null  null
Technische Universität München
Zentrum Mathematik
Applications of Least-Squares Regressions to Pricing
and Hedging of Financial Derivatives
Andreas J. Grau
Vollständiger Abdruck der von der Fakultät für Mathematik der Technischen Universität München
zur Erlangung des akademischen Grades eines
Doktors der Naturwissenschaften (Dr. rer. nat.)
genehmigten Dissertation.
Vorsitzender:
Prüfer der Dissertation:
1.
2.
3.
Univ.-Prof. Dr. Bernd Simeon
Univ.-Prof. Dr. Rudi Zagst
Prof. Phelim P. Boyle, Ph.D. (em.),
Wilfrid Laurier University, Waterloo, Kanada,
(nur schriftliche Beurteilung)
Univ.-Prof. Dr. Hans-Joachim Bungartz
Die Dissertation wurde am 12.12.2007 bei der Technischen Universität eingereicht und durch die
Fakultät für Mathematik am 05.02.2008 angenommen.
2
Acknowledgements
i
Acknowledgements
This thesis would not have been possible without the support of numerous people. First,
I would like to thank my supervisor Prof. Dr. Rudi Zagst for assigning challenging tasks to me,
discussing my ideas patiently and providing assistance in order to making this thesis readable. He
even left me enough room for my unconventional ideas. I thank Prof. Dr. Peter Forsyth, Prof. Dr.
Ken Vetzal and Prof. Dr. Jan Kallsen for discussions with valuable input and thought-provoking
impulses. This is especially true for my co-supervisor Prof. Dr. Phelim Boyle and the third referee
of this thesis Prof. Dr. Hans Bungartz.
As well, I would like to thank my collegues Dr. Stefan Dirnstorfer, Christina Niethammer and
Christoph Hänle for making the time working on my dissertation enjoyable.
Last, but not least, a special thanks goes to my parents and my brother for their constant full
support from back home.
ii
Acknowledgements
Contents
Introduction
1
1 Mathematical Foundations
1.1 Overview . . . . . . . . . . . . . . . . . . . .
1.2 Regression Methods . . . . . . . . . . . . .
1.2.1 Basics . . . . . . . . . . . . . . . . . .
1.2.2 Basis Functions . . . . . . . . . . . .
1.2.3 Approximation Properties . . . . . .
1.3 Pricing and Hedging in Complete Markets
1.3.1 Terminology . . . . . . . . . . . . . .
1.3.2 General Framework . . . . . . . . .
1.3.3 Exercisable Options . . . . . . . . .
1.4 Numerical Methods for Option Valuation .
1.4.1 Overview . . . . . . . . . . . . . . .
1.4.2 Monte Carlo Methods . . . . . . . .
1.4.3 Direct PDE Methods . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
5
5
5
13
19
19
19
20
21
22
22
23
29
2 The Challenge of Path Dependency
2.1 Overview . . . . . . . . . . . . . . . . . . . . . .
2.2 Introduction . . . . . . . . . . . . . . . . . . . .
2.3 Pricing Using Feature Extraction . . . . . . . .
2.3.1 A Discretely Sampled Asian Option . .
2.3.2 Simple Example . . . . . . . . . . . . . .
2.3.3 Numerical Examples . . . . . . . . . . .
2.3.4 Summary of the Feature Extraction . . .
2.4 Pricing Delayed Barrier Options . . . . . . . .
2.4.1 Numerical Example: A Parisian Option
2.5 Summary . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
33
33
33
34
35
37
41
44
45
47
49
3 Moving Window Asian Options
3.1 Overview . . . . . . . . . . . . . . . . .
3.2 Introduction . . . . . . . . . . . . . . .
3.3 Moving Window Asian Option . . . .
3.3.1 Continuous Version . . . . . .
3.3.2 Discretization . . . . . . . . . .
3.4 Related Problems . . . . . . . . . . . .
3.4.1 Asian American Option . . . .
3.4.2 Exponential Weight . . . . . . .
3.4.3 Moving Window Asian Option
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
51
51
51
54
54
55
56
56
57
57
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
iii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3.5
3.6
3.7
4
Numerical Procedure . . . . . . .
3.5.1 Simulation . . . . . . . . .
3.5.2 Choice of Basis Functions
3.5.3 Simple Example . . . . . .
Numerical Examples . . . . . . .
3.6.1 Convergence . . . . . . .
3.6.2 Heuristic Extrapolation .
Summary . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Callable Convertible Bonds
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Introduction . . . . . . . . . . . . . . . . . . . . . . .
4.3 Models for Convertible Bonds . . . . . . . . . . . . .
4.3.1 No Default Risk . . . . . . . . . . . . . . . . .
4.3.2 Credit Risk . . . . . . . . . . . . . . . . . . . .
4.3.3 Cash Flows, Call and Put Provisions . . . . .
4.4 Numerical Algorithm . . . . . . . . . . . . . . . . . .
4.4.1 PDE Implementation . . . . . . . . . . . . . .
4.4.2 Monte Carlo Implementation . . . . . . . . .
4.5 Case Study . . . . . . . . . . . . . . . . . . . . . . . .
4.5.1 Convergence Analysis - PDE . . . . . . . . .
4.5.2 Convergence Analysis - Monte Carlo . . . .
4.5.3 Properties of Different Call Strategies . . . .
4.5.4 Moving Window and Call Notice Protection
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
58
58
59
59
65
65
68
71
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
73
73
73
76
76
76
78
81
81
83
89
90
91
93
100
102
5
Simulation-Based Hedging and Incomplete Markets
105
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.3 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.3.1 Basic Requirements for a Pricing Method . . . . . . . . . . . . . . . . . . . . 107
5.3.2 Hedging and Pricing of a Liquidly Traded Security . . . . . . . . . . . . . . 108
5.3.3 Setting for an Illiquid Market . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.3.4 Transaction Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.3.5 American Put Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.4 Monte Carlo Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.4.1 Simple Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
5.4.2 Simulation-Based Hedging in a Black-Scholes Market (European Options) . 126
5.4.3 Hedged Monte Carlo (Potters et. al. [96]) in a Black-Scholes Market (European Options) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.4.4 Simulation-Based Hedging in a Black-Scholes Market (American Put Option) 128
5.4.5 Remarks on the Computational Efficiency . . . . . . . . . . . . . . . . . . . . 129
5.4.6 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6
Conclusions
143
iv
7 Appendix
7.1 Important Symbols . . . . . . . . . . . . . . . . .
7.2 Notes for the Proof of Theorem 1.4 . . . . . . . .
7.3 Proof of Equation (1.8) . . . . . . . . . . . . . . .
7.4 Proof of Equation Set (4.4)-(4.7) . . . . . . . . . .
7.5 Feature Extraction in Octave/MATLAB . . . . .
7.6 Simulation-Based Hedging in Octave/MATLAB
Bibliography
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
145
145
146
149
150
153
155
156
v
vi
Introduction
1
Introduction
The introduction of financial options delivered a valuable contribution to the efficiency of the
markets in the world. Investors seeking risks - speculators - can use financial options to obtain
large effects with little money. Investors avoiding risks - hedgers - can now buy insurances for
their portfolios at reasonable prices. In book 1, Chapter 11 of Politics, Aristotle already tells the
story of Thales of Miletus (624-547 BC) basically buying an option on olive crop. But, it took until
the 1970s where large volumes of financial options were traded at derivatives exchanges. Today,
the underlying problem of pricing and hedging options is well known and several approaches of
its solution have been proposed. Assuming a simple complete market without any transaction
cost, the Black-Scholes model has been most successful since its introduction 1973 [17].
Despite the beauty and simplicity of the Black-Scholes model, the efficient evaluation of many
exotic options remains challenging. It turned out quickly that analytic solutions e.g. from Merton [87] are by far not sufficient for the evaluation of traded securities. Consequently, a large
variety of procedures has been developed for the solution of the governing partial differential
Equation (PDE). Direct solvers are e.g. a finite differences method by Schwartz [103], the finite element method and the finite volume methods by Forsyth and Vetzal [50] resp. Zvan et al [124] as
well as a mesh-less method by Li et al [80]. A popular solver is the Cox-Ross-Rubinstein method
(CRR) [36], which discretizes the asset price process by a binomial tree and solves for the option
price by a simple recursion, which is easy to implement. But, the CRR method does not have as
good convergence properties as the other PDE methods.
A different approach solving for option prices in the Black-Scholes model focuses on the underlying stochastic differential equation (SDE). This is done by simulating Monte Carlo paths
of the underlying asset and computing option prices as some expected value, presented first by
Boyle [21, 23, 53]. While option features such as an early exercise can easily be evaluated in a PDE
solver (cp. Forsyth and Vetzal [50]), this is hard for Monte Carlo methods. Carrière [32] presented
the first practical Monte Carlo method for the valuation of options with early exercise features in
1996, which became popular after being extended by Longstaff and Schwartz [81] in 2001. This
method is called Least-Squares Monte Carlo.
That means the main methods for option valuation are PDE solvers and Monte Carlo simulation. On the one hand, for many pricing problems PDE solvers deliver highly accurate solutions
2
Introduction
in little time. This is especially true for low-dimensional pricing problems. But, not all of them
are low-dimensional. Especially path-dependent options often require the introduction of additional state variables. Reaching four or five dimensions, the pricing becomes usually infeasible
for current PDE and computer technology. On the other hand, Monte Carlo methods can price
options independent of the dimension of the pricing problem. But, these methods are converging
slowly such that highly accurate solutions often cannot be obtained. Additionally, the valuation
of high-dimensional financial derivatives with embedded options like an early exercise is still
challenging, even if the method of Carrièr [32, 81] is used.
This work will focus on pricing and hedging of derivatives with Monte Carlo simulation. In
some cases, direct numerical PDE solutions will be used as a reference. We will provide insight
into the versatile applications of regression methods for the Monte Carlo valuation. As a result,
very fast valuation procedures are developed: In some cases the methods developed in this dissertation are the first of its kind which handle specific exotic options. Especially the pricing of a
high-dimensional Moving Window Asian option with early exercise and the implementation of a
moving window soft-call constraint of convertible bonds are solved for the first time in this thesis.
Prior technology could not cope with the high-dimensional pricing problem together with
an early exercise feature. The PDE method can deal with an early exercise feature easily, but
high-dimensional problems are unfeasible. Monte Carlo methods can deal with high-dimensional
problems, but an early exercise of a high-dimensional option pricing problem is hard to treat
correctly in the previous setting.
Another contribution of this thesis is the Simulation-Based Hedging method which connects
realistic models for the underlying with suitable pricing and hedging without a detour to a socalled risk-neutral measure. The Simulation-Based Hedging has extraordinary properties: E.g.
using the Black-Scholes assumptions its convergence to the Black-Scholes prices is much faster
than the comparable Longstaff-Schwartz Least-Squares Monte Carlo [81]. Furthermore, the underlying can follow any real-world process: The algorithm always computes the optimal hedging
strategy and thus attains realistic risk-adjusted prices and hedges. This can also be done using
multiple hedge instruments.
Consequently, the new Simulation-Based Hedging is a new pricing framework together with
a numerical method for the solution to option pricing problems in so called incomplete markets.
The whole setting of the framework is new, but related to risk minimization techniques for optimal hedging of financial options presented by several authors [46, 95, 47, 33]). Especially, the
setting of Simulation-Based Hedging can be seen as an extension to the variance minimization
presented by Schweizer [104] and the presented numerical solution is related to a method presented by Potters et. al. [96] resp. Pochart and Bouchaud [95].
The main results of this dissertation are:
• The usual Monte Carlo method is altered for a quicker evaluation: Extracting the main
features of the option’s payoff, a simple regression can accelerate the evaluation of path
Introduction
dependent derivatives significantly.
• A sparse basis for Least-Squares Monte Carlo is presented which allows to price Moving
Window Asian Options for the first time. This method is extended to the evaluation of
convertible bonds with complex rights of holders and issuers.
• A powerful Simulation-Based Hedging method has been developed, which determines
option prices and optimal hedges based on physical simulations of the underlying. This
method is an order of magnitude faster than the state of the art Least-Squares Monte Carlo
and can operate with much less restrictive assumptions on the market than the widely used
Black-Scholes model.
3
4
Introduction
5
Chapter 1
Mathematical Foundations
1.1 Overview
This chapter summarizes the main mathematical tools needed for the pricing and hedging of
financial derivatives. Since this thesis focuses on regression methods, we first define what kind of
regression is meant and which properties of the method are required. Then, the different choices
of regression basis functions are presented. After summarizing the basics for derivatives pricing,
the chapter closes with the corresponding numerical implementation.
1.2 Regression Methods
In this section, we review the mathematical properties of different regression methods. We focus on least-squares regressions due to their desirable properties and extend the regression basis to special sparse basis functions in order obtain computational feasible methods for highdimensional regressions.
1.2.1 Basics
Before we start with the actual topic on regression methods we need to define some terminology.
2
k
The function f (x) is said to be of class C k if the derivatives df , d f2 , . . . , d fk exist and are
dx dx
dx
continuous. The function f (x) is said to be of class C 0 , if it is continuous. The function f (x)
is said to be of class C ∞ , or smooth, if it has derivatives of all orders.
Approximation by Regression
In the following, we want to show what kind of regressions are useful in the context of option
pricing and which properties they have. There are two main applications of the regression for
option pricing: One is the function approximation, the other is a variance minimization of a portfolio.
6
Mathematical Foundations
First, we start with the function approximation. Therefore we need to define our setting and
what we mean by an approximation.
Assumption 1 A data set (X, y), X ∈ Rn,s , y ∈ Rn is provided.
Assumption 2 The rows xi ∈ Rs of X := (x1 , . . . , xn )T are independent and identically distributed (i.i.d.) realizations of a random vector with a probability density function p(x),
which is non-zero everywhere on the cube D := [xmin , xmax ] = ([xj,min , xj,max ])j=1,...,s and
zero outside.
Assumption 3 The provided values of y = (y 1 , . . . , y n )T are noisy observations of f (xi ), i =
1 . . . , n, with
y i = f (xi ) + ²i , i = 1 . . . , n
where ²i is random with E[²i ] = 0, independent of xi .
Assumption 4 The function f : Rs → R, f ∈ B has a representation f (x) =
∞
P
j=1
aj bj (x), x ∈ Rs ,
where bj ∈ B, j = 1, . . . , ∞ are bounded basis functions bj : Rs → R of a vector space
B ⊂ C 1 with kbj (x)k∞ = cj < ∞, ∃x ∈ D : |bj (x)| > 0 j = 1, . . . , ∞.
Assumption 1 is clear. Assumption 2 explains that there is one stochastic variable which determines the value xi at which a function is evaluated, while Assumption 3 explains that the function value plus a random noise is denoted by y i . Now, Assumption 4 contains a decomposition of
function f into basis functions with a continuous total derivative such that an approximation can
be defined properly. Note that Assumption 4 does not impose a restriction on a numerical evaluation using a subset of the basis functions since e.g. any continuous function can be uniformly
approximated by polynomials (Stone-Weierstrass Theorem [110]).
Theorem 1.1 Let Assumptions 1 to 4 be satisfied. Then the mapping given by
Z
b(x)f (x)p(x) dx, x ∈ Rs
< b, f >r :=
(1.1)
D
with functions b(x), f (x) ∈ B and probability density function p(x) is a scalar product on B and thus B
is a Euclidian vector space, i.e. a real vector space B with a corresponding definition of a scalar product.
Proof We can prove Theorem 1.1 simply by comparing the conditions for a scalar product with
the corresponding expressions. This is straight forward, such that we omit the details.
2
The next thing we need is a stochastic approximation of this scalar product, which we provide
by the following theorem.
1.2 Regression Methods
7
Theorem 1.2 Let Assumptions 1 to 4 be satisfied. Then
n
1X
f (xi )bj (xi ) =< f, b >r
n→∞ n
i=1
lim
holds.
Proof With x := (x1 . . . , xs ) ∈ D, xi := (xi1 . . . , xis ) ∈ D, i = 1, . . . , n and indicator function
½
1 if xij < xj ∀ j = 1, . . . , s
i
Ix (x ) :=
0 else
we define the empirical cumulative distribution function Fn (x) of n sample observations xi as
n
Fn (x) :=
1X
Ix (xi ), x ∈ D,
n i=1
which means that
Z
n
f bj dFn =
D
1X
f (xi )bj (xi ).
n i=1
From the Glivenko-Cantelli Theorem1 we know that Fn (x) → F (x) with true cumulative distribution function F (x) almost surely and uniformly, i.e.
Z
Z
lim
f bj dFn →
f bj dF
n→∞
D
D
holds for all integrants which are bounded and continuous in the domain D. The integrand
f (xi )bj (xi ) is continuous by Assumption 4 and since the integration domain D is bounded (Assumption 2), f (xi )bj (xi ) is bounded everywhere on the domain D. Since p(x) is the total first
derivative of the cumulative distribution function F (x),
Z
Z
f bj dF = b(x)f (x)p(x) dx =< b, f >r
D
D
holds, which completes the proof.
2
We defined an approximation for the scalar product, but our goal is to obtain an approximation
˜
f (x) ≈ f (x) for any x in [xmin , xmax ] based on the set of noisy observations (X, y). That means,
we first have to define what we mean by an approximation.
Definition 1.3 Let Assumptions 1 to 4 be satisfied. A local basis approximation f˜m of the function f
induced by the set of samples (X, y), X ∈ Rn,s , y ∈ Rn with function space Bm spanned by the basis
functions b1 , . . . , bm ∈ B, is given by
f˜m (x) =
m
X
j=1
1 See
Fahrmeier et al [43], p.315.
ãnj bj (x), f˜m ∈ B m ⊂ B, ãnj ∈ R, j = 1, . . . , m
8
Mathematical Foundations
with coefficient vector ãm = A (X, y) , ãm = (ãn1 , . . . , ãnm )T iff
°
  n °
°
°
ã1
a1
°
°
° ..   .. °
∀ε ∃N (ε) : ° .  −  . ° < ε, ∀n ≥ N (ε).
°
°
° am
ãnm °∞
In the following, we state how one can obtain a suitable function A (X, y)), which determines
the coefficient vector ãm given the noisy observations and the basis functions of interest bj (x) ∈
Bm , j = 1, . . . , m, so that we obtain the local basis approximation.
Theorem 1.4 Let the Assumptions 1 to 4 be satisfied. The local basis approximation f˜m (·) of function
f (·) based on a set (X, y) of n noisy observations is given by
f˜m (x) =
m
X
ãnj bj (x)
j=1
with ãm = (ãn1 , . . . , ãnm )T where
ãm = A (X, y)) =
arg min
kB(X)ãm − yk2
ãm
¡
¢−1
= B(X)T B(X)
B(X)T y
with

b1 (x1 )

..
B(X) := 
.
b1 (xn )
···
..
.
···
(1.2)

bm (x1 )

..
.
.
n
bm (x )
See Appendix 7.2 for details of the proof.
Lemma 1.5 Let the Assumptions 1 to 4 be satisfied and f˜m (x) be defined as in Theorem 1.4. Then,
lim f˜m (x) = E[y|x].
m,n→∞
Proof The order of the limits n → ∞ and m → ∞ is important:




m
m
´
³
X
X
aj bj (x)
lim
lim f˜m (x)
= lim  lim
ãnj bj (x) = lim 
m→∞
n→∞
m→∞
n→∞
j=1
m→∞
=
f (x) = E [f (x)|x] = E [y − ²|x]
=
E [y|x] ,
j=1
since E[²|x] = 0.
2
Note: In a real application, the quotient
nα
m
should be constant in the limiting process; the
optimal exponent α depends on the smoothness of f and dimension s.2
2 For
a detailed proof and optimal exponent see Stentoft [108].
1.2 Regression Methods
9
Variance Minimization by Regression
To get a better understanding of what the local basis approximation can do for a variance minimization, we look at an idea which dates back to 1979 when Ederington [42] showed that a static
minimum variance hedge ratio is simply defined as the ratio of the covariance of V and S to the
variance of S.3
Ederington argues that
β :=
cov(S, V )
var(S)
(1.3)
is the optimal hedge for V with some correlated underlying S in a single period market.
Our approach can be seen as a generalization to this idea to β(x) as a function of some underlying state x ∈ Rs , such that optimal hedges for multi-period markets can be computed. That
means, we allow the optimal hedge to be conditional on the current state of the world x, and thus
we can compute optimal dynamic hedges in the following Chapters.
Given the random variables Ŝ, V̂ with E[Ŝ|x] = 0 and E[V̂ |x] = 0, dependent on state x, the
function
β(x) :=
cov(Ŝ, V̂ |x)
(1.4)
var(Ŝ|x)
shall be estimated from a sample {Ŝ i , V̂ i , xi }, i ∈ {1, . . . , n}. From linear regressions of stochastic
variables we know that the definition of Equation (1.4) is the solution to the minimization of4
n
1X i
(V̂ − β(xi )Ŝ i )2 ,
n i=1
(1.5)
which we can write with some basis functions bj (xi ), j = {1, . . . m} and β(xi ) =
°
° b1 (x1 )Ŝ 1
°
°
..
{ãj } = arg min °
.
°
ãj
° b (xn )Ŝ n
1
···
..
.
···
P
j
ãj bj (xi ) as

  1 °
°
bm (x1 )Ŝ 1
ã1
V̂
°
  ..   .. °
..
−
°.





.
.
.
°
°
n
n
n
ãm
bm (x )Ŝ
V̂
(1.6)
Standard arguments as in the previous section show that with

b1 (x1 )Ŝ 1

..
BS (x1 , . . . , xn ) := 
.
b1 (xn )Ŝ n
···
..
.
···

bm (x1 )Ŝ 1

..

.
bm (xn )Ŝ n
the values ãj can be computed using efficient algorithms which implicitly solve the normal equations
BS (X)T BS (X)ã = BS (X)T V̂
(1.7)
3 Ederington’s example is the hedge of a future but it applies to any derivative V . For more recent research on futures
hedges see e.g. Allen et al [3].
4 Cp. [102]
10
Mathematical Foundations
with ã = (ã1 , . . . , ãm )T and V̂ = (V̂ 1 , . . . , V̂ n )T . Then,
β(xi ) =
cov(Ŝ, V̂ |x = xi )
var(Ŝ|x = xi )
≈
m
X
ãj bj (xi )
j=1
is the desired result which denotes the optimal hedge ratio based on some state xi . This result
equals the previous result using Theorem 1.4 if we set Ŝ = Sj+1 − E[Sj+1 |xj ] and V̂ = Vj+1 −
E[Vj+1 |xj ] such that E[Ŝ|xj ] = 0 and E[V̂ |xj ] = 0 holds and a local basis approximation of
V̂ by Ŝ is conducted. In implementations, E[Vj+1 |xj ] and E[Sj+1 |xj ] can be obtained from the
corresponding local basis approximation of Vj+1 and Sj+1 as defined in Theorem 1.4.5
In total, we obtained a simple method for computing a conditional variance minimization of
a portfolio based on regressions. The resulting Equation (1.7) is very similar to Equation (1.2) of
Theorem 1.4. Consequently, this method itself can be seen as a local basis approximation of the
option V by the underlying asset price S.
The current literature on non-parametric statistics solves slightly different problems, but we
still want the refer to this research: A conditional functional principal components analysis is
proposed by Cardot [31], who presents the computation of conditional covariances based on
kernel smoothers as well as some convergence properties of the approach. Within the GARCH
framework, conditional variance functions are computed by Fan and Yao [44]. They perform regressions on squared residuals and they obtain the asymptotic convergence result that without
knowing the regression function, their method estimates the conditional variance as well as if
the regression functions were given. Finally, non-parametric regressions are presented by several
authors. Especially Härdle [60] provides a good overview of different regression and smoothing
techniques.
Numerical Solution to the Least-Squares Minimization
Now, we want to summarize how the normal Equations ((1.2) and (1.7)) are solved efficiently. The
direct solution by computation of
¡
¢−1 T
ã = BT B
B y
leads to an unstable solution. In the following, we will provide some insight into the reasons for
this. For more rigorous derivations and detailed error analysis see Higham [62] and the references
therein as well as Voss [114].
First of all, we need a definition and a theorem, which allows us to write the solution of the
minimization problem efficiently.
5 This approach with separate estimations of E[V
i
j+1 |xj ], E[Sj+1 |xj ] and β(x ) can be unified into a single step with
twice as many basis functions. But, the unification is not useful in practice. The numerical algorithm for estimating the
basis coefficients a is ≈ m3 with m denoting the number of basis functions. Estimating the conditional expectation using
the local basis approximation first, subtracting the expectation and estimating the coefficients for β(x) with m > 1 leads
to ≈ 3 · m3 which is less than (2m)3 for the joint estimation.
1.2 Regression Methods
11
Definition 1.6 Let B ∈ Rn,m . Then the matrix B† ∈ Rm,n which is determined by delivering the
solution to the minimization problem kBa − yk2 = min! written as a = B† y is called pseudo inverse of
B.
The next theorem tells us, how to obtain the pseudo inverse:
Theorem 1.7 Let B ∈ Rn,m have a singular value decomposition
B = UΣVT ,
where Σ is a diagonal matrix with the sorted singular values σ1 ≥ σ2 , . . . , ≥ σm > 0 on the diagonal and
the orthonormal vectors ui , vi as columns in U resp. V. Then,
½
1 if i = j
−1
†
• Σ = (σi δi,j ), δij :=
0 otherwise
• B† = VΣ† UT ,
where B† is the pseudo inverse.
Proof A proof of Theorem 1.7 can be found in [114], Satz 5.26, as well as in [85], Satz. 9.22.
2
Now, we take a closer look at the stability of the minimization problem. We are considering
the linear minimization problem
{ai } = arg min kBa − yk2
a
with B ∈ Rn,m , rank(B) = m, n ≥ m, and a perturbation thereof
{ãi } = arg min kB(a + ∆a) − (y + ∆y)k2 ,
a
which are only perturbations of the values y and not of the matrix B.
Let a = B† y and a + ∆a = B† (y + ∆y) be the solution by the corresponding pseudo inverse
B† of B. Then ∆a = B† ∆y holds and from kB† k2 =
1
σm
follows 6
k∆ak2 ≤ kB† k2 · k∆yk2 =
6 Which
1
k∆yk2 .
σm
is a direct result from the definition of a matrix norm:
kB† k2 :=
sup
kyk2 = 1
kB† yk2 =
sup
p
yT Uӆ VT Vӆ UT y =
kyk2 = 1
sup
q
cT (Σ† )2 c
kck2 = 1
c = uT y
=
sup
kck2 = 1
c = uT y
m
X
i=1
c2i
1
σi2
!1
2
=
1
σm
12
Mathematical Foundations
Furthermore, with ci = (ui )T y
1
kak22 ≥ 2
σ1
holds7 . Since
1
Pm
i T
i
i=1 (u ) yu
°
°2
m
°X
°
°
°
° (ui )T yui °
°
°
i=1
(1.8)
2
is the projection of y into the subspace U spanned by the basis
m
u , . . . , u , we can estimate the relative error by
k∆ak2
σ1
k∆yk2
≤
·
,
kak2
σm kPU (y)k2
(1.9)
with linear projector PU (y). Since this inequality describes how a relative error of the input to the
minimization can perturb the solution, we define the condition of a matrix.
Definition 1.8 Let B ∈ Rn,m , rank(B) = m, n ≥ m have a singular value decomposition B = UΣVT .
Then, we call κ(B) :=
σ1
σm
the condition of matrix B.
Example 1.9 Consider the condition of an orthogonal matrix: For any orthogonal matrix Q
QT Q = I
with identity matrix I holds, which means that all eigenvalues equal 1 and thus for all singular values
σi ≡ 1 holds. Consequently, the condition of matrix Q is
κ(Q) = 1.
This condition number can be computed for other matrices and used for other types of problems, too. We will use the condition number of a rectangular matrix B ∈ Rn,m to compare two
methods of the solution to least-squares regression.
¡
¢−1 T
If we now solve the least-squares regression directly by a = BT B
B y, the relative error
for the solution is bounded by
¡
¢
κ BT B =
=
¡
¢
κ (UΣVT )T UΣVT
´
³
2
κ V (Σ) VT
=
σ12
2
σm
=
κ (B) ,
2
with κ(B) ≥ 1. We can improve this result by the QR factorization
B = QR
7 See
Appendix 7.3 for a proof.
1.2 Regression Methods
13
with orthogonal matrix Q ∈ Rn,n and upper triangular matrix R ∈ Rn,m . Since Q is an orthogonal matrix and thus has rank(Q) = n, R has the same rank as B: rank(B) = rank(R) = m.
Then,
BT Ba
T
⇔ (QR) (QR) a
⇔ Ra
=
BT y
T
= (QR) y
=
QT y
holds. From Example 1.9, we know that the orthogonal matrix Q has a condition of one: κ(Q) = 1.
For the condition of matrix R, we consider
B
=
UΣVT ,
T
R = UR ΣR VR
B
=
T
QR = QUR ΣR VR
where QUR is again an orthogonal matrix. That means, V = VR and Σ = ΣR , which is almost
unique.8 That also means that the upper triangular matrix R has the same condition as matrix B,
i.e. κ(R) = κ(B). The relative error of the solution of a = R† QT y is then about κ(B) because in
order to obtain this solution, one has to consider the upper triangular matrix R4 ∈ Rm,m of R
and the first m entries y1 , . . . , ym of y only, since all other rows of R are zero. Consequently,
a =
=
R† QT y


y1


−1
(R4 ) QT  ... 
ym
holds, which can be solved by back substitution without any further error, i.e the solution by
normal equations which squares the condition of the problem can increase the error of the solution
significantly.
Consequently, the QR factorization leads to a method for the solution of the least-squares
problem which is more stable than the naı̈ve solution of the normal equations.
A general description of the least-squares problem can be found in the book of Acton [1], an
in depth analysis in the book of Higham [62] and the description of an efficient implementation
of the solution by QR decompositions can be found in the Lapack Guide [8].
1.2.2 Basis Functions
A tricky part of the function approximation by regression and in fact the crucial challenge is the
careful choice of the basis functions B. As described in the previous section we will use a linear
8 Here, we quote from Manning and Schütze [86], p. 561: For any given SVD solution, you can get additional nonidentical ones by flipping signs in corresponding left and right singular vectors U and V , and, if there are two or more
identical singular values, then only the subspace determined by the corresponding singular vectors is unique, but it can
be described by any appropriate orthonormal basis vectors. Apart from these cases, SVD is unique.
14
Mathematical Foundations
combination of these basis functions to approximate specific functions, e.g. an estimate for the
current option value conditioned on the underlying asset price.
In the following, we will present different choices of basis functions from polynomials to
splines and sparse piecewise linear functions. We compare their properties so that we can choose
the most suitable ones for each problem.
Polynomials
In some cases the choice of the class of basis functions seems to have little effect on the values
computed by Least-Squares Monte Carlo [81]. Consequently, in some cases, it is sufficient to
choose the simplest set of basis functions, polynomials.
In a simple approach we could use the full set B`full of all s-dimensional polynomials up to a
certain polynomial degree ` = (`1 , . . . , `s ) ∈ Ns ,
(
B` (x1 , . . . , xs ) :=
full
s
Y
xgi i
)
¯
¯
¯ gi ∈ N0 ∧ gi ≤ `i .
¯
(1.10)
i=1
But, it is easy to see that this construction by its own will quickly exhaust any available computational resources. A setting with 10 dimensions and a maximal polynomial degree of one in
each direction ` = (1, . . . , 1) already yields as many as 210 = 1024 basis functions, over which
least squares regression has to be performed. A maximal quadratic polynomial degree in each
direction leads to 310 = 59049 basis functions.
That means, a full polynomial basis seems to be useful only in one or two dimensional problems, when the computational efforts are relatively small. But, in these cases, one often wants to
decrease the approximation error by adding basis function to the regression. In the case of polynomials, this leads to an ill-conditioned matrix which cannot be efficiently treated by standard
methods. Additionally, a refinement does not necessarily lead to uniform convergence (Theorem
of Faber, see [116]). A lot more efficient than these global basis functions are local basis functions such as piecewise polynomial splines which show a better convergence, especially at the
boundary of the regression domain.
Splines
A useful class of basis functions especially for a single dimension (s = 1) is the cubic spline. A
spline is a piecewise polynomial function which lives on a decomposition of the interval [x0 , xm ],
∆ : x0 < x1 < . . . < xm .
(1.11)
We focus on splines which coincide on each interval [xi−1 , xi ], i = 1, . . . , m with a polynomial
of degree 3 and lie in the class of twice continuously differentiable functions C 2 . Consider the
1.2 Regression Methods
15
functions
½
bi (x, xi )
=
bm
= x3
bm+1
= x2
bm+2
= x
bm+3
= 1,
(xi − x)3
0
if xi ≥ x
, i = 1, . . . , m − 1
else
then we see from simple differentiation that a linear combination
f spline (x) =
m+3
X
ai bi (x)
(1.12)
i=1
is continuous everywhere and has continuous first and second derivatives (f spline ∈ C 2 ). Thus the
function f spline (x) in Equation (1.12) is a cubic spline9 .
Given the function values y0 , . . . , ym at the knots x0 , . . . , xm , it is easy to find the coefficients
ai of f spline by solving
f spline (xi )
d2 f spline (x )
0
dx2
=
yi , i = 0, . . . , m
=
0
d2 f spline (x ) = 0
m
dx2
which leads to natural splines due to the condition that the second derivative at the boundary of
the spline is zero.
Piecewise Linear Sparse Grids
As already stated, the exponential growth of the number of basis functions of full grids quickly
overextends any computer. Fortunately, a much more efficient selection of basis functions can be
constructed, known as sparse grids or combination method [107, 29]. This kind of function basis
has been successfully applied in the field of high-dimensional function approximation [52] and
many others.
The original idea of sparse grids is based on piecewise linear basis functions which we will
call grids. Similar to the full set of m-dimensional polynomials, we define the full grid Ω` ,
` = (`1 , . . . , `s ) which has a possibly different equidistant spacing h` := (2−`1 , . . . , 2−`s ) for each
dimension of x = (x1 , . . . , xs ) and has grid points x`,i := (x`1 ,i1 , . . . , x`s ,is ), 0 ≤ ij ≤ 2`j ∀j ∈
{1, . . . , s}. The basis functions and thus the values of such a grid are given by10
b`,i (x) :=
s
Y
b`j ,ij (xj )
j=1
9 See [111] for details of this basis spline formulation. Other formulations of this spline basis are also useful, especially
for fast evaluation [38, 37] and for higher stability [38, 114].
10 See [29], p. 10
16
Mathematical Foundations
with index vector i := (i1 , . . . , is ) which denotes the multi-index of b`,i in the grid Ω` . The required
one dimensional basis functions are
µ
blj ,ij (xj ) := b
where
½
b(x) :=
xj − ij · hlj
hl j
1 − |x|
0
¶
if x ∈ [−1, 1]
otherwise.
The idea of sparse grids is summarized as follows. Instead of using a full grid Ω`
Ω` := span{b`,i (x) : 1 ≤ ij ≤ 2`j − 1 ∀j ∈ {1, . . . , s}},
we combine multiple grids according to a sparse and error optimal scheme Ωsparse
L ,
:=
Ωsparse
L
[
P
Ω` .
(1.13)
`i =L
Instead of defining a multidimensional level ` we use the single sparse level L, that limits the sum
of all components ` = (`1 , . . . , `s ). Figure 1.1 presents two- and three-dimensional sparse grids.
This kind of combining regular grids has been shown to produce a reasonable sparseness for a
wide class of smooth functions [28].
If we compare full and sparse grids, we can see that the computational effort decreases radically while the error rises only slightly: Comparing grids with minimal mesh size hL = 2−L , a full
−s
s−1
grid has O(hL
) grid points and a sparse grid only employs O(h−1
) points. At the same
L | log hL |
time, the L2 -interpolation error for smooth functions rises from O(h2L ) to O(h2L · | log hL |s−1 ) [29].
In many applications with high-dimensional smooth functions, L ∈ {2, 3, 4} is already sufficient.
1.2 Regression Methods
17
1
L=0
1
L=1
0.9
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0
0
0.1
0.2
0.4
0.6
0.8
0
0
1
1
L=2
L=3
0.9
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.6
0.8
1
0.2
0.4
0.6
0.8
1
0.2
0.4
0.6
0.8
1
0.1
0.2
0.4
0.6
0.8
0
0
1
1
1
L=5
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0
0
0.4
0.9
0.8
0
0
0.2
1
0.1
L=4
0.9
0.8
0.1
0.2
0.4
0.6
0.8
1
0
0
Figure 1.1: A three-dimensional sparse grid with L = 5 is presented on the left and twodimensional sparse grids from L = 0, . . . , 5 on the right hand side.
Sparse Polynomial Basis Functions
The presented sparse grid approach uses piecewise linear basis functions supporting the grid
nodes. Most of the examples we will consider in the remainder of this dissertation are approximations to continuation value functions for financial options. These functions are usually smooth
(∈ C ∞ ) with respect to the state variables. In such a case, differentiable basis functions deliver
better results than piecewise linear functions. Consequently, we propose creating a sparse polynomial basis, which is very smooth (∈ C ∞ ). A detailed analysis for this kind of sparse basis can
be found in [13] so that we can focus on the main issues.
We combine the idea of a polynomial basis with the idea of sparse grids: instead of using
a plain polynomial basis B`full as in Equation (1.10), we combine the multiple polynomial orders
according to the same sparse and error optimal scheme as for the sparse grids. The sparse basis
BLsparse ,
BLsparse (x1 , . . . , xs ) :=
[
P
full
Bβ(`)
(x1 , . . . , xs )
(1.14)
`i =L
has many of the properties of the sparse grid with piecewise linear basis functions but it is smooth
everywhere.
The polynomial basis sparse level L again limits the sum of all components ` = (`1 , . . . , `s ).
Furthermore, the degree of the combined full polynomial bases is transformed by a mapping
function β that turns each level into a maximum polynomial degree
β(`) = 2 · (2`1 − 1, . . . , . . . , 2`s − 1).
18
Mathematical Foundations
This transformation cannot be applied to grids because a grid with `i = 0 nodes in the ith dimension makes no sense. But, for polynomials, this reduces the number of basis functions. An
example for a three dimensional sparse polynomial basis can be found in Figure 1.2.
Example: A sparse polynomial basis with s = 3, L = 2
First, we have to compute the full basis polynomials of Equation (1.10).
The sparse level L = 2 requires the computation of
full
` = (2, 0, 0) → B6,0,0
= {1, x1 , x21 , x31 , x41 , x51 , x61 }
full
= {1, x1 , x2 , x1 x2 , x21 , x22 , x21 x2 , x1 x22 , x21 x22 }
` = (1, 1, 0) → B2,2,0
full
` = (1, 0, 1) → B2,0,2
= {1, x1 , x3 , x1 x3 , x21 , x23 , x21 x3 , x1 x23 , x21 x23 }
full
= {1, x2 , x22 , x32 , x42 , x52 , x62 }
` = (0, 2, 0) → B0,6,0
full
= {1, x2 , x3 , x2 x3 , x22 , x23 , x22 x3 , x2 x23 , x22 x23 }
` = (0, 1, 1) → B0,2,2
full
` = (0, 0, 2) → B0,0,6
= {1, x3 , x23 , x33 , x43 , x53 , x63 }
for the sparse grid basis and leads to 31 basis functions,
[
full
B2sparse (x1 , x2 , x3 ) =
Bβ(`)
P
=
`i =2
{1, x1 , x2 , x3 , x1 x2 , x1 x3 , x2 x3 , x21 , x22 , x23 , x21 x2 ,
x1 x22 , x21 x22 , x21 x3 , x1 x23 , x21 x23 , x22 x3 , x2 x23 , x22 x23 ,
x31 , x32 , x33 , x41 , x42 , x43 , x51 , x52 , x53 , x61 , x62 , x63 }.
Figure 1.2: An example for a three-dimensional sparse polynomial basis.
Thin-Plate Spline Basis Functions
Another alternative to produce a sparse and smooth multivariate approximation function is to
use the nodes of a sparse linear grid as basis function nodes of a thin-plate regression spline.
Thin plate splines are radial basis functions defined by the minimization of a smoothness
measure in a function space. We directly use the resulting spline basis and define a thin-plate
spline f tp spline (x), x ∈ Rs as
f tp spline (x)
=
m
X
ai bi (x),
i=1
bi (x)
with basis nodes xi .
11
:=
kx − xi k2 log kx − xi k,
The main advantage of thin-plate splines is that we can choose the grid
s
nodes xi ∈ R freely. This allows to use random samples as grid nodes as well as the introduction
11 See [115] for the basic derivation, [118] for regressions with large data samples on thin-plate splines and [93] for
option pricing with radial basis functions.
1.3 Pricing and Hedging in Complete Markets
19
of additional nodes into interesting areas.
1.2.3 Approximation Properties
The theoretical convergence properties of an approximation by the presented basis functions are
well analyzed. Uniform convergence can be expected from piecewise-linear splines as well as
some thin-plate splines [97]. The approximation properties of a polynomial basis is not as good
as of a spline basis, especially the values at the boundary of the approximation domain can have
a significant error [62].
Table 1.1 presents an overview of the basis functions presented in this chapter. Regressions on
basis functions with a global support usually suffer from artifacts at the boundary of the domain
of the samples. Smooth functions are sometimes required due to numerical properties, such that
C ∞ functions should be preferred. The number of functions, which depends on the choice of
the parameter ` and the dimension s, can sometimes grow large for medium size dimensions.
Polynomials seem to be useful up to s = 2 or s = 3. Piecewise linear sparse grids and sparse basis
functions can reach s = 10 to s = 20 and thin-plate splines might still be useful for s > 20. The
presented B-splines are only useful in a single dimension s = 1.
Table 1.1 Overview of the presented basis function classes.
Basis Function
polynomials
B-Splines
piecewise-linear sparse grids
sparse polynomials
thin-plate splines
Support
global
(local) global
local
global
global
# of Functions m(L, s)
m ∈ O(Ls )
m + 3, s = 1 only
m ∈ O(2L | log L|s−1 )
m ∈ O(2L | log L|s−1 )
m
Smoothness
C∞
C2
C0
C∞
C∞
1.3 Pricing and Hedging in Complete Markets
In the previous sections we saw the basic properties of regression methods and different basis
functions. Now, we are presenting the application for which these tools are required: option
valuation.
1.3.1 Terminology
Before we present the general framework for option valuation, we focus on the instruments we
seek to valuate. All financial options in this thesis depend on a single underlying12 S at specific
dates ti . We will write the value of the underlying at time ti as Sti .
A financial option13 is a contract between an option issuer and the option holder. An option
12 The
13 A
underlying is also called asset, stock, spot price or equity.
financial option is also called derivative, security or simply option.
20
Mathematical Foundations
gives the holder the right, but not the obligation to receive cash flows in the future dependent on
the value of the underlying S.
Consequently options are characterized by cash flows to the option holder. In this section we
introduce a few essential instruments, to which we will add others in the subsequent chapters if
required.
Definition 1.10 A European call (put) option is the right but not the obligation to receive StT − K
(K − StT ) at maturity time tT . The variable K is called strike price. European call (put) options are also
called vanilla options.
There are many other ways to formulate option contracts. One of the most common option
features is an exercise opportunity by the option’s holder. This can be specified for a certain date
(discrete) or a specific time interval (continuous). We can also make the payoff dependent on the
asset history to get a so called path dependent option.
Definition 1.11 An American call (put) option is the right to receive Sti − K (K − Sti ) at any time
ti from initial time t0 until maturity time tT . The variable K is called strike price. This option is called
early exercisable because the option holder can receive the exercise value Sti − K (K − Sti ) prior to the
options maturity time tT .
In order to find so-called fair values for these and other options, we proceed with the general
pricing framework.
1.3.2
General Framework
The valuation of a derivative security is a common task in mathematical finance. Here, we will
provide a brief summary of the model derivation which can also be found in e.g. [117, 64]. For
the Black-Scholes model, we need to make some assumptions and simplifications. The basic
assumptions are a frictionless market, no transaction costs, risk-less assets earn the risk-free rate r
and short selling is allowed without restrictions. Another more conceptual assumption is that the
price of the underlying S behaves like a geometric Brownian motion. That means that nobody
can foresee future stock prices and the asset evolves as
dSt = µSt dt + σSt dWt
(1.15)
where µ is the drift rate, σ is the volatility of S and dWt is the increment of a Wiener process.
We can now establish a portfolio Π consisting of the security of interest and a short position
of φ shares. The price of the security clearly depends on the underlying stock price St and time t
and will be noted as V (St , t) or just Vt , i.e.
Πt = V (St , t) − φt St .
1.3 Pricing and Hedging in Complete Markets
21
Using Itô’s Lemma [67], we find that the portfolio changes can be described by
dΠt
=
=
dVt − φt dSt
∂Vt
∂Vt
1
∂ 2 Vt
dt +
dSt + σ 2 S 2
dt − φt dSt .
∂t
∂St
2
∂St2
(1.16)
If we chose
∂Vt
(1.17)
∂St
we can make the portfolio independent of dSt , the movements of the stock price. That means the
φt =
portfolio is completely deterministic and inhabits no risk. Using the no-arbitrage principle14 , this
risk-free portfolio earns the risk-free rate r. Consequently, the changes in Πt are
dΠt = rΠt dt.
(1.18)
Now, we can use (1.16), (1.17) and (1.18) to get the relationship
1
∂ 2 Vt
∂Vt
∂Vt
+ σ 2 St2
+ rSt
− rVt = 0.
∂t
2
∂St2
∂St
(1.19)
Knowing that at maturity time tT the option value equals the payoff P (ST , tT ),
V (StT , tT ) = P (StT , tT ),
the partial differential Equation (1.19) can be solved by numerical methods as an initial value
problem in backwards time. This kind of reasoning was published in 1973 by Fischer Black and
Myron Scholes [17].
1.3.3 Exercisable Options
We now look at a possible early exercise by the holder of the security. By the no-arbitrage assumption, an exercise where the holder obtains the payoff P (St , t) leads to
V (St , t) ≥ P (St , t),
(1.20)
i.e. the early exercisable option will always have a value which is greater or equal to the immediate exercise value. If this were not true, an investor would buy the option, exercise immediately
and make a risk-less profit.
In order to find a representation of an exercisable option value similar to Equation (1.19), little
thought lead to the observation that the portfolio Π can at most earn the risk-free rate if the noarbitrage assumption should still be possible. Consequently, we get, instead of Equation (1.18),
the relationship
dΠt
≤
rΠt dt
⇔
∂Vt
1 2 2 ∂ 2 Vt
∂Vt
+ σ St
− rVt
+ rSt
∂t
2
∂St2
∂St
≤
0.
(1.21)
14 No-arbitrage means in this case that risk-less investments earn the risk free rate and investments that earn more than
the risk-free rate must by risky.
22
Mathematical Foundations
In the case of an American option the value V is given by the solution to Equations (1.21) and
(1.20) where at least one of the inequalities holds with equality on the complete solution. This is
an initial value problem or Cauchy problem in backwards time τ = tT − t with a free boundary,
starting with the terminal condition, i.e. for an American put option with strike K
V (StT , tT ) = max(K − StT , 0) .
Numerical schemes for this kind of PDE solution have been presented by several authors.
An efficient solution has been presented by Forsyth and Vetzal [50], which will also form the
foundation for the PDE reference methods used in this thesis. The next section proceeds with an
overview of the numerical methods currently used in the field of derivatives pricing.
1.4 Numerical Methods for Option Valuation
1.4.1
Overview
The challenge that remains after the introduction of the Black-Scholes framework is the efficient
valuation of arbitrary derivatives. Even though closed-form solutions of the governing Equation (1.19) can be found for European put and call options, no analytic solutions are available
for early exercisable options such as for the American option price which is governed by Equations (1.21) and (1.20). In other cases, the derivation of a closed-form solution might be possible,
but its evaluation still might be a challenge or the derivation too complex. In all these cases a
numerical tool is required which can deliver accurate option prices.
In the past few decades, several option valuation methods have been proposed. Besides the
analytic solution of the Black-Scholes PDE by Merton [87] one of the most common methods is
the Cox-Ross-Rubinstein method [36], which discretizes the asset price process by a binomial
tree. A simple recursive solution for the option value allows the valuation of early exercisable options. This method can be seen as a special case of the finite differences method by Schwartz [103]
which discretizes the Black-Scholes PDE directly and has better convergence properties than the
Cox-Ross-Rubinstein method. Another method with wide application is a Monte Carlo method
by Boyle [21] which simulates the underlying asset price process under a risk-neutral expectation.
Other methods were developed for special cases, e.g. a trinomial model by Boyle [22] for valuation of options with two correlated underlyings. Furthermore, a multinomial-tree method by
Andricopoulos et. al [9] which is based on quadrature methods for integration has found some
applications. This list is certainly not complete, but it assembles the main methods which are in
use in academia and the financial industry.
In the following sections, we focus on numerical option valuation by Monte Carlo methods.
As a reference, we will use a solver for the Black-Scholes PDE similar to the finite differences
method.
1.4 Numerical Methods for Option Valuation
23
1.4.2 Monte Carlo Methods
In numerous cases, the Monte Carlo method proved useful due to a simple implementation
and dimension independent convergence properties. Further details and advanced Monte Carlo
methods for option pricing can be found in [53].
Valuation of European Option
As we will see later in this section, the valuation of European style options is equivalent to the
integration of the expected terminal option value using the risk-neutral distribution of the underlying stock value at the maturity time tT of the option. Consequently, we focus on the integration
of functions by Monte Carlo methods, first. Assume that an option value V is given by
Z
−rtT
V =e
P (S 0 )pStT (S 0 ) dS 0
(1.22)
D
where P (S 0 ) is the option’s payoff function and pStT (S 0 ) is the risk-neutral probability density
function of the stock price at the option’s maturity. The area D j Rs is the domain of integration,
where s denotes the dimension of the state space, e.g. the number of underlying instruments
defining the payoff at tT .
The probability density function pStT (S 0 ) can be obtained by the computation of Green’s function to the Black-Scholes PDE. Naturally, it satisfies
Z
pStT (S 0 ) dS 0 = 1.
Rs
If we now draw a sample S 1 , . . . , S n from the distribution given by pStT (S 0 ), we can compute an
estimate V
n
for the integral given by (1.22),
V
n
n
= e−rtT
1X
P (S i ).
n i=1
The variance of this estimate is given by
³
n
std[V ]
´2
´
1 X ³ −rtT
n 2
e
P (S i ) − V
.
n − 1 i=1
n
=
If we standardize the distribution function of the error, then we get the
standardized error =
V −V
n
n
std[V ]
·
√
n
which converges with n → ∞ to a standard normal distribution if the third moment
£
¤
E |standardized error|3 exists and is finite 15 .
15 This
is an immediate result of the Berry-Esseen Theorem [45] resp. the central limit theorem.
(1.23)
24
Mathematical Foundations
For Equation (1.23), we can see that the standard deviation of the error V − V
µ
¶
n
std[V ]
1
√
∈O √
.
n
n
n
is equal to
(1.24)
The confidence interval at level α can be approximated by a standard normal distribution:
"
#
n
n
std[V ]
std[V ]
n
n
V − √
²(α), V + √
²(α) .
n
n
The value ²(α) can be obtained by inverting the cumulative standard normal distribution Φ such
that
Prob(|standardized error| ≤ ²(α)) = 1 − α
holds, i.e. ²(α) is the (1 −
α
2 )-quantile
of the cumulative standard normal distribution. For fre-
quently used confidence levels, the intervals are given by Table 1.2. It is important to note that
the confidence limits only depend on the Monte Carlo sample value V
n
as well as the number of
samples n. It does not depend on the dimension s of the state space, which makes this method
suitable for the evaluation of high-dimensional option pricing problems.
Table 1.2 Confidence intervals for different levels of confidence α.
Confidence level α
68%
90%
95%
99%
Interval
n
[V ]
√
V ± 1.00 · std
n
n
[V ]
√
V ± 1.65 · std
n
n
[V ]
√
V ± 1.96 · std
n
n
[V ]
√
V ± 2.58 · std
n
Considering a plain vanilla option in the Black-Scholes framework, the probability density of
the terminal stock price values is given by16
pStT (S) =
(log(S/St )+(r−(1/2)σ
0
1
−
2σ 2 tT
√
e
σS 2πtT
2 )t
T)
2
.
(1.25)
But, the inversion of pStT in order to draw a sample S 1 , . . . , S n of terminal asset prices is computationally expensive. A better approach follows from the inspection of the PDE given by Equation (1.19): The Black-Scholes PDE is just the backward Kolmogorov Equation for the process
dSt = rSt dt + σSt dWt ,
(1.26)
which is the same process as in the derivation of Equation (1.19), except that the drift rate µ in
Equation (1.15) is replaced by the risk-free interest rate r.17
16 See
17 See
[117], p. 94.
Wilmott [117], p.164 ff for details.
1.4 Numerical Methods for Option Valuation
25
This observation is very important: The dynamics of the process S required for the evaluation
of an option with S as an underlying is different from the real dynamics of S. The reason for this
is that not the dynamics of S itself are required for the valuation, but the dynamics of a hedged
portfolio which consists of a position in the underlying asset S, the option itself V and money in
a bank account.
In the following, we will call the real process of S, which is determined by Equation (1.15) as
the physical or real-world process, while we call the dynamics for the purpose of option valuation (Equation (1.26)) the risk-neutral process. Furthermore, we will denote expectations which
require a risk-neutral dynamic of St by EQ [St ], while we will leave E[St ] for the expectation under
the physical dynamic:
Definition 1.12 Given an option valuation problem in the Black-Scholes framework.
a) The dynamic of the asset S
dSt = µSt dt + σSt dWt ,
is called physical or real-world process. A formula, which requires an expectation of St using this dynamic
is denoted by
E[St ].
b) The dynamic of the asset S
dSt = rSt dt + σSt dWt ,
is called risk-neutral process. A formula, which requires an expectation of St using this dynamic is denoted
by
EQ [St ].
The easiest sampling technique for an Equation as Equation (1.26) is the Euler method. This
method samples n trajectories Stj , j = 1, . . . , n at several time steps ti , i = 1, . . . , T , starting at
St10 = St20 = . . . = Stn0 =: St0 as
Stji+1 = Stji + rStji (ti+1 − ti ) + σStji θi,j
p
ti+1 − ti
(1.27)
with θi,j drawn from a standard Normal distribution.
It turns out that a better discretization can be found, which is given by
µ
¶
p
1 2
j
j
Sti+1 = Sti exp (r − σ )(ti+1 − ti ) + σ (ti+1 − ti )θi,j .
2
(1.28)
This method is exact in the sense that the distribution of StT does not depend on intermediate
time steps as in Equation (1.27)18 . Consequently, only one time step is required for the valuation
of a vanilla European option:
StjT
18 See
µ
p
1
= St0 exp (r − σ 2 )(tT − t0 ) + σ (tT − t0 )θj
2
Wilmott [117], p. 924 for details.
¶
(1.29)
26
Mathematical Foundations
with θj drawn from a standard Normal distribution.
Finally, we summarize the algorithm for pricing a European Put option in the Black-Scholes
framework in Table 1.3.
Table 1.3 Valuation of a European Put option in the Black-Scholes framework with risk-free rate
r, volatility σ , maturity time tT and strike K .
1. Simulate n risk-neutral asset price trajectories starting at the current price St0 , j ∈ {1, . . . , n},
µ
¶
p
1
StjT = St0 exp (r − σ 2 )(tT − t0 ) + σ (tT − t0 )θj .
2
2. Compute the average payoff and discount it with the risk-free rate r, i.e.
V
n
n
= e−r(tT −t0 )
1X
max(K − StjT , 0),
n j=1
n
which provides an estimate for the option value V , i.e. V ≈ V .
Least-Squares Monte Carlo
Since Monte Carlo is a standard method which is used when dimensionality causes numerical
difficulties, we extend the method in Table 1.3 to exercisable options. We already saw exercisable
options in Section 1.3.3. But another way of formulating this mathematical problem can be used:
The price of an exercisable security is the discounted expected value of the payoff at the optimal
stopping time.
This can be formulated as an optimal stopping problem (see Carrière [32, Section 4]), with
h
i
V (St0 , t0 ) = sup EQ e−r(τ −t0 ) P (Sτ , τ ) ,
(1.30)
τ ∈T
where the asset price process St evolves in the risk-neutral fashion
dSt = rSt dt + σSt dWt ,
(1.31)
and T is the set of all possible stopping times.
This formulation as an optimal stopping problem (1.30) now leads to the idea of Monte Carlo
algorithms for pricing American options, because one only has to estimate an optimal stopping
(exercise) strategy within the Monte Carlo method for vanilla options. This way, the solution of
complex free boundary PDEs as defined by the Inequalities (1.21) and (1.20) is avoided.
One approach for Monte Carlo valuation of exercisable options is to parameterize the region
of optimal exercise with a function (see [14] and the references therein). The main other approach
is to approximate the conditional expected continuation value with a regression. This method
1.4 Numerical Methods for Option Valuation
27
was first presented by Carrière [32] and is called Least-Squares Monte Carlo. This is the method
we will use in the following.
Similar to the previous section, we simulate different asset paths S j , j ∈ {1, . . . , n}, which
follow Equation (1.28),
Stji+1
=
Stji
µ
¶
p
1 2
exp (r − σ )(ti+1 − ti ) + σ (ti+1 − ti )θi,j
2
with θi,j drawn from a standard Normal distribution.
At each exercise time ti , the holder decides to exercise the option and get the payoff P (Sti , ti )
or to continue. In this case, the payoff P (Sti , ti ) may only depend on the value of Sti at time ti ,
which can easily be extended to a dependence on the complete asset paths history as we will see
in later chapters. In order to maximize the option value V , the holder exercises if
P (Sti , ti ) ≥ EQ [V (Sti+1 , ti+1 )|Sti , ti ]
with EQ [V (Sti+1 , ti+1 )|Sti , ti ] denoting the expected option value under the risk-neutral measure
Q if the option is not exercised at time ti . In the Least-Squares Monte Carlo approach, the value
of EQ [V (Sti+1 , ti+1 )|Sti , ti ] is approximated by
P e (Sti , ti ) ≈ EQ [V (Sti+1 , ti+1 )|Sti , ti ].
(1.32)
The value P e (Sti , ti ) is computed using a least-squares regression of many path-realizations S j
on some basis functions bk , i.e. the local basis approximation of V (Sti+1 , ti+1 ) given Sti (Theorem 1.4). The regressions start at the time step tT −1 , i.e. one step before maturity time tT . The
approximated values are
P e (Sti , ti ) =
X
aik bk (Sti )
(1.33)
k
with some basis functions bk and unknown coefficients aik minimizing
°Ã
°
!
°
°
° X i
°
j
j
−r(ti+1 −ti )
°
°
ak bk (Sti ) − e
Vti+1
°
°
° k
°
j=1,...,n
(1.34)
2
where
Vtji+1
is the estimate of the option value for time ti+1 using the Monte Carlo path realization
S . The value of Vtji is given by the maximum between the estimated value of the unexercised
j
option P e and the intrinsic value P ,
(
e−r(ti+1 −ti ) Vtji+1
j
Vt i =
P (Stji , ti )
if P e (Stji , ti ) > P (Stji , ti )
.
else
(1.35)
In Section 1.2.1, we saw how the solution to Equation (1.33) respectively Equation (1.34) can be
computed.
Given that the option value at maturity time equals the payoff VtjT = P (StjT , tT ), a dynamic
program solves for all values Vtji , starting at time tT and iterating backwards to t0 . Based on the
28
Mathematical Foundations
value Vtj0 , we can compute an approximation to the option value, which is known as the in-sample
price,
V
in
n1
1 X
=
·
Vj,
n1 j=1 t0
where the asset paths are S j , j ∈ {1, . . . , n1 }. This approach has an obvious disadvantage. Each
of the estimated option values Vtj0 contains information about its future asset path Stji+1 . In order
to avoid this property, we compute the out-of-sample option price: We generate additional simulation paths S l , l ∈ {n1 + 1, . . . , n1 + n2 } but use the coefficients aik which were fitted to the old set
of simulation paths S j , j ∈ {1, . . . , n1 }. Consequently, the out-of-sample value can not depend on
the knowledge of the future paths. It is given by
V out =
nX
1 +n2
1
·
Vtl0
n2
(1.36)
l=n1 +1
with
½
Vtli
=
e−r(ti+1 −ti ) Vtli+1
P (Stli , ti )
if P e (Stli , ti ) > P (Stli , ti )
, l ∈ {n1 + 1, . . . , n1 + n2 }.
else
Under optimal conditions, the in-sample and the out-of-sample price converge to the correct
arbitrage-free price. However, in our computations, we will only compute the out-of-sample
value because it is the value for which we can state the exercise policy without information about
the future. Furthermore, the expected value of the out-of-sample price V out of an American option
is always a lower bound for the option value: The crucial point for the convergence of LeastSquares Monte Carlo simulation is the estimate P e . We are confined to finite many samples and
to finite degrees of freedom in the regressions and will not be able to perfectly represent the real
shape of EQ [V (Sti+1 , ti+1 )|Sti , ti ]. Thus, a less than optimal exercise strategy is performed and
provides a reduced option value.
Notes on the Convergence of Least-Squares Monte Carlo
Figure 1.3 presents American put option value estimates computed with Least-Squares Monte
Carlo and different numbers of cubic spline basis functions.
While the Least-Squares Monte Carlo estimates with n1 + n2 = 105 asset paths already reach
a minimal error with two basis functions, the estimates with n1 + n2 = 107 asset paths reach a
minimal error at 10 basis functions. The remaining error is mainly due to the finite number of
time steps (T = 50). That means, one has to analyze the optimal number of basis functions based
on the number of asset paths. With more than 10 basis functions, the error increases due to some
kind of overfitting of the conditional expectation function.
In the following chapters, there will always be a discussion, when and which basis functions
bk are suitable for the specific problem.
1.4 Numerical Methods for Option Valuation
29
−1
average absolute error
10
5
10 paths
107 paths
−2
10
2
4
6
8
10
12
14
number of basis functions
16
18
Figure 1.3: The average error of the Least-Squares Monte Carlo price estimate of an American put
option. The put has a strike K = 100, the asset price is St0 = 100, volatility σ = 0.4, risk-free rate
r = 0.05. Different numbers of basis functions are used within the 50 time steps Least-Squares
Monte Carlo. The PDE reference value is 13.66761.
1.4.3
Direct PDE Methods
There are several direct PDE solver methods. The common ones include finite differences [103,
117], finite elements [50] and finite volume methods [124]. Other methods are Laplace and Fourier
transform methods [117] as well as meshless methods [80].
In the following, we present a simple finite element method, which is sufficient for our purposes as a pricing engine. In some cases, this thesis will refer to more advanced techniques. These
techniques are mainly based on the methods presented by Forsyth et. al. [50].
European Options
In this section, we want to build a very simple PDE solver for the Black-Scholes Equation. The
Black-Scholes PDE is a Cauchy-Problem in backwards time τ where the initial values are given
by the payoff at maturity. As usual, with time t we denote the asset price by St , the value of the
option by Vt , the volatility of the asset by σ and the risk-free rate by r. The solver is an implicit
finite volume discretization of
∂Vτ
1
∂ 2 Vτ
∂Vτ
= σ 2 Sτ2
− rVτ
+ rS
∂τ
2
∂Sτ2
∂Sτ
(1.37)
30
Mathematical Foundations
working backwards in time from maturity to present time t0 . This equation is equivalent to Equation (1.19), using backwards time τ = tT − t. We integrate Equation (1.37) over a finite volume Ai
using the discrete values S i , i = 0, . . . , m where
µ i+1
¶ µ i
¶
S
+ Si
S + S i−1
Ai =
−
2
2
with S 0 = 0 and sufficiently S m large19 That means that the cell boundaries are placed half way
between the nodes at
S i + S i+1
2
S i + S i−1
.
=
2
S i+1/2 =
S i−1/2
After the integration of Equation (1.37) over the ith cell, we obtain
Z
Z
Z
Z
∂Vτ
∂Vτ
σ 2 2 ∂ 2 Vτ
ds
+
rs
ds =
s
ds
−
rVτ ds,
∂Sτ2
∂Sτ
Ai
Ai ∂τ
Ai 2
Ai
where Vτ is a function of Sτ and τ , i.e. Vτ = V (Sτ , τ ) = V (St , t), τ = tT − t. In the following, we
denote the value of the option at time τj and asset price S i as
V (Sτij , τj ) = Vji .
Using approximations we get
Z
σ 2 2 ∂ 2 Vτ
ds
s
∂Sτ2
Ai 2
≈
=
≈
σ2 i 2
(S )
2
Z
∂ 2 Vτ
ds
2
Ai ∂Sτ
õ
¶i+1/2 µ
¶i−1/2 !
σ2 i 2
∂Vτ
∂Vτ
(S )
−
2
∂Sτ
∂Sτ
µ i+1
¶
2
i
i−1
σ
V
−V
V
−Vi
i 2
(S )
+ i
.
2
S i+1 − S i
S − S i−1
Furthermore, we get for the other terms
Z
rV ds
Ai
Z
∂Vτ
rs
ds
∂Sτ
Ai
≈
≈
=
rV i Ai ,
i
h
rS i V i+1/2 − V i−1/2
· i+1
¸
V
− V i−1
rS i
.
2
On the left-hand side of Equation (1.37), we get
Z
Ai
19 For
a European call option S m = e10σ
"
#
i
Vj+1
− Vji
∂Vτ
i
ds ≈ A
.
∂τ
∆τ
√
tT −ti K
is enough for many realistic settings [49, p.21].
1.4 Numerical Methods for Option Valuation
31
All the above equations finally lead to
"
i
A
i
− Vji
Vj+1
∆τ
#
=
σ2 i 2
(S )
2
µ
V i+1 − V i
V i−1 − V i
+ i
i+1
i
S
−S
S − S i−1
¶
·
+ rS i
¸
V i+1 − V i−1
− rV i Ai
2
where we have to define the time level of the right-hand side. Choosing a time level j for the right
hand side, we get a so called explicit discretization; choosing a level of j + 1, the discretization is
called implicit. The implicit discretization is more stable so that we will use
"
i
A
i
− Vji
Vj+1
∆τ
#
σ2 i 2
=
(S )
2
Ã
i−1
i+1
i
i
Vj+1
− Vj+1
Vj+1
− Vj+1
+
S i+1 − S i
S i − S i−1
!
+ rS i
i+1
i−1
Vj+1
− Vj+1
i
− rVj+1
Ai (1.38)
2
in the following. The final algorithm of the PDE solver is obtained by rearranging Equation (1.38)
in a matrix M such that the linear Equation
M · Vj+1 = Vj
1
m
with the known vector Vj = (Vj1 . . . Vjm ) and the unknown Vj+1 = (Vj+1
. . . Vj+1
) can be solved.
Note that the time stepping is conducted in backwards time τj , i.e. τ0 = tT − 0, . . . , τT = 0.
Consequently, the boundary for a put option valuation, i.e. the first and the last element of Vj
can be assumed as
0
Vj+1
=
Vj0 (1 − r∆τ ), i.e.
Vjm
=
0 ∀j.
0
Vj0 − Vj+1
= r∆τ
Vj0
Other boundary settings are useful, especially a property that the second derivative
∂2V
∂S 2
should
be zero is often used.
Exercisable Options
Previously, we saw that the valuation of exercisable options can be formulated (Equation (1.21)
and (1.20)) as
V (St , t)
1
∂ Vt
∂Vt
∂Vt
+ σ 2 St2
− rVt
+ rSt
∂t
2
∂St2
∂St
≥ P (St , t),
2
≤ 0,
where at least one of the inequalities holds with equality on the complete solution.
In order to solve this setting with a so called free boundary, we can use basically the same
Equation as in the case of non-exerciseable options (Equation (1.38)):
"
A
i
i
Ṽj+1
− Ṽji
∆τ
#
σ2 i 2
=
(S )
2
Ã
i+1
i−1
i
i
Ṽj+1
− Ṽj+1
Ṽj+1
− Ṽj+1
+
S i+1 − S i
S i − S i−1
!
+ rSi
i+1
i−1
Ṽj+1
− Ṽj+1
i
− rṼj+1
Ai ,
2
32
Mathematical Foundations
i
except that we simply set the node values Vj+1
to the exercise value P (S i , t) if they are lower than
the exercise value (see Wilmott [117, p. 906]):
i
i
Ṽj+1
= max(P (S i , t), Vj+1
).
This solver is sufficient for the purpose of this thesis. However, a better method is given by
internal iterations using a penalty method [50].
33
Chapter 2
The Challenge of Path Dependency
2.1 Overview
A challenging problem in option pricing is the evaluation of path dependant options. This chapter
presents a method which can increase the convergence of Monte Carlo pricing [21, 23, 53, 32, 81]
significantly. The method is extended such that Monte Carlo simulation and numerical integration methods are combined in a consistent framework called Feature Extraction. As an example for the efficiency of the framework, the computational effort of pricing different types of
Parisian and Asian style options is compared to the effort of classical Monte Carlo and PDE pricing. Especially, the fast pricing of a moving window Parisian option is presented using a PDE
solver [103, 50] as an efficient tool for the required numerical integration.
While this chapter focuses on options which can only be exercised at maturity time, the later
chapters will address options with an early exercise.
2.2 Introduction
For the numerical evaluation of option prices in the Black-Scholes models, three approaches come
to mind, namely direct numerical integration, solving the Black-Scholes partial differential Equation (PDE), or Monte Carlo simulation, respectively. Whereas the first two methods are fast and
accurate in many cases, they encounter considerable difficulties for path dependent options. If
the path dependence involves more than just a single or few additional state variables, the PDE
approach may be untractable altogether. Monte Carlo methods, on the other hand, allow for
complex path dependencies (high-dimensional problems) but their efficiency is limited by their
relatively low convergence rate as presented in the previous chapter.
In this chapter we suggest to combine elements of the Monte Carlo and the direct numerical
integration in order to increase the accuracy of the former. We call this method Feature Extraction
because the feature of the option’s payoff, which is hard to compute by other means is estimated
from a Monte Carlo simulation. This feature is then used in a numerical integration method,
34
The Challenge of Path Dependency
which would not have been possible before. In this context, we use a PDE solver as an efficient
tool for the numerical integration.
That means for the case of a path dependent options, we proceed in two steps. First, a Monte
Carlo simulation is used to replace the given path dependent option with a European-style option
which has the same price. In a second step, this hypothetical European option is evaluated by
direct numerical integration or by solving the Black-Scholes PDE.
Why does this procedure lead to a higher accuracy? In many concrete cases, the value StT of
the underlying at expiration contributes considerably to the actual payoff of the option. The variability of the payoff among all paths of the underlying with terminal value StT is often much
smaller than the variability of the payoff among all conceivable asset price movements. The
possibly slow and inaccurate Monte Carlo step in the Feature Extraction contributes only to the
variability that cannot be explained by the asset’s terminal value StT , thus leading to a higher
precision in the end.
The main idea of this chapter was first presented by Grau [55]. In contrast to this earlier work,
this chapter provides more insight into the method and derives its correctness.
The chapter is organized as follows. In the next section we explain the general idea of the
combined approach and compare the relative performance in the case of a discretely sampled
Asian option. Subsequently, an iterative extension of the method is applied to moving window
Parisian options. The last section concludes.
2.3 Pricing Using Feature Extraction
We denote by S = (St )t∈[0,tT ] the price process of an asset in the Black-Scholes model with constant volatility σ and risk-less interest rate r. Our aim is to price an option with a payoff P (S),
S := {Sτ |τ ∈ I}, I ⊆ {t0 , t1 , . . . , tT } at tT which may depend on the whole path history of S. The
fair initial price of the option is given by
Vt0 = e−rtT EQ [P (S)],
where the expectation is taken under the risk-neutral measure given an initial asset price St0 .
If EQ [P (S)|StT = s] denotes the conditional expectation of the payoff given that the asset price
terminates at s, it follows that
Vt0 = e−r(tT )
Z
∞
0
EQ [P (S)|StT = s] · pStT (s) ds,
(2.1)
where pStT is the probability density function of StT given an initial asset price St0 , (Equation (1.25)),
2
2
(log(s/St )+(r−(1/2)σ )tT )
0
1
−
2σ 2 tT
√
e
.
σs 2πtT
Now, we define the Feature Extraction as a method which separates the option price computa-
pStT (s) =
tion into the two parts in Equation (2.1), namely into the conditional expected payoff function
2.3 Pricing Using Feature Extraction
35
EQ [P (S)|StT = s] and the corresponding probability density function pStT (s). In the later Section, we will see Asian options, where pStT (s) is known analytically and EQ [P (S)|StT = s] has
to be estimated numerically. Furthermore, we will see Parisian options, where pStT (s) has to be
estimated numerically and EQ [P (S)|StT = s] is known at each time step of an induction in backwards time. Feature Extraction means to use as much analytical information in an option pricing
process as possible.
Suppose that pStT (s) is known and EQ [P (S)|StT = s] is not known. The intuition behind this
method is that we interpret
fe(s) := EQ [P (S)|StT = s]
as the payoff of a hypothetical European-style option, which is computed by Monte Carlo simulation. In a second step, the integral in (2.1) is evaluated directly or by numerical solution of
the corresponding PDE. As noted in the introduction, the conditional variance var(P (S)|StT = s)
is typically much lower than the total variance var(P (S)) which leads to a higher precision compared to an unconditional Monte Carlo simulation.
From the way this algorithm works by integrating the payoff of a hypothetical European option, it is clear that an extension to options with early exercise features is not easy. Another
obvious limitation is that the probability density function of the terminal asset price distribution
has to be known with a high precision, in a symbolic form at best. Apart from these limitations,
the pricing of any option with a payoff that can be computed by a simple forward simulation
and an underlying price process with a known probability density can profit from the presented
approach.
Let us illustrate the approach in the case of a discretely sampled Asian option in the next
section followed by a Parisian option.
2.3.1 A Discretely Sampled Asian Option
An Asian option is a European option with a payoff P (S) that depends on the average I of the
past stock prices and the strike K. For a discretely sampled Asian call option with sample dates
t0 = 0, t1 , . . . , tT , the payoff P is
P (S) = max(I(S) − K, 0),
for an Asian call option resp.
P (S) = max(K − I(S), 0)
for an Asian put option with
T
I(S) =
1 X
St .
T + 1 i=0 i
36
The Challenge of Path Dependency
As presented in Chapter 1, in the classical Monte Carlo pricing method, one would estimate the
discounted expected payoff by the mean of the payoff of n simulated asset paths
Vt 0
with
= e−r(tT ) EQ [P (S)]
n
1X
≈ e−r(tT )
P (Sj )
n j=1
ÃÃ
j
P (S ) = max
T
1 X j
S
T + 1 i=0 ti
!
!
− K, 0 ,
for the Asian call,
Ã
Ã
j
P (S ) = max K −
T
1 X j
S
T + 1 i=0 ti
!
!
,0 ,
for the Asian put with
1
Stji = Stji−1 e(r− 2 σ
2
√
)(ti −ti−1 )+σ ti −ti−1 θi,j
,
(2.2)
where θi,j , i = 1 . . . T , j = 1 . . . n, denotes independent realizations of a random variable drawn
from a standard normal distribution.
The value Vt0 is the no-arbitrage price of the standard Black-Scholes option price model.
As explained in the previous section (see Equation (2.1)), the Asian option pricing problem
can be divided into an expected payoff function and the probability density function pStT of the
asset price at maturity. The PDF is known (see Equation (1.25)) and the expected payoff function
can be estimated by Monte Carlo simulations.
In order to compute the expected payoff function in the case of an Asian option, we have
to compute the payoff for all possible paths starting at St0 . Consider a Monte Carlo simulation
with stock price paths Sj and payoffs P (Sj ), j = 1 . . . n. As in Section 1.2.1, an estimate for the
conditional payoff function fe ≈ EQ [P (S)|StT = s] is generated by a regression on basis functions.
P
In this case, a B-spline f spline (StT ) = k ak bk (StT ) proves to be useful. The required regression is
given by
min
n
X
f spline j=1
°
°2
n °
m
°
X
X
°
°
kP (Sj ) − f spline (STj )k22 ⇔ min
ak bk (StT )°
°P (Sj ) −
a1 ...,am
°
°
j=1
k=1
(2.3)
2
i.e.
fe(s)
= EQ [P (S)|StT = s] ≈ f spline (s).
(2.4)
This can easily be done by using the local basis approximation as presented in Chapter 1. The
regression is only one dimensional – on the asset price at maturity, so that the spline is the best
2.3 Pricing Using Feature Extraction
37
choice for the basis. Since this least-squares regression leads to an estimate for the conditional
expectation (see Section 1.2.1), this kind of regression exactly produces the results we need. Due
to the smoothness of the solution, already a small number of cubic spline basis functions (3-5)
generate an acceptable accuracy. A simple implementation can be found in Appendix 7.5.
Figure 2.1 demonstrates the smoothing effect for an Asian option by comparing the conditional payoff of Monte Carlo simulations with the spline approximation of the expected conditional payoff function. On the left, we can see the terminal distribution of the asset price as a
histogram and the realized payoff values as dots. Performing a regression on the data of these
dots, we obtain the line for the conditional expected value of the payoff on the right of the figure.
Instead of using the empirical distribution of the asset price as demonstrated by the histogram on
the left, the Feature Extraction now uses the analytic solution shown as probability density on the
right.
An alternative to a regression on the asset price is the simulation of only very specific asset
paths. Using Brownian bridges, one can basically choose the terminal asset price for each paths.
That means, one can determine the conditional payoff on a grid of the values of StT and interpolate the values to obtain fe(St ).
T
Distribution pST of asset paths at maturity
(known as closed-form solution:
no Monte Carlo error
Histogram of asset paths at maturity
(Monte Carlo error)
Value of payoff at maturity
(Monte Corlo)
Expected value of payoff at maturity
(spline regression on Monte Carlo,
Monte Carlo error)
time t
time t
asset price S
asset price S
Figure 2.1: On the left, an Asian option with the conditional payoff of each asset path and the
path distribution histogram is presented. The right figure demonstrates the smoothing effect
taking place using the PDF and the approximation of the expected conditional payoff function.
2.3.2
Simple Example
To clarify the algorithm, this section will proceed in a step by step fashion and explain every
computational task of the evaluation of the discretely sampled Asian option with the Feature
Extraction method. Consider an Asian option with data in Table 2.1. This is a simple Asian put
option with three observation dates for the averaging: t0 = 0, t1 = 0.5 and tT = t2 = 1.
We start the evaluation by simulating the underlying’s paths using Equation (2.2)
1
Stji = Stji−1 e(r− 2 σ
2
√
)(ti −ti−1 )+σ ti −ti−1 θi,j
38
The Challenge of Path Dependency
Table 2.1 Data of an Asian put option with three averaging sample dates.
General features
Independent variable I
strike price K
risk-free rate r
current stock price St0
volatility σ
maturity time tT
observations
Payoff at Maturity P
1
P2
i=0 Sti
100
5% p.a.
100
40% p.a.
1 year
every 0.5 years
max(K − I, 0)
3
and the data in Table 2.1. With Stj0 = 100 and random numbers θi,j , j = 1 . . . , 10, i = 1, 2, we get
10 asset paths:
j
1
2
3
4
5
6
7
8
9
10
Stj0
100
100
100
100
100
100
100
100
100
100
Stj1
81.6340
120.7011
89.1249
118.4613
104.5817
81.0836
58.9702
101.6986
65.8471
119.6538
Stj2
61.7521
86.3528
84.7387
87.9178
118.0245
86.1567
40.8700
72.1828
65.5679
133.2725
Now, we can compute the payoff of the Asian option for each paths. With
Ã
P
j
(Stj0 , Stj1 , Stj2 )
!
2
1X j
S , 0 , j = 1, . . . , 10
= max 100 −
3 i=0 ti
we obtain








P := P(St0 , St1 , St2 ) = 







18.8713
0.0000
8.7121
0.0000
0.0000
10.9199
33.3866
8.7062
22.8616
0.0000
















2.3 Pricing Using Feature Extraction
39
which can already be used for a price estimate following the traditional Monte Carlo pricing
method (cp. Section 1.4.2):
10
Vt0
1 X j
P
10 j=1
≈
e−rtT
≈
e−0.05·1 · 10.3458
≈
9.8412.
However, we want to compute the estimate for the option price using the Feature Extraction, where we need a local basis approximation of E[P |StT ] which we compute following Theorem 1.4 and Lemma 1.5. For this regression estimate, we need to define our basis functions
b1 (x), . . . , bm (x). In this simple case we choose polynomials up to the power of two in x := StT
such that b1 (StT ) = 1, b2 (StT ) = StT and b3 (StT ) = St2T . Consequently, the matrix B := B(StT ) in
the regression is
³
B= 1
´¯
¯
(StjT )2 ¯
StjT
j=1,...,10
,
which is








B=







1
1
1
1
1
1
1
1
1
1
61.7521
86.3528
84.7387
87.9178
118.0245
86.1567
40.8700
72.1828
65.5679
133.2725
3813.3171
7456.8073
7180.6524
7729.5404
13929.7926
7422.9713
1670.3548
5210.3516
4299.1531
17761.5684
















in this particular case. That means, that the local basis approximation is given by
f˜(StT ) =
3
X
ãk bk (StT )
k=1
with
T
(ã1 ã2 ã3 )
=
¡
¢−1 T
BT B
B P
=
B† P
=
(81.70
− 1.375
T
0.005712) .
40
The Challenge of Path Dependency
We use this result for the pricing with Equation (2.1), i.e.
Z
Vt 0
∞
= e−rtT
Z
0
≈ e−rtT
∞
f (s) · pStT (s) ds
f˜(s) · pStT (s) ds
0
Z
∞
≈
e−rtT · (81.70 − 1.375 · s + 0.005712 · s2 ) ·
0
(log(s/St )+(r−(1/2)σ
0
1
−
2σ 2 tT
√
e
σs 2πtT
2 )t
T)
2
ds.
The last integral can now be evaluated with numerical methods. In this simple case, we use the
trapezoidal rule (see Voss [114, p.47]) with equidistant nodes:
s
25
50
75
100
125
150
175
200
225
250
275
300
325
350
375
400
425
450
475
500
pStT (s)
0.0001372
0.0050471
0.0108062
0.0099545
0.0065309
0.0036761
0.0019223
0.0097297
0.0004863
0.0002430
0.0001222
0.0000621
0.0000319
0.0000167
0.0000088
0.0000047
0.0000026
0.0000014
0.0000008
0.0000004
f (s)
50.9062
27.2489
10.7312
1.3531
-0.8852
4.0161
16.0571
35.2378
61.5581
95.0181
135.6178
183.3572
238.2362
300.2549
369.4132
445.7113
529.149
619.7264
717.4434
822.3002
F(s):=e−rtT f (s) · pStT (s)
0.0062
0.1308
0.1103
0.0128
-0.0055
0.0140
0.0294
0.0326
0.0285
0.0220
0.0158
0.0108
0.0072
0.0048
0.0031
0.0020
0.0013
0.0008
0.0005
0.0004
which leaves us with
Ã
Vt0 ≈ 25 0.5 · (F (25) + F (500)) +
19
X
!
F (i · 25)
≈ 10.69
i=2
as the Feature Extraction estimate for the Asian option with data in Table 2.1. This value is far
from the true value of 6.97 (estimated with 10.000.000 paths) and even further away than the
traditional Monte Carlo method with 9.84. But, as we will see in the next section, in realistic
settings with many paths and more regression basis functions the Feature Extraction method
converges faster to the true solution and delivers more accurate estimates than the traditional
Monte Carlo method.
2.3 Pricing Using Feature Extraction
41
Table 2.2 Specifications of an Asian option.
Option type
Independent variable I
Strike K
Payoff at Maturity P
Maturity T
Risk free rate r
Volatility σ
Daily observations ∆obs t
no observations at
Initial asset price St0
Asian
P124arithmetic average
1
i=1 Sti
124
100
max(I − K, 0)
0.5 years
5% p.a.
25% p.a.
1/250 years
t = t0 , t = tT
100
2.3.3 Numerical Examples
The examples in this section are discretely observed Asian call options based on the specifications
in Table 2.2.
The value of a classical PDE solution of the Asian option with specifications in Table 2.2 is
4.646 (Grid with 1600 nodes in S and 4000 time steps, all four digits correct). All subsequently
reported errors are relative to this value.
In the case of Asian options, the antithetic variables and the variance reduction improve a
Monte Carlo estimate significantly. See Boyle [21, 23] for more information on variance reduction
and antithetic variables. These improvements can be combined with the method of this chapter.
Figure 2.2 shows the frequency distribution of the error with the different Monte Carlo methods, compared with the new method. In both cases, the antithetic variables and the antithetic
variables with control variable, the error distribution of the new method is smaller. The standard
deviation for each of the methods is a half using the new method compared with the classical
Monte Carlo. That means the new method is an additional improvement of Monte Carlo simulations.
In order to analyze the effect of the Feature Extraction a little further, we are going to separate
the error of a Monte Carlo estimate (cp. Equation 1.24)
n
Err :=
std[V ]
√
n
into separate effects. From (Equation (2.1))
Z ∞
EQ [P (S)|StT = s] · pStT (s) ds
Vt0 = e−r(tT )
0
we know that we can separate the errors into errors for the estimation of f˜(s) ≈ EQ [P (S)|StT = s]
and errors for the estimation of p̃StT (s) ≈ pStT (s). We do not discuss an empirical estimation of a
probability density function (PDF) here so that we refer to [20] for details.1 However, the analytic
1 The
following examples are computed using the Matlab command ksdensity.
42
The Challenge of Path Dependency
Error: pure Monte Carlo
Error: new method
20
30
15
20
10
10
5
0
-0.5
0
Error: MC with antithetic variables
0.5
30
0
-0.5
0
0.5
Error: new method with antithetic variables
40
30
20
20
10
0
-0.5
10
0
Error: MC with control variable
0.5
0
-0.5
300
600
200
400
100
200
0
-0.5
0
0.5
0
-0.5
0
0.5
Error: new method with control variable
0
0.5
Figure 2.2: The Asian option with data in Table 2.2: Histograms of the error distribution of different Monte Carlo simulations. The pure Monte Carlo method uses 1,000 asset paths, the antithetic
variables uses the same paths twice: once positive, once negative. The variance reduction uses
the paths with antithetic variables and the geometric averaging Asian option as correction. The
reference value for this asian option is 4.646.
conditional expectation f (s) for the Asian option in Table 2.2 is not known, which is a problem
for the computation of Equation (2.1). Consequently, we compute a highly accurate estimate with
10.000.000 asset paths and use the result instead of an analytic formula for f (s). A corresponding
implementation can be found in Appendix 7.5.
Table 2.3 summarizes the results of the different methods for pricing an Asian option with
data in Table 2.2. All values are computed using the same set of asset paths, the values in the
column A (Reference MC) are values computed with a pure Monte Carlo method. The mean
value is computed using 10,000 valuations with 10,000 asset paths simulations each. This mean
value is 4.655 ± 0.001 with 95% confidence. This value differs from the reference PDE value by
about 0.01 because the Monte Carlo method uses only 125 time steps for the averaging while the
PDE method uses 4000.
The next column contains the corresponding values where the option value is based on the
Feature Extraction, i.e. the integration with an estimated conditional expectation function f˜(s).
Columns follow, where the integration is also conducted using an estimated probability density
p̃StT (s) and combinations of both. The standard deviation (Std) and thus the error of the Fea-
2.3 Pricing Using Feature Extraction
43
ture Extraction with an analytic PDF and an estimate for the conditional expected payoff function
(column B) is much smaller than the value of Reference MC. This was expected since this demonstrates the effectiveness of the method. The Std for integration based on the PDF estimation (column C) is also smaller than the value of Reference MC. This is expected because the conditional
expectation in this integration was estimated from a highly accurate estimation with 10,000,000
asset paths. If we now focus on column D, where both functions p̃StT (s) and f˜(s) are estimated
from the sample, we see that the Std is very close to the value of the Reference MC. Again, this
is expected because besides smoothing, no additional information was added to the valuation
process.
The question arises, how the Std values of the different methods are connected. In the following we give some empirical intuition for the interdependency of the different Monte Carlo
errors using the obtained numerical values. Let us assume that the Monte Carlo error Err can be
decomposed into
std(Err)
= std(Errp + Errf )
q
=
std(Errp )2 + std(Errf )2 + 2 · cov(Errp , Errf )
where Errf is the error due to the estimation of f˜(s) in Equation (2.4) and Errp is the error due
to the estimation of the probability density pStT . In our example, we can obtain the numerical
estimate for Errp = 0.05700, for Errf = 0.03701 and for Err = 0.06758 from row “Std” of Table 2.3.
A numerical estimate for the covariance of the two errors cov(Errp , Errf ) is 0 ± 0.000001, which is
effectively zeros. Now, computing
q
std(Errp )2 + std(Errf )2 + 2 · cov(Errp , Errf ) ≈
≈
p
0.057002 + 0.037012 + 2 · 0
0.06796
(2.5)
(2.6)
this corresponds surprisingly well to the value of Reference MC (0.06758). Therefore, it seems to
be plausible that the different components of the error add up linearly and that they are uncorrelated. A further detailed analysis of this error splitting remains open to further research, since it
is beyond the scope of this dissertation.
However, this numerical example demonstrates that the Feature Extraction really benefits
from splitting the error of a Monte Carlo simulation into an error for the expected conditional
payoff function and the probability density function. Using the Feature Extraction uses an analytic expression for the probability density function and thus only the error for estimating the
expected conditional payoff remains.
44
The Challenge of Path Dependency
Table 2.3 Values and error estimates are presented of different ways estimating the value of the
Asian option in Table 2.2 with St0 = 100, and n = 10, 000 simulated asset paths. Under Reference
MC, the prices were computed using the regular Monte Carlo technique. The other columns are
computed using Equation (2.1) with estimates for the pdf pStT (s) ≈ p̃StT (s) resp. the conditional
expectation f (s) ≈ f˜(s). Std is the standard deviation of a series of 10.000 option valuations and
thus an expected error for a single valuation.
Mean
Std
Systematic error
2.3.4
A
Reference MC
4.655
0.06758
-
B
f˜(s)
4.652
0.03701
<0.001
C
p̃StT (s)
4.687
0.05700
≈0.03
D
p̃StT (s) and f˜(s)
4.687
0.06788
≈0.03
E
highly accurate f(s)
4.652
0
<0.001
Summary of the Feature Extraction for European Path Dependent Options
For discretely observed European style path dependent options, the new method can be summarized as follows:
1. Compute the local basis approximation of the expected payoffs fe(StT ) = EQ [P (S)|StT = s]
for all possible asset path histories S := {Sτ |τ ∈ I}, I ⊆ {t0 , t1 , . . . , tT } and terminal asset
price StT at maturity time tT . This can be done by Monte Carlo simulations, starting at St0
and Theorem 1.4.
2. Use the Black-Scholes Equation to solve for the price Vt0 , e.g. by a finite differences time
stepping of
∂V (St , t)
∂ 2 V (St , t)
∂V (St , t)
1
+ σ 2 St2
+ rS
− rV (St , t) = 0
∂t
2
∂St2
∂St
with fe(StT ) as terminal condition. Or solve the Black-Scholes Equation by using the distribution of StT , i.e.
(log(s/St )+(r−(1/2)σ 2 )tT )2
0
1
−
2σ 2 tT
pStT (s) = p
e
.
σs 2π(T )
The option price Vt0 is then given by
Z
−rtT
Vt 0 = e
0
∞
pStT (s)fe(s) ds.
2.4 Pricing Delayed Barrier Options
45
2.4 Pricing Delayed Barrier Options
The Feature Extraction method can be used for pricing Parisian options with different kinds of
knock-out or knock-in conditions. In order to use less computations in the Monte Carlo simulation one can extend the method from one expected payoff function at time T to expected payoff
functions at all discrete observation times with different probabilities for each expected payoff
function to occur. The resulting algorithm is based on PDE time stepping in order to integrate the
expected payoff functions at the different times consistently. In contrast to the previous example,
the probability density function of the terminal asset price is unknown and the conditional expected payoff function is known at each time step of the Parisian option observation. This was
reversed in the pricing an Asian option.
Definition 2.1 We define Parisian options as follows:
(i) A consecutive counting Parisian option is an option which becomes worthless if the underlying
stock stays M consecutive days above a barrier level SB .
(ii) A cumulative counting Parisian option is an option which becomes worthless if the underlying
stock stays M days above a barrier level SB since the initialization of the Parisian option.
(iii) A moving window Parisian option is an option which becomes worthless if the underlying stock
stays M out of the last N days above a barrier level SB .
Note that the moving window Parisian option is a generalization of the consecutive and cumulative counting Parisian option. The consecutive counting Parisian is equivalent to a moving
window Parisian option with N = M . The cumulative counting Parisian is equivalent to a moving window Parisian with N → ∞.
In the following we will consider a moving window Parisian call option. At each time step
we want to consider the fraction of options which is knocked out separately from the options still
alive.
In the context of moving window Parisian options we apply the method in the previous sections recursively in time. The payoff function of the option at expiration tT can be represented
as
f (S) = max(StT − K, 0) · I(S, tT ),
where I(S, t) denotes an indicator variable which equals 0 if the option has been knocked out up
to (and including) time t and 1 if it is still alive at t. Obviously, this indicator depends in a complex
way on the whole path history S := {Sτ |τ ∈ I}, I ⊆ {t0 , t1 , . . . , tT } of the stock price process S.
In order to evaluate the initial fair price
Vt0 = e−rT EQ [f (S)],
46
The Challenge of Path Dependency
we suggest to compute the conditional expectations
EQ [f (S)|Sti = s, I(S, ti+1 ) = 1]
(2.7)
EQ [f (S)|Sti = s, I(S, ti ) = 1]
(2.8)
and
recursively for i = T − 1, . . . , 0. Since
EQ [f (S)] = EQ [f (S)|St0 = St0 , I(S, t0 ) = 1],
this eventually leads to the fair option price.
For the first Step (2.7) suppose that
EQ [f (S)|Sti+1 = s, I(S, ti+1 ) = 1]
is known by recursion. Note that this is definitely true for i = T − 1 because
EQ [f (S)|StT = s, I(S, tT ) = 1]
= EQ [max(StT − K, 0) · I(S, tT )|StT = s, I(S, tT ) = 1]
= max(s − K, 0).
Since S is a Markov process, we have
EQ [f (S)|Sti = s, I(S, ti+1 ) = 1]
Z
=
EQ [f (S)|Sti+1 = se, I(S, ti+1 ) = 1] · pSti+1 |Sti =s (e
s) de
s,
(2.9)
where pSti+1 |Sti =s (e
s) denotes the conditional probability density function of Sti+1 given that Sti =
s, i.e.
Sti+1 |Sti =s
1
−
(log(s
e/St )+(r−(1/2)σ 2 )(ti+1 −ti ))2
i
2σ 2 (ti+1 −ti )
p
e
σe
s 2π(ti+1 − ti )
For the second step, we need an estimate of the conditional probability
p
(e
s) =
Pi,i+1 (s) := Prob (I(S, ti+1 ) = 1|Sti = s, I(S, ti ) = 1) ,
.
(2.10)
i.e. the probability of survival until time ti+1 if the option has not been knocked out until ti and
the underlying price equals s. This conditional probability is determined in the Monte Carlo step
of our approach. It is the only instance where simulation is actually needed. Using (2.10) we can
determine (2.8) by
EQ [f (S)|Sti = s, I(S, ti ) = 1]
= EQ [f (S)|Sti = s, I(S, ti+1 ) = 1] · Pi,i+1 (s)
s, I(S, ti+1 ) = 0] · (1 − Pi,i+1 (s))
+ |EQ [f (S)|Sti = {z
}
=0
= EQ [f (S)|Sti = s, I(S, ti+1 ) = 1] · Pi,i+1 (s).
(2.11)
2.4 Pricing Delayed Barrier Options
47
For many path dependent options, the Feature Extraction can be summarized as follows
1. Compute the probabilities Pi,i+1 (s) of survival for all values of s and i. This can be done
e.g. by Monte Carlo simulations, starting at the asset value St0 .
2. Use numerical integration, or better, a numerical solver for the Black-Scholes PDE to compute
EQ [f (S)|Sti = s, I(S, ti+1 ) = 1]
from EQ [f (S)|Sti+1 = s, I(S, ti+1 ) = 1] as in Equation (2.9).
3. Compute EQ [f (S)|Sti = s, I(S, ti ) = 1] from
EQ [f (S)|Sti = s, I(S, ti+1 ) = 1]
and Pi,i+1 (s) using Equation (2.11) and go back to step 2 if i 6= 0.
4. The price of the path dependent option is given by
Vt0 = e−rtT EQ [f (S)] = e−rtT EQ [f (S)|St0 = St0 , I(S, t0 ) = 1].
This algorithm can be extended to Parasian, lookback or similar options without large efforts.
2.4.1 Numerical Example: A Parisian Option
In this section, the efficiency of the new method will be compared with the classical Monte Carlo
method. Table 2.4 provides the data of the moving window Parisian option used for the calculations.
Table 2.4 Specifications of a Parisian option.
Option type
Payoff at Maturity f
Strike K
Maturity tT
Risk free rate r
Volatility σ
Barrier Level SB
Daily observations ∆obs t
Length of observation
period N
Number of observations
to event M
no knock out at
Parisian up-and-out
max(StT − K, 0)
100
0.25 years
5%
25%
120
1/250 years
15 ∆obs t
5 ∆obs t
t = 0, t = tT
48
The Challenge of Path Dependency
0
10
pure Monte Carlo
-1
standard deviation
10
-2
10
this method
-3
10
-4
10
-2
10
-1
10
0
10
1
10
2
10
3
10
CPU-time in sec
Figure 2.3: The standard deviation of the value for a Parisian option with specifications in Table 2.4 is plotted versus CPU-time. The standard deviation for each CPU-time is computed using
100 runs of different Monte Carlo simulations. The total number of asset paths reach from about
1,000 to 15,000,000 (PDE discretization: 800 to 3200 nodes in the S, 480 to 1920 time steps, Pi,i+1 grid: spacing in S direction: 800 to 12800 nodes). The CPU time is the run time of a C/C++
implementation on an Intel Xeon 1.7 GHz computer.
For this example of the Parisian options, the values of 1 − Pi,i+1 are presented in Figure 2.4.
The values are estimated using a Monte Carlo simulation. The options price Vt0 is computed
using these probabilities, and a linear interpolation between the nodes.
In order to get an idea of the improvement of convergence, a confidence interval for the price
of the Parisian option is computed. With a traditional Monte Carlo method and with the Feature
Extraction method. In the this example, the Feature Extraction uses a PDE solver for the integration with the probability density function. This is very efficient for the treatment of many sub
steps, which are required in the Parisian option case.
In Figure 2.3 the confidence interval is given as the standard deviation of different runs of
the pricer.2 The standard deviation of 100 runs of the new method and the classical Monte Carlo
method is presented. The Figure shows that, the Feature Extraction has about half the standard
deviation compared with the pure Monte Carlo simulation.
An interesting property one can observe at Figure 2.3 is that the slope of the log(standard
deviation) versus log(CPU-time) is about −0.5 of all three methods. That means that the standard
deviation is approximately proportional to (CPU-time)−0.5 =
2 This
√ 1
.
CPU-time
This result was expected
corresponds to 68% probability that the option value is within V ± standard deviation (cp. Table 1.2).
2.5 Summary
49
Figure 2.4: Value of the knock out conditional probability 1 − Pi,i+1 (S) of a Parisian option with
specifications in Table 2.4 are plotted versus the asset price and time step.
for the pure Monte Carlo method because we found for this kind of Monte Carlo pricing that is
1
converges with √
(Equation (1.24)) and the CPU-time is proportional to the number
number of paths
of paths. The numerical PDE solution required by the new method needs only a small fraction of
time compared to the Monte Carlo sampling thus it has only a minor effect on the CPU-time.
2.5 Summary
This chapter presents a new framework for the valuation of exotic path dependent options, which
we call Feature Extraction. The new framework presented is based on separating the pricing problem into two parts. One part with high complexity is solved by a Monte Carlo method, and a second part with low complexity is solved by standard numerical tools (numerical integration, PDE
solution). The only problem arising is that even though a PDE method can be used in the process,
the Feature Extraction is not easily extended to pricing options with early exercise feature.
Values for different kinds of complex derivatives can be computed. The numerical convergence studies show that the new method is capable of a precise pricing of moving window
Parisian options. While it is practically impossible for a pure PDE method to handle a moving window Parisian option with long windows (> 20 observations, see [55]), the new solution
can deal with this problem.
The improvement of convergence for Asian options using the new method is comparable with
50
The Challenge of Path Dependency
the improvement for Parisian options. Furthermore, the improvements by the new method can
be combined with classical Monte Carlo improvements like antithetic variables and importance
sampling in order to increase convergence.
51
Chapter 3
Moving Window Asian Options
3.1 Overview
In the previous chapter, we saw a method which can increase the speed of pricing path dependent
options which do not allow early exercise. Now, we want to see, how regression methods can help
pricing exercisable path dependent options.
The pricing of moving window Asian options with an early exercise feature is considered as
one of the most complex problems in numerical finance. The computational challenge is created
by the unknown optimal exercise strategy and the high dimensionality that is required for its
approximation. We use the Least-Squares Monte Carlo approach together with Sparse Grid type
basis functions to combine two simple and well established methods. The resulting algorithm
provides a convergent and practical method for pricing the moving window Asian options as
well as other high-dimensional, exercisable securities, which to our knowledge have not yet been
solved with reasonable accuracy.
3.2 Introduction
Methods for pricing a large variety of exotic options have been developed in the past decades.
Still, the pricing of high-dimensional American style options remains challenging. The price of
this kind of option depends on the complete price path not only on the stock price at the final exercise date. In this chapter, we consider the price of a moving window Asian option (MWAO) with
discrete and continuous observations for the computation of the early exercise value1 . The early
exercise value of the MWAO depends on the average value of the underlying stock over a moving period of time, which means that a continuous observation leads to an infinite dimensional
problem.
1 Note that we skip the American in the name for the moving window Asian option with early exercise. We do this,
because a moving window would be useless without an early exercise or a knock-out feature. In the remainder of this
thesis, we will only consider MWAOs with an early exercise.
52
Moving Window Asian Options
Euro Stoxx 50
3500
3400
Euro Stoxx 50
3300
3200
3100
3000
2900
21d moving average
2800
Oct. 04
Dez. 04
Feb. 05
Apr. 05 Jun. 05
Aug. 05
Oct. 05
time t
Figure 3.1: Although being a popular tool for chart analysts, pricing options on a moving average
is challenging.
The idea of computing a moving average value comes from the technical analysis of stock
price evolution: Chart analysts use the moving average as an indicator for future stock price
movements and they often present charts similar to Figure 3.1. The figure shows a stock-price
index and the corresponding moving average. The analysts claim that there is information about
the future in such charts. However, we will not discuss whether this is true or not, we will use
the moving average in a different setting, as a strike of stock options. This idea is simple and
leads to a product which is easy to understand for investors. But, only a few options which have
a moving average as a strike or as an underlying are actively traded [70]. More common is the
moving average computation in issuer-call features of some fixed income securities [71]. Our
algorithm can easily be adapted to these securities, so that we will only present the simple case of
MWAOs.
The foundation of almost any option pricing method is layed by the no-arbitrage framework
introduced by Black and Scholes [17]. We presented the common methods for valution in this
framework in Chapter 1. Especially important to note is that Least-Squares Monte Carlo which
was first presented by Carrière [32] was improved by Longstaff and Schwartz [81], who already
presented an example of a moving window Asian option with early exercise in their publication.
However, the option priced by their mathematical formula solves a much easier problem than
indicated by their prosa. Another application of the Least-Squares Monte Carlo to the MWAO
3.2 Introduction
option is presented by Bilger [16]. His method is very limited and computationally extremely
expensive. Accurate values can hardly be expected. However, Bilger’s approach is closely related
to our method, which only uses a different choice of basis functions for the conditional expected
option value.
There are virtually no analytic pricing formulas known for American type options so that one
has to rely on numerical methods, of which Monte Carlo simulation is among the most common.
Alternative approaches are based on the Cox Ross Rubinstein (CRR) [36] binomial tree model,
which can easily be adapted to American Asian options by using non-recombining trees. The size
of non-recombining trees grows exponentially with the number of time steps, such that accurate
results are hardly obtained. Window options in a recombining CRR model have been presented
by Lau and Kwok [77] using forward shooting grids but they price Parisian or delayed-barrier options and not averaging options. Zvan, Forsyth and Vetzal present PDE methods for continuously
[122] as well as for discretely sampled Asian options [123]. The averaging period in their model
is limited to a start at a fixed point in time and cannot be easily adapted to a moving averaging
period. Other authors like Wilmott [117] present the MWAO with early exercise as a challenging
(“not easy”) problem in a PDE framework.
In fact, pricing methods for MWAOs have been described by very few authors besides Longstaff
and Schwartz [81] or Bilger [16]. Kao and Lyuu [70] present results for moving average-type options which are traded in the Taiwan market. Their method is based on the CRR model and can
handle short averaging periods: the examples include up to 5 discrete observations in the averaging period. To our knowledge, Bilger [16] as well as Kao and Lyuu [70] are pioneers in the
treatment of MWAOs with early exercise features.
Related to the MWAOs is the problem of multi-asset Asian options. An interesting approach
using Markov transition matrices on low distortion grids has been presented by Berridge and
Schumacher [15]. Their method seems to be promising for problems with medium dimensionality
(4 to 10) and should be applicable to moving window Asian options. An implementation of their
method is much more complex and less flexible than ours. Work on European Asian option
contracts has been conducted by several authors, e.g. Kemna and Vorst [71] and Shao and Roe
[106].
As the main extension to Least-Squares Monte Carlo we propose the utilization of sparse grids
type basis functions in the regression, which allows for an accurate option valuation of up to 20
discrete observations on prevailing hardware. The idea of this technique was originally discovered by Smolyak [107] and was rediscovered by Zenger [121] for PDE solutions in 1990. It has
been applied to many different topics since then, such as integration [19] or Fast Fourier Transformation [59]. Recently, sparse grids have been used for finite element PDE solutions by Bungartz
[28], interpolation by Bathelmann et al [13], and clustering by Garcke et al [52]. They also have
been applied to PDE option pricing by Reisinger [100]. An extensive overview of sparse grid
methods is provided by Bungartz and Griebel [29].
53
54
Moving Window Asian Options
This chapter is structured as follows: First we formulate the problem of moving window Asian
option pricing and explain why it is computationally challenging. It follows a brief description
of the Least-Squares Monte Carlo and the introduction of sparse grids to the framework. We
show some numerical examples that demonstrate the method’s effectiveness. Finally we apply
an extrapolation technique to further reduce the error originating from the discrete observations
and other limiting parameters. A paper version of this chapter is also available [40].
3.3 Moving Window Asian Option
In this section, we work out the details of a moving window Asian option and present some
similar derivatives. The MWAO is a simple option that makes use of the moving average as it
is plotted in many stock price charts. Similar to an American option which pays the difference
between the current underlying price and a fixed strike, the MWAO pays the difference between
the current stock price and the floating moving average. Since the computation of moving averages is well established in chart analysis, this option could be accepted by the market, despite its
computational difficulties. Having derived a precise mathematical formulation for the price of an
MWAO, we will be able to understand its computational challenge. Other securities which seem
to be equally challenging at first sight are already very common and actively traded. We will
show, how the valuation of the related securities avoid the computational difficulties of MWAOs.
However, MWAOs might be more interesting for investors than the related securities because
they have a more intuitive averaging mechanism.
3.3.1
Continuous Version
Before we go into the details of the financial product we set up the process for the underlying variable. As in the previous chapters, we use a standard diffusion process that models the uncertainty
in the stock price, according to the formula of Black and Scholes. We denote the stock price at time
t with St and the option price in dependence of St := {Sτ |τ ∈ I}, I ⊆ [t0 , t] with Vt := V (St , t).
From the no-arbitrage arguments we know that the option value satisfies the partial differential
Equation (Equation (1.19)),
1
∂ 2 Vt
∂Vt
∂Vt
+ σ 2 St2
− rVt = 0
+ rSt
2
∂t
2
∂St
∂St
with risk-free interest rate r.
Now, the peculiarity of the MWAO is expressed by a boundary condition to the option value
V , known as an American constraint. The following condition states the minimum value for the
function Vt and has to be satisfied at each time t > t0 + tw ,
Vt
At
≥ P (At , St ) ,
=
R tw
0
1
α(τ )dτ
Z
(3.1)
t
α(t − τ )Sτ dτ,
t−tw
(3.2)
3.3 Moving Window Asian Option
55
where P is the option’s payoff function that depends on the current stock price St and a weighted
average At of the historic stock prices using the weight function α. The moving average is taken
over a window ranging from t − tw to t. In the following, we will consider the payoff function
P (At , St ) = max(At − St , 0).
(3.3)
Hence, the exercise value is greater zero if the stock price falls below its moving average. Effectively this is the case if the stock price drops either quickly or steadily.
The standard value for the weight α is
α≡1
which results in an arithmetic average.
We will discuss other values in Section 3.4. The difficulty in this pricing Equation is the boundary condition in Equation (3.1) which depends on the whole history of stock prices S within the
averaging period t − tw ≤ τ ≤ t. In fact, it is almost impossible to represent this integral numerically, unless we discretize the path of S.
3.3.2 Discretization
For the computational implementation of this problem we introduce a number of additional variables that contain samples of historic values of Sti at different times ti ∈ {t0 = 0, t1 , . . . , tT }, i.e.
St := {Sτ |τ ∈ I}, I ⊆ {t0 , t1 , . . . , tT }. The accuracy of this approximation depends on the time
resolution of the samples. Thus the boundary condition (3.1) becomes a constraint in terms of
historic samples. We assume the last M samples to form the historic window. The condition


i
X
1
Vti = V (St , ti ) ≥ P 
α(i − j)Stj , Sti 
(3.4)
M
j=i−M
holds for i ≥ M , after an initial incubation. For weight α, we will consider two possible configurations. Since the sample points are used to approximate the integral over the stock price path, we
can use the trapezoid method for integration as the preferred method for non-smooth integrands:
½ 1
for t = 0 ∨ t = M
2
α1 (t) =
.
(3.5)
1
otherwise
A simpler method is sometimes closer to reality. With a constant α we do not optimally approximate the continuous integral, but might do better at modeling the practical implementation of
such an option. In a realistic setting, this option has predefined dates at which the stock price is
fixed and considered in an equally weighted arithmetic average. That means we require a weight
function α with
½
α2 (t) =
1
0
for t < M
.
for t = M
(3.6)
56
Moving Window Asian Options
Our method for the valuation of the option uses the presented discretization and a quadrature
of either α1 or α2 , depending on the setting. The valuation proceeds backwards in time, starting
at maturity tT , where condition (3.4) holds with equality. Then, we solve for the option value at
current time and current stock price Vt0 = V (St0 , t0 ).
For low values of M this procedure can be rephrased in a PDE setting and solved numerically
by standard methods. Without going into details, we recommend a method that is based on a
finite volume discretization of the Black-Scholes PDE according to the model of Zvan et al [123].
However, due to the “curse of dimensions” it is traditionally thought that a function with more
than three or four dimensions is extremely hard to discretize.
3.4 Related Problems
As we have seen, the moving window Asian option is a derivative with the moving average as one
of its underlyings. In order to determine its price correctly, the full history of previous prices has
to be considered, which leads to an arbitrary number of relevant dimensions. Despite its intuitive
definition the moving average presents a serious computational challenge. This section distinguishes the MWAO from other similar derivatives for which straight-forward implementations
or even analytical formulas were derived. Since all the difficulties originate from the averaging
mechanism At , we will focus on some alternative averaging styles.
3.4.1
Asian American Option
The Asian American option (AAO) is very similar to the moving window Asian option. It differs
in the time horizon over which the average is evaluated. While the MWAO has a moving window
with constant length, the AAO has a window that increases in time. The averaging window
always starts at t0 and ends at the current time t. This slight difference considerably simplifies the
computational procedure. In the following we will briefly show that this pricing problem can be
solved in two dimensions.
Consider an asset price process S with an asset price at time t of St . The moving average AAAO
t
is given by
AAAO
t
1
=
t
Z
t
Sτ dτ.
(3.7)
0
Differentiating this expression with resprect to time t, we obtain
dAAAO
=
t
1
1
St dt − AAAO
dt
t
t t
(3.8)
which does not depend on any historic stock price. Only the current stock price and the previous
average is required.
3.4 Related Problems
57
3.4.2 Exponential Weight
There exists another version of the moving window Asian option for which a good Markovian
approximation of the update formula can be constructed. It uses the variable a as a decay factor which determines how much less old stock prices are weighted compared to newer values.
Consider an exponentially weighted average for the payoff
Vt
≥
Exp
=
At
P (At , St ) ,
Z t
1
α(t − τ )Sτ dτ,
Rt
α(τ )dτ 0
0
(3.9)
(3.10)
with
α(t) = a exp(−at).
(3.11)
The average theoretically depends on all previous prices, which makes it difficult to implement
in practice. However, a simple update formula is available by differentiation of the expression
Exp
At
with respect to time,
µ
Exp
dAt
=
¶
a
Exp
(St − At ) dt.
1 − e−at
(3.12)
Since this special case assigns virtually no weight to very old asset prices, the method can be seen
as a rough approximation to the MWAO in Equation (3.1) with α(t) = a exp(−at). This kind of
approximation is presented by Longstaff and Schwartz [81].
3.4.3 Moving Window Asian Option
The previous paragraphs presented simple update formulas for averages At of Asian options. A
similar update formula can not be constructed for the MWAO2 . The complete set of historic asset
prices in the window is relevant to the exercise decision of MWAOs.
To see that the problem of the MWAO is different from the presented Asian options, we reconsider the averaging function in Equation (3.2) with a weight function α = 1:
Z t
1
At =
Sτ dτ.
tw t−tw
Differentiating this expression with respect to time t leads to
dAt =
1
(St − St−tw ) dt,
tw
which depends on the asset price at two different times. An optimal exercise strategy has to
consider the two values St , St−tw and all asset prices in between. The reason for this is that all the
values Sti , t > ti > t − tw will be used in the computation of future moving averages, which are
required in the computation of the expected value of continuation. Since there are infinite many
asset prices Sti , the computation of the optimal exercise strategy is hard.
2 Recall
that we defined MWAO to be a moving window Asian option with an early exercise feature.
58
Moving Window Asian Options
3.5 Numerical Procedure
The algorithm that is proposed in this chapter is effectively combining three individual techniques which are well established in their respective fields. We combine Monte Carlo simulation,
least squares regression and sparse grids to a practical method for American option valuation.
Especially in quantitative finance the technique called sparse grids does not yet fully live to its
potential. One of the purposes of this article is to demonstrate the flexibility and the simplicity of
sparse grids. Since all the individual components of our algorithm have been elaborated in full
detail by our cited sources, we will just summarize each of the components’ main aspects.
3.5.1
Simulation
As noted in the previous chapters, the standard method which is used when dimensionality
causes numerical difficulties is Monte Carlo simulation. As we will see, this approach does not
resolve our issue but will provide the framework for our algorithm. Again, we simulate different
asset paths. Each of these paths follows the risk-neutral process, a geometric Brownian motion as
in the first chapter (1.26). Recall this process, which is the process underlying the Black-Scholes
Equation (1.19),
dSt = rSt dt + σSt dWt
with a risk-less interest rate r, volatility σ and the increment of a Wiener process dWt . This
process is sampled at discrete times ti ∈ {t0 , t1 , . . . , tT } so that each of the n realization S j , j ∈
{1, . . . , n} follows as in Equation (1.28)
¶
µ
p
1 2
j
j
Sti+1 = Sti exp (r − σ )(ti+1 − ti ) + σ (ti+1 − ti )θi,j
2
with θi,j drawn from a standard Normal distribution. The price of the MWAO is the discounted
expected value of the payoff at the optimal stopping time. The optimal stopping time provides a
strategy maximizing the option value without information about the future of the asset path. The
optimal stopping time is computed by Least-Squares Monte Carlo as presented in Section 1.4.2. It
is important to recall that the numerical procedure always produces a suboptimal exercise strategy, such that the average option value is underestimated.
3.5 Numerical Procedure
59
3.5.2 Choice of Basis Functions
A tricky part of our numerical solution and in fact the crucial challenge is the careful choice of
the basis functions bk in Equation (1.34). As described in the previous section we will use a
linear combination of these basis functions to express an estimate for the current option value in
dependence of all relevant input parameters.
Implementation
In our implementation, we perform the regressions required by Equation (1.34) on sparse polynomial basis functions as presented in Chapter 1 (Section 1.2.2). We use sparse levels L from
0 to 3 which are sufficient for our purposes. But, we do not perform the regressions on S directly. Instead, we use scaled values of S such that for each simulated path Sj , we compute
xj = (γ1 (Stji ), . . . , γM (Stji−M ), with linear transformation function
γj (Stji ) :=
Stji − min(Sti )
,
max(Sti ) − min(Sti )
such that xj ∈ [0, 1]M +1 lies in a unit cube. Since sparse polynomial basis functions are used, this
creates matrices with better condition numbers than without the transformation.
The regression itself is performed by solving the linear least squares problem of Equation (1.34)
implicitly via QR-decompositions (cp. Section 1.2.1). Furthermore, the regression is only performed on the paths with a positive exercise value S i : P (S i , t) > 0. This decreases the computational effort.
3.5.3 Simple Example
Table 3.1 Specification of a simple moving window Asian option with a floating strike in discrete
time.
Option type
Maturity tT
Risk free rate r
Volatility σ
observation frequency ∆obs t
Length of observation
period M
Exercise value
Exercise dates
moving window Asian option
0.4 years
5% p.a.
40% p.a.
1/10 years
3 observations à Ã
P (S, ti ) = max
ti ∈ 0.3, 0.4
1
3
i
P
j=i−2
!
S tj
!
− S ti , 0
60
Moving Window Asian Options
To learn more about the implementation, we consider a simple example of a MWAO with few
simulated asset paths and only a single early exercise date. The data of this option is presented in
Table 3.1.
As always, the Monte Carlo evaluation starts with simulating asset paths. To keep this example simple, we simulate 20 paths S j , only. The paths j = 1, . . . , 10 are used for an in-sample
estimate (n1 = 10) and j = 11, . . . , 20 for an out-of-sample estimate (n2 = 10).
j
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Stj0
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
Stj1
104.0085
98.1999
110.3735
98.8469
87.1526
110.4539
78.6144
96.4781
105.8142
103.8195
102.8094
98.4026
94.1248
94.6225
120.4949
98.0982
132.8095
96.542
91.1224
95.1477
Stj2
80.5268
91.2582
118.5273
111.907
80.691
99.8516
69.4161
74.4774
101.2728
118.0207
108.1064
94.6187
140.5762
81.2551
135.0204
89.5304
125.8557
94.6397
102.0637
97.4547
Stj3
90.4124
70.0437
132.243
96.2535
77.7054
83.3369
65.5344
90.8101
96.8907
101.0886
121.8964
76.2072
150.434
68.6878
145.2074
96.7834
106.592
105.7335
88.0567
78.4171
Stj4
64.3768
62.3572
111.0303
86.32
78.7367
103.8844
63.532
110.0753
103.1488
76.6437
131.7954
73.3851
126.2498
70.1597
112.4673
85.1618
107.2087
96.5274
90.9533
75.2402
Now, we can compute the value of the option Vt4 at maturity time t4 for each of the paths, i.e.
µ
Vt4 := Pt4 (St2 , St3 , St4 ) = max
¶
St4 + St3 + St2
− St4 , 0 ,
3
3.5 Numerical Procedure
61

Vt 4

















=

















14.0619
12.1958
9.5699
11.8402
0.3077
0
2.6288
0
0
21.9406
0
8.0185
12.8368
3.2078
18.4311
5.3301
6.0101
2.4395
2.7379
8.4638


















.

















This completes the work required at time t4 . We proceed at time t3 , where we compute the
immediate exercise value Ptj3 for each path S j ,
Pt3 (St1 , St2 , St3 ) = max
µ
¶
St3 + St2 + St1
− S t3 , 0 ,
3


















Pt3 (St1 , St2 , St3 ) = 

















1.2369
16.4569
0
6.0823
4.1443
14.5439
5.6539
0
4.4352
6.5543
0
13.5356
0
12.834
0
0
15.1604
0
5.6909
11.9227


















.

















Following the Least-Squares approach in Section 1.4.2, we have to compute a three dimensional local basis approximation for the expected exercise value P e (S, t3 ) ≈ EQ [Vt4 |S, t3 ], S :=
62
Moving Window Asian Options
{Sτ |τ ∈ I}, I ⊆ {t0 , t1 , t2 , t3 } which we solve with a sparse polynomial basis as in Section 1.2.2.
The approximation is three dimensional because the known stochastic values at time t3 which
determine Vt4 and Pt3 are St3 , St2 and St1 .
That means, we compute (Theorem 1.4 and Lemma 1.5)
P e (S, ti )
≈ EQ [e−r(ti+1 −ti ) V (S, ti+1 )|S, ti ]
m
X
=
ãj bj (x),
j=1
(ã1
ã2
...
ãm )T
where we identify the state X := (x1
x2
γj (Stji ) :=
¡
=
¢−1
B(X)T B(X)
B(X)T y,
x3 )T , with x1 = γ1 (St1 ), x2 = γ2 (St2 ), x3 = γ3 (St3 ),
Stji − min(Sti )
max(Sti ) − min(Sti )
and the function values y := e−r(ti+1 −ti ) Vt4 to approximate.
We will not use the three dimensional basis presented in Figure 1.2, the m = 31 basis functions
B2sparse (x1 , x2 , x3 ) =
=
[
P
full
Bβ(`)
`i =2
{1, x1 , x2 , x3 , x1 x2 , x1 x3 , x2 x3 , x21 , x22 , x23 , x21 x2 ,
x1 x22 , x21 x22 , x21 x3 , x1 x23 , x21 x23 , x22 x3 , x2 x23 , x22 x23 ,
x31 , x32 , x33 , x41 , x42 , x43 , x51 , x52 , x53 , x61 , x62 , x63 }
are too many for just 8 asset paths (in-sample and in-the-money). Instead, we will use a sparse
polynomial basis with L = 1, i.e.
B1sparse (x1 , x2 , x3 )
=
[
P
full
Bβ(`)
`i =1
= {1, x1 , x2 , x3 , x21 , x22 , x23 }.
Since the basis functions do not include high polynomial degrees, we skip the transformation
γ(S) of the S values onto a unit cube, i.e.
xj := γj (Stji ) := Stji .
Then, the values of the basis functions using the asset paths S j which are in the in-sample valuation set and in the money (Ptj3 > 0), i.e. j ∈ {1, 2, 4, 5, 6, 7, 9, 10} lead to
3.5 Numerical Procedure
63
³
Bin := Bin (St1 , St2 , St3 ) = 1






Bin = 





Stj1
Stj2
Stj3
(Stj1 )2
´¯
¯
(Stj3 )2 ¯
(Stj2 )2
j∈{1,2,4,5,6,7,9,10}
1 104.0085 80.5268 90.4124 10817.7681 6484.5731 8174.3955
1 98.1999 91.2582 70.0437
9643.212 8328.0551 4906.1259
1 98.8469 111.907 96.2535 9770.7034 12523.1715 9264.7272
1 87.1526
80.691 77.7054 7595.5709 6511.0443 6038.1319
1 110.4539 99.8516 83.3369 12200.0661 9970.3394 6945.0413
1 78.6144 69.4161 65.5344 6180.2308 4818.5902 4294.7519
1 105.8142 101.2728 96.8907 11196.6402 10256.1791 9387.8019
1 103.8195 118.0207 101.0886 10778.4862 13928.8762 10218.9062
,






.





Analog to Bin , the remaining paths (the out-of-sample paths) in the money j ∈ {12, 14, 17, 19, 20}
lead to a basis function value matrix Bout of



Bout = 


1 98.4026 94.6187
1 94.6225 81.2551
1 132.8095 125.8557
1 91.1224 102.0637
1 95.1477 97.4547
76.2072 9683.0692
8952.6961
68.6878 8953.4162
6602.3897
106.592 17638.3735 15839.6651
88.0567 8303.2904 10416.9973
78.4171 9053.0917
9497.4135
5807.5343
4718.0082
11361.8604
7753.9758
6149.2424



.


We are interested in P e (S, t3 ) of the in-the-money asset path values Stj3 ,
j ∈ {1, 2, 4, 5, 6, 7, 9, 10, 12, 14, 17, 19, 20}, only. These values are obtained by
in
Pe,in
t3 = B · (ã1
ã2
...
ãm )T
for the in-sample respectively
Pe,out
= Bout · (ã1
t3
ã2
...
µ
for the out-of-sample paths. Altogether with
Pet3

µ
Pet3 =
Bin
Bout
=








¶ 
 in † 

·
(B
)










Pe,in
t3
Pe,out
t3
ãm )T
¶
we can just compute
e−r·(t4 −t3 ) Vt14
e−r·(t4 −t3 ) Vt24
e−r·(t4 −t3 ) Vt44
e−r·(t4 −t3 ) Vt54
e−r·(t4 −t3 ) Vt64
e−r·(t4 −t3 ) Vt74
e−r·(t4 −t3 ) Vt94
e−r·(t4 −t3 ) Vt10
4






 ,





where (Bin )† is the pseudo inverse of Bin (cp. Theorem 1.7). The exercise decision (Equation (1.35))
is in this case
½
Vtj3 =
e−r(t4 −t3 ) Vtj4
Ptj3
Computing the data for all paths leads to
if P e (Sj , t3 ) > Ptj3
.
else
64
Moving Window Asian Options
j
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
P e (Sjt3 )
16.2631
15.8368
-73.5746
9.9375
-1.2629
1.301
1.7157
-111.2316
1.9591
16.9106
-85.4283
-23.5981
-8.5221
Ptj3
1.2369
16.4569
0
6.0823
4.1443
14.5439
5.6539
0
4.4352
6.5543
0
13.5356
0
12.834
0
0
15.1604
0
5.6909
11.9227
e−0.05·0.1 Vtj4
13.9918
12.1350
9.5222
11.7811
0.3062
0
2.6157
0
0
21.8312
0
7.9785
12.7728
3.1918
18.3392
5.3035
5.9801
2.4273
2.7242
8.4216
Vtj3
13.9918
16.4569
9.5222
6.0823
0.3062
14.5439
5.6539
0
4.4352
6.5543
0
13.5356
12.7728
3.1918
18.3392
5.3035
15.1604
2.4273
5.6909
11.9227
where the last column contains the option values Vt3 at time t3 . Since the option has no further
early exercise dates, we can just compute the values of Vtj0 as the discounted values of Vtj3 :
Vtj0
=
Vt 0
=
e−r(t3 −t0 ) Vtj3 ,


13.8525
 16.2931 


 9.4275 


 6.0218 


 0.3031 


 14.3392 


 5.5977 



0 


 4.3911 


 6.4891  .


 13.4010 


 12.6457 


 3.1600 


 18.1567 


 5.2507 


 15.0096 


 2.4032 


 5.6343 
11.8041
The option value V in of the in-sample and the value V out of the out-of-sample set are then the
3.6 Numerical Examples
65
average values of the corresponding paths estimates Vtj0 ,
V in
V out
=
=
1
10
1
10
10
P
j=1
Vtj0 =
7.6392
Vtj0 =
8.7029.
20
P
j=11
In the remainder of this chapter, we will focus on the out-of-sample prices, since on average they
represent a lower bound on the true price as discussed in Section 1.4.2.
3.6 Numerical Examples
In order to demonstrate the efficiency of our approach, a numerical case study is provided in
this final section. We will focus on a discretely sampled MWAO with properties sketched in
Table 3.2. The option is sampled with a regular frequency, e.g. every trading day at a specified
time. We will distinguish between two different sample techniques. The first one has a discretely
sampled averaging window spanning ten observations and is consequently integrated with α2
from Equation (3.6). The second one is aimed at an approximation of the continuous-time version
of the MWAO and is integrated with α1 from (3.5).
Table 3.2 Specifications of a moving window Asian option with a floating strike in discrete time.
Option type
Maturity tT
Risk free rate r
Volatility σ
Daily observations ∆obs t
Early exercise
Length of observation
period M
Exercise value
moving window Asian option
0.4 years
5% p.a.
40% p.a.
1/250 years
at each observation with t ≥ 10/250 years
10 days
P (S, ti ) = max
Ã
Ã
1
M
i
P
j=i−M +1
!
S tj
!
− Sti , 0
3.6.1 Convergence
To analyze the convergence of the presented pricing algorithm for MWAOs, we will denote the
i
fa (n, L, M ). Thus, each Monte Carlo
computational result of V out according to Equation (1.36) by V
i
fa depends on the number of samples n, the level of the sparse grid function basis L, the
value V
number of observations in the window M and the quadrature scheme αa . Using different sets of
i
fa (n, L, M ) with n, L, a and M fixed in order to get an
random numbers, we compute different V
66
Moving Window Asian Options
option price
8
7.8
7.6
7.4
7.2
7
6.8
level 0
level 2
6.6
level 1
level 3
6.4
6.2
6
100
1,000
10,000
100,000
1,000,000
number of simulations
Figure 3.2: The option value of an MWAO option with data in Table 3.2 estimated by LSMC.
estimate for the mean
I
Va (n, L, m) =
1 X fi
Va (n, L, M )
I i=1
(3.13)
of I different Monte Carlo prices. The values for I range from 10 to 1000 depending on an estimate
of the Monte Carlo error. In most cases I is chosen in a way such that the estimate of the 68%-error
is less than 0.001, which means that all presented digits of the option values are correct. The only
exception from this rule are two presented option values with many basis functions L = 3 and
many asset paths n > 105 , where I = 1. The computations of more values is too expensive and
the error should be already less than 0.01.
The number of samples n per Monte Carlo price results from the in-sample paths S 1 , . . . , S n1
and the out-of-sample paths S n1 +1 , . . . , S n2 , i.e. n = n1 + n2 . We use 30% of the sample paths for
regressions (n1 ) and 70% for valuation out-of-sample (n2 ).
Figure 3.2 presents the mean V2 (n, L, 10) for different numbers of samples n and different
levels L. The values at level L = 0 converge quickly to a value of about V = 7.15 which does
not change after 3000 simulations. Using M = 10, the level 0 consists of just one basis function
and the resulting exercising decision is almost trivial. Level 1 consists of 21 basis functions. This
allows for a more sophisticated strategy with a better utilization of the option. After about 100.000
simulations, the option value saturates at 7.58. The level 2 with 241 basis functions results in an
even higher value of V = 7.60 after 1.000.000 simulations. A third level with 2001 basis functions
3.6 Numerical Examples
67
Table 3.3 The option value of an MWAO option with data in Table 3.2 estimated by Least-Squares
Monte Carlo. The mean of a series of evaluations with level L and a fixed # of samples is denoted
i
f2 (n, L, 10)) of this series is denoted by σ
by V2 (n, L, 10) where as the standard deviation σ(V
b.
\
level L
# samples n
3 × 101
1 × 102
3 × 102
1 × 103
3 × 103
1 × 104
3 × 104
1 × 105
3 × 105
1 × 106
3 × 106
L=0
V2 (n, 1, 10)
7,010
7,110
7,138
7,145
7,148
7,149
7,149
7,150
7,148
7,150
7,149
σ
b
0,499
0,234
0,134
0,073
0,043
0,022
0,014
0,007
0,005
0,004
0,001
L=1
V2 (n, 2, 10)
σ
b
L=2
V2 (n, 3, 10)
σ
b
4,053
6,192
7,178
7,450
7,536
7,567
7,576
7,580
7,582
7,579
0,470
0,264
0,114
0,061
0,034
0,018
0,010
0,005
0,003
0,001
3,813
5,399
6,869
7,359
7,522
7,578
7,600
7,601
0,148
0,083
0,041
0,018
0,009
0,006
0,002
0,001
L=3
V2 (n, 4, 10)
3,166
5,357
6,841
7,358
7,524
σ
b
0,069
0,024
0,010
already exceeds our available computational resources, such that the saturation level could not
be computed.
One thing worth mentioning is the initial inferiority of higher levels due to an over-fitted
exercise strategy. This effect is based on the fact that a regression with relatively few asset paths
on many basis functions is conducted for estimating the optimal early exercise strategy. Now,
the basis functions can predict the behavior of the in-sample data set perfectly and deliver early
exercise strategies which have knowledge of specific future paths characteristics instead of the
average characteristics. In Figure 3.2, the out-of-sample values are presented. For the out-ofsample data set, the trained knowledge of specific in-sample future paths characteristics delivers
wrong estimates of the expected path development. Now, the over-fitted exercise strategy is
suboptimal and thus delivers lower values than an optimal strategy. The larger the over-fitting
effect, the worse is the exercise strategy in the out-of-sample data set.
The corresponding values to Figure 3.2 are presented in Table 3.3. The mean values of a series
of valuations is denoted by V2 (n, L, 10), the standard deviation of the series is denoted by σ
b. For
a single evaluation with the Least-Squares Monte Carlo, σ
b can be seen as a measure how close the
value is to the mean of many valuations. Contrarily, σ
b does not provide a measure for the error
compared with the real value. The mean estimate will be biased towards lower than the real values due to the insufficient estimate of the optimal exercise strategy P e . An approximation of the
MWAO with 1.000.000 sample paths and level two regressions delivers cent accurate estimates.
Consequently, the value of an option with properties in Table 3.2 is at least 7.60.
68
Moving Window Asian Options
3.6.2
Heuristic Extrapolation
After we have successfully handled the ten-dimensional case we will aim for the infinitely dimensional problem. While the previous option’s exercise value depends on the average of ten
discretely sampled stock prices we will now consider the continuous integral. The option has a
payoff as defined in Equation (3.3). In order to present the optimal approximation we will increase the number of samples in the averaging window and then extrapolate the value based on
the obtained convergence properties. The derivative’s specification can be found in Table 3.4.
Table 3.4 Specifications of a moving window Asian option with a floating strike in continuous
time.
Option type
Maturity tT
Risk free rate r
Volatility σ
Averaging window length tw
Early exercise interval
Exercise value
moving window Asian option
0.4 years
5% p.a.
40% p.a.
10
10 days = 250
years
10/250 years ≤Ãt ≤Ã
0.4 years !
!
Rt
1
P (S, t) = max tw
Sτ dτ − St , 0
t−tw
For the continuous integral we rely on the trapezoidal quadrature rule α1 as stated in (3.5).
Hence, we compute V1 (n, L, M ) and analyze the effect of increasing M arbitrarily. Figure 3.3
demonstrates the convergence on level L = 2 with 106 sample path’s. We can clearly recognize
the convergence with the number of observation samples within the averaging window M . Despite the converging shape there is still some slope in the curve’s final point V1 (106 , 2, 20) = 8.16.
Extrapolation will lead us to a final result that is about 2% higher than our best finite approximation.
In order to approximate the infinite-dimensional result as accurate as possible we use an extrapolation technique for the Least-Squares Monte Carlo, similar to the Richardson extrapolation [1, 98]. For an extrapolation, we require a convergent, strictly increasing series of option
values. Assuming that we knew the order of convergence of the error, we could extrapolate to
infinity and solve for the value of the continuously averaging MWAO. Before going into mathematical details we can think of this method as a way of guessing the limit value based on the
information known.
The price of our continuous average option has three main sources of error: the number of
simulation paths n, the level of the function basis L and the number of integration samples M .
The best possible approximation would have to limit each of these parameters towards infinity
and compute V (∞, ∞, ∞). We will limit our discussion to L ∈ {0, 1, 2} because this should
already give values accurate enough. Furthermore, multidimensional extrapolation is certainly
3.6 Numerical Examples
69
8.5
8
7.5
option value (LS Monte Carlo)
7
8.2706 − 3.12026 · M− 1 . 1 2 5 0
6.5
6
5.5
5
0
5
10
15
20
number of samples (M)
difference to 8.2706
3.2
1.6
0.8
0.4
0.2
0.1
1
2
4
8
16
number of samples (M)
Figure 3.3: The mean option value V1 (106 , 1, M ) of an MWAO option with data in Table 3.4 estimated by Least-Squares Monte Carlo together with the function 8.2706 − 3.12026 · M −1.1250 is
presented in the upper plot. The plot on the bottom shows the difference of the option values to
the extrapolated value, 8.2706 − V1 (106 , 1, M ).
70
Moving Window Asian Options
something where little experience has been collected so far. We will do extrapolation not only
as a mental exercise, but also as a way to justify our finite results which are very close to the
presumable infinite limit.
To our knowledge, there has not been any theoretical error analysis of the Least-Squares Monte
Carlo for MWAOs. But, we can build on a result from Stone [109]: If a regression function θ(x) =
E[Y |X = x] is p-smooth, then the L2 -error of a local polynomial kernel estimator converges to
zero at a rate of n−c with some fixed value c and n denoting the number of samples of X. This
result is related to the Least-Squares Monte Carlo valuation because we use a polynomial basis in
order to estimate the conditional expectation and we assume that this is the main source of error.
Hence, we can rewrite Va (n, L, m) as
Va (n, L, M ) ≈ Va (∞, L, M ) − (c0 n−c1 )
(3.14)
and our empirical data analysis indicates that this is a reasonable guess.
This extrapolation to n → ∞ has little impact, since values based on n = 106 are already very
precise. The difference between V (106 , 1, M ) and V (∞, 1, M ) is just about one cent.
Being able to produce a series of V1 (∞, L, M ) we can continue and focus on the next parameter: M . If we want to extrapolate it to infinity, we again have to find the order of the
error. We look for a reasonable function which can fit the option values for different M . Figure 3.3 presents the Least-Squares Monte Carlo option values for different M together with the
function 8.2706 − 3.12026 · M −1.1250 . The fit of the function is almost perfect so that we get
V (106 , 1, ∞) ≈ 8.27. The same procedure for L = 2 leads to V (106 , 2, ∞) ≈ 8.30.
In the end, we can present an informed guess for the value V of a continuously averaging
MWAO with properties in Table 3.4:
V ≈ 8.30.
(3.15)
This kind of extrapolation can be useful to decrease computational effort or to increase accuracy. However, the correct description of the error and in particular the suitability of the error
estimators for the parameters is open for future research.
3.7 Summary
3.7 Summary
This chapter presents a simple and flexible implementation of a moving window Asian option.
Despite being actively traded, no accurate algorithm has been published so far which could extract the derivative’s optimal exercise strategy and its precise value. The computational difficulty
stems from one of the options underlyings: an either discretely or continuously sampled moving
average over a stock price path. This leads to a very high dimensionality in the mathematical
definition with an exponential complexity in standard algorithms. We have shown that a straight
forward approach to this problem could be found by employing Least-Squares Monte Carlo and
a technique called sparse grids, which was specifically developed as a cure to the curse of dimension. The presented approach can be applied to any derivative that has the moving average as an
underlying, as it is commonly plotted in stock price charts. We believe that this thesis can increase
the acceptance of such products. Now, there is a simple algorithm for the simple derivative.
71
72
Moving Window Asian Options
73
Chapter 4
Callable Convertible Bonds
4.1 Overview
After the solution to the Moving Window Asian Option pricing, we will push the Least-Squares
Monte Carlo a little further. This will allow us to determine its efficiency for complex exercise
and call features. Convertible bonds inhabit many of these complex rights for the holder and the
issuer, which makes them a suitable choice for the test of Least-Squares Monte Carlo.
Most methods for valuing convertible bonds assume that the bond is continuously and instantly callable by the issuer. However, in practice convertible bonds can often be called only if
advance notice is given to the holders. In this chapter, we develop an accurate PDE method for
valuing convertible bonds with a finite notice period as a reference. Then, we present a LeastSquares Monte Carlo method capable of pricing convertible bonds and compare the two methods. Example computations are presented which illustrate the effect of varying notice periods,
and moving window call constraints. It appears that a low-dimensional sparse basis can be used
to obtain reasonably accurate prices even in the case of the moving window call constraint.
4.2 Introduction
The market for convertible bonds has been growing rapidly in the past few years [12]. Convertibles can be thought of as normal corporate bonds with embedded call options on the issuer’s
stock. Having properties of both stocks and bonds, convertibles can be an attractive choice for
investors. Studies suggest that the historical average return of convertibles has been roughly the
same as that for the equity market, but convertibles have tended to have lower risk [113, 83].
From the standpoint of the issuing firm, a convertible can be attractive for several reasons [26].
They are particularly desirable in situations where the risk of the issuer is hard to evaluate and its
investment policy is somewhat unpredictable. The prototypical issuer is a relatively small firm
with high growth potential and risk. Such firms are often cash-constrained and willing to give
the embedded call option to investors in exchange for lower coupon payments on their debt.
74
Callable Convertible Bonds
Convertibles incorporate a variety of features. The instrument might be convertible into shares
of the issuing company or in some cases into shares of a different company. Usually convertibles
may be converted by the holder at any time. Often, these bonds can be sold back (or “put”) to
the issuer at specific dates for a guaranteed price. In addition, the issuer may have the right to
redeem or call back the convertible at a specified call price. If the issuer does so, the investor can
choose between receiving the call price or converting into shares. The ability of the issuer to call
back the issue is frequently restricted by “soft” and “hard” call constraints. A hard call constraint
prohibits calling the issue during the initial life of the contract. A soft call constraint requires that
the issuer’s stock price remains above a discretely observed trigger level before the issue can be
called, e.g. the convertible cannot be called until the underlying has been m out of n days above
the trigger level. In addition, and what is our main focus here, the issuer usually has to give notice
some period in advance (e.g. 1 month) of calling the issue.
A long-standing puzzle with regard to convertibles is the “delayed call” phenomenon. This
has been discussed by many authors [26, 78, 58, 4, 5, 119]. As shown in [65], assuming that the
issuing firm’s management is acting in the interests of the existing shareholders, it is optimal to
call the convertible as soon as its value is equal to the call price. However, the observed behavior
of firms is not consistent with this in that companies often wait until the convertible value is far
above the call price before calling. The optimality of the policy of calling immediately after the
convertible value reaches the call price depends on various other assumptions, and so a variety of
explanations have been proposed to account for the difference between the theoretically optimal
policy and that observed in practice. One possibility is the dilution effect1 [74, 72, 69, 73], another
possibility is the effect of the notice period (see [10] and references therein) on which we will
focus.
A detailed lattice method for valuing convertibles with notice periods is presented in [78].
In [56], we develop a PDE method for pricing convertible bonds with a call notice period. The
main focus of the work in [56] was to compare the pricing results for a call notice period to the
approximations developed by Butler [30], under both the Tsiveriotis and Fernandes [112] model
and the Ayache, Forsyth, Vetzal (AFV) model [11, 12, 7, 119, 84].
Since it now appears that the standard reduced-form pricing model for convertible bonds uses
the AFV assumptions, (see, for example, the most recent version of [64]), we will consider only
models of the AFV type in this thesis.
The main focus of this chapter is on methods for pricing complex path-depedent call and put
features of convertible bonds. We will first present a brief overview of the numerical PDE method
used for pricing convertible bonds using the AFV [11] model. In particular, this one factor PDE
model can be used to price complex put and call features, including calls with a notice period.
Although the PDE method is very general, the valuation of convertible bonds with complex
call constraints is sometimes not feasible due to memory and computational restrictions. Con1 Dilution is the effect that the relative share of the existing stock holders declines if a holder of a convertible bond
decides to exercise into new shares instead of taking the bond’s face value at maturity time.
4.2 Introduction
sequently, we present a Monte Carlo method that is capable of pricing securities with a variety
of call constraints. In most cases, computing values of convertible bonds using the Monte Carlo
method is not nearly as efficient as our PDE method. But, the Monte Carlo method allows us to
analyze properties of convertibles which we cannot model efficiently in a PDE framework.
Convertible bonds often have complex call trigger features. For example, a call may be issued
(with notice) only if the underlying was above a trigger level for 20 out of the last 30 days. In
the following, we will refer to this contract feature as moving window call protection. In this
case, the numerical PDE solution would require a solution on a thirty dimensional grid, which is
clearly infeasible. The only real possibility for pricing such a feature is by means of a Monte Carlo
method.
Most previous work on Monte Carlo methods for pricing convertible bonds [27, 6] relies on
a parameterization of the optimal stock price level for issuing a call. It is difficult to correctly
account for the different cash flows and call constraints in this setting so that we choose a more
rigorous non-parametric approach in this work which is similar to the one used in [84].
For the assessment of complex trigger level call constraints, we present a Monte Carlo method
based on least-squares regressions and special basis functions. Our Monte Carlo Method is based
on the American option pricing procedure presented in [32] and [81]. To the best of our knowledge, the only previous work which uses this approach for convertibles is presented by Lvov et
al [84]. While their focus lies on the general application of Least-Squares Monte Carlo to discretely callable convertibles, we focus on the quantitative comparison of PDE and Monte Carlo
for convertibles with continuous call. Additionally, we present how to incorporate common soft
call constraints.
We will first compare Monte Carlo and PDE methods for convertible bonds with call notice
period features which can be priced using both techniques. We then go on to use the Monte Carlo
method for pricing complex features which cannot be priced using PDE methods.
The main results of this chapter are as follows
• Assuming that the issuer uses various rules of thumb for issuing a call notice, we examine
the impact of these non-optimal strategies on the bond price.
• We verify that the Least-Squares Monte Carlo method can be used to price a convertible
with vanilla call and put provisions. The accuracy of the Monte Carlo method is verified by
comparing with an accurate PDE solution. Our results are consistent with those in [84].
• Since the moving window call protection is heavily path-dependent, it would appear that
the Monte Carlo method would require a very large number of basis functions, even if
sparse grid techniques are used. Our preliminary results indicate that reasonable results
can be obtained even with a small number of basis functions for the least-squares regression.
This is fortunate, since otherwise, the computation would be infeasible.
75
76
Callable Convertible Bonds
The chapter is organized as follows. We first present the standard model for convertible bonds
with credit risk and a short summary of new developments in this area. We then derive the
equations which take into account, in a rigorous manner, the call notice period. An outline of the
numerical algorithms is next presented, followed by some illustrative results. A paper version of
this chapter is also available [57].
4.3 Models for Convertible Bonds
Our main focus here is on modeling the call notice period and other call provisions. We will
restrict our attention to the case where interest rates are constant. This is in line with current
practice since it is commonly believed that the effect of stochastic interest rates on convertible
pricing and hedging is small, compared to stochastic stock prices (see [25, 6]). Dilution effects
will also be ignored in the following.
4.3.1
No Default Risk
For ease of explanation, consider first the case where we ignore the credit risk of the issuer of the
convertible. This is basically the same derivation as for the valuation of vanilla options which we
presented in Chapter 1. Recall that we assume that the stock price S evolves according to
dSt = rSt dt + σSt dWt
(4.1)
where r is the risk-free interest rate, σ is the volatility, and dWt is the increment of a Wiener
process. Then, following Equation (1.19), the value of any claim Vt contingent on St satisfies
1
∂ 2 Vt
∂Vt
∂Vt
+ σ 2 St2
+ rSt
− rVt = 0.
∂t
2
∂St2
∂St
(4.2)
Consider the case of a convertible bond which has no put or call provisions and can only be
converted at the terminal time tT . If the convertible has face value F and can be converted into κ
shares, then the value of the convertible V is given from the solution to Equation (4.2), with the
terminal condition
VtT (StT ) = max(F, κStT ).
(4.3)
Note that the index of VtT denotes a time tT at which we observe the variable V .
4.3.2
Credit Risk
The above model ignores the credit risk of the issuer of the bond, but this is potentially an important effect. Several models for incorporating credit risk have been proposed. Tsiveriotis and
Fernandes (T&F) [112] proposed a model in 1998 which was widely adopted. The T&F model was
derived in a very heuristic manner, and, as pointed out by Ayache, Forsyth, and Vetzal (AFV) in
[11, 12], seems to be inconsistent in some cases. In the following, we will use the AFV model as a
4.3 Models for Convertible Bonds
77
basis for out study. A similar model has been used in [7, 119, 84]. We also note that a simplified
form of this model was also suggested in [64].
The Hedged Model (AFV Model)
AFV derive a model, based on a hedging portfolio where the risk due to the normal diffusion process is eliminated, and assuming a Poisson default process. The probability of default in [t, t + dt],
conditional on no-default in [0, t] is p(St , t) dt with p(St , t) the hazard rate of the default process.
This model allows different scenarios in the event of default. Upon default, it is assumed that
the stock price jumps according to
St+ = St− (1 − η), 0 ≤ η ≤ 1
where St+ is the stock price immediately after default, and St− is the stock price just before default. Further, the holder of the convertible can choose upon default between:
1. Recovering RX, where 0 ≤ R ≤ 1 is the recovery factor. There are various possible assumptions for X, e.g. face value of bond, discounted bond cash flows, or pre-default value of the
bond component of the convertible; or
2. Receiving shares worth κSt+ = κSt− (1 − η).
For simplicity in the following, we will assume that the recovery rate R as well as the present
value of the convertible if we wait until liquidation are zero. This leads to the following partial
differential inequality for the convertible value V
∂Vt
σ 2 2 ∂ 2 Vt
∂Vt
+
St
+ (r + pη)St
− (r + p)Vt + pκSt (1 − η) ≥ 0
2
∂t
2
∂St
∂St
Vt (S, t) ≥ max(Bp (St , t), κSt )
2
(4.4)
(4.5)
2
σ 2 ∂ Vt
∂Vt
∂Vt
+
St
+ (r + pη)St
− (r + p)Vt + pκSt (1 − η) ≤ 0
2
∂tt
2
∂St
∂St
Vt (St , t) ≤ max(Bc (St , t), κSt ),
(4.6)
(4.7)
where either one of (4.4)-(4.5) or (4.6)-(4.7) hold, and one of the inequalities holds with equality at
each point in the solution domain. The terminal condition is given in Equation (4.3). Note that the
call price Bc (St , t) is the price at which the issuer can terminate the convertible and the put price
Bp (St , t) is the price at which the holder can return the convertible. Inequality (4.5) represents
the options of the holder: She can convert into shares worth κSt or put the option to the issuer
for Bp (St , t). The value of the convertible cannot drop below these prices because otherwise
an investor would buy the convertible, convert (resp. put) immediately and receive a risk-less
profit. This is not possible following the no-arbitrage assumption. Inequality (4.7) represents the
option of the issuer, who can call the convertible and pay Bc (St , t). The value of the convertible
will not rise above this price because the issuer calls and thus terminates the convertible as soon
78
Callable Convertible Bonds
as the convertible reaches the call price. This is in the interest of the existing stock holder (see
Ingersoll [66] for details of this reasoning.).
The above Inequalities (4.4)-(4.7) can be derived by constructing a hedging portfolio
Πt = Vt − φ1 St − φ2 Lt
containing the convertible bond V , φ1 of the underlying stock St and φ2 of a plain bond L issued by the same firm as Vt and with zero recovery. An appropriate choice of φ1 and φ2 renders
this portfolio risk-free, so that we can perform valuations based on pure hedging arguments, see
Appendix 7.4.
Modeling Default Intensity
In order to obtain realistic default behavior of the stock price process, we will use a hazard rate
which depends on the stock price S. For the hazard rate p(St , t), we use the model suggested in
[89] and in [7] where
³
p(St , t) = p(St ) = p0
St
S0
´α
.
The parameters p0 > 0 and α < 0 can be calibrated to market data.
As before, we assume in the following, that recovery R = 0 and that the stock jumps to zero
on default (i.e. η = 1) for ease of exposition.
4.3.3
Cash Flows, Call and Put Provisions
Convertible bonds usually have a variety of different features which influence their value. In this
section we will present the most important features and their effects.
Dividends and Coupons
If a discrete dividend Di is paid at time td,i , then the usual no-arbitrage arguments imply that
−
V (St+ − Di , t+
d,i ) = V (St− , td,i ),
d,i
d,i
(4.8)
+
where t−
d,i is the time immediately before the dividend payment, and td,i is the time immediately
after the payment.
Consider coupon payments ci paid at times tc,i . Denote the time immediately before the pay+
ment as t−
c,i and immediately after the coupon payment as tc,i . The price of the convertible then
drops according to
−
V (St+ , t+
c,i ) = V (St− , tc,i ) − ci .
c,i
c,i
(4.9)
4.3 Models for Convertible Bonds
79
Clean and Dirty Prices
The call price Bc and the put price Bp in the previous equations include accrued interest. Specifically, let Bcl , Bpl be the clean call and put prices. The actual call (put) price is computed by
Bc (St , t) = (Bcl + A(t)) · δcall (St , t),
(4.10)
Bp (St , t) = (Bpl + A(t)) · δput (St , t),
(4.11)
where A(t) is the accrued interest, a fraction of the next coupon payment and St := {Sτ |τ ∈
I}, I ⊆ {t0 = 0, t1 , . . . , tT } denotes the complete asset paths until time t. If the last payment was
at ti−1 and the next payment worth ci is paid at ti , then the accrued interest A(t) is
A(t) =
t − ti−1
ci .
ti − ti−1
The function δcall indicates if a call is allowed, δput indicates if a put is allowed. Different specifications follow.
Specifications of indicator functions δcall and δput
Another common feature of convertible bonds is the hard call protection which prevents calling in
the initial lifetime of the security. A hard call protection during time [t0 , Th ] can be accommodated
in our model by defining the set of call times Tcall = {t|t > Th } and the indicator function of
callability
½
δcall (t) =
1
∞
if t ∈ Tcall
.
otherwise
Note that we set the indicator to ∞ if the convertible is not callable such that we can use it as
a multiplier to the call price: In Equation (4.10) the dirty call price Bc (St , t) is the product of a
call price and the indicator. If the call feature is not allowed, the indicator and thus the dirty call
price Bc (St , t) are set to infinity. Consequently, Inequality (4.7) is not a binding constraint on the
price of the convertible. But, if the call constraint is active, the indicator is set to 1 and thus the
convertible value V (St , t) cannot exceed the dirty call price. A similar reasoning holds for the
other indicator functions defined in this section.
The indicator for the put feature is defined as
½
1 if t ∈ Tput
δput (t) =
0 otherwise
with the set of put times Tput .
In addition to the hard call protection period, some convertibles have a call trigger price
Bc,trigger which means that the underlying asset value St has to be above the trigger value before a call can be issued
½
δcall (St , t) =
1
∞
if t ∈ Tcall and St > Bc,trigger
otherwise
80
Callable Convertible Bonds
with St = {Sτ |τ ∈ Tcall } Again, we set the indicator to ∞ if the convertible is not callable so that
we can use it as a multiplier.
The most challenging level of complexity in traded convertible securities is a moving window
trigger protection. In this case, the asset value S has to be at least M out of the last N days above
the trigger Bc,trigger before a call can be issued. We assume that in this case, the asset value is
measured discretely. This means that


1 if t ∈ Tcall and
δcall (S, t) =

∞ otherwise
i
P
k=i−N +1
1Stk >Bc,trigger ≥ M
(4.12)
with St := {Sτ |τ ∈ I}, I ⊆ {t0 , t1 , . . . , tT } and daily observations, i.e. ∀k : tk − tk−1 = 1 day.
Protection by Call Notice Periods
We now add the feature that the issuer has to provide advance notice of calling the convertible.
In particular, upon the notice being provided, the holder has Tn time units to decide whether to
take the face value or to convert into shares. As noted in [30], the issuer is effectively giving the
holder a put option on the shares, plus the shares themselves. The longer the notice period, the
more valuable is this put option.
The value of the shares plus the put option can be viewed as the value V called,t of a new convertible bond starting at time t, maturing at time t + Tn , and having a terminal value of
V called,t (St , t + Tn ) = max(Bc (St , t + Tn ), κSt+Tn ).
Note that the call price Bc includes accrued interest and is set to infinity if no call is allowed
(see Section 4.3.3). Based on the assumption that the issuer acts in the interests of the existing
shareholders, he has to minimize the market value of the convertible [65]. Consequently, the
issuer will call the convertible as soon as V called,t is less than the price of the convertible. That
means that in the model for convertibles, we need to replace all conditions with a call price Bc by
conditions with V called,t .
For the AFV model, the following equations need to be solved
∂Vt
σ 2 2 ∂ 2 Vt
∂Vt
+
St
+ (r + pη)St
− (r + p)Vt + pκSt (1 − η) ≥ 0
∂t
2
∂St2
∂St
V (St , t) ≥ max(Bp (St , t), κSt )
2
(4.13)
(4.14)
2
∂Vt
σ 2 ∂ Vt
∂Vt
+
St
− (r + p)Vt + pκSt (1 − η) ≤ 0
+ (r + pη)St
∂t
2
∂St2
∂St
V (St , t) ≤ δcall (St , t) · V called,t (St , t),
(4.15)
(4.16)
with V called,t (St , t̂), t̂ ≥ t satisfying
∂Vt̂called,t
∂ t̂
+
2 called,t
∂Vt̂called,t
σ 2 2 ∂ Vt̂
St̂
+
(r
+
pη)S
− (r + p)Vt̂called,t + pκSt̂ (1 − η) ≥ 0
t̂
2
∂St̂2
∂St̂
(4.17)
Vt̂called,t (St̂ , t̂) ≥ max(Bp (St̂ , t̂), κSt̂ ),
(4.18)
4.4 Numerical Algorithm
81
with terminal condition
V called,t (St , t + Tn ) = max(Bc (St , t + Tn ), κSt+Tn ).
(4.19)
Dividend and coupon payments are accounted for by applying Equation (4.8) resp. (4.9) to the
convertible bond value V if dividend time td,i (coupon times tc,j ) equals model time t. Furthermore, if the model time t̂ of the call value V called,t is equal to a dividend time td,i resp. a coupon
time tc,j , then Equations (4.8) resp. (4.9) are applied to V called,t . This treatment is presented in
Figure 4.1.
Figure 4.1: A convertible bond is presented with two call dates: t1 and t2 . The maturity time of
the bond is T , notice time is Tn and a coupon is payed at time tc,1 . While V called, t1 is effected by
the coupon payment, V called, t2 is not.
4.4 Numerical Algorithm
This section presents the outline for an accurate PDE model for valuing convertible bonds with
call notice periods. Furthermore, we explain the details of a Least-Squares Monte Carlo method
which can additionally price moving window soft call constraints. For valuations without a moving window constraint, the PDE method is the superior method due to the slow convergence of
the Monte Carlo method. But, the PDE method cannot handle long moving windows which are
a common feature of convertible bonds.
In the next sections, we first present a brief outline of a PDE implementation with call notice
periods. A Monte Carlo implementation follows, which ignores default and call constraints, so
that we can concentrate on the estimation of optimal call and conversion. Then, we extend the
Monte Carlo implementation to default and soft call constraints. Finally, we present a detailed
description of the regression basis functions used in the Monte Carlo simulation leaving us with
a tool capable of solving high-dimensional problems.
4.4.1 PDE Implementation
The PDEs in the AFV models are parabolic Linear Complementarity Problems (LCP) which in
general cannot be solved analytically. However, the equations can be solved numerically.
82
Callable Convertible Bonds
The solution of the LCPs in the AFV case are computed via a discretization in two dimensions:
S and t. The solution is generated at discrete values V (Si , tn ) = Vtjn , S = {S 1 , . . . , S imax }. As is
usual in finance, the solution proceeds backwards in time. Given the terminal (payoff) conditions
at tn = T , the solution at tn−1 is generated using an implicit finite difference scheme. Dividend
and coupon payments are included as in Equation (4.8)-(4.9).
The pseudo code provided in Listing 1 illustrates the solution procedure. We assume the
existence of a function discrete_timestep which, given V(tn ) = {Vt1n , . . . , Vtimax
}, does one
n
}. See [50] and [12]
time step of the implicit solution method to return V(tn−1 ) = {Vt1n−1 , . . . , Vtimax
n−1
for implementation details of such a function.
Listing 4.1 Pseudo code for the numerical algorithm
f u n c t i o n v e c t o r = d i s c r e t e t i m e s t e p ( Vold , S , t , c o n s t r a i n t , . . . )
\\This function is a discrete version of the AFV
\\model. It uses an implicit method to compute the
\\values V(t − ∆t) from V(t) and returns the result
\\as a vector. The c o n s t r a i n t on the values V is
\\implicitly applied with a penalty method [50].
f u n c t i o n v e c t o r = c o n v e r t i b l e w i t h n o t i c e ( Vterminal , S , T , σ , r , . . . )
{
\\Computes the values of a convertible with a notice period
\\and returns the prices V(S i )∀i at t = 0 as a vector.
V=Vterminal ;
f o r a l l t i m e s t e p s from t = T down t o t = 0
{
i f notice to c a l l possible
{\\solve for the constraint
Bc =Bcl + a c c r u e d i n t e r e s t ( t + Tn ) ;
V called,t (S i )=max(Bc , κS i )∀i ; \ \ the terminal condition
f o r a l l t i m e s t e p s from t̂ = t + Tn down t o t̂ = t
{
c o n s t r a i n t ={V called,t (S i ) ≥ max(Bp (t̂), κS i )∀i } ;
V called,t = d i s c r e t e t i m e s t e p ( V called,t , S , t̂ , c o n s t r a i n t , . . . ) ;
i f cash flow o c c u r s between l a s t t i m e s t e p and t̂
apply cash flow ( ) ;
}\\end of inner time-stepping for loop
}\\end of constraint block
e l s e \\no call possible
{
V called,t (S i )=∞∀i ;
}
c o n s t r a i n t ={(V ≥ max(Bp , κS) ∧ (V ≤ max(V called,t , κ S))};
Vt−∆t = d i s c r e t e t i m e s t e p ( Vt , S , t , c o n s t r a i n t , . . . ) ;
i f cash flow o c c u r s between l a s t t i m e s t e p and t
apply cash flow ( ) ;
}\\end of time-stepping for loop
return V ;
}\\end of function convertible with notice
4.4 Numerical Algorithm
83
An important detail in this implementation is the treatment of cash flows which occur within
the notice period. There are usually no details given in the convertible bond contract about what
happens if the issuer calls and there is a coupon payment within the notice period. We assume
that there is no special treatment in this case and the coupon will be paid as usual. A similar
reasoning applies for dividends. Both types of cash flows, coupons and dividends, which are
paid at time ti are applied at t = ti to calculate V (S, t) and at t̂ = ti to calculate the value for
the constraint V called,t (S, t̂). This allows the holder to obtain the coupon after a notice of call and
then convert into shares before the end of the notice period to get the dividend. The algorithm in
Listing 1 can be easily adapted for a different treatment of these cash flows.
4.4.2 Monte Carlo Implementation
The numerical solution of the AFV model by finite difference schemes is very efficient for convertibles with and without call notice periods. However, consider the case where the convertible
has a discretely observed moving window call protection. In this case, the underlying has to be m
out of the last n days above a trigger level before a call can be issued. This would require the solution of an n dimensional PDE. This moving window type feature is computationally extremely
challenging using traditional PDE discretization schemes.
We will present a Monte Carlo method based on least-squares regressions similar to the procedures proposed for American type option pricing [32, 81]. Least-Squares Monte Carlo methods
for convertible bonds have also been suggested in [84]. Our method is extended to handle the
moving window feature by utilization of sparse grid like basis functions.
For explanatory purposes, our first formulation of the Least-Squares Monte Carlo will focus
on the exercise decisions leaving out default and call notice periods. We will then present a second
formulation which extends the first formulation, by adding provisions for call notice periods and
default of the asset. Two sections follow, which explain the basis functions and the specific design
choices made for the implementation used in the case study.
Simulation without Default and Call Notice Period
In a Monte Carlo simulation, we simulate different asset paths. Each of these paths follows a
geometric Brownian motion as described by Equation (4.1). This process is sampled at discrete
times ti ∈ {t0 , t1 , . . . , tT } so that each realization S j , j ∈ {1, . . . , s1 } follows
1
Stji+1 = Stji e(r− 2 σ
2
)(ti+1 −ti )+σ
√
(ti+1 −ti )θi,j
(4.20)
with θi,j drawn from a standardized Normal distribution. The price of the convertible is the
discounted expected value of the payoff at the optimal stopping time. The optimal stopping
time provides a strategy for maximizing the bond value by optimal conversion of the holder and
minimizing the bond value by optimal calling of the issuer.
84
Callable Convertible Bonds
It is interesting to note that Equation (4.20) is a discretization of the process described by
Equation (4.1) which does not introduce any time stepping error by itself, i.e. the distribution of
the simulated asset prices ST does not depend on the number of timesteps n. But, in the PDE case
we assume continuous conversion and continuous call. The Least-Squares Monte Carlo can only
evaluate convertibles with discrete conversion and discrete call features. This introduces a time
stepping error which we analyze in Section 4.5.2.
Least-Squares Monte Carlo for Optimal Decision
The valuation of the convertible proceeds as the valuation of exercisable options in the previous
sections. But, the procedure presented in Section 1.3 is generalized to handle the conversion and
put by the holder as well as the issuer’s call constraints.
As usual, the Monte Carlo valuation begins with a simulation of the underlying by (4.20)
forwards in time. At each conversion time ti , the holder decides to convert the bond and receive
the payoff κSti or to continue. At the same time, the issuer decides between a call and paying the
call price Bc (S, ti ) or continuation, where
Sti := {Sτ , τ ∈ I}, I ⊆ {t0 , . . . , ti }
denotes the complete asset paths until time ti . In order to maximize the option value Vti , the
holder exercises if
κSti ≥ EQ [e−r(ti+1 −ti ) Vti+1 |Sti , ti ],
and returns the convertible to the issuer for the put price Bp (St , t) if
Bp (Sti , ti ) ≥ EQ [e−r(ti+1 −ti ) Vti+1 |Sti , ti ] and Bp (Sti , ti ) ≥ κSti ,
and the issuer calls if
Bc (Sti , t) ≤ EQ [e−r(ti+1 −ti ) Vti+1 |Sti , ti ],
with EQ denoting the expectation under the risk-neutral measure. In the Least-Squares Monte
Carlo, the value of EQ [Vti |Sti , ti ] is approximated by
P e (Sti , ti ) ≈ EQ [e−r(ti+1 −ti ) Vti+1 |Sti , ti ],
The value P e (Sti , ti ) is computed using a least-square regression on many path-realizations Stji .
The regressions start at the time step tT −1 , i.e. one step before maturity time tT . The approximated
values are
P e (Sti , ti ) =
X
aik bk (Sti )
(4.21)
k
with some basis function bk (Sti ) and unknown coefficients aik which are determined by a leastsquares regression corresponding to the procedure presented in Section 1.4.2. We then determine
4.4 Numerical Algorithm
85
aik in Equation (4.21) from
°Ã
!
° X
°
j
j
aik bk (Sti ) − e−r(ti+1 −ti ) Vti+1
{aik } = arg min °
°
aik
° k
j=1,...,s
where
Vtji+1
1
°
°
°
°
°
°
(4.22)
2
is the estimate of the current convertible value of a Monte Carlo path realization Sjti at
time ti . The value of Vtji is given as the maximum between the estimated value of the unexercised
bond P e and the exercise value κSti , as well as the minimum of the expected bond value P e and
the call price Bc ,

j
j
j

P e (Sτ,t
, ti ) < max(Bp (Stji , ti ), κStji )
 max(Bp (Sti , ti ), κSti ) if
i
j
j
j
j
Vt i =
.
max(Bc (Sti , ti ), κSti ) if
P e (Sti , ti ) > Bc (Sjti , ti )

 e−r(ti+1 −ti ) V j
otherwise
ti+1
(4.23)
Given that the value of the convertible bond at maturity time equals the payoff VTj = F , a dynamic
program solves for all values Vtji , starting at time T and iterating backwards to t0 . Dividend and
coupon payments are included by the application of Equation (4.8) resp. (4.9).
Now, there are essentially three different methods which can provide us with an estimate of
the convertibles value V (St0 , t0 ) = EQ [Vt0 |St0 , t0 ]. The first possibility is the computation using
the regression function given in Equation (4.21):
V ≈ P e (St0 , t0 ) =
X
a0k bk (St0 ).
k
This is especially useful if the asset paths realizations S j do not start at St0 but at values spanning
an interval around St0 , e.g. ∀j : Stj0 ∈ [ 21 S0 , 2S0 ]. This way, we can easily get estimates for the
hedge ratios delta and gamma,
X ∂bk (St )
∂Vt
a0k
=
,
∂St
∂St
k
2
∂ Vt
=
∂St2
X
k
a0k
∂ 2 bk (St )
.
∂St2
But, computing the value of the convertible and the greeks (delta and gamma) this way, the shape
of the basis functions bk (St0 ) can introduce a systematic error, which might not be negligible. This
error will be especially large if the expected value function EQ [Vt0 |St0 , t0 ] is not smooth in St0 .
A second possibility for the computation of V (St0 , tt0 ) is given by
V in =
s1
1 X
·
Vj
n1 j=1 t0
and a third by
V out =
nX
1 +n2
1
·
Vj
n2 j=n +1 t0
1
86
Callable Convertible Bonds
with Vtji as given in Equation (4.23), which is similar to the usual Least-Squares Monte Carlo
(Equation (1.36)).
As in Section 1.3, we name this value the out-of-sample price. As already noted, this price
has the desirable property that it can present a bound on the true model price for exercisable
options. However, in the presence of exercise and call features, the out-of-sample price is neither
a lower, nor an upper bound since both, the issuer’s call and the holder’s conversion strategy, are
suboptimal.
The next sections will extend this implementation to default and soft call constraints. After
that, we summarize the design choices of the implementation we use for our numerical experiments.
Simulation with Default
Given, that the issuer can go into default, the simulation has to account for the effect of default
on the price of the convertible. In general, we could simulate the asset as a combined Brownian
motion and jump default process. However, if we examine the PDEs (4.4-4.7) with (η = 1), we
can immediately see that the price V can be computed by simulating the process
dSt = (r + p)St dt + σSt dW
(4.24)
and discounting the cash flows back along the path with the effective discount rate (r + p). Note
that in general, we must also add in the effective default cash flow pκSt (1 − η) dt in each timestep
t → t + dt. We will use this approach in the following.
Monte Carlo Valuation with Default and Call Notice
In Section 4.4.2, we discussed the case without a notice period and without default. Now, we want
to consider the call constraints discussed in Section 4.3.3. In contrast to the clean call price Bcl
which is a constant, specified in the convertible bond contract, we defined the call price Bc (St , t) to
account for accrued interest (Equation (4.10)) and the call trigger constraint (Equation (4.12)). That
means, Bc (St , t) may depend on the complete asset paths history. Consequently, the optimal call
strategy may depend on the recent history of the asset path. This greatly increases the complexity
of the convertible bond valuation.
In the case of a defaultable convertible, we discretize the asset price process governed by (4.24)
which leads to
j
1
Stji+1 = Stji e(r+p(Sti )− 2 σ
2
)(ti+1 −ti )+σ
√
(ti+1 −ti )θi,j
.
(4.25)
In contrast to the Equation without default (4.20), this Equation introduces some time stepping
error because the default intensity p is not constant. However, the error is small already for a few
timesteps as we will see in Section 4.5.2.
As for the non-defaultable contingent claim case, the optimal decisions are solved by regressions. In the Least-Squares Monte Carlo, the value of EQ [Vti+1 |Sti , ti ], Sti := {Sτ , τ ∈ I}, I ⊆
4.4 Numerical Algorithm
87
{t0 , . . . , ti } is approximated by
P e (Sti , ti ) ≈ EQ [Vti+1 |Sti , ti ].
The value P e (Sti , ti ) is again computed using a least-square regression backwards in time, starting
at time ti−1 . The approximated values are
P e (Sti , ti ) =
X
aik bk (Sti−m , . . . , Sti )
k
with some m-dimensional basis function bk and unknown coefficients aik satisfying
°Ã
°
!
° X
°
°
°
j
j
i
−r(ti+1 −ti ) j
°
a
b
(S
,
.
.
.
,
S
{aik } = arg min °
)
−
e
V
ti−M
ti
ti+1
k k
°
°
aik
° k
°
j=1, ...,n
(4.26)
2
where Vtji+1 is the estimate of the current convertible value of a Monte Carlo path realization S j at
times ti−M , . . . , ti . The value of Vtji is given as the maximum between the estimated value of the
unexercised bond P e (Sti , ti ) and the exercise value κSti , as well as the minimum of the expected
bond value P e (Sti , ti ) and the effective value of the call price V called,ti (Stji , ti ),

j
j
 max(Bp (Sti , ti ), κSti )
j
j
Vt i =
max(V called,ti (S , t ), κStji )
 −r(ti+1 −ti ) j ti i
e
Vti+1
if
P e (Sti , ti )
if
P e (Sti , ti )
otherwise
< max(Bp (Stji , ti ), κStji )
> δcall (Sti , ti ) · V called,ti (Stji , ti ) .
The Equation for Vtji now also accounts for a call notice period. The algorithm proceeds according
to Section 4.3.3, i.e. the call price Bc is substituted by the value of the convertible after a call notice.
Since after the notice, the convertible cannot be called again, the value of the convertible after a
call notice V called,t (S j , ti ) is easy to compute. It is the value of a convertible bond with the exact
same properties as the original convertible except that the convertible with price V called,t (Stji , ti )
has a maturity time t + Tn and no issuer call options. This value can be estimated e.g. by LeastSquares Monte Carlo or the PDE method ( Equations (4.17) to (4.19)).
As in the valuation of a non-defaultable convertible, a dynamic program solves for all values
Vtji ,
starting at time T and iterating backwards to t0 , given that the value of the convertible bond
at maturity time equals the payoff VTj = F . We can compute the in-sample option price with
V in =
n1
1 X
·
Vj
n1 j=1 t0
with asset paths S j , j ∈ {1, . . . , n1 }. And we compute the out-of-sample option price by
V
out
nX
1 +n2
1
=
·
Vtj0
n2
l=n1 +1
with Vtj0 as given in Equation (4.4.2) based on new simulation paths S l , l ∈ {n1 + 1, . . . , n1 + n2 }
and the coefficients aik from the in-sample valuation.
88
Callable Convertible Bonds
Sparse Grids: Choice of Basis Functions
The described Monte Carlo algorithm works well in case of a single factor model. But, in case of
a moving window call protection, we have to deal with additional dimensions. In some special
cases, the moving window constraint can be reformulated as a set of one-dimensional problems,
where the number of elements grows exponentially with the size of the window [55]. This reformulation is only useful for moving windows, with relatively small length, e.g. a condition is
feasible where the underlying has to stay 5 out of 15 days above a trigger level before a call can
be issued. However, call protection requiring 20 out of 30 days over a trigger level is not feasible.
The 20 out of 30 protection is fairly typical of real convertibles.
Since the value of the convertible will depend on the daily observations in the window of
historic samples, this represents a high dimensional problem (i.e. 30 dimensional). An American
type option pricing problem with so many dimensions is challenging. The procedure we are following is presented in Section 1.2.2 as well as in [40] and proves to be useful for moving window
type pricing problems.
A key issue in our numerical algorithm revolves around the choice of the basis functions bk
in Equation (4.22). As described in the previous section we will use a linear combination of these
basis functions to express an estimate for the current option value, which includes dependence
on all relevant input parameters. Thus, we need one additional dimension in the basis for each
observation in the window of historic samples. In the case of a window with M samples, we
need, in principle, a basis of dimension M .
As in Chapter 3, where we evaluated a moving window Asian option, we choose a sparse
basis BLsparse ( see Equation (1.14)),
BLsparse (x1 , . . . , xM ) :=
[
P
full
(x1 , . . . , xM ),
Bβ(`)
`i =L
which is smooth everywhere.
Implementation Details
In our implementation, we perform the regressions required by Equation (4.22) on sparse polynomial basis functions as presented in the previous paragraphs. We use sparse levels L from 0
to 3 which are sufficient for our purposes. But, we do not perform the regressions on S directly.
Similar to the MWAO option pricing, we use scaled values of S such that for each path j, we
compute xj = (γ1 (Stji ), . . . , γM (Stji−M )), with linear transformation function
γj (Stji )
Stji − min(Sti )
:=
,
max(Sti ) − min(Sti )
such that xj ∈ [0, 1]M lie in a unit cube. Since sparse polynomial basis functions are used, this
creates matrices with better conditions numbers then without the transformation.
4.5 Case Study
89
The regression itself is performed solving the linear least-squares problem of Equation (4.22)
implicitly via QR-decompositions.
In case of a call notice period, we use the PDE method for the computation of the convertible
bond value V called,ti after a call. The problem that, in general, the required value V called,ti (Stji , ti )
of the Monte Carlo paths S j does not lie on a grid point of the PDE is solved by a cubic-spline
interpolation of the PDE values.
In the following, all reported values for a valuation by our Monte Carlo methods are out-ofsample values V out .
4.5 Case Study
The base case data is presented in Table 4.1 and all following examples are computed using this
data. The data is consistent with the data used by other authors [12, 112].
Any variation to the base case data will be explicitly noted: some of the parameters will be
varied so that the effect on the model price of optimal decisions can be analyzed. We will denote
a computed approximation for the value of a convertible with an asset value St by V(St ). Varied
parameters will be denoted by a “|” sign, e.g.
V(St |Tn = 0, p(St ) = 0)
denotes the approximation for the value of a convertible with no notice period and no default.
We assume that even in the presence of a hard call protection during the initial lifetime of
the bond, a call notice can only be given after this protection period. Note that a call period of
Tcalled = {t|t > (2.0 years + Tn )} means that the first notice of a call can be issued at time t = 2.0.
Table 4.1 Base case data of a convertible bond with AFV default model and a call notice period.
General features
conversion ratio κ
face value F
coupon payment ci
coupon times Tcoupon
maturity tT
risk-free rate r
volatility σ
dividends Di
Callability
notice period Tn
call period Tcall
clean call price Bcl
first call notice
call trigger Bc,trigger
1
100
4, (8% p.a.)
{0.5, 1.0, . . . , 5.0}
5.0 years
5% p.a.
20% p.a.
0,
1/12 years
{t|t > (2.0 years + Tn )}
110
t > 2.0
0
Putability
putable at time Tput
clean put price Bpl
put price Bp (t = 3)
Default model
hazard rate p(St )
spread p0
α
S t0
recovery rate R
jump factor η
{t|t = 3.0 years}
105
Bpl + c3 = 109
³ ´α
p0 SStt
0
2% p.a.
−1.2
100
0
1
90
Callable Convertible Bonds
4.5.1
Convergence Analysis - PDE
In Table 4.2 the values are displayed for a convertible using the base case data in Table 4.1. CrankNickolson time stepping is used. To mitigate numerical oscillations, the method presented in [99]
is used. This method uses two implicit time-steps after each non-smooth solution and proceeds
with Crank-Nickolson time stepping. The reason for this is that the implicit time-steps smooth out
the non-smooth initial conditions, which can then be used by but Crank-Nickolson time stepping.
The Crank-Nickolson method has a better convergence than the implicit time stepping for smooth
initial conditions, such that the combination of both has a better convergence than the implicit
method alone [99, 50].
Table 4.2 shows a numerical convergence analysis. At each refinement, we double the number
of nodes in the S grid and the number of time steps. The number of substeps used to determine
V called,t (inner time stepping in loop in pseudo code, Listing 1) is also shown. From the column
”difference”, we can estimate the order of the convergence. A method with first order convergence has the property that the absolute difference halves from one refinement to the next, while
the absolute difference of a method with second order convergence is reduced to a quarter.
For both methods, the case with a finite call notice period and the case where the notice period
is zero, the convergence seems to be about first order. This contrasts with the smooth quadratic
convergence reported in [50] for simple American options. We conjecture that the effect of discrete
coupon payments and accrued interest may cause some difficulties on obtaining smooth convergence. However, this is not a problem of practical concern, since we can obtain results which are
clearly correct to five digits, which is much more accuracy than would be required in any real
situation.
Each time step of the algorithm in Listing 1 requires about (#substeps+1) times the work required for a convertible bond with no notice period. In all cases, the constraint V called,t is solved
on a grid with the same spacing as that for V . From Table 4.2, we see that a grid with 3200
nodes has an absolute error of less than 0.01. All results of PDE values in subsequent sections are
reported using a 3200 node grid.
4.5 Case Study
91
Table 4.2 Convergence results for a convertible with data in Table 4.1 in the AFV model with a
one months and without a call notice period. Substeps refers to the number of time steps used to
determine V called,t , at each discrete time.
The AFV model - no call notice V(St |Tn = 0)
grid for V
S × t V (S = 100, t = 0) difference
50 × 50
122.3672
100 × 100
122.3713
0.0041
200 × 200
122.3851
0.0138
400 × 400
122.3692
-0.0159
800 × 800
122.3660
-0.0032
1600 × 1600
122.3653
-0.0007
3200 × 3200
122.3649
-0.0004
1
The AFV model - call notice V(St ) (Tn = 12
)
called,t
grid for V , V
S × t × substeps V (S = 100, t = 0) difference
50 × 50 × 1
123.0799
100 × 100 × 2
122.9873
-0.0926
200 × 200 × 4
122.9365
-0.0508
400 × 400 × 7
122.9073
-0.0292
800 × 800 × 14
122.8928
-0.0145
1600 × 1600 × 27
122.8853
-0.0075
3200 × 3200 × 54
122.8814
-0.0039
4.5.2 Convergence Analysis - Monte Carlo
For the Monte Carlo valuation, we use the algorithm discussed above. The number of in-sample
paths n1 and out-of-sample paths n2 are equal, i.e. n1 = n2 . The sparse basis functions are used
even in the case of a single dimension. In one dimension, the sparse basis functions become a full
basis with 2L basis polynomials. We use a level of L = 3 for up to n1 + n2 = 105 asset paths and
a level of L = 2 for n1 + n2 = 106 paths simulations. The reported values of the Monte Carlo
procedure are always the out-of-sample values.
In Table 4.3, we list average values we computed using the Monte Carlo method. The reported
average VMC is based on as many Monte Carlo valuations as required to compute the prices with
an accuracy of ±0.02 with 95% confidence (cp. Table 1.2). The reported standard deviations are
the standard deviations of the set of Monte Carlo values.
We can observe in Table 4.3 that the Monte Carlo simulations appear to converge to the value
computed by the PDE-method. This is the case for the convertible without notice period as well as
for the convertible with notice period. Note that we cannot expect precise convergence to the PDE
values since we are taking daily MC steps leading to errors due to finite sized MC time stepping.
As mentioned before, there are mainly two sources of time stepping errors of the Monte Carlo
92
Callable Convertible Bonds
Table 4.3 Convergence results of Monte Carlo simulations for the AFV models with and without
a call notice period. Values for convertibles with data in Table 4.1 are presented. VMC denotes the
average of at least 20 simulations and std(VMC ) denotes the standard deviation of these simulations. The real mean values lie in: reported value ±0.02, with a probability of 95%.
no notice period
VPDE (St0 = 100|Tn = 0) = 122.36
# asset paths
104
105
106
VMC
122.18
122.37
122.39
std(VMC )
0.31
0.11
0.05
one month notice
VPDE (St0 = 100) = 122.88
# asset paths
104
105
106
VMC
122.67
122.88
122.90
std(VMC )
0.30
0.10
0.04
method compared to the PDE method: the discrete call of the Monte Carlo method and the discretization (4.25) of the asset price process with variable default intensity. Table 4.4 shows the
effect of different time steps. The upper table presents the isolated effect due to the variable default intensity p(S). The effect introduces an error O(∆t), weekly timesteps (5/250) create only an
error of about 0.01. A little larger error is introduced by fewer call times which is presented in the
lower table. The order of the error is again O(∆t), but daily timesteps are required for an error of
about 0.01.
However, in spite of this, the MC method appears to be converging to a solution with at least
four digit accuracy, which is sufficient for many practical purposes.
4.5 Case Study
93
Table 4.4 Convergence results of Monte Carlo simulations for different time steps. Values for
convertibles with data in Table 4.1 are presented with different time intervals of the Monte Carlo
simulation. The convertible evaluated on the upper table is only convertible at maturity, allows neither a put nor a call. The values on the lower table are computed using daily timesteps
1
(ti+1 − ti = 250
) for the underlying process but perform call or conversion fewer times (see # time
between calls) than daily. The real mean values lie in the interval of the reported value ±0.02,
with a probability of 95%.
variable default intensity time steps
conversion only at maturity T , no call,
no put, no coupons, 106 paths
VPDE (St0 = 100) = 104.18
time step (ti+1 − ti )
625/250
125/250
25/250
5/250
1/250
VMC
106.01
104.43
104.23
104.19
104.20
variable time between call/conversion
VPDE (St0 = 100) = 122.36
106 paths, sampling of paths: ti+1 − ti = 1/250
# time between calls
16/250
8/250
4/250
2/250
1/250
VMC
122.51
122.46
122.43
122.41
122.39
4.5.3 Properties of Different Call Strategies
The call strategy of the issuer is an important factor for the value of the convertible bond. Early
papers [65, 24] derive the optimal call strategy which an issuer should follow without a notice
period. The optimal call strategy for a continuously callable convertible without a notice period
is to call if the conversion value rises to the effective call price [65]. Or, as an Equation:
κS ∗ = Bc
(4.27)
with Bc being the effective call price (including accrued interest) and S ∗ the stock price at which
it is optimal for the issuer to call. Note that in case of no default, the sufficient condition for the
optimality is that the coupon rate has to be less than the risk-free rate2 .
2 See
[65], Theorem III
94
Callable Convertible Bonds
160
1600 nodes
150
140
3200 nodes
S*
130
120
110
100
2
2.5
3
3.5
4
4.5
5
time t
Figure 4.2: A convertible using the data from Table 4.1. The level of stock price S ∗ for the optimal
call strategy versus time t approximated by the AFV model using a PDE solver. The plot shows
the solution on a coarse grid (1600 nodes in S , 1600 time steps, 27 substeps) versus a fine grid (3200
nodes in S , 3200 time steps, 54 substeps): Except close to maturity, both estimates lie virtually on
the same line.
Under some simplistic assumptions, these results are extended in [30] to include notice periods. But, the simplification does not lead to realistic approximations for optimal call strategies of
traded securities.
A more sophisticated model that takes more of the complex features of the convertible bond
into account follows from the discretization of the PDE in the AFV model. For each node V j at
time ti of the discretization we check if
V j (ti ) = (V called,t )j (ti ),
(4.28)
i.e. we see if the maximum constraint in Equation (4.16) is active. At time step ti , let V j (S j , ti ) be
the node with S j the smallest value in S that results in an active constraint. Then S j gives a good
approximation for the optimal stock price level: S ∗ (ti ) ≈ S j (ti ). This method is simply a nearest
neighbor interpolation.
4.5 Case Study
95
with default: S ∗ (V(S))
160
150
140
PDE method
130
120
S∗
110
100
LS Monte Carlo
90
80
70
2
2.5
3
3.5
4
4.5
5
time t
without default: S ∗ (V(S|p(S) = 0))
160
150
140
130
120
S∗
PDE method
110
100
90
LS Monte Carlo
80
70
2
2.5
3
3.5
4
4.5
5
time t
Figure 4.3: The optimal stock price S ∗ for a call of a convertible using the data from Table 4.1
is presented on the top. On the bottom, the stock price S ∗ is presented for a non-defaultable
convertible: p = 0. The level of stock price S ∗ for the optimal call strategy versus time t is
estimated by the AFV model using the PDE solver and the Monte Carlo solver.
96
Callable Convertible Bonds
without notice time: Tn = 0
S∗
with notice time Tn = 1/12y
200
200
180
180
160
160
140
S∗140
Bc l = 150
Bc l = 110
120
120
100
100
80
2
2.5
3
3.5
4
4.5
5
2
Bc l = 150
Bc l = 110
2.5
time t
3
3.5
4
4.5
5
time t
Figure 4.4: The optimal stock price S ∗ for a call of a convertible using the data from Table 4.1 except a call notice time of zero is presented on the left. On the right, the stock price S ∗ is presented
with a notice time of one month. The level of stock price S ∗ for the optimal call strategy over time
t approximated by the AFV model using the PDE solver.
The error of approximating the optimal strategy by the PDE method is presented in Figure 4.2.
In this figure, S ∗ is approximated by a PDE solution on a coarse grid and on a fine grid. The
difference between the two approximations is less than 0.02 for t < 4.5. The calculation of S ∗ is
less accurate for t > 4.5 because the grid is still coarse in both discretizations (a structured mesh is
used). Furthermore, the gradients of the convertible bond value and the constraint are very close
for S À Bc :
∂V called,t
∂V
≈
≈ 1.
(4.29)
∂S
∂S
A similar estimation procedure for the stock price S ∗ above which it is optimal to call, is
possible by the Monte Carlo algorithm:
³
´
S ∗ (ti ) ≈ min Stji : P e (Stji , ti ) ≥ V called,ti (Stji , t̂ = ti ) .
This leads to similar results as for the PDE. Figure (4.3) presents the estimates for S ∗ (ti ) computed
by a PDE method with 3200 nodes in S and an estimate by a Monte Carlo method with 106 asset
paths. One can observe that both approaches deliver comparable results, but the PDE method has
much less noise in the estimate. Consequently, the PDE method provides a reference point for the
other approximations using a sufficiently fine grid. Note that the MC estimate is particularly
noisy near expiry, which happens because the holder of the convertible receives about the same
cash flows with and without an issuer’s call. Consequently, the shapes of V (Sti ) and V called,ti (Sti )
are very similar and the numerical procedure starts oscillating.
In Figure 4.4, the left graph depicts the PDE estimate of the optimal stock price level for a call
∗
S for two convertibles without notice period and different call prices. The value of the optimal
4.5 Case Study
stock price level equals the clean call price plus the accrued interest, as derived in [65]. The right
graph in Figure 4.4 presents the optimal stock price level S ∗ for the same convertible but including
call protection by a one month notice. Now, the optimal call strategy is more complex: there is
a large drop in S ∗ just before a coupon payment is within the notice period and a jump as soon
as the coupon payment is within the notice period. A closer look at this phenomenon shows that
this is a result of the “screw clause”: The holder will not receive the accrued interest if he chooses
to convert into shares, but he can still receive a coupon if the payment date is within the notice
period. An issuer’s call just before the coupon falls within the notice period avoids this situation.
An interesting property of the optimal call strategy (Figure 4.4) is that it does not seem to be
optimal to call after the last coupon before maturity is paid. This is because the issuer is trying to
minimize the value of the convertible. Consequently, the issuer tries to avoid the situation where
the holder gets a coupon plus the opportunity to convert into shares. Thus, the value for S ∗ is
relatively low just before a coupon payment takes place. But at maturity, the holder gets either
the face value plus the last coupon or κ shares and no coupon. So, there is no need for the issuer
to call because the holder cannot get both.
From Figure 4.4, we can see that in the case of a notice period, the optimal S ∗ at which the
issuer should call the convertible is most of the time higher than in case of no notice period with
Bcl = 150. But, with Bcl = 110, S ∗ is most of the time lower than in case of no notice period. This
is surprising since it means that in general, the delayed call observed in the real market cannot be
explained by a call notice period.
Implications of Different Call Strategies
It is interesting to examine the effect of the call policies on the convertible bond value. In Figure 4.5
on the left, we can see the effect of different notice periods on the value of the convertible. These
results are all obtained using our accurate PDE method (AFV model). The premium for a notice
period varies over the stock price S with a maximum between 85 and 95. As predicted, the
premium is larger for a longer notice period. The premium for a typical notice period with 30
days is about 0.55, a significant addition. The reason for this is that the issuers interest it to
minimize the value of the convertible and the notice period makes it more difficult of her. Thus,
the price of the convertible rises introducing call notice period.
The right graph in Figure 4.5 shows the gain in value holding the notice period fixed at one
month notice and varying the clean call prices Bcl . A higher call price leads to higher gains in
value due to the call notice period.
Another interesting subject is the effect of suboptimal call policies, especially the delayed call
phenomenon. Assuming that issuers call their convertibles late, what is the effect on the value?
Consider the following strategy. The issuer calls only if it is beneficial for him to call, but he will
not call until the stock price level S ∗ is reached. This seems to be a realistic assumption, because
the issuers tend to call their convertible bonds late.
97
98
Callable Convertible Bonds
constant call price
constant notice time
V(S, Tn |Bc = 110) − V(S|Tn = 0, Bc = 110)
V(S, Bc |Tn = 1 ) − V(S|Tn = 0, Bc = 110)
12
gain ∆V
gain ∆V
1.4
1.4
1.2
1.2
1
Tn = 3.0 12
1
Tn = 2.0 12
1
Tn = 1.0 12
1
0.8
0.6
0.8
0.6
1
Tn = 0.5 12
0.4
0.4
0.2
0.2
0
0
−0.2
0
50
100
150
200
Bc l = 140
Bc l = 130
Bc l = 120
Bc l = 110
1
250
−0.2
0
asset value S
50
100
150
asset value S
200
250
Figure 4.5: The impact of notice periods on the initial value of the convertible bond. The difference
in value compared to a convertible with Tn = 0 is shown across the stock price (t = 0). The graph
on the left shows the effect of different notice periods with constant call price Bc = 110, while
the graph on the right shows the effect of different call prices and a constant notice period of
Tn = 1/12. All values are computed using the AFV model with data in Table 4.1.
This last calling strategy is implemented by altering the model for valuation with notice periods. The Inequalities (4.13)-(4.16) become for St < S ∗
∂Vt
σ 2 2 ∂ 2 Vt
∂Vt
+
S
+ (r + pη)St
− (r + p)Vt + pκSt (1 − η) ≥ 0
∂t
2 t ∂St2
∂St
V (St , t) ≥ max(Bp (St , t), κSt )
and for St ≥ S ∗
∂Vt
σ 2 2 ∂ 2 Vt
∂Vt
+
St
+ (r + pη)St
− (r + p)Vt + pκSt (1 − η) ≥ 0
∂t
2
∂St2
∂St
V (St , t) ≥ max(Bp (St , t), κSt )
2
2
∂Vt
σ 2 ∂ Vt
∂Vt
+
St
− (r + p)Vt + pκSt (1 − η) ≤ 0
+ (r + pη)St
∂t
2
∂St2
∂St
V (St , t) ≤ V called,t (St , t),
where at least one of the inequalities holds with equality on the complete solution.
The impact of this new call strategy is presented in Figure 4.6 in the graph on the left. The
difference in value compared with the optimal strategy (for a range of S ∗ = {120, 130, 140, 150})
for a convertible bond from Table 4.1 is shown over the stock price. This premium rises sharply
for higher values S ∗ . But, for S ∗ = 150, we have a maximum impact on the value of about 7.00.
4.5 Case Study
99
call at S ∗ (t) = Bc (t)
late calling
V(S, S ∗ |B
c = 110, Tn = 1/12) − V(S|Bc = 110, Tn = 1/12)
gain ∆V
V(S, S ∗ |B
c = 110, Tn = 1/12) − V(S|Bc = 110, Tn = 1/12)
gain ∆V
7
0.4
6
0.35
S∗ >150
0.3
5
S∗ >140
S∗ >130
S∗ >120
4
3
0.25
Bc l = 110
Bc l = 120
Bc l = 130
Bc l = 140
0.2
0.15
2
0.1
1
0.05
0
0
−1
0
50
100
150
asset value St 0
200
250
−0.05
0
50
100
150
asset value St 0
200
250
Figure 4.6: The impact of suboptimal calling. The value of a convertible with data in Table 4.1
and optimal call strategy is compared to suboptimal call strategies. A strategy which calls at
higher than optimal values results in an increase of the security value presented in the left graph.
The right graph depicts the difference in value of the convertible called as if there were no notice
compared to the optimal call strategy.
That means that the optimality of the issuer’s behavior also has a significant impact on the value
of a convertible bond.
We now consider a second, somewhat realistic scenario. Suppose the issuer uses an approximate method to determine the optimal call policy which we denote S ∗ . This strategy can be
modeled by replacing Inequalities (4.13)-(4.16) by
∂Vt
σ 2 2 ∂ 2 Vt
∂Vt
+ (r + pη)St
+
S
− (r + p)Vt + pκSt (1 − η) ≥ 0
∂t
2 t ∂St2
∂St
V (St , t) ≥ max(Bp (St , t), κSt )
for St < St∗ and by
σ 2 2 ∂ 2 Vt
∂Vt
∂Vt
+
St
− (r + p)Vt + pκSt (1 − η) ≥ 0
+ (r + pη)St
∂t
2
∂St2
∂St
V (St , t) ≥ max(Bp (St , t), κSt )
∂Vt
σ 2 2 ∂ 2 Vt
∂Vt
+
St
− (r + p)Vt + pκSt (1 − η) ≤ 0
+ (r + pη)St
∂t
2
∂St2
∂St
V (St , t) = V called,t (St , t)
(4.30)
for St ≥ St∗ , where at least one of the inequalities holds with equality on the complete solution.
Since this strategy is suboptimal, all values computed using this set of Equations will be larger
100
Callable Convertible Bonds
than values obtained with the optimal method (Inequalities (4.13)-(4.16)). This makes the resulting premium a good measure of the error of approximating the optimal call strategy.
The graph on the right in Figure 4.6 shows the premium (compared to the optimal strategy)
at t = 0 due to a call strategy which ignores the call notice period. More specifically, we set St∗ =
Bc (St , t), which is optimal for a notice period of zero (cp. Equation (4.27)). One can see that this
policy has only a slight effect on the value (varying from 0.04 to 0.36). In other words, computing
the optimal value S ∗ for calling by ignoring the call notice period can be a good approximation of
the optimal call (but note that this assumes taking into account that the holder receives an option
worth V called upon call notice). This depends on the set of parameters. Using the Ingersoll call
policy in case of dividend payments can add significant value.
4.5.4 Moving Window and Call Notice Protection
We now want to assess the effects of a moving window trigger as soft call protection. More
precisely, the underlying asset has to stay M = 20 out of the last N = 30 days above a trigger
level before a call notice can be issued. In our example, we set the trigger level to the clean call
price, which is often seen in convertible bond contracts. This trigger level constraint will increase
the value of the convertible. Furthermore, the issuer must give 30 days notice before the bond is
called.
A mathematical formulation of the trigger constraint can be found in Equation (4.12). We use
a discrete formulation since discrete observation is more realistic in convertible bond contracts.
As already mentioned, since no similar pricing procedure can be done using the PDE method,
we rely on our Monte Carlo estimates.
Influence of Moving Window Constraints
Numerical experiments of the PDE model show that in our example (no dividends), a conversion
of the convertible is not optimal for a holder at any time, even in the presence of the default model.
Consequently, we will use the approximation that only call stopping times are computed for the
Monte Carlo valuations in this section, conversion is ignored. This allows to see the average value
of many Monte Carlo estimates as an estimate for an upper bound of the true price.
Table 4.5 shows the results for various choices of basis functions in the Least-Squares Monte
Carlo regression. The dimension of the basis refers to the number of observations taken into
account in the window of historic observations. The correct method requires a 30 dimensional
basis. The L in Table 4.5 refers to the level of the sparse basis, as in Equation (1.14), which results
in m (# basis) different basis functions.
In Table 4.5 we can see the values for a different number of observations in the moving window. Note: value is the average of #sims simulations with #paths asset paths each. The confidence
level is 2 √std(V)
where V is the set of values of the simulations with #sims elements, which cor#sims
4.5 Case Study
responds to over 95% probability that the correct value lies within the reported interval.3 The
presented values are the best values of a large series of computations. It turns out that an increase
in the sparseness level L (i.e. the #basis) does not always lead to lower upper bound estimates.
This is a result of numerical difficulties (poor condition of the matrix with basis function values)
and insufficiently many simulations for a good approximation of the conditional expected continuation value. The rows where the trigger is set to ”-”, contain the values of the example without
a trigger condition. All other rows contain the values for the example with the constraint that the
underlying stock has to be 20 out of the last 30 days above the trigger price before a call notice
can be issued.
The values in Table 4.5 suggest that the estimation of the continuation value based only on the
current asset price is already extremely good. That means, we include the complete trigger condition in order to decide, which paths allow a call notices. But then, we use only the current stock
price of these paths for the computation of the expected continuation value. This has obviously a
systematic error. But, the approximation is cheap and leads to good estimates: The value is very
close (for Bcl = 110) to the estimate of the full 30 dimensional basis functions 123.69 ± 0.01. This
is also true, for a clean call price of Bcl = 140.
Consequently, it appears that a one dimensional basis function can already provide good estimates for the continuation value of a convertible bond with a 20 out of 30 day moving window
trigger price protection. Furthermore, the value added for the moving window call protection,
compared with an unprotected convertible (0.79 for Bcl = 110 and 0.38 for Bcl = 140) can be a
significant effect.
At first sight, it is difficult to understand why such a comparatively poor approximation (only
considering a basis using the current asset price) should yield such a good solution, when, at least
in theory, the value of the bond depends on the past 30 day history. However, note that in the
Monte Carlo simulation, calls cannot be issued along a particular path, unless the asset has been
above the trigger for 20 out of the past 30 days. Consequently, it appears that much of the path
dependency has already been taken into account, and there is little error introduced when we use
only a low-dimensional basis. In our example, the required CPU time for one evaluation of the
one dimensional approximation is only one tenth of the time required for the thirty dimensional
problem.
3 See
Table 1.2 for details.
101
102
Callable Convertible Bonds
Table 4.5 Average Monte Carlo estimates of value for a convertible callable when S is 20 out of 30
days above Btriggger = Bcl = 110 and Btriggger = Bcl = 140 is presented compared to one without
this trigger condition. The data of the convertible is given in Table 4.1, the average Monte Carlo
values lie with a probability of about 68% within the reported confidence.
trigger
110
110
110
Bcl
110
110
110
110
110
110
110
140
140
140
110
140
140
140
140
observations in basis
St
St
St , St−29
St , St−5 , ..., St−25 ,
St−29
St , St−5 , ..., St−25 ,
St−29
St , St−1 , ..., St−29
St
St
St , St−29
St , St−5 , ..., St−25 ,
St−29
value
122.91
123.69
123.73
confidence
+- 0.01
+- 0.01
+- 0.02
dim
1
1
2
L
3
3
2
#paths n
106
106
105
#sims I
30
20
250
#basis m
15
15
17
123.72
+- 0.02
7
1
105
250
15
123.69
123.69
129.26
129.64
129.64
+- 0.02
+- 0.01
+- 0.02
+- 0.02
+- 0.02
7
30
1
1
2
2
1
2
2
2
3 · 105
106
105
105
105
15
30
250
250
250
361
61
7
7
17
129.67
+- 0.02
7
1
105
250
15
4.6 Summary
Convertible bonds are a popular financial instrument with complex behavior. The notice period
which prevents the issuer from an immediate call for conversion has a significant impact on the
theoretical value of a convertible bond and the optimal call strategy of the issuer.
In this chapter, we compare both PDE and Monte Carlo approaches to pricing convertible
bonds with complex call features. In particular, we examine calls with notice periods, and moving
window call protection, whereby calls cannot be issued unless the underlying asset is observed
above a trigger level for m out of the last n days.
Various authors have analyzed the delayed call phenomenon. Evidence suggests that issuers
wait to call their convertibles until the stock price is well above its optimal level. If we assume
such a delayed call, we of course find that the value of the convertible is larger than the convertible
without a notice period. For example if the convertible is called 10% above the optimal value, with
a notice period of 30 days, the value of the convertible increases by about 1% compared with the
optimal strategy. A notice period of 30 days, assuming optimal issuer behavior, adds about 0.5%
to the value compared to a bond with no notice period.
Some authors argue that the introduction of a notice period results in a higher stock price
level which is optimal for the issuer to call. We find that the call price and the schedule of coupon
payments have a significant effect on this stock price level. In general, the optimal stock price is
higher than the call price for convertibles with notice periods, but in some cases, it is lower. Just
before a coupon payment falls within the notice period, an optimal call by the issuer can be at a
considerably lower stock price than the call price.
4.6 Summary
For the case of convertible bonds with simple notice periods, we find that a Least-Squares
Monte Carlo approach gives quite good solutions compared with an accurate PDE solution. These
results are consistent with those reported in [84], we extend their results to continuous call and
conversion. We find that the PDE approach is computationally much more efficient and it is very
expensive to get cent accurate Monte Carlo estimates.
However, it is not feasible to value convertibles with the moving window soft call protection
using a PDE method. Our Monte Carlo method is based on least-squares regressions on sparse
basis functions. In principle, we need a large dimensional basis set to take into account the path
dependency of this contract feature. However, it appears that a very low order approximation
(i.e. a basis using only the current asset price) yields a very good solution, compared with a
full dimensional sparse basis. This is very fortunate, since the computational cost of using a full
dimensional basis (even for the case of a sparse basis) is very high.
103
104
Callable Convertible Bonds
105
Chapter 5
Simulation-Based Hedging and
Incomplete Markets
5.1 Overview
The previous chapters demonstrated a few applications of regressions for a complete market
which follows the Black-Scholes assumptions. The presented methods can increase the speed
of Monte Carlo pricing significantly and even allow to evaluate options for which pricing was
not possible before. Applying regressions to incomplete markets, which do not follow the BlackScholes assumptions, this chapter goes a large step further than the previous ones.
Even though, the axiom that financial markets do not allow arbitrage and the axiom that they
are complete lead to various breakthroughs in the previous decades, this chapter lies out of this
line of derivatives research by only assuming that a real-world model for the underlying hedge
instrument exists. The optimal hedging strategies are computed based on statistical properties of
the market and prices are obtained by the computation of all cost components of the derivatives
hedge including the cost of risk. This allows to apply Monte Carlo pricing to many more realistic
market scenarios than the ones based on the Black-Scholes assumptions.
It turns out that this approach can be seen as a powerful extension to common pricing frameworks such as Least-Squares Monte Carlo. But it implies desirable properties, such as higher
accuracy for complete markets than comparable Least-Squares Monte Carlo methods and more
realistic prices in incomplete markets.
106
Simulation-Based Hedging and Incomplete Markets
5.2 Introduction
Many extensions to the famous Black-Scholes model have been published including local volatility surfaces1 , stochastic volatility2 , jumps3 and transaction costs4 . These models often still assume
that markets are arbitrage-free and in some cases even complete. For the incomplete markets,
valuation can be conduced in many different ways, since the corresponding so called equivalent
martingale measure is not unique5 which it is in the complete market case. Solutions to the option
pricing problem in incomplete markets include6 risk minimization [46, 95] (especially variance
minimization [48, 47, 104, 33]), utility maximization [68, 54, 91] and martingale methods [51, 75].
In this chapter we will present a simple numerical framework which can provide numerical solutions to many of the complete and incomplete market models. We propose simple properties
which we think an option price should satisfy. The new setting we propose is similar to the one
presented by Schweizer [104]. His work provides valuable structural results, which correspond
to our findings. But, our work extends these structural results to an efficient numerical valuation
technique based on Monte Carlo simulation.
We present a method which is especially designed for but not limited to the pricing and hedging of OTC (Over-The-Counter) options which are not liquidly traded in the market. The new
method will be based on the view of the hedging issuer of a derivative and overcomes many of
the deficiencies of other methods. Additionally, the framework fits seamless into the risk management already existent in banks. The numerical implementation is very general and can be easily
extended to all kinds of options and hedging scenarios.
We call the algorithm of this chapter Simulation-Based Hedging. It will compute optimal portfolios explicitly in order to obtain prices which have a foundation on a hedging strategy in the
physical or real-world measure. Especially in illiquid markets, where no option prices can be fitted to traded prices, the complete market models suffer from systematic errors. In these situations
the Simulation-Based Hedging can rely on econometric models for the underlying which capture
the real-world dynamics in order to compute realistic prices.
In order to obtain a versatile method which can be used to price a large multitude of options such as path-dependent, exercisable, or callable options we present a framework which can
be seen as an extension to the Least-Squares Monte Carlo by Carrière [32] resp. Longstaff and
Schwartz [81]. The underlying principles for the valuation of exotic options can easily be adapted
to our algorithm.
1 Local volatility models were introduced as implied trees by Dupire [41] as well as Derman and Kani [39]. Further
references can be found in Wilmott [117, p. 357ff].
2 Introduced by Hull and White in 1987 [63], stochastic volatility models became especially popular after closed-form
solutions were published 1993 by Heston [61].
3 Option pricing when the underlying jumps was first studied 1976 by Merton [88].
4 In 1985, option pricing with transaction costs was studied by Leland [79] who introduced proportional transaction
costs to the Black-Scholes option pricing framework. A summary of references can be found in [117, p.353].
5 In an arbitrage-free market, a trader can only earn more than the risk-free rate with a risky investment. These and
other concepts are presented by many authors, e.g. Zagst [120], Panjer [92] and Pliska [94].
6 An overview of the different methods can be found in [90, p.99ff, p.252ff] as well as [34].
5.3 Derivation
Simulation-Based Hedging is similar to mean-variance option pricing in incomplete markets,
which corresponds to the maximization of a quadratic utility function [104]. But, the method of
this chapter adjusts the prices to account for the remaining risk of the hedged position which
delivers prices a bank could directly trade on. A numerical method which is similar to the
Simulation-Based Hedging is the Hedged Monte Carlo presented by Potters et. al.[96] respectively
Pochart and Bouchaud [95]. Potters’ method did not find wide acceptance due to its convergence
properties: It is challenging to find the right parameter set which delivers accurate option prices
because they compute their prices directly by regressions. Instead, we perform regressions to
compute the optimal hedge only. This preserves the convergence to the correct values and allows
for a higher accuracy since the option prices do not rely much on the specific set of basis functions
used. Other related methods are proposed by Ryabchenko et. al. [101] with a global quadratic optimization as well as Luenberger [82] who unifies a dynamic version of CAPM and Black-Scholes
in a continuous time setting.
The remainder of this chapter is structured as follows: the next section will introduce the setting we want to study including a definition of the objective of the trader and the risk management
of the bank. A section follows, which presents the optimal solution to the setting. Implementing
the optimal solution in a numerical algorithm, a section explains details for the practical pricing
and hedging estimation. Then, we discuss possible extensions illustrated using the example of a
GARCH process for the return distribution of the option’s underlying. Finally, we conclude with
a summary of the results.
5.3 Derivation
5.3.1 Basic Requirements for a Pricing Method
Before we proceed with the derivation of a pricing method for options, we want to see what the
objectives for a derivatives price are.
The motivation for a new pricing method lies mainly in the observation that a large OTC
market exists where positions of exotic options are traded. Some of these exotics are rarely traded
so that a market price is not observable. Others might be traded in higher volume, but the askbid spread is substantially high. Even the ask price of the different sellers for such an option
might be considerably varying because the different market participants use different models and
assumptions since the Black-Scholes prices are not sufficient. The models used instead usually try
to present a fair market value plus some spread the bank wants to earn. But, it is not even clear
whether this fair market value covers the cost of the hedging strategy.
The reason why the derivatives prices based on strategies like maximal utility, minimal risk
and market price of risk cannot ensure that the price is at least on average greater than the cost of
the hedging strategy lies in the treatment of risk. On the one hand, the option is priced by a utility
function of the bank - whatever that is - or a market price of risk. On the other hand, the bank has
107
108
Simulation-Based Hedging and Incomplete Markets
to pay for the risk of her portfolio by keeping money in a margin account. This margin account
will have to cover losses occurring and has to be supported by the bank’s own funds, which have
to earn the equity return.
In order to develop a pricing tool, which does not suffer from these deficits, we will deduce
four fundamental properties we require for the prices at which a hedging derivative’s issuer
should trade.
Property 5.1 A method which delivers the price of a derivative also delivers a corresponding hedging
strategy, which an issuer can follow.
Property 5.2 The price of a derivative covers at least on average all cost components which occur at the
issuer.
Property 5.3 The hedging strategy reduces the risk of the issuer.
Property 5.4 Any realistic, physical market model is allowed in the pricing method.
The first property is clearly a matter of realism: the issuer of an option desires to replicate the
option using a hedging strategy. That means Property 5.1 has to be fulfilled, otherwise the issuer
has no information, how her hedging strategy looks like.
Property 5.2 denotes a second principle, which is clearly a requirement of a bank. The average
cost of a derivatives trade must be estimated, otherwise one could not control the companies
earnings accurately.
The third property seems obvious. The important issue of Property 5.3 is that a hedging strategy should not be designed for maximizing profits or utilities. It should be designed to minimize
the required capital in the margin account since the bank’s own funds are limited and thus expensive.
The last Property 5.4 is a requirement such that the issuer can tie the derivative’s price to as
much information as possible. The issuer has specific knowledge about the underlying and it’s
market and she should be able to make it available to the pricing method.
5.3.2
Hedging and Pricing of a Liquidly Traded Security
Using these Properties 5.1- 5.4, we will deduce pricing procedures which allow the computation
of prices one could trade on in a real market. We start with a derivative which is already traded
in the market with a small ask-bid spread. From Properties 5.1-5.4 it is immediately clear what
the method for the pricing of such a liquidly traded derivative is: the issuer sells the derivative at
a price V ask with
V ask = V ask, market + C
5.3 Derivation
109
where V ask, market is the market ask price of the option and C is the residual transaction cost plus
a spread the issuer wants to earn. Now, the issuer’s hedging strategy is to buy the derivative at
the market for V ask, market .
However, the method for pricing of an illiquid derivative is not obvious. In the next sections,
we will see that pricing and hedging are still possible.
5.3.3 Setting for an Illiquid Market
Based on the requirements of Properties 5.1-5.4, we proceed assembling the parts of a real-world
pricing method for illiquid derivatives. However, we will restrict our first setting to a derivative
with a value V and a payoff function depending on a single underlying. This is extended in
Section 5.4.6 to an example of a hedge with two hedge instruments. The terminal value of the
derivative is denoted by VtT = P (StT ) with payoff P which depends of the asset price at maturity
time tT , only. A generalization at a payoff which depends on S := {Sτ |τ ∈ I}, I ⊆ {t0 , t1 , . . . , tT },
i.e. the whole paths history is straight forward. We consider an issuer who sells this derivative
and holds no other position in the market.
Property 5.1 requires a discrete-time model, because a hedging party can only buy and sell the
underlying at discrete times in order to follow a hedging strategy. Note that this does not prevent
the computation of the option’s payoff P (StT ) because it may only depend on discrete samples
Sti of the underlying. Consequently, we employ a discrete-time market model, where
Sti+1 = fti+1 (Sti ),
(5.1)
with the underlying asset Sti at time ti and the transition function fti+1 (Sti ) which describes the
change of S from time ti to time ti+1 . The transition function fti+1 is arbitrary with a continuous
probability density which allows to model a large variety of markets. This function can easily be
extended to account for state variables such as multiple underlyings or stochastic volatility.
Suppose that the derivative is not actively traded in the market so that the position in the
derivative cannot easily be closed by the issuer. That means, the issuer’s risk management has to
account for the risk over the whole lifetime of the derivative. A hedging strategy reducing the risk
involved in holding the short position in the derivative will be initiated and the question arises,
how much the issuer should charge for the derivative. It is clear that, the issuer should charge at
least all costs occurring in the derivatives trade (Property 5.2), i.e. she should charge the expected
cost of the hedge portfolio, the transaction costs plus the capital costs of the margin account.
To determine the expected cost of the hedge portfolio we are going through the hedging process backwards in time, starting at maturity.
In this setting, we are computing the value of a portfolio Π which compensates exactly for the
cost occurring in the hedge of the derivative. We will restrict ourselves here to a simple portfolio
Πti = Bti + φti Sti
(5.2)
110
Simulation-Based Hedging and Incomplete Markets
with a bank account value Bti and a position φti in the options underlying Sti .
The exact quantity of money required in the bank account B and thus the portfolio value Π
are unknown at the initialization time t0 of the derivatives trade. Thus, distributions for Bt0 and
Πt0 will be computed. The exact value of Π and B are known at maturity time tT of the option,
only. This makes the processes Π and B measurable at tT . However, since the hedging strategy
should be feasible in a real environment (Property 5.1), the hedging strategy itself must be known
(measurable) at each time step ti .
Focussing on the bank account B, the issuer has to pay the payoff of the derivative at maturity,
i.e. the trader sells the hedge
φtT = 0
and the bank account compensates for the payoff payed to the option holder,
BtT = VtT .
At this time (maturity), Π is measurable:
ΠtT = BtT + φtT StT = VtT .
In order to obtain a hedging strategy for the whole life time of the option, we proceed by an
induction from ti+1 to ti . The bank account at time ti should be able to compensate for the money
required at time ti+1 , i.e.
er(ti+1 −ti ) Bti + φti Sti+1
⇔ Bti
=
=
Bti+1 + φti+1 Sti+1
e
−r(ti+1 −ti )
(Bti+1 + (φti+1 − φti )Sti+1 )
(5.3)
(5.4)
where (φti+1 − φti )Sti+1 denotes the profit from the position in the underlying and r is the interest
rate the option’s issuer receives respectively pays on a bank account. That means, Bti is only
measurable at maturity time ti = tT . Furthermore, the equation for B describes a self-financing
hedging strategy7 . We can use Equation (5.4) to derive the value of the hedge portfolio
Πti
=
Bti + φti Sti
=
e−r(ti+1 −ti ) (Bti+1 + (φti+1 − φti )Sti+1 ) + φti Sti
³
³
´´
e−r(ti+1 −ti ) Πti+1 − φti Sti+1 − er(ti+1 −ti ) Sti ,
=
(5.5)
which is helpful in formulating the numerical evaluation in the next section. Again, Πti is only
measurable at maturity time ti = tT . Finally, the expected cost of the hedging strategy is given by
Vt0 = E[Πt0 ].
(5.6)
7 A self-financing hedging strategy is a trading strategy which does not require external cash-flows hedging the option
(see Zagst [120, p.52] for a more formal definition): All money required is available in the bank account. Note that
in contrast to Schweizer [104] neither the bank account value nor the portfolio value are measurable at time ti < tT .
Schweizer’s framework contains a measurable portfolio, which is mean self-financing. That means the expected cost of
hedging the option is met by the portfolio value.
5.3 Derivation
111
Again, it is important to note that the hedge portfolio in Equation (5.5) is not deterministic at time
t0 , because the bank account Bt0 denotes the money required to compensate for the hedging costs
including the hedging error. Since the exact path of the underlying is not known at any time ti ,
i < T , the required amount in the bank account is stochastic.
In order to determine, which trading strategy φti , i ∈ {0, . . . , T } the issuer should follow, we
have to consider the objectives of the issuer.
That means, the crucial part for the determination of the hedge is the choice of the objective
function. The objective of the hedging party of a derivatives trade is to minimize the risk incurred
in the derivative (Property 5.2). The risk can be measured in many ways which lead to different
strategies.8
Hedging Strategies
There are several objectives of an issuer, which we have to consider in order to determine a specific
hedging strategy. The issuer is usually a bank which constitutes of a derivatives trading department and an investment department. While the investment department actively takes risks, the
derivatives trading department should not take risks. That means that the market models of the
derivatives department are not made for investment strategies.
Consequently, the objective of a derivatives hedge should be to minimize a symmetric risk
measure. Most other strategies demand that the drift
£
¤
E fti+1 (Sti )
1
µ=
· log
(ti+1 − ti )
S ti
of the underlying is correctly identified which is extremely difficult as this knowledge is usually
not present to the derivatives department. If it were, one would still not consider over investing (or under investing) in an underlying to make a profit on the investment: The investment
department already has an optimal portfolio allocation and usually such an over-investment in
the derivatives underlying does not result in a better portfolio allocation. Consequently, a strategy is required which does not depend much on the underlying’s drift. The traditional option
pricing model of Black-Scholes perfectly fulfils this property, since the drift µ does not appear in
the pricing equations. In other models it is desirable that the more often a hedge is conducted,
the less should be the influence of µ on the option’s price. In the later, we will see that variance
minimization has this property.
Variance is a very simple risk measure, which corresponds to a quadratic utility of the issuer.
Some higher order minimization strategies are also independent of the drift rate µ. However,
they are not only harder to evaluate, but one looses the linear superposition of optimal hedges for
individual options in a trading book.9
8 It is worth noting that the risk should be measured in the physical measure and not in a risk-neutral way, otherwise
the connection to the real-world trader would be lost.
9 A strategy minimizing the fourth moment of the error and the resulting properties is described by Selmi and
Bouchaud [105].
112
Simulation-Based Hedging and Incomplete Markets
Another property to look at is the optimization procedure itself. Consider an issuer who already followed a risk-minimizing strategy for some time. During that time, he lost some money
due to hedging errors. Should this issuer change the hedging portfolio in order to compensate
for the errors? The answer is no, because the same arguments as before apply: the derivatives
department should not invest in order to make gains on the position in the underlying but in
order to minimize future risks. Consequently only future risk should be hedged, that means a
hedge should be local in time not global10 .
Quadratic Hedging
Before we actually consider the variance minimization strategies of this work, we want to focus on
related work in the literature: A setting similar to the setting of this thesis is the quadratic hedging
or local variance minimization as described by Föllmer and Schweizer [47] and Schweizer [104].11
We provide a brief summary of their framework using a notation similar to the previous section.
We have an option value V with an underlying asset S. At maturity time tT , the option has the
same value as the payoff P (StT ),
VtT = P (StT ).
For technical reasons, this local variance minimization is defined with a risk-free interest rate of
r = 0 and with process S being a martingale, i.e. Sti = E[Sti+1 |Sti , ti ]. Furthermore, given a
hedging strategy φ = φt0 , . . . , φtT we define the function Π̃tT as
Π̃tT := P (StT ) −
T
−1
X
φti (Sti+1 − Sti ),
i=0
i.e. it has the value of the payoff P (StT ) minus the so called gains process
PT −1
i=0
φti (Sti+1 − Sti )
from time t0 until tT . For ti < tT , we define
h
i
Π̃ti := E Π̃tT |Sti , ti .
Consider a market such that the contingent claim P (StT ) is attainable12 . Then, for the option price
process Vti ,
"
Vt i
:= E P (StT ) −
T
−1
X
#
φtk (Stk+1 − Stk ) | Sti , ti ,
k=i
=
Π̃ti +
i−1
X
φtk (Stk+1 − Stk )
k=0
10 Local in time means that the objective function for the decision on the hedge position at time t is not based on
i
previous variable values at time tj , j < i. Global in time means that the complete hedging strategy is defined at the
initial time t0 .
11 The local variance minimization is based on the work of Föllmer and Sondermann [48] who use variance minimization
in a global setting. Černý and Kallsen [35] extended the work of Föllmer and Schweizer [47] to a use regressions in a global
variance minimization, which is also related to the work of this thesis.
12 A contingent claim P (S ) is called attainable if there is a feasible hedging strategy, which perfectly replicates the
tT
contingent claim. For details see Zagst[120, p.62]
5.3 Derivation
113
holds, especially Vt0 = Π̃t0 holds as well. Minimizing the local quadratic risk defined as the
quadratic deviation from Π̃ti+1 to Π̃ti at time ti the optimal strategy is given by
{φti , Vti }
=
=
=
arg min
φti ,Vti
h
i
E (Π̃ti+1 − Π̃ti )2 | Sti , ti
0 2
arg min @E 4
φti ,Vti
arg min
φti ,Vti
Vti+1 −
i
X
!
φtk (Stk+1 − Stk )
k=0
−
Vti −
i
h
2
E Vti+1 − Vti − φti (Sti+1 − Sti ) | Sti , ti .
i−1
X
31
!!2
φtk (Stk+1 − Stk )
| Sti , ti 5A
k=0
In this setting, Vti and φti are measurable at time ti . Thus the solution to the minimization
problem is given by
φti
Vt i
£
¤
cov Vti+1 , Sti+1 | Sti , ti
£
¤
=
,
var Sti+1 | Sti , ti
= E[Vti+1 |Sti , ti ] − φti E[Sti+1 − Sti |Sti , ti ].
This setting is equivalent to the solution of
{φti } = argmin
φti
¡
£¡
¢
¤¢
var φti Sti+1 − Vti+1 | Sti , ti
(5.7)
and setting the option value, such that it is self-financing on average:
Vti = E[Vti+1 |Sti , ti ] − φti E[Sti+1 − Sti |Sti , ti ].
In contrast to the work of this thesis, where no intermediate option prices exist and hedge portfolio values are only measurable at tT , the work of Föllmer and Schweizer is focused on option
prices and portfolio values measurable at time ti . That means, it is hard to compute global hedging errors and extensions to non-martingale underlyings and non-zero interest-rates are challenging. Furthermore, no efficient numerical methods are known for the computation of these
minimum-variance hedges. However, a simple numerical method with a multi-nominal tree for
this setting can be found in Černý [33]. A more general Monte Carlo method for the setting is
presented by Potters et al [96] which we summarize in Section 5.4.3.
Now, we are proceeding with the new method of this thesis. In the next section, we want to
see what are the different possible types of variance minimization and which one should use.
114
Simulation-Based Hedging and Incomplete Markets
t0
w01:
S=100
t1
t2
w11:
S=110
w21:
S=110
w12:
S=100
w22:
S=100
w23:
S=90
w13:
S=80
w24:
S=100
Figure 5.1: Tree model of stock price S for an incomplete market with states ω01 , . . . , ω24 .
Variance Minimization Strategies
This section will take a closer look at the different possibilities for a variance minimization of the
required portfolio value Π.
More formally, if we look at the value of the hedge portfolio Πti required to compensate for the
hedging strategy from time t0 until maturity time tT , the choices for variance-risk minimization
we consider are Global Hedging, Local Hedging, and something we call Forward Global Hedging.
We will describe the different strategies based on the example of hedging a European call
option in an incomplete market. Consider a discrete market model of asset price S with states as
presented in Figure 5.1 and a risk-free rate of r = 0. The option has a payoff Vt2 at maturity time
t2 of
Vt2 = max(St2 − 95, 0).
Additionally, the hedge portfolio Πt2 is given by
Πt2 = Vt2 .
With this information and the information of the transition Πt+1 → Πt , we will present the effects
of the different variance minimization strategies.
5.3 Derivation
115
Table 5.1 Global optimization of a derivatives hedge in a market as in Figure 5.1.
t0
S t0
100
100
100
100
100
100
100
100
100
Πt0
10.29
6.74
8.20
7.50
5.00
7.50
8.37
6.40
9.42
φt0
0.47
var(Πt0 )
2.54
t1
St1
110
110
110
100
100
100
80
80
80
Πt1
15.00
11.45
12.91
7.50
-5.00
7.50
-1.05
-3.02
0.00
φt1
var(Πt1 )
0.65
3.18
0.75
2.08
0.30
2.36
t2
St2
110
100
90
110
100
90
100
90
80
Vt 2
15
5
0
15
5
0
5
0
0
Πt2
15
5
0
15
5
0
5
0
0
1.) Global Hedging A globally optimal hedge minimizes the variance for the total hedging error,
{φt0 , . . . , φtT } = arg
min
φt0 ,...,φtT
(var [Πt0 | St0 , t0 ]) ,
with
³
³
´´
Πti = e−r(ti+1 −ti ) Πti+1 − φti Si+1 − er(ti+1 −ti ) Si , ΠtT = VtT = P (StT )
i.e. it minimizes the variance var(Πt0 ) at time t0 . This can create odd artifacts: In some cases
the trader would try to loose gains she made in the past and even pay for that. In other
cases, she would change her strategy in order to compensate for losses in the past.
ω11
ω12
ω13
01
In the example of Figure 5.1, we use the parameters φω
t0 , φt1 , φt1 and φt1 to minimize
the sample variance var(Πt0 ).
Table 5.1 presents the result of the single-step four-dimensional optimization. The computation itself starts by setting all required values at time t2 . Then, the process proceeds at time
t1 , where the underlying value St2 and the position φt1 in St1 is used for the computation of
Πt0 :
¡
¢
Πt0 = Πti+1 − φti (Si+1 − Si ) , i ∈ {0, 1}
since r = 0. Then, at time t0 , we compute Πt0 from Πt1 , φt0 , St1 and St0 . The values for the
4 different φ’s are chosen by a simple gradient-based method such that the variance of the 9
samples of Πt0 at time t0 is minimal.
116
Simulation-Based Hedging and Incomplete Markets
Table 5.2 Local optimization of a derivatives hedge in a market as in Figure 5.1.
t0
S t0
100
100
100
100
100
100
100
100
100
Vt 0
8.27
8.27
8.27
8.27
8.27
8.27
8.27
8.27
8.27
Πt0
9.35
9.35
9.35
6.67
6.67
6.67
8.81
8.81
8.81
φ t0
0.48
var(Πt0 )
1.51
t1
St1
110
110
110
100
100
100
80
80
80
Vt 1
14.17
14.17
14.17
6.67
6.67
6.67
-0.83
-0.83
-0.83
Πt1
15.0
15.0
12.5
7.5
5.0
7.5
-0.0
-2.5
-0.0
φt1
var(Πt1 )
0.75
2.08
0.75
2.08
0.25
2.08
t2
S t2
90
110
100
110
100
90
100
90
80
Vt 2
0
15
5
15
5
0
5
0
0
Πt2
0
15
5
15
5
0
5
0
0
2.) Local Hedging Local hedging minimizes the variance from one time step to the next, which
is similar to the quadratic hedging presented in the section about quadratic hedging. This
setting is intended for hedging of options for which we can observe a market price Vti . Since
the hedge portfolio Π̃ti should replicate the option price, we define
Vti+1 := E[Π̃ti+1 | Sti , ti ], VtT = P (StT )
i.e. that the expected value of the hedge position equals the option price.
As usual, we define the hedge portfolio as
Π̃ti := Bti + φti Sti .
The objective of this local hedge at time ti is to minimize the discrepancy be between a
hedge portfolio Π̃ti+1 and the option value Vti+1 , i.e.
³
h
i´
{φti } = arg min var Π̃ti+1 − Vti+1 | Sti , ti ,
φti
¡
£
¤¢
= arg min var Bti+1 + φti+1 Sti+1 − Vti+1 | Sti , ti .
φti
Using the self financing property of Equation (5.4), this equals
³
h
i´
{φti } = arg min var er(ti+1 −ti ) Bti + φti Sti+1 − Vti+1 | Sti , ti ,
φti
¡
£
¤¢
= arg min var φti Sti+1 − Vti+1 | Sti , ti ,
φti
which is essentially the same solution as presented in the quadratic hedging, Equation (5.7).
The result of this step-wise minimization for the example in Figure 5.1 and payoff Vt2 =
max(St2 − 95, 0) is presented in Table 5.2. One can observe that local minimization results in
5.3 Derivation
117
Table 5.3 Forward global optimization of a derivatives hedge in a market as in Figure 5.1.
t0
St0
100
100
100
100
100
100
100
100
100
Πt0
10.18
7.68
10.18
7.50
5.00
7.50
9.64
7.14
9.64
φ t0
0.48
var(Πt0 )
3.07
t1
S t1
110
110
110
100
100
100
80
80
80
Πt1
15.00
12.50
15.00
7.50
5.00
7.50
-0.00
-2.50
-0.00
φt1
var(Πt1 )
0.75
2.08
0.75
2.08
0.25
2.08
t2
S t2
110
100
90
110
100
90
100
90
80
Vt 2
15
5
0
15
5
0
5
0
0
Πt2
15
5
0
15
5
0
5
0
0
a different strategy than the global minimization in Table 5.1. However, since the transition
from Π̃ti+1 → Π̃ti is different than that from Πti+1 → Πti in the global setting, the resulting
variances cannot be compared directly.
3.) Forward Global Hedging A strategy between the local and the global is one, we call forward
global. In a forward global variance minimization, one solves
³
³
´´
Πti = e−r(ti+1 −ti ) Πti+1 − φti Si+1 − er(ti+1 −ti ) Si , ΠtT = VtT = P (StT )
with
{φti } = arg min (var [Πti ] | Sti , ti )
φti
and thus only minimizes the variance of the hedging actions required in the future and ignores all information from the past t0 , . . . , tt−1 . In this setting, in contrast to Local Hedging,
the complete future hedging errors can be obtained for further computations.
ω11
11
Table 5.3 presents the Forward Global Minimization. Starting at φω
t1 , var(Πt0 ) is miniω12
ω13
ω13
12
mized, followed by φω
t1 where var(Πt0 ) and φt1 where var(Πt0 ) is minimized. Finally,
ω01
01
φω
t0 is used for minimizing var(Πt0 ). The resulting strategy is identical to the Local Mini-
mization, and different to the Global Minimization. In fact, the forward global and the local
variance minimization strategy coincide in many cases. 13
Choice of the Variance Minimization Strategy
It is obvious that a forward global risk minimization, not a global risk minimization should be
pursued, since the global risk minimization tries to correct errors from the past by investing. The
forward global risk minimization only reduces future hedging errors, based on the assumption
that the future strategy does the same.
13 In a similar mean-variance setting with no-arbitrage this has been shown by Černý [33, Theorem 2]. And in an other
similar setting, this has been proved by Lamberton et al [76, Proposition 2].
118
Simulation-Based Hedging and Incomplete Markets
Consequently, we focus on the forward global optimization. Then, the minimization decomposes into a minimization of the risky capital costs for each time-step. That means the optimal
fraction φti of the hedge instrument S is given by
{φti }
=
=
=
arg min (var [Πti | Sti , ti ])
φti
³
h
i´
arg min var e−r(ti+1 −ti ) (Πti+1 − (φti (Sti+1 − er(ti+1 −ti ) Sti ))) | Sti , ti
φti
¡
£
¤¢
arg min var Πti+1 − φti Sti+1 | Sti , ti ,
(5.8)
φti
which is similar to the quadratic hedging in discrete time which is discussed in the literature (cp.
Section 5.3.3, Quadratic Hedging). We basically reduced all future values of the stochastic variables
S and V to one stochastic variable Πti+1 . Now, the solution to (5.8) can be found by setting the
first derivative
∂
∂φti
to zero and rearranging to φti ,
φti =
cov[Πti+1 , Sti+1 | Sti , ti ]
.
var[Sti+1 | Sti , ti ]
(5.9)
The values of (5.9) can be computed using the regression method presented in Section 1.2.1.
Determination of the Cost of Risk
The cost of risk for a bank is given by the bank’s risk management system: The bank has to deposit
a certain amount of capital in a margin account. For simplicity, we assume that this capital is
deposited at the initial time t0 of the trade until maturity of the option tT .
We propose a risk measure which is called Conditional Value at Risk of the future hedge error
at the time of the initiation of the trade t0 :
R := CVaR [Πt0 ] .
Other choices of the risk measurement (e.g. VaR) are easy to accommodate in the model,
too. In a more complicated setting, the risk management department computes a marginal CVaR
value, which is the CVAR of all hedges minus CVAR of all hedges except the one under consideration.
The risk management has to allocate R units of money in the margin account in order to
compensate for unexpected future losses on the hedge. This money is equity capital which has to
earn the return on equity, a risky interest rate re . That means, the derivative trade has to earn the
accrued interest
(ere (tT −t0 ) − 1)R
(5.10)
for the risk management, payable at maturity. Consequently, the cost of capital for the risk management is worth
CR = e−r(tT −t0 ) (ere (tT −t0 ) − 1)R
5.3 Derivation
119
at the initial time t0 .
The task of the issuer is to sell the derivative as expensive as possible and minimize the capital
cost caused by the risk management. Again, the issuer’s task is not to earn excess returns from the
derivatives hedge because that would result in an investment strategy. But, investment decisions
should be made consciously and not based on random derivative sales.
Now, we collected all ingredients for the option valuation: The value of an illiquid option is
given by
Vtask
= Vt 0 + C R .
0
The unadjusted value Vt0 = E[Πt0 ] is computed using the variance minimization starting at maturity time, while the margin capital cost is a direct result from the residual randomness in the
hedged portfolio, measured by the bank’s risk management system.
is the lowest price at which a trader should sell the option: Since she wants to
Now, Vtask
0
maximize the bank’s profit14 , the trader will add an additional spread to the option price, which
is an excess profit.
5.3.4 Transaction Costs
In the previous section, we saw how to convert remaining risk of a derivatives hedge into costs
which can be included in the derivative’s price. In this section, we want access another important feature often omitted in option pricing: transaction costs. In this section we briefly discuss
possible choices for the inclusion of transaction costs in Simulation-Based Hedging. In the next
Chapter, a Monte Carlo method is presented which is then extended to transaction costs. This
way, the complete costs of an option hedge can be compared for different hedging strategies.
Any kind of transaction costs model can be implemented for the Simulation-Based Hedging.
But, an optimal solution requires the introduction of an additional state variable: The current
position φ in the asset. This is computationally expensive and the optimal rehedge also requires
a numerical minimization with respect to this state variable, to decide wether to rehedge or not.
This is often not feasible due to insufficient computing capacity.
A more practical strategy follows from the current practice in option pricing: a time-based
rehedge15 . Here, one fixes specific dates (often equidistantly) at which the rehedge occurs. This
is a strategy which can deliver satisfying results without introducing large computational efforts.
The parameters of the strategy can also be optimized in order to find the optimal time interval.
The Monte Carlo method in the following Chapter (Section 5.4.6) is based on this time-based
rehedge.
As in the case without transaction costs, a rebalancing strategy which minimizes directly Vtask
0
including the expected replication cost would effectively be a strategy betting on the drift of the
14 Maximizing
the bank’s profit, maximizes the trader’s bonus.
hedging strategies with transaction costs are discusses in Wilmott [117, p.331], the time-based rehedge for
the Black-Scholes framework was analyzed by Leland [79]
15 Different
120
Simulation-Based Hedging and Incomplete Markets
underlying. As a result, an over (or under) investment in the underlying, which violates Property 5.3.
The objective, which one minimizes in order to find an optimal rehedge under transaction
costs requires some further thoughts: A usual risk minimization would result in a strategy which
rehedges every time step. Thus extremely large transaction costs occur. The Simulation-Based
Hedging allows a more intuitive procedure: Instead of minimizing some risk, we try to find
a strategy which minimizes the capital costs of the margin account plus the transaction costs.
Thus, we find a balance between paying transaction costs and taking risk. Later, we will provide
examples for this procedure.
5.3.5
American Put Options
Valuing exercisable options within this general framework cannot be based on no-arbitrage. However, this is still realistic because options are usually not exercised at the optimal values computed
by no-arbitrage arguments [2]. Consequently, the American put option valuation has to be based
on other means: The bank has to prepare for a hostile option holder who exercises at the worst
possible time for the issuer. That means the expected conditional value of the hedge portfolio
has to be at least as high as the exercise value at all times. This is accomplished by replacing
Equation (5.6) with
½
Πti =
(K − Sti )
B ti + φ ti S ti
if e−r(ti+1 −ti ) · E[Πti+1 |Sti , ti ] < K − Sti
.
else
This is easy to compute following the ideas of Least-Squares Monte Carlo as presented in Section 1.2.1.
5.4 Monte Carlo Implementations
Implementing an algorithm for the Simulation-Based Hedging is easy using the techniques presented in Chapter 1, especially the regression method presented in Section 1.2.1 helps to determine the optimal hedges. A simple implementation can be found in Appendix 7.6.
In the next section, we will start with a detailed computation of a simple European put option
with Simulation-Based Hedging. Then, brief summaries of Simulation-Based Hedging and the
method from Potters [96] follow. A theoretical analysis of the convergence properties of the LeastSquared Monte-Carlo and Simulation-Based hedging follows, which is confirmed by numerical
experiments.
5.4.1
Simple Example
After the theoretical considerations in the previous sections, we now focus on a simple example
of the Simulation-based Hedging. Consider a European put option with data in Table 5.4. We
5.4 Monte Carlo Implementations
121
Table 5.4 Data of a European put option.
General features
strike price K
risk-free rate r
volatility σ
drift rate µ
maturity time tT
terminal value P (StT , tT )
100
5% p.a.
40% p.a.
5% p.a.
1 year
max(K − StT , 0)
evaluate this option with 10 asset paths using the Simulation-Based Hedging. As in the previous
Chapters, we simulate the underlying’s paths using Equation (1.28)
1
Stji = Stji−1 e(r− 2 σ
2
√
)(ti −ti−1 )+σ ti −ti−1 θi,j
and the data in Table 5.4. With St0 = 100 and random numbers θi,j , i = 1 . . . , 10, j = 1, 2, we get
10 asset paths:
j
1
2
3
4
5
6
7
8
9
10
Stj0
100
100
100
100
100
100
100
100
100
100
Stj1
81.6340
120.7011
89.1249
118.4613
104.5817
81.0836
58.9702
101.6986
65.8471
119.6538
Stj2
61.7521
86.3528
84.7387
87.9178
118.0245
86.1567
40.8700
72.1828
65.5679
133.2725
Computing the option’s payoff
VtjT = Vtj2 = max(100 − Stj2 , 0)
we get

Vt 2







=







38.2479
13.6472
15.2613
12.0822
0
13.8433
59.1300
27.8172
34.4321
0








.







122
Simulation-Based Hedging and Incomplete Markets
For the portfolio ΠtT = BtT = Vt2 holds. Pathwise, we obtain


38.2479
 13.6472 


 15.2613 


 12.0822 


 0.0000 


Π t2 = 

 13.8433 
 59.1300 


 27.8172 


 34.4321 
0.0000
as scenario values.
Now, all values at time tT = t2 are computed and we can proceed with the backwards time
stepping to the previous time step tT −1 = t1 . At this time, we are computing the optimal hedge
for all possible states based on the sample for the next time step.
We saw in the previous section that optimal hedge is given by
φjt1 (Stj1 ) =
cov(St2 , Πt2 |Stj1 , t1 )
var(St2 |Stj1 , t1 )
.
Following Section 1.2.1, we approximate φjt1 (Stj1 ) by
φjt1 (Stj1 ) ≈
m
X
ãj bj (Stj1 ),
j=1
with ãj defined as (cp. Equation (1.7))
¡
¢−1
ã = BŜ (St1 )T BŜ (St1 )
BŜ (St1 )T Π̂
¡
¢†
= BŜ (St1 ) Π̂
(5.11)
with Ŝ j = Stj2 − E[St2 |St1 = Stj1 ] and Π̂j = Πjt2 − E[Πt2 |St1 = Stj1 ]. Ŝ and Π̂ can be computed
from the local basis approximation16 E[St2 |St1 = Stj1 ] of St2 respectively from the local basis
approximation E[Πt2 |St1 = Stj1 ] of Πt2 , i.e.
¡
¢
Π̂ := Πt2 − B(St1 ) · B(St1 )† Πt2
and
¡
¢
Ŝ := St2 − B(St1 ) · B(St1 )† St2
with pseudo inverse B(St1 )† of B(St1 ) (see Definition 1.6). In this Equation, we have to chose the
appropriate basis functions bj (St1 ) for

b1 (St11 )

..
B(St1 ) := 
.
b1 (St10
)
1
16 The
···
..
.
···

bm (St11 )

..
 ..
.
10
bm (St1 )
local basis approximation is defined in Theorem 1.4 and Lemma 1.5
5.4 Monte Carlo Implementations
123
Since we only have 10 samples to do the regression, we choose a polynomial basis with m = 3
basis functions, such that
¯
¯
B := B(St1 ) = (1 Stj1 (Stj1 )2 )¯
i.e.








B=







1
1
1
1
1
1
1
1
1
1
81.6340
120.7011
89.1249
118.4613
104.5817
81.0836
58.9702
101.6986
65.8471
119.6538
j=1,...,10
6664.1164
14568.7494
7943.2463
14033.0766
10937.3404
6574.5581
3477.4886
10342.6003
4335.8452
14317.0342
,








.







This results in








Π̂ = 







13.2697
4.5509
−3.5234
3.1732
−10.7258
−11.6489
6.3162
16.0707
−8.4905
−8.9921








,















Ŝ = 







−14.284
−17.0855
1.1383
−14.7551
22.2798
10.7196
−5.605
−21.6508
9.0629
30.1799








,







which we use in

b1 (St11 )Ŝ 1

..
BS := BS (St11 , . . . , St10
) := 
.
1
b1 (St10
)Ŝ 10
1
···
..
.
···

bm (St11 )Ŝ 1

..
.
.
bm (St10
)Ŝ 10
1
Then, the numerical values are
BS = (Ŝ j
i.e.








BS = 







Stj1 · Ŝ j
-14.284
-17.0855
1.1383
-14.7551
22.2798
10.7196
-5.605
-21.6508
9.0629
30.1799
¯
¯
(Stj1 )2 · Ŝ j )¯
-1166.0613
-2062.243
101.4493
-1747.9124
2330.0611
869.1857
-330.5282
-2201.8519
596.7638
3611.1368
j=1,...,10
-95190.2909
-248914.949
9041.6545
-207059.949
243681.841
70476.7514
-19491.3279
-223925.199
39295.1894
432086.279
,








.







124
Simulation-Based Hedging and Incomplete Markets
Now, the estimate for the hedge position
†
φt1 = −B · ((BS ) · Π̂)
evaluates to

φt1







=







−0.9347
−0.3008
−0.781
−0.3259
−0.5121
−0.9466
−1.4926
−0.5573
−1.3086
−0.3124








.







This result is used to compute the portfolio value Πt1 . Following Equation (5.5), we obtain
Πjt1 = e−0.05·0.5 · (Πjt2 − (φjt1 (Stj2 − e0.05·0.5 Stj1 ))),
which is in this case

Πt1







=







17.2941
2.3366
9.8246
1.1209
5.3918
16.2901
29.1472
9.6868
31.0981
3.2265








.







This completes the steps required at time t1 . We proceed with time t0 , where we have to
compute the next variance optimal hedge. At time t0 , the computation of φt0 is easier than at time
t1 since all state variables are equal: Stj0 = 100 ∀j ∈ {1, . . . , 10}. Consequently, we can compute
φt0 without conditioning at St0 :
φt0 =
cov(Πt1 , St1 )
,
var(St1 )
which is computed using the sample covariance and the sample variance of Stj1 and Πjt1 , j ∈
{1, . . . , 10}. The result is

φt0







=







−0.4633
−0.4633
−0.4633
−0.4633
−0.4633
−0.4633
−0.4633
−0.4633
−0.4633
−0.4633








.







5.4 Monte Carlo Implementations
125
Using this in the Equation for Πt0 :
Πjt0 = e−0.05·0.5 (Πjt1 − (φjt0 (Stj1 − e0.05·0.5 Stj0 ))),
we obtain

Πt0
which means







=







7.4243
10.4891
3.5241
8.2913
6.1851
6.1964
8.7438
9.0712
13.7539
10.8838








,







10
Vt 0
1 X j
=
Π = 8.45629.
10 i=1 t0
This can be compared with the traditional Monte Carlo estimate, which is
10
Vt0
1 X j
V
10 i=1 t2
≈
e−rtT
≈
e−0.05·1 · 21.4461
≈
20.4002,
while the true Black-Scholes price is 13.1459. It is easy to observe that the sample standard deviation std(Vt2 ) = 17.58 is a lot larger than the sample standard deviation std(Πt0 ) = 2.8720,
which is a clear indication that the Simulation-Based Hedging can provide better estimates than
traditional Monte Carlo methods.
126
Simulation-Based Hedging and Incomplete Markets
5.4.2
Simulation-Based Hedging in a Black-Scholes Market (European Options)
The previous section presented the computations of Simulation-Based Hedging for a simple setting in detail. In order to provide an overview of this method, this section summarizes the important steps of the Simulation-Based Hedging framework for Black-Scholes markets:
1. Simulation of asset ³Sti = (St1i , St2i , . . . , Stni ) in physical measure,
´
p
e.g. Stji+1 = Stji exp (µ − 12 σ 2 )(ti+1 − ti ) + σ (ti+1 − ti )θi,j
2. Determine payoff P (S j ) and
φjtT
= 0
ΠjtT
= P (StjT )
of paths S j .
3. For all j and for i = T − 1 down to 0 repeat
(a) Perform least-squares regressions to determine φjti according to
φjti
=
cov[Πti+1 , Sti+1 | Sti , ti ] X
αk bk (Stji )
≈
var[Sti+1 | Sti , ti ]
k
(b) Update portfolio value Πjti
Πjti
=
³
³
´´
e−r(ti+1 −ti ) Πjti+1 − φti Stji+1 − er(ti+1 −ti ) Stji
4. Finally, the option value is given by Vt0 ≈
1
n
Pn
j=1
Πjt0 as an in-sample value.17 .
17 V
t0 is computed as an in-sample value, only. This is because an out-of-sample value similar to the out-of-sample
value of Equation (1.36) has no benefit for the Simulation-Based Hedging: The convergence of the out-of-sample value to
the theoretical price is slower than for the in-sample case. Furthermore, there is no useful average bound as in the case of
Least-Squares Monte Carlo, where the out-of-sample price is on average a lower bound on the true value.
5.4 Monte Carlo Implementations
127
5.4.3 Hedged Monte Carlo (Potters et. al. [96]) in a Black-Scholes Market (European Options)
For comparison, we present the Hedged Monte Carlo method of Potters et. al. [96].18 Instead of
a Forward Global Hedge as in Simulation-Based Hedging, this method minimizes the variance
using some kind of Local Hedging.
1. Simulation of asset ³Sti = (St1i , St2i , . . . , Stni ) in physical measure,
´
p
e.g. Stji+1 = Stji exp (µ − 12 σ 2 )(ti+1 − ti ) + σ (ti+1 − ti )θi,j
2. Determine payoff P (S j ) and
φjtT
= 0
VtjT
= P (StjT )
of paths S j .
3. For all j and for i = T − 1 down to 0 repeat
(a) Perform least-squares regressions to determine φjti according to
φjti
=
cov[Vti+1 , Sti+1 | Sti , ti ] X
αk bk (Stji )
≈
var[Sti+1 | Sti , ti ]
k
(b) Update option value Vtji
Vtji
=
h
³
´
i
j
e−r(ti+1 −ti ) E Vtji+1 − φjti Si+1
− er(ti+1 −ti ) Sij | Sti , ti
4. Finally, the option value is given by Vt0 =
18 Note
1
n
Pn
j=1
Vtj0 as an in-sample value.
that there is no rigorous derivation of the setting or the resulting solution in Potters et. al. [96].
128
Simulation-Based Hedging and Incomplete Markets
5.4.4
Simulation-Based Hedging in a Black-Scholes Market (American Put
Option)
In this section, we summarize the implementation for Simulation-Based Hedging of American
put options in a Black-Scholes markets. That means, we extend the algorithm presented in Section 5.4.2 by early exercise opportunities where the option holder exercises in a way which is
the worst possible one for the issuer. We already discussed this early exercise behavior in Section 5.3.5. Note that additional to the regressions for estimating the optimal hedge position φjti ,
regressions are required for estimating the conditional expected continuation value. The regression are performed on the two possibly different sets of basis functions bj , j = 1, . . . , m1 resp. b̂j ,
j = 1, . . . , m2 .
1. Simulation of asset ³Sti = (St1i , St2i , . . . , Stni ) in physical measure,
´
p
e.g. Stji+1 = Stji exp (µ − 12 σ 2 )(ti+1 − ti ) + σ (ti+1 − ti )θi,j
2. Determine payoff P (S j ) and
φjtT
= 0
ΠjtT
= P (StjT )
of paths S j .
3. For all j and for i = T − 1 down to 0 repeat
(a) Perform least-squares regressions to determine φjti according to
m
φjti
1
cov[Πti+1 , Sti+1 | Sti , ti ] X
≈
ak bk (Stji )
var[Sti+1 | Sti , ti ]
=
k=1
(b) Perform least-squares regression to determine
E[Πti+1 |Sti , ti ] ≈
m2
X
ãk b̃k (Stji )
k=1
(c) Update portfolio value Πjti
Π̃jti
=
(
Πjti
=
´´
³
³
j
− er(ti+1 −ti ) Sij
e−r(ti+1 −ti ) Πjti+1 − φti Si+1
(K − Stji )
Π̃jti
if e−r(ti+1 −ti ) · E[Πjti+1 |Stji , ti ] < K − Stji
else
.
4. Finally, the option value is given by Vtj0 ≈
1
n
Pn
j=1
Πjt0 .
5.4 Monte Carlo Implementations
129
5.4.5 Remarks on the Computational Efficiency
This new method based on the computation of optimal hedges and the computation of their cost
is not only more realistic than pricing in the Black-Scholes framework. Using the assumptions
of the Black-Scholes prices, the new framework is much more efficient than regular Monte Carlo
methods: Especially the comparison with Least-Squares Monte Carlo shows that the new method
requires dramatically less computations. We will show that by estimating the order of convergence for both methods.
First, we take a look at Least-Squares Monte Carlo. This convergence analysis is similar to
the analysis of plain Monte Carlo in Section 1.4.2. We use a few assumptions which allow a brief
exposition, because a detailed analysis goes beyond the scope of this work.19 However, the result
is also valid for much more general assumptions which is confirmed by the numerical examples
presented later in this chapter.
We assume that the regression leads to perfect estimates of the conditional expectation function P e (Sti , ti ) = EQ [V (S, ti+1 )|Sti , ti ] in Equation (1.34). We also assume that a constant proportion m of the n samples is exercised, the rest reaches maturity time. And we assume that all paths
are exercised for P (S ∗ ) which is constant for all t at asset price S ∗ , additionally, that the interest
rate r is zero. Consequently, the expected variance of a single Monte Carlo samples is given by
·
2
σLS
=
=
¸
n−m
m
V (S, tT ) + P (S ∗ )
n
n
µ
¶2
n−m
var[V (S, tT )].
n
var
That means a Monte Carlo estimate, which is the mean value of the samples has a variance of
2
σLS
n .
Consequently, the standard deviation is in Landau-O notation:
µ
O
1
√
n
¶
(cp. Equation (1.24), Section 1.4.2). This result is independent of the number of time steps made
in the Least-Squares Monte Carlo.
The situation is different if we look at the Simulation-Based Hedging. Again, we assume that
the regressions lead to perfect estimates of the conditional expectation function. That means, the
Simulation-Based Hedging samples deviate only by the hedging error due to finite time-stepping.
Furthermore, we assume that the underlying asset follows the Black-Scholes assumptions (i.e.
µ = r = 0). In order to obtain a good estimate of the hedging error Hti between time ti and ti+1 ,
we look at the total hedging error until timestep ti , which is
Πti − E[Πti |Sti , ti ].
19 See
e.g. Stentoft [108] for a more detailed convergence analysis.
130
Simulation-Based Hedging and Incomplete Markets
If we now define the value function V (Sti , ti ) := E[Πti |Sti , ti ], we get the hedging error Hti as
Ht i
:=
=
(Πti+1 − V (Sti+1 , ti+1 )) − (Πti − V (Sti , ti ))
¡
¢ ¡
¢
Πti+1 − Πti − V (Sti+1 , ti+1 ) − V (Sti , ti ) .
(5.12)
From Equation (5.5) we know that with r = 0
¡
¢
Πti+1 − Πti = φti Sti+1 − Sti
holds. That means together with (5.12)
¡ ¡
¢¢ ¡
¢
Hti = φti Sti+1 − Sti − V (Sti+1 , ti+1 ) − V (Sti , ti ) .
With the Taylor expansion
V (Sti+1 , ti+1 ) − V (Sti , ti ) =
¢ ∂V (Si , ti )
∂V (Si , ti ) ¡
Sti+1 − Sti +
(ti+1 − ti )
∂S
∂t
³
¢2
¡
¢3 ´
1 ∂ 2 V (Si , ti ) ¡
2
+
Sti+1 − Sti + O (ti+1 − ti ) + Sti+1 − Sti
2
2
∂S
and the approximation20
φti =
cov[Πti+1 , Sti+1 |Sti , ti ]
∂V (Si , ti )
≈
var[Sti+1 |Sti , ti ]
∂S
we obtain
¢2 ∂V (Si , ti )
1 ∂ 2 V (Si , ti ) ¡
Sti+1 − Sti +
(ti+1 − ti ) + rest.
2
2
∂S
∂t
¡
¢3
2
where rest ∈ O((ti+1 − ti ) + Sti+1 − Sti ). Now, the variance hedging error of the portfolio is
−Hti
≈
given by
·
var[Hti |Sti , ti ] ≈
=
since
¢2 ∂V (Si , ti )
1 ∂ 2 V (Si , ti ) ¡
Sti+1 − Sti +
(ti+1 − ti ) + rest | Sti , ti
2
2
∂S
∂t
· 2
¸
¢2
1 ∂ V (Si , ti ) ¡
var
S
−
S
+
rest
|
S
,
t
ti+1
ti
ti i ,
2
∂S 2
¸
var
∂V (Si ,ti )
∂t
(ti+1 − ti ) is a constant
° 2 at time
° ti and does not contribute to the variance of the
° 1 ∂ V (Si ,ti ) °
one-step error. Assuming that ° 2 ∂S 2 ° = c < ∞ and that krestk∞ ≈ 0 and using the
∞
Black-Scholes assumptions for the asset price process with r = 0, we obtain
£
¤
var[Hti | Sti , ti ] ≤ var c · (Sti+1 − Sti )2 | Sti , ti
¸
· ³ ³
´´2
√
(−0.5σ 2 )(ti+1 −ti )+σθ ti+1 −ti
−1
| S ti , t i
≈ var c · Sti e
·
¸
³
´2
p
2
2
≈ var c · (Sti ) (−0.5σ )(ti+1 − ti ) + σθ ti+1 − ti | Sti , ti
£
¤
= (Sti )4 c2 var (−0.5σ 2 )2 (ti+1 − ti )2 + σ 2 θ2 (ti+1 − ti ) | Sti , ti
¡
¢
∈ O (ti+1 − ti )2
20 This can be seen from the fact that with zero interest-rate in a one-period Black-Scholes market, both expressions
minimize the variance of the hedge portfolio. Further discussions can be found in [33], Section 3.2.
5.4 Monte Carlo Implementations
131
with θ ∼ N (0, 1) holds, which is the variance of a single time step.
Since there are T := tT /(ti+1 − ti ) time steps in each paths (assuming equal time steps), the
total variance of the set of Simulation-Based Hedging paths is given by
"T −1
#
X
var
Ht i .
i=0
Assuming furthermore, that the hedging errors are uncorrelated and var[Hti∗ |Sti∗ , ti∗ ] denotes
the largest hedging error, i.e. var[Hti∗ |Sti∗ , ti∗ ] ≥ var[Hti |Sti , ti ] ∀i = 0, . . . , T − 1, then
"T −1
#
X
var
Hti
≤ T var [Hti∗ ]
i=0
µ
tT (tti∗ +1 − tti∗ )2
O
tti∗ +1 − tti∗
O (tti∗ +1 − tti∗ ) .
∈
∈
¶
1
That means, the variance of the mean of n sample paths of hedged portfolios is in O( nT
) and the
standard deviation is in
µ
¶
1
O √
.
nT
These results show that the Least-Squares Monte Carlo method requires about T -times as
many asset paths simulations as the Simulation-Based Hedging method for a computation with
the same accuracy. Using equally many asset paths, both methods require the same order of
numerical operations: The regressions of the Simulation-Based Hedging require about triple21
the computations of the Least-Square Monte Carlo, which is irrelevant in Landau-O notation. In
reality, where many time steps are used, Simulation-Based Hedging is an order of magnitude
quicker than Least-Squares Monte Carlo for pricing American put options.
5.4.6 Numerical Experiments
After we theoretically confirmed the high speed of the Simulation Based Hedging and went
through a simple example, we now want to obtain more insight into the practical properties of
this new method. A few numerical experiments will demonstrate the efficiency and other effects
compared with regular Least-Squares Monte Carlo.
Black-Scholes: American Put Options
First, we focus on an American option as in Table 5.5. Its Black-Scholes price is given by V =
13.66761 (all digit correct, computed by a PDE method). In order to compare the continuous
time PDE value with the discrete American (thus Bermudan) option values of the Monte Carlo
methods, we first compare the computed option values for different numbers of time steps. Note
21 Simulation-Based Hedging requires three regressions at each time-step t for estimating Π̂ = E[Π
ti
ti+1 |Sti , ti ], Ŝti =
i
E[Sti+1 |Sti , ti ] and φti , while Least-Squares Monte Carlo requires a regression for estimating E[Vti+1 |Sti , ti ], only.
132
Simulation-Based Hedging and Incomplete Markets
Table 5.5 Base case data of an American put option.
General features
strike price K
risk-free rate r
volatility σ
drift rate µ
dividends Di
maturity time tT
terminal value P (S, tT )
exercise price at tj < tT
100
5% p.a.
40% p.a.
5% p.a.
none
1 year
max(K − StT , 0)
max(K − Stj , 0)
that we are computing Black-Scholes prices, i.e. we are not assigning a margin capital cost to the
remaining hedging error.
A summary of these result is presented in Table 5.6, which contains the values for the LeastSquares Monte Carlo, the Simulation-Based Hedging and the Hedged Monte Carlo22 . Both, the
Least-Squares Monte Carlo and the Simulation-Based Hedging seem to converge to the reference
value: 128 time-steps are required for cent accurate values (±0.01). On the other hand, our implementation of the Hedged Monte Carlo does not converge to the reference value, which is mainly
due to the nature of the Hedged Monte Carlo. The main difference of Hedged Monte Carlo to
Simulation-Based Hedging is that in Hedged Monte Carlo, the regression is performed on a function that is interpreted as option value V , which is unique in each state. Instead, in SimulationBased Hedging, we perform regressions on the amount Π required for a self-financing hedge,
which is stochastic in each state. Consequently, the Hedged Monte Carlo has a large interpolation
error in the expected value V at each time-step, which increases with the number of time-steps.
This is not the case for Simulation-Based Hedging. In Simulation-Based Hedging, the regression
(or interpolation) error influences the hedging strategy φ only, the expected value of Π is not
affected.
Another experiment (Table 5.7) shows that if the drift rate is changed, the option value with
few time-steps can be much different, but with an increasing number of time steps, the value still
converges to the Black-Scholes price. This is expected since the drift of the underlying is irrelevant
for the Black-Scholes price. But, one can see that the option price depends on the underlying
market such that even daily hedging (256 time steps) introduces a difference of 0.02 (drift µ =
30%) compared with the Black-Scholes price. This is not a model or numerical error, a real-world
trader would obtain the same results for this market with a daily hedge rebalancing.23
A different aspect is presented in Figure 5.2: One can see that the Simulation-Based Hedging
is indeed a lot more accurate due to a higher convergence. While the standard deviation of the
sample of Least-Squares Monte Carlo valuations does not change with increasing time steps, it
reduces quickly with the Simulation-Based Hedging valuations. Again, this is expected as we
22 As
presented in [96]
details of different hedging strategies we refer the reader to [117], p. 326.
23 For
5.4 Monte Carlo Implementations
133
Table 5.6 Valuation results for an American option as in Table 5.5, comparing the values of LeastSquares Monte Carlo, Potters’ method (Hedged Monte Carlo) and the Simulation-Based Hedging.
All values are computed using the same set of 1,000,000 asset paths and 1 to 256 time-steps of the
sample. The PDE reference value is 13.66761.
# time steps
256
128
64
32
16
8
4
2
1
V Simulation-Based Hedging
13.6605
13.6593
13.6533
13.6431
13.6268
13.5906
13.5126
13.3926
13.1474
V Least-Squares Monte Carlo
13.6584
13.6605
13.6569
13.6476
13.6317
13.5782
13.4927
13.3574
13.1288
V Hedged Monte Carlo
13.8073
13.8016
13.7787
13.7384
13.7071
13.6283
13.5257
13.2526
12.4762
Table 5.7 Valuation results for an American option as in Table 5.5, comparing the values of
Simulation-Based Hedging with different drift rates µ. All values are computed using the same
set of 1,000,000 asset paths. The PDE reference value is 13.66761.
Simulation-Based Hedging with different drift rates
# time steps
256
128
64
32
16
8
4
2
1
V : µ = 5%
13.6605
13.6593
13.6533
13.6431
13.6268
13.5906
13.5126
13.3926
13.1474
V : µ = 30%
13.6395
13.6205
13.5764
13.4908
13.3285
12.9984
12.3494
11.1942
9.3091
V : µ = −15%
13.6588
13.6540
13.6518
13.6352
13.6062
13.5357
13.4267
13.1903
12.7706
theoretically saw the higher convergence of the Simulation-Based Hedging in Section 5.4.5.
Black-Scholes + Transaction Cost
After the verification in the previous section that the proposed Simulation-Based Hedging method
yields to the correct values, we will extend the portfolio strategy by introducing a proportional
transaction cost factor κ :
¯
¯´
³
¯
¯
Btji = e−r(ti+1 −ti ) Btji+1 + (φjti+1 − φjti )Stji+1 + κ ¯(φjti+1 − φjti )Stji+1 ¯ .
134
Simulation-Based Hedging and Incomplete Markets
Convergence with the number of time-steps
Least-Squares Monte Carlo
14
14
13.9
13.9
13.8
13.8
13.7
13.7
option value
option value
Simulation-Based Hedging
13.6
13.5
13.4
13.6
13.5
13.4
13.3
13.3
13.2
13.2
13.1
13.1
13
1
2
3
4
5
6
13
1
7
2
3
4
5
6
7
log2 number of time steps
log2 number of time steps
Figure 5.2: Valuation results for an American option as in Table 5.5, comparing the values of
Least-Squares Monte Carlo and the Simulation-Based Hedging. All values are mean values of
100 valuations, computed using a set of 104 asset paths and 1 to 128 time steps. The presented
interval is the mean value of the valuations ± the standard deviation of the valuations. The PDE
reference value is 13.66761.
The resulting optimization problem is
{φti }
=
arg min (var [Πti | Sti , ti ])
=
arg min var e−r(ti+1 −ti ) Πti+1 − φti (Sti+1 − er(ti+1 −ti ) Sti ) + κ (φti+1 − φti )Stji+1 | Sti , ti
=
arg min var Πti+1 − φti − κ φti+1 − φti Sti+1 | Sti , ti
φti
φti
h
φti
.
The corresponding solution is similar to the one in the case without transaction costs (Equation (5.9)), i.e. the solution is,
¯
¯ cov[Πti+1 , Sti+1 | Sti , ti ]
φti − κ ¯φti+1 − φti ¯ =
,
var[Sti+1 | Sti , ti ]
which is for φti+1 > φti :
φti =
1
1+κ
and for φti+1 < φti :
1
φti =
1−κ
µ
¶
cov[Πti+1 , Sti+1 | Sti , ti ]
+ φti+1 κ ,
var[Sti+1 | Sti , ti ]
µ
¶
cov[Πti+1 , Sti+1 | Sti , ti ]
− φti+1 κ .
var[Sti+1 | Sti , ti ]
Interestingly, for
µ
¶
µ
¶
cov[Πti+1 , Sti+1 | Sti , ti ]
cov[Πti+1 , Sti+1 | Sti , ti ]
1
1
− φti+1 κ < φti <
+ φti+1 κ ,
1−κ
var[Sti+1 | Sti , ti ]
1+κ
var[Sti+1 | Sti , ti ]
the solution does not exist. We propose that in this situation φti = φti+1 holds, which means that
one does not change the hedge.
i
5.4 Monte Carlo Implementations
135
Table 5.8 Base case data of a European put option. The Black-Scholes option price is VBS =
13.1459
General features
strike price K
100
risk-free rate r
5% p.a.
volatility σ
40% p.a.
drift rate µ
5% p.a.
dividends Di
none
maturity time T
1 year
terminal value P (S, T ) max(K − StT , 0)
exercise price at t < T
0
Table 5.9 Valuation results for a European option as in Table 5.8 with proportional transaction
costs (κ) in a risk-neutral setting, i.e. µ = r, 8 time steps are used for rehedging. The table
provides values for comparing the not admissible strategy φ with the approximate strategy φ̂. All
values are computed such that the 95% confidence interval is at most ±0.005.
κ
0.001
0.020
φ
V,φ
13.238
14.847
std(Πt0 )
8.0476
8.2429
φ̂
V , φ̂
13.267
15.581
std(Πt0 )
8.052
8.433
Now, it is important to note that the hedging strategy φti is not measurable at time ti as in the
case without transaction costs. The strategy at time ti depends on the strategy at time ti+1 , which
is clearly not available at time ti . Consequently, we approximate φti+1 with a simple estimate, i.e.
φti+1 ≈ φti . This leads to an approximate hedging strategy φ̂ti ,
φ̂ti =
cov[Πti+1 , Sti+1 | Sti , ti ]
var[Sti+1 | Sti , ti ]
which is the same as in the case without transaction costs. Table 5.9 presents values of the example with data in Table 5.8 for the not admissible strategy φ, as well as the admissible approximate
solution φ̂. In the case with low transaction costs κ = 0.001, the option value and the hedging
are not much different for both solutions, which suggests that the approximate solution is useful
in this case. In the case with large transaction costs, the hedging error is still not much different
(about 2% difference). But, the option value is affected by almost 5%, which is significant. However, it is important to emphasize that the optimal strategy φti is not admissible and thus super
optimal because it is measurable at time ti+1 .
Leland [79] already solved the time based hedging problem for the case of a Black-Scholes
delta hedge in a risk-neutral setting with constant revision intervals ∆t = (ti+1 − ti ). His result is
that one effectively has to price the option with an adjusted volatility
Ã
!
r
8 κ
2
2
√
σLeland = σ 1 +
,
π σ ∆t
136
Simulation-Based Hedging and Incomplete Markets
Table 5.10 Valuation results for a European option as in Table 5.8 with proportional transaction
costs (κ = 0.001) in a risk-neutral setting, i.e. µ = r. All values are computed such that the 95%
confidence interval is at most ±0.005.
nt V , κ = 0.001 VLeland
512
13.864
13.851
256
13.669
13.659
128
13.528
13.521
64
13.426
13.423
32
13.351
13.353
16
13.300
13.303
8
13.267
13.268
4
13.249
13.243
2
13.228
13.226
1
13.219
13.213
such that the option price is given by
VLeland = VBS (σLeland ) + kκθt0 St0 k,
(5.13)
where VBS (σ) is a function which returns the Black-Scholes price for the option under consideration and θt0 is the Black-Scholes delta at time t0 . The additive term kκθt0 St0 k results from the fact
that we assume that the transaction cost for the θt0 · St0 we buy at the initial time have to be paid.
This is not the case in the original model of Leland. Furthermore, we assume physical delivery at
maturity, which corresponds to the Leland model.
Table 5.10 presents the results for a European option with data in Table 5.8. The Black-Scholes
price of the option is 13.1459, the column V presents the Simulation-Based Hedging prices for a
risk-neutral drift µ = r and VLeland presents the price with Leland’s adjusted volatility according to
Equation (5.13). In most cases the difference between the Leland price and the Simulation-Based
Hedging price is small and within the confidence interval of the Monte Carlo estimates. But,
for decreasing time intervals, i.e. larger numbers of time steps T , the cost in Simulation-Based
Hedging seem to increase a bit faster than in Leland’s model. This is mainly due to numerical
properties: the estimated representation of the hedging strategy leads to additional oscillations in
the hedge portfolio, which result in higher transaction costs.
Simulation-Based Hedging with Transaction Cost and Margin Capital Cost
In the previous paragraphs, we saw how the transaction costs behave in the Simulation-Based
Hedging setting. The values missing for a valuation of ask price Vtask
of an option are the margin
0
capital costs CR . These capital costs are easy to obtain in the simulation framework since the
distribution of the required capital in the portfolio Πt0 is already computed using simulated asset
paths in the Monte Carlo algorithm. In this case, we are interested in the conditional value at risk
(CVaR). Consequently, we can compute the required margin capital cost CR by the corresponding
5.4 Monte Carlo Implementations
137
quantile of paths (using Equation (5.10)):
CR
=
e−r(tT −t0 ) (ere (tT −t0 ) − 1)CV aR95% (Πt0 ),
CV aRα (Πt0 )
:=
E[Πt0 |Πt0 ≤ V aRα (Πt0 )],
V aRα (Πt0 )
:=
sup(x ∈ R | P (Πt0 ) ≤ x) < α).
Assuming an expected return on equity capital re of 20% p.a., we obtain the capital costs presented in Table 5.11 for the hedge of a European option with data in Table 5.8 and proportional
transaction costs κ = 0.001. Comparing the properties of the prices obtained with different time
steps, a few facts are obvious: The transaction costs are the lowest with only a single time step
and they are increasing with the number of time steps. The reverse is true for the required margin
capital costs CR , i.e. the capital costs are the lower, the more time steps are used for rehedging.
This was expected. Now, we can find some optimal strategy, which minimizes the transaction
costs + margin capital costs value: The lowest costs are at T = 64 time steps, which equals about
weekly rehedging. This option value is 13.6638 and has about 0.51 costs associated compared with
the Black-Scholes price, which is a significant addition. Note that this costs can be dramatically
reduced using an expected drift rate µ > r, non-equally distant hedging intervals or move-based
rehedges.24
Table 5.11 Valuation results for a European option as in Table 5.8 with proportional transaction
costs (κ = 0.001) in a risk-neutral setting, i.e. µ = r. All values are computed such that the 95%
confidence interval is at most ±0.005.
T V , κ = 0.001 trans. V 1 − VBS capital cost CR trans. + CR
Vtask
0
256
13.6690
0.5231
0.1245
0.6476
13.7935
128
13.5280
0.3821
0.1735
0.5556
13.7015
64
13.4260
0.2801
0.2378
0.5179
13.6638
32
13.3510
0.2051
0.3385
0.5436
13.6895
16
13.3000
0.1541
0.4843
0.6384
13.7843
8
13.2670
0.1211
0.6823
0.8034
13.9493
4
13.2490
0.1031
0.9324
1.0355
14.1814
2
13.2280
0.0821
1.2940
1.3761
14.5220
1
13.2190
0.0731
1.7359
1.8090
14.9549
Simulation-Based Hedging with an Econometric Model for the Underlying
This section is not intended to promote a specific model: It is intended to promote the potential of
Simulation-Based Hedging. Therefore, we will describe an econometric model which is somehow
realistic, but which will not satisfy everybody due to the specific model restrictions. At the end of
this section, we will briefly summarize the method’s abilities and possible extensions.
24 For
details of different hedging strategies, we refer the reader to Wilmott [117] the the references therein.
138
Simulation-Based Hedging and Incomplete Markets
As an econometric model, we use a GARCH(1,1) volatilty model as proposed by Bollerslev [18],
with
r
=
const.
R ti
=
a + ²ti−1 ,
σt2i
=
α + βσt2i−1 + γ²2ti−1
Sti+1
=
eRti Sti .
²ti ∼ N (0, σt2i )
Using an algorithm for parameter estimation as in MATLAB, we can create parameters from
Data (German DAX, daily from 01.01.2003 to 06.11.2006):
r
=
3.9%
Rti
=
9.28 · 10−4 + ²ti−1 ,
σt2i
=
1.69 · 10−6 + 0.91σt2i−1 + 0.079²2ti−1
Sti+1
=
eRti Sti .
²ti ∼ N (0, σt2i )
In order to complicate the setting a little further, we are going to evaluate an option using
two hedge instruments. The data in Table 5.12 contains the data of a one year barrier option
V which we would like to sell. The hedge will consist of a two-year option Vc on the same
underlying plus the underlying S itself. Consequently, we need a model for the volatility of
the hedge. Assuming that Vc were liquidly traded, Figure 5.3 suggests that the implied volatility
of Vc can be approximated by
σimplied,ti = 0.29 · 0.306 + 0.71 · σti
(5.14)
with GARCH volatility σti at time ti .
In this setting with two hedge instruments, we set up the portfolio similar to Equation (5.2) as
Πti = Bti + φti Sti + ψti Vc,ti
with an option which is tradable at price Vc,ti . The self-financing condition leads to
er(ti+1 −ti ) Bti + φti Sti+1 + ψti Vc,ti+1
=
Bti+1 + φti+1 Sti+1 + ψti+1 Vc,ti+1
⇔ Bti
=
e−r(ti+1 −ti ) (Bti+1 + (φti+1 − φti )Sti+1 + (ψti+1 − ψti )Vc,ti+1 )
for the bank account value Bti . From the objective to minimize the variance of Πti at each timestep ti , we obtain the optimal solution by
³
h
{φti , ψti } = arg min var e−r(ti+1 −ti ) (Πti+1 − φti (Sti+1 − er(ti+1 −ti ) Sti )
φti ,ψti
i´
−ψti (Vc,ti+1 − er(ti+1 −ti ) Vc,ti )) | Sti , ti ,
5.4 Monte Carlo Implementations
139
Table 5.12 The data for the hedge of an up-and-out barrier option with two hedge instruments in
a market with GARCH volatility.
Up-and-Out Barrier Option
Barrier B
Strike K
Payoff P (S)
Maturity Time T
Kock-out observation
7800
6500
1
100 max(StT − K)
1.00 years
12:00h daily
Hedge Instruments: S and Vc
Underlying S
current asset price St0
current volatility σt0
transaction cost κS
6500
18%
10 basis points (0.001)
Call option
Strike Kc
Payoff Pc (S)
Maturity Time T
transaction cost κV
6500
max(StT − K)
2.00 years
200 basis points (0.02)
φ ψ
using a regression set of basis functions {bφ1 , bψ
1 , b2 , b2 , . . .} and the optimal strategy
X φ φ
φ(xi ) ≈
ãj bj (xi )
j
and
ψ(xi ) ≈
X
ψ
i
ãψ
j bj (x )
j
where the state variable is
xi := (σtji , Stji )
thus two dimensional. We solve this regression using thin-plate splines as presented in Chapter 1.
The numerical results are presented in Table 5.13. These results are only correct about ±0.10 since
the Monte Carlo evaluation of such a barrier option is challenging. The transaction costs (row
t.-cost) is the difference between one evaluation with transaction costs and a second evaluation on
the same set of Monte Carlo paths without transaction costs. One can observe that the transaction
costs rise dramatically the more often the portfolio is rehedged, while the cost of capital CR stays
constant for 1 to 16 time steps and rises for more than 32. Now, different strategies can be pursued
to do something optimal. Looking at the accumulated costs (transaction costs + margin requirements), the optimal rehedging frequency seems to be within 0 and 3 times during the one year
140
Simulation-Based Hedging and Incomplete Markets
60%
implied volatility in % p.a.
VDAX NEW 3M
50%
0.29 · 30.6% + 0.71 · σti
40%
30%
20%
2000
2001
2002
2003
2004
Date
2005
2006
2007
Figure 5.3: Comparison of at the money implied volatilities (VDAX) and estimated daily volatilities σti for 2000-2007.
5.5 Summary
141
until maturity (1 to 4 time steps). However, one could sell the option for less when one rehedges
8 or 16 times per year.
Table 5.13 Costs and prices computed using Simulation-Based Hedging for the up-and-out barrier
call with data in Table 5.12 in a market with GARCH volatility.
T
256
128
64
32
16
8
4
2
1
V
0.77
0.77
0.83
0.88
1.00
1.04
1.18
1.21
1.48
t.-cost
9.04
3.19
1.28
0.58
0.18
0.14
0.05
0.17
-0.06
CR
1.71
0.84
0.59
0.47
0.44
0.43
0.42
0.41
0.42
t.-cost + CR
10.75
4.03
1.87
1.05
0.61
0.57
0.47
0.58
0.36
ψ
1.55
0.92
0.75
0.50
0.13
0.03
0.05
0.05
0.03
φ
-0.90
-0.69
-0.43
-0.28
-0.05
0.02
0.00
-0.01
-0.01
Vtask
0
11.52
4.80
2.69
1.93
1.61
1.61
1.65
1.79
1.84
5.5 Summary
The question, which this chapter tried to answer was: How should a pricing and hedging strategy
look like, especially for the seller of an exotic OTC-contract? We found that the price of the option
should cover all cost components of the bank. These costs consist of the cost of the hedge plus
the cost of the risk involved. While the cost of the hedge is the expected capital required for
the hedge including transaction costs, the cost of the risk is the required margin capital for the
residual risk of the hedge. The costs of the risk are minimized using a dynamic hedging strategy,
which minimizes the variance of the hedged portfolio.
In short, this chapter presented a new an versatile Monte Carlo method called SimulationBased Hedging, capable of pricing derivatives based on optimal hedges. The method can be used
in very general market settings including econometric market models such as GARCH. Theoretical considerations and numerical experiments confirm that the new method is capable of efficient
pricing and hedging in incomplete markets, while being faster than regular Least-Squares Monte
Carlo in complete markets.
142
Simulation-Based Hedging and Incomplete Markets
143
Chapter 6
Conclusions
Summary of Results
This thesis provides three new ideas for the application of Least-Squares regressions to the pricing
of financial derivatives. All of them contribute to low variance Monte Carlo pricing, each in a
separate context.
The first idea is called Feature Extraction and parts of the first idea were previously presented
by Grau [55]. But, the idea was extended and the theoretical correctness has been shown. The
idea is to accelerate the pricing of path dependent options in complete markets by separating the
pricing algorithm into two parts: One which estimates a conditional expected payoff functions,
and another which uses this function in a numerical integration to determine the option value.
As a result, the required computational effort for the pricing of a delayed barrier option could
be reduced dramatically. The method is new and basically combines the ideas of Monte Carlo
pricing as presented by Boyle [21] with quadrature pricing as presented by Andricopoulos et.
al. [9].
The second idea is the utilization of sparse basis functions for the Least-Squares Monte Carlo
method. The Least-Squares Monte Carlo method introduced by Carrière [32] as well as Longstaff
and Schwartz [81] is the state of the art method for pricing exercisable options within a simulation
framework. So far, the complexity of the options’ payoff in Least-Squares Monte Carlo was very
limited since the regression in the algorithm could only handle low dimensionality. The sparse
basis functions which are similar to sparse grids allow regressions on relatively high-dimensional
functions and thus allow the valuation of much more complex derivatives than before. The successful application of the Least-Squares Monte Carlo to a Moving Window Asian option demonstrates the ability of this powerful method. No other practical and convergent approach has been
presented yet. In a second application, the complex rights of holders and issuers of convertible
bonds are implemented for a numerical valuation of a convertible bond. Although a so called
moving window soft call protection is common in convertible bond contracts, this thesis is the
first to evaluate this kind of constraints correctly. This work is the first to present a combination
144
Conclusions
of Least-Squares Monte Carlo [32, 81] with the idea of sparse grids [107, 29].
The last application of regression methods leads to the most powerful method of this thesis:
A method is presented, which can evaluate options much quicker than comparable methods as
Least-Squares Monte Carlo [32, 81] or Hedged Monte-Carlo [96, 95] while dropping the complete
market assumptions. For the pricing of American Put options, the new method is an order of
magnitude faster than the state of the art Least-Squares Monte Carlo. This remarkable result is
obtained by the direct computation of optimal hedging portfolios. Therefore, we call this new
method Simulation-Based Hedging. The option price in an incomplete market is then given by
the cost of the hedge plus the cost of the required margin capital for the remaining risk. Since the
provided prices are based on a risk minimizing strategy similar to the variance minimization by
Schweizer [104], which a trader can follow in real world, this approach can be a benchmark for
all issuers in the market.
All together, this thesis has shown that Least-Squares regression is a useful tool in a wide area
of derivatives pricing.
Future Work
There are a few topics open for further research. The main question about the Feature Extraction
presented in Chapter 2 is, how it could be extended to the pricing of American options. Besides,
the careful analysis of the proposed error splitting (Equation (2.5)) would provide more insight to
the efficiency of the Feature Extraction.
Even though the sparse basis functions presented in Chapter 3 and 4 create significant speedups of the evaluation of exercisable path-dependent options, the required computations are still
expensive and further work has to be conducted in order to obtain a fast valuation procedure.
Our findings suggest that further research about approximate exercise and call strategies can lead
to such a fast valuation procedure.
The Simulation-Based Hedging presented in Chapter 5 could be extended a joint minimization of transaction costs and costs of risk (costs of the margin account) in an efficient procedure.
Furthermore, different trading strategies under transaction costs than the presented time-based
rehedging should be analyzed. Finally, another open task is the transfer of Simulation-Based
Hedging to other numerical procedures, i.e. lattice or PDE methods in order to obtain even better
convergence.
145
Chapter 7
Appendix
7.1
Important Symbols
Symbol
α
β
i, j
m
n
B = (b1 . . . bm )
bj = (b(x1 ) . . . b(xn ))T
x = (x1 . . . xn )T
y = (y1 . . . yn )T
f (xi )
z
s
S
t
T
φ
θ
f
P (S, tT )
Prob(X)
E[X]
var(X)
cov(X, Y )
std(X)
k·k2
κ
Explanation
the level of confidence
a regression coefficient
index variables
number of basis functions
number of observations
regression basis
single basis function of the observations x1 , . . . , xn
vector of random observations
vector of random observations
some function of the random observation, e.g. a payoff
state ∈ Rs
dimension of the state space
underlying asset price process
time
number of time steps
number of assets in a portfolio
random number, drawn from a standard normal distribution
some function
payoff at maturity time
probability of event X
expected value of X
variance of X
covariance of X and Y
standard deviation of X
Euclidian vector norm
conditioning number (stability of a problem)
146
Appendix
7.2
Notes for the Proof of Theorem 1.4
In this section, we present the proof for Theorem 1.4.
The following lemmas and definitions are restating the properties of usual linear algebra. For
details and proofs we refer the reader to [85], Chapters 6 and 7.
Lemma 7.1 Let B be a possibly infinite dimensional Euclidian vector space and let B m be a linear subspace
of B with dimension m. Then, for every b ∈ B, there exists a unique decomposition such that
b = b̃ + ²
with b̃ ∈ B m and ² ⊥ B m .
Proof See [85], Satz 6.10.
2
Definition 7.2 Let B be a possibly infinite dimensional Euclidian vector space and let Bm be a linear
subspace of B with dimension m. Furthermore, let b ∈ B and b = b̃ + ² be the decomposition from
Lemma 7.1. Then, b̃ is called orthogonal projection from b on Bm . The mapping P : B → B m , which
assigns to each b ∈ B its orthogonal projection b̃ ∈ B m ⊂ B is called orthogonal projection from B onto
Bm .
Lemma 7.3 Using any orthonormal basis b1 , . . . , bm of Bm , the orthogonal projection of b onto B m is
given by
P (b) =
m
X
< b, bj > bj
j=1
with a scalar product < ·, · >.
Proof See [85], Bemerkung 6.12.
2
Lemma 7.4 Let b1 , . . . , bm be an arbitrary basis of Bm . Then P (b) has a unique representation P (b) =
Pm
j=1 aj bj , where a1 , . . . , am are determined by the solution to

< b1 , b1 >

..

.
< bm , b1 >
···
..
.
···

 

< b 1 , bm >
a1
< b, b1 >
  ..  

..
..
 .  = 
,
.
.
< b m , bm >
am
< b, bm >
which is unique.
Proof See [85], Bemerkung 6.14
2
7.2 Notes for the Proof of Theorem 1.4
147
Lemma 7.5 Let B be a possibly infinite dimensional Euclidian vector space and let B m be a linear subspace
of B with dimension m and P is the orthogonal projection onto B m . Then, for all b ∈ B
kb − P (b)k < kb − b̃k ∀b̃ ∈ B m , b̃ 6= P (b)
holds, i.e. P (b) is the best approximation of b in B m and the approximation problem has a solution which
√
is unique for the norm kbk := < b, b > induced by the inner product < ·, · >.
Proof See [85], Satz 6.16
2
Finally, after describing the above setting, we can start with the proof to Theorem 1.4.
Proof of Theorem 1.4: If we now look at the local basis approximation of a non-noisy observation
∞
P
sample (X, y) with yi = f (xi ), and f (x) =
aj bj (x) and basis functions b1 . . . , bm . We can use
j=1
Theorem 1.1, to get
n
1X
f (xi )bj (xi ) =
lim
n→∞ n
i=1
=
Z
f (x)bj (x)p(x) dx
D
< f, b >r
with random vectors xi ∈ Rs , i = 1, . . . , n which are distributed according to the probability
density function p(x). Lemma 7.5 provides that if we chose a basis b1 , . . . , bm ∈ B m , we can
approximate f ∈ B, B m ⊂ B by
f (x) ≈
m
X
ãnj bj (x)
j=1
using the projection given by Lemma 7.4. Since the projection is unique, the coefficients ãnj , j =
1, . . . , m are unique. Since the basis functions b1 , . . . , bm form a basis of Bm , the coefficients ãnj
of the approximation are equal to the coefficients ãnj = aj of the represented function f (x) =
∞
P
aj bj (x).
j=1
If we now introduce independent noise into the observations such that the sample is given by
∞
P
(X, y) with yi = f (xi ) + ²i , ²i , xi independent, i = 1, . . . , n and f (x) =
aj bj (x), aj = ãnj still
j=1
148
Appendix
holds for n → ∞ since the discrete version of the scalar product of Equation (1.1) is
n
< b, y >
1X
b(xi )yi
n→∞ n
i=1
à n
!
n
1X
1X
i
i
i
lim
b(x )f (x ) +
b(x )²i
n→∞
n i=1
n i=1
=
lim
=
Z
Theorem 1.1
=
b(x)f (x)p(x) dx + E[b(x)²]
D
independence of
Z
x, ²
=
b(x)f (x)p(x) dx + E[b(x)] · E[²i ]
D
Z
=
b(x)f (x)p(x) dx.
D
If we now, take a closer look at the explicit representation of the determination of the coefficient
vector a:
ãn
(B(X)T B(X))−1 B(X)T y
 n
n
P
P
i
i
b
(x
)b
(x
)
·
·
·
b1 (xi )bm (xi )
1
1
 i=1
i=1


..
..
..
= 
.
.
.
 n
n
 P
P
i
i
bm (x )b1 (x ) · · ·
bm (xi )bm (xi )
=

=






i=1
1
n
1
n
i=1
n
P
i=1
n
P
i=1
b1 (xi )b1 (xi )
..
.
···
..
bm (xi )b1 (xi )
1
n
.
···
1
n
n
P
−1 






i
 i=1 b1 (x )yi


..

.
 n
 P
bm (xi )yi
i=1
−1 
b1 (xi )bm (xi ) 
i=1


..

.

n
P

i
i
bm (x )bm (x )
i=1

n
P






1
n
1
n







b1 (xi )yi 
i=1


..

.

n
P

i
bm (x )yi
n
P
i=1
Thus

lim ãn
n→∞
=
=
< b1 , b1 >r

..

.
< bm , b1 >r


a1
 .. 
 . 
···
..
.
···
−1 

< b1 , bm >r
< f, b1 >r
 

..
..
 

.
.
< bm , bm >r
< f, bm >r
am
2
7.3 Proof of Equation (1.8)
7.3
149
Proof of Equation (1.8)
Proof Starting with
kak22
=
m
m
X
c2i
1 X 2
≥ 2
c
σ2
σ1 i=1 i
i=1 i
and U = (u1 , . . . , um ) we get
m
X
(ui )T yui =
i=1
Furthermore,
°m
°2
°X
°
°
°
° (ui )T yui °
°
°
i=1
=
m
X
ci ui = UT c.
i=1
° T °2 ¡ T ¢T ¡ T ¢
°U c° = U c
U c = cT UUT c = cT Ic = cT c
2
2
=
m
X
c2i
i=1
with identity matrix I holds. That means
kak22
1
≥ 2
σ1
°
°2
m
°X
°
°
i T
i°
° (u ) yu °
°
°
i=1
2
2
150
Appendix
7.4
Proof of Equation Set (4.4)-(4.7)
In this section we are following the arguments of Ayache et al [12] in order to proof Equations (4.4)-(4.7).
The Equation set (4.4)-(4.7) describes the risk-neutral dynamics of a convertible bond. First,
we want to restate the partial differential inequality for the convertible value V
σ2 2 ∂ 2 V
∂V
∂V
+
S
− (r + p)V + pκS(1 − η) ≥ 0
+ (r + pη)S
∂t
2
∂S 2
∂S
V (S, t) ≥ max(Bp (S, t), κS)
2
(7.1)
(7.2)
2
∂V
σ 2∂ V
∂V
+
S
− (r + p)V + pκS(1 − η) ≤ 0
+ (r + pη)S
2
∂t
2
∂S
∂S
V (S, t) ≤ max(Bc (S, t), κS),
(7.3)
(7.4)
where either one of (4.4)-(4.5) or (4.6)-(4.7) hold, and one of the inequalities holds with equality at
each point in the solution domain.
Note that we leave out the indices for the time t of the stochastic processes S and V in this
section to make the equations more readable.
The main difference to the derivation of the Black-Scholes Equation in Section 1.3 lies in the
possibility of default of the company. In this case, one has to model what happens to the holder
of the convertible.
Consider an instantaneous probability p(S, t) of default in the time interval [t, t + dt] conditional on no default in [t0 , t]. In general, we assume that the asset drops to (1 − η)S upon default
and the present value of holding the convertible until liquidation of the company is F giving the
convertible a value of
max(κ(1 − η)S, F ).
For simplicity, we assume that F is zero. In the following, we are considering a hedged portfolio
Π = V (S, t) − φ1 S − φ2 L
with defaultable bond L of the same issuer as V and zero recovery rate upon default. Furthermore, let dL = rL dt, F = 0 = R and p := p(S, t) hold.
7.4 Proof of Equation Set (4.4)-(4.7)
151
Then, we need to determine the dynamics of portfolio Π due to changes of the underlyings.
Consequently, there are two main cases:
No default with probability 1 − p(S, t) dt. The arguments from Section 1.3 apply since the portfolio is hedged against small changes in the underlying asset S. That means
δΠ = dV − φ1 dS − φ2 dL,
which is consistent with Equation (1.16).
Default with probability p(S, t) dt. The change dΠ in the portfolio is given by the assets held in
the portfolio and the value of the bond as well as the underlying after default:
δΠ = κS(1 − η) − V − φ1 (−ηS) − φ2 · 0.
Unifying both cases and computing the dynamics of Π from Itô’s Lemma [67] we obtain
dΠ = (1 − p dt) · ( dV − φ1 dS − φ2 dL) + p dt · (κS(1 − η) − V − φ1 (−ηS) − φ2 · 0)
µ·
¸
¶
∂V
1 2 2 ∂2V
∂V
= (1 − p dt) ·
+ σ S
dt +
dS − φ1 dS − φ2 dL
∂t
2
∂S 2
∂S
+p dt · (κS(1 − η) − V + φ1 (ηS)).
(7.5)
When choosing φ1 =
∂V
∂S
, Equation (7.5) becomes
µ
dΠ =
¶
∂V
1
∂2V
+ σ 2 S 2 2 dt − (1 − p dt) · φ2 dL
∂t
2
∂S
·
¸ ¶
µ
∂V
1
∂2V
∂V
(ηS) −
+ σ 2 S 2 2 dt ,
+p dt · κS(1 − η) − V +
∂S
∂t
2
∂S
which should be risk-free assuming that the probability of default p is given in the risk-neutral
world. Consequently, using dL = rL dt, we obtain
∂V
S − φ2 L) dt
∂S
∂V
1
∂2V
∂V
⇔0 =
+ σ 2 S 2 2 + (r + p)ηS
− (r + p)V
∂t
2
∂S µ
∂S
¶
1 2 2 ∂2V
∂V
+ σ S
− rLφ2 .
+pκS(1 − η) − p ·
∂t
2
∂S 2
³
´
∂V
1 2 2 ∂2V
1
+
σ
S
we finally obtain the governing PDE of the convertIf we now choose φ2 = rL
2
∂t
2
∂S
dΠ =
rΠ dt = r(V −
ible, which is
∂V
σ2 2 ∂ 2 V
∂V
+
S
+ (r + pη)S
− (r + p)V + pκS(1 − η) = 0.
∂t
2
∂S 2
∂S
(7.6)
Since the holder can convert into shares worth κS or return the convertible to the issuer for the
put price Bp and the issuer can call the convertible for Bc , the following boundary constraints
152
Appendix
have to hold during the lifetime of the security:
V (S, t)
≥ κS
V (S, t)
≥ Bp
V (S, t)
≤ Bc .
Imposing these constraint on the value of the convertible, described by Equation (7.6), leads to
the Equation set (4.4)-(4.7).
7.5 Feature Extraction in Octave/MATLAB
7.5
153
Feature Extraction in Octave/MATLAB
function s t a r t ( n )
% s t a r t ( n ) i s a t e s t f u n c t i o n f o r t h e Asian o p t i o n p r i c e r with
% F e a t u r e E x t r a c t i o n v a l u e ( t T , T , S0 , K , r , s i g m a ) . n i s t h e
% number o f v a l u a t i o n s p e r f o r m e d .
for i = 1: n
[ V1 ( i ) V2 ( i ) V3 ( i ) V4 ( i ) V5 ( i ) ] = value ( 0 . 5 , 1 2 5 , 1 0 0 , 1 0 0 , 0 . 0 5 , 0 . 2 5 ) ;
end
disp ( s p r i n t f ( ’ V1 : mean=%g s t d=%g ’
disp ( s p r i n t f ( ’V2 : mean=%g s t d=%g ’
disp ( s p r i n t f ( ’V3 : mean=%g s t d=%g ’
disp ( s p r i n t f ( ’V4 : mean=%g s t d=%g ’
disp ( s p r i n t f ( ’V5 : mean=%g s t d=%g ’
disp ( ’ The c o r r e l a t i o n i s : ’ )
c o r r c o e f ( [ V1 V2 V3 V4 V5 ] )
disp ( ’ The c o v a r i a n c e i s : ’ )
cov ( [ V1 V2 V3 V4 V5 ] )
,
,
,
,
,
mean ( V1 )
mean ( V2 )
mean ( V3 )
mean ( V4 )
mean ( V5 )
,
,
,
,
,
s t d ( V1)/ s q r t ( n ) ) ) ;
s t d ( V2)/ s q r t ( n ) ) ) ;
s t d ( V3)/ s q r t ( n ) ) ) ;
s t d ( V4)/ s q r t ( n ) ) ) ;
s t d ( V5)/ s q r t ( n ) ) ) ;
end
fu n c t i o n B = B s p l i n e ( S , S min , S max )
% r e t u r n b a s i s B o f c u b i c s p l i n e a c c o r d i n g t o S e c t i o n 1.2.2
m = 4;
x = l i n s p a c e ( S min , S max ,m+ 2 ) ;
x = x ( 2 : end − 1 ) ;
B = [ ones ( s i z e ( S ) ) S S . ˆ 2 S . ˆ 3 max ( repmat ( S , 1 ,m)−repmat ( x , s i z e ( S , 1 ) , 1 ) , 0 ) . ˆ 3 ] ;
end
f u n c t i o n f = B approx ( x , a , S min , S max )
% return value of spline at x
% ’a ’ are the basis function c o e f f i c i e n t s
% ’ S min ’ and ’ S max ’ a r e t h e b o u n d a r y v a l u e s o f t h e s p l i n e
f = B s p l i n e ( x ’ , S min , S max ) ∗ a ;
end
% p l e a s e turn page
154
Appendix
f u n c t i o n [ V0 nn V0 fn V0 np V0 fp V0 sim ]= value ( t T , T , S0 , K, r , sigma )
% [ V0 nn V0 fn V0 np V 0 f p V0 sim ] = v a l u e ( t T , T , S0 , K , r , s i g m a )
% Computes t h e v a l u e o f an A s i a n o p t i o n w i t h m a t u r i t y t i m e t T , T o b s e r v a t i o n s ,
% i n i t i a l s t o c k p r i c e S0 , s t r i k e K r i s k −f r e e r a t e r and v o l a t i l i t y s i g m a .
% s i m u l a t e s t o c k S and p a y o f f P
S = S0 ∗ ones ( 1 0 0 0 , 1 ) ;
dt = t T /T ;
I = 0;
for i =1:T
S
= S . ∗ exp ( ( r −0.5∗ sigma ˆ 2 ) ∗ dt + s q r t ( dt ) ∗ sigma ∗randn ( s i z e ( S ) ) ) ;
i f ( i <T )
I = I + S /(T− 1 ) ;
end
end
P = max ( I−K , 0 ) ;
% define p r o b a b i l i t y density function p (x ) , e s t stands for estimated ;
% n e s t f o r not e s t i m a t e d
p n e s t = @( x ) 1 . / ( x∗ sigma ∗ s q r t ( 2 ∗ pi ∗ t T ) ) . ∗ . . .
exp ( −( log ( S0 . / x ) + ( r −0.5∗ sigma ˆ 2 ) ∗ t T ) . ˆ 2 / ( 2 ∗ sigma ˆ 2 ∗ t T ) ) ;
p e s t = @( x ) ( k s d e n s i t y ( ( S ) , ( x ) ) ) ;
% p e r f o r m r e g r e s s i o n t o compute f t i l d e : = E ( P | S )
a e s t = B s p l i n e ( S , min ( S ) , max ( S ) ) \ ( P ) ;
% f o r t h e c a s e w i t h no e s t i m a t i o n u s e p r e c a l c u l a t e d h i g h l y −a c c u r a t e v a l u e s
S nest =
1 . 0 e +002 ∗ [
0.35049011165481375 2.476959609710143];
a nest =
1 . 0 e +002 ∗ [ − 2.27127 0933408031; 0 . 0 9 5 9 6 7 0 7 7 0 2 9 1 4 0 ; −0.001335089683737
0 . 0 0 0 0 0 6 1 2 4 4 5 8 7 8 0 ; − 0 . 0 0 00 0 6 6 0 9 4 8 5 3 0 0 ; −0.000000109620663
0.000002091568479; −0.000007210515689];
f e s t = @( x ) B approx ( x , a e s t , min ( S ) , max ( S ) ) ;
f n e s t = @( x ) B approx ( x , a n e s t , min ( S n e s t ) , max ( S n e s t ) ) ;
% V t0 computed a s
V0 nn = exp(− r ∗ t T )
V0 fn = exp(− r ∗ t T )
V0 np = exp(− r ∗ t T )
V0 fp = exp(− r ∗ t T )
integral
∗ quad (@( x ) f
∗ quad (@( x ) f
∗ quad (@( x ) f
∗ quad (@( x ) f
n e s t ( x ) . ∗ p n e s t ( x ) ’ , 0 , 1 0 ∗ S0 , 1 E− 5 ) ;
e s t ( x ) . ∗ p n e s t ( x ) ’ , 0 , 1 0 ∗ S0 , 1 E− 5 ) ;
n e s t ( x ) . ∗ p e s t ( x ) ’ , 0 , 1 0 ∗ S0 , 1 E− 5 ) ;
e s t ( x ) . ∗ p e s t ( x ) ’ , 0 , 1 0 ∗ S0 , 1 E− 5 ) ;
% V t0 computed u s i n g t h e s i m u l a t i o n s
V0 sim = exp(− r ∗ t T ) ∗ mean ( P ) ;
end
7.6 Simulation-Based Hedging in Octave/MATLAB
7.6
155
Simulation-Based Hedging in Octave/MATLAB
f u n c t i o n [ V, d e l t a ] = value ( S0 , K, r , mu, sigma , T , t i m e s t e p s , number paths )
% Computes t h e v a l u e V and t h e o p t i m a l h e d g e d e l t a
% o f an American p u t o p t i o n .
%
% [ V d e l t a ] = v a l u e ( S0 , K , r , mu , sigma , T , t i m e s t e p s , n u m b e r p a t h s )
% Example : [ V d e l t a ] = v a l u e ( 1 0 0 , 1 0 0 , 0 . 0 5 , 0 . 0 5 , 0 . 4 , 1 , 1 0 , 1 0 0 0 0 )
% S i m u l a t e t h e a s s e t v a l u e s 1 : n u m b e r p a t h s / 2 s t a r t i n g i n [ 0 . 5 ∗ S0 , 1 . 5 ∗ S0 ]
% t h e o t h e r h a l f ( n u m b e r p a t h s / 2 + 1 : n u m b e r p a t h s ) o f t h e a s s e t s s t a r t a t S0
dt = T/ t i m e s t e p s ;
S = z e r o s ( number paths , t i m e s t e p s + 1 ) ;
S ( : , 1 ) = [ l i n s p a c e ( 0 . 5 ∗ S0 , 1 . 5 ∗ S0 , number paths /2) S0 ∗ ones ( 1 , number paths / 2 ) ] ’ ;
f o r i = 2 : t i m e s t e p s +1
S ( : , i ) = S ( : , i − 1). ∗ exp ( ( mu−0.5∗ sigma ∗ sigma ) ∗ dt + sigma ∗ s q r t ( dt ) . ∗ . . .
randn ( number paths , 1 ) ) ;
end
% Determine v a l u e s at maturity time T
V = max (K−S ( : , end ) , 0 ) ;
B = z e r o s ( number paths , t i m e s t e p s + 1 ) ;
d e l t a = z e r o s ( number paths , t i m e s t e p s + 1 ) ;
B ( : , end ) = −V ;
d e l t a t m p = z e r o s ( number paths , 1 ) ;
% D e t e r m i n e t h e o t h e r v a l u e s u s i n g a dynamic program
f o r i = t i m e s t e p s : − 1:1
% Determine t h e e x p e c t e d p o r t f o l i o v a l u e Pi
P i j = −B ( : , i +1)− d e l t a ( : , i + 1 ) . ∗ S ( : , i + 1 ) ;
E P i = r e g r e s s ( S ( : , i + 1 ) , P i j ) ; % E [ P i ( t { i +1}) | S ( t { i + 1 } ) ]
% Compare w i t h E x e r c i s e V a l u e
Payoff = K − S ( : , i + 1 ) ;
ex index = ( E Pi < Payoff ) & ( Payoff > 0 ) ;
% A d j u s t bank a c c o u n t s o f e x e r c i s e d p a t h s
i f sum( e x i n d e x )>0
B ( ex index , i +1) = −( P a y o f f ( e x i n d e x )) − d e l t a ( ex index , i + 1 ) . ∗ S ( ex index , i + 1 ) ;
end
% Compute c o r / var , t h e n u p d a t e d e l t a and bank a c c o u n t B
% E [ S ( t { i +1}) | S ( t i ) ]
E S = exp (mu∗ dt ) ∗ S ( : , i ) ;
E P i = r e g r e s s ( S ( : , i ) , P i j ) ; % E [ P i ( t { i +1}) | S ( t i ) ]
d e l t a ( : , i ) = −r e g r e s s c o v ( S ( : , i ) , S ( : , i +1)−E S , P i j −E P i ) ;
B ( : , i ) = exp(− r ∗ dt ) ∗ ( B ( : , i +1) + ( d e l t a ( : , i +1)− d e l t a ( : , i ) ) . ∗ S ( : , i + 1 ) ) ;
end
156
Appendix
% a d j u s t bank a c c o u n t f o r i n i t i a l d e l t a p o s i t i o n
B ( : , 1 ) = B(: ,1)+ delta ( : , 1 ) . ∗ S ( : , 1 ) ;
% r e t u r n t h e o p t i o n p a r a m e t e r s f o r t h e p a t h s s t a r t i n g a t S0
V=−mean ( B ( number paths / 2 : end , 1 ) ) ;
d e l t a = −mean ( d e l t a ( number paths / 2 : end , 1 ) ) ;
% END o f v a l u e f u n c t i o n
fu n c t i o n [ E y ] = r e g r e s s ( x h a t , y ) ;
% P e r f o r m a r e g r e s s i o n on a p o l y n o m i a l b a s i s
% t o compute c o n d i t i o n a l e x p e c t a t i o n E [ y | x ]
% s c a l e x domain o n t o [ 0 , 1 ]
x = ( x h a t −min ( x h a t ) ) / ( max ( x h a t )−min ( x h a t ) ) ;
% compute p o l y n o m i a l b a s i s up t o d e g r e e n
n=6;
B = [];
for i =0:n
B = [B x . ˆ i ] ;
end
n = round ( length ( x ) / 2 ) ;
% perform the r e g r e s s i o n
E y = B ∗ (B ( 1 : n , : ) \ y ( 1 : n ) ) ;
% END o f r e g r e s s
f u n c t i o n [ Cov ] = r e g r e s s c o v ( x h a t , co , y ) ;
% P e r f o r m a r e g r e s s i o n on a p o l y n o m i a l b a s i s f o r c o v / v a r p a r a m e t e r ,
% which i s t h e c o n d i t i o n a l r e g r e s s i o n c o e f f i c i e n t c o v ( co , y | x h a t ) / v a r ( y | x h a t )
% s c a l e x domain o n t o [ 0 , 1 ]
x = ( x h a t −min ( x h a t ) ) / ( max ( x h a t )−min ( x h a t ) ) ;
% compute p o l y n o m i a l b a s i s up t o d e g r e e n
n=6;
B = [];
B plain = [ ] ;
for i = 0: n
B = [ B co . ∗ x . ˆ i ] ;
B plain = [ B plain x . ˆ i ] ;
end
n = round ( length ( x ) / 2 ) ;
Cov = B p l a i n ∗ ( B\ y ) ;
% END o f r e g r e s s c o v
Bibliography
[1] F. S. Acton. Numerical Methods that Work, page 106. The Mathematical Association of America, August 1997.
[2] H. Ahn and P. Wilmott. On trading American options. Technical report, Mathematical
Finance Group at the University of Oxford (OCIAM), 1997.
[3] D. E. Allen, G. MacDonald, K. D. Walsh, and D. M. Walsh. Using regression techniques
to estimate futures hedge ratios, some results from alternative approaches applied to Australian 10 year treasury bond futures. Edith Cowan Finance & Business Economics Working
Paper, September 2001.
[4] Z. A. Altintig and A. Butler. Are they still late? The effect of notice period on calls of
convertible bonds. Journal of Corporate Finance, 11:337–350, 2002.
[5] M. Ammann, A. Kind, and C. Wilde. Are convertible bonds underpriced? An analysis of
the French market. Journal of Banking and Finance, 27:635–653, 2003.
[6] M. Ammann, A. Kind, and C. Wilde. Simulation-based pricing of convertible bonds. Working paper, University of St. Gallen, Switzerland, 2005.
[7] L. Andersen and D. Buffum. Calibration and implementation of convertible bond models.
Journal of Computational Finance, 7(2):1–34, 2003/04.
[8] E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. D. Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen. LAPACK User’s Guide, chapter Linear Least Squares (LLS) Problems. SIAM, Philadelphia, 3rd edition, 1999.
[9] A. D. Andricopoulos, M. Widdicks, P. W. Duck, and D. P. Newton. Universal option valution using quadrature methods. Journal of Financial Economics, 67:447–471, 2003.
[10] P. Asquith. Convertible bonds are not called late. The Journal of Finance, 50(4):1275–1289,
September 1995.
[11] E. Ayache, P. A. Forsyth, and K. R. Vetzal. Next generation models for convertible bonds
with credit risk. Wilmott Magazine, pages 68–77, December 2002.
[12] E. Ayache, P. A. Forsyth, and K. R. Vetzal. The valuation of convertible bonds with credit
risk. Journal of Derivatives, 11:9–29, Fall 2003.
[13] V. Bathelmann, E. Novak, and K. Ritter. High dimensional polynomial interpolation on
sparse grids. Advances in Compuational Mathematics, 12:273–288, 2000.
[14] C. Bender, A. Kolodko, and J. Schoenmakers. Iterating snowballs and related path dependent callables in a multi-factor libor model. WIAS Preprint, ISSN 0946-8633, 2005.
158
BIBLIOGRAPHY
[15] S. J. Berridge and J. H. Schumacher. Pricing high-dimensional American options using local
consistency conditions. Technical Report 2004-19, CentER Discussion Paper, 2004.
[16] R. Bilger. Valuing American-Asian options using the Longstaff-Schwartz algorithm. Msc
thesis in computational finance, Oxford University, 2003.
[17] F. Black and M. Scholes. The pricing of options and corporate liabilities. Journal of Political
Economy, 81:637–659, 1973.
[18] T. Bollerslev. Generalized autoregressive condtional heteroscedasticity. Journal of Econometrics, 31:307–327, 1986.
[19] T. Bonk. A new algorithm for multi-dimensional adaptic numerical quadrature. In W. Hackbush, editor, Proceedings of the 9th GAMM-Seminar, Kiel 1993, Braunschweig, January 1994.
Vieweg.
[20] A. W. Bowman and A. Azzalini. Applied Smoothing Techniques for Data Analysis: The Kernel
Approach with S-Plus Illustrations. Oxford Statistical Science Series, 1997.
[21] P. Boyle. Options: A Monte Carlo approach. Journal of Financial Economics, 4(3):323–338,
1977.
[22] P. Boyle. A lattice framework for option pricing with two state variables. Journal of Financial
and Quantitative Analysis, 23(1):1–12, March 1988.
[23] P. Boyle, M. Broadie, and P. Glasserman. Monte carlo methods for security pricing. Journal
of Economic Dynamics and Control, 21(8/9):1276–1321, 1997.
[24] M. J. Brennan and E. S. Schwartz. Convertible bonds: Valuation and optimal strategies for
call and conversion. Journal of Finance, 32:1699–1715, 1977.
[25] M. J. Brennan and E. S. Schwartz. Analysing convertible bonds. Journal of Financial and
Quantitative Analysis, 15:907–929, 1980.
[26] M. J. Brennan and E. S. Schwartz. The case for convertibles. Chase Financial Quarterly,
1(3):27–46, 1982.
[27] M. J. Buchan. Convertible bond pricing: Theory and evidence. PhD thesis, Harvard University,
1997.
[28] H. J. Bungartz. Higher order finite elements on sparse grids. Electronic Transactions on
Numerical Analysis, 6:63–77, December 1997.
[29] H.-J. Bungartz and M. Griebel. Sparse grids. Acta Numerica, pages 147–269, 2004.
[30] A. W. Butler. Revisiting optimal call policy for convertible bonds. Financial Analyst Journal,
58(1):50–55, 2002.
[31] H. Cardot. Conditional functional principal components analysis. Scandinavian Journal of
Statistics, (OnlineEarly Articles):–, 2006.
[32] J. F. Carrière. Valuation of the early-exercise price for options using simulations and nonparametric regression. Insurance: Mathematics and Economics, 19:19–30, 1996.
[33] A. Cerny. Dynamic programming and mean-variance hedging in discrete time. Technical
report, Cass Business School Research Paper, October 2003.
BIBLIOGRAPHY
[34] A. Cerny. Mathematical Techniques in Finance: Tools for Incomplete Markets. Princeton University Press, January 2004.
[35] A. Cerny and J. Kallsen. Hedging by sequential regression revisited. Working paper, City
University London and TU München, 2007.
[36] J. Cox, S. Ross, and M. Rubenstein. Option pricing: A simplifyed approach. Journal of
Financial Economics, 7:229–263, 1979.
[37] M. G. Cox. The numerical evaluation of b-splines. IMA Journal of Applied Mathematics,
10(2):134–149, 1972.
[38] C. de Boor. A practical guide to splines. Springer, 1985.
[39] E. Derman and I. Kani. The volatility smile and its implied tree. Quantitative strategies
research notes, Goldman Sachs, January 1994.
[40] S. Dirnstorfer, A. J. Grau, and R. Zagst. Moving window Asian options: Sparse grids and
Least-Squares Monte Carlo. submitted, December 2005.
[41] B. Dupire. Pricing with a smile. Risk magazine, 7(1):18–20, 1994.
[42] L. Ederington. The hedging performance of the new futures markets. Journal of Finance,
34(1):157ff, Mar 1979.
[43] L. Fahrmeir, R. Knstler, and I. Pigeot. Statistik. Springer, 2004.
[44] J. Fan and Q. Yao. Efficient estimation of conditional variance functions in stochastic regression. Technical report, Department of Statistics, UCLA, 1998.
[45] W. Feller. An Introduction to Probability Theory and Its Applications, Vol. 2, 3rd ed., chapter The
Berry-Essen Theorem., pages 542–546. Wiley, 1971.
[46] H. Föllmer and P. Leukert. Quantile hedging. Finance and Stochastics, 3:251–273, 1999.
[47] H. Föllmer and M. Schweizer. Hedging by sequential regression: an introduction to the
mathematics of option trading. ASTIN Bulletin, 18:147 – 160, 1989.
[48] H. Föllmer and D. Sondermann. Contributions to Mathematical Economics, volume 34, chapter
Hedging of Non-Redundant Contingent Claims, pages 205–223. North-Holland, 1986.
[49] P. A. Forsyth. Lecture notes for semiar on computational finance. Faculty of Mathematics,
University of Waterloo, Mai 2001.
[50] P. A. Forsyth and K. R. Vetzal. Quadratic convergence of a penalty method for valuing
American options. SIAM Journal on Scientific Computing, 23:2096–2123, 2002.
[51] M. Frittelli. The minimal entropy martingale measure and the valuation problem in incomplete markets. Mathematical Finance, 10(1):39–52, 2000.
[52] J. Garcke, M. Griebel, and M. Thess. Data mining with sparse grids. Computing, 67(3):225–
253, October 2001.
[53] P. Glasserman. Monte Carlo Methods in Financial Engineering. Springer, Berlin, 2003.
[54] T. Gol and J. Kallsen. Optimal portfolios for logarithmic utility. Stochastic Processes and their
Applications, 89(1):31–48, 2000.
159
160
BIBLIOGRAPHY
[55] A. J. Grau. Soft call constraints in convertible bonds and a pricing framework for path
dependent options. Master’s thesis, University of Waterloo, Canada, 2003.
[56] A. J. Grau, P. A. Forsyth, and K. R. Vetzal. Convertible bonds with call notice periods.
In Proceedings to IASTED International Conference on Financial Engineering and Applications,
Banff, Canada, July 2003.
[57] A. J. Grau, P. A. Forsyth, and K. R. Vetzal. PDE and monte carlo methods for pricing convertible bonds with soft call constraints. Technical report, University of Waterloo, Canada,
2006. http://www.cs.uwaterloo.ca/ paforsyt/.
[58] D. Greiner, A. Kalay, and H. K. Kato. The market for callable-convertible bonds: Evidence
from Japan. Pacific-Basin Finance Journal, 10:1–27, 2002.
[59] K. Hallatschek. Fouriertransformation auf dünnen Gitttern mit hierarchischen Basen. Numer. Math., 63:83–97, 1992.
[60] W. Härdle. Applied Nonparametric Regression. Cambridge University Press, 1992.
[61] S. L. Heston. A closed-form solution for options with stochastic volatility with applications
to bond and currency options. The Review of Financial Studies, 6(2):327–343, 1993.
[62] N. J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM, Philadelphia, 2nd edition, 2002.
[63] J. Hull and A. White. The pricing of options on assets with stochastic volatilities. Journal of
Finance, 42(2):281–300, June 87.
[64] J. C. Hull. Options, Futures and Other Derivatives, chapter 21, pages 520–524. Prentice Hall,
New Jersey, 6th edition, 2006.
[65] J. Ingersoll. A contingent-claims valuation of convertible securities. Journal of Financial
Economics, 4:289–322, 1977.
[66] J. Ingersoll. An examination of corporate call policies on convertible securities. Journal of
Finance, 32:463–478, 1977.
[67] K. Ito. On stochastic differential equations. Memoirs, American Mathematical Society, 4:1–51,
1951.
[68] J. Kallsen. A utility maximization approach to hedging in incomplete markets. Mathematical
Methods of Operations Research, 50:321–338, 1999.
[69] J. Kallsen and C. Kühn. Exotic Option Pricing and Advanced Lévy Models, chapter Convertible
bonds: financial derivatives of game type. Wiley, Chichester, 2005.
[70] C.-H. Kao and Y.-D. Lyuu. Pricing of moving average-type options with applications. Journal of Futures Markets, 23(5):415–440, 2003.
[71] A. Kemna and A. Vorst. A pricing method for options based on average asset values. Journal
of Banking and Finance, 14:113–129, 1990.
[72] C. Koziol. Valuation of Convertible Bonds when Investors Act Strategically, volume 110 of
Beiträge zur betriebswirtschaftlichen Forschung. Dt. Univ.-Verl., Wiesbaden, Germany, 2004.
BIBLIOGRAPHY
[73] C. Koziol. Optimal exercise strategies for corporate warrants. Quantitative Finance, 6(1):37–
54, February 2006.
[74] C. Koziol and W. Bühler. Valuation of convertible bonds with sequential conversion.
Schmalenbach Business Review (sbr), 54:302–334, 2002.
[75] C. Kühn. Shocks and Choices - an Analysis of Incomplete Market Models. PhD thesis, TUMünchen, Germany, Fakultät für Mathematik, November 2002.
[76] D. Lamberton, H. Pham, and M. Schweizer. Local risk-minimization under transaction
costs. Mathematics of Operations Research, 23:585–612, 1998.
[77] K. W. Lau and Y. K. Kwok. Pricing algorithms for options with exotic path dependence.
Journal of Derivatives, pages 28–38, Fall 2001.
[78] K. W. Lau and Y. K. Kwok. Optimal calling policies in convertible bonds. Proceedings of
International Conference on Computational Intelligence for Financial Engineering, March 2003.
[79] H. E. Leland. Option pricing and replication with transactions costs. Journal of Finance,
40(5):1283–1301, 1985.
[80] J. Li, Y. C. Hon, and C. S. Chen. Numerical comparisons of two meshless methods using
radial basis functions. Engineering Analysis with Boundary Elements, 26(3):205–225, March
2002.
[81] F. A. Longstaff and E. S. Schwartz. Valuing American options by simulation - a simple
least-squares approach. The Review of Financial Studies, 14(1):113–147, 2001.
[82] D. G. Luenberger. Pricing a nontradable asset and its derivatives. Journal of Optimization
Theory and Applications, 121(3):465–487, June 2004.
[83] S. L. Lummer and M. W. Riepe. Convertible bonds as an asset class: 1957-1992. Journal of
Fixed Income, 3(2):47–57, September 1993.
[84] D. Lvov, A. Yigitbasioglu, and N. E. Bachir. Pricing convertible bonds by simulation. Working Paper, ISMA Center, the University of Reading, 2004.
[85] W. Mackens and H. Voss. Mathematik I. HECO-Verlag, Alsdorf, 1993.
[86] C. D. Manning and H. Schütze. Foundations of Statistical Natural Language Processing. The
MIT Press, 1999.
[87] R. Merton. Theory of rational option pricing. Bell Journal of Economics and Management
Science, 4(1):141–183, 1973.
[88] R. C. Merton. Option princing when the underlying stock returns are discontinuous. Journal
of Financial Economics, 3:125–144, 1976.
[89] Y. Muromachi. The growing recognition of credit risk in corporate and financial bond markets. Technical Report Paper # 126, Financial Research Group, NLI Research Institute, 1-1-1
Yurakucho, Chiyoda-ku, Tokyo 100-0006, Japan, 1999.
[90] M. Musiela and M. Rutkowski. Martingale Methods in Financial Modelling. Springer, 1997.
[91] C. R. Niethammer. On convergence to the exponential utility problem with jumps. Stochastic
Analysis & Applications, forthcoming.
161
162
BIBLIOGRAPHY
[92] H. H. Panjer, P. P. Boyle, S. H. Cox, H. U. Gerber, H. H. Mueller, H. W. Pedersen, S. R.
Pliska, M. Sherris, E. S. Shiu, and K. S. T. (Author). Financial Economics: With Applications to
Investments, Insurance and Pensions. The Actuarial Foundation, 1998.
[93] U. Pettersson, E. Larsson, G. Marcusson, and J. Persson. Improved radial basis function
methods for multi-dimensional option pricing. Technical report 2006-028, Uppsala University, 2006.
[94] S. R. Pliska. Introduction to Mathematical Finance: Discrete Time Models. Wiley, July 1997.
[95] B. Pochart and J.-P. Bouchaud. Option pricing and hedging with minimum local expected
shortfall. Quantitative Finance, 4:607–618, 2004.
[96] M. Potters, J.-P. Bouchaud, and D. Sestovic. Hedged monte carlo: Low variance derivative
pricing with objective probabilities. Physica A, 289:517–525, 2001.
[97] M. J. D. Powell. The uniform convergence of thin plate spline interpolation in two dimensions. Numerische Mathematik, 68(1):107–128, 1994.
[98] W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling. Numerical Recipes in FORTRAN: The Art of Scientific Computing, chapter Richardson Extrapolation and the BulirschStoer Method., pages 718–725. Cambridge University Press, Cambridge, England, 2nd edition, 1992.
[99] R. Rannacher. Finite element solution of diffusion problems with irregular data. Numerische
Mathematik, 43:309–327, 1984.
[100] C. Reisinger. Numerische Methoden für hochdimensionale parabolische Gleichungen am Beispiel
von Optionspreisaufgaben. Dissertation, Ruprecht-Karls-Universität, Heidelberg, June 2004.
[101] V. Ryabchenko, S. Sarykalin, and S. Uryasev. Pricing european options by numerical replication: Quadratic programming with constraints. Asia-Pacific Financial Markets, 11(3):301–333,
2004.
[102] H. Scheid, editor. Schüler Duden: Die Mathematik, chapter Regression, page 340ff. Dudenverlag, 3rd. edition, 1991.
[103] E. S. Schwartz. Valuation of warrants implementing a new approach. Journal of Financial
Economics, 4:79–93, 1977.
[104] M. Schweizer. Variance-optimal hedging in discrete time. Mathematics of Operations Research,
20:1–32, 1995.
[105] F. Selmi and J. Bouchaud. Alternative large risks hedging strategies for options. Wilmott
Magazine, March:64–67, 2005.
[106] R. Shao and B. Roe. The design and pricing of fixed- and moving window contracts: An
application of Asian-basket option pricing methods to the hog-finishing sector. The Journal
of Futures Markets, 23(11):1047–1073, 2003.
[107] S. A. Smolyak. Quadrature and interpolation formulas for tensor products of certain classes
of functions. Dokl. Akad. Nauk SSSR, 148:1042–1043, 1963.
[108] L. Stentoft. Convergence of the least squares Monte-Carlo approach to American option
valuation. School of Economics & Management, University of Aarhus, September 2003.
BIBLIOGRAPHY
163
[109] C. J. Stone. Optimal rates of convergence for nonparametric regression. Annals of Statistics,
10(4):1040–1053, 1982.
[110] M. H. Stone. The generalized Weierstrass approximation theorem. Mathematics Magazine,
21(4):167–184, 1948.
[111] W. Törnig and P. Spellucci. Numerische Mathematik für Ingenieure und Physiker. SpringerVerlag, 2nd edition, 1990.
[112] K. Tsiveriotis and C. Fernandes. Valuing convertible bonds with credit risk. Journal of Fixed
Income, 8(2):95–102, September 1998.
[113] E. van de Heiligenberg, W. Klijnstra, and M. Lundin. The optimal portfolio choice in today’s
markets. Technical report, Fortis Investment Management, 2002.
[114] H. Voss.
Grundlagen der numerischen Mathematik.
harburg.de/mat/LEHRE/material/grnummath.pdf, 2004.
http://www.tu-
[115] H. Wahba. Spline Model for Observational Data. Society for Industrial and Applied Mathematics, Philadelphia and Pennsylvania, 1990.
[116] H. Werner and R. Schaback. Praktische Mathematik II, page pp. 150 151. Springer Verlag,
1972.
[117] P. Wilmott. Paul Wilmott on Quantitative Finance. John Wiley & Sons Ltd., West Sussex,
England, 2000.
[118] S. N. Wood. Thin plate regression splines. Journal of the Royal Statistical Society: Series B
(Statistical Methodology), 65(1):95, February 2003.
[119] A. Yigitbasioglu and C. Alexander. Pricing and hedging convertible bonds: delayed calls
and uncertain volatility. International Journal of Theoretical & Applied Finance, 9(3):415–437,
2004. International Journal of Theoretical & Applied Finance.
[120] R. Zagst. Interest Rate Management. Springer Verlag, Heidelberg, 2002.
[121] C. Zenger. Sparse grids. In W. Hackbusch, editor, Notes on Numerical Fluid Mechanics, volume 31, Braunschweig, 1991, 1990. Vieweg.
[122] R. Zvan, P. A. Forsyth, and K. R. Vetzal. Robust numerical methods for PDE models of
Asian options. Journal of Computational Finance, 1:39–78, 1998.
[123] R. Zvan, P. A. Forsyth, and K. R. Vetzal. Discrete Asian barrier options. Journal of Computational Finance, 3:41–68, 1999.
[124] R. Zvan, P. A. Forsyth, and K. R. Vetzal. A finite volume approach for contingent claims
valuation. IMA Journal of Numerical Analysis, 21:703–731, 2001.
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement