A Monte Carlo method for estimating sensitivities of reflected diffusions in convex polyhedral domains

In this work we develop an effective Monte Carlo method for estimating sensitivities, or gradients of expectations of sufficiently smooth functionals, of a reflected diffusion in a convex polyhedral domain with respect to its defining parameters --- namely, its initial condition, drift and diffusion coefficients, and directions of reflection. Our method, which falls into the class of infinitesimal perturbation analysis (IPA) methods, uses a probabilistic representation for such sensitivities as the expectation of a functional of the reflected diffusion and its associated derivative process. The latter process is the unique solution to a constrained linear stochastic differential equation with jumps whose coefficients, domain and directions of reflection are modulated by the reflected diffusion. We propose an asymptotically unbiased estimator for such sensitivities using an Euler approximation of the reflected diffusion and its associated derivative process. Proving that the Euler approximation converges is challenging because the derivative process jumps whenever the reflected diffusion hits the boundary (of the domain). A key step in the proof is establishing a continuity property of the related derivative map, which is of independent interest. We compare the performance of our IPA estimator to a standard likelihood ratio estimator (whenever the latter is applicable), and provide numerical evidence that the variance of the former is substantially smaller than that of the latter. We illustrate our method with an example of a rank-based interacting diffusion model of equity markets. Interestingly, we show that estimating certain sensitivities of the rank-based interacting diffusion model using our method for a reflected Brownian motion description of the model outperforms a finite difference method for a stochastic differential equation description of the model.

For instance, they arise in the study of rank-based interacting diffusion models in mathematical finance [2,12] and as diffusion approximations in queueing theory [6,18,19,23,24,25]. For use in uncertainty qualification, stochastic optimization and other areas (see [1,Chapter VII] for a list of the numerous applications), it is of interest to estimate the gradient of the expectation of a functional of a reflected diffusion with respect to its defining parameters -namely, its initial condition, drift and diffusion coefficients, and directions of reflection along the boundary of its domain. (Henceforth, we simply write "sensitivities" to mean "gradients of expectations of functionals".) The main contribution of this work is to develop a broadly applicable asymptotically unbiased estimator of a large class of sensitivities of reflected diffusions, which can be used to approximate sensitivities of reflected diffusions via a Monte Carlo method. Our estimator is based on a probabilistic representation for sensitivities of reflected diffusions that was obtained in [15,Corollary 3.15]. This representation expresses the sensitivity as the expectation of a functional of a reflected diffusion and its associated derivative process. The derivative process satisfies a constrained linear stochastic differential equations with jumps whose coefficients, domain and directions of reflection are modulated by the reflected diffusion, and it was shown in [15,Theorem 3.13] that the pathwise derivatives of a reflected diffusion can be described via the derivative process. While this representation provides an unbiased estimator for sensitivities of a reflected diffusion, computation of sensitivities via this representation would entail simulation of the reflected diffusion and its associated derivative process, which typically involves discrete-time approximations. There is a large literature devoted to approximating reflected diffusions in convex polyhedral domains (see, e.g., [4,5,10,17,20,21,27,28]); however, there is no method for approximating the derivative process. In this work we propose an Euler scheme for the approximation of a reflected diffusion and its derivative process, and prove that the associated estimators of sensitivities of the reflected diffusion are asymptotically unbiased, as the discretization parameter goes to zero. The proof that the Euler scheme for the reflected diffusion is asymptotically unbiased follows an argument analogous to the one used in the proof of [28,Theorem 3.2], which established the result in the case of normal reflection. The proof that the Euler scheme for the derivative process is asymptotically unbiased is much more challenging because the derivative process jumps whenever the reflected diffusion hits the boundary of the polyhedral domain. Establishing convergence of the approximation is quite subtle and relies on a continuity property of a certain map, called the derivative map. This continuity property is of independent interest; for example, it is used in [16] to prove that a reflected Brownian motion (RBM) in a convex polyhedral cone and its derivatives process are jointly Feller continuous.
Our method, which relies on pathwise derivatives of a reflected diffusion, falls into the class of infinitesimal perturbation analysis (IPA) methods used in sensitivity analysis (see, e.g., [1,Chapter VII.2]). We compare the performance of our IPA method to a standard likelihood ratio (LR) method and provide numerical evidence that the variance of the former is substantially smaller than that of the latter. It is also worth mentioning here that the LR method only applies to perturbations of the drift and in many applications it is of interest to study perturbations with respect to all of the parameters (e.g., when estimating sensitivities of diffusion approximations of queueing networks, which we plan to investigate in future work). We also apply our method to study a particular rank-based interacting diffusion model called the Atlas model, originally proposed by Fernholz [8,Example 5.3.3] to model equity markets, and subsequently generalized by Banner, Fernholz and Karatzas [2] and Ichiba et. al. [12]. This model is described by a stochastic differential equation (SDE) with discontinuous drift coefficients, and its sensitivities can be estimated using a standard finite difference (FD) method (although this method remains biased as the time-discretization vanishes). On the other hand, the dynamics of this model can also be expressed in terms of an RBM in a convex polyhedral domain (see, e.g., [8] and Section 5.2 below). We estimate sensitivities by applying our method to this RBM representation, and provide numerical evidence to show that it performs much better than the FD method applied to the SDE description of the model.
In summary, the main contributions of this work are as follows: • An IPA method for estimating sensitivities of a reflected diffusion (Algorithm 1).
• Proof of convergence of Euler schemes for a reflected diffusion (Theorem 3.3) and its associated derivative process (Theorem 3.6). • Comparison of our method with the LR method, with numerical evidence that our method performs better, especially over long time horizons (Section 4). • Application of our method to estimate certain sensitivities of the Atlas model, and numerical evidence that our method applied to the RBM representation of the model performs better than the FD method applied to an SDE description of the model (Section 5.2). • A continuity property of the derivative map (Theorem 6.15).

1.2.
Outline. The remainder of this paper is organized as follows. In Section 2 we give precise definitions of a reflected diffusion and its associated derivative process, and we recall the probabilistic representation of sensitivities of reflected diffusions from [15]. In Section 3 we define an Euler approximation for a reflected diffusion and its derivative process, state our main convergence result and describe a numerical algorithm for estimating sensitivities. In Section 4 we compare our algorithm with an LR algorithm for gradient estimation in cases when the latter applies. In Section 5 we present numerical results from applying our method to a one-dimensional RBM and to the Atlas model. In Section 6, we define and state properties of the Skorokhod problem (SP) and the derivative problem, which are used to characterize a reflected diffusion and its derivative process, respectively. These are used in the proofs of our main results, which are presented in Sections 7-9. = N ∪ {∞}. Given J ∈ N, we use R J + to denote the closed nonnegative orthant in J-dimensional Euclidean space R J . When J = 1, we suppress J and write R for (−∞, ∞) and R + for [0, ∞). For r, s ∈ R let r = max{k ∈ N 0 : k ≤ r}, r ∧ s = min(r, s) and r ∨ s = max(r, s). For a subset A ⊂ R, let inf A and sup A denote the infimum and supremum, respectively, of A, with the convention that the infimum and supremum of the empty set are respectively defined to be ∞ and −∞. For a column vector x ∈ R J , let x j denote the jth component of x, for j = 1, . . . , J, and let |x| denote the usual Euclidean norm of x. We let {e 1 , . . . , e J } denote the standard normal basis in R J , where e j is the column vector in R J whose jth component is one and whose other components are zero, for j = 1, . . . , J. We let S J−1 = {x ∈ R J : |x| = 1} denote the unit sphere in R J . For J, K ∈ N, let R J×K denote the set of real-valued matrices with J rows and K columns. We write M T ∈ R K×J for the transpose of a matrix M ∈ R J×K . Let I J denote the J × J identity matrix. Let · denote the operator norm on R J×K .
Given E ⊆ R J we let cone(E) denote the convex cone generated by E; that is, and let span(E) denote the set of all possible finite linear combinations of vectors in E, with the convention that cone(∅) and span(∅) are equal to {0}.
Throughout this paper we fix a filtered probability space (Ω, F, {F t }, P) satisfying the usual conditions; that is, (Ω, F, P) is a complete probability space, F 0 contains all P-null sets in F and the filtration {F t } is right-continuous. We write E to denote expectation under P. We abbreviate "almost surely" as "a.s." By a K-dimensional {F t }-Brownian motion W on (Ω, F, P), we mean that (W 1 , . . . , W K ) are independent and for each k = 1, . . . , K, {W k (t), F t , t ≥ 0} is a continuous martingale with W k (0) = 0 and quadratic variation [W k ] t = t for t ≥ 0. We let C p < ∞, for p ≥ 2, denote the universal constants in the Burkholder-Davis-Gundy (BDG) inequalities (see, e.g., [26, Chapter IV, Theorem 42.1]).

Background on reflected diffusions and their sensitivities
2.1. A parameterized family of reflected diffusions. Let G be a closed polyhedron in R J equal to the intersection of finitely many closed half spaces in R J ; that is, for a positive integer N ∈ N, unit vectors n i ∈ S J−1 and constants c i ∈ R, for i = 1, . . . , N . Let . = {i ∈ I : x ∈ F i } to denote the (possibly empty) set of indices associated with the faces that intersect at x.
Let M ∈ N and U be an open parameter set in R M . For each i ∈ I, fix a continuously differentiable function d i : U → R J that satisfies d i (α), n i = 1 for all α ∈ U . For α ∈ U , d i (α) will denote the (constant) direction of reflection along the face F i associated with the parameter α. Since the direction of reflection d i (α) can be renormalized, our assumption d i (α), n i = 1 for all α ∈ U is without loss of generality. We let R : U → R J×N denote the continuous differentiable function defined by R(α) . = d 1 (α) · · · d N (α) for α ∈ U , and let R (α) denote the Jacobian of R(·) at α ∈ U . We refer to R(α) as the reflection matrix associated with α. Fix continuously differentiable functions For α ∈ U , x 0 (α), b(α, ·) and σ(α, ·) will respectively be the initial condition, drift and dispersion coefficients for the reflected diffusion associated with α. We refer to a(α, ·) . = σ(α, ·)σ T (α, ·) as the diffusion coefficient associated with α ∈ U . For α ∈ U and x ∈ G, we let x 0 (α) denote the Jacobian of x 0 (·) at α ∈ U , b α (α, x) denote the Jacobian of b(·, x) at α ∈ U , b x (α, x) denote the Jacobian of b(α, ·) at x ∈ G, and similarly define σ α (α, x) and σ x (α, x). Definition 2.1. Given α ∈ U and a K-dimensional {F t }-Brownian motion W on (Ω, F, P), a reflected diffusion associated with α and driving Brownian motion W is a J-dimensional contin- adapted process such that a.s. L α (0) = 0 and for every i ∈ I, the ith component of L α , denoted L α,i , is nondecreasing and can only increase when Z α lies in face F i ; that is, We say that pathwise uniqueness holds if given α ∈ U and a K-dimensional {F t }-Brownian motion W on (Ω, F, P), any two reflected diffusions associated with α and driving Brownian motion W are indistinguishable.
Remark 2.2. In [15, Definition 2.1] the authors define a family of reflected diffusions in which the drift and dispersion coefficients and directions of reflection are parameterized by α ∈ U , but the initial condition is parameterized by x ∈ G. In [15], this allowed for a characterization of pathwise derivatives of flows of reflected diffusions and was a convenient representation in the proofs. In contrast, here we will find it more convenient to assume that the initial condition is a continuously differentiable function x 0 (·) on U taking values in G.
Remark 2.3. Given α ∈ U and a reflected diffusion Z α with L α satisfying the conditions in It follows from the definition of R(α) and the conditions on L α in Definition 2.1 that Y α satisfies, for all 0 ≤ s < t < ∞, In particular, we see that Z α satisfies [ Let ζ 1 , ζ 2 lie in C 1 b (G), the space of real-valued functions on G that are continuously differentiable with bounded first partial derivatives. Suppose that for each α ∈ U there exists a unique reflected diffusion Z α associated with α. Then for t ≥ 0, define Θ t (·) to be the mapping from U to R defined by In the next two sections we state our main assumptions, introduce the derivative process along Z α and characterize the Jacobian of Θ t (·) in terms of Z α and the derivative process along Z α .
2.2. Main assumptions. The assumptions stated in this section are assumed to hold, without restatement, throughout this work.
The first three assumptions on the data {(d i (·), n i , c i ), i ∈ I} ensure the associated SP is well defined and the associated Skorokhod map (SM) is Lipschitz continuous (see Proposition 6.5 below), which is useful for proving strong existence of reflected diffusions and establishing pathwise uniqueness.
Given a convex set B, let ν B (z) . = {ν ∈ S J−1 : ν, y − z ≥ 0 ∀ y ∈ B} denote the set of inward normal vectors to the set B at z ∈ ∂B.
Remark 2.7. Under Assumption 2.5, there can be at most one map π α : R J → G that satisfies the conditions stated in Assumption 2.6 (see, e.g., the paragraph before [ Along with the last three assumptions, the final two assumptions on the coefficients b(·, ·), σ(·, ·) and R(·) ensure existence and pathwise uniqueness of the reflected diffusion and the derivative process, as well as the characterization of sensitivities of reflected diffusions in terms of the derivative process (see Theorem 2.13 below). Assumption 2.9. There exists κ 1 < ∞ such that for all α ∈ U and x ∈ G, In addition, there exist κ 2 < ∞ and γ ∈ (0, 1] such that for all α, β ∈ U and x, y ∈ G, Assumption 2.10. For each α ∈ U there exists c(α) > 0 such that for all x ∈ G, y T a(α, x)y ≥ c(α)|y| 2 , y ∈ R J .

2.3.
Sensitivities of reflected diffusions in terms of the derivative process. We first define the derivative process, which was introduced in [15, Definition 3.5] to characterize pathwise derivatives of a reflected diffusion. Given x ∈ G, define Recall the definition of d(·, ·) in (2.4).
Definition 2.11. Let α ∈ U , W be a K-dimensional {F t }-Brownian motion on (Ω, F, P) and Z α be a reflected diffusion associated with α and driving Brownian motion W . A derivative process along Z α is an RCLL {F t }-adapted process J α = {J α (t), t ≥ 0} taking values in R J×M such that a.s. for all t ≥ 0, J α m (t) ∈ H Z α (t) for m = 1, . . . , M and J α satisfies is an RCLL {F t }-adapted process taking values in R J×M such that a.s. K α (0) = 0 and for each m = 1, . . . , M and all 0 ≤ s < t < ∞, . We say that pathwise uniqueness holds if for each α ∈ U , K-dimensional {F t }-Brownian motion W and reflected diffusion Z α associated with α ∈ U with driving Brownian motion W , any two derivative processes along Z α are indistinguishable.
The equation 2.9 for the derivative process can be viewed as a linearized version of the equation (2.2) for the reflected diffusion Z α , with the key feature, established in [15], that R (α)L α + K α serves as the appropriate linearization of the constraining process R(α)L α .
Remark 2.12. Definition 2.11 for the derivative process is slightly different from the definition given in [15,Definition 3.5] due to the fact that the initial condition here is a function of α ∈ U . To clarify the relation, suppose J α satisfies Definition 2.11 and J α,x satisfies [15,Definition 3.5]. Then, for m = 1, . . . , M , the mth column vector of J α satisfies J α m (t) = J α,x0(α) t [e m , x 0 (α)e m ] for all t ≥ 0 (where the right-hand side of the equality is written in the notation of [15]).
In the following theorem we state the probabilistic representation of sensitivities of reflected diffusions that was obtained in [15] and serves as the starting point for the method for sensitivity estimation that we introduce in this work. Recall that the assumptions stated in Section 2.2 hold.
Theorem 2.13. For each α ∈ U and K-dimensional {F t }-Brownian motion W on (Ω, F, P), the following hold: (i) There exists a pathwise unique reflected diffusion Z α associated with α and driving Brownian motion W , and it is a strong Markov process. (ii) There exists a pathwise unique derivative process J α along Z α . (iii) Given t ≥ 0, a.s. J α (·) is continuous at t, and a.s. J α (·) is continuous at almost every s ∈ [0, t). (iv) Given t ≥ 0, the function Θ t (·) defined in (2.5) is differentiable at α and its Jacobian satisfies While (2.11) provides an unbiased estimator for Θ (α), exact sampling of functionals of (Z α , J α ) is complicated by the discontinuous dynamics when Z α reaches the boundary ∂G. As is often the case when simulating diffusion processes, we sample from a discrete-time Euler approximation of (Z α , J α ). Our main result (see Corollary 3.7 below) states that the Euler approximation can be used to construct an asymptotically unbiased estimator for Θ t (α).

Main results
Recall that the assumptions stated in Section 2.2 hold. Fix α ∈ U and a K-dimensional {F t }-Brownian motion W on (Ω, F, P). Let Z α denote the pathwise unique reflected diffusion associated with α and driving Brownian motion W , and let J α denote the pathwise unique derivative process along Z α . Throughout this section, given a time step ∆ > 0, define the and let {δ ∆ n W } n∈N be the sequence of i.i.d. K-dimensional Gaussian random variables with mean zero and diagonal covariance matrix ∆I K given by In addition, let {F ∆ t } denote the discrete filtration defined by 3.1. Euler scheme for the reflected diffusion. In this section we present an Euler scheme for approximating just the reflected diffusion Z α . This is an extension of a result obtained in [28] to allow for reflected diffusions with oblique reflection. We also prove a convergence result for an Euler approximation of the process L α that will be needed in the next section. Let π α denote the unique mapping satisfying the conditions in Assumption 2.6. In order to define the Euler scheme, we need the following lemma.
Lemma 3.1. There exists a unique map ξ α : Remark 3.2. In the case of a simple polyhedral cone with vertex at the origin (i.e., N = J and c i = 0 for i ∈ I), Assumption 2.9 implies R(α) is invertible for each α ∈ U and so ξ α (x) = (R(α)) −1 (π α (x) − x) for each α ∈ U and x ∈ R J .
Proof. According to Assumption 2.6 and Remark 2.7, π α : R J → G is the unique mapping that satisfies π α (x)−x ∈ d(α, π α (x)). By the definition of d(·, ·) in (2.4) and the linear independence of the vectors {d i (α), i ∈ I(π α (x))} guaranteed by Assumption 2.4, it follows that for each x ∈ R J there is a unique vector ξ α (x) ∈ R N + satisfying the conditions of the lemma.
The proof of Theorem 3.3 is given in Section 7. The proof is a straightforward adaptation of the proof of [28, Theorem 3.2(i)], which proves (3.8) in the case of normal reflection along the boundary. The main difference between the two results is that we allow for oblique reflection along the boundary and also prove convergence of the constraining process in (3.9).

3.2.
Euler scheme for the derivative process. We now construct an Euler scheme for the derivative process. We start with a lemma that introduces a linear projection map that can be interpreted as a linearization of π α . Recall the definition of the linear subspace H x , for x ∈ G, given in (2.8). Remark 3.5. For each x ∈ G, since L α x is a linear map from R J to H x ⊆ R J , we can write L α x as a J × J matrix whose column vectors lie in H x . Throughout this work we view L α x as this J × J matrix and refer to L α x as the derivative projection matrix. For ∆ > 0 recall the sequence {t ∆ n } n∈N0 , the random vectors δ ∆ n W , n ∈ N, the discrete filtration {F ∆ t } and the pair of processes (Z α,∆ , L α,∆ ) defined in the last section. Let J α,∆ = {J α,∆ (t), t ≥ 0} denote the piecewise constant RCLL {F ∆ t }-adapted process taking values in R J×M defined as follows: Set and, for n ∈ N, set where X α,∆ n is the random element taking values in R J×M defined by ). The following establishes convergence of the Euler scheme for the derivative process.
The proof of Theorem 3.6, which is given in Section 8, relies on continuity properties of the related derivative map, which are established in Theorem 6.15.
Proof. The corollary follows from Theorem 2.13(iv) and Theorem 3.6.
Corollary 3.7 suggests an asymptotically unbiased estimator for Θ t (α). In the next section we describe an algorithm for estimating Θ t (α) based on the Euler discretization. It is also of interest to investigate whether there is an exact sampling algorithm for the joint process (Z α , J α ), which would avoid the bias introduced by the Euler discretization. Even in the setting of an RBM, where an exact sampling method has been proposed in [4] for RBMs in the nonnegative orthant with reflection matrices that are so-called M-matrices, it is a challenging problem to exactly sample the associated derivative process due to the fact that the derivative process jumps whenever the RBM reaches the boundary of the domain. We leave this as an interesting open problem for future work.
3.3. Numerical algorithm. Fix 0 < ∆ < t < ∞. Since ∆ is typically much smaller than t, for simplicity we can assume N .
In the next section we compare our algorithm with other methods for sensitivity analysis. In Section 5 below, we illustrate our algorithm with an example of a one-dimensional RBM and an example of an RBM that arises in the study of a rank-based interacting diffusion model of equity markets.

Comparison with existing methods
The main existing (asymptotically unbiased) estimator for estimating sensitivities of reflected diffusions is the LR method. The LR method is applicable only when the law of the perturbed process is absolutely continuous with the law of the original process. In the context of multidimensional (reflected) diffusions, this only holds for perturbations to the drift. In this case, the LR method uses a change of measure argument to recast expectations of functionals of the perturbed process as the expectation, under the law of the original process, of the functional multiplied by the likelihood ratio or the Radon-Nikodym derivative. The LR method for sensitivities of diffusions with respect to drift was introduced in [32] and the authors describe an extension of their method to reflected diffusions (see [32,Section 9]). Here we provide a brief summary of the method in the context of reflected diffusions in convex polyhedral domains and refer the reader to [32] for the details.
Since we can only consider perturbations to the drift in this section, in addition to the assumptions stated in Section 2.2, we also assume that the initial condition, dispersion coefficient and directions of reflection are constant in α ∈ U ; that is, x 0 (·) ≡ x 0 , σ(·, ·) ≡ σ(·) and R(·) ≡ R. Fix t < ∞ and α ∈ U . For m = 1, . . . , M , by a standard argument using the Lipschitz continuity of b(·, ·) and Girsanov's transformation (see, e.g., [26,Chapter IV.38]), there is a family of probability measures {P α,ε m } ε>0 on the measurable space (Ω, F t ) that are mutually absolutely continuous with respect to the underlying measure P with Radon-Nikodym derivative As described in Section 3.1, we have an Euler approximation Z α,∆ for Z α , which yields the approximation D α,∆,m for D α,m , for m = 1, . . . , M , given by According to [32, Theorem 4.1], We now propose a numerical algorithm for estimating Θ t (α) based on (4.2) and (4. Algorithm 2. Set Z α,∆ (t ∆ 0 ) = x 0 (α). For n = 1, . . . , N , recursively complete the following two steps: 1. Generate a K-dimensional Gaussian random variable δ ∆ n W with mean zero and diagonal covariance matrix ∆I K . 2. Define Z α,∆ (t ∆ n ) as in (3.6). Define D α,∆ as in (4.2) and set the (LR) estimator Θ t (α) LR of Θ t (α) equal to the right-hand side of (4.4).
We briefly mention some of the advantages and drawbacks of the LR method. When applicable, the main advantage of the LR method is that it applies to a broad class of functionals of reflected diffusions, requiring only that ζ 1 and ζ 2 be measurable (and integrable) without imposing additional smoothness conditions that are required for our IPA method (Algorithm 1). In addition, since there are exact sampling methods for reflected diffusions (see [4]), they can potentially be adopted to obtain an unbiased estimator of Θ t (α). On the other hand, as mentioned above, the main drawback of the LR method is that it is only applicable to perturbations of the drift. This is quite restrictive as it is often of interest to compute sensitivities with respect to parameters other than the drift (e.g., in diffusion approximations of queueing networks, which we plan to study in future work). In [32,Section 8] the authors show that in the one-dimensional setting, the approach can be adapted to allow for perturbations of the diffusion coefficient, but those methods do not extend to higher dimensions. Furthermore, even when considering perturbations to the drift, numerical evidence suggests that the variance of an LR estimator performs worse than the our method, especially over large time intervals (see Figure  2 below). This supports the observation made in [1, Chapter VII.5] that IPA estimators usually outperform LR estimators when both methods are applicable.
Remark 4.1. It is worthwhile to compare our IPA algorithm with the class of FD methods, which can be used to estimate sensitivities. These are based on approximating the derivative by a finite difference and include forward, backward and central difference approximations (see, e.g., [1, Chapter VII.1]). For example, the forward difference estimator associated with fixed ε, ∆ > 0 (and assuming once again that N . = t/∆ is a positive integer) for ∂ m Θ t (α), for m = 1, . . . , M , is given by with t ∆ n defined as in (3.1). The appeal of an FD method is that it is simple to implement and can be used to approximate sensitivities with respect to all of the parameters. The main drawback is that it introduces an extra source of bias from the derivative approximation. Our IPA method can be viewed as a limiting version of the FD method that eliminates the bias from the derivative approximation, without any increase in the computational complexity of implementation. Since there is no computational advantage to using the FD method and in general there do not appear to be any other advantages to using an FD method over an IPA method when both are applicable (see, e.g., the first paragraph at the the top of [1, page 219]), we do not provide a numerical comparison of our method and an FD method.

Examples
We illustrate the value of our method (Algorithm 1) with two examples: a one-dimensional RBM and a three-dimensional RBM that arises in the study of rank-based interacting diffusion models of equity markets. We used the program R for all computations, which were performed on an Apple machine with a 2.26 GHz Intel Core 2 Duo processor. Table 1. Analytic values (approximated up to four decimal places) of the first partial derivatives of Θ 1 (·, ·, ·) defined as in (5.1) and evaluated at (1, −1, 1), as well as 95% confidence intervals for the LR estimate (when applicable) and the IPA estimate of the first partial derivatives of Θ 1 (·, ·, ·). Here ∆ denotes the time step. Each estimate was obtained using 10 5 trials and took approximately 10∆ −1 CPU seconds to compute.
It is readily verified that the assumptions in Section 2.2 hold. For t ≥ 0, define Θ t : U → R + by We use our algorithm (Algorithm 1) to estimate the first partial derivatives ∂ x Θ 1 (1, −1, 1), , we compare our method IPA with the LR method. In Table 1 we compare the estimates with the analytic values obtained using the formula for the cumulative distribution function of a one-dimensional RBM given on [11, page 49], and, in the case of perturbations to the drift, with the estimates obtained using the LR method (Algorithm 2). In Figure 1 we plot confidence intervals for the magnitude of the relative biases of the LR estimator and our IPA estimator for different time steps ∆, which, as expected, decrease with ∆. (The "relative bias" of an estimator is equal to the difference between the mean of the estimator and the analytic value divided by the magnitude of the analytic value.) In Figure 2 we compare the empirical variance of our estimator of  [12]. In the usual model of an equity market, the growth and volatility of each stock are fixed and depend on the identity of the stock, whereas in the class of Atlas models, the growth and volatility of each stock are determined by its relative Figure 1. 95% confidence intervals for the magnitude of the relative biases of the LR estimators (when applicable) and IPA estimators for the first partial derivatives of Θ 1 (·, ·, ·) at (1, −1, 1) using different time steps. Each estimate was obtained using 10 5 trials. rank within the market and thus are time-varying (and discontinuous). We consider the simplest version of the Atlas model (with J ≥ 2 stocks) in which the stocks have equal volatility, and the smallest stock, referred to as the Atlas stock, has positive growth rate, whereas all other stocks have zero growth. More precisely, let g > 0, σ > 0 and W be a J-dimensional Brownian motion. Let S = {S(t), t ≥ 0} denote the J-dimensional process whose jth component at time t ≥ 0, denoted S j (t), is equal to the value of the jth stock at time t. Then S evolves according to the stochastic differential equation (with discontinuous drift) for all k = j and equal to zero otherwise, for j = 1, . . . , J. Weak existence and uniqueness of solutions to (5.2) follows from their relation to martingale problems, and existence and uniqueness results for martingale problems established in Stroock and Varadhan [29] and Bass and Pardoux [3]. Let Z = {Z(t), t ≥ 0} be the rank-ordered process, where Z(t) is obtained by permuting the components of log S(t) (the logarithm is applied componentwise) so that they are in descending order, for t ≥ 0. Then according to [2] (see the paragraph following the proof of Proposition 2. In many applications in mathematical finance, it is of interest to compute sensitivities of functionals of the stocks in a market with respect to the underlying parameters of the market. For example, sensitivities of derivative prices with respect to the underlying market parameters, commonly referred to as the "Greeks", play an important role in risk management and hedging strategies (see, e.g., [9,Chapter 7]). Here, we study sensitivities related to the Atlas model, which have not yet received much attention. A performance measure of interest in this context is the so-called diversity of the equity market, which is expressed in terms of the J-dimensional process µ = {µ(t), t ≥ 0} of relative market capitalizations defined by µ(t) .
Fix p ∈ (0, 1) and define f : Then f is the pth power of the diversity function and f (µ(t)) is a measure of the diversity of the market (see [8,Example 3.4.4]). Since f (x) is invariant under permutations of the components of x and Z is obtained by permuting the components of log S, it follows from (5.3) and (9.56) that f (µ(t)) = ζ(Z(t)), where ζ : G → [J p−1 , 1) is the continuously differentiable function defined by A straightforward computation shows that the gradient of ζ is bounded and continuous. Here, we first use a FD method based on the stochastic differential equation representation (5.2) and then our IPA method for the reflected diffusion representation to estimate the sensitivity of E[f (µ(t))] to an underlying parameter of the market. Fix σ > 0 and set U . = (0, ∞). For each α ∈ U set the growth rate to be g(α) . = σ 2 /2α, let S α = {S α (t), t ≥ 0} be the J-dimensional process satisfying (5.2) with g = g(α), let µ α = {µ α (t), t ≥ 0} be the process of relative market capitalizations defined by µ α (t) .

The Skorokhod problem and the derivative problem
In this section we recall the deterministic SP, extended Skorokhod problem (ESP) and derivative problem. We state some useful properties and also state a new continuity result for the derivative problem (see Theorem 6.15 below) that is needed in the proof of Theorem 3.6 and whose proof is deferred to Section 9. Throughout this section we fix α ∈ U . Recall that Assumptions 2.4, 2.5 and 2.6 hold. 6.1. The Skorokhod problem and extended Skorokhod problem.
where (0) = 0 and for each i ∈ I, the ith component of , denoted i , is nondecreasing and can only increase when h lies in face F i ; that is, If there exists a unique solution (h, ) to the SP {(d i (α), n i , c i ), i ∈ I} for f , we write h = Γ α (f ) and refer to Γ α as the associated SM.
In the following we state the ESP, which is an extension of the SP that allows for a constraining term with unbounded variation. We introduce the ESP formulation because it will be more convenient to work with in subsequent proofs.
0) and the following conditions hold: If there exists a unique solution (h, g) to the ESP {(d i (α), n i , c i ), i ∈ I} for f , we write h =Γ α (f ) and refer toΓ α as the associated ESM. Remark 6.3. Given a reflected diffusion Z α with L α satisfying the conditions in Definition 2.1 define the {F t }-adapted J-dimensional continuous process X α = {X α (t), t ≥ 0} by Then by the properties of (Z α , L α ) stated in Definition 2.1, (Z α , L α ) is a solution to the SP The following is a time-shift property to the ESP. A similar property holds for the SP; however, it is not needed in this work.
The following proposition ensures the SP and ESP are well defined and the associated SM and ESM are Lipschitz continuous. We close this section by stating a version of the boundary jitter property that was introduced in [14, Definition 3.1]. The boundary jitter property will be important for establishing continuity properties of the so-called derivative map. Definition 6.6. A pair (h, g) ∈ C(G) × C(R J ) satisfies the boundary jitter property if the following hold: 1. If h(t) ∈ ∂G for some t 1 < t < t 2 , then g is nonconstant on (t 1 ∨ 0, t 2 ). 2.' h does not spend positive Lebesgue time in the boundary ∂G; that is, 3. If h(t) ∈ N for some t > 0, then for each i ∈ I(h(t)) and all δ ∈ (0, t) there exists s ∈ (t − δ, t) such that I(h(s)) = {i}. For the following, given α ∈ U and the associated reflected diffusion Z α , let Y α denote the constraining process introduced in Remark 2.3. 6.2. The derivative problem. The derivative problem was first introduced in [14] as an axiomatic framework for studying directional derivatives of the ESM. The formulation of the derivative problem can be thought of as a linearization to the ESP (note the similarities between Definition 6.2 and Definition 6.9).
is a solution to the derivative problem along h for ψ if η(0) ∈ span[d(α, h(0))] and the following conditions hold: If there exists a unique solution (φ, η) to the derivative problem along h for ψ, we write φ = Λ α h (ψ) and refer to Λ α h as the derivative map associated with h.
Remark 6.10. Definition 6.9 is slightly different from the definition given in [14,Definition 3.4], where h was assumed to be continuous. Here we allow for h to lie in D(R J ) and impose condition 4, which follows from condition 3 when h is continuous. When h is not continuous, condition 3 only guarantees that η(t) − η(t−) ∈ span [d(α, h(t−)) ∪ d(α, h(t))].
Remark 6.11. Given a derivative process J α along a reflected diffusion Z α , define the continuous For each m = 1, . . . , M , it follows from the properties of (J α m , K α m ) stated in Definition 2.11 that a.s. (J α m , K α m ) is a solution to the derivative problem along Z α for H α m . In other words, a.s.
In the following we state some important properties of the derivative problem and its associated derivative map. The first lemma states a time-shift property of the derivative problem. Lemma 6.12. Let (h, g) be a solution to the ESP {(d i , n i , c i ), i ∈ I} for f ∈ D G (R J ). Suppose (φ, η) is a solution to the derivative problem along h for ψ ∈ D(R J ). Let s ≥ 0 and define h s ∈ D(G) as in (6.2), and define φ s , η s , ψ s ∈ D(R J ) by Then (φ s , η s ) is a solution to the derivative problem along h s for ψ s .
Proof. This was shown for f ∈ C G (R J ) in [14,Lemma 5.2]. A similar argument can be used to prove the lemma in the case that f ∈ D G (R J ). The only difference is that condition 4 must also be verified. To see that condition 4 holds, fix s ≥ 0 and let t > 0. By the definition of η s , condition 4 to the DP and the definition of h s , This proves condition 4 of the DP holds.
, and, for k = 1, 2, (φ k , η k ) is a solution to the derivative problem along h for ψ k ∈ D(R J ), then for all t < ∞, Proof. When f ∈ C G (R J ), the proposition follows from [14,Theorem 5.4]. When f is not continuous, the proof follows exactly analogously to the proof of [14,Theorem 5.4] except that [14, equation (3.5)] does not follow from condition 3 of the derivative problem, but rather is due to condition 4 of the derivative problem.
For the following, let N denote the nonsmooth part of the boundary; that is, Proposition 6.14. Let α ∈ U . Given f ∈ C G (R J ) suppose the solution (h, g) to the ESP for f satisfies the boundary jitter property (Definition 6.6). Then given ψ ∈ D(R J ) there exists a unique solution (φ, η) to the derivative problem along h for ψ. Furthermore, φ is continuous at times t > 0 that h(t) ∈ G • ∪ N .
Proof. Uniqueness follows from Proposition 6.13. Existence and the continuity property follows from [ We close this section with the following continuity property of the derivative map that will be instrumental in the proof of our main result. In addition to its use in this work, it is also used in [16] to show that the joint reflected diffusion and derivative process is Feller continuous. In contrast to the other results in this section, we explicitly state exactly which assumptions are required for the following theorem. Recall that we have equipped D(R J ) with the Skorokhod J 1 -topology. Theorem 6.15. Suppose Assumptions 2.5 and 2.6 hold. Given f ∈ C G (R J ) suppose the solution (h, g) to the ESP for f satisfies the boundary jitter property (Definition 6.6). Let {f l } l∈N be a sequence in D(R J ) such that f l converges to f in D(R J ) as l → ∞ and for each l ∈ N, let (h l , g l ) denote the solution to the ESP for f l . Suppose ψ ∈ C(R J ) satisfies ψ(0) ∈ H h(0) and {ψ l } l∈N is a sequence in D(R J ) such that ψ l converges to ψ in D(R J ) as l → ∞. Then Λ α h l (ψ l ) converges to Λ α h (ψ) in D(R J ) as l → ∞. The proof of Theorem 6.15 is deferred to Section 9.

Convergence of the Euler scheme for reflected diffusions
For ∆ > 0 recall the sequence {t ∆ n } n∈N0 in R + defined in (3.1), define the RCLL step function ρ ∆ : R + → R + by recall the definition of the discrete filtration {F ∆ t } given in (3.3), and define the piecewise constant Observe that ρ ∆ and W ∆ are constant on the interval [t ∆ n , t ∆ n+1 ) for each n ∈ N 0 and where the third equality uses the fact that, by Lemma 3.1, R(α)ξ α (Ξ α j ) = π α (Ξ α,∆ j ) − Ξ α,∆ j for each j ∈ N. Since Z α,∆ , L α,∆ , ρ ∆ and W ∆ are constant on intervals of the form [t ∆ n , t ∆ n+1 ) for n ∈ N 0 , it follows from (3.4), (7.3), (7.4) and the last display that, for all t ≥ 0, In addition, by (3.4)-(3.6) and Lemma 3.1, we see that a.s. L α,∆ (0) = 0 and for each i ∈ I, the ith component of L α,∆ is nondecreasing and can only increase when Z α,∆ lies in F i .
We now prove the convergence of the Euler scheme for the reflected diffusion. The argument is analogous to the one given in [28] for reflected diffusions in convex polyhedral domains with normal reflection. Given a function f : R + → R J and 0 < ∆ < t, define   as ∆ ↓ 0.
Proof. Fix α ∈ U and p ≥ 2. The proof is analogous to the proof of [28, Theorem 3.2], which assumes normal reflection along the boundary. In particular, following [28, equations (3.3)-(3.4)], there exists a constant C < ∞ such that, for all t ≥ 0, Due to the facts that Z α,∆ = Γ α (X α,∆ ) and Z α = Γ α (X α ) by Remarks 6.3 and 7.1, and the Lipschitz continuity of the SM shown in Proposition 6.5(i), we have, for t ≥ 0, Substituting the last inequality into the integrand on the right hand side of (7.8) and applying Gronwall's inequality yields which completes the proof of the lemma.

Relation between the discretized derivative process and the derivative map.
Recall the definition of H x given in (2.8) for x ∈ G. Suppose x 0 (α) ∈ F i for some i ∈ I. Then, for all β ∈ R M and ε > 0 sufficiently small so that α + εβ ∈ U , we have where the final inequality is due to the fact that x 0 (·) takes values in G. Since the last display holds for all β ∈ R M , by considering limits as ε → 0 from the left and right, this implies that the column vectors of x 0 (α) take values in {y ∈ R J : y, n i = 0}. In particular, it follows that given α ∈ U , the column vectors of x 0 (α) take values in H x0(α) . Thus, by (3.1), (3.10) and (3.4), Here [x 0 (α)] m denotes the mth column vector of x 0 (α). By (3.10)-(3.13) and (3.4), we have, for each n ∈ N, where K α,∆ = {K α,∆ (t), t ≥ 0} is the piecewise constant RCLL {F ∆ t }-adapted process taking values in R J×M defined by K α,∆ (0) = 0 and, for each n ∈ N, K α,∆ (t) .
The remainder of this section is devoted to the proof Proposition 8.4. 8.3. Some useful lemmas. Recall the derivative processes J α introduced in Definition 2.11 and the associated process H α introduced in Remark 6.11. We first recall a basic estimate. Lemma 8.5 ([15,Lemma 5.7]). For each α ∈ U , p ≥ 2 and t < ∞, We now prove an estimate that relates the derivative map along the reflected diffusion with the derivative map along the Euler discretization of the reflected diffusion. Lemma 8.6. For each α ∈ U , p ≥ 2, m = 1, . . . , M and t < ∞, Proof. Fix α ∈ U , p ≥ 2, m = 1, . . . , M and t < ∞. Recall the constraining process Y α introduced in Remark 2.3 and the process X α defined in (6.1). By Remark 6.3 and Proposition 6.8, a.s. (a) (Z α , Y α ) is the solution to the ESP for X α , and (b) (Z α , Y α ) satisfies the boundary jitter property.
By the Lipschitz continuity of the derivative map (Proposition 6.13), where κ 1 and κ 2 denote the constants in Assumption 2.9, κ Γ (α) denotes the constant in Proposition 6.5 and κ Λ (α) denotes the constant in Proposition 6.13.

Continuity of the derivative map
Throughout this section we fix α ∈ U . For brevity, we drop the α notation and write d(·), Λ h and L x for d(α, ·), Λ α h and L α x , respectively. 9.1. Proof of Theorem 6.15. In this section we first prove Theorem 6.15 when ψ l = ψ for all l ∈ N and ψ lies in a class of simple functions (see Lemma 9.2 below). With that in mind, let D s (R J ) denote the set of simple functions in D(R J ); that is, denote the times in [0, ∞) that h lies in the boundary ∂G, and for ψ ∈ D(R J ) let denote the jump or discontinuity points of ψ in (0, ∞). Recall the definition of H x , x ∈ G, given in (2.8). For h ∈ C(G), define Lemma 9.1. Given h ∈ C(G), suppose B h has zero Lebesgue measure. Then for ψ ∈ C(R J ) with ψ(0) ∈ H h(0) , there is a sequence {ψ k } k∈N in D h s (R J ) such that ψ k converges to ψ uniformly on compact intervals as k → ∞.
Since s ∈ [0, t] was arbitrary, this completes the proof.
We now state a key convergence lemma, Lemma 9.2, and show how it, together with Lemma 9.1, can be used to prove Theorem 6.15. The proof of Lemma 9.2 is deferred to Section 9.3, after first establishing some preliminary results in Section 9.2. Lemma 9.2. For α ∈ U , given f ∈ C G (R J ) suppose that the solution (h, g) to the ESP for f satisfies the boundary jitter property. Let {f l } l∈N be a sequence in D(R J ) such that f l converges to f in D(R J ) as l → ∞ and for each l ∈ N, let (h l , g l ) denote the solution to the ESP for f l . Then for all ψ ∈ D h s (R J ), Λ h l (ψ) converges to Λ h (ψ) in D(R J ) as l → ∞. Proof of Theorem 6.15. Fix α ∈ U and, for any h ∈ D(R J ), denote Λ α h just by Λ h . Fix f ∈ C G (R J ) and a sequence {f l } l∈N in D(R J ) such that f l converges to f in D(R J ) as l → ∞. Let (h, g) denote the solution to the ESP for f and for each l ∈ N, let (h l , g l ) denote the solution to the ESP for f l . Fix ψ ∈ C(R J ) and a sequence {ψ l } l∈N in D(R J ) such that ψ l converges to ψ in D(R J ) as l → ∞. Let d J1 (·, ·) denote a metric on D(R J ) that is compatible with the Skorokhod J 1 -topology. Let d U (·, ·) denote a metric on D(R J ) that is compatible with the topology of uniform convergence on compact intervals. Since the topology of uniform convergence on compact intervals is a stronger topology than the Skorokhod J 1 -topology, we can assume the metrics are chosen such that d J1 (·, ·) ≤ d U (·, ·). Then by the triangle inequality and Lipschitz continuity of the derivative map (Proposition 6.13), Since ψ ∈ C(R J ) and the Skorokhod J 1 -topology relativized to C(R J ) coincides with the topology of uniform convergence on compact intervals there, ψ l converges to ψ uniformly on compact intervals as l → ∞. Therefore, d U (ψ l , ψ) converges to zero as l → ∞. We are left to show d J1 (Λ h l (ψ), Λ h (ψ)) converges to zero as l → ∞. According to Lemma 9.1, there is a sequence {ψ k } k∈N in D h s (R J ) such that ψ k converges to ψ in D(R J ) as k → ∞. For each k ∈ N, by the Lipschitz continuity of the derivative map, By Lemma 9.2, d J1 (Λ h l (ψ k ), Λ h (ψ k )) converges to zero as l → ∞. Then letting k → ∞, d U (ψ, ψ k ) converges to zero. This completes the proof.

9.2.
Characterization of solutions to the derivative problem. In this section we characterize solutions to the DP along h when the input is simple up until the first time h hits the intersection of two or more faces. As a preliminary result, in Lemma 9.4, we first characterize solutions to the DP when the input ψ is constant.
In order to characterize the solutions to the DP, we first need some definitions. Recall that F i = {x ∈ G : x, n i = c i } denotes the ith face of G, for i ∈ I; and I(x) = {i ∈ I : x ∈ F i } for x ∈ G. The following is an upper semicontinuity property of the set-valued function I(·). Suppose (h, g) is the solution to the ESP for f ∈ C G (R J ) and 0 ≤ S < T < ∞. Definet 1 . = S and for k ≥ 1 such thatt k < T , define (9.2)ρ k . = inf{t ∈ (t k , T ] : I(h(t)) ⊆ I(h(t k ))} ∧ T to be the first time aftert k that h lies on a boundary face that is distinct from the faces on which h(t k ) lies. Also, define We claim thatt k = T for some k ∈ N, at which point we set K . = k and end the sequence. To see that the claim holds, suppose for a proof by contradiction thatt k < T for all k ∈ N. Since {t k } k∈N is increasing and bounded from above, there existst ∞ ≤ T such thatt k →t ∞ as k → ∞. By the upper semicontinuity of I(·) shown in Lemma 9.3 and the continuity of h, this implies there exists k 0 ∈ N such that I(h(t)) ⊆ I(h(t ∞ )) for all t ≥t k0 . Sincet k0 <ρ k0 <t k0+1 <t ∞ , this implies (9.4) I(h(t)) ⊆ I(h(t ∞ )) for all t ≥ρ k0 .
However, (9.3) and (9.4) implyt k0+1 ≥t ∞ , which yields a contradiction. With this contradiction thus obtained, we see thatt k = T for some k ∈ N. We now characterize solutions to the DP when the input ψ is constant. In this case φ = Λ h (ψ) is constant while h is in the interior of G and jumps whenever h hits the boundary ∂G and the jump can be expressed in terms of a derivative projection matrix. (Recall the derivative projection matrices L x , x ∈ G, introduced in Lemma 3.4.) Lemma 9.4. Given f ∈ D(R J ) and ψ ∈ D(R J ), suppose (h, g) is the solution to the ESP for f and (φ, η) is the solution to the derivative problem along h for ψ. Let 0 ≤ S < T < ∞ and define the nested sequence S =t 1 <ρ 1 ≤t 2 < · · · ≤t K = T as in (9.2)-(9.3). Suppose ψ is constant on [S, T ]. Then Proof. We will prove the relation We can then iterate (9.6) and use the fact thatt 1 = S to obtain (9.7). Let k ∈ {2, . . . , K}. We first claim that φ is constant on [t k−1 ,ρ k−1 ). Due to the uniqueness that follows from Lipschitz continuity of the derivative map stated in Proposition 6.13, in order to prove the claim it suffices to show that if (φ, η) satisfies conditions 1-4 of the derivative problem for t ∈ [0,t k−1 ] and is constant on [t k−1 ,ρ k−1 ), then (φ, η) satisfies conditions 1-4 of the derivative problem for t ∈ [t k−1 ,ρ k−1 ). Let t ∈ [t k−1 ,ρ k−1 ). Since (φ, η) satisfies condition 1 of the derivative problem att k−1 and φ, η, ψ are constant on [t k−1 ,ρ k−1 ), we have φ(t) = φ(t k−1 ) = ψ(t k−1 ) + η(t k−1 ) = ψ(t) + η(t), so condition 1 of the derivative problem holds at t. Next, since φ satisfies condition 2 of the derivative problem att k−1 , φ is constant on [t k−1 ,ρ k−1 ) and I(h(t)) ⊆ I(h(t k−1 )) by the definition ofρ k−1 in (9.2), we have where the last inclusion holds by (2.8).
Thus, condition 2 of the derivative problem holds at t. Lastly, since η is constant on [t k−1 ,ρ k−1 ), conditions 3 and 4 of the derivative problem are straightforward to verify, so we omit the details. This concludes the proof that (φ, η) is constant on [t k−1 ,ρ k−1 ). Next, by the definition oft k in (9.3), the upper semicontinuity of I(·) shown in Lemma 9.3 and the continuity of h, there exists δ > 0 such that I(h(t)) ⊆ I(h(t k )) for all t ∈ [ρ k−1 − δ,t k ]. Along with the fact that φ is constant on [t k−1 ,ρ k−1 ), conditions 1-4 of the derivative problem, the fact that ψ is constant on [S, T ], and the definition of d(·) in (2.4), this implies that φ(t k ) ∈ H h(t k ) and The unique characterization of L h(t k ) stated in Lemma 3.4 then implies (9.6) holds.
We have the following corollary of Lemma 9.4. Corollary 9.5. Given f ∈ D(R J ) and ψ ∈ D(R J ), suppose (h, g) is the solution to the ESP for f and (φ, η) is the solution to the derivative problem along h for ψ. Let 0 ≤ S < T < ∞ and suppose ψ is constant on [S, T ]. Then there existsm ∈ N and S <ŝ 1 < · · · <ŝm = T such that In Lemma 9.6 below, we characterize solutions to the DP along h for simple ψ up until the first time h hits the intersection of two or more faces. In the next section we use an inductive argument (on the first time h hits n or more faces) to prove continuity properties of the derivative map. With that in mind, given a solution (h, g) to the ESP for f ∈ C G (R J ), define for some m ∈ N, y 1 , . . . , y m ∈ R J such that y j = y j+1 for 1 ≤ j < m, and 0 = s 1 < · · · < s m+1 < ∞ with s m+1 ≤ θ 2 . For each 1 ≤ j ≤ m, define K j ∈ N ∞ and the sequence {ρ j (k)} k=1,...,Kj in [s j , s j+1 ] as follows: set ρ j (1) . = s j and for k ∈ N such that ρ j (k) < s j+1 , recursively define (9.10) ρ j (k + 1) . = inf{t > ρ j (k) : I(h(t)) ⊆ I(h(ρ j (k)))} ∧ s j+1 .
Before proving the induction step, we need the following helpful lemma. There is a norm on R J , denoted · B , such that under · B the derivative projection matrix L x is a contraction for all x ∈ G; that is, L x y B ≤ y B for all x ∈ G and y ∈ R J . Furthermore, since norms on R J are equivalent, there exists C B < ∞ such that y B ≤ C B |y| for all y ∈ R J . Lemma 9.9. Let 2 ≤ n ≤ N and suppose that Statement 1 holds. Then Statement 1 holds with n + 1 in place of n.
Proof. Let f , (h, g), {f l } l∈N , {(h l , g l )} l∈N and ψ be as in Statement 1. If θ n+1 = θ n then (9.15) and (9.16) hold by assumption. For the remainder of the proof we assume that θ n+1 > θ n . We split the proof into four parts (whose titles are indicated using italic font). The first two parts are devoted to defining the time-changes and the latter two parts are devoted to proving convergence results for the time-changes and the time-changed paths.
Partition of the interval [0, θ n+1 ) into subintervals {[t k , t k+1 )} 1≤k<K+1 : Set t 0 . = 0, t 1 . = θ n and for k ≥ 1 such that t k < θ n+1 , define ρ k to be the first time after t k that h hits a face that does not lie in the subset of faces that h(t k ) lies in; that is, (9.42) ρ k . = inf{t > t k : I(h(t)) ⊆ I(h(t k ))} and let t k be the first time after ρ k that h lies at the intersection of n or more faces; that is, (9.43) t k+1 . = inf{t ≥ ρ k : |I(h(t))| ≥ n}.
If t k = θ n+1 for some k ∈ N, then set K . = k and end the sequence. Alternatively, if t k < θ n+1 for all k ∈ N, define K . = ∞. In this case, we claim that t k → θ n+1 as k → ∞. The claim clearly holds if t k → ∞ as k → ∞; and on the other hand, if t k is uniformly bounded for all k ∈ N, then there must be an accumulation point and it follows from (9.42)-(9.43) that the accumulation point must be θ n+1 . In either case, the following holds: (9.44) Given T < θ n+1 , there exists 1 ≤ k < K + 1 such that T < t k .
For each l ∈ N, define λ l on the interval [0, t 1 ) by (9.47) λ l (t) . = λ 1 l (t) for t ∈ [0, t 1 ). Now suppose 1 ≤ k < K. Since h spends zero Lebesgue measure on the boundary ∂G due to condition 2 of the boundary jitter property (Definition 6.6), it follows from the definition of ρ k in (9.42) and the upper semicontinuity of I(·) shown in Lemma 9.3 that there exists (9.48) t k < ξ k < ρ k such that h(ξ k ) ∈ G • and |I(h(t))| < n for all t ∈ [ξ k , ρ k ). (9.49) Since h(ρ k ) ∈ ∂G by (9.42), the definition of D h s (R J ) in (9.1) implies that ψ is constant in a neighborhood of ρ k . In particular, by choosing ξ k ∈ (t k , ρ k ) possibly larger, we can assume that (9.50) ψ is constant on [ξ k , ρ k ].
= h(ξ k ) + f (ξ k + ·) − f (ξ k ). (9.51) By the time-shift property of the ESP, (h k , g k ) is a solution to the ESP for f k . In addition, since (h, g) satisfies the boundary jitter property and h(ξ k ) ∈ G • by (9.49), it follows from (6.2) that (9.52) (h k , g k ) satisfies the boundary jitter property.
(9.54) Again, by the time-shift property of the ESP, (h k l , g k l ) is a solution to the ESP for f k l . Since f l converges to f in uniformly on compact intervals as l → ∞, it follows that (9.55) f k l converges to f k uniformly on compact intervals as l → ∞. Therefore, by the Lipschitz continuity of the ESM (Proposition 6.5(ii)), (9.56) (h k l , g k l ) converges to (h k , g k ) uniformly on compact intervals as l → ∞. Define (9.57) ψ k (·) . = Λ h (ψ)(ξ k ) + ψ(ξ k + ·) − ψ(ξ k ).
The definitions of ψ k and h k , the fact that h(ξ k ) ∈ G • so H h(ξ k ) = R J , the fact that ψ ∈ D h s (R J ) and (9.1) imply that (9.59) ψ k ∈ D h k s (R J ).
By (9.52), (9.55), (9.59) and the fact that Statement 1 holds by assumption (with h k , g k , f k , h k l , g k l , f k l , ψ k and θ k n in place of h, g, f , h l , g l , f l , ψ and θ n , respectively), there is a sequence {λ k l } l∈N of time-changes mapping [0, θ k n ) to [0, θ k n ) such that for all T < ∞, Λ h k l (ψ k (λ k l (t)) − Λ h k (ψ k )(t) = 0, For each l ∈ N, define λ l on the interval [t k , t k+1 ) = [t k , ξ k + θ k n ) (due to (9.53)) by . Convergence of the time-changed paths on intervals of the form [0, t k ): We use the principle of mathematical induction to show that for each 1 ≤ k < K + 1 and for all T < ∞, |Λ h l (ψ)(λ l (t)) − Λ h (ψ)(t)| = 0.
Due to the upper semicontinuity of I(·) shown in Lemma 9.3 and the definition of ρ k in (9.42), we can choose δ > 0 possibly smaller to ensure that (9.71) I(h(t)) ⊆ I(h(t k )) for all t ∈ [t k − δ, ρ k ).
Then by the last display, (9.71) and the upper semicontinuity of I(·), we can choose l 0 ∈ N possibly larger such that for all l ≥ l 0 , (9.74) I(h l (λ l (t))) ⊆ I(h(t k )) for all t ∈ [t k − δ, ρ k − δ].
In addition, since Λ h (ψ)(t k ) ∈ H h(t k ) (by condition 2 of Definition 6.9) and (9.74) holds, it follows from the uniqueness of the derivative projection matrices L x stated in Lemma 3.4 that L h l (sj ) Λ h (ψ)(t k ) = Λ h (ψ)(t k ), 1 ≤ j ≤ m.