Scaling limit of a limit order book model via the regenerative characterization of L\'evy trees

We consider the following Markovian dynamic on point processes: at constant rate and with equal probability, either the rightmost atom of the current configuration is removed, or a new atom is added at a random distance from the rightmost atom. Interpreting atoms as limit buy orders, this process was introduced by Lakner et al. to model a one-sided limit order book. We consider this model in the regime where the total number of orders converges to a reflected Brownian motion, and complement the results of Lakner et al. by showing that, in the case where the mean displacement at which a new order is added is positive, the measure-valued process describing the whole limit order book converges to a simple functional of this reflected Brownian motion. Our results make it possible to derive useful and explicit approximations on various quantities of interest such as the depth or the total value of the book. Our approach leverages an unexpected connection with L\'evy trees. More precisely, the cornerstone of our approach is the regenerative characterization of L\'evy trees due to Weill, which provides an elegant proof strategy which we unfold.

Context. The limit order book is a financial trading mechanism that facilitates the buying and selling of securities by market participants. It keeps track of orders made by traders, which makes it possible to fulfill them in the future. For instance, a trader may place an order to buy a security at a certain level π. If the price of the security is larger than π when the order is placed, then the order is kept in the book and will be fulfilled if the price of the security falls below π. Due to its growing importance in modern electronic financial markets, the limit order book has attracted a significant amount of attention in the applied probability literature recently. One may consult, for instance, the survey paper by Gould et al. [15] for a list of references. Several mathematical models of the limit order book have been proposed in recent years, ranging from stylized models such as Yudovina [35] to more complex models such as those proposed by Cont et al. [10] and Garèche et al. [13]. Broadly speaking, these models may be categorized as being either discrete and closely adhering to the inherent quantized nature of the limit order book, or as being continuous in order to better capture the high frequency regime in which the order book typically evolves.
In the present paper, we attempt to bridge the gap between the discrete and continuous points of view by establishing the weak convergence of a discrete limit order book model to a continuous one in an appropriately defined high frequency regime where the speed at which orders arrive grows large. Similar weak convergence results have recently been considered in various works. However, most of the time, only finite-dimensional statistics of the limit order book are tracked, such as the bid price (the highest price associated with a buy order on the book), the ask price (defined in a symmetric way from the limit sell orders) or the spread (equal to the difference between these two quantities), see for instance [1,6,8,9,22]. In contrast, in the present paper we establish the convergence of the full limit order book which we model by a measure-valued process. This approach has also been taken in [30]. In [16] the authors also model the entire book, but with a different approach, namely, they track the density of orders which they see as random elements of an appropriate Banach space.
Relation with previous work. The particular discrete model that we study is a variant of the limit order book model proposed by Lakner et al. [23]. We are interested in a onesided limit order book with only limit buy orders, which are therefore fulfilled by market sell orders. In this model, limit buy orders and market sell orders arrive according to two independent Poisson processes. Each time a limit buy order arrives, it places an order on the book at a random distance from the current existing highest buy price. The sequence of differences between the arriving limit buy orders and the current highest buy price forms an i.i.d. sequence of random variables with common distribution identical to the distribution of a random variable J . This sequence of differences is also assumed to be independent of the two Poisson processes according to which limit and market orders arrive (cf. next section for a precise definition).
Under the assumption E(J ) < 0 that traders on average place their limit buy orders at a price lower than the bid price, it was shown in [23] that, under some appropriate rescaling in the high frequency regime, the entire limit order book is asymptotically concentrated at the bid price, and that the latter converges to a monotonically decreasing process.
In the present paper we complement this result by considering the case E(J ) > 0. In stark contrast to the case E(J ) < 0, our main result (Theorem 2.1 below) shows that when E(J ) > 0, the price process converges to a reflected Brownian motion and that at any point in time the measure describing the book puts mass on a non-empty interval.
It is worthwhile to note that in the high frequency regime that we consider, the total number of orders in the book converges to a reflected Brownian motion and that in both cases E(J ) < 0 and E(J ) > 0, the limiting measure is a deterministic function of this reflected Brownian motion. However, this deterministic function changes completely depending on the sign of E(J ), and this interesting dichotomy reflects the asymmetric nature of the discrete limit order book model itself. Another interpretation of this dichotomy is discussed at the end of Section 6.

MODEL AND MAIN RESULT
Model and main result. Let M be the set of positive measures on [0, ∞). We equip M with the weak topology and consider D([0, ∞), M ) the class of càdlàg mappings from [0, ∞) to M , which we endow with the Skorohod topology. Let z ∈ M be the zero measure, δ a be the Dirac mass at a ≥ 0 and M F ⊂ M be the set of finite point measures, i.e., measures ν ∈ M with finite support and of the form ν = p ς p δ p for some integers ς p . For a measure ν ∈ M let π(ν) be the supremum of its support: π(ν) = sup y ≥ 0 : ν([y, ∞)) = 0 with the convention π(z) = 0; π(ν) will be called the price of the measure ν, and an atom of ν ∈ M F will be referred to as an order. We use the canonical notation and denote by (X t , t ≥ 0) the canonical M -valued process. Let P χ be the law of the M F -valued (strong) Markov process started at χ ∈ M F and with generator ω given by where a + = max(0, a) for a ∈ R, and where λ > 0 and J , a real-valued random variable, are the only two parameters of the model under consideration. In words, the dynamic is as follows. We are given two independent Poisson processes, each of intensity λ. When the first one rings, a new order is added to the process and is located at a distance distributed like J to the current price, independently from everything else (J will sometimes be referred to as the displacement of the newly added order). Note however that an order cannot be placed in the negative half-line, and so an order with displacement J is placed at (π(ν) + J ) + (this boundary condition will be discussed in Section 6). When the second Poisson process rings and provided that at least one order is present, an order currently sitting at the price is removed (it does not matter which one).
In the sequel we will omit the subscript when the initial state is the empty measure z, i.e., we will write P and P n for P z and P n z , respectively, with their corresponding expectations E and E n . For convenience we will also use P and E to denote the probability and expectation of other generic random variables (such as when we write E(J ), or when we consider random trees).
Let M n F = ϑ n (M F ) = {ϑ n (ν) : ν ∈ M F }. In the sequel we will denote by ν n for ν ∈ M n F the only measure in M F such that ϑ n (ν n ) = ν. Let in the sequel W be a standard Brownian motion reflected at 0 and α = (2λ) 1/2 . The following result, which is the main result of the paper, shows that P n converges weakly to a measure-valued process which can simply be expressed in terms of W . Theorem 2.1. Assume that E(J ) > 0 and that J ∈ − j * , − j * + 1, . . . , 0, 1 for some j * ∈ N. Then as n → +∞, P n converges weakly to the probability measure under which π • X is equal in distribution to αE(J )W and X t for each t ≥ 0 is absolutely continuous with respect to Lebesgue measure with density ½ {0≤y ≤π(X t )} /E(J ), i.e., (2.2) X t ([0, y]) = 1 E(J ) min y, π(X t ) , t , y ≥ 0.
Remark. We will prove more than is stated, namely, we will show that X converges jointly with its mass and price processes, and also with their associated local time processes at 0 (see Lemma 5.2).
In the rest of the paper we assume that the assumptions of this theorem holds, i.e., E(J ) > 0 and J ∈ {− j * , . . . , 1} for some j * ∈ N. The behavior when E(J ) < 0 is completely different and has been treated in Lakner et al. [23] using stochastic calculus arguments, see the Introduction and Section 6 for more details.
Link with Lévy trees: detailed discussion. The following lemma is at the heart of our approach to prove Theorem 2.1. Let in the sequel D be the set of real-valued càdlàg functions with domain [0, ∞) and ζ( f ) = inf{t > 0 : f (t ) = 0} for f ∈ D. We call excursion, or excursion away from 0, a function f ∈ D with 0 < ζ( f ) < +∞ and f (ζ( f ) + t ) = 0 for all t ≥ 0 (note that we only consider excursions with finite length). We call height of an excursion its supremum, and denote by E the set of excursions. For a ≥ 0 and g ≤ d we say that the function e = ( f ((g +t )∧d)−a, t ≥ 0) is an excursion of f above level a if e ∈ E , e t ≥ 0 for every t ≥ 0 and f (g −) ≤ a.

Lemma 2.2.
Under P, the sequence of successive excursions of π• X above level a, for any integer a, are i.i.d., with common distribution the first excursion of π • X away from 0 under P δ 1 .
Proof. Consider X under P ν for any ν ∈ M F with π(ν) ≤ a that only puts mass on integers. Then when the first excursion (of π • X ) above a begins, the price is at a and an order is added at a + 1. Thus if g is the left endpoint of the first excursion above a, X g must be of the form X g = X g − + δ a+1 with π(X g − ) = a. This excursion lasts as long as at least one order sits at a + 1, and if d is the right endpoint of the first excursion above a, then what happens during the time interval [g , d] above a is independent from X g − and is the same as what happens above 0 during the first excursion of π • X away from 0 under P δ 1 . Moreover, X d only puts mass on integers and satisfies π(X d ) ≤ a, so that thanks to the strong Markov property we can iterate this argument. The result therefore follows by induction.
Remark. For a ≥ 0 let R a : M → M be defined by R a (ν)([y, ∞)) = ν([a + y, ∞)), and call (R a (X t ), g ≤ t ≤ d) an excursion of X above level a if (π(X t ), g ≤ t ≤ d) is an excursion of π • X above level a. Then the above proof actually shows that the successive excursions above level a of X are i.i.d., with common distribution the first excursion above 0 of X under P δ 1 .
Lemma 2.2 is at the heart of our proof of Theorem 2.1. Indeed, this regenerative property is strongly reminiscent of Galton Watson branching processes. More precisely, consider a stochastic process H ∈ E with finite length and continuous sample paths, that starts at 1, increases or decreases with slope ±1 and only changes direction at integer times.
For integers a ≥ 0 and p > 0 and conditionally on H having p excursions above level a, let (e k a,p , k = 1, . . . , p) be these p excursions. Then H is the contour function of a Galton Watson tree if and only if for each a and p, the (e k a,p , k = 1, . . . , p) are i.i.d. with common distribution H . Indeed, H can always be seen as the contour function of some discrete tree. With this interpretation, the successive excursions above a of H code the subtrees rooted at nodes at depth a + 1 in the tree. The (e k a,p , k = 1, . . . , p) being i.i.d. therefore means that the subtrees rooted at a node at depth a in the tree are i.i.d.: this is precisely the definition of a Galton Watson tree.
The difference between this regenerative property and the regenerative property satisfied by π • X under P and described in Lemma 2.2 is that, when conditioned to belong to the same excursion away from 0, consecutive excursions of π • X above some level are neither independent, nor identically distributed. If for instance we condition some excursion above level a to be followed by another such excursion within the same excursion away from 0, this biases the number of orders put in {0, . . . , a} during the first excursion above a. Typically, one may think that more orders are put in {0, . . . , a} in order to increase the chance of the next excursion above a to start soon, i.e., before the end of the current excursion away from 0.
However, this bias is weak and will be washed out in the asymptotic regime that we consider. Thus it is natural to expect that π•X under P, properly renormalized, will converge to a process satisfying a continuous version of the discrete regenerative property satisfied by the contour function of Galton Watson trees.
Such a regenerative property has been studied in Weill [34], who has showed that it characterizes the contour process of Lévy trees (see for instance Duquesne and Le Gall [11] for a background on this topic). Thus upon showing that this regenerative property passes to the limit, we will have drastically reduced the possible limit points, and it will remain to show that, among the contour processes of Lévy trees, the limit that we have is actually a reflected Brownian motion. From there, a simple argument based on local time considerations allows us to conclude that Theorem 2.1 holds.
In summary, our proof of Theorem 2.1 will be divided into four main steps: (1) showing tightness of P n ; (2) showing, based on Lemma 2.2, that for any accumulation point P, π • X under P satisfies the regenerative property studied in Weill [34] (most of the proof is devoted to this point); (3) arguing that among the contour processes of Lévy trees, π • X under P must actually be a reflected Brownian motion; (4) showing that X t under P has density ½ {y ≤π(X t )} /E(J ) with respect to Lebesgue measure.

COUPLING WITH A BRANCHING RANDOM WALK
In this section we introduce the coupling of [33] between our model and a particular random walk with a barrier. As mentioned in the Introduction, this coupling plays a crucial role in the proof of Theorem 2.1. Let T be the set of colored, labelled, rooted and oriented trees. Trees in T are endowed with the lexicographic order. Thus in addition to its genealogical structure, each edge of a tree T ∈ T has a real-valued label and each node has one of three colors: either white, green or red.
In the sequel we write v ∈ T to mean that v is a node of T, and we denote by ∈ T the root of T, by |T| its size (the total number of nodes) and by h(T) its height. Nodes inherit labels in the usual way, i.e., the root has some label and the label of a node that is not the root is obtained recursively by adding to the label of its parent the label on the edge between them. If v ∈ T we write ψ(v, T) for the label of v (in T), |v| for the depth of v (so that, by our convention, | | = 1 and h(T) = sup v∈T |v|) and v k ∈ T for k = 1, . . . , |v| for the node at depth k on the path from the root to v (so that v 1 = and v |v| = v). Also, ψ * (T) = sup v∈T ψ(v, T) is the largest label in T, γ(T) is the green node in T with largest label, with γ(T) = if T has no green node and in case several nodes have the largest label, γ(T) is the last one, and Γ(T) ∈ M F is the point measure that records the labels of green nodes: We say that a node v ∈ T is killed if the label of v is < than the label of the root, and if the label of every other node on the path from the root to v has a label ≥ to the one of the root. Let K (T) ⊂ T be the set of killed nodes: and consider B(T) ∈ T the tree obtained from T by removing all the descendants of the killed nodes (but keeping the killed nodes themselves), and B + (T) the tree obtained from B(T) by applying the map x → x + to the label of every node in B(T). Note that since B(T) is a subtree of T, we always have ψ * (B(T)) ≤ ψ * (T).
Let Φ : T → T be the operator acting on a tree T ∈ T as follows. If T has no green node then Φ(T) = T. Else, Φ changes the color of one node in T according to the following rule: • if γ(T) has at least one white child, then its first white child becomes green; • if γ(T) has no white child, then γ(T) becomes red. Let Φ k be the kth iterate of Φ, i.e., Φ 0 is the identity map and Φ k+1 = Φ • Φ k , and let also τ(T) = inf{k ≥ 0 : ψ(γ(Φ k (T)), T) < ψ( , T)}. We will sometimes refer to the process (Φ k (T), k = 0, . . . , τ(T)) as the exploration of the tree T.
Consider a tree T ∈ T such that all the nodes are white, except for the root which is green. For such a tree, the dynamic of Φ is such that τ(T) is the smallest k at which the nodes of B(Φ k (T)) \ K (Φ k (T)) are red, the nodes of K (Φ k (T)) are green and the other nodes are still white. It has taken one iteration of Φ to make the nodes of K (T) green, and two to make the nodes of B(T)\K (T) red (first each of them had to be made green), except for the root which was already green to start with. Thus for such a tree we have Let finally T x for x ∈ R be the following random tree: • its genealogical structure is a (critical) Galton Watson tree with geometric offspring distribution with parameter 1/2; • ψ( , T x ) = x and labels on the edges are i.i.d., independent from the genealogical structure, and with common distribution J ; • all nodes are white, except for the root which is green. Because of the last property and the preceding remark, we have Note that since J ≤ 1, we have ψ * (T 1 ) ≤ h(T 1 ), and in particular ψ * (B(T 1 )) ≤ h(T 1 ). The following result is a slight variation of Theorem 2 in Simatos [33], where the same model in discrete-time and without the boundary condition (i.e., an order may be added in the negative half-line) was studied. The intuition behind this coupling is to create a genealogy between orders in the book, a newly added order being declared the child of the order corresponding to the current price, see Section 3.1 in Simatos [33] for more details.
Theorem 3.1. [Theorem 2 in [33]] Let a be any integer and g < d be the endpoints of the first excursion of π • X above level a. Then the process (X t − X g − , g ≤ t ≤ d) under P and embedded at jump epochs is equal in distribution to the process (Γ • Φ k • B + (T a+1 ), k = 0, . . . , τ(T a+1 )).
Ambient tree. Thanks to this coupling, we can see any piece of path of X corresponding to an excursion of the price process above some level a as the exploration of some random tree T a+1 : we will sometimes refer to this tree as the ambient tree. Note that the ambient tree of an excursion above a, say e, is a subtree of the ambient tree of the excursion above a − 1 containing e. Moreover, the remark following Lemma 2.2 implies that the ambient trees corresponding to successive excursions above some given level are i.i.d..
Exploration time. Theorem 3.1 gives, via (3.1), the number of steps needed to explore the ambient tree, say T . However, we are interested in X in continuous time. Since jumps in X under P occur at rate 2λ, independent from everything else, the length of the corresponding excursion is given by S (τ(T )), where, here and in the sequel, S is a random walk with step distribution the exponential random variable with parameter 2λ, independent from the ambient tree T .
More generally, we will need to control the time needed to explore certain regions of T , which will translate to controlling S (β) for some random times β defined in terms of T , and thus independent from S . As it turns out, the random variables β that need be considered have a heavy tail distribution. Since on the other hand jumps of S are light-tailed, the approximation P(S (β) ≥ y) ≈ P(β ≥ 2λy) will accurately describe the situation. Let us make this approximation rigorous: for the upper bound, we write Then, a large deviations bound shows that P(S (y) ≥ y) ≤ e −µy with µ = (1 − log 2)λ. Carrying out a similar reasoning for the lower bound, we get (3.2) P β ≥ 4λy − e −µy ≤ P S (β) ≥ y ≤ P β ≥ λy + e −µy with µ = (2log 2 − 1)λ.

NOTATION AND PRELIMINARY REMARKS
4.1. Additional notation and preliminary remarks. We will write in the sequel P x , P, P n x , P n for P δ x , P z , P n δ x and P n z , respectively, and denote by E x , E, etc, the corresponding expectations. Remember that we will also use P and E to denote the probability and expectation of other generic random variables (such as when we write E(J )). In the sequel it will be convenient to consider some arbitrary probability measure P on D([0, ∞), M ) and to write Y n ⇒ n Y to mean that the law of Y n under P n converges weakly to the law of Y under P (Y n and Y are measurable functions of the canonical process). When we will have proved the tightness of P n , then we will fix P to be one of its accumulation points, but until then P remains arbitrary. Let M(ν) for ν ∈ M be the mass of ν, i.e., We will need various local time processes at 0. First of all, ℓ t = t 0 ½ {π(X u )=0} du denotes the Lebesgue measure of the time spent by the price process at 0. For discrete processes, i.e., under P n , we will also need the following local time processes at 0 of M •X and π•X : For the continuous processes that will arise as the limit of π•X and M •X , we consider the operator L acting on continuous functions f : [0, ∞) → [0, ∞) as follows: We will only consider L applied at random processes equal in distribution to βW for some β > 0, in which case this definition makes sense and indeed leads to a local time process at 0. Note that for any β > 0 and any f for which L ( f ) is well-defined, we have L (β f ) = β −1 L ( f ). Moreover, according to Tanaka's formula the canonical semimartingale decomposition of W is given by whereW is a standard Brownian motion.
In the sequel we will repeatedly use the fact that the process π • X under P n (or P) is regenerative at 0, in the sense that successive excursions away from 0 are i.i.d.. Note also that the time durations between successive excursions away from 0 are also i.i.d., independent from the excursions, with common distribution the exponential random variable (with parameter λP(J = 1)n 2 under P n , and λP(J = 1) under P).
Moreover, jumps of π • X under P n have size 1/n, and so if π • X under P n converges weakly, then the limit must be almost surely continuous (see for instance Theorem 13.4 in Billingsley [5]).
Let θ t and σ t for t ≥ 0 be the shift and stopping operators associated to π•X , i.e., θ t = (π(X t +s ), s ≥ 0) and σ t = (π(X s∧t ), s ≥ 0). Since by the previous remark, accumulation points of π• X under P n are continuous, these operators are continuous in the following sense (see for instance Lambert   Consider some arbitrary random times T n , T ≥ 0. If (π • X , T n ) ⇒ n (π • X , T ), then (θ T n , σ T n ) ⇒ n (θ T , σ T ).
We will finally need various random times. For t and ε ≥ 0 let Note that G t and D t are the endpoints of the excursion of π • X straddling t , where we say that an excursion straddles t if its endpoints g ≤ d satisfy are the endpoints of the first excursion of π • X above level a with height ≥ b − a and U a,b is its length. Note that, in terms of trees, the interval [g a,b , d a,b ] corresponds to the exploration of a tree distributed like T a conditioned on ψ * (T a ) > b, since the height of the excursion corresponds to the largest label in the ambient tree. Also, it follows from the discussion at the end of Section 3 that U a,b is equal in distribution to S (τ(T a )) under the same conditioning.

4.2.
An aside on the convergence of hitting times. At several places in the proof of Theorem 2.1 it will be crucial to control the convergence of hitting times. For instance, we will need to show in the fourth step of the proof that if (X , π • X ) ⇒ n (X , π • X ), then D t ⇒ n D t for any t ≥ 0. Let us explain why, in order to show that D t ⇒ n D t , it is enough to show that for any η > 0, Let us say that π • X goes across ε if inf [D t,ε ,D t,ε +η] π • X < ε for every η > 0, and let G = {ε > 0 : π • X goes across ε}. Then, the following property holds (see for instance Proposition VI.2.11 in Jacod and Shiryaev [17] or Lemma 3.1 in Lambert and Simatos [24]): if P(ε ∈ G ) = 1, then D t ,ε ⇒ n D t ,ε .
On the other hand, the complement G c of G is precisely the set of discontinuities of the process (D t ,ε , ε > 0). Since (D t ,ε , ε > 0) is càglàd, as the left-continuous inverse of the process (inf [t ,t +s] π • X , s ≥ 0), the set {ε > 0 : P(ε ∈ G c ) > 0} is at most countable, see for instance Billingsley [5,Section 13]. Gathering these two observations, we see that D t ,ε ⇒ n D t ,ε for all ε > 0 outside a countable set. Then, writing for any ε, η > 0 Since D t ,ε ⇒ n D t ,ε for all ε outside a countable set, and since for those ε we have Next, observe that D t ,ε → D t as ε → 0, P-almost surely. Indeed, D t ,ε decreases as ε ↓ 0, and its limit D ′ must satisfy t ≤ D ′ ≤ D t , since t ≤ D t ,ε ≤ D t , and also π(X D ′ ) = 0, since π(X D t,ε ) ≤ ε and π • X is P-almost surely continuous. Thus letting first ε → 0 and then η → 0 in the previous display, we obtain by (4.2) which shows that D t ⇒ n D t by the Portmanteau theorem. This reasoning, detailed for D t and used in the proof of Lemma 5.6, will also be used in Section 5.4 to control the asymptotic behavior of the hitting times T b , g a,b and d a,b .
We will also use the following useful property: if π • X and D t converge weakly, then the convergence actually holds jointly. The reasoning goes as follows. If π • X and D t under P n converge to π• X and D t under P, then (π• X , D t ) under P n is tight (we always consider the product topology). Let (P ′ , D ′ ) be any accumulation point.
Since projections are continuous, P ′ is equal in distribution to π • X under P, in particular it is almost surely continuous, and D ′ is equal in distribution to D t under P, in particular it is almost surely ≥t . Further, assume using Skorohod's representation theorem that (P n , D n t ) is a version of (π • X , D t ) under P n which converges almost surely to (P ′ , D ′ ). Since P n D n t = 0 and P ′ is continuous, we get P ′ D ′ = 0 and thus, since D ′ ≥ t , inf{s ≥ t : P ′ s = 0} ≤ D ′ . Since these two random variables are both equal in distribution to D t under P, they must be (almost surely) equal. This shows that (P ′ , D ′ ) is equal in distribution to (π • X , D t ) under P, which uniquely identifies accumulation points.
This reasoning applies to all the hitting times considered in this paper, in particular to T b , g a,b and d a,b . Thus, once we will have shown the convergence of π • X and, say, T b , then we will typically be in position to use Lemma 4.1 and deduce the convergence of θ T b and σ T b .

Convention.
In the sequel we will need to derive numerous upper and lower bounds, where only the asymptotic behavior up to a multiplicative constant matters. It will therefore be convenient to denote by C a strictly positive and finite constant that may change from line to line, and even within the same line, but which is only allowed to depend on λ and the law of J .

PROOF OF THEOREM 2.1
We decompose the proof of Theorem 2.1 into several steps. The coupling of Theorem 3.1 makes it possible to translate many questions on P n to questions on B(T 1 ), and in order to keep the focus of the proof on P n , we postpone to the Appendix A the proofs of the various results on B(T 1 ) which we need along the way.
At a high level, it is useful to keep in mind that, since E(J ) > 0, the law of large numbers prevails and the approximation ψ(v, T 1 ) ≈ E(J )|v| describes accurately enough (for our purposes) the labels in the tree B(T 1 ). In some sense, most of the randomness of B(T 1 ) lies in its genealogical structure, and the results of the Appendix A aim at justifying this approximation.
Note that similar results than the ones we need here are known in a more general setting, but for the tree without the barrier, i.e., for T 1 instead of B(T 1 ), see, e.g., Durrett et al. [12] and Kesten [21].
We begin with a preliminary lemma: recall that W is a reflected Brownian motion, that α = (2λ) 1/2 and that L (βW ) = β −1 L (W ) for any β > 0. Moreover, Proof. By definition, M •X under P is a critical M/M/1 queue with input rate λ, which is well-known to converge under P n to αW . Further, λL n,M is the finite variation process that appears in its canonical (semimartingale) decomposition, and standard arguments show that it converges, jointly with M • X , to the finite variation process that appears in the canonical decomposition of αW , equal to (α/2)L (W ) by (4.1). Dividing by λ we see that L n,M under P n converges to (α/(2λ))L (W ) = L (αW ). This shows that (M •X , L n,M ) under P n converges weakly to (αW, L (αW )).
We now show that L n,π under P n converges weakly to (1/E(J ))L (αW ) jointly with under P n is equal to this would imply that L n,π under P n converges in the sense of finite-dimensional distributions to (1/E(J ))L (αW ) (jointly with M • X and L n,M ), and so, since L n,π and L (W ) are continuous and increasing, Theorem VI.2.15 in Jacod and Shiryaev [17] would imply the desired functional convergence result.
where ℓ −1 stands for the right-continuous inverse of ℓ. The composition with ℓ −1 makes Q evolve only when the price is at 0. Under P and while the price is at 0, the dynamic of Q is as follows: • Q increases by one at rate λP(J ≤ 0) (which corresponds to an order with a displacement ≤ 0 being added) and decreases by one at rate λ, provided Q > 0 (which corresponds to an order being removed); • when an order with displacement > 0 is added, which happens at rate λP(J = 1), the price makes an excursion away from 0. When it comes back to 0, Q resumes evolving and, by the coupling, a random number of orders distributed like |K (T 1 )| and independent from everything else have been added at 0.
Thus we see that Q under P is stochastically equivalent to a G/M/1 single-server queue, with two independent Poisson flows of arrivals: customers arrive either one by one at rate λP(J ≤ 0), or by batch of size distributed according to |K (T 1 )| at rate λP(J = 1). Then, customers have i.i.d. service requirements following an exponential distribution with parameter λ. In particular, the load of this queue is P( . Since E(J ) > 0, Q is positive recurrent and in particular, the long-term average idle time is equal to one minus the load, i.e., Fix on the other hand some y > 0: then ). Since M •X is a reflected critical random walk with jump size ±1/n and jump rates λn 2 , one easily proves that E n (M(X ε ) 2 ) ≤ 2λε which gives the desired result by Cauchy-Schwarz inequality. 5.1. First step : tightness of P n . To show the tightness of P n , it is enough to show that M • X under P n is tight, and that for each continuous φ which is infinitely differentiable with a compact support, f φ • X under P n is tight (recall that f φ (ν) = φdν), see for instance Theorem 2.1 in Roelly-Coppoletta [32]. The tightness of M • X is a direct consequence of Lemma 5.1, and so it remains to show the tightness of f φ • X . First of all, note that jumps of f φ • X under P n are upper bounded by sup|φ|/n, and so we only need to control the oscillations of this process (see for instance the Corollary on page 179 in Billingsley [5]). Using standard arguments, we see that the process f φ • X is a special semimartingale with canonical decomposition where Ω n is the generator of P n , given for any ν ∈ M n F and any function f : M → R by and Z n is a local martingale with predictable quadratic variation process given by (see, e.g., Lemma VIII.3.68 in Jacod and Shiryaev [17]). In particular, we have and from which it follows that Thus there exists a finite constant C ′ , that only depends on λ, the law of J and φ, such that for any finite stopping time V and any ε > 0 we have where the last inequality follows from (5.2) combined with the strong Markov property at time V . Similarly, E n (|〈Z n 〉 V +ε − 〈Z n 〉 V |) ≤ C ′ ε and these upper bounds imply the tightness of f φ • X by standard arguments for the tightness of a sequence of semimartingales, see for instance Theorem VI.4.18 in Jacod and Shiryaev [17], or Theorem 2.3 in Roelly-Coppoletta [32].
We now know that P n is tight: it remains to identify accumulation points. As planned in Section 4.1, we now let P be an arbitrary accumulation point of P n and we assume without loss of generality that P n converges weakly to P. In particular, we have f φ •X ⇒ n f φ • X for every continuous function φ ≥ 0 with a compact support, see for instance Theorem 16.16 in Kallenberg [19]. Also, as noted in Section 4.1 the process π• X under P is almost surely continuous, and since the jumps of f φ • X under P n are upper bounded by n −1 sup|φ|, the same argument shows that the process f φ • X under P is also almost surely continuous. 5.2. Second step : joint convergence. We now show that X under P n actually converges jointly with its mass, price and local time processes.
Lemma 5.2. The following joint convergence holds: Moreover, it holds P-almost surely that M(X t ) ≥ π(X t ) for every t ≥ 0.
Proof. Lemma 5.1 and the first step imply that the sequence (X , M • X , L n,M , L n,π ) under P n is tight. Let (X ′ , M ′ , L (M ′ ), L (E(J )M ′ )) be any accumulation point (which is necessarily of this form by Lemma 5.1), and assume in the rest of the proof, using Skorohod's representation theorem, that X (n) is a version of X under P n and that L (n),M and L (n),π are defined in terms of X (n) similarly as L n,M and L n,π are defined in terms of X , such that (X (n) , M • X (n) , L (n),M , L (n),π ) → (X ′ , M ′ , L (M ′ ), L (E(J )M ′ )) almost surely. Then, in order to prove the joint convergence, we only have to prove that M ′ = M • X ′ and that We will use the following key observation: under P and provided that M(X t ) > 0, we have X t ({p}) ≥ 1 for any integer p ≤ π(X t ). It follows that, P n -almost surely, π(X t ) ≤ M(X t ) for every t ≥ 0 and so π( Note that this implies the desired inequality π(X t ) ≤ M(X t ) under P, once we will have proved that M ′ = M • X and that for those n, and since the left-hand side of this equality converges to φdX ′ t while the right-hand side converges to

) is bounded and any accumulation point is upper bounded by
. Consider any such accumulation point p, and assume without loss of gener- ) (this holds for every a < b < p outside a countable set). Then for n large enough we have a < b < π(X (n) t ) and so a consequence of the key observation made at the beginning of the proof is that Letting a ↑ p achieves the proof. Since under P n , orders are added at a distance at most j * /n from the current price, it follows readily from the previous result that X under P only evolves locally around its price, in the sense that if y ≥ 0 and 0 ≤ g ≤ d are such that π(X t ) > y for g ≤ t ≤ d, then for t ∈ [g , d] the measures X t and X g restricted to [0, y] are equal. Actually we will only need the following weaker property. Corollary 5.3. The following property holds P-almost surely. Let t , y ≥ 0 such that π(X t ) > y, and let g be the left endpoint of the excursion of π • X above y straddling t . Then X t ([0, y]) = X g ([0, y]). 5.3. Third step : P is regenerative at 0. So far, we have used the fact that π • X under P n was regenerative at 0, in the natural sense that successive excursions away from 0 are i.i.d.. Under P there is no first excursion away from 0 and so we need a more general notion of regeneration (we also don't know, at this point, that π • X under P is a Markov process). The goal of this step is to show that π • X under P is regenerative at 0 in the following sense (recall that θ t and σ t are the shift and stopping operators associated to π • X , see Section 4.1),: i) the zero set of π • X has zero Lebesgue measure under P; iii) for every t ≥ 0 and every continuous, bounded functions f and g on E : Note for (5.4) that D t is P-almost surely finite, since D t ≤ inf{s ≥ t : M(X s ) = 0} by Lemma 5.2, and this upper bound is P-almost surely finite by Lemma 5.1. It can be checked, following the proof of Theorem 22.11 in Kallenberg [19], that if π • X under P satisfies the three properties i)-iii) above, then π • X admits an excursion measure away from 0, denoted by N , by which we mean, in accordance with the literature, that: (1) there is a continuous, nondecreasing process L increasing only on the zero set of π • X (the local time); (2) the right-inverse L −1 of L is a subordinator and the excursion process sending t to the corresponding excursion if L −1 jumps at time t (and to a cemetery point otherwise) is a Poisson point process with intensity measure dmdN (m stands for the Lebesgue measure).
The rest of this step is devoted to showing that P satisfies the properties i)-iii) above. We begin with a preliminary lemma.
Let D 0 = 0 and D k for k ≥ 1 be the endpoint of the kth excursion away from 0 of π • X , and let K (y) = k≥1 ½ {D k ≤y } be the number of excursions finishing before y. In particular, The coupling shows that for each k ≥ 1 we can write D k − D k−1 = E k + V k , where E k and V k are independent and, under P: du is the time that the price process stays at 0 before the kth excursion starts. In particular, it follows an exponential distribution with parameter λP(J = 1); • V k is the time taken to explore the ambient tree T k , distributed according to T 1 , corresponding to the kth excursion. According to the discussion at the end of Section 3, we can write V k = S k (τ(T k )) where S k is a random walk independent from T k and with step distribution an exponential random variable with parameter 2λ.
Note furthermore that, since π • X under P is regenerative at 0, the random variables where the last inequality follows from the independence betweenK (n 2 t ) and the E k 's. By definition ofK (n 2 t ) we have and since the V i 's are i.i.d. with common distribution S 1 (τ(T 1 )), we end up with According to (A.7) we have P(τ(T 1 ) ≥ u) ≥ Cu −1/2 and so the lower bound in (3.2) implies that P(S 1 (τ(T 1 )) ≥ u) obeys to a similar lower bound, which completes the proof.
Proof. We will use the compensation formula for the Poisson point process of excursions away from 0 of π • X associated to the local time ℓ, see, e.g., Corollary IV.11 in Bertoin [4]. More precisely, under P n the first jump of the right-continuous inverse of ℓ occurs at rate λP(J = 1)n 2 , which uniquely identifies the excursion measure of π • X associated to ℓ as being equal to λP(J = 1)n 2 times the law of an excursion of π • X away from 0, see for instance Proposition O.2 in Bertoin [4]. In particular, if (β s , s ≥ 0) is the Poisson point process of excursions of π • X away from 0 associated to ℓ, so that β s ∈ E ∪ {∂} for some cemetery state ∂, then for any t ≥ 0 and any measurable function Since nℓ t = L n,π t , the previous lemma thus gives Let us now prove the result: actually, it is enough to prove that for any η > 0, Indeed, if this holds, then using the convergence of φdX t ⇒ n φdX t for continuous φ with a compact support, this implies 0 which yields P(ℓ t = 0) = 1 for each fixed t ≥ 0, and thus P(∀t ≥ 0 : ℓ t = 0) = 1 by continuity of ℓ. So let us show (5.6): using Markov inequality, writing Since ℓ t = L n,π t /n, the first term of the above upper bound is upper bounded by C t 1/2 /(ηn) by Lemma 5.4. As for the second term, using (5.5) In terms of the exploration of the ambient tree (equal in distribution to T 1 ), the integral under the expectation corresponds to the time spent when the largest label of a green node was ≤ εn. Since transitions occur at rate 2λ independently from everything else, we therefore have Further, we can write The sum k≥0 ½ {γ(Φ k (T 1 ))=v} counts the number of times the node v has been the price: this is actually equal to 1 + C (v), with C (v) the number of children of v in T 1 . The one accounts for the first time v becomes the price, and the additional C (v) accounts for the fact that each child of v makes v stay the price one more unit of time (either immediately, if the child has a smaller label, or later on if the child has a larger label). Thus Now this sum counts twice all nodes in B(T 1 ) \ K (T 1 ) with label in {1, . . . , εn}; it also counts once the nodes in K (T 1 ) as well as the nodes with label εn +1 whose parent has label εn. In particular, and so taking the mean and using (A.8) finally gives E 1 ( Proof. We first prove the property iii). First, assume that for every η > 0 it holds that As explained in Section 4.2, this implies that ( On the other hand, since π • X under P n is regenerative at 0, (5.4) holds with E n instead of E and so passing to the limit and using (θ D t , σ D t ) ⇒ n (θ D t , σ D t ) we obtain the desired result. Thus we only have to prove (5.8), which we prove now.
Let A t = t − G t be the age of the excursion straddling t , and G u < D u for u > 0 be the endpoints of the first excursion of π • X with length > u, say e u : then Theorem (5.9) in Getoor [14] shows that for u > 0 and ν ∈ M F and conditionally on {A t = u, X n G t = ν} (recall the definition of X n t before Theorem 2.1), the excursion of π • X straddling t is equal in distribution to e u conditionally on {X n G u = ν}: in particular, Further, under P n , X n G t is almost surely of the form ν = ςδ 0 + δ 1 for some ς ≥ 0. For such an initial condition, the number ς of orders sitting at 0 does not influence the first excursion, which is distributed like the first excursion under P 1 : thus and the goal is now to prove that which will achieve the proof of (5.8). Rescaling, we obtain and to control this term we consider any ε ′ > 0 and write High-level description. Let us explain in words how we are going to upper bound each term in the right-hand side of (5.9): this reasoning will also be used in the proof of Lemma 5.8. LetT be the ambient tree corresponding to the first excursion of π • X away from 0, so that D 0 is the sum of τ(T ) i.i.d. exponential random variables with parameter 2λ and the conditioning D 0 > un 2 therefore amounts, by (3.1), to B(T ) having a large number of nodes. When S ≤ (ε + ε ′ )n, then D un 2 − D un 2 ,εn is smaller than the time spent exploring all the nodes inT with label ≤ (ε + ε ′ )n. We have a good control on the number of such nodes (they are of the order of (ε + ε ′ ) 2 n 2 ) which thus translates into a good control on D un 2 − D un 2 ,εn in this event.
On the other hand, to control the probability of S being large, i.e., S > (ε + ε ′ )n, we observe that S is equal to the largest supremum of the excursions above εn that start between times D un 2 ,εn and D 0 . By Lemma 2.2 these excursions are i.i.d. with common distribution the exploration of a tree distributed like T 1 . In particular, we can control their supremum (which is equal in distribution to ψ * (T 1 )), and to control their number, we use a crude upper bound by saying that there cannot be more excursions above level εn than there are nodes inT with label = εn. Again, we have a good control on these two quantities which, combined, will give us a sufficiently good control on the probability of S being large.
Let us now make these arguments rigorous. As just explained, D un 2 − D un 2 ,εn is, in the event {S ≤ (ε + ε ′ )n, D 0 > un 2 }, smaller than the time spent exploring the nodes of the ambient tree that have a label ≤ (ε + ε ′ )n. This means that if is the number of such nodes, then (the factor 2 in 2N ≤ comes from the same reason as the 2 in the right-hand side of (5.7)). Invoking (3.2), we get and so (A.10) finally gives We now control the second term in the right-hand side of (5.9). Let e k be the kth excursion of π • X above level εn − 1 to start after time un 2 , and let N = be the number of excursions above level εn − 1 that belong to the first excursion of π • X away from 0: then as explained above, for any κ 0 > 0 we have Thanks to the coupling, we have and so (A.11) gives P 1 N = ≥ κ 0 n | D 0 > un 2 ≤ C ε/κ 0 . On the other hand, since under P 1 ( · | D 0 > un 2 ) the (e k , k ≥ 1) are i.i.d., with common distribution the first excursion of π • X under P 1 (as a consequence of Lemma 2.2), we have thanks to the union bound By the coupling, where the last inequality follows from Lemma A.3. Gathering the previous bounds, we see that Choosing κ 0 = ε 1/2 and ε ′ = ε 1/4 , and letting first n → ∞ and then ε → 0 achieves to prove (5.8), and in particular property iii).
We now prove property ii), i.e., for any ε > 0 we must prove that P(D ′ 0 ≥ ε) = 0. For any η > 0 we have P(D ′ 0 ≥ ε) ≤ P(D η ≥ ε−η). Because we have just proved that D η ⇒ n D η , we have P n D η ≥ ε ′ → P D η ≥ ε ′ for all ε ′ outside a countable set. Adapting the previous arguments, it is on the other hand not difficult to see that limsup n→+∞ P n D η ≥ ε ′ −→ η→0 0 which concludes the proof of the lemma.

Fourth step : a regenerative property at the excursion level.
Let in the sequel N be an excursion measure of P, whose existence has been proved in the previous step. With a slight abuse in notation we will consider that N acts on measurable functions f : E → [0, ∞) by N ( f ) = f dN . Note that N is only determined up to a multiplicative constant: in this step the value of this multiplicative constant is irrelevant (because we only consider N upon some conditionings), and it will be fixed at the end of the next step.
The goal of this step is to show that N satisfies the following regenerative property (R) studied in Weill [34]. In the sequel we use the canonical notation for excursions, and let ǫ = (ǫ t , t ≥ 0) denote the canonical excursion and ξ(a, u) for a, u > 0 denote the number of excursions of ǫ above level a that have height > u.
(R) For every a, u > 0 and p ∈ N, under the probability measure N ( · | sup ǫ > a) and conditionally on the event {ξ(a, u) = p}, the p excursions of ǫ above level a with height greater than u are independent and distributed according to the probability measure N ( · | sup ǫ > u). This property implies that N is the law of the excursion height process of a spectrally positive Lévy process that does not drift to +∞, see the next step for more details.
The rest of this step is therefore devoted to proving that N satisfies the regenerative property (R). Fix until the rest of this step a, u > 0, p ∈ N and ( f k , k = 1, . . . , p) continuous, bounded and non-negative functions on E . Consider the first excursion of ǫ (or π • X ) with exactly p excursions above a with height larger than u and let (ǫ k , k = 1, . . . , p) be these p excursions: in order to show that N satisfies (R) we have to show that To prove this we will prove that while at the same time Let in the rest of the proofĝ k <d k be the endpoints ofǫ k , ǫ k be the kth excursion of π • X above a with height > u, and g k < d k be its endpoints. Note in particular that (g 1 , d 1 ) = (g a,a+u , d a,a+u ).

5.4.1.
Proof of (5.10). Since N is an excursion measure of π• X under P, the probability distribution N ( · | ξ(a, u) = p) is the law of the first excursion of π • X under P that has exactly p excursions above a with height > u, and in particular Thus in order to prove (5.10) it is enough to show that i.e., that (ǫ k , k = 1, . . . , p) ⇒ n (ǫ k , k = 1, . . . , p). In view of Lemma 4.1 it is enough to show that the corresponding endpoints converge, i.e., that we have the convergence ((ĝ k ,d k ), k = 1, . . . , p) ⇒ n ((ĝ k ,d k ), k = 1, . . . , p). We first show in the following two lemmas that T a+u and (g a,a+u , d a,a+u ) converge, and explain after Lemma 5.8 why this implies the convergence of ((ĝ k ,d k ), k = 1, . . . , p). In the following lemma, the limit means that the left-hand side is arbitrarily small provided that b −b, b −b ≥ 0 are small enough. The limit in Lemma 5.8 has the same meaning.
where the supremum is taken over [g b ′ n,bn , d b ′ n,bn ]. Rescaling, we obtain When S ≥ (ε + ε ′ )n, then necessarily g b ′ n,bn ≤ T bn ≤ T bn ≤ d b ′ n,bn and so The coupling implies that U b ′ n,bn under P is equal in distribution to S (τ(T 1 ) + 1) conditionally on ψ * (B(T 1 )) ≥ ε ′ n, and also that S under P is equal in distribution to ψ * (B(T 1 )) conditioned on ψ * (B(T 1 )) ≥ ε ′ n. Using τ(T 1 ) + 1 ≤ 2|T 1 | by (3.1) and using also (3.2), we therefore get In view of (A.12) and (A.13), choosing ε ′ = ε 1/2 and letting first n → +∞ and then ε → 0 gives the result. The proofs for d and g are very similar to one another, and also very similar to the proof of Lemma 5.6. Let us first sketch the proof for d. First of all, we are interested in the excursion straddling T bn and above an, so the ambient tree, sayT , is distributed like T 1 conditioned on ψ * (B(T 1 )) > (b−a)n. Let ε = a−a, consider any ε ′ > 0 and define Rescaling and introducing S = sup π • X − an, where the supremum is taken over [d an,bn , d an,bn ], we obtain (5.13) P n d a,b − d a,b ≥ η ≤ P d an,bn − d an,bn ≥ ηn 2 , S ≤ (ε + ε ′ )n + P S > (ε + ε ′ )n .
To control the right-hand side of the above upper bound we use a similar reasoning as in the proof of Lemma 5.6 (see the High-level description there). To control the first term of the above right-hand side, we observe that in the event S ≤ (ε + ε ′ )n, d an,bn − d an,bn is upper bounded by the time spent exploring nodes with label ≤ (ε + ε ′ )n inT , which leads to the bound P d an,bn − d an,bn ≥ ηn 2 To control the second term of the right-hand side of (5.13), we observe that (1) S is equal to the largest supremum of the excursions above an that start after d an,bn and end before d 0,bn ; (2) the number of such excursions is smaller than the number of nodes with label = εn inT ; and (3) the excursions above an and starting after time d an,bn are i.i.d. with common distribution the first excursion of π • X under P n 1 . This leads to the bound In view of (A.14) and (A.15) we get the desired result for d. For g we derive the exact same upper bound by considering S = sup π• X −an, where the supremum is now taken over [g an,bn , g an,bn ]. There is one additional minor difference, namely that excursions above an that end before g an,bn are i.i.d., but with distribution the first excursion of π•X above an conditioned on having its height < bn. Since the probability of this event goes to one, this additional conditioning has no influence on the result.
We now explain why the two previous lemmas imply the convergence of the vector ((ĝ k ,d k ), k = 1, . . . , p) (by which we mean that ((ĝ k ,d k ), k = 1, . . . , p) ⇒ n ((ĝ k ,d k ), k = 1, . . . , p)). First of all, the discussion in Section 4.2 shows that π • X shifted at time T a+u converges. Thus by Lemma 5.8, d a,a+u , which is the hitting time of (0, a] by the process π • X shifted at time T a+u , converges. Moreover, the arguments in the proof of Lemma 5.8 go through for a = 0, which shows that D d a,a+u converges. Since g a,a+u is the hitting time of (0, a] by the process π • X shifted at time T a+u and run backward in time, and the mapping that to a function associates the same function run backward in time is continuous, we obtain for the same reasons the convergence of g a,a+u . Recall that ǫ k is the kth excursion of π• X above a with height > u, and g k < d k are its endpoints. Let also T k = inf{t ≥ g k : π(X t ) ≥ a + u}. The idea is now to iterate the above arguments by looking at the process π • X shifted at time d k . Let us look at k = 1, for which we have (g 1 , d 1 , T 1 ) = (g a,a+u , d a,a+u , T a+u ). Inspecting the proof of Lemma 5.7, we see that T 2 converges: indeed, all that matters in the proof of Lemma 5.7 is the local behavior around b, for which the initial state ν, as long as π(ν) is far below b, is irrelevant (note that this is the case when shifting π • X at time d 1 , since by definition π(X d 1 ) ≤ a).
Moreover, since the successive excursions above a are i.i.d. by Lemma 2.2, Lemma 5.8 implies, since T 2 converges, that d 2 , g 2 and D d 2 converge. Iterating, we obtain the convergence of d k , g k and D d k for every k ≥ 1. Finally, it is not hard to see that these convergences hold jointly, i.e., ((g k , d k , D d k ), k ≥ 1) ⇒ n ((g k , d k , D d k ), k ≥ 1). There are two different ways to see this: either use arguments as in end of the discussion in Section 4.2, or use the fact that the results of Lemmas 5.7 and 5.8 actually show more than just weak convergence, but actually that the limiting functions have no fixed point of discontinuity, and then use the continuous mapping theorem.
Having the joint convergence with the D d k 's makes it possible to know whether two successive excursions above a with height > u belong to the same excursion away from 0. In particular, if k * ≥ 0 is the first index such that D d k * < D d k * +1 = · · · = D d k * +p < D d k * +p+1 (defining d 0 = 0), then (ĝ k ,d k ) = (g k * +k , d k * +k ) for k = 1, . . . , p. From the convergence of ((g k , d k , D d k ), k ≥ 1) we obtain the convergence of k * , which therefore entails the convergence of ((ĝ k ,d k ), k = 1, . . . , p) as desired. This finally achieves the proof of (5.10).

5.4.2.
Proof of (5.11). Let m k = sup π•X , where the supremum is taken over the interval [d k , D d k ]: then (ǫ k , k = 1, . . . , p) is equal in distribution to (ǫ k , k = 1, . . . , p) conditionally on {m p < a + u < m 1 , . . . , m p−1 } (which is to be understood as {m 1 < a + u} when p = 1). In particular, 1 p defined similarly as A n p by taking all the f k 's equal to the constant function which takes value one. Moreover, let us introduce for ν ∈ M F f k+p−q (ǫ k ); g 1 < D 0 , m q < a + u < m 1 , . . . , m q−1 for q = 1, . . . , p − 1, for q = 0, . . . , p − 1 (recall that ν n ∈ M F is the measure such that ϑ n (ν n ) = ν) and finally ϕ n q = E n f p−q+1 (ǫ 1 ) for q = 1, . . . , p. Note that Lemma 2.2 together with Lemmas 4.1, 5.7, 5.8 and the definition of N imply that We now derive some relations between all these quantities. First of all, and since π • X regenerates at D 0 , the second term of the above right-hand side is equal to P n ν (D 0 < g 1 )E n q k=1 f k+p−q (ǫ k ); m q < a + u < m 1 , . . . , m q−1 .
Since the two events {D 0 < g 1 } and {D 0 < d 1 } coincide, we obtain (5.15). Second, for ν ∈ M F with π(ν) < an, we have (5.16) A n q (ν) = ϕ n q × E n ν B n q−1 (X n g 1 − ) + ∆ n q−1 (ν) for q = 1, . . . , p. Indeed, the strong Markov property at time d 1 gives for q = 1, . . . , p . Since π(ν) < an, under P n ν , ǫ 1 and X g 1 − are independent, and ǫ 1 is distributed according to ǫ 1 under P n , which gives (5.16). Combining (5.15) and (5.16), we end up with the following recursion for A n q : . . , p, with the boundary condition Since the functions f k were arbitrary in deriving this recursion, we obtain a similar recursion for A n,1 q (ν), but with all the terms ϕ n q replaced by one and the ∆ n q (ν)'s replaced by ∆ n,1 q (ν), defined similarly as ∆ n q (ν) but with the functions f q equal to the constant function taking value one. Now considerÃ n q andÃ n,1 q that satisfy the same recursion (5.17)-(5.18), but with all the ∆ n q equal to 0, i.e., for every ν ∈ M F , . . , p, with the boundary conditionÃ n 1 (ν) = ϕ n 1 × E n ν B n 0 (X n g 1 − ) , and similarly forÃ n,1 q (ν) with all the terms ϕ n q replaced by one. By induction one gets A n p (z) A n,1 p (z) = p k=1 ϕ n k and so (5.14) implies that On the other hand, for ν with π(ν) < an, we have A n 1 (ν) −Ã n 1 (ν) = ∆ n 0 (ν) while for q = 2, . . . , p, then by induction we obtain |A n p (z) −Ã n p (z) ≤ C ε (n) for some finite constant C , and a similar upper bound holds for |A n,1 p (z) −Ã n,1 p (z)| (note that, to perform the induction, we use the fact that π(X n g 1 − ) < an P n ν -almost surely, for any ν with π(ν) < an). In view of (5.19), the following lemma therefore achieves the proof of (5.11). Lemma 5.9. ε (n) → 0 as n → +∞.
Proof. By definition we have for q = 0, . . . , p − 1 ..,p sup f q . To control the difference appearing in this last expectation, we use the following observation: under P n ν for any ν ∈ M F with π(ν) < an, we can write X n d 1 = X n g 1 − + Ξ, where Ξ ∈ M F corresponds to the orders added below a during the first excursion of π • X above a with height > u. In particular, the coupling implies that Ξ is independent from X n g 1 − , its law does not depend on ν and M(Ξ) is equal in distribution to |K (T 1 )| conditioned on ψ * (B(T 1 )) > un. In particular, Thus we need to control terms of the form B n q (ν)−B n q (ν+ν) uniformly in ν ∈ M F with π(ν) < an, whereν plays the role of Ξ. In view of the definition of B n q , we thus need to understand the difference between X under P n ν and P n ν+ν . More precisely, all the events and random variables involved in the computation of B n q depend on X stopped at D 0 , and so we actually only need to compare the processes (X t , 0 ≤ t ≤ D 0 ) under P n ν and P n ν+ν .
In order to do so we extend the coupling of Theorem 3.1: recall that this coupling couples X under P a with T a . Using this coupling, it is straightforward to couple X under P ν with a forest of trees F (ν) = (T (k) , k = 1, . . . , M(ν)) such that the trees T (k) are independent, and if ν = a ς a δ a , then exactly ς a of the trees T (k) are distributed like T a . This coupling relies on extending the map Φ to make it act on forests in an obvious manner.
This coupling between P ν and F (ν) provides a coupling between P ν and P ν+ν as follows: first, one considers the forest F (ν) used to construct X under P ν . Then, one adds M(ν) independent trees to this forest, say (T (k) , k = 1, . . . , M(ν)), such that ifν = pςp δ p then exactlyς a of these trees are distributed according to T a . We thus get a larger forest, sayF = F (ν) ∪ {T (k) , k = 1, . . . , M(ν)}, and by exploring this forest with successive iterations of Φ we get a new processX on the same probability space that X . By construction and thanks to Theorem 3.1, this process is a version of X under P ν+ν .
Note moreover that, as mentioned previously, we are only interested in X before time D 0 . In particular, we can truncate the trees T (k) andT (k) by removing all the nodes that have a label ≤ 0. It is thus convenient to consider the operator B 0 : T → T that removes all the nodes of a tree T ∈ T with label ≤ 0.
Ifǫ k ,g 1 ,m k andD 0 are the quantities associated toX in the same way that ǫ k , g 1 , m k and D 0 are associated to X , then using the definition of B n q we have Now the key observation is that in the event {max k ψ * (B 0 (T (k) )) < (a + u)n}, the two random variables (the one defined in terms of X and the one defined in terms ofX ) in the previous expectation are equal. Indeed, in this event, the excursions above a with height > u for X andX coincide. In particular, since the random variables under consideration are bounded, we obtain Recall that the treesT (k) are independent. Further, ψ * (B 0 (T y )) ≤ ψ * (T y ), and ψ * (T y ) is (stochastically) increasing in y, so that using the union bounds we get for anyν ∈ M F with π(ν) < an In view of (5.20) and the discussion preceding it, we therefore get The supremum over n ≥ 1 of the expectation in the above right-hand side is finite by (A. 16), and since P(ψ * (T 0 ) > un) → 0 as n → +∞, the result is proved. 5.5. Fifth step : π • X under P is a reflected Brownian motion. At this point, we know that N is a σ-finite measure on E that satisfies the following properties: which by Lemma 5.1 is a Brownian motion with no drift reflected at 0); II) N (ǫ is not continuous) = 0 (since π • X under P is almost surely continuous).
In particular, N induces a σ-finite measure Θ on the set of compact real trees via the usual coding of a compact real tree by a continuous excursion with finite length, see for instance Le Gall and Miermont [26,Section 3].
Further, let y > 0 and ǫ 1 be the first excursion of π • X away from 0 with height > y. Then by Lemmas 5.7 and 5.8, we have P n (sup ǫ 1 > x) → P(supǫ 1 > x) for all x outside a countable set, where this latter quantity is equal to N (supǫ > x | sup ǫ > y) by definition of N . On the other hand, P n sup ǫ 1 > x = P ψ * (B(T 1 )) > xn | ψ * (B(T 1 )) > yn which, for any 0 < y < x, converges to y/x by Lemma A.3. Thus for all x > y outside a countable set, we have x from which one deduces that N (sup ǫ > x) = c/x for every x > 0, and for some finite constant c > 0 (this constant will be fixed shortly). Thus N satisfies the following additional properties: III) N (sup ǫ = 0) = 0 (by definition of an excursion measure); VI) N satisfies the regenerative property (R) (by the previous step).
Properties III)-V) above immediately translate to Θ having infinite mass, Θ(H = 0) = 0 and Θ(H > x) ∈ (0, ∞) (where H denotes the height of the canonical tree t). Moreover, the last property VI) means exactly that Θ satisfies the property (R) of Weill [34]: indeed, excursions of ǫ above level a under N correspond to the subtrees of t above level a under Θ. Finally, we see that the assumptions of Theorem 1.1 in Weill [34] are satisfied, which gives the existence of a spectrally positive Lévy process Y , with Laplace exponent Ψ satisfying ∞ (1/Ψ) < +∞, such that Θ is the (excursion) law of the Ψ-Lévy tree. In particular, N is an excursion measure of the height process associated to Y .
We now fix the normalization constant as in Duquesne and Le Gall [11] (which amounts to choosing the constant c above), so that according to Corollary 1.4.2 in Duquesne and Le Gall [11] (remember that In other words, Y is equal in distribution to (2/c) 1/2W , withW a standard Brownian motion, and the height process associated to this Lévy process is equal in distribution to (2c) 1/2 W (to see this, consider for instance the CSBP Z associated to Y , which has branching mechanism Ψ and satisfies the SDE dZ t = (2Z t /c) 1/2 dW t , and use (20) and (21) in Pardoux and Wakolbinger [31]). Since π • X under P is equal in distribution to the height process of Y , we obtain that π • X under P is equal in distribution to (2c) 1/2 W . The following lemma makes it possible to identify c and, more importantly, to conclude the proof of Theorem 2.1.

Lemma 5.10. For any
≤ε} du and using the triangular inequality, we first obtain P n L n,π t − 1 ε t 0 ½ {π(X u )≤ε} du ≥ η = P n L n,π t ≥ ηεn/2 The first term of the above upper bound goes to 0 by Lemma 5.4, and so we need to control the second term. Rescaling leads to Let as in the proof of Lemma 5.4 K (y) be the number of excursions of π• X away from 0 that end before time y, E k be the time that π• X stays at 0 before the kth excursion and V k (y) be the time spent exploring nodes with label ≤ y in the kth ambient tree: then if π(X n 2 t ) = 0, we have If π(X n 2 t ) > 0, then the residual term, instead of being E K (n 2 t ) , is the time spent exploring nodes with label ≤ εn in the K (n 2 t )th ambient tree. In each case, one can show that this residual term does not contribute in the regime n → +∞ and then ε → 0 that we are interested in, and so we only have to show that For y > 0 introduce the following quantities: m(y) = E(E 1 ) − E(V 1 (y))/y, Υ k (y) = E k − V k (y)/y − m(y), σ(y) 2 = E(Υ(y) 2 ), Υ(y) = Υ(y)/σ(y) and Then the triangular inequality gives and so Let C = sup y ≥0 (y −1/2 E(K (y))), which has been showed in the proof of Lemma 5.4, to be finite. Using Markov's inequality, the first term of the above upper bound is thus upper bounded by (5.22) P K (n 2 t ) ≥ ηn 2|m(εn)| ≤ (2/η)C t 1/2 × |m(εn)| .
Thus letting first n → +∞, then ε → 0 and finallyK → +∞ achieves the proof. 5.6. Last step. At this point, we know that, under P: (1) π • X is equal in distribution to (2c) 1/2 W (by the fifth step); (2) M • X is equal in distribution to (2λ) 1 Before proving this lemma, let us quickly conclude the proof of Theorem 2.1. Fix some t , y ≥ 0: we have to prove (2.2). If π(X t ) = 0, then M(X t ) = 0 by Lemma 5.11 and (2.2) holds. Otherwise, assume first that y < π(X t ) and let g be the left endpoint of the excursion of π • X above y straddling t . Then according to Corollary 5.3, we have X t ([0, y]) = X g ([0, y]). On the other hand, we have y = π(X g ) by definition of g and so X g ([0, y]) = M(X g ) which is equal to π(X g )/E(J ) = y/E(J ) by Lemma 5.11. This proves that X t ([0, y]) = y/E(J ) for y < π(X t ), and since X t ([0, y]) = M(X t ) for y ≥ π(X t ), which is equal to E(J ) −1 π(X t ) by Lemma 5.11, this proves (2.2) and concludes the proof of Theorem 2.1.
Proof of Lemma 5.11. Thanks to (4.1) we can write π = cL (π) +π and M = λL (M) +M , where (2c) −1/2π and (2λ) −1/2M are two standard Brownian motions, and in the rest of the proof we write in order to ease the notation π and M for π•X and M •X , respectively. Moreover, L (π) is on the one hand equal in distribution to L ((c/λ) 1/2 M) because π is equal in distribution to (c/λ) 1 Since (M 2 − 2λt , t ≥ 0) and (π 2 − 2ct , t ≥ 0) are also martingales, another application of the optional sampling theorem implies The stopped process (M T a ∧t , t ≥ 0) is therefore uniformly integrable, and letting t → +∞ in (5.23), we thus obtain E(M T a ) = E(π T a )/E(J ) = a/E(J ). On the other hand, letting t → +∞ in (5.24) and using Fatou's lemma, we obtain E(M 2 T a ) ≤ (a/E(J )) 2 which implies that M T a = a/E(J ).
A similar calculation to (5.24) shows that the stopped Brownian motions (M T a ∧t , t ≥ 0) and (π T a ∧t , t ≥ 0) are uniformly integrable. Then we can apply another version of the optional sampling theorem, such as in Karatzas Since we have proved that M T a = a/E(J ), the last display leads to The exact same reasoning shows that the left hand side of the above display is also equal toπ T a ∧t , and so M t ∧T a = π t ∧T a /E(J ). Letting a → +∞ achieves the proof.

DISCUSSION
The main purpose of this paper was to exploit the connection between the regenerative characterization of Lévy trees of Weill [34] and the present model of the limit order book. The assumptions made on J in Theorem 2.1 correspond to the simplest interesting case where this connection can be exploited, but this result should hold under more general assumptions on J and λ. For instance, our arguments should readily extend to a triangular scheme where the rates at which orders are added to and removed from the book may be different, and the model's parameters depend on n in a suitable way. We believe that the results of Theorem 2.1 would still hold, with the limiting price process being a Brownian motion with drift reflected at 0.
A more delicate generalization consists in relaxing the assumption that J ∈ {− j * , . . . , 1}. The proof of most results goes through in this more general case, but the main problem is that for a general random variable J , the successive excursions above level a are not i.i.d. anymore, which invalidates Lemma 2.2. However, the dependency between successive excursions lies in the overshoot of the price above a at the beginning of each excursion above a, and so upon suitable moment assumptions on J this dependency should be washed out in the limit.
Further, different boundary conditions can also be considered. In Lakner et al. [23] and Simatos [33] for instance, orders can be placed in the negative half-line. In Lakner et al. [23] there is the additional constraint that the number of orders cannot fall below some level, say εn. This is meant to model the presence of a market maker.
In the presence of such a market maker, Theorem 2.1 remains valid and the proofs go through. Indeed, imagine εn orders initially sit at 0. Since these orders can only be displaced when the price is at 0 and, while the price is at 0, the number of orders evolves according to a critical random walk, the price process needs to accumulate of the order of n 2 units of local time at 0 in order to go through this initial stack of orders. Lemma 5.1 shows that this takes of the order of n 4 units of time, and so on the time scale that we are interested in, this does not happen. Pushing this reasoning a bit further actually shows that Theorem 2.1 should remain valid as long as the initial number of orders, say m n , diverges to +∞. Indeed, in this case after accumulating m 2 n units of local time at 0 these orders will have only moved by a constant distance, and it would thus take nm 2 n units of local time at 0, which take about n 2 m 4 n ≫ n 2 units of normal time to accumulate, to have them moved by a distance of the order of n.
On the other hand, when orders can be placed on the negative half-line and there is no market maker, then we conjecture that the price process should converge to a Brownian motion (without reflection), sayW , and the measure-valued process should converge to the process having constant density 1/E(J ) with respect to Lebesgue measure restricted to [I t ,W t ] with I t = inf [0,t ]W . The key observation is indeed that if, in this "free" case, one reflects the measure-valued process by considering I π , the past infimum of the price process, as the origin of space and collapsing all the orders below I π at I π , then one precisely gets the model studied here. Thus the only thing left to prove would be that I π converges to the local time at 0 of the reflected price process.
Let us finally mention that we have focused here on the case E(J ) > 0. When E(J ) < 0, under minor moment assumptions on J the probability P(ψ * (B(T 1 )) > u) decays exponentially fast, since for this to happen one needs the supremum of a random walk with a negative drift to be large (see for instance Theorem 2 in Addario-Berry and Broutin [2]). This is in sharp contrast with the polynomial decay proved in Lemma A.3 when E(J ) > 0, and it implies, when E(J ) < 0, that π • X under P n converges weakly to 0 (since one would need to see an exponential number of excursions before seeing a macroscopic one). Note that the case E(J ) < 0 with a different boundary condition (see discussion above) has been studied in Lakner et al. [23] via a completely different approach. The fact that π•X under P n converges to 0 means, in terms of the free process studied in [23], that the limiting price process is increasing (see Proposition 9.12 there).
To conclude, we note that the case E(J ) = 0, which is in some sense the true critical case where both the offspring and displacement distributions of T 1 are critical, remains open.

APPENDIX A. RESULTS ON A BRANCHING RANDOM WALK WITH A BARRIER
We prove in this section the various results on B(T 1 ) that have been used in the proof of Theorem 2.1. These results may also be of independent interest, see for instance Durrett et al. [12] and Kesten [21] where closely related results are proved for T 1 .
Note that we consider the case of a geometric offspring distribution, but except for the second estimate in (A.3) this does not play a role in the proofs. Similarly, except for some explicit expressions such as (A.4), the following results would hold for any random variable J with E(J ) > 0 and a suitable moment assumption, e.g., E(e ̺J ) < +∞ for some ̺ > 0.
A.1. Preliminary results. Let in the sequel Z m = v∈T 1 ½ {|v|=m} for m ≥ 1 be the number of nodes at depth m in T 1 , so that (Z m , m ≥ 1) is a Galton Watson branching process with geometric offspring distribution with parameter 1/2, h(T 1 ) is its extinction time and |T 1 | is its total progeny. By induction one easily obtains Moreover, it is well known that there exists a finite constant C S > 0 such that (A.2) P (|T 1 | ≥ u) ∼ u→+∞ C S /u 1/2 and P (h(T 1 ) ≥ u) ∼ u→+∞ 1/u, see for instance [3,Theorem 23] where these estimates are established for any finite variance Galton Watson process. Most of the times upper and lower bounds will be enough, and we will for instance often write 1/(Cu 1/2 ) ≤ P (|T 1 | ≥ u) ≤ C /u 1/2 and 1/(Cu) ≤ P (h(T 1 ) ≥ u) ≤ C /u.
We will also need the existence of a finite constant C > 0 such that for every u, m ≥ 1, The first bound can be found in, e.g., [18,Theorem 1.13], where it is proved for any finite variance Galton Watson process. The second bound is very natural in view of the first one, since the trees conditioned on having a large size or a large height are known to have the same scaling limits, but we could not find a precise reference for it. When the critical offspring distribution is geometric we can nevertheless give a simple proof of this fact.
Proof of the second bound in (A.3). Let C = (C (t ), t ≥ 0) be the contour process of the corresponding Galton Watson tree, L C m = t ≥0 ½ {C (t )=m} be its number of visits to m and T C m = min{t > 0 : C (t ) = m}. Note that, because of our convention that the root has height one, C starts at one and ends at zero and so the two events {h(T 1 ) ≥ u} and {T C u < T C 0 } coincide.
Since each node of the tree is visited at least once by C , we have Z m ≤ L C m and so we only need to prove that E(L C m | T C u < T C 0 ) ≤ C m. In the event {T C u < T C 0 } we can write L C m = 1 + G 1 + 1 + G 2 where 1 + G 1 , resp. 1 + G 2 , accounts for the number of visits to m before, resp. after, time T C u . The key observation is that, since the offspring distribution is geometric, C is a simple random walk: see for instance [26,Proposition 2.6]. Let us look at the case m < u: the fact that C is a simple random walk implies that, in the above decomposition L C m = 1 + G 1 + 1 + G 2 , G i is a geometric random variable with success probability p i with , where the subscript in P m refers to the initial state of the random walk C . Since C is a simple random walk we have p 2 = 1 − 1/(2m) and The case m ≥ u follows similarly.
Let in the rest of this section S = (S m , m ≥ 0) be a random walk started at 0 and with step distribution J , independent from T 1 , and S m = min 0≤k≤m S k .
Lemma A.1. We have E(|K (T 1 )| 2 ) < +∞ and Proof. To compute the mean number of killed nodes, we write Thus, taking the mean and using the fact that labels are independent from the genealogical structure, we obtain Since the genealogical structure Z of T 1 is a critical Galton Watson process, we have E(Z m ) = 1 which gives E(|K (T 1 )|) = P(S ∞ < 0). Since J ∈ − j * , − j * + 1, . . . , 0, 1 , S is, in the terminology of Brown et al. [7], a skip-free (to the right) random walk with positive drift. In particular, Corollary 1 in this reference implies that P(S ∞ ≥ 0) = E(J )/P(J = 1) which gives (A.4).
As for the second moment, we define v ∧ v ′ for v, v ′ ∈ T 1 as the most recent common ancestor of v and v ′ , and write |K (T 1 )| 2 = |K (T 1 )| + Σ, so that we only have to prove that E(Σ) < +∞, with where g (i , s) = P S i−1 ≥ −s, S i < −s for i ≥ 1 and s ∈ N. Defining g (0, s) = 0, we therefore have Since by the branching property, the subtrees rooted at nodes at depth M in the tree are i.i.d., independent from the number Z M of nodes at depth M, and since further To where Z (w, i ) is the number of nodes at depth i in the subtree of T rooted at w ∈ T . Thus taking the mean and noting that the number of distinct pairs of children of the root is equal in distribution to Z 1 (Z 1 − 1) which has mean 2, we obtain We have by definition g (m, s) = P(S m−1 ≥ −s, S m < −s), so that for any κ > 0, Since E(J ) > 0, we can choose κ > 0 such that β = E(e −2κJ ) < 1, and so we get the bound Since β < 1, these two sums are finite, which achieves to prove that |K (T 1 )| has a finite second moment. Lemma A.2. As u → +∞, we have uP(h(B(T 1 )) ≥ u) → E(J )/P(J = 1).
Proof. Define for simplicity κ = |K (T 1 )| and let (v B k , k = 1, . . . , κ) be the κ killed nodes in T 1 , and (T (k) , k = 1, . . . , κ) be the subtrees attached to them. Then Next, we observe that conditionally on B(T 1 ), the (h(T (k) ), k = 1, . . . , κ) are i.i.d. with common distribution h(T 1 ). Thus defining H (u) = P(h(T 1 ) ≥ u), we obtain and consequently, It follows from (A.2) that the random variable uY (u)½ {h(B (T 1 ))<u} converges almost surely as u → +∞ to κ. If we had uniform integrability, then we would obtain which would prove the result by (A.4). Thus it remains to show that the family of random variables (uY (u)½ {h(B (T 1 ))<u} , u ≥ 0) is uniformly integrable: it is enough to show that In the event V B ≥ u/2, we have N ≥ 1 where N is the number of nodes v ∈ T 1 that satisfy |v| ≥ u/2 and ψ(v, T 1 ) ≤ 0. Using Markov inequality, we therefore get Using 1 − (1 − x) y ≤ x y for y ≥ 0, we get on the other hand Since the probability P(S m ≤ 0) decays exponentially fast as m → +∞ by (A.5), the first term of the above upper bound is bounded in u. The second term being also bounded in u by (A.2) and Lemma A.1, the proof is complete. Lemma A.3. As u → +∞, we have uP(ψ * (B(T 1 )) ≥ u) → (E(J )) 2 /P(J = 1).
We show that E(ψ 12 ) is finite. By upper bounding the supremum by the sum, we get where (Y i ) are i.i.d. centered random variables with distribution J − E(J ) and where, in order to derive the last equality, we used the independence in T 1 between the genealogical structure and the labels. The central limit theorem implies that |Y 1 + · · · + Y m |/m 1/2 converges weakly, and since the Y k 's are bounded, all the moments of this random variable are bounded uniformly in m, so that by uniform integrability we can write E |Y 1 + · · · + Y m | 12 ≤ C m 6 for all m ≥ 1 and some finite constant C , independent from m. This gives E(ψ 12 ) ≤ C m≥1 m −2 which is finite.
Starting from the lower bound in (A.6) a corresponding lower bound can be proved using the same arguments, which completes the proof.
A.2. Various results. We now provide the proof of the various results that have been used in the proof of Theorem 2.1. Results needed in the proof of Lemma 5.6. To complete the proof of Lemma 5.6, we must show that there exists a finite constant C > 0 such that for every p ≥ 1 and every κ, u > 0 (A.10) P Note that (A.11) implies (A.10) by summation over p, so we only need prove (A.11).
Conditioning on the genealogical structure leads as before to E N p | |T 1 | > u/2 = m≥1 E (Z m | |T 1 | > u/2) P S m = p and combining the two previous displays with (A.3), we end up with Since S takes a geometric number of times the value p at times around p/E(J ), the term E( m≥1 m½ {S m =p} ) is of the order of p when p grows large, which concludes the proof.
Results needed in the proof of Lemma 5.7. To complete the proof of Lemma 5.7 we need to prove the two following results: From there, (A.15) can be proved by repeating verbatim the proof of (A.11) with the following caveat: one needs to replace the conditioning on |T 1 | by a conditioning on h(T 1 ), and thus to use the second bound in (A.3) instead of the first one.
Conditioning on the genealogical structure (Z m , m ≥ 1), we get Result needed in the proof of Lemma 5.10. To complete the proof of Lemma 5.10 we need to prove that the constant C * defined there is finite. Let N (y) be the number of nodes in B(T 1 ) with label ≤ y: then going back to the definition of C * , we see that we have to prove that sup y (Var(S (N (y)))/y 3 ) is finite, where Var(Y ) denotes the variance of a real valued random variable Y . Thanks to (3.2), we only have to show that sup y (Var(N (y))/y 3 ) is finite. Further, using the same estimates as in the proof of Lemma A.3 we can show that N (y) behaves like the number of nodes in T 1 at depth ≤ y/E(J ), and in particular Var(N (y)) is of the order of Var(Z 1 + · · · + Z y /E(J) ). Thus in order to prove that C * < +∞, we only have to prove that Var(Z 1 + · · · + Z n ) grows at most like n 3 . Let and so (A.1) gives v n = 2n + v n−1 . In particular, v n grows quadratically. On the other hand, we have Var(Z 1 + · · · + Z n+1 ) = 2(n + 1) + 2v n + Var(Z 1 + · · · + Z n ), and since v n grows quadratically in n, this implies that Var(Z 1 + · · · + Z n ) grows like n 3 , which proves the result.