Exact asymptotics for a multi-timescale model, with applications in modeling overdispersed customer streams

In this paper we study the probability $\xi_n(u):={\mathbb P}\left(C_n\geqslant u n \right)$, with $C_n:=A(\psi_n B(\varphi_n))$ for L\'{e}vy processes $A(\cdot)$ and $B(\cdot)$, and $\varphi_n$ and $\psi_n$ non-negative sequences such that $\varphi_n \psi_n =n$ and $\varphi_n\to\infty$ as $n\to\infty$. Two timescale regimes are distinguished: a `fast' regime in which $\varphi_n$ is superlinear and a `slow' regime in which $\varphi_n$ is sublinear. We provide the exact asymptotics of $\xi_n(u)$ (as $n\to\infty$) for both regimes, relying on change-of-measure arguments in combination with Edgeworth-type estimates. The asymptotics have an unconventional form: the exponent contains the commonly observed linear term, but may also contain sublinear terms (the number of which depends on the precise form of $\varphi_n$ and $\psi_n$). To showcase the power of our results we include two examples, covering both the case where $C_n$ is lattice and non-lattice. Finally we present numerical experiments that demonstrate the importance of taking into account the doubly stochastic nature of $C_n$ in a practical application related to customer streams in service systems; they show that the asymptotic results obtained yield highly accurate approximations, also in scenarios in which there is no pronounced timescale separation.


Introduction, preliminaries, notation, literature
Consider a scalar Lévy process A(·), and (independently of A(·)) an increasing scalar Lévy process B(·). These are uniquely characterized by their characteristic exponents, which are defined as α(ϑ) := log Ee ϑA (1) , β(ϑ) := log Ee ϑB (1) , and are in fact the logarithmic moment generating functions of A(1) and B (1). Targeting at largedeviations asymptotics we impose the assumption that both characteristic exponents are finite in an open neighborhood of the origin, implying that all moments of A(t) and B(t) exist. Let ϕ n and ψ n be non-negative sequences such that ϕ n ψ n = n and ϕ n → ∞ as n → ∞. Our objective is to find the exact asymptotics of ξ n (u) := P (A(ψ n B(ϕ n )) un) , i.e., we wish to find a sequence f n such that ξ n (u)/f n → 1 as n → ∞. We assume u > ab, with a := E A(1) > 0 and b := E B(1) > 0, such that the event under consideration becomes increasingly rare as n → ∞. One special case has been studied in detail: the choice ϕ n = n and ψ n = 1 reduces ξ n (u) to P (A(B(n)) un), whose exact asymptotics follow from [2] (using that A(B(·)) is a Lévy process). To the best of our knowledge, other cases have not been analyzed in the literature.
Effect of multiple timescales. To get some intuition for the behavior of ξ n (u), we write it as ξ n (u) = P A n B(ϕ n ) ϕ n un .
We proceed by explaining that there are two timescale regimes. (i) If ϕ n is superlinear, it is anticipated that B(ϕ n )/ϕ n is close to b, such that ξ n (u) resembles P(A(bn) un). We refer to this setting as the 'fast regime', as the fluctuations of the process B(·) are so fast that it can be replaced by its mean value. (ii) If on the contrary ϕ n is sublinear (which we will refer to as the 'slow regime' for obvious reasons), then one may expect that the event of interest roughly looks like anB(ϕ n )/ϕ n un, and therefore ξ n (u) essentially behaves as P(aB(ϕ n ) uϕ n ). The objective of this paper is to make these claims precise. Our main contribution is that we succeed in identifying the exact asymptotics of ξ n (u) as n → ∞. These turn out to have a non-standard form, in the sense that the exponent, in addition to the linear term that also appears in the classical asymptotics [10], may also contain sublinear terms.
To further investigate the two timescale regimes identified above, it is instructive to calculate the variance of C n := A(ψ n B(ϕ n )). To this end, we first express the log-moment generating function (l-mgf) γ n (·) of C n in terms of α(·) and β(·). It requires a direct computation to verify that γ n (ϑ) = ϕ n β α(ϑ) ψ n .
Then it is direct that E C n = nα ′ (0)β ′ (0) = nab and Var C n = γ ′′ n (0) = nψ n α ′ (0) 2 β ′′ (0) + nα ′′ (0)β ′ (0) = nψ n σ 2 − + nσ 2 + , with σ 2 − := a 2 β ′′ (0) and σ 2 + := α ′′ (0)b. In this decomposition of the variance, the aforementioned regime dichotomy is nicely reflected, as can be seen as follows. If ϕ n is superlinear (and hence ψ n vanishes), then the first term in the right-hand side of (1) is small relative to the second term, and Var C n essentially behaves as nσ 2 + (which is also the variance of A(bn)). On the other hand, if ϕ n grows sublinearly then so does ψ n , so that in this case the first term of (1) will dominate; as a consequence, Var C n behaves as nψ n σ 2 − (which equals the variance of anB(ϕ n )/ϕ n ). The above intuition can be translated in terms of a central limit theorem by following a classical approach. In the fast regime (in which ϕ n is superlinear), direct computations reveal that the characteristic function of D n := C n − (ab)n √ nσ + converges to that of a standard Normal random variable. Hence, as a direct application of Lévy's convergence theorem, D n converges in distribution to a standard Normal random variable. Likewise, in the slow regime (in which ϕ n is sublinear) converges in distribution to a standard Normal random variable. We conclude that the dichotomy described above manifests itself in a central limit context.
Large deviations, contributions. In this paper we assess to what extent the above dichotomy carries over to a large-deviations setting. Based on the above reasoning, it is tempting to believe the following conjecture: in the fast regime ξ n (u)/P(A(bn) un) → 1 as n → ∞, and in slow regime ξ n (u)/P(aB(ϕ n ) uϕ n ) → 1 as n → ∞. Our study, however, shows that this conjecture does not hold in general. In more detail, denoting by k n ∼ ℓ n that k n /ℓ n → 1 as n → ∞, the main result is that we find explicit sequences λ +,n and λ −,n such that in the fast regime, as n → ∞, ξ n (u) ∼ P(A(bn) un)λ +,n , and in the slow regime, as n → ∞, ξ n (u) ∼ P(aB(ϕ n ) uϕ n )λ −,n .
Here, λ +,n and λ −,n do not necessarily equal 1, thus refuting the above conjecture. Note that hereby the exact asymptotics of ξ n (u) are fully identified, as sequences κ +,n and κ −,n such that P(A(bn) un)κ +,n → 1 and P(aB(ϕ n ) uϕ n )κ −,n → 1 are given in [2]. (As an aside, we mention that in specific situations λ +,n and λ −,n do equal 1, so that in those situations the conjecture applies; we return to this below.) The resulting exact asymptotics of ξ n (u) can be written as the product of a polynomial and an exponential part. In the fast regime, the polynomial part is inversely proportional to √ n, as was found before in various related settings (such as the one studied in [2]). The exponential part, however, has a rather unusual shape: not only does the exponent contain a commonly observed term that is linear in n, in addition it consists of finitely many sublinear terms (the number of which depends on the specific form of ϕ n and ψ n ). In the slow regime similar results apply, but with the role of n taken over by ϕ n , meaning that the polynomial part is inversely proportional to √ ϕ n and the exponent is the sum of a term that is linear in ϕ n and finitely many terms that are o(ϕ n ).
An immediate consequence of our result is that the conjecture we stated above is true if the timescales of both Lévy processes are sufficiently separated; then there are no sublinear terms in the exponent (where 'sublinear' means o(n) in the fast regime and o(ϕ n ) in the slow regime). More specifically, the conjecture holds in the fast regime if nψ n → 0 (i.e., then λ +,n = 1), and in the slow regime if ϕ n /ψ n → 0 (i.e., then λ −,n = 1). It is further remarked that on a logarithmic scale the conjecture always holds, albeit in the following (weaker) sense: we show that in the fast regime whereas in the slow regime Example 1. The leading example, which we will visit several times in this paper, is that of ϕ n ∼ n f and ψ n ∼ n 1−f for some f > 0. The fast regime corresponds to f > 1 and the slow regime to f ∈ (0, 1). The above criterion yields that λ +,n = 1 if f > 2 and λ −,n = 1 if f ∈ (0, 1 2 ). ♦ Remark 1. We mentioned before that the case ϕ n = n can be dealt with in a straightforward way. For the sake of completeness, we include this argumentation here. In this case γ n (ϑ) = nβ(α(ϑ)), which effectively means that C n can be written as the sum of n i.i.d. random variables, bringing us directly in the setting of [2]. With ϑ ⋆ solving β ′ (α(ϑ))α ′ (ϑ) = u, one obtains We proceed with a few words on the approach followed. In the large-deviations analysis a crucial role is played by the 'twisting factor', i.e., the solution ϑ = ϑ n of the equation un = γ ′ n (ϑ), or As u > ab, it follows that ϑ n is positive. In the fast regime, ϑ n is close to the solution ϑ ⋆ of u = bα ′ (ϑ ⋆ ); in the slow regime ϑ n resembles τ ⋆ /ψ n where τ ⋆ solves u = β ′ (aτ ⋆ )a. Our proofs rely on a change of measure via an exponential twist that is based on the solution of (2). By and large, the proof presented in [10] underlying the Bahadur-Rao [2] result is followed: first we capture the exponential part of ξ n (u) by applying a change of measure, after which the polynomial part of ξ n (u) is identified using delicate Berry-Esséen-based calculations (that are considerably more involved than the ones needed in the 'classical' Bahadur-Rao case). In line with the proof in [10], the case where the underlying Lévy processes are lattice has to be dealt with separately.
Literature and motivation. Our work fits in the tradition of large-deviations asymptotics of samplemean related quantities in the light-tailed setting. Without attempting to provide an exhaustive overview, we give a number of key references. In the classical framework, S n is defined as the sum of i.i.d. random variables X 1 , . . . , X n , where the X i are assumed to have a finite moment generating function in a neighborhood of the origin. The exceedance probability χ n (u) := P(S n un) is the object of study, with u > EX 1 . Cramér [9] characterized a function I(u) such that 1 n log χ n (u) → −I(u) as n → ∞; this type of results is commonly referred to as logarithmic asymptotics. The proof technique used relied on change-ofmeasure argumentation that has been applied extensively since then.
Cramér's seminal work was extended in several directions. In [8] a uniform upper bound on χ n (u), generally known as the Chernoff bound, was derived. We refer to [3] for a generalization of Cramér's logarithmic asymptotics to infinite-dimensional topological vector spaces. Cramér's theorem was also extended to sums of dependent random variables [20] and vectors [11,13].
Whereas the above results focus on logarithmic asymptotics, another strand of research addresses exact asymptotics. In this respect we mention the pioneering paper [2] that shows that, under the conditions of Cramér's result, χ n (u) e nI(u) √ n converges to a positive constant as n → ∞; in [6] a similar result had already been derived for the case that the X i are lattice. We refer to e.g. [7] for exact asymptotics in the vector-valued case, and to [17] for a uniform framework covering both the CLT regime and large deviations.
Our investigations were motivated by models recently suggested in order to incorporate overdispersion. The reason for developing such models was the observation that in various types of service systems [5,21] the customer arrival process is intrinsically more variable than the traditionally used Poisson process. An approach to overcome the lack of overdispersion was proposed in [16]: extra variability is produced by periodically resampling the Poisson arrival rate. As a consequence, when resampling every unit of time, the number of arrivals in [0, m] denoted by C(m) (for m ∈ N) is Poisson distributed with (random) parameter m i=1 Λ i , where the Λ i are i.i.d. non-negative random variables. Assuming for simplicity that the Λ i are integer-valued as well, this means that with X j a sequence of i.i.d. Poisson random variables with parameter 1. We thus have introduced overdispersion, as desired; more specifically, we have making the process more variable than the Poisson process. Now observe that the 'two-timescale random walk' (3) we introduced can be seen as the discrete counterpart of the two-timescale Lévy processes considered in the present paper. In [16] the above two-timescale random walk model is studied under certain scalings (comparable to the scalings with ϕ n and ψ n that are imposed in the present paper). The results in [16] include logarithmic asymptotics, which are for special cases refined to exact asymptotics in [15].
Organization. The fast regime (in which ϕ n grows superlinearly) is covered by Section 2, followed by the slow regime (in which ϕ n grows sublinearly) in Section 3. Examples are presented in Section 4. Section 5 describes how are findings can be applied when modeling overdispersed customers streams.

Fast regime
In this section we analyze the case that ϕ n is superlinear, entailing that ψ n → 0 as n → ∞; this corresponds to f > 1 in the context of Example 1. The approach followed echoes the proof of e.g. [10,Thm. 3.7.4], and comprises the following elements: • We first identify in Section 2.1 the 'twisting factor' ϑ n (i.e., the solution of (2)). It means that twisting C n by ϑ n leads to a random variable with mean un. More concretely, we define a measure Q n through (in self-evident notation) Q n (C n ∈ dx) = P(C n ∈ dx) e ϑnx log γ n (ϑ n ) , where it holds that (in self-evident notation) x Q n (C n ∈ dx) = un.
• Then we rewrite in Section 2.2 the probability ξ n (u) using the ϑ n -twisted version of C n . As in [1, Ch. XIII], we can express ξ n (u) into a mean under Q n : for an appropriately chosen L(·) (which can be interpreted as a likelihood ratio). • The right-hand side of (4) turns out to consist of an exponential part and a polynomial part (in n, that is). Section 2.3 analyzes the exponential part. As opposed to the standard Bahadur-Rao result [10,Thm. 3.7.4], the exponent potentially also contains sublinear terms. • The polynomial part in the right-hand side of (4) is technically the most demanding element of the proof, and relies on Berry-Esséen-based arguments; see Section 2.4. • In Section 2.5 all elements are put together, and our result is stated.
The formal assumption we impose on ψ n is the following. This assumption means that there is an ε > 0 such that ψ n < n −ε , and hence ϕ n > n 1+ε . We observe that ϕ n is superlinear.
In the main theorem of this section, i.e., Thm. 1, we assume that A(·) is non-lattice. To establish the result for the case where A(·) is lattice (covering e.g. Poisson processes), a minor adaptation needs to be made. More specifically, the steps of Sections 2.1 up to 2.3 apply to the lattice as well as non-lattice case, whereas in Section 2.4 in the lattice case slightly different bounds have to be used (fully analogously to the treatment of the lattice and non-lattice cases in the proof of [10,Thm. 3.7.4]). The result for the lattice case is discussed separately in Remark 2. We refer to Sections 4 and 5 for illustrative examples covering both the lattice and non-lattice case.
2.1. Analysis of twisting factor. As ψ n → 0 as n → ∞, the twisting factor ϑ n converges to the solution ϑ ⋆ of bα ′ (ϑ) = u, where b = EB(1) = β ′ (0). To establish the exact asymptotics of ξ n (u), it turns out that we need the an expansion of ϑ n ; we start by arguing that ϑ n has the form and then we point out how to determine the coefficients v k .
• The main idea is that one can (implicitly) define a function ϑ + (x) as the solution of the equation such a solution is unique due to the fact that the left-hand side of the previous display is the derivative of a convex function (hence increasing). Now observe that ϑ n (solving γ ′ n (ϑ n ) = u) equals ϑ + (ψ n ); recall that ψ n → 0 as n → ∞. Observe that ϑ + (0) = ϑ ⋆ (with ϑ ⋆ , as defined above, solving bα ′ (ϑ ⋆ ) = u). Now write down the Taylor expansion (at x = 0) of the implicitly defined function ϑ + (x): The v k can be determined in terms of the derivatives of α(·) and β(·) (which are well-defined). Below we provide a constructive procedure that identifies all these coefficients. Upon combining the above elements, we thus obtain the expansion (5): • We now present a constructive procedure that yields the coefficients v k . To this end, first observe that, by evaluating α(·) as a Taylor series around ϑ ⋆ , where, by evaluating β(·) as a Taylor series around 0, This equation effectively determines all v k , by grouping together the appropriate terms (i.e., terms with the same power). We demonstrate this procedure for v 1 . Up to terms that are o(ψ n ), Now v 1 can be determined: The v i for i ∈ {2, 3 . . .} can be computed in the exact same way, but this does not lead to clean expressions; for k = 2 we obtain

2.2.
Change of measure. The next step is to rewrite the probability of interest, relying on the usual change-of-measure procedure. This effectively means that we let Q n correspond to twisting the distribution of C n by the solution ϑ n of (2). The l-mgf of C n under Q n can thus be expressed in terms of the l-mgf of C n under the original measure: γ Qn n (ϑ) := γ n (ϑ + ϑ n ) − γ n (ϑ n ). Later on we need the mean and variance of C n under Q n . From the definition of ϑ n , it follows that E Qn C n = un. Differentiating once more, we obtain The next step is to use the change of measure to rewrite ξ n (u). By applying the definition of Q n , we find the identity ξ n (u) = E Qn e γn(ϑn)−ϑnCn 1{C n un} , realizing that e γn(ϑn)−ϑnCn plays the role of the likelihood ratio dP/dQ n . On the event C n un, C n will typically be relatively close to un under the new measure Q n (recall E Qn C n = un). To exploit this, we define, with σ Q + := bα ′′ (ϑ ⋆ ), which is a random variable that has, by construction, mean 0 under Q n . (Due to the √ n in the denominator, it is anticipated that, under Q n , D n can be approximated by a standard Normal random variable; we come back to this idea below.) We thus obtain that, for all n, ξ n (u) = e γn(ϑn)−ϑnun ∆ n , with ∆ n := E Qn e −ϑnσ Q Consequently, we are left with analyzing the exponential factor δ n := exp(γ n (ϑ n ) − ϑ n un) and the expectation ∆ n , as n grows large.
where the empty sum is defined as 0.
The validity of this claim (as well as the procedure to determine the coefficients v k ) can be demonstrated as follows. Relying on Taylor expansions, we obtain The claim for m + = 1 can be directly verified. Now consider the case m + = 2; any higher value of m can be dealt with fully analogously. For m + = 2, the sequence ϕ n ψ k n converges to 0 when k > 2, whereas ϕ n ψ 2 n = nψ n stays away from 0; think of e.g. ψ n = n −2/3 . Hence we obtain that, as n → ∞, It is clear that for m + = 3, an additional term needs to be included. In general, using this procedure any value of m + can be dealt with.
2.4. Analysis of ∆ n as n → ∞. We are left with analyzing ∆ n for n large. Our objective is to prove that √ n∆ n converges to a positive constant as n → ∞. Mimicking the line of reasoning of [10, Eqn. (3.7.7)], we apply integration by parts: As we will intensively rely on this representation of ∆ n , an important role in our argumentation is played by the probability distribution where it is noted that E Qn A(ψ n B(ϕ n )) = un. Estimates for this distribution can be established, essentially as variants of the classical Edgeworth expansion. As such expansions follow by applying well-established techniques, we restrict ourselves to providing the main steps of the derivation in the appendix. An excellent introduction on the Edgeworth expansion is provided in [14, Ch. II]; see also [4,Ch. IV].
• In the case where lim n→∞ ψ n √ n = 0 (which corresponds to f > 3 2 in the context of Example 1), we have, as pointed out in the appendix, as n → ∞, cf. Eqn. (30), where H 2 (x) = x 2 − 1, φ(·) denotes the probability density function of a standard Normal random variable, and Φ(·) the corresponding cumulative distribution function. We are now in a position to prove that √ n∆ n converges to a constant; we provide the upper bound, but the lower bound follows fully analogously. By virtue of (8), for any ε > 0 and n sufficiently large, Let us start by evaluating the first term on the right-hand side. It can be rewritten as Using the known limit x(1 − Φ(x))/φ(x) → 1 as x → ∞, and ϑ n → ϑ ⋆ , we have that the expression in the previous display converges to Using that H 2 (x) = x 2 − 1 the second term can be written as the sum of bounded, applying the dominated convergence theorem yields that t we also have t +,n → 0. The third term equals ε, which can be made arbitrarily small. As mentioned before, by the same token it can be verified that the corresponding lower bound applies as well. We thus conclude that √ n∆ n converges to the constant in (9).
• We now proceed with the case lim inf n→∞ ψ n √ n > 0 (which simplifies to f ∈ (1, 3 2 ] in the context of Example 1). From Eqn. (31) in the appendix we have By (10), for any ε > 0 and n sufficiently large, using that H 1 (x) = x and hence H 1 (0) = 0, The first, third, and fourth term can be dealt with as in the case ψ n √ n → 0. Due to (i) φ(x) the second term vanishes. It follows that again √ n∆ n converges to the constant in (9). We have thus established the following lemma.

2.5.
Result. Upon combining Lemmas 1 and 2, as presented in the previous subsections, we have arrived at the following result.
An immediate consequence of this theorem is that ξ n (u) behaves as P(A(bn) un) when ϕ n ψ 2 n = nψ n → 0. This is for instance the case when ψ n = n ζ for ζ < −1, or, in the setting of Example 1, f > 2. It reflects that the timescale of B(·) is so much faster than that of A(·), that it can be replaced by its mean. In addition, it implies that the rough (logarithmic) asymptotics are not affected by the choice of ψ n (as long as Assumption 1 is fulfilled). These findings are summarized in the following corollary.
Remark 2. So far we throughout assumed that the Lévy processes A(·) be non-lattice, thus ruling out e.g. Poisson processes. With a minor adaptation, however, the lattice case can be dealt with as well. Let, for some x 0 and d and any t 0, the random variable d as n → ∞. This behavior is consistent with that of Thm. 1 when taking d ↓ 0. ♦

Slow regime
In this section we analyze the case that ϕ n grows sublinearly, implying that ψ n → ∞ as n → ∞ (also sublinearly). As before, we first characterize the solution ϑ n of (2), then we rewrite ξ n (u) using the ϑ n -twisted version of C n , and finally we determine the corresponding exact asymptotics. The formal assumption we impose on ψ n in this section is the following. The first inequality of this assumption ensures that there is an ε ∈ (0, 1) such that ψ n > n ε , and hence ϕ n < n 1−ε , so that that ϕ n is sublinear. In addition, the second inequality entails that ψ n is sublinear, too. For the moment we assume that B(·) be non-lattice; see Remark 3 for the corresponding result in the lattice case.
The procedure we follow to prove the exact asymptotics in the slow regime is in line with the one we developed for the fast regime: we perform a change of measure, take out the exponential factor δ n , and analyze the remainder term ∆ n . As this procedure echoes the one developed in the previous section, we only include the main steps.
The change of measure is again based on the solution ϑ n that solves equation (2). In this case, as we argue below, ϑ n obeys the expansion the coefficients w k can be recursively determined. The reasoning is analogous to the argumentation followed in the fast regime.
• To show the validity of the expansion (11), we write ϑ − (x) as the solution of the equation We now write ϑ − (·) as a Taylor expansion around x = 0: with w 1 = τ ⋆ . As in the fast regime, the coefficients allow expressions in terms of the derivatives of α(·) and β(·). Combining the above, we thus obtain the expansion (11): • The procedure to identify the coefficients w k in this slow regime is analogous to the procedure to identify the coefficients v k in this fast regime. In particular, w 1 solves aβ ′ (aw 1 ) = u; in the sequel, we refer to w 1 by τ ⋆ .
We define σ Q − := a β ′′ (aτ ⋆ ); cf. the decomposition (1). For this regime we define which is a random variable that has, by construction, mean 0 and variance converging to 1 under Q n ; comparing E n with D n , observe that there is now ψ n √ ϕ n rather than √ n in the denominator.
As before, it remains to analyze the exponential factor δ n := exp(γ n (ϑ n )−ϑ n un) and the expectation ∆ n , in the regime of n growing large. The analysis of δ n precisely follows the corresponding step in the fast regime. Define It thus follows that as n → ∞, for constants w k , with the empty sum being defined as 0, We continue with the analysis of ∆ n . Analogously to our analysis in the fast regime, applying integration by parts, we write ∆ n as We again proceed by using the Edgeworth expansion presented in the appendix.
• In the case lim n→∞ ϕ 3/2 n /n = 0 (which corresponds to f < 2 3 in the context of Example 1), we have, as pointed out in Eqn. (34) in the appendix, as n → ∞, Our objective is to prove that √ ϕ n ∆ n converges to a constant; as in the fast regime, we provide the upper bound, but the lower bound follows fully analogously. By (13) and (14), for any ε > 0 and n sufficiently large, The first term on the right-hand side equals which, using x(1 − Φ(x))/φ(x) → 1 and ψ n ϑ n → τ ⋆ , converges to As before, we split the second term into (recalling that H 2 (x) = x 2 − 1) Mimicking the reasoning used in the fast regime, it can be shown that t −,n → 0 and t −,n → 0 as n → ∞. The third term equals ε, which can be made arbitrarily small. Combining this with the corresponding lower bound applies, we find that √ ϕ n ∆ n converges to (15).
• Considering lim inf n→∞ ϕ 3/2 n /n > 0 (which simplifies to f ∈ [ 2 3 , 1) in the context of Example 1), we obtain from Eqn. (35) in the appendix, By (16), for any ε > 0 and n sufficiently large, using that H 1 (x) = x and hence H 1 (0) = 0, The first, third, and fourth term can be dealt with as in the case ϕ 3/2 n /n → 0. Analogously to the reasoning used in the fast regime, the second term vanishes. The corresponding lower bound is established in the same way. Conclude that also in this case √ ϕ n ∆ n converges to (15). We have thus proven the following results.

Remark 3.
In the case B(·) is lattice, the result of Thm. 2 has to be slightly adjusted; cf. Remark 2.
Let, for some x 0 and d and any t 0, the random variable d −1 (B(t) − x 0 ) be integer almost surely, and let d be the largest number with this property. Then, under Assumption 2, we obtain as n → ∞. Analogously with what we observed in the fast regime, this behavior is consistent with that of Thm. 2 when taking d ↓ 0. ♦ Remark 4. In our analysis we have assumed that a > 0; one may wonder whether our findings extend to general a. When picking a 0, however, in the slow regime complications arise. The constant τ ⋆ , that plays a crucial role in this regime, should solve the equation u = β ′ (aτ ⋆ )a, but observe that β ′ (·) is positive (as it the characteristic exponent of a subordinator); as a consequence, u = β ′ (aτ ⋆ )a has no solution for a 0. This fact implies that the case a 0 should be dealt with with an entirely different technique, in the sense that the change-of-measure technique (as was used above) cannot work. ♦

Examples
In this section we include two examples that demonstrate how the asymptotic expansion can be evaluated.

A(·)
is a Poisson process and B(·) a Gamma process. Let A(·) be a Poisson process; we assume its rate is λ > 0, so that α(ϑ) = λ(e ϑ − 1) for ϑ ∈ R. Let B(·) be a Gamma process; we call the parameters r > 0 (shape) and µ > 0 (rate), so that β(ϑ) = r log µ − r log(µ − ϑ), where we require ϑ < µ. Observe that A(t) has a Poisson distribution with parameter λt, and that B(t) has a Gamma distribution with parameters rt and µ. In particular, in the terminology of our paper, a = λ and b = r/µ. To make the event of interest rare, we assume u > ab; writing ̺ := λr/(µu) this translates to assuming ̺ < 1. Note that we have that A(·) is lattice but B(·) is not, so that we should apply Remark 2 in the fast regime, and Thm. 2 in the slow regime.
We now distinguish between the fast regime and the slow regime. In the fast regime, in which ψ n → 0 as n → ∞, applying the Taylor expansion of the logarithm yields an expression for the coefficients v k . With ζ 1 := λ/µ and ζ 2 := u/r, We now compute the coefficients v k featuring in Lemma 1. To this end, note that by inserting the first-order condition (18) into (17), It now follows that −r v 1 = (1 − ̺)u = bα(ϑ ⋆ ). We thus observe that δ n indeed has the form that was established in Lemma 1, with, for k ∈ {2, 3, . . .

Remark 5.
It is noted that the expressions encountered in this example align with those featuring in [15,Section 3] that deal with a case in which ξ n (u) could be evaluated explicitly. It can be checked that C n has a negative binomial distribution, as follows. Observe that γ n (ϑ) = rϕ n log µ µ − λ(e ϑ − 1)ψ n .
In case a random variable has a negative binomial distribution with parameters k and p, then its l-mgf is of the form It thus follows that in the setting considered in this subsection, we have that C n is negative binomial, with success probability equal to p = µ/(µ + λψ n ) and the allowed number of failures k = rϕ n . Further intuition behind the appearance of the negative binomial distribution has been provided in [15,Remark 6]. ♦

A(·)
is a Gamma process and B(·) a Poisson process. In our second example the Gamma process and the Poisson process swap roles. In other words, we have α(ϑ) = r log µ − r log(µ − ϑ) for ϑ < µ, and β(ϑ) = λ(e ϑ − 1) for ϑ ∈ R. Again, to make the event of interest rare, we assume that u > ab, or alternatively, ̺ := λr/(µu) < 1 (where a = r/µ and b = λ). In this case A(·) is non-lattice, whereas B(·) is, so that we have to use Thm. 1 in the fast regime, and Remark 3 in the slow regime.
Remark 6. Note that the random variable A(ψ n B(ϕ n )) is compound Poisson in this case, with Gamma distributed jumps. More precisely, the jumps are generated according to a Poisson process with rate ϕ n λ, where the jumps are Gamma, with parameters rψ n and µ. ♦ In this case the l-mgf is given by Again we have to distinguish between the fast regime and the slow regime. As before, we start with the fast regime. From (21) it follows, by solving the first-order condition, that ϑ n = µ (1 − ̺ ηn ) , η n := 1 1 + rψ n , so that γ n (ϑ n ) = ϕ n λ 1 ̺ · ̺ ηn − 1 . To find the v k , we write ϑ n as a Taylor series in ψ n . It is directly seen that ϑ ⋆ = µ(1 − ̺); some elementary calculus thus yields for the first coefficients v 1 = −µr̺ log 1 ̺ , and v 2 = µr 2 ̺ (log 1 ̺ ) (1 − 1 2 log 1 ̺ ); higher coefficients can be found in the same way.
Also the coefficients v k can be found: . Note that the first term equals (bα(ϑ ⋆ ) − ϑ ⋆ u)n, in accordance with the result in Lemma 1. Thm. 1, with (σ Q + ) 2 = u 2 /(λr), yields We now proceed with the slow case. The w k follow by expanding ϑ n in terms of a Taylor series with respect to ψ −1 n (recalling that ψ n → ∞): Routine calculations show that τ ⋆ = w 1 = (µ/r) log 1 ̺ , and , where higher coefficients follow along the same lines. The w k can be determined as before; leaving out a few intermediate steps, By applying Remark 3, with (σ Q − ) 2 = (r/µ) 2 · λ/̺ = ru/µ and w k = −u ( 1 r w k + w k+1 ),

Applications in modeling overdispersed customer streams
In operations research, the performance evaluation and design of service systems is a key topic of interest. Traditionally, the dominant assumption when modeling customer arrival processes is that of Poisson arrivals. As we mentioned in the introduction, however, measurements indicate that the level of variation observed in practice may significantly exceed that predicted by the Poisson process [5,18,19,21]. More concretely, where for Poisson arrivals the mean and variance of the number of arrivals in a given time interval coincide, in practice one typically observes that the variance is larger than the mean; the phenomenon is typically referred to as overdispersion.
In [16] a mechanism is proposed that produces overdispersion. The main idea is that the arrival rate is random rather than deterministic; this is achieved by resampling the arrival rate periodically. The main underlying idea behind the resampling is that there is a (perhaps unobservable) environmental process that influences all potential customers in the same way; think for instance of the weather conditions, or the occurrence of specific events. As we argued in the introduction, our two-timescale framework can be used to represent this 'resampled Poisson process'.
Renormalizing time such that the resampling intervals have length 1, considering K such intervals, and letting the samples be i.i.d. non-negative random variable Λ 1 , . . . , Λ K , then the number of arrivals in there K intervals is distributed as When designing service centers (for instance when making staffing decisions), one would like to control the probability of excessive waiting times. For that reason, it could be informational to have a handle on the tail distribution corresponding to the number of arrivals in a certain time window.
The remainder of this section has two objectives. In the first place, we quantify the error made by neglecting overdispersion (i.e., the situation in which the number of arrivals in the K slots has a Poisson distribution with parameter K EΛ). In the second place, we compare our refined asymptotics with the crude asymptotics that were found in [16], just involving the dominant term in the exponent (as featuring in the second statements of Corollaries 1 and 2). 5.1. Setup. In our experiments, the following example is studied. Let Λ 1 , . . . , Λ K be i.i.d. samples from an exponential distribution with mean 1/µ. We are interested in using our expansions to approximate the probability for some u > K/µ.
• Importantly, as we have seen in Remark 5, this example allows an explicit solution: here C(K) has a negative binomial distribution with parameters K and success probability µ/(1 + µ). We thus have, in self-evident notation, We chose the example of exponentially distributed Λ i , as the explicit expressions allow us to compare the various approximations with the corresponding exact values. • When neglecting the overdispersion, one does not work with sampled values Λ 1 , . . . , Λ K , but rather with a deterministic arrival rate EΛ in any interval. In other words, when one would ignore the overdispersion, the probability Π(K, u, µ) would be approximated by Π (Pois) (K, u, µ) := P (Pois (K EΛ) u) .
This approximation, corresponding with the fast regime, is expected to work well if K is large relative to u, and the contributions of the individual Λ i are relatively small (i.e., µ is relatively large). • Along the same lines, one could consider the regime that K is substantial, but not in the order of u, and the contributions of the individual Λ i are relatively large (i.e., µ is relatively small). Now the slow regime is expected to apply, suggesting to ignore the Poisson sampling. Bearing in mind that the sum of exponentially distributed random variables (with equal parameter) has an Erlang distribution, we end up with (in self-evident notation) The crude approximations Π (Pois) (K, u, µ) and Π (Gamma) (K, u, µ) we wish to compare with approximations based on our two-timescale model. To this end, we convert the asymptotics that we found in Section 4 into approximations. We fix a number n > 1; this scaling parameter can be considered as 'artificial' in the sense that the choice of n will not affect the approximation (as will turn out below). Then we pick f such that n f = K (i.e., f = log K/ log n > 0), and we set u := u/n and µ := µn 1−f . We thus obtain where Λ • i is exponentially distributed with parameter µ. We have thus rewritten the probability (22) as Now notice that this representation falls in the framework of Section 4.1, with ϕ n = n f . More specifically, we are in the situation that A(·) is a Poisson process with rate λ = 1, and B(·) a Gamma process with shape parameter r = 1 and rate parameter µ.
The next step is to translate the asymptotics that we identified for ξ n (u), which are in terms of the parameters of the scaled model (i.e., µ, f , and u), into approximations for Π(K, u, µ), which are in terms of the original parameters (i.e., µ, K, and u).

5.2.
Fast regime. We start by considering the regime in which K is large relative to u, so that it makes sense to work with the asymptotics pertaining to the case f > 1 (i.e., the fast regime). First observe that which we assumed to be smaller than 1. In addition, Also, Based on (19), we eventually obtain the following approximation for the probability (22): with ̺ given by (23) and t + k by (24). In practice one should choose the threshold M + such that adding more terms in the sum does not affect the outcome; realize that the object m + is not welldefined in the pre-limit context. Observe that, as announced, the scaling parameter n does not appear in the approximation.
In [16] logarithmic asymptotics of ξ n (u) were found. These lead to (crude) approximations that only cover the dominant term in the exponent (as in the second statement of Corollary 1). More concretely, one would approximate the probability of our interest bŷ Π (fast) (K, u, µ) = exp ((1 − ̺ + log ̺) u) .

5.3.
Slow regime. Now consider the regime in which K is small relative to u, in which we opt for relying on the asymptotics corresponding to f < 1. Then implying that In addition, Based on (20), we thus obtain the following approximation for (22): with ̺ given by (23) and t − k by (26). As before, the truncation level M − is chosen such that including additional terms would not have any impact on the approximation. Again, we observe that the scaling parameter n does not appear in the approximation.
We also compare with the approximation that only covers the dominant term in the exponent (as in the second statement of Corollary 2, in line with the decay rate identified in [16]). This approximation readsΠ (slow) (K, u, µ) = exp 1 − 1 ̺ + log 1 ̺ K .

5.4.
Numerical experiments. We conclude this section by presenting a set of examples that illustrate the use of the developed approximations. We subsequently consider an example that corresponds to the fast regime, and one that corresponds to the slow regime. They provide indications of the accuracy that can be achieved by the expressions (25) and (27) that were based on our refined asymptotics, relative to more crude approximations.
We start with an example corresponding to the fast regime. Recalling (25), notice that our approximation Π (fast) (K, u, µ) is parameterized by the threshold M + . In Table 1 below, the column with superscript 0 corresponds to neglecting the sum in the exponent of (25) altogether, which could be seen as choosing M + := 1, whereas the column with superscript 1 corresponds to including one term in the sum in the exponent of (25), i.e., M + := 2. Throughout this example ̺ = 1 3 and u = 150 are held fixed, explaining that the values in the third, fourth and fifth column are constant (as can be seen from the expressions for the approximations Π (Pois) ,Π (fast) , and Π (fast,0) ).
The following observations can be made from Table 1. (i) In the two top rows there is so much timescale separation that Π (fast,0) provides highly accurate results. In the other rows the timescale separation becomes less pronounced, which leads to Π (fast,1) becoming the preferred approximation.
Here it is noted that including 2 or more terms in the exponent hardly improves the approximation in the third and fourth row. In the last row the timescales are so poorly separated that Π (fast,1) is still relatively far off; we mention that the performance is significantly improved by using M + = 3, which leads to the highly accurate approximation 5.90 · 10 −6 . (ii) As anticipated, the Poisson approximation (neglecting the overdispersion) performs well if there is a substantial degree of timescale separation, but leads to substantial underestimation if the timescales are relatively close together.
In the slow regime our approximation Π (fast) (K, u, µ), as given by (27), is parameterized by the threshold M − . Analogously to the notation used in Table 1, in Table 2 the column with superscript 0 corresponds to neglecting the sum in the exponent of (27), which one could identify with the choice M − := 0, whereas the column with superscript 1 corresponds to including one term in the sum in the exponent of (27), i.e., M − := 1. We chose scenarios in which ̺ = 2 3 and K = 100 are held fixed, implying that the values in the fourth and fifth column are constant (as can be seen from the expressions forΠ (fast) and Π (fast,0) ); note that in this slow regime the values in the third column (i.e., Π (Gamma) ) are not constant, in that they (very slowly) increase.
The conclusions from Table 2 are as follows. (i) Again, in the top rows (with a strong timescale separation) Π (fast,0) performs well. This approximation, however, degrades in the lower rows, where the accuracy substantially improves if M − = 1 (i.e., the approximation Π (fast,1) ) is used. We mention that for the parameters in the last row, the accuracy is further improved by taking M − = 2, yielding 1.34 · 10 −5 . (ii) The Gamma approximation (assuming full separation of time scales) provides accurate approximations in the top row, but substantially worse in the bottom rows (where there is less timescale separation), as expected.

Discussion and concluding remarks
Motivated by recent developments in arrival process modeling, this paper presents tail asymptotics for a multi-timescale model. The focus has been on approximating ξ n (u) = P(A(ψ n B(ϕ n ) un), with A(·) a Lévy process and B(·) a Lévy subordinator. Our analysis shows that subtle analysis is required to identify the exact asymptotics. The structure found is considerably richer than in the model's classical (single timescale) counterpart; in particular, the exponent includes additional terms.
Effect of leaving customers. In the case A(·) is a Poisson process, we recover the mixed Poisson model proposed in [16], which can be thought of as a Poisson process with periodically resampled rate (so as to generate overdispersion). An interesting extension of the results developed in the presented paper could concern the model in which the arrived clients leave after a service time distributed as the random variable J with F J (t) := P(J t). Such a setting could be considered as an infinite-server queue with overdispersed input. Using the insights from [16], one thus obtains the following expression for the number of customers C n present at time ϕ n , under the scaling considered in the present paper: Using calculation rules for Lévy processes, one obtains log Ee ϑCn = ϕn 0 β α(ϑ)ψ n (1 − F J (t)) dt; plugging in F J (t) = 0 for all t ∈ [0, ϕ n ], we recover ϕ n β(α(ϑ)ψ n ) (which makes sense, as for this choice customers do not leave the system). As a topic for further research, one could pursue deriving the exact asymptotics of P(C n un) (which may require scaling the service times J). These asymptotics can be converted into approximations, to be used as the basis of refined staffing rules.
Follow-up research. Other directions for future work include: • In the first place, multivariate extensions could be considered. One could for instance consider the probability that the vector A 1 (ψ n B(ϕ n )), A 2 (ψ n B(ϕ n )) , with A(·) = (A 1 (·), A 2 (·)) a bivariate Lévy processes, attains a value in the set [u 1 n, ∞) × [u 2 n, ∞). The components of A(·) could be assumed dependent, but observe that A 1 (ψ n B(ϕ n )) and A 2 (ψ n B(ϕ n )) are dependent anyway (as the same B(ϕ n ) is used). The multivariate single-timescale results of [7] show that one should expect that various cases arise: there will be cases in which one component exceeding its threshold (with high probability) implies that the other component exceeds its threshold as well, but also cases in which both constraints are 'tight'. • In Remark 4 we pointed out that in our setup it is required to assume a > 0, particularly in the slow regime. For a 0, one may anticipate that ξ n (u) decays exponentially in n (unlike in the case of a > 0, where ξ n (u) decays exponentially in ϕ n ), based on the following reasoning. The starting point is the large-deviations heuristic ξ n (u) ≈ max x>0 P(B(ϕ n ) ≈ xϕ n ) P(A(xn) ≈ un).
The main insight is that, taking into account only the leading term of the asymptotics, the first probability on the right-hand side of (28) decays exponentially in ϕ n , but the second exponentially in n, so that their product decays exponentially in n. This should be contrasted with the case a > 0 studied in the present paper: there the maximum will be attained (roughly) at x = u/a > 0, so that {A(xn) ≈ un} is no rare event, and hence P(aB(ϕ n ) ≈ uϕ n ) dictates the tail behavior (thus leading to exponential decay in ϕ n ). A comparable situation has been considered in [15,Section 2.2]. We stress however that, relative to the a > 0 case, this a 0 case has a considerably lower practical relevance.
By a direct computation, we find E Qn e ϑDn = exp 1 2 ϑ 2 1 + 1 6 bα ′′′ (ϑ ⋆ ) Using the familiar inversion procedure for characteristic functions, we thus obtain, with φ(·) denoting the probability density function of a standard Normal random variable, Q n (D n ∈ dx) = φ(x) 1 + H 3 (x) 1 6 bα ′′′ (ϑ ⋆ ) with H k (·) the Hermite polynomial of degree k. This leads to, with Φ(·) denoting the cumulative distribution function of a standard Normal random variable, With the same reasoning as in the standard Edgeworth expansion (i.e., that for sums of i.i.d. random variables), the error term (being small relative to 1/ √ n) is uniform in x.
In the case where lim inf n→∞ ψ n √ n > 0, by inserting our expansion into (29) Using the same procedure as above, this yields, uniformly in x, In our setting, however, we need an error that is small relative to 1/ √ n, which can be achieved by expanding Γ ′′ n (0) further (as it can be verified that the contributions of the higher derivatives Γ due to Assumption 1, this is a finite constant. Using the above type of reasoning, we conclude that there are constants c 1 , . . . , c k + so that In the boundary case where ψ n √ n converges to a constant we have k + = 1; the sum in (31) consists of only one term.
Using the expansion of ϑ n , the right-hand side of the previous display reads