Big jobs arrive early: From critical queues to random graphs

We consider a queue to which only a finite pool of $n$ customers can arrive, at times depending on their service requirement. A customer with stochastic service requirement $S$ arrives to the queue after an exponentially distributed time with mean $S^{-\alpha}$ for some $\alpha\in[0,1]$; so larger service requirements trigger customers to join earlier. This finite-pool queue interpolates between two previously studied cases: $\alpha = 0$ gives the so-called $\Delta_{(i)}/G/1$ queue and $\alpha = 1$ is closely related to the exploration process for inhomogeneous random graphs. We consider the asymptotic regime in which the pool size $n$ grows to infinity and establish that the scaled queue-length process converges to a diffusion process with a negative quadratic drift. We leverage this asymptotic result to characterize the head start that is needed to create a long period of activity. We also describe how this first busy period of the queue gives rise to a critically connected random forest.


Introduction
This paper introduces the ∆ α (i) /G/1 queue that models a situation in which only a finite pool of n customers will join the queue. These n customers are triggered to join the queue after independent exponential times, but the rates of their exponential clocks depend on their service requirements. When a customer requires S units of service, its exponential clock rings after an exponential time with mean S −α with α ∈ [0, 1]. Depending on the value of the free parameter α, the arrival times are i.i.d. (α = 0) or decrease with the service requirement (α ∈ (0, 1]). The queue is attended by a single server that starts working at time zero, works at unit speed, and serves the customers in order of arrival. At time zero, we allow for the possibility that i of the n customers have already joined the queue, waiting for service. We will take i n, so that without loss of generality we can assume that at time zero there are still n customers waiting for service. These initial customers are numbered 1, . . . , i and the customers that arrive later are numbered i + 1, i + 2, . . . in order of arrival. Let A(k) denote the number of customers arriving during the service time of the k-th customer. The busy periods of this queue will then be completely characterized by the initial number of customers i and the random variables (A(k)) k≥1 . Note that the random variables (A(k)) k≥1 are not i.i.d. due to the finite-pool effect and the servicedependent arrival rates. We will model and analyze this queue using the queue-length process embedded at service completions.
We consider the ∆ α (i) /G/1 queue in the large-system limit n → ∞, while imposing at the same time a heavy-traffic regime that will stimulate the occurrence of a substantial first busy period. By substantial we mean that the server can work without idling for quite a while, not only serving the initial customers but also those arriving somewhat later. For this regime we show that the embedded queue-length process converges to a Brownian motion with negative quadratic drift. For the case α = 0, referred to as the ∆ (i) /G/1 queue with i.i.d. arrivals [16,17], a similar regime was studied in [5], while for α = 1 it is closely related to the critical inhomogeneous random graph studied in [7,18].
While the queueing process consists of alternating busy periods and idle periods, in the ∆ α (i) /G/1 queue we naturally focus on the first busy period. After some time, the activity in the queue inevitably becomes negligible. The early phases of the process are therefore of primary interest, when the head start provided by the initial customers still matters and when the rate of newly arriving customers is still relatively high. The head start and strong influx together lead to a substantial first busy period, and essentially determine the relevant time of operation of the system.
We also consider the structural properties of the first busy period in terms of a random graph. Let the random variable H(i) denote the number of customers served in the first busy period, starting with i initial customers. We then associate a (directed) random graph to the queueing process as follows. Say H(i) = N and consider a graph with vertex set {1, 2, . . . , N } and in which two vertices r and s are joined by an edge if and only if the r-th customer arrives during the service time of the s-th customer. If i = 1, then the graph is a rooted tree with N labeled vertices, the root being labeled 1. If i > 1, then the graph is a forest consisting of i distinct rooted trees whose roots are labeled 1, . . . , i, respectively. The total number of vertices in the forest is N .
This random forest is exemplary for a deep relation between queues and random graphs, perhaps best explained by interpreting the embedded ∆ α (i) /G/1 queue as an exploration process, a generalization of a branching process that can account for dependent random variables (A(k)) k≥1 . Exploration processes arose in the context of random graphs as a recursive algorithm to investigate questions concerning the size and structure of the largest components [3]. For a given random graph, the exploration process declares vertices active, neutral or inactive. Initially, only one vertex is active and all others are neutral. At each time step one active vertex (e.g. the one with the smallest index) is explored, and it is declared inactive afterwards. When one vertex is explored, its neutral neighbors become active for the next time step. As time progresses, and more vertices are already explored (inactive) or discovered (active), fewer vertices are neutral. This phenomenon is known as the depletion-of-points effect and plays an important role in the scaling limit of the random graph. Let A(k) denote the neutral neighbors of the k-th explored vertex. The exploration process then has increments (A(k)) k≥1 that each have a different distribution. The exploration process encodes useful information about the underlying random graph. For example, excursions above past minima are the sizes of the connected components. The critical behavior of random graphs connected with the emergence of a giant component has received tremendous attention [2,6,7,8,9,10,18,14,15]. Interpreting active vertices as being in a queue, and vertices being explored as customers being served, we see that the exploration process and the (embedded) ∆ α (i) /G/1 queue driven by (A(k)) k≥1 are identical.
The analysis of the ∆ α (i) /G/1 queue and associated random forest is challenging because the random variables (A(k)) k≥1 are not i.i.d. In the case of i.i.d. (A(k)) k≥1 , there exists an even deeper connection between queues and random graphs, established via branching processes instead of exploration processes [19]. To see this, declare the initial customers in the queue to be the 0-th generation. The customers (if any) arriving during the total service time of the initial i customers form the 1-st generation, and the customers (if any) arriving during the total service time of the customers in generation t form generation t+1 for t ≥ 1. Note that the total progeny of this Galton-Watson branching process has the same distribution as the random variable H(i) in the queueing process. Through this connection, properties of branching processes can be carried over to the queueing processes and associated random graphs [11,21,22,24,25,26]. Takács [24,25,26] proved several limit theorems for the case of i.i.d. (A(k)) k≥1 , in which case the queue-length process and derivatives such as the first busy period weakly converge to (functionals of) the Brownian excursion process. In that classical line, the present paper can be viewed as an extension to exploration processes with more complicated dependency structures in (A(k)) k≥1 .
In Section 2 we describe the ∆ α (i) /G/1 queue and associated graphs in more detail and present our main results. The proof of the main theorem, the stochastic-process limit for the queue-length process in the large-pool heavy-traffic regime, is presented in Sections 3 and 4. Section 5 discusses some interesting questions related to the ∆ α (i) /G/1 queue and associated random graphs that are left open.

Model description
We consider a sequence of queueing systems, each with a finite (but growing) number n of potential customers labelled with indices i ∈ [n] := {1, . . . , n}. Customers have i.i.d. service requirements with distribution F S (·). We denote with S i the service requirement of customer i and with S a generic random value, and S i and S all have distribution F S (·). In order to obtain meaningful limits as the system grows large, we scale the service speed by n/(1 + βn −1/3 ) with β ∈ R so that the service time of customer i is given bỹ We further assume that E[S 2+α ] < ∞. If the service requirement of customer i is S i , then, conditioned on S i , its arrival time T i is assumed to be exponentially distributed with mean 1/(λS α i ), with α ∈ [0, 1] and λ > 0. Hence with d = denoting equality in distribution and Exp i (c) an exponential random variable with mean 1/c independent across i. Note that conditionally on the service times, the arrival times are independent (but not identically distributed). We introduce c(1), c(2), . . . , c(n) as the indices of the customers in order of arrival, so that We will study the queueing system in heavy traffic, in a similar heavy-traffic regime as in [5,4]. The initial traffic intensity ρ n is kept close to one by imposing the relation where λ = λ n can depend on n and f n = o P (n −1/3 ) is such that lim n→∞ f n n 1/3 P → 0. The parameter β then determines the position of the system inside the critical window: the traffic intensity is greater than one for β > 0, so that the system is initially overloaded, while the system is initially underloaded for β < 0.
Our main object of study is the queue-length process embedded at service completions, given by Q n (0) = i and with x + = max{0, x} and A n (k) the number of arrivals during the k-th service given by where ν k ⊆ [n] denotes the set of customers who have been served or are in the queue at the start of the k-th service. Note that Given a process t → X(t), we define its reflected version through the reflection map φ(·) as The process Q n (·) can alternatively be represented as the reflected version of a certain process N n (·), that is Q n (k) = φ(N n )(k), (2.8) where N n (·) is given by N n (0) = i and We assume that whenever the server finishes processing one customer, and the queue is empty, the customer to be placed into service is chosen according to the following size-biased distribution: where we tacitly assumed that customer j is the i-th customer to be served. With definitions (2.5) and (2.10), the process (2.4) describes the ∆ α (i) /G/1 queue with exponential arrivals (2.2), embedded at service completions.
Remark 1 (A directed random tree). The embedded queueing process (2.4) and (2.8) gives rise to a certain directed rooted tree. To see this, associate a vertex i to customer i and let c(1) be the root. Then, draw a directed edge to c(1) from c(2), . . . , c(A n (1) + 1) so to all customers who joined during the service time of c(1). Then, draw an edge from all customers who joined during the service time of c(2) to c(2), and so on. This procedure draws a directed edge from c(i) to c(i + i−1 j=1 A n (j)), . . . , c(i + i j=1 A n (j)) if A n (i) ≥ 1. The procedure stops when the queue is empty and there are no more customers to serve. When Q n (0) = i = 1 (resp. i ≥ 2), this gives a random directed rooted tree (resp. forest). The degree of vertex c(i) is 1 + |A n (i)| and the total number of vertices in the tree (resp. forest) is given by H Qn (0) = inf{k ≥ 0 : Q n (k) = 0}, (2.11) the hitting time of zero of the process Q n (·).
Remark 2 (An inhomogeneous random graph). If α = 1, the random tree constructed in Remark 1 is distributionally equivalent to the tree spanned by the exploration process of an inhomogeneous random graph. Let us elaborate on this. An inhomogeneous random graph is a set of vertices {i : i ∈ [n]} with (possibly random) weights (W i ) i∈ [n] and edges between them. In a rank-1 inhomogeneous random graph, given (W i ) i∈[n] , i and j share an edge with probability The tree constructed from the ∆ 1 (i) /G/1 queue then corresponds to the exploration process of a rank-1 inhomogeneous random graph, defined as follows. Start with a first arbitrary vertex and reveal all its neighbors. Then the first vertex is discarded and the process moves to a neighbor of the first vertex, and reveals its neighbors. This process continues by exploring the neighbors of each revealed vertex, in order of appearance. By interpreting each vertex as a different customer, this exploration process can be coupled to a ∆ 1 (i) /G/1 queue, for a specific choice of (W i ) n i=1 and λ n . Indeed, when W i = (1 + βn −1/3 )S i for i = 1, . . . , n, the probability that i and j are connected is given by where T j ∼ exp(λ n ), (2.14) and λ n = n/ i∈[n] S i . The rank-1 inhomogeneous random graph with weights (S i ) n i=1 is said to be critical (see [7, (1.13) Consequently, if β = 0 and λ n = n/ i∈[n] S i , the heavy-traffic condition (2.3) for the ∆ 1 (i) /G/1 queue implies the criticality condition (2.15) for the associated random graph (and vice versa).
Remark 3 (Results for the queue-length process). By definition, the embedded queue (2.4) neglects the idle time of the server. Via a time-change argument it is possible to prove that, in the limit, the (cumulative) idle time is negligible and the embedded queue is arbitrarily close to the queue-length process uniformly over compact intervals. This has been proven for the ∆ (i) /G/1 queue in [5], and the techniques developed there can be extended to the ∆ α (i) /G/1 queue without additional difficulties.

The scaling limit of the embedded queue
All the processes we consider are elements of the space D := D([0, ∞)) of càdlàg functions that admit left limits and are continuous from the right. To simplify notation, for a discrete-time process X(·) : N → R, we write X(t), with t ∈ [0, ∞), instead of X( t ). Note that a process defined in this way has càdlàg paths. The space D is endowed with the usual Skorokhod J 1 topology. We then say that a process converges in distribution in (D, J 1 ) when it converges as a random measure on the space D, when this is endowed with the J 1 topology. We are now able to state our main result. Recall that Q n (·) is the embedded queue-length process of the ∆ α (i) /G/1 queue and let Q n (t) := n −1/3 Q n (tn 2/3 ) (2. 16) be the diffusion-scaled queue-length process.
By the Continuous-Mapping Theorem and Theorem 2 we have the following: Theorem 2 (Number of customers served in the first busy period). Assume that α ∈ [0, 1], E[S 2+α ] < ∞ and that the heavy-traffic condition (2.3) holds. Assume further that Q n (0) = q. Then, as n → ∞, the number of customers served in the first busy period BP n : where W (·) is given in (2.18).
In particular, if |F n | denotes the number of vertices in the forest constructed from the ∆ α (i) /G/1 queue in Remark 1, as n → ∞, (2.20) Theorem 1 implies that the typical queue length for the ∆ α (i) /G/1 system in heavy traffic is O P (n 1/3 ), and that the typical busy period consists of O P (n 2/3 ) services. The linear drift t → βλt describes the position of the system inside the critical window. For β > 0 the system is initially overloaded and the process W (·) is more likely to cause a large initial excursion. For β < 0 the traffic intensity approaches 1 from below, so that the system is initially stable. Consequently, the process W (·) has a strong initial negative drift, so that φ(W )(·) is close to zero also for small t. Finally, the negative 2E[S α ] t 2 , so that φ(W )(t) performs only small excursions away from zero. See Figure 1.
Let us now compare Theorem 1 with two known results. For α = 0, the limit diffusion simplifies to , in agreement with [5,Theorem 5]. In [7] it is shown that, when (W i ) i∈[n] are i.i.d. and further assuming that E[W 2 ]/E[W] = 1, the exploration process of the corresponding inhomogeneous random graph converges to For α = 1, (2.18) can be rewritten using (2.3) as Therefore the two processes coincide if W i = S i , as expected.

Numerical results
We now use Theorem 2 to obtain numerical results for the first busy period. We shall also use the explicit expression of the probability density function of the first passage time of zero of φ(W ) obtained by Martin-Löf [23], see also [14]. Let Ai(x) and Bi(x) denote the classical Airy functions (see [1]). The first passage time of zero of W (t) = q + βt − 1/2t 2 + σB(t) has probability density [23] where c = (2σ 2 ) 1/3 and a = q/σ 2 > 0. The result (2.24) can be extended to a diffusion with a general quadratic drift through the scaling relation W (τ 2 t) = τ (q/τ + βτ t − τ 3 t 2 /2 + σB(t)). Figure 2 shows the empirical density of BP n , for increasing values of n and various values of α, together with the exact limiting value (2.24). Table 1 shows the mean busy period for different choices of α and different service time distributions. We computed the exact value for n = ∞ by numerically integrating (2.24).  Table 1: Numerical values of n −2/3 E[BP n ] for different population sizes and the exact expression for n = ∞ computed using (2.24). The service requirements are displayed in order of increasing coefficient of variation. In all cases q = β = E[S] = 1. The hyperexponential service times follow a rate λ 1 = 0.501 exponential distribution with probability p 1 = 1/2 and a rate λ 2 = 250.5 exponential distribution with probability p 2 = 1 − p 1 = 1/2. Each value for finite n is the average of 10 4 simulations.
Observe that E[BP n ] decreases with α. This might seem counterintuitive, because the larger α, the more likely customers with larger service join the queue early, who in turn might initiate a large busy period. Let us explain this apparent contradiction. When the arrival rate λ is fixed, assumption (2.3) does not necessarily hold and E[BP n ] increases with α, as can be seen in Table  2. However, our heavy-traffic condition (2.3) implies that λ depends on α since λ = 1/E[S 1+α ]. The interpretation of condition (2.3) is that, on average, one customer joins the queue during one service time. Notice that, due to the size-biasing, the average service time is not E[S].  Table 2: Expected number of customers served in the first busy period of the nonscaled ∆ α (i) /G/1 queue with mean one exponential service times and arrival rate λ = 0.01. In all cases q = 1. Each value is the average of 10 4 simulations. Therefore, the number of customers that join during a (long) service is roughly equal to one as α ↑ 1. However, when customers with large services leave the system, they are not able to join any more. As α ↑ 1, customers with large services leave the system earlier. Therefore, as α ↑ 1, the resulting second order depletion-of-points effect causes shorter excursions as time progresses, see also Figure 1. In the limit process, this phenomenon is represented by the fact that the coefficient of the negative quadratic drift increases as α ↑ 1, as shown in the following lemma. . (2.25) Proof. Since We split the left-hand side in two identical terms and show that each of them dominates one term on the right-hand side. That is the proof of the second bound being analogous. The inequality (2.28) is equivalent to The term on the left and the two terms on the right can be rewritten as the expectation of a size-biased random variable W , so that (2.29) is equivalent to Finally, the inequality (2.30) holds because W is positive with probability one and x → log(x) and x → x 1+α are increasing functions.
3 Overview of the proof of the scaling limit The proof of Theorem 1 extends the techniques we developed in [5]. However, the dependency structure of the arrival times complicates the analysis considerably. Customers with larger job sizes have a higher probability of joining the queue quickly, and this gives rise to a size-biasing reordering of the service times. In the next section we study this phenomenon in detail.

Preliminaries
Given two sequences of random variables (X n ) n≥1 and (Y n ) n≥1 , we say that X n converges in probability to X, and we denote it by X n P → X, if P(|X n − X| > ε) → 0 as n → 0 for each ε > 0. We for all x ∈ R. For our results, we condition on the entire sequence (S i ) i≥1 . More precisely, if the random variables that we consider are defined on the probability space (Ω, F, P), then we define a new probability space (Ω, Correspondingly, for any random variable X on Ω we define E S [X] as the expectation with respect to P S , and E[X] for the expectation with respect to P. We say that a sequence of events (E n ) n≥1 holds with high probability (briefly, w.h.p.) if First, we recall a well-known result that will be useful on several occasions. Therefore, Since for any positive random variable Y , ε1 {Y ≥ε} ≤ Y 1 {Y ≥ε} almost surely, it follows The right-most term tends to zero as n → ∞ since E[X 1 ] < ∞, and this concludes the proof.
Given a vectorx = (x 1 , x 2 , . . . , x n ) with deterministic, real-valued entries, the size-biased ordering ofx is a random vector X (s) = (X (s) 1 , X (s) 2 , . . . , X (s) n ) such that More generally, for any α ∈ R the α-size-biased ordering ofx is given by a vectorX as the set of the first k customers served. The following lemma is the first step in understanding the structure of the arrival process: Lemma 3 (Size-biased reordering of the arrivals). The order of appearance of customers is the α-size-biased ordering of their service times. In other words, Proof. Conditioned on (S l ) n l=1 , the arrival times are independent exponential random variables. By basic properties of exponentials, we have as desired.
We remark that (3.8) differs from the classical size-biased reordering in that the weights are a non-linear function of the (S i ) n i=1 . The next lemma is crucial, establishing stochastic domination between the service requirements of the customers in order of appearance. In our definition of the queueing process (2.4)-(2.5), we do not keep track of the service requirements of the customers that join the queue, but only of their arrival times (2.2). Therefore, at the start of service, a customer's service requirement is a random variable that depends on the arrival time relative to the remaining customers. Lemma 3 then gives the precise distribution of the service requirement of the j-th customer entering service.
Recall that X stochastically dominates Y (with notation Y X) if and only if there exists a probability space (Ω,F,P) and two random variablesX,Ȳ defined onΩ such thatX
Proof. We compute explicitly We have the almost sure bound where S α (1) ≤ S α (2) ≤ . . . ≤ S α (n) denote the order statistics of the finite sequence (S α i ) i∈ [n] . There exists p ∈ (0, 1) such that n − k + 1 ≥ pn, for large enough n. Consequently, so that we have . (3.13) Let us denote by ξ p the p-th quantile of the distribution F S (·) and let us assume, without loss of generality, that f S (ξ p ) > 0.
Indeed, the assumption f S (ξ p ) > 0 implies that F S (·) is invertible in a neighborhood of ξ p . We have that, as n → ∞, (3.14) In particular, as n → ∞, If α > 0, as is the case in our setting, the proof of Lemma 4 shows that, uniformly in k = O(n 2/3 ), and therefore If f (·) is an increasing function, (3.18) makes precise the intuition that, if α > 0, customers with larger job sizes join the queue earlier. We will often make use of the expression (3.18).
The following lemma will often prove useful in dealing with sums over a random index set: Lemma 5 (Uniform convergence of random sums). Let (S j ) n j=1 be a sequence of positive random variables such that E[S 2+α ] < +∞, for α ∈ (0, 1).
We now focus on the i-th customer joining the queue (for i large) and characterize the distribution of its service time. In particular, for α > 0 this is different from S i . Lemma 6 (Size-biased distribution of the service times). For every bounded, real-valued continuous function f (·), as n → ∞,

21)
uniformly for i = O P S (n 2/3 ). Moreover, as n → ∞, Proof. First note that This can be further decomposed as ], using (3.21) and the Dominated Convergence Theorem the second claim follows.
In Lemma 6 we have studied the distribution of the service time of the i-th customer, and we now focus on its (conditional) moments. The following lemma should be interpreted as follows: Because of the size-biased re-ordering of the customer arrivals, the service time of the i-th customer being served (for i large) is highly concentrated.
where the error term is uniform in i = O P S (n 2/3 ). Moreover, the convergence holds in L 1 , i.e.
Proof. In order to apply Lemma 6, we first split where K > 0 is arbitrary, so that The first term is bounded, and therefore converges to E[(S ∧ K) 1+γ S α ]/E[S α ] by Lemma 6. The second term is bounded through Markov's inequality, as Therefore, The proof of Lemma 4 shows that, for any ε > 0, lim K→∞ C f K ,S ≤ ε, thus lim K→∞ C f K ,S = 0. Therefore, by letting K → ∞ in (3.33), (3.27) follows. Next, we split (3.34) The second term can be bounded as in (3.32). For the first term, where we have used that |(a − b)/(c − d) − a/c| ≤ ad/c 2 + bc/c 2 , for positive a, b, c, d. The second and third terms converge uniformly over i = O P S (n 2/3 ) by Lemma 5. Summarizing, Letting first n → ∞ and then K → ∞, (3.28) follows.
We will make use of Lemma 7 several times throughout the proof, with the specific choices γ ∈ {0, α, 1}. The following lemma is of central importance in the proof of the uniform convergence of the quadratic part of the drift: Proof. By Lemma 7, (3.37) is equivalent to We split the event space and separately bound n −2/3 sup (3.39) and n −2/3 sup for a sequence (K n ) n≥1 that we choose later on and is such that K n → ∞. We start with (3.39).
Since the sum inside the absolute value is a martingale as a function of j, (3.39) can be bounded through Doob's L p inequality [20, Theorem 11.2] with p = 2 as which converges to zero as n → ∞ if and only if K α n /n 1/3 does. We now turn to (3.40) and apply Doob's L 1 martingale inequality [20,Theorem 11.2] to obtain (3.43) We have used Lemma 7 in the second inequality, and Lemma 4 with f (x) = x 1+α 1 {x 1+α >Kn} in the third. The right-most term in (3.43) is o P (1) as n → ∞ by the strong Law of Large Numbers. Note that this side of the bound does not impose additional conditions on K n , so that, if we take K n = n c , it is sufficient that c < 1 3α , with the convention that 1 0 = ∞. We conclude this section with a technical lemma concerning error terms in the computations of quadratic variations. Denote the density (resp. distribution function) of a rate λ exponential random variable by f E (·) (resp. F E (·)): holds almost surely for 0 < ε < 1 and C > 0, which gives Therefore, where in the last step we used Lemma 7. Note that, since E[S 2+α ] < ∞, by Lemma 2 max j∈[n] S 2ε j = o P (n 2ε/(2+α) ). The right-most term in (3.46) then tends to zero as n tends to infinity as long as 0 < ε < min{1, 2/α}.

Proving the scaling limit
We first establish some preliminary estimates on N n (·) that will be crucial for the proof of convergence. We will upper bound the process N n (·) by a simpler process N U n (·) in such a way that the increments of N U n (·) almost surely dominate the increments of N n (·). We also show that, after rescaling, N U n (·) converges in distribution to W (·). The process N U n (·) is defined as with c n,β = 1 + βn −1/3 , An interpretation of the process N U n (·) is that customers are not removed from the pool of potential customers until they have been served. Therefore, a customer could potentially join the queue more than once. We couple the processes N n (·) and N U n (·) as follows. Consider a sequence of arrival times (T i ) ∞ i=1 and of service times (S i ) ∞ i=1 , then define A n (·) as (2.5) and A U n (·) as (4.2). With this coupling we have that, almost surely, Consequently, and almost surely. While in general only the upper bounds (4.6) and (4.7) hold, the processes N n (·) and N U n (·) (resp. Q n (·) and Q U n (·)) turn out to be, very close to each other. We start by proving results for N U n (·) and Q U n (·) because they are easier to treat, and only then we are able to prove that identical results hold for N n (·) and Q n (·).
In fact, we introduce the upper bound N U n (·) to deal with the complicated index set for the summation in (2.5). The difficulty arises as follows: in order to estimate N n (·) one has to estimate A n (·). To do this, one has to separately (uniformly) bound each element in the sum, and also estimate the number of elements in the sum. The first goal is accomplished, for example, through Lemma 7, while for the second the crude upper bound n is not strict enough. However, estimating |ν k | requires an estimate on N n (·) itself, as (2.6) shows. To solve this circularity, we introduce a bootstrap argument: first, we upper bound N n (·) and we obtain estimates on the upper bound, from this follows an estimate on |ν k |, and this in turn allows us to estimate N n (·).
This technique can be applied to solve a recently found technical issue in the proof of the main result of [7]. The authors in [7] prove convergence of a process which upper bounds the exploration process of the graph. Therefore, their main result is analogous to Theorem 3. However, a further step is required to complete the proof of convergence of the exploration process, and this is provided by our approach.
Theorem 3 (Convergence of the upper bound).
where W (·) is the diffusion process in (2.18). In particular, The next section is dedicated to the proof of Theorem 3.

Convergence of the upper bound
We use a classical martingale decomposition followed by a martingale FCLT. The process N U n (·) in (4.1) can be decomposed as N U n (k) = M U n (k) + C U n (k), where M U n (·) is a martingale and C U n (·) is a drift term, as follows: Moreover, (M U n (k)) 2 can be written as (M U n (k)) 2 = Z U n (k) + B U n (k) with Z U n (k) a martingale and B U n (k) the compensator, or quadratic variation, of M U n (k) given by In order to prove convergence of N U n (·) we separately prove convergence of C U n (·) and of M U n (·). We prove the former directly, and the latter by applying the martingale FCLT [13,Theorem 7.1.4]. For this, we need to verify the following conditions:

Proof of (i) for the upper bound
First we obtain an explicit expression for The third term is an error term. Indeed, for some ζ n ∈ [0, S c(i) S l /n], since |F E (x)| ≤ λ 2 for all x ≥ 0. By Lemma 7 this can be bounded by where C n is bounded w.h.p. and the o P (1) term is uniform in i = O(n 2/3 ). Therefore, the third term in (4.12) is o P (n −1/3 ). The remaining terms in (4.12) can be simplified as For the first term of (4.15), using c Note that the right-most term in (4.16) and the second term in (4.15) cancel out. This cancellation is what makes the analysis of N U n (·) considerably easier than the analysis of N n (·). Moreover, Lemma 7 implies that the third term in (4.15) is also o P (n −1/3 ). (4.12) is then simplified to 17) and the o P (n −1/3 ) term is uniform in i = O(n 2/3 ). We are now able to compute n −1/3 C U n (tn 2/3 ) = n −1/3 = tn 1/3 c n,β λ n n j=1 S 1+α j − 1 − c n,β λ n 4/3  so that, for α ∈ [0, 1], Since c n,β = 1+O(n −1/3 ), the second term in (4.23) converges uniformly to −t 2 λE[S 1+2α ]/2E[S α ] by Lemma 8.

Proof of (ii) for the upper bound
Rewrite B U n (k), for k = O(n 2/3 ), as where we have used the asymptotics for Again by (4.17), E S [A n (i) | F i−1 ] = 1 + o P (1), uniformly in i = O(n 2/3 ), so that (4.24) simplifies to We then focus on the second term in (4.25), which we compute as (4.28) The leading contribution to B U n (k) is given by the first term, while the second term is an error term by Lemma 5. We have shown that B U n (·) can be rewritten as which concludes the proof of (ii).

Proof of (iii) for the upper bound
The jumps of B U n (k) are given by (4.17), the second term is of order O P (n −1/3 ), uniformly in i = O P (n 2/3 ). The first term was computed in (4.27). Therefore, (4.32) After rescaling and taking the expectation, we obtain the bound (4.33) Proof. For ε > 0 split the expectation as We bound the expected value in the first term as where we used Lemma 4 with f (x) = x 2 1 {x>εn 1/3 } . Computing the expectation explicitly we get which tends to zero as n → ∞ since E[S 2+α ] < ∞ and ε > 0 is arbitrary.
By Lemma 10 the right-hand side of (4.33) converges to zero, and this concludes the proof.

Proof of (iv) for the upper bound
First we split We then stochastically dominate (A U n (k)) k≤tn 2/3 by a sequence of Poisson processes (Π k ) k≤tn 2/3 , according to Indeed, if E 1 , E 2 , . . . , E n are exponential random variables with parameters λ 1 , λ 2 , . . . , λ n , there exists a coupling with a Poisson process Π(·) such that The coupling is constructed as follows. Each random variable E i is coupled with a Poisson process Π (i) with intensity λ i in such a way that 1 {E i ≤t} ≤ Π (i) (λ i t). Moreover, by basic properties of We bound (4.40) via martingale techniques. First, we decompose it as Applying Doob's L 2 martingale inequality [20, Theorem 11.2] to the first term we see that it converges to zero, since S α i n . (4.42) The last equality follows from the expression for the variance of a Poisson random variable. The right-most term converges to zero by Lemma 7. We now bound the second term in (4.41), as By Lemma 10 the right-hand side of (4.43) converges to zero, concluding the proof of (iv).
As an immediate consequence of (2.6) and Lemma 11, we have the following important corollary. Recall that ν i is the set of customers who have left the system or are in the queue at the beginning of the i-th service, so that |ν i | = i + Q n (i). Recall also that 0 ≤ Q n (t) ≤ Q U n (t).
Corollary 1. As n → ∞, Intuitively, this implies that the main contribution to the downwards drift in the queuelength process comes from the customers that have left the system, and not from the customers in the queue. Alternatively, the order of magnitude of the queue length, that is n 1/3 , is negligible with respect to the order of magnitude of the customers who have left the system, which is n 2/3 .
In order to prove Theorem 1 we proceed as in the proof of Theorem 3, but we now need to deal with the more complicated drift term. As before, we decompose N n (k) = M n (k) + C n (k), where (4.46) As before, we separately prove the convergence of the drift C n (k) and of the martingale M n (k), by verifying the conditions (i)-(iv) in Section 4.1. Verifying (i) proves to be the most challenging task, while the estimates for (ii)-(iv) in Section 4.1 carry over without further complications.

Proof of (i) for the embedded queue
By expanding E S [A n (i) | F i−1 ] − 1 as in (4.15), we get By further expanding the first term in (4.47) as in (4.16), we get n .
(4.49) Therefore, to conclude the proof of (i) it is enough to show that the second term vanishes, after rescaling. We do this in the following lemma: Lemma 12. As n → ∞, Proof. By Lemma 11, sup i≤tn 2/3 Q n (i) ≤ C 1 n 1/3 w.h.p. for a large constant C 1 , and by Lemma for another large constant C 2 . This implies that, w.h.p., n −1/3 c n,β λt n ≤ c n,β λC 2t The right-most term converges to zero in probability as n → ∞ by Lemma 8. This concludes the proof.

Proof of (ii), (iii) and (iv) for the embedded queue
Proceeding as before, we find that The second term is an error term by Lemma 5 and Corollary 1. This implies that B n (·) can be rewritten as which concludes the proof of (ii).
To conclude the proof of Theorem 1, we are left to verify (iii) and (iv). However, the estimates in Sections 4.1.3 and 4.1.4 also hold for B n (·) and M n (·), since they rely respectively on (4.33) and (4.40) to bound the lower-order contributions to the drift. This concludes the proof of Theorem 1.

Conclusions and discussion
In this paper we have considered a generalization of the ∆ (i) /G/1 queue, which we coined the ∆ α (i) /G/1 queue, a model for the dynamics of a queueing system in which only a finite number of customers can join. In our model, the arrival time of a customer depends on its service requirement through a parameter α ∈ [0, 1]. We have proved that, under a suitable heavytraffic assumption, the diffusion-scaled queue-length process embedded at service completions converges to a stochastic process W (·). A distinctive characteristic of our results is the socalled depletion-of-points effect, represented by a quadratic drift in W (·). A (directed) tree is associated to the ∆ α (i) /G/1 queue in a natural way, and the heavy-traffic assumption corresponds to criticality of the associated random tree. Our result interpolates between two already known results. For α = 0 the arrival clocks are i.i.d. and the analysis simplifies significantly. In this context, [5] proves an analogous heavy-traffic diffusion approximation result. Theorem 1 can then be seen as a generalization of [5,Theorem 5]. If α = 1, the ∆ α (i) /G/1 queue has a natural interpretation as an exploration process of an inhomogeneous random graph. In this context, [7] proves that the ordered component sizes converge to the excursion of a reflected Brownian motion with parabolic drift. Our result can then also be seen as a generalization of [7] to the directed components of directed inhomogeneous random graphs.
Lemma 6 implies that the distribution of the service time of the first O(n 2/3 ) customers to join the queue converges to the α-size-biased distribution of S, irrespectively of the precise time at which the customers arrive. This suggests that it is possible to prove Theorem 1 by approximating the ∆ α (i) /G/1 queue via a ∆ (i) /G/1 queue with service time distribution S * such that and i.i.d. arrival times distributed as T i ∼ exp(λE[S α ]). This conjecture is supported by two observations. First, the heavy-traffic conditions for the two queues coincide. Second, the standard deviation of the Brownian motion is the same in the two limiting diffusions. However, this approximation fails to capture the higher-order contributions to the queue-length process. As a result, the coefficients of the negative quadratic drift in the two queues are different, and thus the approximation of the ∆ α (i) /G/1 queue with a ∆ (i) /G/1 queue is insufficient to prove Theorem 1.
Surprisingly, the assumption that α lies in the interval [0, 1] plays no role in our proof. On the other hand, we see from (2.18 is a necessary condition for Theorem 1 to hold. From this we conclude that Theorem 1 remains true as long as α ∈ R is such that (5.2) is satisfied. From the modelling point of view, α > 1 represents a situation in which customers with larger job sizes have a stronger incentive to join the queue. On the other hand, when α < 0 the queue models a situation in which customers with large job sizes are lazy and thus favour joining the queue later. We remark that the form of the limiting diffusion is the same for all α ∈ R, but different values of α yield different fluctuations (standard deviation of the Brownian motion), and a different quadratic drift.