Heavy traffic approximation for the stationary distribution of a generalized Jackson network: the BAR approach

In the seminal paper of Gamarnik and Zeevi (2006), the authors justify the steady-state diffusion approximation of a generalized Jackson network (GJN) in heavy traffic. Their approach involves the so-called limit interchange argument, which has since become a popular tool employed by many others who study diffusion approximations. In this paper we illustrate a novel approach by using it to justify the steady-state approximation of a GJN in heavy traffic. Our approach involves working directly with the basic adjoint relationship (BAR), an integral equation that characterizes the stationary distribution of a Markov process. As we will show, the BAR approach is a more natural choice than the limit interchange approach for justifying steady-state approximations, and can potentially be applied to the study of other stochastic processing networks such as multiclass queueing networks.

In a seminal paper, Gamarnik and Zeevi [17] proved that for a sequence of GJNs indexed by n = 1, 2, . . ., where the symbol ⇒ denotes convergence in distribution, {r n } is a sequence of positive numbers that converge to zero, L (n) (∞) is a random vector whose ith component represents the steady-state number of customers at station i in the nth network, and Z(∞) is a random vector that has the stationary distribution of a certain d-dimensional semimartingale reflecting Brownian motion (SRBM) Z = {Z(t), t ≥ 0} that was first defined in [22]. Readers are referred to the introduction of [17] for the importat motivation of this problem, and a review of then recent literature. Gamarnik and Zeevi [17] proved (1.1) under two key conditions: (a) the heavy traffic condition, and (b) the exponential moment condition. Condition (a) is standard and can be expressed in terms of a d-dimensional vector ρ (n) , where ρ (n) i is the traffic intensity at station i in the nth network. This condition requires that ρ (n) i < 1 at each station i, and ρ (n) i → 1 as n → ∞. The scaling parameter r n in (1.1) is closely tied to this heavy traffic condition and describes how quickly each ρ (n) i converges to one. In particular, r n goes to zero at the same rate as 1−ρ (n) i . Namely, lim n→∞ (1−ρ (n) i )/r n > 0 exists for each station i. Condition (b) requires that interarrival and service times have finite exponential moments; such a condition is unnecessarily strong. In a follow up work by Budhiraja and Lee [8], this condition is relaxed to a new moment condition: (b') interarrival and service times have finite second moments, and the sequences (indexed by n) of square interarrival and square service times are uniformly integrable. Conditions (a) and (b') in [8] represent the weakest possible conditions for (1.1) to hold.
In this paper, we prove (1.1) under conditions (a) and (b'); see Theorem 2.1 in Section 2.4. Our proof of (1.1) uses a novel approach, and is drastically different from the ones in [8] and [17]. This approach was used in [30] to study the steady-state approximation of a single server queue in heavy traffic. However, ours is the first paper to apply it to the network setting. In addition to proving Theorem 2.1, the ideas laid out in this paper can be applied to study steady-state diffusion approximations of other systems of interest. One promising direction for future work is to generalize this approach to study multiclass queuing networks, such as those studied by Bramson and Dai [4]. We now outline the approach.
Apriori, we cannot exclude the possibility that ϕ * (θ) may be degenerate, i.e. be the MGF of some nonnegative measure on R d that has total mass strictly less than 1 (or possibly 0). To invoke the uniquness result in [13] and conclude that ϕ * (θ) = ϕ(θ), we must also prove that ϕ * (θ) is the MGF of some probability measure, i.e. that it is not degenerate. For example, ϕ * (θ) ≡ 0 and ϕ * j (θ) ≡ 0 clearly satisfy BAR (2.30), but ϕ(θ) = 0. To this point, we show that (1.3) lim θ↑0 ϕ * (θ) = 1, which implies that the sequence of probability measures corresponding to {ϕ (n k ) , k ≥ 1} is tight; see Lemma 6.1. It turns out that condition (1.3) can be verified algebraically from the fact that ϕ * (θ) and ϕ * j (θ) (j = 1, . . . , d) satisfy BAR (2.30). Most of the steps in this algebraic procedure are carried out in Proposition 5.1. Once we have this proposition, showing (1.3) becomes straightforward; see for instance, the proof of (6.3). The proof of Proposition 5.1 requires that the reflection matrix of the SRBM be an Mmatrix. An M-matrix is an invertible square matrix whose diagonal entries are non-negative, and off diagonal entries are non-positive [2,Chapter 6]. This condition is always satisfied in the GJN setting, because the reflection matrix of the SRBM has the form (I − P t ), where P routing matrix of the GJN.
The procedure of considering a sequence ϕ (n k ) (θ) → ϕ * (θ) (and ϕ (n k ) j (θ) → ϕ * j (θ)), and then verifying (1.3) looks like an application of Lévy's convergence theorem [39,Section 18.1], with one key difference. Lévy's result says that if a sequence of characteristic functions (CF) converges, and the limit is continuous at zero, then the corresponding sequence of probability measures converges weakly to a limiting probability measure. Our use of MGFs in this paper, instead of CFs, is not accidental. Applying Lévy's theorem requires knowing apriori that the sequence (or at least a subsequence) of CFs converges, and for a complicated model like a GJN this is nigh impossible to verify. In contrast, MGFs are monotone functions, and any sequence of MGFs always has a convergent subsequence due to Helly's selection principle [21,Theorem 8.1].
To prove Proposition 4.1 (that ϕ (n k ) (θ) and its boundary counterparts satisfy BAR (2.30) asymptotically), we work with a continuous time Markov process X (n) = {X (n) (t), t ≥ 0} that describes the dynamics of the nth GJN. In addition to the queue length process L (n) = {L (n) (t), t ≥ 0}, this Markov process also keeps track of the remaining interarrival times and remaining service times at all stations. Although L (n) is a jump process taking values ℓ in Z d + , the other component of X (n) is a piecewise deterministic process taking values (u, v) in R 2d + . Nevertheless, the stationary distribution of X (n) satisfies a BAR (3.15) for all "good" test functions f (ℓ, u, v). The BAR for X (n) has two components. The first component involves the deterministic process and the second involves the jump process; the latter is generally difficult to analyze. To handle this difficulty, we choose f (ℓ, u, v) to be an exponential function of the state variable (ℓ, u, v) of the form f (ℓ, u, v) = e θ,ℓ + η,u + ζ,v , (1.4) where (θ, η, ζ) ∈ R 3d are parameters and ·, · is the Euclidean inner product (we actually use truncated versions of these functions to accommodate general interarrival and service time distributions, which may not have exponential moments, see (3.21)). By judiciously choosing η = η(θ) and ζ = ζ(θ) as functions of θ ∈ R d , we eliminate the jump term of the BAR (3.15), leaving us with the jump free BAR (3.26). To obtain Proposition 4.1 from this jumpless BAR (3.26), we perform Taylor expansion on η(θ) and ζ(θ) to obtain their quadratic approximations (Lemma 4.1), and establish corresponding error bounds (Lemma 4.2).
Both [8] and [17] focused on proving the tightness of {r n L (n) (∞), n ≥ 1} by ingenuously constructing appropriate Lyapunov functions. Both papers rely on the Lipschitz continuity of the Skorohod map corresponding to the SRBM for their tightness argument. Such a map does not exist in the multiclass queueing network setting, which makes the generalization of results from [8] and [17] to the multiclass setting difficult. Gurvich [19] is the only paper that provides a sufficient condition for proving (1.1) in the multiclass setting. In addition to the strong exponential moment assumption, [19] assumes a strong state space collapse (SSC) condition: (c) fluid solutions of a critically loaded fluid model converge to their equilibria at a "uniform linear rate" in finite time. Gurvich [19] focused on generalizing the approach in [17]. In particular, he retained the strong exponential moment assumption. In a recent paper, Ye and Yao [40] focused on generalizing the approach in [8] to resource-sharing networks that lie outside of the multiclass queueing network setting. Using a multiclass queueing network example, in Section 5 of [40], the authors outline the steps needed for generalizing their approach to the multiclass queueing network setting. The authors are able to keep the weak moment condition (b'), relaxing condition (c) to condition (c'): fluid solutions converge to their equilibria "uniformly fast", but not necessarily in finite time. However, they imposed a strong "bounded workload condition", which is difficult to check in general.
The approach of Gamarnik and Zeevi [17] is known as the limit interchange argument, and has since been used by others to study steadystate approximations of various queueing systems. In the single server setting, Katsuda [27] studied a multiclass single server queue with feedback, and Zhang and Zwart [41] studied a limited processor sharing queue. In the many-server setting, Tezcan [38] considered a parallel-server system with multiple server pools and no customer abandonment, Gamarnik and Stolyar [16] examined a multiclass, many-server queue with abandonment, where customer service and patience times are exponentially distributed with means varying between different customer classes, and Dai et al. [12] considered a many-server queue with abandonment, where service times follow a phase-type distribution. In recent years, several papers [6,7,18,20,24] have gone beyond limit theorems, and establish rates of convergence to the approximating distribution. The framework underlying those papers (except for [20]) is known as Stein's method [37,10,34].
The limit interchange and Stein method frameworks represent the two general approaches used to establish convergence of steady-state distributions. Our paper adds a third approach to this set. Each approach has its own pros and cons. The Stein approach is able to provide rates of convergence, which is a step beyond just convergence, but this comes at a cost. Successfully applying it requires deeper knowledge about the underlying system than either the limit interchange approach, or the one presented in this paper. In particular, Stein's method has not been applied to queueing networks and has so far been limited to systems with a single station. The limit interchange approach has been the prominent method in the past decade. At its core, it requires the use of a Lyapunov function to prove tightness. How-ever, each stochastic system requires a separate Lyapunov function. Finding one is typically very difficult and can be considered an art. In contrast, the method in this paper is algorithmic in nature, and requires no guesswork to find any Lyapunov function. For instance, there is little wiggle room in choosing the exponential test function in (1.4), and our tightness argument in Section 5 is algebraic and "procedural". In terms of generality of our method, multiclass queueing networks do add an extra layer of difficulty to our approach. Namely, the presence of SSC in those networks and the fact that the reflection matrix of the SRBM no longer has to be an M-matrix. It is the subject of ongoing research to extend our approach to the multiclass setting.
The rest of the paper is structured as follows. In Section 2, we introduce the sequence of GJNs, the heavy traffic condition, describe the approximating SRBM, and state our main results. In Section 3, we derive the BAR (3.15) for each GJN in the sequence, and introduce conditions on test functions under which the jump term there disappears. Section 4 is devoted to proving Proposition 4.1, which states that the MGFs of the queue lengths of the GJN approximately satisfy the BAR of the SRBM. In Section 5 we present Proposition 5.1, which we use together with Proposition 4.1 to prove our main result, Theorem 2.1, in Section 6. We defer proofs of certain technical lemmas to the Appendix.
We advise the reader that the present document serves as an improved version of the recently published paper [5]. The improvement contains two minor changes. The first change is in the first sentence of the proof of Theorem 2.1 in Section 6. The existence of a subsequence {n ′′ } follows directly from Helly's selection principle; there is no need to use the notion of "weak compactness". The second change is in the proof of part (b) of Lemma 6.1, replacing reference [36] by [26].
1.1. Notation. All random variables and stochastic processes are defined on a common probability space (Ω, F, P), and all stochastic processes X = {X(t), t ≥ 0} are assumed to be right continuous on [0, ∞), and having left limits on (0, ∞). For a sequence of random variables {Y n , n ≥ 1} and a random variable Y , we write Y n ⇒ Y if Y n converge in distribution to Y . For an integer d ≥ 1, R d denotes the d-dimensional Euclidean space, and Z d + and R d + denote the spaces of d-dimensional vectors whose elements are non-negative integers and non-negative real numbers, respectively. For vectors x, y ∈ R d , we write x i to denote the ith component of x, 1 ≤ i ≤ d. Furthermore, we write x ≤ y if x i ≤ y i for all i = 1, ..., d and we let x, y be their Euclidean inner product. All vectors are understood to be column For integers a, b with a > b, we define b i=a = 0. For integers i, j, we let δ ij be the Kronecker delta; i.e. δ ij = 1 if i = j, and zero otherwise. We let x T and A T denote the transpose of a vector x and matrix A, respectively. We reserve I for the identity matrix, e for the vector of all ones and e (i) for the vector that has a one in the ith element and zeroes elsewhere; the dimensions of these vectors will be clear from the context.

Heavy traffic approximation.
In this section, we introduce the generalized Jackson network and state the main results of this paper.

Network description.
To be able to state our main results, we first introduce a generalized Jackson network, and define a Markov process that describes it. This network has d stations, numbered 1, 2, . . . , d. Let J = {1, 2, . . . , d}. Each station has a single server that serves customers in the first-in-first-out (FIFO) manner. A station may have customers arriving from outside the network; we refer to such arrivals as external arrivals. A customer who completes service at a station either goes to another station or leaves the network. The following is a mathematical description of a GJN.
Let E be the subset of J whose members are the stations that have external arrivals. External arrivals at station i ∈ E follow a renewal process with i.i.d. interarrival times and we let T e,i be a nonnegative random variable having the distribution of the interarrival times. We assume that T e,i has finite variance and a non-zero mean. External arrivals at different stations are independent.
Service times at station i are i.i.d. random variables and we let T s,i be a nonnegative random variable having the distribution of the service times. We assume that T s,i has finite variance and a nonzero mean. Service times at different stations are independent. Service time sequences and external interarrival time sequences are assumed to be independent.
A customer that completes service at station i ∈ J goes to station j ∈ J with probability p ij or exits the network with probability p i0 = 1− j∈J p ij , independently of everything else. Let P be the d × d square matrix whose (i, j)th entry is p ij for i, j ∈ J .
This queueing network is referred to as a generalized Jackson network (GJN). We now introduce a Markov process for describing the GJN. For time t ≥ 0, denote the number of customers, the residual external arrival time, and the residual service time at station i ∈ J by L i (t), R e,i (t) and R s,i (t), respectively. We set R e,i (t) = 0 for i ∈ J \ E, and R s,i (t) = T s,i (m) if no customer is in service at station i at time t, and the service time of the next customer at station i is T s,i (m). In Section 2.1 of [11] and Section 2.1.1 of [8], R s,i (t) is defined to be zero if no customer is in service at time t. Our definition will make it slightly easier to derive condition (3.19) in Section 3.1 to annihilate the jump term in (3.15) there.
Denote the vectors whose entries are L i (t), R e,i (t) and R s,i (t) by L(t), R e (t) and R s (t), respectively. Throughout the paper we refer to {L(t), t ≥ 0} as the queue length process (even though it includes customers currently in service as well). Let X(t) = (L(t), R e (t), R s (t)), then X = {X(t), t ≥ 0} is a Markov process with state space Z d + × R 2d + . The GJN is a natural generalization of the Jackson network, but its stationary distribution is hard to get. This motivates the study of heavy traffic approximations in [32] and [25]. To introduce the notation of heavy traffic, we introduce a sequence of generalized Jackson networks in the next section.

A Sequence of Networks and Their
Assumptions. Consider a sequence of GJNs indexed by n = 1, 2, . . .. We denote the nth GJN by superscript (n) . For example, X(t), L(t), R e (t), R s (t) are denoted by X (n) (t), L (n) (t), R Then λ (n) a,i can be interpreted as the total arrival rate at station i. The traffic equation can be written as the vector valued equation: where P t is the transpose of P . Equation (2.4) has a unique solution given by λ e . We assume the following heavy traffic conditions: there exists a positive vector b ∈ R d + , and a sequence of positive numbers r n such that It will be convenient to express condition (2.5) in terms of the primitive data λ Note that r n is chosen to be 1/ √ n in [32] and much of the literature as well, but it is intuitively clear that this is not essential as long as (2.5) holds and r n converges to zero as n → ∞. For example, some authors take r n = 1/n (see, e.g., [9]). In this paper, we do not make any specific choice for r n ; this conveys the same spirit of Kingman [28]'s heavy traffic approximation (see [30] for details). We make the following moment assumptions on the sequence of networks: e,i → λ e,i > 0 for each i ∈ E, (2.9) σ (n) s,j → σ s,j < ∞ for each j ∈ J . (2.10) In addition, we assume T (n) e,i 2 , n ≥ 1 is uniformly integrable for each i ∈ E, (2.11) T (n) s,j 2 , n ≥ 1 is uniformly integrable for each j ∈ J . where λ a = (I − P t ) −1 λ e . The diffusion approximation focuses on the sequence L (n) = {L (n) (t), t ≥ 0}, which is the first component of X (n) . Clearly, L (n) is not a Markov process in general, but the standard approach in the literature (e.g. [25,32]) shows that the diffusion-scaled process Z (n) = {Z (n) (t), t ≥ 0}, defined as Z (n) (t) = r n L (n) t/r 2 n , t ≥ 0, (2.14) converges in distribution to a semimartingale reflecting Brownian motion (SRBM) Z = {Z(t), t ≥ 0} (to be defined in Section 2.3). The heavy traffic assumptions (2.5) and (2.6) are crucial for the time and space scalings in (2.14) to be correct.
We also assume that for each n, This assumption is satisfied under heavy traffic condition (2.5) and some additional regularity assumptions on the interarrival time distributions; see, for example, Theorem 3.8 of Down and Meyn [15] and Theorem 5.1 of Dai [11]. Since X (n) is positive Harris recurrent, it has a unique stationary distribution. We let X (n) (∞) be the vector having that stationary distribution. Since X (n) is assumed to have a stationary distribution, L (n) has a stationary distribution. We use L (n) (∞) and Z (n) (∞) to denote the random vectors having the stationary distributions of L (n) and Z (n) , respectively. Note that the stationary distribution of Z (n) is independent of the time scaling in (2.14) for each fixed n. Furthermore, it is clear that Z (n) (∞) d = r n L (n) (∞). For future reference, we use π (n) to denote the stationary distribution of Z (n) . As stated in (1.1), the primary result of this paper is to prove that Z (n) (∞) ⇒ Z(∞), where Z(∞) has the stationary distribution of the SRBM Z, which we now define.
where δ ij is the Kronecker delta, p ij are the routing probabilities in the GJN, and the quantities λ e,i , σ e,i , σ s,i , and λ s,i are given in (2.9), (2.10), and (2.13), respectively. The matrix Σ is always non-negative definite. Throughout the document, we assume that which is a standard assumption in both [17,8]. Associated with the data (µ, Σ, R) is Z = {Z(t), t ≥ 0}, a semimartingale reflecting Brownian motion (SRBM) that satisfies and covariance matrix Σ, The matrix R is called the reflection matrix of Z. Recall that an M-matrix is an invertible square matrix whose diagonal entries are non-negative, and off diagonal entries are non-positive [2, Chapter 6]. Since R in (2.8) is an M-matrix, it follows from [22] that the SRBM Z exists and is unique as a strong solution to (2.19)-(2.22).
Again, because R is an M-matrix and condition is satisfied, it follows from [23] that the SRBM Z has a unique stationary distribution π on (R d We now discuss the characterization of π. Let E π denote the expectation when Z(0) has distribution π. For j ∈ J , define the boundary probability measure π j by Therefore, π j concentrates on F j , but we define it on R d + for notational simplicity.
The following lemma gives a characterization of the stationary distribution π and its associated boundary measures π 1 , . . ., π d .
The stationary distribution π and its associated boundary measures π 1 , . . ., π d must satisfy the following basic adjoint relationship (BAR) such that f , its first order derivatives, and its second order derivatives are bounded and continuous, and (b) Conversely, assume that π, π 1 , . . ., π d are probability measures on R d + satisfying BAR (2.24). Then π must be the stationary distribution of the (µ, Σ, R)-SRBM, and π 1 , . . ., π d the corresponding boundary measures.
We denote the moment generating functions (MGFs) of π and π j by ϕ and ϕ j , respectively. Namely, for θ ∈ R d with θ ≤ 0, We also define imsart-ssy ver. 2014/10/16 file: GJN-HTP_ssyfinal_arxivcorrection.tex date:  and we now use it to show that To do so, we let θ = αe (k) , where α ≤ 0 is a real number and k ∈ J . Dividing both sides of (2.28) by α and taking α ↑ 0, we obtain for all k ∈ J , where r kj is the (k, j)th entry of R. In vector form, this is equivalent to and since R is invertible, (2.29) follows. Therefore, (2.28) can be rewritten in the more practical form We refer to this as the MGF version of the BAR.
It is shown in the appendix of the arXiv version of [14] that the MGF version of the BAR (2.30) and the standard version the BAR (2.24) are equivalent. Thus, it follows from the characterization obtained in [13] that ϕ(θ) and ϕ j θ are the unique moment generating functions that satisfy (2.30).

Main Results.
Recall that π (n) is the stationary distribution of Z (n) and for j ∈ J , define π We are now ready to present our main results.
Theorem 2.1. Consider the sequence of GJNs indexed by n. Assume the heavy traffic conditions (2.5) and (2.6), positive recurrence condition (2.15), and moment conditions (2.9)-(2.12) hold. Then, (a) As n → ∞, the stationary distribution π (n) of Z (n) converges weakly to π, the stationary distribution of the SRBM Z. (b) For each j ∈ J , π (n) j converges weakly to π j , the corresponding boundary measure on F j .
Part (b) appears to be new. As mentioned in Section 2.2, Part (a) of the theorem is known, but we prove it using a new approach. To elaborate on this approach, we first discuss the existing approach that uses process limit and tightness.
Assuming the distribution of Z (n) (0) converges, Reiman [32] proves that the scaled queue length process {Z (n) (t), t ≥ 0} converges in distribution to the SRBM Z; we refer to this as the process limit. Namely, where Z(∞) is a random vector having distribution π. From (2.31) and As we mentioned earlier, part (a) of Theorem 2.1 was already proved by Gamarnik and Zeevi [17], and Budhiraja and Lee [8]. Both sets of authors rely on the process limit result (2.31), and reduce the problem to showing tightness of the stationary distributions {π (n) : n = 1, 2, . . .}. Together with the process limit, this tightness implies (1.1). Our proof of Theorem 2.1 will not use the process limit (2.31). Instead, we work directly with the BAR associated with each network in the sequence, which we now derive in Section 3.
3. Network dynamics and basic adjoint relationship. In Section 3.1 we derive the BAR for the GJN and describe a special class of test functions for which the BAR reduces to a very simple expression. In Section 3.2, we focus on a family of exponential test functions that belong to this special class. This lays down the foundation for Section 4, where we compare the BAR of the GJN to that of the approximating SRBM.
3.1. Network Dynamics and Tractable BAR. In this section, we derive the BAR (3.15) for a GJN using sample path dynamics of the network. The main result of this section is Lemma 3.1 below. It describes a special class of functions for which the BAR has a simple form. This lemma can be compared to the BAR in Miyazawa [31], which studies a heterogeneous multiserver queue. However, our approach is different and we make detailed comments on this difference at the end of this section.
We first describe the network dynamics in terms of network primitives. We fix a network in the sequence and temporarily drop the index n for the imsart-ssy ver. 2014/10/16 file: GJN-HTP_ssyfinal_arxivcorrection.tex date:  remainder of the section. For station j ∈ J , we introduce the sequence of i.i.d. random vectors where p jk is the probability that a customer goes to station k after completing service at station j. These random vectors represent the routing of customers after completion of service at station j.
Recall the definition of X = {X(t), t ≥ 0} from Section 2.1, and let be the initial state of the network. We recall the sequences of interarrival and service times from (2.1) and (2.2), and for each integer q ≥ 1 we define the primitive processes For station j ∈ J , and time t ≥ 0, let E j (t) and D j (t) be the total number of external arrivals and total number of departures, respectively, on the interval be the cumulative busy time of the server at station j. We can then define to be the time of the qth service completion at station j. Recalling that only stations in E have external arrivals, we see that for every t ≥ 0 and on every sample path, Furthermore, one may check that the queue lengths, residual interarrival times and residual service times satisfy In the last equation, we have adopted the convention that when station j is empty at time t, the remaining service time R s,j (t) is set to be the service time of the next customer at this station. Clearly, where the derivatives at a jump time t are interpreted as the right derivatives.
With the network dynamics rigorously defined, we proceed to derive a BAR for the network. Let D be the space of all bounded functions f (ℓ, y) : and view f (x, y) as a one dimensional function in the y i component. We require that this one dimensional function be continuously differentiable at all but finitely many points, and have bounded derivatives whose bound is independent of the point (x, (y k ) k =i ). For instance, D contains the space of all bounded functions f : + → R is continuously differentiable with a bounded derivative, whose bound is independent of ℓ. The reason for enforcing the component-wise conditions on the set D is that the test functions we use in this paper always have some form of truncation, which prevents them from being everywhere differentiable in the variable belonging to R 2d + .
Now for any f ∈ D, and any interval [t, t + h] ⊂ R + free of jumps, we can use the fundamental theorem of calculus together with (3.10) to see that For the remainder of this section, we let ν denote the stationary distribution of X, and let P ν be the probability measure conditioned on X(0) having the distribution ν. To deal with the jumps of X, we introduce some notation. We know that the jumps of the process X correspond to external arrivals and departures at various stations. We use the term event to refer to a single external arrival or departure. At time instance s, there may be simultaneous events. A jump of X at time s constitutes all the events that occur at s. Since we assumed that ET e,i > 0 and ET s,j > 0 for all i ∈ E and j ∈ J , it follows from basic renewal theory (see for instance [33,Theorem 3 The finiteness of E ν D j (t)−D j (0) comes from the fact that {D j (t), t ≥ 0} is dominated by the renewal process corresponding to the times {V j (q), q ≥ 1}. Therefore, for every t > 0 we know that P ν -almost surely, the process X has finitely many events on the interval (0, t]. It follows that P ν -almost surely, the process X has countably many jumps on the interval (0, ∞), and every jump instant on this interval has finitely many events. (3.13) In what follows, we deal only with the sample paths where (3.13) holds. Suppose now that k events occur simultaneously at time s. We can order them in an arbitrary manner, provided that we do not violate the network dynamics. For example, if station s is empty at time s−, and experiences both an arrival and departure at time s, then the arrival must happen first. The particular order assigned to the simultaneous events does not matter, because (3.13) implies there are always finitely many events at a time instance. We can therefore order all of the events that occur on the interval (0, ∞), and represent them as δ 1 < δ 2 < . . ., where δ m represents the mth event to occur after time 0. We let T (δ m ) represent the time at which the mth event happens. Now for integer m ≥ 1, we write X δm to represent the value of the process immediately after the mth event has been applied to it. Our use of X δm as opposed to X(T (δ m )) is intentional. If δ m 1 and δ m 2 represent two simultaneous arrivals to some station, then X(T (δ m 1 )) = X(T (δ m 2 )), but X δm 1 = X δm 2 . We also write X δm− to represent the value of the process X right before the mth event is applied to it, but after the first m − 1 events have been applied.
From (3.12), we see that By isolating times when jumps occur, one can verify that for any t > 0 and f ∈ D, Since f ∈ D is bounded, we can take the expectation under ν to see that Furthermore, we know that Af (x) is bounded for all x ∈ Z d + × R 2d + , meaning we can apply the Fubini-Tonelli theorem to interchange the expectation with the integral in the first term. Stationarity implies that We therefore have the intermediate result imsart-ssy ver. 2014/10/16 file: GJN-HTP_ssyfinal_arxivcorrection.tex date:  We now use (3.14) and the boundedness of f ∈ D to apply the Fubini-Tonelli theorem again and arrive at the BAR for the GJN: Before proceeding further, let us pause and discuss the implications of (3.15). The equation above encodes information about the distribution ν. To access this information, we choose various test functions f ∈ D and plug them into (3.15). However, (3.15) is of limited practical use for most test functions because the jump term is hard to handle analytically. To get around this difficulty, we now describe how to engineer f ∈ D so that the jump terms above vanish. Roughly speaking, the idea is to ensure that at each jump time s and for each state x, i.e. the test function remains unchanged in expectation after a jump. In Section 3.2, we describe a family of exponential test functions satisfying the above property, which is also rich enough to asymptotically characterize the distribution of the queue lengths. It may be helpful for the reader to compare the arguments that follow to the proof of Lemma 2.3 in [31, Appendix A.2], where the author also seeks to find test functions for which the jump terms vanish. In that paper, simultaneous events are handled by an approach that is slightly different from ours. To make the jump terms vanish, we now analyze the terms f (X δm ) − f (X δm− ). If the event δ m corresponds to the qth external arrival to station i ∈ E, then Similarly, if δ m corresponds to the qth departure from station j ∈ J , then X δm = X δm− + (−e (j) + φ (j) (q), 0, T s,j (q)e (j) ).
Since T s,j (q) and φ (j) (q) are independent of X δm− and S j (q), we have for every feasible state (ℓ, u, v) ∈ Z d + × R 2d + with ℓ j > 0 and v j = 0. We summarize our analysis in the following lemma.
Remark 3.1. The idea of engineering a test function to kill the jump terms in (3.15) first appeared in the work of Miyazawa [30] for a single server queue, which is further studied for a multiserver queue with heterogeneous servers in [31]. The novel feature of the present paper is the careful consideration of tightness in Section 5. This issue was not present in [30,31] because those papers deal with a single queue as opposed to a queueing network.
In the next section, we describe a useful family of exponential test functions that satisfies (3.17) and (3.19).

Exponential Test Functions for a Sequence of GJNs.
We consider the sequence of Markov processes {X (n) , n ≥ 1} from Section 2.2. Recall our goal, which is to show that the MGF of Z (n) (∞) = r n L (n) (∞) asymptotically satisfies the BAR (2.30) of the approximating SRBM. As a first step, we have already derived a BAR for X (n) . We now describe a family of exponential test functions that satisfy conditions (3.17) and (3.19).
For all n ∈ Z + , let us first define the truncation function g (n) : R → R as It is not hard to verify that f (n) θ,η,ζ ∈ D. Our in (3.21) is meant to resemble a moment generating function. Indeed, if η and ζ were allowed to be independent of θ and if we chose g (n) (y) = y, then this family of test functions would characterize the stationary distribution of X (n) via its BAR (3.15). However, applying (3.15) to f (n) θ,η,ζ is of little practical use, because the jump terms become too complicated to work with.
We have rewritten η i (θ i ) because it is independent of θ k for k = i. For the remainder of the paper, we write This reduced family of test functions can only characterize the distribution of Z (n) (∞) asymptotically as n → ∞, which is enough for our purposes. The following lemma is similar to Lemma 2.3 of [31], and says that f     Proof. Once we assume that η (n) i (θ i ) and ζ (n) j (θ) are well defined and finite, the second claim of the lemma becomes trivial to verify by (3.22) and (3.23). We now check that these functions are well defined for all θ ∈ R d with θ ≤ M . The argument to show that η e,i < ∞ for each n, and the latter follows from (2.9) and the fact that λ e,i < ∞ there. Hence, η We conclude that ζ (n) j (θ) is well defined and satisfies where t j (θ) is defined in (3.23).
We now present the BAR for this special family of exponential test functions.
Lemma 3.3. Assume that X (n) is positive Harris recurrent, and let have the stationary distribution of X (n) . Then Proof. This lemma follows immediately from applying Lemmas 3.1 and 3.2 to the test function f (n) θ (x) from (3.21), because its right side partial derivatives are In order for Lemma 3.3 to be of practical use, we need to know the behavior of η where M > 0 is as in Lemma 3.2.
This result states that the steady state distributions of the queue lengths asymptotically satisfy (2.30) as n → ∞. It plays an essential role in the proof of Theorem 2.1 in Section 6.
The idea behind the proof is to use (3.26) as a starting point, and use quadratic approximations of η (n) i (θ i ) and ζ (n) j (θ) to arrive at (4.2). In Section 4.1 we use Taylor expansion to obtain the quadratic approximations of η (n) i (θ i ) and ζ (n) j (θ). We then show that under heavy traffic scaling, the associated approximation error vanishes in an appropriate fashion. We prove Lemma 4.1 in Section 4.3.

Taylor Expansions.
We begin with a general lemma, which describes the behavior of functions that are implicitly defined in a manner similar to η The proof of this lemma is deferred to Section A.1. We now apply it to obtain expansions of η Observe that the uniform integrability assumptions (2.11) and (2.12), together with the definition of g (n) (x), imply that for all i ∈ E and j ∈ J , λ (n) e,i → λ e,i , λ    Recall that ζ (n) is defined in (3.24) and log t j (θ) = −θ j + log k∈J p jk e θ k + p j0 ∈ C ∞ (R d ).
By Taylor expansion, one can verify that log t j (θ) satisfies log t j (θ) − − θ j + k∈J p jk θ k (4.9) Note that both c 1,j (θ) and c 2,j (θ) are finite for each θ and j ∈ J because log t j (θ) belongs to C ∞ (R d ).
Applying Lemma 4.1 to χ (n) j , together with the Taylor expansions above, we have the following result about ζ (n) j (θ). For each j ∈ J and n ≥ 1, s,j c 1,j (θ) and c 2,j (θ) are as in (4.10), and for any K ∈ (0, M ], We know c Lip,j (K) < ∞ because log t j (θ) ∈ C ∞ (R d ) and is therefore is locally Lipschitz.

Error bounds.
In order for the quadratic approximations of η     s,j (r n K) < ∞.
In the rest of this section, we will frequently use the following bound. Recall the definition of f (n) rnθ (x) from (3.21), where x = (ℓ, u, v) ∈ Z d + × R 2d + . Using (4.8), (4.12), and Lemma 4.3, it follows that for any K ∈ (0, M ], we can define c f (K) to satisfy This bound holds for all x ∈ Z d + × R 2d + . We now state several lemmas that we will use to prove Proposition 4.1.  We would like to point out that Lemmas 4.4 and 4.5 are not novel, and can be proved using the well developed theory of Palm calculus (see for example [1,Chapter 1] or [3,Chapter 4]). However, to keep this paper self-contained, we avoid using Palm calculus and prove these lemmas in Section A.4 using the BAR (3.15). following statements are true: The proof of this lemma is postponed until Section A.5.

4.3.
Proof of Lemma 4.1. We now prove Proposition 4.1 using the auxiliary lemmas stated in the previous section. As a starting point, we multiply (3.26) by −1/r 2 n to see that We claim that the last three lines are negligible. For this, we wish to show that then similar statements hold for the remaining two lines. Observe that for each j ∈ J , by (4.12) and (4.17). By Lemma 4.3 and Lemma 4.5, this upper bound vanishes as n → ∞. Thus, we have succeeded in proving that For the next step, we apply (4.17) and Lemma 4.6 to see that We arrive at the intermediate result Recall the definition of ϕ (n) j (θ) from (4.1) and use the telescoping sum Using (4.22) we see that to complete the proof of Proposition 4.1, all we need to do is show the upper bound above equals zero. To show that the first term is zero, we recall from Lemma 4.4 that Recalling the form of γ j (θ) from (2.27) and the bound c f (M ) from (4.17), we see that where the first equality is justified by (2.13) together with the approximation of ζ (n) j (r n θ) from (4.11) and Lemma 4.2, and the second equality follows from (2.13) and (4.6). Now for the second term, Lemmas 4.4 and 4.6 tell us that for j ∈ J , This concludes the proof of Proposition 4.1.

Tightness of Stationary Distributions: an Essential Proposition.
This section is centered around the statement and proof of Proposition 5.1, which is critical to proving that the sequence {Z (n) (∞), n ≥ 1} is tight. The tightness argument itself is provided as a part the proof of Theorem 2.1 in Section 6, and relies on both Propositions 4.1 and 5.1. We now motivate the proposition and introduce some notation needed to state it.
Let C be the class of functions f that satisfy Clearly, the MGF of any probability measure on R d + belongs to C. Suppose that f is a pointwise limit of a sequence of MGFs of probability measures. Then, f ∈ C. Such a pointwise limit is not necessarily the MGF of a probability measure; for example, this happens when the sequence of measures is not tight. One can prove that the pointwise limit f is an MGF of a probability measure if and only if f is left continuous at 0; see Lemma 6.1 in Section 6. By left continuity, we mean where θ ↑ 0 means that θ ∈ R d approaches 0 from left in arbitrary way.
In the tightness argument of Section 6, we deal with the sequence of MGFs corresponding to {Z (n) (∞), n ≥ 1}. Loosely speaking, Proposition 4.1 tells us that the pointwise limit of every convergent subsequence of the MGFs satisfies the BAR of the SRBM (2.30). Proposition 5.1 then leverages the structure of this BAR to prove that the pointwise limit of the MGFs is left continuous at the origin, thereby proving tightness of the corresponding sequence of probability measures. This argument is made precise in Section 6. Having sufficiently motivated the necessity of Proposition 5.1, we now continue with its setup.
For our purposes, it will be beneficial to take limits as θ ↑ 0 along rays stemming from the origin. Each ray corresponds to some direction vector For a fixed ray c A > 0, we consider the limits lim α↑0 f (αc A ), α ∈ R, which always exist by the monotonicity of f . We note the following fact: Proof. Since c A > 0 andc A > 0, there exist constants m, M > 0 such that Using the monotonicity of f (θ), In view of this lemma, we write The following proposition is one of the main tools used to prove Theorem 2.1. It will allow us to show that whenever the sequence of the MGFs of Z (n) (∞) has a limit, it must be left continuous at 0.
Proposition 5.1. Assume ψ ∈ C and {ψ j , j ∈ J } ⊂ C satisfy for θ ≤ 0, (5.1) and that ψ j (θ) is independent of θ j . For any A ⊂ J , Proof. Recall the definitions of γ(θ) and γ j (θ) from (2.27). For any A ⊂ J , c A > 0, and α < 0, we set θ = αc A in (5.1) to get We divide both sides by α and let α ↑ 0, which yields By Lemma 5.1, c A = (c 1 , ..., c d ) T can be arbitrary as long as c A > 0. For each fixed i ∈ A, we set c i = 1 and let c k ↓ 0 for k ∈ A \ {i}. We arrive at where r ij is the (i, j)th entry of the reflection matrix R. Next, we split the summation in this formula into two parts: We consider these equations as an |A|-dimensional vector equation. Let R A be the principal sub-matrix of R whose entry indices are taken from A. Since R is an M-matrix, R A is also an M-matrix, so it has an inverse whose entries are all nonnegative. Denote the (i, j)th entry of the inverse of R A by r A ij . Multiplying the vector version of (5.4) from the left by ( Set A = J \ B, where B ⊂ J and B = J . Using induction on the size of the set B, we will show that We first take B = ∅, meaning A = J . Since the summation in (5.5) vanishes, we have which proves (5.3). Hence, the base case is true and we now justify the inductive step. For a general set B ⊂ J , (5.5) becomes follows because ψ ∈ C. We now assume that Note that r ij ≤ 0 for j = i and r J \B ki ≥ 0 because R and R A are M-matrices. Hence, which together with (5.7) immediately gives us We combine (5.8) and (5.9) to complete the proof.
We now state a lemma that will be used frequently in this section. Its proof is given in Section B.
Suppose that there exists a function f (θ) such that is the MGF of some probability measure ν on R d + , which immediately implies that ν (n) ⇒ ν, as n → ∞.
In the remainder of the proof, we prove (6.3), which implies (6.4) by (5.3). Let A = {j}, then (5.2) implies is the scaled steady state queue length vector of the n ′′ th GJN. Applying Lemma 6.1 to (6.6), we see that {Z (n ′′ ) j (∞)} is tight for each j ∈ J . Since all the marginals are tight, {Z (n ′′ ) (∞)} is tight as well. We invoke Lemma 6.1 again to conclude that (6.3) holds.
7. Concluding remarks. To summarize, this paper uses a novel approach to justify the steady-state diffusion approximation of GJNs. Our method does not rely on first using the process limit Z (n) (·) ⇒ Z(·), followed by the tightness of {Z (n) (∞), n ≥ 1} as in [17] or [8]. Rather, we work directly with the basic adjoint relationships of the GJN and the SRBM, which characterize their respective stationary distributions. By applying the BAR (3.15) of the GJN to a carefully designed exponential test function, we show in Proposition 4.1 that the sequence of moment generating functions of the GJN (together with a corresponding sequence of boundary measures) asymptotically satisfies the BAR (2.30) of the SRBM.
In addition to Proposition 4.1, we also require tightness of the sequence {Z (n) (∞), n ≥ 1} like in [17] and [8]. However, unlike the previous two papers, we do not rely on constructing Lyapunov functions to prove this tightness. Our tightness argument from Section 6, relies on algebraic manipulations of the BAR (2.30) of the SRBM, the bulk of which are presented as Proposition 5.1. A critical condition for this proposition is that the reflection matrix of the SRBM is an M-matrix, and in particular, that its diagonal entries are non-negative and off diagonal entries are non-positive.
An important direction for future research is generalizing the methodology presented in this paper to the multiclass queueing network setting, where new sources of difficulty emerge. One source of difficulty is the need to handle state space collapse. In a multiclass network with d stations, the vector of queue lengths is of a higher dimension than d, while the approximating diffusion process remains d-dimensional. This will manifest itself as an extra error term on the right hand side of (4.2) in Proposition 4.1, and we would need to have some way to deal with it. Another source of difficulty is to relax the M-matrix currently needed to prove tightness, as the Mmatrix structure is a feature unique to GJNs. Preliminary experiments with several multiclass networks suggest that a variant of Proposition 5.1 can be established even when the reflection matrix is not an M-matrix. However, it remains an open problem to find the minimal conditions on the reflection matrix under which our approach can be carried out.
second derivatives of f (x) are .
Observe that f ′ (x) < 0, which implies that f (x) is decreasing. To prove concavity, we wish to show that f ′′ (x) ≤ 0. This follows from which is just the Cauchy-Schwarz inequality. A second order Taylor expansion of f (x) combined with the facts that Using (A.1), we see that η (n) i ′′ (θ i ) can be written symbolically as To prove (A.2), we want to replace each a i (r n y) by a i (0) one at a time. For this, we need to show that sup n≥1 |y|≤K a 1 (r n y) < ∞, sup In the first inequality, we use (A.5) together with the bound The second inequality is by (4.8) and the fact that the limit equals zero follows from (2.11)  Lastly, we tackle a (n) e,i (r n K)eĉ Recalling the definition of η (n) i (θ i ) in (3.22) and the truncation function g (n) (y) = min(y, 1/r n ) defined for y ∈ R, we observe that where in the last inequality, we used part (b) of Lemma 4.1, which states that η (n) i (θ i ) is a decreasing function of θ i . To bound this term, we first argue that there exists an ǫ ∈ (0, 1) and y 1 > 0 such that Suppose (A.9) is not true. Then lim inf n→∞ P(T (n) e,i ≥ y) < ǫ for any ǫ ∈ (0, 1) and y > 0. On the other hand, it follows from the uniform integrability assumption (2.11) that for any ǫ ′ , there exists a 0 such that for any a ≥ a 0 , E T Taking ǫ, y → 0 and then ǫ ′ → 0, we conclude that lim n→∞ E(T (n) e,i ) = 0, which contradicts λ e,i < ∞ in (2.9) and therefore proves (A.9). In addition to (A.9), the uniform integrability assumption (2.11) implies that there exists a y 2 > y 1 such that sup n≥1 P(T Fixing such y 1 and y 2 , and recalling that η (n) i (r n K) < 0 for all n ≥ 1, we see that 1 .
We recall our choices of y 1 and y 2 from (A.9) and (A.10) to arrive at Since η We choose y = y 1 from (A.9), and use the fact that e −rnK → 1, to see that Therefore, where the last inequality is obtained by taking the limit infimum inside the logarithm. This proves (A.7). The argument to show that is nearly identical to the one forĉ (n) e,i (r n K), and requires two slight modifications. First, observe that sup n≥1 c Lip,j (r n K) < ∞, which is a trivial consequence of its definition in (4.14). Second, the equality in (A.8) would be modified to where C > 0 is an upper bound for sup y ≤rnK |log t j (y)| that is independent of n, and t j (θ) is defined in (3.23).
A.4. Proofs of Lemmas 4.4 and 4.5 . Let ν (n) be the probability measure corresponding to X (n) (∞). For i ∈ E and j ∈ J , recall the external arrival process {E    In this section we prove the following five statements, which imply Lemmas 4.4 and 4.5. We show that for all i ∈ E and j ∈ J , any real number K > 0 and any integer n ≥ 1, Assuming these five equations are correct, we see that Lemma 4.4 is immediately implied by (A.11) and (A.13). To see why Lemma 4.5 is true, we write where in the last equality we use R In the rest of this section, we prove (A.11)-(A.15) using the BAR (3.15), which is repeated here as E ν Af (X(0)) (A.16) are introduced below (3.13) in Section 3.1, the operator A is defined in (3.11), and the class of functions D is defined above (3.11). Fix an integer κ ≥ 1 and j ∈ J . We now prove (A.11) by applying the BAR (A.16) to f κ,1 (x) = v j ∧ κ (which belongs to D) to obtain For station j, we recall the sequence of service times {T (n) s,j (q), q ≥ 1} and the sequence of departure times {S (n) j (q), q ≥ 1}, defined in (2.2) and (3.6), respectively. From the form of the test function, we see that f κ,1 (X m that do not correspond to departures from station j. Therefore, In the second equality, we used the independence of T (n) s,j (q) and S (n) j (q), and in the last equality we use the Fubini-Tonelli theorem, justified by the fact that the summands are non-negative. We therefore have We set t = 1, take the limit as κ → ∞ on each side, and apply the monotone convergence theorem to conclude the proof of (A.11).
To prove (A.12), we use the test function f κ,2 (x) = u i ∧ κ (with κ ≥ 1, and i ∈ E) and repeat the argument for (A.11), but instead of S i (q), as defined in (3.2) to be the qth external arrival time to station i ∈ E. An alternative way to prove (A.12) would be to verify that the process E (n) i is stationary when E (n) i (0) is initialized accoridng to ν (n) . Then (A.12) would follow from [35,Theorem 76].
We now move on to prove (A.13). We recall from (3.9) that on every sample path, where E j (t) ≡ 0 if j ∈ J \ E and {Φ (k) (m), m ≥ 1} is the customer routing process from (3.4). We first show that which would have been immediate provided we knew that E L (n) j (∞) < ∞. Instead, we use the stationarity of ν (n) to see that for every κ ≥ 1, We wish to take the limit as κ → ∞ and apply the dominated convergence theorem. To justify doing so, observe that (L and by (3.12), the expectation of the right hand side (with respect to ν (n) ) is finite. We apply the dominated convergence theorem to get Suppose we knew that Letting D (n) (0, t] ∈ R d be the vector whose elements are D (n) k (0, t], we would use (2.4) and (A.12) to conclude that E D (n) (0, t] = (I − P T ) −1 λ (n) e t = λ (n) a t, thereby completing the proof of (A.13). It remains to verify (A. 19). Observe that j is the indicator that the qth customer departing station k routes to station j, as defined in (3.1), and in the second equality we apply the Fubini-Tonnelli theorem, justified by the non-negativity of the summands. By definition, φ (k) j (q) is independent of S (n) j (q) for all q ≥ 1, and E ν (n) φ (k) j (q) = p kj , q ≥ 1. Repeating the arguments used to obtain (A.17), we see that This proves (A. 19) and concludes the proof of (A.13).