Tightness of stationary distributions of a flexible-server system in the Halfin-Whitt asymptotic regime

We consider a large-scale flexible service system with two large server pools and two types of customers. Servers in pool 1 can only serve type 1 customers, while server in pool 2 are flexible -- they can serve both types 1 and 2. (This is a so-called"N-system."Our results hold for a more general class of systems as well.) The service rate of a customer depends both on its type and the pool where it is served. We study a priority service discipline, where type 2 has priority in pool 2, and type 1 prefers pool 1. We consider the Halfin-Whitt asymptotic regime, where the arrival rate of customers and the number of servers in each pool increase to infinity in proportion to a scaling parameter $n$, while the overall system capacity exceeds its load by $O(\sqrt{n})$. For this system we prove tightness of diffusion-scaled stationary distributions. Our approach relies on a single common Lyapunov function $G^{(n)}(x)$, depending on parameter $n$ and defined on the entire state space as a functional of the {\em drift-based fluid limits} (DFL). Specifically, $G^{(n)}(x)=\int_0^\infty g(y^{(n)}(t)) dt$, where $y^{(n)}(\cdot)$ is the DFL starting at $x$, and $g(\cdot)$ is a"distance"to the origin. ($g(\cdot)$ is same for all $n$). The key part of the analysis is the study of the (first and second) derivatives of the DFLs and function $G^{(n)}(x)$. The approach, as well as many parts of the analysis, are quite generic and may be of independent interest.

1. Introduction.In this paper we consider a large-scale service system in the so-called Halfin-Whitt asymptotic regime.Such systems received a lot of attention in the literature, especially in the past 10-15 year, because they find a variety of applications, including, e.g., large customer contact centers [1,9] and large computer farms in network clouds.The Halfin-Whitt regime, introduced originally in [12], is such that the system capacity (roughly, number of servers) increases in proportion to a scaling parameter n, and exceeds the system load by O( √ n).It is attractive because it allows -in principle, under a good control algorithm -to achieve both good performance (e.g.waiting times) and high resource utilization.
In the Halfin-Whitt regime, the stochastic process describing the system behavior is usually studied under diffusion scaling, i.e. it is centered at the system equilibrium point and scaled down by n −1/2 .This name reflects the fact that, in the limit on n → ∞, on any finite time interval, the sequence of diffusion-scaled processes Y (n) (•) "typically" converges to a diffusion process Y (•).Then, a fundamental question is whether or not the following limit interchange property holds: the limit of stationary distributions of Y (n) (•) is equal to the stationary distribution of Y (•).In turn, the key The state space of the diffusion-scaled process for N -system has five domains (where the process drift is given by different affine functions).Nevertheless, we construct a single Lyapunov function G(x) on the entire state space, as a functional of the drift-based fluid limits (DFL), which are the deterministic trajectories defined by the drift of the process.Specifically, (1) where y(•) is the DFL starting at x, and g(•) is a "distance" to the origin.For Lyapunov functions of this type, in a setting more general than needed for the proof of Theorem 2, we give sufficient conditions for the tightness in Theorem 5; the key condition a bound on the Lyapunov function second derivatives.This result may be of independent interest.
The proof of Theorem 2 verifies the conditions of Theorem 5 for the N -system.This requires the analysis of the DFL structure, and of the (first and second) derivatives of DFLs and corresponding functionals G(x) on the initial state x; it also requires an appropriate choice of the "distance" g(•).Many parts of this analysis are quite generic and may also be of independent interest.
We note that for a deterministic dynamic system, with trajectories y(•) defined by a continuous derivative-field, the function G(x) given by ( 1) is a natural Lyapunov function (as will be illustrated in Section 2), as long as it is well defined (the integral in (1) is finite).In particular, this observation is used in [13,17] to establish the existence of a Lyapunov function for stable deterministic fluid models.This observation, however, does not imply that G(x) defined by (1) via DFLs y(•) can serve as a Lyapunov function for a (family of) random processes(es).In this paper we give sufficient conditions under which Lyapunov functions G(x) can be used to establish tightness of stationary distributions, and then verify these conditions for the N -system.
1.2.Layout of the rest of the paper.In Section 2, we informally discuss our general approach and the Lyapunov function construction.Section 3 formally defines the N -system, the Halfin-Whitt regime for it, and states the tightness (Theorem 2) and the limit-interchange (Corollary 4) results.
In Section 4, in a setting more general than needed for the N -system, we give a formal construction of the DFLs and the Lyapunov function, and sufficient conditions for the tightness (Theorem 5).Section 5 contains the proof of Theorem 2; here we choose a specific "distance" function g and verify the conditions of Theorem 5 for the N -system.A generalization of the N -system, for which our results still hold, is described in Section 6.Finally, in Section 7, we discuss our approach and results.
1.3.Basic notation.Symbols R, R + , Z, Z + denote the sets of real, real non-negative, integer, and integer non-negative numbers, respectively.In the Euclidean space R I (of dimension I ≥ 1): |x| denotes standard Euclidean norm of vector x = (x 1 , . . ., x I ), while x = i |x i | denotes its L 1norm; scalar product of two vectors is denoted x • y = i x i y i ; diag(x) denotes diagonal square matrix, with diagonal elements given by x; we write simply 0 for a zero matrix or vector; vectors are written as row-vectors, but in matrix expressions they are viewed as column-vectors (without using a transposition sign).For real numbers u and w: u ∨ w = max{u, w}, u ∧ w = min{u, w}, and u denotes the largest integer not greater than u.
For a vector-function y(•) = (y(t), t ≥ 0), we denote y(•) = sup [0,∞) y(t) .Abbreviation u.o.c.means uniform on compact sets convergence.If X(t), t ≥ 0, is a Markov process, we write X(∞) for a random element with the distribution equal to a stationary distribution of the process.(In all cases considered in this paper, the stationary distribution will be unique.)Symbol ⇒ denotes convergence in distribution of random elements; random processes are random elements in the appropriate Skorohod space.For a condition/event H, the indicator function I{H} is equal to 1 when H holds and 0 otherwise.

2.
The intuition for the Lyapunov function construction.The discussion in this entire section is informal.Consider a deterministic dynamic system governed by ODE where state y is a vector, and the vector-field v(•) is Lipschitz continuous.Suppose the system has unique stable point 0. Let g(x) be a non-negative continuous (and sufficiently smooth) function, which measures a "distance" from 0. (In our results, we will use g(x) which is a smooth approximation of L 1 -norm x .)Suppose that for any initial state y(0) = x the trajectory y(t), t ≥ 0 converges to 0 and, moreover, (2) Then, obviously, G(•) is a Lyapunov function for this dynamic system, in the sense that where G denotes the gradient of G.
Suppose now that instead of a deterministic system we have a Markov process Y (•), for which vector-field v(•) gives the drift.Then we can define deterministic trajectories y(•), and function G(•), the same way as above.(The trajectories y(•) we call drift-based fluid limits (DFL).)Suppose further that the process generator A is such that where C is a constant and G denotes the Hessian matrix of second derivatives.(To interpret (3) one can think, for example, of a diffusion process with bounded diffusion coefficients.In this paper we will work not with diffusion processes, but rather with diffusion-scaled processes for our queueing system -their behavior can be very different from that of diffusions, especially when the system state is "far" from the equilibrium point.Nevertheless, the process generator will have form (3).) Then, we have If we can show that ( 4) with a sufficiently small C 1 , then for some > 0, This is a Lyapunov-Foster type condition from which we can obtain the steady-state bound Eg(Y ) ≤ C 3 / .
Finally, suppose we consider a family of processes Y = Y (n) , with the drift v(•) and generator A depending on n.If for some common function g such that g(x) → ∞ as x → ∞, we can derive estimates (3)-( 5) with constants independent of n, then the family of stationary distributions of This is the program that we implement in this paper, for the sequence of diffusion-scaled processes for the N -system.The difficult part is obtaining the second derivative bound (4).Since G is defined as a functional of the DFLs y(•), this involves the analysis of the dependence of DFLs on the initial state.
3. N -system with absolute priority.Consider a so-called N -system, with absolute priorities.(See Fig. 1.)There are two customer types, arriving according to as independent Poisson processes with rates Λ 1 > 0 and Λ 2 > 0, respectively.There are two server pools, with B 1 and B 2 identical servers, respectively.The total service requirement of any customer is an independent, exponentially distributed random variable with mean ) ∨ 0 wait in the queue.(Here ∧ and ∨ denote minimum and maximum, respectively.)Therefore, the total service rate of all type 2 customers is The type 1 customers have absolute preference to be served in pool 1, and have lower preemptresume priority in pool 2. Namely, if there are X 1 type 1 customers in the system, then ∨ 0] are served in pool 2, and the remaining [X 1 − (B 1 + B 2 ) + (X 2 ∧ B 2 )] ∨ 0 wait in queue.The total service rate of all type 1 customers is We consider a sequence of such systems, indexed by a positive scaling parameter n, increasing to infinity.(See Fig. 2.) In a system with parameter n, where the positive parameters b, λ Given this definition, and the priorities, the system "desired operating point," which we will refer to as equilibrium point, is such that X 2 = ψ 22 n and X 1 = ψ 11 n + ψ 12 n, where type 1 customers occupy the entire pool 1 and ψ 12 n servers in pool 2; the equilibrium point is such that b √ n servers in pool 2 are empty -this is the "margin" by which system capacity exceeds its load.(Again, see Fig. Remark 1.To be precise, in the definition of the sequence of systems, we need to make sure that B 1 and B 2 are integer.Equations ( 9), as written above, assume that B 1 and B 2 "happen to be" integer.We make this assumption throughout the paper to simplify the exposition, while maintaining rigor of the results and arguments.More specifically, we could replace (9) with, for example, (11) If we do that, it is easy to check that for each n we can choose numbers ψ 11), ( 12), (22), and b (n) , such that: |ψ n for some constant C 55 > 0; (11) can be rewritten as ( 12) and (10) can be rewritten as ( 13) The sequence of systems will then be defined by ( 8), ( 12), (13).Then, the entire analysis in this paper will hold as is, with ψ ij and b replaced everywhere with ψ (n) ij and b (n) , respectively.(We note that the components of the equilibrium point, namely 12 n, need not be integer.) It is easy to see that for each n the process is continuous-time countable irreducible Markov chain, with the state space being (for each n) Z 2 + .Further, it is not difficult to check that, for each sufficiently large n, this Markov process is positive recurrent, and therefore has unique stationary distribution.Indeed, due to absolute priority, type 2 customers "do not see" type 1, and therefore X steady-state occupies on average ψ 22 n servers in pool 2. This means that on average ψ 12 n + b √ n servers in pool 2 are available to serve type 1 customers; this is in addition to all ψ 11 n servers on pool 1 which are available exclusively to type 1; therefore, the average total service capacity available to type 1 is We omit further details, which are rather straightforward.
The diffusion-scaled version X(n is defined by centering at the equilibrium point and rescaling by 1/ √ n: Theorem 2. For some C > 0 and all sufficiently large n, The proof of Theorem 2 is given in the rest of this paper.It relies on a family of Lyapunov functions (indexed by n), each being a functional of a fluid models, determined by the process drift.Such fluid models will be referred to as a drift-based fluid limits (DFL).In the rest of this section we define DFLs for the N -system under consideration, and give motivation for the form of Lyapunov function.
Then, in Section 4, we give the Lyapunov function construction and sufficient tightness conditions (Theorem 5) in a setting that is more general than needed for the N -system.In the following sections we verify the conditions of Theorem 5 for the N -system, thus proving Theorem 2.
For each n, for the unscaled process X (n) (•), we define a drift function (vector field) 2 ) for x = (x 1 , x 2 ) ∈ R 2 + .(Note that it is defined on R 2 + , and not just on the lattice Z 2 + .)It is defined in the natural way, as the difference of arrival and service rates (see ( 6)-( 7)): (15) where Λ 1 , Λ 2 , B 1 , B 2 are the functions of n givem in ( 8)- (10).
Let us denote by L n the affine mapping X (n) → X(n) , defined by (14).Then, the state space of The drift function for X(n) is defined accordingly: We emphasize, that v (n) (x) is defined on the continuous, convex set X (n) , which contains the discrete state space S (n) .It is important, however, that at each point x ∈ S (n) , v (n) (x) gives exactly the average drift of the process.Namely, where ν (n) (x, x ) is the Markov process transition rate from state x to state x ; note that there is only a finite number of "neighbor" states x for which ν (n) (x, x ) > 0.
It is easy to observe that v (n) (x) = 0 if and only if x = 0; also, uniformly in n, v which is easily seen to stay within X (n) for all t ≥ 0. This trajectory y (n) (t), t ≥ 0, will be called drift-based fluid limit (DFL), starting from x.
As we will show later in Section 5.1, each DFL y (n) (t) → 0 as t → ∞.Moreover, after a finite time, this convergence is exponentially fast, so that The Lyapunov function we will use to prove Theorem 2 is where y (n) (•) is the DFL starting from x, and g(•) is a smooth non-negative function (common for all n) approximating • .
Remark 3. Deterministic trajectories defined by the drift vector field, which we call DFL, have been considered in the literature; see e.g.[3], where they are called fluid models.However, the way we use DFLs in this paper -namely, to directly construct a Lyapunov function from them -is completely different from their use in [3].
3.1.Limit interchange.We conclude this section by noting that the tightness of stationary distributions of the processes X(n) (•), which follows from Theorem 2, allows us to easily establish the limit interchange result, given in Corollary 4 below.Observe that v for all large n.) Corollary 4. The following convergence holds where where W 1 , W 2 are independent standard Brownian motions and the diffusion coefficients are The proof is fairly straightforward, we just give an outline.First, the following convergence on a finite interval holds (see e.g.[11]).Namely, consider a sequence of processes X(n) (•) with fixed initial states X(n) (0) → x ∈ R 2 .Then, for any fixed where X(•) is a strong solution of (19) with initial state X(0) = x.Then, (18) can be established, together with the existence and uniqueness of a stationary distribution of X(•), as follows.We consider the sequence of stationary versions of the processes X(n) (•) on a fixed finite time interval [0, T 0 ], and let n → ∞.Given tightness of stationary distributions of pre-limit processes, we can choose a subsequence along which X(n) (0) ⇒ X(0) for some random vector X(0); then we also have X(n) (T 0 ) ⇒ X(0).We then use (20) to show that the distribution of X(0) must be a stationary distribution of X(•).The uniqueness of the latter stationary distribution is easy to establish, for example, using a coupling argument.
4. Lyapunov function construction and a tightness criterion.The model in this section is quite general (including the N-system as a special case).For this model we define DFLs, construct a functional of DFL, and give sufficient conditions under which this functional can serve as a Lyapunov function to prove tightness of stationary distributions.The section is self-contained, because its main construction and result may be of independent interest.However, it may help the reader to keep the N-system described in Section 3 in mind as an example, to make the material more concrete.
Let I ≥ 1 be a fixed positive integer.For each n = 1, 2, 3, . .., we consider a Markov chain X(n) (t), t ≥ 0, with a countable state space S (n) which has the form where X (n) is a convex closed subset of R I , containing 0, and The Markov chain is irreducible positive-recurrent and is such that the total transition rate out of any state is upper bounded by R 1 n and any single transition has the jump size of at most R 2 / √ n, where R 1 , R 2 are positive constants independent of n.Defined on X (n) is a drift function (vector field) v (n) (x), which is Lipschitz continuous uniformly in n.At each point x ∈ S (n) , v (n) (x) gives exactly the average drift of the process.Namely, where ν (n) (x, x ) is the Markov process transition rate from state x to state x ; note that there is only a finite number of "neighbor" states x for which ν (n) (x, x ) > 0.
For any x ∈ X (n) , there is a unique solution y (n) (t), t ≥ 0, to the ODE which stays within X (n) .This solution is called drift-based fluid limit (DFL), starting from x.
Suppose a continuous non-negative function g where y (n) (•) is the DFL starting from x.
Denote by ∇ z G (n) (x) the directional derivative of G (n) at x ∈ X (n) in the direction of vector z ∈ R I : imsart-ssy ver.2011/12/06 file: hw-tightness.texdate: March 20, 2014 when the limit exists.(To be precise, if x in on the boundary of X (n) , it is also required that the direction z from x points into X (n) .)Then, ∇ z * [∇ z G (n) ](x) is the second derivative, first in the direction z and then z * .
Theorem 5. Suppose that for any C 1 > 0, there exists a function g(x), x ∈ R I , and a constant C 2 > 0, such that the following conditions hold uniformly in n.
(i) g(x) is Lipschitz continuous non-negative and such that g(x) → ∞ as x → ∞.
(ii) Function G (n) (x), x ∈ X (n) , is finite for all x; it has continuous gradient ∇G (n) (x); for any x and any fixed unit-length vectors z, z * ∈ R I , (24) Then, for some C 3 > 0, uniformly in n, The second derivative condition ( 23) is the key one.It implies that this second derivative exists.An equivalent form of ( 23) is as follows: for any compact set D ⊆ X (n)  and any unit-length vector z ∈ R I , the first derivative Proof of Theorem 5.By definition of G (n) and its assumed continuous differentiability, (25) Let A (n) denote the infinitesimal generator of the Markov process X (n) .For any fixed k > 0, the function ∧ k is such that it has constant value k for all states x ∈ S (n) except a finite subset.Then it is easy to see that function G (n),k is within the domain of A (n) and ( 26) (See also [7], page 31, for more details.)For all x ∈ S (n) we have where R 2 / √ n is the maximum possible size of one jump of the process, r (n) (x) ≤ R 1 n is the total transition rate from state x, and the second-term coefficient h Recalling also (25), we obtain We choose a sufficiently small constant C 1 > 0 (and then a corresponding function g and constant C 2 ), so that imsart-ssy ver.2011/12/06 file: hw-tightness.texdate: March 20, 2014 Therefore, from (26) we obtain Letting k → ∞, by monotone convergence, The constant in the RHS is same for all n. 2 5. Proof of Theorem 2. We will prove Theorem 2 by choosing specific function g(•) and then verifying (in Theorem 10) the assumptions of Theorem 5 for N -system.
In this section, we study properties of DFL trajectories and their G (n) -functionals, for a system with a fixed scaling parameter n.We will drop upper index (n) from now on.So, for example, will write simply X and y(t) instead of X (n) and y (n) (t), respectively.(However, the expressions may contain n as a variable.)From this point on in the paper, we say that C is a universal constant if C depends only on the system parameters λ i , ψ ij , µ ij , b, but does not depend on scaling parameter n. (If the sequence of systems is defined as in Remark 1, then a universal constant C depends on the values ψ ij and b, and not on the sequences ψ In this subsection we first establish some basic properties of DFLs and their directional (Gateaux) derivatives.
Then we specify function g(•), and obtain the expressions for the first derivatives of the corresponding function G(•).(All results of this subsection hold for systems far more general than N-system.In particular, they still hold for the systems under the Leaf Activity Priority LAP discipline in [14,15], in the Halfin-Whitt regime; our priority discipline for the N -system is a special case of LAP.) The DFL trajectories y(•) have the following structure.Recall that v(x) is (uniformly in n) Lipschitz continuous on the entire X .There is a finite number M (same for any n) of domains, indexed by m = 0, . . ., M − 1; within each of them v(x) is a given linear function.More precisely, the DFL satisfies a linear ODE where u m is a constant I × I matrix (same for each n), and a m is a constant vector (depending on n).Informally speaking, a domain is determined by which service pools a fully occupied and/or which queues are non-empty.
Formally, the domains are easier to define (and think of) in terms of unscaled quantities X 1 ≥ 0 and X 2 ≥ 0, and unscaled pool sizes B 1 = ψ 11 n and B 2 = ψ 12 n + ψ 22 n + b √ n.Each domain is defined by a combination of the directions of three strict inequalities: If , and For a given fluid trajectory, let us call time point t ≥ 0 a switching point if y(t) belongs to the intersection of two or more closed domains X m .(i.e. it is on a boundary separating different domains).
(ii) DFL y(•) depends on its initial state x continuously, in the sense of y(•) -norm.
(iii) DFL y(•) has at most K switching points, t 1 < t 2 < . . .< t K , 0 ≤ K ≤ K , and t K < x T .Moreover, the set of switching points is upper semicontinuous: as a DFL initial state converges to x, the limiting points of the set of switching points are within the set of switching points for initial state x.
(iv) For any interval [C 3 , C 4 ], not containing 0, there exists a constant T 3 > 0 (independent of n), such that the total time the condition y i (t) ∈ [C 3 , C 4 ] holds for at least one i, is upper bounded by T 3 .
Proof of Lemma 6.Given equation (30), condition y 2 (t) = ψ 12 √ n + b ( X 2 = B 2 ) can hold at most at one point t 2 ≥ 0, which will be a switching point.Similarly, by (31), there is at most one point t 1 ≥ 0, at which condition y 1 (t) = −ψ 12 √ n (corresponding to X 1 = B 1 ) can hold, and if so, it will be a switching point.
) hold.Therefore, y(t) can be only in one of the two domains X 0 or X 1 , depending on whether y 1 + y 2 ≤ b (no queues) or y 1 + y 2 ≥ b (queue size y 1 + y 2 − b of type 1).It is easy to see from equations (d/dt)y 2 = −µ 22 y 2 , (33), (34), that if y(t) is in X 1 , then the trajectory eventually leaves X 1 and can never return.This implies that at most two transitions between X 0 and X 1 can occur after t .Specifically, either the trajectory stays in X 0 , or it is in X 1 and then X 0 , or it is in X 0 then X 1 then X 0 .The boundary cases are also possible; for example, the trajectory may stay in the open domain y(t) ≤ C x for some universal positive constants T, C , C 37 .Obviously, t ≥ τ , so that τ ≤ T x + C 37 .However, if x ≤ α, i.e. y(0) = x is already in X 0,α , then obviously τ = 0. Therefore, in the bound τ ≤ T x + C 37 , we can drop C 37 by rechoosing T , if necessary.
For future reference, we also make the following observation.Suppose, µ 12 = µ 22 .Then, there can be at most one switching point after time t , let us call it t 3 ≥ t , and it is such that y(t) ∈ X 0 for all t > t 3 .Indeed, in this case, in the domain X 0 , we have simply Let us prove properties (i)-(iv).In fact, (i) has been proved already.For a given x, let us choose τ such that τ < τ for all initial states sufficiently close to x. (On a finite interval [0, τ ], y(•) depends on the initial state continuously, because it is a solution to an ODE with Lipschitz continuous RHS.) But, for t ≥ τ , the DFL with any initial state close to x is such that y(t) ∈ X 0,α ; this implies uniform convergence across all t ≥ 0, which proves (ii).The part of property (iii), stating that there is at most K switching points, all of which are smaller than τ ≤ T x , has already been proved, in fact we specified that K ≤ 4.Then, the upper continuity of the set of switching points follows from continuity of trajectories w.r.t.initial state; this proves (iii).Consider a fixed interval [C 3 , C 4 ], not containing 0. It is clear from (30) that y 2 (t) can spend only a finite time within [C 3 , C 4 ].Now, y 1 (t) can be in [C 3 , C 4 ] only after time t 1 , and then in every domain the trajectory visits y 1 (t) satisfies one of the equations (32)-(34).If we examine each of these equations (and recall that (34) holds within domain X 1 , where (d/dt)y 2 = −µ 22 y 2 ), we see that even if the equation were to hold up to infinite time, y 1 (t) can spend only a finite time within [C 3 , C 4 ].And there is only a finite, uniformly bounded number of domains that a trajectory can visit.This proves (iv). 2 Next, let us consider the first-order dependence of DFL on the initial state.Let y(t; x) denote y(t) with initial state y(0) = x ∈ X .For any x ∈ X and any direction z ∈ R I (which does not point outside X ), we use the following notation for the directional (Gateaux) derivative of y(t; x) at x in the direction z: Theorem 7. (i) For any fixed x ∈ X and a fixed vector z, the directional derivative exists.It has the following structure.Let 0 < t 1 < t 2 < . . .< t K be the switching points of y(•; x).Then, ξ(0) = z, and in each interval where matrix u m is the matrix u for the domain X m containing y(t; x).Solutions q(t), t ≥ 0, to the equation (d/dt)q = u m q, for any m, are such that for a universal constant C 7 > 0.
(iii) There exists a universal constant C 8 > 0, such that Proof.The proof of (i) relies on the following observations.(a) In any time interval, where both y(t; x + zδ) and y(t; x) are within same domain X m , they are governed by the same ODE (d/dt)y = v m (y), and therefore their difference ∆y(t) = y(t; x + zδ) − y(t; x), is governed by the linear homogeneous ODE (d/dt)∆y = u m ∆y.Moreover, it is easy to check that within any domain X m the corresponding matrix u m is such that ∆y(t) can increase at most by some universal factor C 8 .Indeed, consider ∆y 2 first, and then ∆y 1 .The equation for ∆y 2 is either we also note that if ∆y 2 satisfies (38) then necessarily u m 12 = 0. We see that in any case, in any time interval, |∆y 2 (t)| is upper bounded by the initial ∆y 2 times a universal constant.This observation, in particular, proves (36).(b) The total length of "switching intervals", where y(t; x + zδ) and y(t; x) belong to different domains vanishes as δ → 0 (by upper semicontinuity of the set of switching points), and therefore the total change of ∆y(t) within those intervals is "small".More precisely, let t be fixed and [θ 1 , θ 2 ] be a switching interval such that θ 1 , θ 2 → t.Then, ∆y(θ 2 ) − ∆y(θ 1 ) / ∆y(θ 1 ) → 0, because v(x) is Lipschitz.Combining observations (a) and (b), and further observing that the number of intervals where both y(t; x + zδ) and y(t; x) are within same domain X m (i.e.outside the switching intervals) is upper bounded, we take the δ ↓ 0 limit to obtain (i).
(ii) This follows from the continuity of the set of switching points on x.
(iii) By (36), in any domain ξ(t) can increase at most by some factor C 7 .There is only a finite number of domains that y(t) visits.This proves (iii). 2 We now introduce a specific function g, which we will use in the definition (22) of the Lyapunov function W . Definition 8. Let parameter C > 0 be fixed.Let a function f (η) of real η be fixed, which satisfies the following conditions.It is a non-negative, even, convex, twice continuously differentiable, (Such a function can be defined explicitly.Since C is a parameter, essentially, we fix the shape of function f (C + ζ), ζ ≥ 0.) Note that both f and f are uniformly bounded, and f = 0 outside of the intervals Obviously, |f (η) − |η| | is uniformly bounded by a constant, and then so is |g(x) − x |.
Then, by ( 22) we have Theorem 9.For i = 1, 2, the following holds.For any x ∈ X and any direction vector z, Proof.Expression (40) follows from Theorem 7(i) and the fact that f is continuous bounded.The continuity of ∇ z G i (x) is obtained using Theorem 7(ii). 2 5.2.Second derivative bounds for the Lyapunov function.
Theorem 10.The assumptions of Theorem 5 hold for a function g in Definition 8, with appropriately chosen parameter C > 0.
Note that for a function g satisfying Definition 8, condition (i) of Theorem 5 holds automatically.Condition (24) is also automatic given the definition of G and basic properties of DFL, namely the fact that the time for a DFL to reach a given compact set increases to infinity as x → ∞.Therefore, to prove Theorem 10, it remains to prove condition (23), and it suffices to prove it separately for G i , i = 1, 2 (see (39)).We will do this first for the case µ 22 = µ 12 , and then for µ 22 = µ 12 .(The proof of condition (23) in this section applies to the N-system, as well as its generalization described in Section 6.It does not apply for LAP discipline.) For a given x and a time τ * > 0, denote by S(τ * ; x) the set of time points, consisting of τ * and all switching points 0 ≤ t < τ * of the DFL y(•; x).
Lemma 11.Suppose µ 22 = µ 12 .For any > 0 there exists a sufficiently large C 9 > 0, such that, for all sufficiently large n, the following holds for any fixed x and any unit-length vector z.Let τ 9 be the first time the DFL y(•; x) hits set { y ≤ C 9 }.Then for all sufficiently small δ > 0, any point in S(τ 9 ; x + zδ) is within distance at most δ from a point in S(τ 9 ; x).Recall that to prove Theorem 10, it remains to prove the second derivative condition (23).The proper second derivative may not exist, hence we must "settle" for the estimate (23).But, to illustrate the proof that follows, let us write down the expression for the second derivative, by formally applying ∇ z * differentiation to (40) (this expression is not used in the proof): Proof of Theorem 10, case µ 22 = µ 12 .We choose small > 0 and then C 9 > 0 as in Lemma 11.Then choose parameter C > 0 of function g large enough so that any DFL starting from the set { y ≤ 2C 9 } never hits set { y ≥ C}. (We can do this by Lemma 6(i).) (The integrals of the terms (44) and (45), correspond to the integrals (42) and (43), respectively, in the formal second derivative expression.)We consider the third switching point, and so on.We see that the first-order component of ξ(t; x + z * δ, z) − ξ(t; x, z) will be upper bounded by C * δ, for a sufficiently large universal C * .(There will be also higher order terms δ , ≥ 2, with uniformly bounded coefficients.)This proves claim (46).
Now, for all sufficiently small δ, the integral of the term (45), In particular, in a small neighborhood of time t, (d/dt)[y 1 (t) + y 2 (t)] ≤ −(b/2)µ 22 < 0. These facts imply that the switching interval, corresponding to switching time t, is such that its end points are within C 40 δ from t, for some universal constant C 40 > 0. This means that the contribution of this last switching interval, as well as of the remaining time interval up to the time τ 9 , into the integral of (45), is upper bounded by a universal constant C 41 > 0. 2 6.Generalization of the N -system.Theorem 2, along with its proof, easily extend to the generalization of N -system, shown in Figure 3, in the Halfin-Whitt regime.The system has two customer types and arbitrary number of server pools.There is exactly one server pool that is flexible, i.e. can serve both types.(On Figure 3, it is the pool in the middle.)Each of the remaining pools is dedicated to service of either type 1 or 2. (The two pools on the left in the figure are dedicated to type 1, while the two pools on the right -to type 2.) Each customer type has absolute preference for its dedicated server pools, in some fixed priority order, over the flexible pool.In the flexible pool, the absolute preemptive priority is given to one of the types.
The key features that the generalized system shares with the N -system are that there are two customer types and only one flexible server pool, which can be shared by the customers of different types.These features are exploited in Section 5.2, where we estimated second derivatives of the Lyapunov function.(We note again that all results in Section 5.1, which concern with first derivatives, hold for far more general systems, e.g.those under LAP discipline [14,15].)The behavior of the DFLs for the generalized system is more complicated, simply because the number of state space domains can be very large.However, as in the N -system, after a finite time all dedicated server pools stay fully occupied, which means that the DFL dynamics depends only on "what happens" in the flexible pool.Consequently, our analysis goes through with very minor adjustments.

Discussion
. In this paper we address the problem of tightness of stationary distributions, and the limit interchange, for flexible multi-pool service systems in the Halfin-Whitt regime.The behavior of such systems can be very complicated, which makes the problem challenging.This is, in particular, due to the difficulty of constructing Lyapunov functions.We propose an approach which uses a single Lyapunov function, defined as an integral functional of the drift-based fluid limits (DFL) y(•): G(x) = ∞ 0 g(y(t))dt, y(0) = x.The problem then reduces to studying the (first and second) derivatives of a DFL -and the corresponding integral G(x) -on the initial state x.We apply this approach to show the tightness property for the N -model under a priority discipline.
Both the approach and many parts of our analysis are quite generic and might be applicable to other models as well.In this respect, note that there is a lot of flexibility in choosing the "distance" imsart-ssy ver.2011/12/06 file: hw-tightness.texdate: March 20, 2014 function g(•).It might also be possible to combine the approach with other approaches.For example, a Lyapunov function of the type we consider could be defined and applied on a subspace, if it could be shown by other means that the stationary distributions concentrate (in appropriate sense) on that subspace.Exploring these directions may be a subject of future research.

1 .
Basic DFL properties.First derivatives of DFLs and the Lyapunov function.

Fig 3 .
Fig 3. A more general system.
The constants C 14 and C * are universal, while C 15 depends on C, which depends on C 9 , which depends on .It remains to choose small enough so that C 14 C * < C 1 .Then the value of C 14 C 15 C * , plus the corresponding upper bound on the integral of (44), gives constant C 2 . 2 Proof of Theorem 10, case µ 22 = µ 12 .This case is treated the same way as µ 12 = µ 22 , with the following modifications.If there is no switching point t ∈ S(τ 9 ; x), associated with equality y 1 (t) + y 2 (t) = b, then the proof is unchanged.Suppose there is a switching point t ∈ S(τ 9 ; x), associated with equality y 1 (t) + y 2 (t) = b.Then, in the notation of the proof of Lemma 6, we must have t ≥ t , and by the observation we made in that proof, t is the last switching point, and therefore it is the only switching point associated with equality y 1 (t) + y 2 (t) = b.Moreover, all the properties we established in the µ 22 = µ 12 case proof, still apply to all switching points before t.After time t, the process stays within the domain X 0 , and therefore (d/dt)[y 1 (t) + y 2 (t)] = −µ 22 [y 1 (t) + y 2 (t)].