Open Access

Jumping Fluid Models and Delay Stability of Max-Weight Dynamics Under Heavy-Tailed Traffic

Arsalan Sharifnassab
Corresponding Author
Arsalan Sharifnassab
[email protected]
https://orcid.org/0000-0002-3910-2878
Department of Computing Science, University of Alberta, Edmonton, Alberta T6G 2R3, Canada;
Search for more papers by this author
,
John Tsitsiklis
John Tsitsiklis
[email protected]
https://orcid.org/0000-0003-2658-8239
Laboratory for Information and Decision Systems, Electrical Engineering and Computer Science Department, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Search for more papers by this author

Arsalan Sharifnassab

Corresponding Author

Arsalan Sharifnassab

[email protected]

https://orcid.org/0000-0002-3910-2878

Department of Computing Science, University of Alberta, Edmonton, Alberta T6G 2R3, Canada;

Search for more papers by this author

John Tsitsiklis

[email protected]

https://orcid.org/0000-0003-2658-8239

Laboratory for Information and Decision Systems, Electrical Engineering and Computer Science Department, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

Search for more papers by this author

Published Online:14 Apr 2023https://doi.org/10.1287/stsy.2023.0110

Abstract

We say that a random variable is light-tailed if moments of order $2 + ϵ$ are finite for some $ϵ > 0$ ; otherwise, we say that it is heavy-tailed. We study queueing networks that operate under the max-weight scheduling policy for the case in which some queues receive heavy-tailed and some receive light-tailed traffic. Queues with light-tailed arrivals are often delay stable (that is, expected queue sizes are uniformly bounded over time) but can also become delay unstable because of resource sharing with other queues that receive heavy-tailed arrivals. Within this context and for any given “tail exponents” of the input traffic, we develop a necessary and sufficient condition under which a queue is robustly delay stable, in terms of jumping fluid models—an extension of traditional fluid models that allows for jumps along coordinates associated with heavy-tailed flows. Our result elucidates the precise mechanism that leads to delay instability through a coordination of multiple abnormally large arrivals at possibly different times and queues and settles an earlier open question on the sufficiency of a particular fluid-based criterion. Finally, we explore the power of Lyapunov functions in the study of delay stability.

1. Introduction

We say that a random variable is light-tailed if moments of order $2 + ϵ$ are finite for some $ϵ > 0$ ; otherwise, we say that it is heavy-tailed. We study queueing networks that operate under the max-weight (MW) scheduling policy for the case in which some queues receive heavy-tailed traffic, whereas some other queues receive light-tailed traffic; our motivation stems from the fact that heavy-tailed processes are often natural models of the inputs to computer and communications networks (Foss et al. 2011). Queues that receive heavy-tailed traffic are naturally delay unstable, that is, they incur infinite expected delay as an immediate consequence of the Pollaczek–Khintchine formula. However, it is also known that, because of the relatively complex max-weight dynamics, some of the queues that receive light-tailed traffic may also end up delay unstable.¹ Our aim is to develop conditions that determine whether any particular queue is delay stable or not.

This problem is studied extensively (Markakis 2013; Markakis et al. 2014, 2018; Nair et al. 2015), and a necessary condition for delay stability is given in Markakis et al. (2014, 2018). In particular, Markakis et al. (2018) consider the associated fluid model, initialized at zero, except for a positive initial condition at some queue that receives heavy-tailed arrivals. If another queue happens to eventually become positive under that fluid model, one can then conclude that the latter queue is delay unstable. This result leads to the natural question whether this condition is also sufficient, that is, whether delay stability is guaranteed when any such fluid trajectory (with a positive initialization at any single one of the queues that receive heavy-tailed traffic) keeps the queue of interest at zero level. Such a sufficiency result might appear plausible because, in models involving heavy-tailed random variables, large fluctuations are often the consequence of a single abnormally large value in the underlying heavy-tailed random variables (Pakes 1975, Veraverbeke 1977, Anantharam 1989, Asmussen 1996, Durrett 1980, Foss et al. 2011).

1.1. Our Contributions

We start in Section 3 by showing that the aforementioned possible sufficiency result does not hold. We accomplish this by providing a fairly simple example in which a large arrival at any single heavy-tailed queue does not cause a certain queue of interest to grow, but a combination of two large arrivals, at two different heavy-tailed queues, can result in large backlogs at the queue of interest. We also provide necessary and sufficient conditions for delay stability in that particular example, and then provide intuition for a possible more general result. Interestingly, delay instability manifests itself only when the tail exponents (defined precisely in Section 2.2) of the heavy-tailed arrival processes lie in a specific range. As a consequence, we need to take these exponents explicitly into account, something that traditional fluid models cannot do.

We then generalize, by developing general and tight (necessary and sufficient) conditions for delay stability, in terms of deterministic fluid-like models in which there can be multiple jumps (in different queues and possibly at different times) at the heavy-tailed queues.

Our conditions are not easy to test computationally, but this seems unavoidable: because the conditions are necessary and sufficient, the complexity of testing them reflects the intrinsic complexity of testing delay stability. On the positive side, our conditions

Provide a conceptual understanding of the mechanism that results in delay instability.
Can be checked in special cases, for instance, for the example in Section 3; see Section 5.3.

On the technical side, our general conditions involve a small but technically crucial reformulation of the delay stability problem. To understand the underlying issue, note that fluid models do not always lead to definite conclusions when the underlying system is marginally stable but are generally effective when used to analyze “robust” properties. For this reason, we consider a network with given nominal arrival rates and focus on robust stability. Namely, we ask whether a certain queue is delay stable for all arrival processes with given tail exponents and for all (possibly time-varying) arrival rates that lie in some open ball around the nominal ones. When the problem is framed that way, definitive necessary and sufficient conditions for (robust) delay stability become possible. Furthermore, with this formulation, it is only the nominal arrival rates and the tail exponents that matter as opposed to the details of the distribution of the input traffic.

Finally, earlier works Markakis (2013) and Markakis et al. (2018) show that Lyapunov functions with certain structural properties can be used to certify delay stability. But it was not known whether this methodology is “complete,” that is, whether delay stability can always be established through a suitable Lyapunov function. Our results make progress in this direction, for the case of “very heavy” tails, that is, when there exists some $γ > 0$ for which arriving traffic moments of order $1 + γ$ are infinite but γ can be arbitrarily small. For this regime, we derive some necessary and sufficient conditions for delay stability: we show that a Lyapunov function of a special kind can be used to certify delay stability together with a partial converse.

1.2. Related Works on Multiple Big Jumps

As already mentioned, in several systems that involve heavy-tailed random variables, large fluctuations are the consequence of a single large jump in the underlying heavy-tailed variables. However, this is not always the case. There are known systems that have small fluctuations under a single large jump, and large deviations can only arise as a consequence of multiple jumps in heavy-tailed random variables with suitable timing. Early examples are studied in Jelenković and Momčilović (2003) and Zwart et al. (2004) for a single-buffer system with multiple on/off input processes with heavy-tailed periods. Furthermore, Foss and Korshunov (2012) provide conditions on the finiteness of moments of queue sizes in a multiserver G/G/s queue and show the special effects caused by multiple jumps. Perhaps the closest work to ours that demonstrates the necessity of multiple big jumps is Chen et al. (2019), who propose a rare-event simulation technique to estimate the probability of rare events in stochastic systems that receive heavy-tailed inputs. As an application, they consider a queuing network with fixed fluid services and fixed routing in which each queue receives independent heavy- or light-tailed exogenous arrivals. They characterized the tail exponent of the queues in terms of a knapsack-type constraint that involves the sum of numbers of large jumps in the different inputs that are required to make the queue of interest large, in which the sum is weighted by the tail exponents of the corresponding inputs. Such a knapsack-type constraint also appears in our bounds, and the tail exponent fully determines the delay stability/instability of the queue of interest. Thus, our results parallel those in Chen et al. (2019) but for a different setting. The main differences are that our network operates under the more complex max-weight scheduler, and our assumptions allow for non–independent and identically distributed (i.i.d.) arrival processes. Furthermore, our proof techniques are very different from those in Chen et al. (2019); it would be interesting to explore whether the techniques in Chen et al. (2019) can be used to obtain alternative proofs for our setting (Theorem 1).

1.3. Outline

The rest of the paper is organized as follows. We start in the next section with the details of our model and some definitions. In Section 3, we discuss an example that provides insights on the ways that arrival rates and tail exponents affect delay stability and demonstrate that criteria based on traditional fluid models are inadequate for the purpose of deciding delay stability. In Section 4, we introduce a fluid-like model, which we call the jumping fluid (JF) model and underlies our main result, the necessary and sufficient conditions for (robust) delay stability that we present in Section 5. In Section 6, we study the power of Lyapunov functions for our problem. In Sections 7–9, we provide the proofs of our results. As the proofs are quite involved, they are presented as a sequence of lemmas with the proofs of the lemmas provided in Appendices B and C. We discuss the results and directions for future research in Section 10. Finally, in Appendix A, we explore alternative definitions of robust stability and corresponding variants of our jumping fluid conditions and explain why they are unlikely to yield sharp necessary and sufficient conditions.

1.4. Notation

We collect here some notational conventions to be used throughout the paper. We use boldface symbols to denote vectors and ordinary font to denote scalars. For any vector v, we use v_i to denote its ith component and $| v |$ to denote the sum $| v_{1} | + \dots + | v_{n} |$ . We also use the notation ${[v]}^{+}$ to denote the vector with components $\max {0, v_{i}}$ . Finally, we let e_j stand for the jth unit vector.

We use $R_{+}$ and $Z_{+}$ to denote the sets of nonnegative reals and nonnegative integers, respectively. Furthermore, for a vector v, we write $v ⪰ 0$ (respectively, $v ≻ 0$ ) to indicate that all components are nonnegative (positive). For any set S, we denote its convex hull by $conv (S)$ .

Throughout, $‖ \cdot ‖$ stands for the Euclidean norm. Sometimes, we use the alternative notation $d (x, y)$ in place of $‖ x - y ‖ .$ We also let $d (x, S)$ be the distance of a vector $x$ from a set S, that is, $d (x, S) = \inf_{y \in S} ‖ x - y ‖$ . We finally use $𝟙 (\cdot)$ to denote the indicator function and $\log$ to denote the natural logarithm.

For any time function $x (\cdot)$ that is right-continuous with left limits, $(d x / d t) (t)$ or $\dot{x} (t)$ stand for the right derivative of $x (t)$ with the implicit assumption that it exists, and $x (t^{-})$ stands for $\lim_{τ ↑ t} x (τ)$ .

2. The Model

2.1. Network Model and the Max-Weight Policy

We consider a switched network that operates in discrete time. For simplicity and ease of presentation, we restrict ourselves to single-hop networks. However, our results are easily generalized to multihop networks of the type considered in Sharifnassab et al. (2020).

The network consists of $ℓ$ queues that buffer incoming packets (or jobs). For any $t \in Z_{+}$ , we let $Q (t)$ be a nonnegative vector whose jth component is the length of the jth queue at time t. Packets arrive to the queues according to a nonnegative stochastic vector arrival process, $A (\cdot)$ . In particular, $A_{j} (t)$ stands for the exogenous arrival to the jth queue at time t. We assume that the random variables $A_{j} (t)$ , for different j and t, are independent. We refer to $E [A (t)]$ as the arrival rate vector at time t.

At each time t, the amount of service received by the queues is a nonnegative vector $μ (t)$ , which is chosen by a scheduler from a finite set $M$ of possible service vectors. The queue lengths then evolve according to

Q (t + 1) = {[Q (t) - μ (t)]}^{+} + A (t) .

(1)

As in Sharifnassab et al. (2020), we assume throughout the paper that, for any $μ \in M$ , the set $M$ also contains all vectors that result from setting some entries of $μ$ to zero. This assumption is naturally valid in most contexts.

We focus exclusively on the popular MW scheduling policy,

μ (t) \in \underset{ν \in M}{argmax} ν^{T} Q (t),

(2)

which is known to have favorable stability properties (Tassiulas and Ephremides 1992): whenever there exists a policy under which the queues remain stable, MW results in stable queues. More specifically, let us consider the set

\bar{M}

, defined as the convex hull,

conv (M)

, of the set of all possible service vectors, which is the so-called capacity region of the network. For the case of i.i.d. arrivals and under common stochastic assumptions, it is known that, if the arrival rate vector lies in the interior of the capacity region, then MW results in stable queues; conversely, if the arrival rate vector lies outside the capacity region, the queues are unstable under every scheduling policy (Tassiulas and Ephremides 1992).

2.2. Light- and Heavy-Tailed Arrivals

In this section, we present some definitions related to the tails of the arrival process distributions.

To any nonnegative random variable X, we associate a tail exponent, defined as the value of γ at which $E [X^{1 + γ}]$ switches from finite to infinite:

γ^{*} = \sup {γ : E [X^{1 + γ}] < \infty} .

(3)

As an example, consider a continuous random variable whose probability density function $f (\cdot)$ satisfies

c \cdot x^{- (2 + γ)} \leq f (x) \leq \log^{k} x \cdot x^{- (2 + γ)}, \forall x \geq x_{0},

(4)

where c, γ, k, and x₀ are positive constants. Such a distribution has a tail exponent equal to γ.

For an i.i.d. arrival process, a tail exponent is unambiguously defined as the tail exponent of the marginal distribution at an arbitrary time. However, once we bring robustness into the picture, we are led to consider arrival processes with nonidentically distributed $A (t)$ . To any arrival process $A_{j} (\cdot)$ , we associate a tail exponent, γ_j, defined as the largest value of γ such that $A_{j} (t)$ is dominated by some nonnegative random variable X with tail exponent γ for all times t:

\begin{array}{l} γ_{j} = \sup {γ : there exists a r . v . X \geq 0 s . t . & X dominates A_{j} (t) for all t \geq 0, \\ and E [X^{1 + γ}] < \infty} . \end{array}

(5)

Here, the term “dominates” refers to stochastic dominance: a random variables X dominates a random variable Y if $P (X > a) \geq P (Y > a)$ for all $a \in R$ . We say that $A_{j} (\cdot)$ is heavy-tailed if $γ_{j} \leq 1$ and light-tailed otherwise. Note that this definition of a heavy-tailed process is aimed to capture the boundedness of variance of the input distribution and differs from the conventional definition of heavy-tailed random variables, which requires subexponential decay of tail probabilities (Nair et al. 2020). The tail behavior of the different arrival processes is summarized by the vector $γ = (γ_{1}, \dots, γ_{ℓ})$ .

We note that, as long as $E [A_{j} (t)] \leq \bar{μ}$ for some constant $\bar{μ}$ and for all times t, the tail exponent γ_j is well-defined and lies in the range $[0, \infty]$ . Indeed, suppose that $λ_{j} (t) = E [A_{j} (t)] \leq \bar{μ}$ for some finite constant $\bar{μ}$ and for all t. Consider a random variable X with probability density function $f_{X} (x) = (\bar{μ} / x^{2}) 𝟙 (x \geq \bar{μ})$ . The Markov inequality implies that

P (A_{j} (t) > a) \leq \min {1, E [A_{j} (t)] / a} \leq \min {1, \bar{μ} / a} = P (X > a), \forall a > 0 .

In particular, X dominates $A_{j} (t)$ for all t. Furthermore, $E [X^{1 + γ}] < \infty$ , for every $γ < 0$ . Thus, γ_j is at least as large as any negative number, which implies that $γ_{j} \geq 0$ .

Conversely, if $γ_{j} > 0$ , then $E [A_{j} (t)]$ is finite and bounded as a function of t. We finally note that γ_j may be infinite; this is the case for bounded or, more generally, exponential-type distributions.

2.3. Robust Delay Stability

As already mentioned, we are interested in the question of delay stability under the MW policy in the presence of heavy-tailed arrivals. Given a set of arrival processes, we say that queue m is delay stable if, starting from $Q (0) = 0$ , we have $\sup_{t} E [Q_{m} (t)] < \infty$ .

The arrival rates and the tail exponents do not provide enough information to decide whether we have delay stability or even stability. For example, there is sometimes indeterminacy on the boundary of the capacity region. Furthermore, a tail exponent of $γ_{j} = 1$ is compatible with $E [A_{j}^{2} (t)]$ being either finite or infinite, and this may be critical as far as delay stability is concerned (cf. the Pollaczek–Khintchine formula). These difficulties, all related to indeterminacy at certain boundaries, can be circumvented by focusing on a robust version of delay stability that incorporates two distinct elements.

We require delay stability for all arrival process distributions with given tail exponents. This allows us to focus on conditions that involve the tail exponents and ignore other details of these distributions.
We require delay stability for all arrival rates, possibly time-varying, in the vicinity of a given nominal rate. This allows us to avoid issues of indeterminacy when the arrival rate lies at a threshold between delay stability and instability.

Definition 1

(Robust Delay Stability). Let us fix a network with $ℓ$ nodes and a set $M$ of possible service vectors. We consider a tail exponent vector $γ$ with components in $[0, \infty]$ and a nominal arrival rate vector $λ^{*} ⪰ 0$ .

Given some $δ \geq 0$ , an arrival process $A (\cdot)$ belongs to the class $A_{δ} (γ; λ^{*})$ if
1. The random variables $A_{j} (t)$ , for different j and t, are independent and have finite means.
2. For every j, the tail exponent of the process $A_{j} (\cdot)$ is γ_j.
3. For every $t \geq 0$ , we have $‖ E [A (t)] - λ^{*} ‖ \leq δ$ .
For $m \in {1, \dots, ℓ}$ , we say that queue m is robustly delay stable (RDS) if there exists some $δ > 0$ such that queue m is delay stable for all arrival processes in the class $A_{δ} (γ; λ^{*})$ .

3. A Counterexample and the Insufficiency of Fluid Models

In this section, we discuss a simple example with the following features:

Similar to existing examples, heavy-tailed arrivals to some queues can cause delay instability at a queue that receives light-tailed traffic.
Tight criteria for delay instability need to take into account the values of the tail exponents. In particular, criteria that are based on traditional fluid models cannot be conclusive because they do not involve the tail exponents.
Delay instability may emerge from coordinated large arrivals at multiple heavy-tailed queues.

Consider the three-queue network in Figure 1. The set of possible service vectors $M$ is such that, at any time slot, up to two queues can be served, each with rate one, but not all three queues can be served simultaneously. Thus, at each time step, the MW policy chooses two queues with the largest backlogs and serves each one of them with rate one.

Figure 1. A Single-Hop Network with Three Queues, Two of Which Receive Heavy-Tailed Traffic, Whereas the Third One Receives Deterministic Traffic
*Notes*. In this figure, the dotted ellipses illustrate the different possible service vectors. The third queue is delay unstable for certain ranges of λ and γ.

The third queue receives deterministic arrivals with $A_{3} (t) = λ < 1$ for all $t \geq 0$ so that $γ_{3} = \infty$ . The other two queues receive heavy-tailed traffic with a density of the form (4) and tail exponent $γ \in (0, 1)$ and with rate 0.5; that is, $E [A_{1} (t)] = E [A_{2} (t)] = 0.5$ .

The total arrival rate is $1 + λ$ , which is less than two, and the network is stable in the conventional sense. The first two queues are automatically delay unstable because they receive heavy-tailed arrivals. However, the third queue can be either delay stable or delay unstable, depending on the values of λ and γ. In what follows, we provide an informal discussion of the different cases.

Case 1

( $0.5 < λ < 1$ ). In this case, queue 3 is delay unstable through a scenario similar to those considered in earlier works (Markakis et al. 2014). Intuitively, because of its heavy-tailed arrivals, Q₁ occasionally receives large inputs. When that happens, with $A_{1} (t_{0})$ being large at some time $t_{0} > 0$ , Q₁ becomes and stays largest for some time. During that time, the MW policy keeps serving Q₁, while the remaining service capacity is split between Q₂ and Q₃. Because $λ > 0.5$ , the sum of the arrival rates to Q₂ and Q₃ exceeds the aggregate service rate to these two queues over a time interval of duration $Ω (A_{1} (t_{0}))$ . Thus, Q₂ and Q₃ build up to size $Ω (A_{1} (t_{0}))$ . Because $γ < 1$ , we have $E [A_{1}^{2} (t)] = \infty$ , and using this property, it can be shown that $E [Q_{3} (t)]$ grows unbounded.

The intuition behind this argument is captured by an existing criterion from Markakis et al. (2014) that examines certain trajectories $q (\cdot)$ of a corresponding fluid model (also called fluid trajectories). In that fluid model, the arrival processes are replaced by deterministic flows with the same rates. The initial conditions are $q_{h} (0) = 1$ for some heavy-tailed queue (in our example, h = 1 or h = 2) and $q_{j} (0) = 0$ for $j \neq h$ . (Because of symmetry, we only need to consider the case in which h = 1.) Let us say that the “zero fluid” (ZF) condition holds if the solution to the fluid model (known to be unique for the MW policy) keeps $q_{3} (\cdot)$ at zero. According to the criterion in Markakis et al. (2014), the failure of the ZF condition certifies the delay instability of queue 3. This is indeed the case here: starting with the initial conditions $q (0) = (1, 0, 0)$ , it can be verified that, for small positive times t, we have $q_{3} (t) = (λ - 0.5) t / 2 > 0$ .

In summary, for this particular case, we have delay instability, and this is correctly predicted by the failure of the ZF condition and available results.

Case 2

( $0 < λ < 0.5$ and $γ > 0.5$ ). In this case, queue 3 turns out to be delay stable (in fact, robustly delay stable). This is a consequence of our general result in Section 5.

For this case, the fluid model initialized at $q (0) = (1, 0, 0)$ satisfies $q_{3} (t) = 0$ for all positive times, and the ZF condition holds. In particular, the ZF condition is “aligned” with delay stability. Observations of this nature lead to the question whether the ZF condition can be used as a certificate of delay stability (Markakis et al. 2018). However, this is not the case as we discuss next.

Case 3

( $0 < λ < 0.5$ and $γ \leq 0.5$ ). In this case, the ZF condition holds, similar to Case 2. However, queue 3 turns out to be delay unstable; this follows from the proof² of our main result, Theorem 1.

We summarize here the underlying intuition. When $γ \leq 0.5$ , there is considerable probability that both queues 1 and 2 receive large inputs within a certain time interval. More concretely, for large values of M, there is probability $Ω (1 / M)$ that both Q₁ and Q₂ receive aggregate arrivals of size at least 3M within the time interval $[0, M]$ . If this happens, both Q₁ and Q₂ become large enough so that Q₃ receives no service during the interval $[M, 2 M]$ . As a result, $Q_{3} (2 M) \geq λ M$ with probability $Ω (1 / M)$ . One can then use this fact to show that, as t increases, $E [Q_{3} (t)]$ grows unbounded, and queue 3 is delay unstable.

In summary, the presence or absence of delay stability depends on the tail exponents in a nontrivial manner. Furthermore, the ZF condition cannot discriminate between Cases 2 and 3 and, thus, cannot account for the different outcomes (delay stability in Case 2, delay instability in Case 3). In fact, the same obstacle arises with any other criterion that relies on traditional fluid models because fluid models do not take the tail exponents into account. In order to make progress, we need to consider the probability that large inputs (or jumps) may arrive within a certain time interval as a function of the tail exponents (the probability is larger when the tail exponents are smaller). Furthermore, as illustrated by Case 3, we may have to consider the effect of “coordinated” large jumps at more than one queue within the same time interval. This is accomplished by the model in the next section: it is still in the spirit of traditional fluid models except that it allows for jumps along the heavy-tailed flows, subject to a “budget” on allowed jumps as determined by the tail exponents.

4. Jumping Fluid Models

In this section, we introduce a generalization of the fluid model that allows for jumps along certain coordinates. We proceed by first defining a traditional fluid model and then modifying it.

4.1. The Fluid Model

A fluid model is a deterministic continuous-time dynamical system that replaces the arrival process with a fluid stream of arrivals and updates queue lengths along max-weight drifts. The literature provides a few, somewhat different but equivalent, definitions of the fluid model (Shah and Wischik 2012, Markakis et al. 2018), which typically involve differential equations with boundary conditions. Here, we adopt an equivalent but somewhat simpler definition³ from Sharifnassab et al. (2020).

Recall the definition of $M$ as the set of all possible service vectors. For any $x \in R_{+}^{ℓ}$ , we define

M (x) = {μ \in M | μ^{T} x \geq ν^{T} x, \forall ν \in M} = \underset{ν \in M}{argmax} ν^{T} x,

(6)

which is the set of all possible service vectors that, for the given x, attain the maximum in the definition of the MW policy; see (2). We also let

\bar{M} (x) = conv (M (x)) .

(7)

Furthermore, given an arrival rate vector $λ ⪰ 0$ and any $x \in R_{+}^{ℓ}$ , we let

{\bar{D}}_{λ} (x) = λ - \bar{M} (x) = conv ({λ - μ | μ \in \underset{ν \in M}{argmax} ν^{T} x}),

(8)

which is the set of candidate drifts when the queue length vector is x.

For the definitions that follow, recall our convention that $\dot{q} (t)$ denotes the right derivative of $q (\cdot)$ with respect to time at time t.

Definition 2

(Fluid Trajectories). Let us fix a network with $ℓ$ nodes, a set $M$ of possible service vectors, and an arrival rate vector $λ ⪰ 0$ . A fluid trajectory corresponding to $λ$ is a nonnegative, continuous, and right-differentiable $ℓ$ -dimensional function $q (\cdot)$ that satisfies the differential inclusion

\dot{q} (t) \in {\bar{D}}_{λ} (q (t)), \forall t \geq 0 .

(9)

Given some $λ ⪰ 0$ and $q (0) ⪰ 0$ , there always exists a unique (and nonnegative) fluid trajectory $q (\cdot)$ corresponding to $λ$ and initialized at $q (0)$ (Markakis et al. 2018, Sharifnassab et al. 2020).

4.2. Adding the Jumps

We now introduce JF trajectories. Given a vector $n = (n_{1}, \dots, n_{ℓ})$ of nonnegative integers, an ϵ-JF $(n)$ trajectory is a fluid trajectory with jumps with n_j positive jumps in its jth component, allowing for ϵ-changes in the arrival rate $λ$ . More concretely, we have the following definition.

Definition 3

(ϵ-Jumping Fluid Trajectories). Let us fix a network with $ℓ$ nodes and a set $M$ of possible service vectors. We are given an arrival rate vector $λ^{*} ⪰ 0$ , a nonnegative integer vector n, and some $ϵ \geq 0$ . An ϵ-JF $(n)$ trajectory corresponding to $λ^{*}$ is a nonnegative $ℓ$ -dimensional function $q (\cdot)$ , which is right-continuous with left limits and right-differentiable initialized with $q (t) = 0$ for all t < 0 and with the following properties:

Each component $q_{j} (\cdot)$ has n_j points of discontinuity.
If $q_{j} (\cdot)$ has a discontinuity at some time t, then $q_{j} (t) > q_{j} (t^{-})$ .
For every $t \geq 0$ ,
$\dot{q} (t) \in {\bar{D}}_{λ (t)} (q (t)),$ (10)
for some nonnegative function $λ (\cdot)$ that is right-continuous and piecewise constant with finitely many points of discontinuity and satisfies $‖ λ (t) - λ^{*} ‖ \leq ϵ$ for all t.

Finally, an ϵ-JF(n) trajectory with⁴ $γ^{T} n = \sum_{j = 1}^{ℓ} γ_{j} n_{j} \leq 1$ , is called an ϵ-JF( $γ$ ) trajectory.

Note that we allow jumps at time zero, in which case $q (0) \neq 0$ . Note also that, if ϵ = 0 and jumps can only happen at time zero, then an ϵ-JF trajectory is just a fluid trajectory with the jumps of the JF trajectory determining the initial conditions of the fluid trajectory.

More generally, an ϵ-JF trajectory consists of a concatenation of fluid trajectories over the intervals in which $λ (\cdot)$ stays constant together with a finite number of jumps. For this reason, once the jump times, jump sizes, and function $λ (\cdot)$ are specified, the results for fluid models extend and establish the existence and uniqueness of the ϵ-JF(n) trajectory.

Our next definition formalizes the requirement that a certain queue must stay at zero under all ϵ-jumping fluid trajectories.

Definition 4

(JF Conditions). Let us fix a network with $ℓ$ nodes and a set $M$ of possible service vectors. We are given an arrival rate vector $λ^{*} ⪰ 0$ and a particular queue, m, of interest.

Given a vector $γ$ with components in $[0, \infty]$ and some $ϵ \geq 0$ , we say that the ϵ-JF $(γ)$ condition holds for queue m and $λ^{*}$ if, for every ϵ-JF( $γ$ ) trajectory corresponding to $λ^{*}$ and every $t \geq 0$ , we have $q_{m} (t) = 0$ .
Given a vector $γ$ with components in $[0, \infty]$ , we say that the robust jumping fluid condition (RJF( $γ$ )) holds for queue m and $λ^{*}$ if there exists some $ϵ > 0$ such that the ϵ-JF $(γ)$ condition holds for queue m and $λ^{*}$ .

Note the restriction on the number of jumps in terms of the tail exponents: the heavier the arrival processes (i.e., the smaller the tail exponents γ_j), the larger the number n_j of jumps that we allow. As an example, if $γ_{1} \leq 1$ and queue 1 is the only heavy-tailed queue, then we allow up to $⌊ 1 / γ_{1} ⌋$ jumps at queue 1 and no jumps at the other queues.

5. Main Result

Our main result provides a necessary and sufficient condition for robust delay stability in terms of JF trajectories. The proof is given in Sections 7 and 8.

Theorem 1.

Let us fix a network with $ℓ$ nodes; a set $M$ of possible service vectors; an arrival rate vector $λ^{*} ⪰ 0$ ; a particular queue, m, of interest; and a vector $γ$ of tail exponents with components in $(0, \infty]$ . The queue m is RDS if and only if the RJF $(γ)$ condition holds for queue m and $λ^{*}$ .

5.1. Some Intuition

We provide here a high-level explanation of our result. Some more refined intuition is provided by the proof outlines in Sections 7.1 and 8.1.

Let M be a large constant. We say that the stochastic process $A_{j} (t)$ has a jump whenever it is larger than (approximately) M. Let N_j be the number of jumps of $A_{j} (t)$ during the interval $[0, M]$ and let $N = (N_{1}, \dots, N_{ℓ})$ . It turns out that, for any nonnegative integer vector n, the probability that the event $N = n$ scales (approximately) like $M^{- γ^{T} n}$ . The latter quantity is “significant” (in the sense that it makes an unbounded contribution to certain expected values) if and only if $γ^{T} n \leq 1$ . Thus, over an interval of length M, we can focus on sample paths for which the realized vector n of jump counts satisfies $γ^{T} n \leq 1$ and examine whether such sample paths can cause the queue of interest to become large. We then argue that these sample paths are well-approximated by the ϵ-JF(n) trajectories involved in the ϵ-JF $(γ)$ condition.

5.2. Remarks

We continue with some remarks on the scope of our result.

5.2.1. Heavy-Tailed Queues.

If queue m is heavy-tailed, that is, $γ_{m} \leq 1$ , then the condition $γ^{T} n \leq 1$ allows n to be the mth unit vector. With such a vector n, we can have an ϵ-JF(n) trajectory with a positive jump in the mth component, resulting in a positive value of $q_{m} (t)$ . Thus, the RJF( $γ$ ) condition does not hold, and queue m is not RDS. This is just a variation of the well-known fact that a queue with heavy-tailed arrivals is not delay stable.

5.2.2. Unstable Systems.

Theorem 1 makes no stability assumptions. For unstable (or marginally stable) systems, some components of ϵ-JF trajectories can grow arbitrarily large. On the other hand, these components do not necessarily have a substantial effect on the queue, m, of interest. As long as the mth component of all ϵ-JF trajectories stays at zero, queue m is RDS.

5.2.3. Comparison with the ZF Condition.

If $γ_{j} > 1 / 2$ for all j, then an ϵ-JF trajectory can have at most one jump. For stable systems, the RJF condition boils down to a robust version of the ZF condition introduced in Section 3. In other words, for this case, a robust version of the ZF condition is a necessary and sufficient condition for robust delay stability. On the other hand, because the (robust version of) the ZF condition is strictly weaker than the RJF condition, it does not provide necessary and sufficient conditions for general $γ$ .

5.2.4. The Light-Tailed Case.

Suppose that $γ_{j} > 1$ for all j so that all arrival processes are light-tailed. In this case, no jumps are allowed, and the RJF condition boils down to considering ordinary fluid trajectories with slightly perturbed arrival rates. We have the following possibilities:

If $λ^{*}$ is in the interior of the capacity region $\bar{M}$ , then $0$ is in the interior of ${\bar{D}}_{λ^{*}} (0) = λ^{*} - \bar{M}$ . It then turns out that $0$ is an attracting fixed point of the fluid dynamics, the RJF condition holds, and we have RDS for all queues. This is in line with existing results (e.g., see theorem 4.5 of Georgiadis et al. 2006).
If $λ^{*}$ is on the boundary or outside the capacity region and, similar to our earlier discussion of unstable systems, queue m could be either RDS or non-RDS, depending on whether (perturbed) fluid trajectories cause q_m to become positive or not.

5.2.5. The Case of Zero Tail Exponents.

Our definitions in Sections 2 and 4 are formulated for a nonnegative vector $γ$ . However, our result is restricted to the case in which this vector is positive. We comment on the reasons for this.

When $γ_{j} = 0$ , we are dealing with an arrival process for which $E [A_{j} (t)]$ is finite, whereas $E [A_{j} {(t)}^{1 + γ}]$ may be infinite for every $γ > 0$ . Our proofs involve, at certain places, a division by γ_j and break down if $γ_{j} = 0$ . It is not clear whether a similar result is possible when some of the tail exponents are zero.

5.2.6. Computational Issues.

As already hinted in the introduction, checking the RJF condition algorithmically appears to be a hard computational problem, amenable only to impractical Tarski-like elimination algorithms. In one possible simplification, we might just consider ϵ-JF trajectories with ϵ = 0 so that $λ (t) = λ^{*}$ for all times t. However, this would still leave the indeterminacy of the jump times and sizes to be reckoned with. Even worse, a restriction to this limited class of trajectories does not seem to lead to necessary and sufficient conditions for any suitably modified notion of stability. See Appendix A.2 for further discussion. A related question is whether we could, without loss of generality, require all the jumps to occur at the same time, for example, at time zero, thus eliminating the need to consider all possible values of the jump times. Unfortunately, this is not the case: there exist examples in which ϵ-JF trajectories can drive a queue m to a positive value, but this can happen only if we allow the jumps to occur at different times; see Appendix A.5 for an example.

5.3. Our Example, Revisited

On the positive side, Theorem 1 elucidates the precise mechanism that leads to delay instability through a coordination of multiple abnormally large arrival vectors at possibly different times and queues. Furthermore, it allows us to analyze simple problems, such as the one discussed in Section 3, which we do next.

Recall the network of three queues in Section 3, in which $γ = (γ, γ, \infty)$ . For γ in the range $(0, 0.5]$ , consider a jumping fluid trajectory $q (\cdot)$ with $n = (1, 1, 0)$ in which both q₁ and q₂ undergo unit jumps at time 0. With this trajectory, q₃ immediately starts to grow positive. Because $γ^{T} n = 2 γ \leq 1$ , it follows that the RJF $(γ)$ condition fails to hold. Theorem 1 then establishes that queue 3 is not robustly delay stable.

On the other hand, when γ is in the range $(0.5, 1]$ , the constraint $γ^{T} n \leq 1$ allows at most one jump in either q₁ or q₂. Without loss of generality, we can assume that this jump takes place at time 0. It turns out that the fluid trajectories that start from either $q (0) = (1, 0, 0)$ or $q (0) = (0, 1, 0)$ keep $q_{3} (\cdot)$ at zero if and only if $λ \leq 0.5$ . Therefore, for γ in range $(0.5, 1]$ and also taking robustness into account, queue 3 is RDS if and only if $λ < 0.5$ .

6. Robust Delay Stability via Lyapunov Functions

Lyapunov functions are a powerful tool for the stability analysis of queueing networks (Tassiulas and Ephremides 1992, Bertsimas et al. 2001, Maguluri et al. 2016), for example, in throughput optimality proofs for the MW policy (Tassiulas and Ephremides 1992, Neely 2010). Markakis (2013) and Markakis et al. (2018) provide a sufficient condition for delay stability based on a class of piecewise linear Lyapunov functions and use it to derive a sharp characterization of delay stability for a special class of networks, namely, networks with disjoint schedules. Nevertheless, the Lyapunov approach in Markakis (2013) and Markakis et al. (2018) has some drawbacks: (a) the condition provided therein is, in general, only sufficient for delay stability; (b) it does not take into account the tail exponents even though they play an essential role in delay stability as already discussed in Sections 3 and 5.3; (c) the Lyapunov functions considered are piecewise linear, which is perhaps inadequate for the purpose of tight delay stability conditions.

In this section, we explore the power of Lyapunov functions for the case in which the tail exponents of the heavy-tailed queues may be arbitrarily close to zero so that the RJF condition allows an arbitrarily large number of jumps.

For the remainder of this section, we assume that queues $1, \dots, h$ can be heavy-tailed, where $h < ℓ$ , whereas the remaining queues are light-tailed. Formally, we consider the set Γ of tail coefficients, defined by

Γ = {γ ≻ 0 : γ_{j} > 1, for j = h + 1, \dots, ℓ} .

We also fix an arrival rate vector $λ^{*} ⪰ 0$ and a light-tailed queue m > h of interest.

Definition 5

(Special Lyapunov Function). For any $ϵ > 0$ , we say that a function $V : R_{+}^{ℓ} \to R_{+}$ is a special ϵ-Lyapunov function if

V is Lipschitz continuous with Lipschitz constant one.
$\dot{V} (q (t)) \leq - ϵ$ whenever $V (q (t)) > 0$ for all fluid trajectories $q (\cdot)$ corresponding to arrival rate $λ^{*}$ .
We have $V (0) = 0$ . Furthermore, if $q_{m} > 0$ , then $V (q_{m}) > 0$ .
V is nonincreasing along the coordinates associated with heavy-tailed queues; that is, for $j = 1, \dots, h$ , for any $q \in R_{+}^{n}$ , and any $α > 0$ , we have $V (q + α e_{j}) \leq V (q)$ , where $e_{j}$ is the jth unit vector.

Special ϵ-Lyapunov functions as defined are quite similar to the functions considered in theorem 2 of Markakis et al. (2018). However, in contrast to Markakis et al. (2018), our special Lyapunov functions need not be piecewise linear. Our next result establishes a strong connection between the ϵ-JF $(γ)$ condition and the existence of special ϵ-Lyapunov functions. The proof is given in Section 9.

Theorem 2.

For any $ϵ > 0$ , there exists a special ϵ-Lyapunov function if and only if the ϵ-JF $(γ)$ condition holds for every $γ \in Γ$ .

The special ϵ-Lyapunov functions constructed in the proof of Theorem 2 are not piecewise linear. We do not know whether Theorem 2 remains valid if we restrict to piecewise linear functions.

Combining Theorems 1 and 2, we can establish a strong connection between robust delay stability and special ϵ-Lyapunov functions. In what follows, we say that queue m is ϵ-RDS $(γ)$ if it is delay stable under all arrival processes in the class $A_{ϵ} (γ; λ^{*})$ in Definition 1(a).

Corollary 1.

Let us fix a network with $ℓ$ nodes, a set $M$ of possible service vectors, an arrival vector $λ^{*} ⪰ 0$ , the number h of heavy-tailed queues, and a light-tailed queue m > h.

If there exists a special ϵ-Lyapunov function for some $ϵ > 0$ , then queue m is RDS for all tail exponents $γ \in Γ$ .
Suppose that there exists some $ϵ > 0$ such that, for all $γ \in Γ$ , queue m is ϵ-RDS $(γ)$ . Then, there exists an ϵ-special Lyapunov function.

The proof is provided in Section 9.3. We conjecture that Corollary 1(b) can be strengthened to provide a converse to part (a).

Hypothesis 1.

If queue m is RDS for all tail exponents $γ \in Γ$ , then for some $ϵ > 0$ , there exists a special ϵ-Lyapunov function.

If Hypothesis 1 is true, we have a tight characterization: a queue is RDS for all tail exponents $γ \in Γ$ if and only if there exists a special ϵ-Lyapunov function that testifies to this. Establishing the conjecture appears to be difficult. Technically it amounts to reversing the order of the quantifiers in the clause “there exists $ϵ > 0$ such that, for all $γ \in Γ \dots$ ” in Corollary 1(b) and showing equivalence with the statement “for every $γ \in Γ$ , there exists some $ϵ > 0$ …”

Our Lyapunov-based results are relevant to the case in which nothing is known about the tail exponents of the heavy-tailed queues other than the fact that they are positive. On the other hand, Lyapunov functions are unlikely to provide useful characterizations of robust delay stability for specific values of the tail exponents because there is no apparent way of accounting for the number of jumps through Lyapunov functions.

7. Proof of the “if” Direction of Theorem 1 (RJF $\Rightarrow$ RDS)

In this section, we provide the proof of the “if” direction of Theorem 1, that is, that the RJF condition implies RDS.

Throughout this section, we consider a network with $ℓ$ nodes and a set $M$ of possible service vectors. We fix an arrival rate vector $λ^{*} ⪰ 0$ ; a particular queue, m, of interest; a vector $γ$ of tail exponents with components in $(0, \infty]$ ; and some $ϵ > 0$ for which the ϵ-JF $(γ)$ condition holds for $λ^{*}$ and queue m. Our goal is to establish robust delay stability for queue m.

The proof is organized in a sequence of lemmas whose proofs are collected in Appendix B. However, before proceeding to the formal arguments, it is helpful to provide an overview of the proof.

7.1. Outline of the Proof

Let us fix an arbitrary time T. We aim at upper bounds for the probability $P (Q_{m} (T) \geq M)$ as M gets large. As long as these bounds are a summable function of M and independent of T it follows that $E [Q_{m} (T)]$ is finite and a bounded function of T, which is our goal. Let us now fix some M and keep it fixed throughout except for the end of the proof.

The proof relies on various probabilistic bounds as well as on deterministic properties of the MW dynamics. Let us start with the probabilistic part, which is focused on showing that the stochastic system mostly follows the deterministic fluid dynamics except for certain jumps caused by the heavy tails of the arrival processes. We define a threshold for what constitutes a jump and then develop a probabilistic bound on the numbers of jumps. A difficulty here is that, if we use a fixed threshold and because T is arbitrary, a bound on the number of jumps is not possible. We handle this issue by using a threshold θ_t that increases almost linearly as we move further to the past of the form

θ_{t} = \frac{M + T - t}{η \log (M + T - t)}, t = 0, 1, \dots, T .

(11)

This is for some positive constant η to be defined later. At any time $t \leq T$ and for any index j, we say that $A_{j} (t)$ is a jump if $A_{j} (t) > θ_{t}$ . Ignoring logarithmic factors, we show that the jump probability $P (A_{j} (t) > θ_{t})$ is of order at most $1 / {(M + T - t)}^{1 + γ_{j}^{'}}$ , where $γ_{j}^{'}$ is slightly smaller than the tail exponent γ_j. By summing over t and after some elementary calculations, we then obtain that $P (N_{j} = n_{j})$ is of order at most $1 / M^{n_{j} γ_{j}^{'}}$ , where N_j is the number of jumps of the jth arrival process during the interval $[0, T]$ . Then, a further calculation shows that, if $N = (N_{1}, \dots, N_{ℓ})$ , then $P (γ^{T} N > 1)$ is of order at most $1 / M^{β}$ for some constant $β \in (1, 2)$ and is, therefore, a summable function of M; see Lemma 1.

We then consider stochastic fluctuations in the arrival process other than jumps. We argue that they average out so that the cumulative arrival process follows its fluid counterpart. We refer to this as the “small fluctuations event,” and show that it occurs with probability at least $1 - ℓ / M^{2}$ ; see (21) and Lemma 3.

Having completed the probabilistic analysis, we then switch to deterministic (sample path) considerations. Let $W (n)$ be the set of all points q that can be reached by some ϵ-JF $(n)$ trajectory (see Definition 6). As a first step, we exploit some special properties of the MW dynamics and show that $W (n)$ is ϵ-attracting; that is, any fluid trajectory that starts outside $W (n)$ moves toward that set with rate at least ϵ (Lemma 4). We then consider a “nice” sample path, that is, a sample path for which the small fluctuations event occurs and for which the realized vector of jump counts n satisfies $γ^{T} n \leq 1$ . (As discussed earlier, nice sample paths have probability $1 - O (M^{- β})$ with $β > 1$ .) Our deterministic analysis, outlined in the next paragraph, shows that any such sample path stays within O(M) distance from the set $W (n)$ . Because $γ^{T} n \leq 1$ , the ϵ-JF( $γ$ ) condition implies that any point in $W (n)$ satisfies q_m = 0. Thus, for nice sample paths, we have $Q_{m} (T) = O (M)$ , and therefore, $P (Q_{m} (T) \geq M)$ is of order at most $1 / M^{β}$ for the constant $β \in (1, 2)$ mentioned earlier. This readily implies a uniform upper bound on $E [Q_{m} (T)]$ .

The analysis of the dynamics under nice sample paths has two parts. We first study the dynamics between jumps: we rely on the small fluctuations event and then make use of a result from Sharifnassab et al. (2020, theorem 4) to ensure that small fluctuations in the arrivals result into comparably small changes in the resulting stochastic trajectory; cf. Lemma 5. Second, to understand what happens at jump times, we recall that the jump vectors n associated to nice sample paths satisfy $γ^{T} n \leq 1$ . Such vectors n are allowed in ϵ-JF( $γ$ ) trajectories, and therefore, the jumps cannot take the ϵ-JF(n) trajectory away from $W (n)$ . This implies that a nice sample path stays “close” to an ϵ-JF(n) trajectory and, therefore, has a “small” Q_m.

7.2. Sensitivity of Max-Weight Dynamics

The proof for both directions of the theorem requires fairly precise bounding of the fluctuations of the stochastic trajectories. To this effect, we rely heavily on a fluctuation bound for the MW dynamics established in Sharifnassab et al. (2020).

Theorem 3

(Sharifnassab et al. 2020, theorem 2). Fix a network (i.e., the number of nodes and the set $M$ of possible service vectors) operating under the MW policy and let $Q (\cdot)$ be the corresponding queue length stochastic process. There exists a (deterministic) constant $C \geq 1$ such that, for any arrival rate vector $λ ⪰ 0$ , any $q (0) ⪰ 0$ , any $t \geq 0$ , and any sample path, if $q (0) = Q (0)$ , then

‖ Q (t) - q (t) ‖ \leq C (1 + ‖ λ ‖ + \max_{k < t} ‖ \sum_{τ = 0}^{k} (A (τ) - λ) ‖),

(12)

where

q (\cdot)

is the fluid trajectory corresponding to

λ

initialized at

q (0)

We actually use the following variant of Theorem 3, which allows for different initial conditions. The proof is given in Appendix B.1.

Theorem 4.

Under the same assumptions as in Theorem 3 except that we allow for $Q (0)$ and $q (0)$ to be different and for the same constant C, we have

‖ Q (t) - q (t) ‖ \leq ‖ Q (0) - q (0) ‖ + C (1 + ‖ λ ‖ + \max_{k < t} ‖ \sum_{τ = 0}^{k} (A (τ) - λ) ‖) .

7.3. Arrival Process and Jumps

We now return to the formal proof. Recall that, throughout this section, we fix the network, $λ^{*} ⪰ 0$ and the tail exponents $γ_{j} \in (0, \infty]$ . We assume that the ϵ-JF $(γ)$ condition holds for $λ^{*}$ , queue m, and some $ϵ > 0$ . We fix some positive integers M and T; these remain fixed throughout except for the end of the proof and for some additional assumptions that M is “large enough.”

We consider an arrival process $A (\cdot)$ in the class $A_{δ} (γ; λ^{*})$ introduced in Definition 1 with $δ = γ ϵ / 20 C$ , where C is the constant in Theorem 4 and

γ = \min_{j = 1, \dots, ℓ} γ_{j} .

(13)

In particular,

‖ E [A (t)] - λ^{*} ‖ \leq \frac{γ ϵ}{20 C}, \forall t \geq 0 .

(14)

Our goal is to derive an upper bound on $P (Q_{m} (T) \geq M)$ that holds uniformly for every arrival process $A (\cdot)$ in $A_{δ} (γ; λ^{*})$ , every T, and every large enough M. This is then used to conclude that the RDS property holds.

Because we assume that the tail exponents are nonzero, we have $γ > 0$ . Furthermore, in order to simplify the proof, it is convenient to assume that

γ \leq 1 .

(15)

Claim 1.

The assumption $γ \leq 1$ can be made without loss of generality.

Proof.

Given a system, call it S, consider a new system $S'$ in which we add one more queue, queue 0, with $γ_{0} < 1$ , which does not interact with the others; for example, for any allowed service vector $μ$ in system S, we introduce a corresponding service vector $(1, μ)$ in system $S'$ . This way, the dynamics of queues $1, \dots, ℓ$ are the same in the two systems S and $S'$ . It follows that, for $m \geq 1$ , queue m is RDS in system S if and only if it is RDS in system $S'$ . Furthermore, because of the lack of interaction, the RJF condition for queue m holds in system S if and only if it holds in system $S'$ .

Once we prove the result for the case $γ \leq 1$ , we apply it to system $S'$ and obtain the equivalence of RDS and RJF for the latter system. Based on this discussion, this also establishes the equivalence of RDS and RJF for the original system S. □

We define some more constants:

\bar{μ} = 1 + \max_{μ \in M} ‖ μ ‖ + ‖ λ^{*} ‖ + ϵ,

(16)

and

η = \frac{8000 C^{2} ℓ^{2} \bar{μ}}{{(γ ϵ)}^{2}} .

(17)

Having fixed M, T, and η, we finally define θ_t as in (11).

For $j = 1, \dots, ℓ$ and $t = 0, \dots, T - 1$ , we say that t is a jump time for $A_{j} (\cdot)$ if $A_{j} (t) > θ_{t}$ . For any j and any $τ \in [0, T)$ , we let $N_{j} (τ)$ be the (random) number of jumps in $A_{j} (\cdot)$ that occur during $[0, τ]$ and also define the corresponding vector $N (τ) = (N_{1} (τ), \dots, N_{ℓ} (τ))$ . To simplify notation and as long as T is fixed, we use N and N_j to refer to $N (T - 1)$ and $N_{j} (T - 1)$ , respectively. Note that, with this definition, $N = N (T - 1)$ includes all jumps that can affect $Q (T)$ .

We consider the event

E^{jump} (T, M) = {γ^{T} N \leq 1} .

(18)

We argue that it occurs with high probability.

Let F be the set of all $n ⪰ 0$ such that $1 < γ^{T} n \leq 2$ . If F is empty, then whenever $γ^{T} n > 1$ , we must have $γ^{T} n > 2$ . If F is nonempty, it has finitely many elements, which implies that the minimum $\min_{n \in F} γ^{T} n$ is attained and its value is greater than one. In either case, we obtain that there exists a constant $β > 1$ such that

if γ^{T} n > 1, then γ^{T} n \geq β^{3} .

(19)

Without loss of generality, we can take β to satisfy $1 < β < 2$ . In the next lemma, we show the probability that $γ^{T} N > 1$ decays at least as fast as $1 / M^{β}$ .

Lemma 1.

There exists a constant $M_{1} \geq 0$ independent of T such that, if $M \geq M_{1}$ , then

P (E^{jump} (T, M)) = P (γ^{T} N \leq 1) \geq 1 - M^{- β} .

The proof is given in Appendix B.2 and relies on the intuition that the probability of a jump in the jth arrival process scales at most as $M^{- γ_{j}}$ (approximately), which then implies that the probability of the event ${N = n}$ scales at most as $M^{- γ^{T} n}$ (approximately).

7.4. Fluctuations of the Arrival Process

We now study the remaining fluctuations of the cumulative arrival processes after we exclude the jumps. We begin with a concentration inequality for the sum of independent random variables. The proof of our next lemma is given in Appendix B.3 and is essentially a reformulation of the Bernstein inequality; see, for example, (1.21) in appendix 1 of Anthony and Bartlett (2009).

Lemma 2.

Suppose that $X_{1}, \dots, X_{n}$ are independent random variables that satisfy $X_{i} \in [0, b]$ and $E [X_{i}] \leq \bar{λ}$ for some $b, \bar{λ} > 0$ . Let $Y = X_{1} + \dots + X_{n}$ . Then, for any $z \geq 0$ ,

P (| Y - E [Y] | > z) \leq 2 \exp (- \frac{z^{2}}{2 b (\bar{λ} n + z / 3)}) .

(20)

We consider the truncated process $A^{*} (t) = \min {A (t), θ_{t}}$ , where the minimum is taken component-wise. We define the small fluctuations event

E^{fluc} (T, M) = {‖ \sum_{τ = t_{0}}^{t} (A^{*} (τ) - λ^{*}) ‖ \leq \frac{γ ϵ}{10 C} (M + T - t_{0}), for 0 \leq t_{0} \leq t < T},

(21)

where C is the constant in Theorem 4.

Lemma 3.

There exists a constant $M_{2} \geq 0$ independent of T such that, if $M \geq M_{2}$ , then $P (E^{fluc} (T, M)) \geq 1 - ℓ M^{- 2}$ .

The proof is somewhat long but straightforward. It relies on the concentration inequality in Lemma 2 and is given in Appendix B.4.

7.5. Deterministic Analysis of the Dynamics

We start with some definitions. It is useful to recall here that ϵ-JF trajectories start at zero just before time 0 but can become nonzero at time 0 or later because of jumps or unstable drifts.

Definition 6.

For any nonnegative integer vector n, we define $W (n)$ as the set of all points in $R_{+}^{ℓ}$ that can be reached by some ϵ-JF $(n)$ trajectory.

Because ϵ-JF trajectories are by definition nonnegative, $W (n)$ is a subset of $R_{+}^{ℓ}$ . Note that the set $W (n)$ depends on ϵ, but because ϵ is held fixed, we suppress this dependence from our notation.

Definition 7

(ϵ-Attracting and ϵ-Invariant Sets).

A subset W of $R_{+}^{ℓ}$ is ϵ-attracting if every fluid trajectory $q (\cdot)$ corresponding to $λ^{*}$ and initialized at some arbitrary $q (0) ⪰ 0$ satisfies
$\frac{d}{d t} d (q (t), W) \leq - ϵ,$ (22)
whenever $q (t) \notin W$ .
A subset W of $R_{+}^{ℓ}$ is ϵ-invariant if, for any $λ ⪰ 0$ that satisfies $‖ λ - λ^{*} ‖ \leq ϵ$ and any fluid trajectory $q (\cdot)$ corresponding to $λ$ and initialized at some arbitrary $q (0) \in W$ , we have $q (τ) \in W$ for all $τ \geq 0$ .

We observe that if W is ϵ-attracting and $0 \leq t_{0} < t_{1}$ , then every fluid trajectory satisfies

d (q (t_{1}), W) \leq \max {0, d (q (t_{0}), W) - (t_{1} - t_{0}) ϵ} .

(23)

It can be shown that an ϵ-attracting set is ϵ-invariant, but we do not need this fact. We are interested instead in the converse statement, that every ϵ-invariant set is ϵ-attracting, which we then apply to the set $W (n)$ ; this is the subject of the next lemma.

Lemma 4.

Every ϵ-invariant set is ϵ-attracting.
For any $n \in Z_{+}^{ℓ}$ , the set $W (n)$ in Definition 6 is ϵ-invariant and, a fortiori, ϵ-attracting.

The proof is given in Appendix B.5 and relies on the intuition that, for an ϵ-invariant set W, ϵ-perturbations of $λ^{*}$ cannot take the trajectories away from W. This means that there must be a drift toward W that locally counteracts such perturbations. The nonexpansive property of the max-weight dynamics then enables us to extend this local counteraction result into the desired global attraction property.

Let us fix a sample path of the arrival process. For any $t \in [0, T)$ , let $n (t)$ be the realized value of $N (t)$ , that is, $n (t)$ is the vector with the realized number of jumps in the process $A (\cdot)$ up to time t (see the paragraph preceding (18)). With a bit of abuse of notation, we let

W (t) = W (n (t - 1)) .

(24)

(The reason for the t – 1 term on the right-hand side is that we want to compare $W (\cdot)$ and $Q (\cdot)$ , but $Q (t)$ is only affected by jumps that happen before time t.)

7.6. Some More Intuition

The general idea is to show that $Q (\cdot)$ stays close to the set $W (\cdot)$ so that we can ultimately exploit the fact that q_m = 0 for every $q \in W (T)$ . There are two parts to the argument:

If, at a certain time, $Q_{j} (t)$ has a jump (i.e., a large increase), the set W(t) expands along the jth coordinate, and so the distance between $Q (t)$ and W(t) does not increase.
In between jumps, $Q (t)$ follows the fluid trajectory, plus some fluctuations, within the range allowed by Lemma 3. These fluctuations get “eliminated” because the fluid trajectory is attracted to W(t) (Lemma 4). Note that (21) allows for larger fluctuations in the far past (see the term $M + T - t_{0}$ ); however, for fluctuations in the far past, the motion in the direction of W(t) happens for a longer time period, enough to eliminate them. This explains the choice of the threshold θ_t in (11); the logarithmic term in the denominator is included for technical reasons.

7.7. The Distance from the Invariant Set

Given times that satisfy $0 \leq t_{0} < t_{1} \leq T$ , we say that the interval (t₀, t₁) is jump-free if $A_{j} (τ) \leq θ_{τ}$ for all j and all $τ \in (t_{0}, t_{1})$ . Note that the initial time t₀ and the end time t₁ are allowed to be jump times. Let $M_{3} \geq 1$ be a constant, independent of T, such that, for any $M' \geq M_{3}$ ,

\frac{γ ϵ M'}{10} + \frac{ℓ M'}{η \log M'} + ϵ + (‖ λ^{*} ‖ + 1) C + \bar{μ} \leq \frac{γ ϵ M'}{6} .

(25)

Lemma 5.

Suppose that $M \geq M_{3}$ and fix times that satisfy $0 \leq t_{0} < t_{1} \leq T$ . Consider a sample path under which the event $E^{fluc} (T, M)$ occurs and the interval (t₀, t₁) is jump-free. Then,

d (Q (t_{1}), W (t_{1})) \leq \max {0, d (Q (t_{0}), W (t_{0})) - (t_{1} - t_{0}) ϵ} + \frac{ϵ γ}{6} (M + T - t_{0}) .

(26)

Note that the lemma also applies when $t_{1} = t_{0} + 1$ so that the set of integers in (t₀, t₁) is empty. The proof, which is given in Appendix B.6, relies on the sensitivity bound in Theorem 4 and the fact that $W (t_{0})$ is ϵ-attractive. The first term on the right-hand side reflects the fact that a fluid trajectory is attracted to $W (\cdot)$ during the jump-free interval; the second term reflects the effect of the smaller fluctuations during the interval (t₀, t₁).

We then apply Lemma 5 and use strong induction on t for $t \leq T$ to establish the next lemma. Its proof is given in Appendix B.7.

Lemma 6.

Suppose that $M \geq M_{3}$ . Consider a sample path under which the events $E^{jump} (T, M)$ and $E^{fluc} (T, M)$ occur. Then, $d (Q (T), W (T)) \leq M ϵ / 2$ .

7.8. Bounding $E [Q_{m} (t)]$

Let $\bar{M} = \max {M_{1}, M_{2}, M_{3}}$ , where M₁, M₂, and M₃ are the constants in Lemma 1, Lemma 3, and (25), respectively. Because these constants are independent of T, the constant $\bar{M}$ is also independent of T.

Let us consider some $M \geq \bar{M}$ and a sample path under which the events $E^{jump} (T, M)$ and $E^{fluc} (T, M)$ occur. In particular, we have $γ^{T} n \leq 1$ , where n is the realized value of N, that is, the vector with the number of jumps until time T – 1 for that particular sample path. The ϵ-JF( $γ$ ) condition, which we assume to hold, implies that every ϵ-JF(n) trajectory with $γ^{T} n \leq 1$ keeps q_m at zero, and therefore, every vector $q \in W (T)$ has q_m = 0. Consequently, for every sample path in $E^{jump} (T, M) \cap E^{fluc} (T, M)$ , we have

Q_{m} (T) \leq d (Q (T), W (T)) \leq \frac{M ϵ}{2},

(27)

where the second inequality follows from Lemma 6. Therefore, for any

M \geq \bar{M}

, we have

\begin{array}{l} P (Q_{m} (T) > \frac{M ϵ}{2}) & \leq (1 - P (E^{jump} (T, M))) + (1 - P (E^{fluc} (T, M))) \\ \leq M^{- β} + ℓ M^{- 2}, \end{array}

(28)

where the first inequality follows from (27) and the union bound and the second inequality is due to Lemmas 1 and 3. Because β is by definition greater than one, the formula

E [Q_{m} (T)] = \int_{0}^{\infty} P (Q_{m} (t) > M) d M

implies that

E [Q_{m} (T)]

is bounded above by a constant that does not depend on T. Furthermore, note that this bound applies uniformly to all processes in the class

A_{δ} (γ; λ^{*})

for

δ = γ ϵ / 20 C

(see (14)). This shows that queue m is robustly delay stable and completes the proof of the first direction of Theorem 1.

8. Proof of the Reverse Direction of Theorem 1 (RDS $\Rightarrow$ RJF)

In this section, we prove the reverse (“only if”) direction of Theorem 1, that is, that RDS implies that the ϵ-JF $(γ)$ condition holds for some $ϵ > 0$ . The proof is organized in a sequence of lemmas whose proofs are collected in Appendix C.

We actually prove the contrapositive and start by assuming that, for every $ϵ > 0$ , there exists an ϵ-JF $(n)$ trajectory $q^{ϵ} (\cdot)$ with $γ^{T} n \leq 1$ and such that $q_{m}^{ϵ} (t) > 0$ at some positive time t.

We keep $λ^{*}, γ$ , and ϵ fixed throughout the proof and show that there exists an arrival processes in the class $A_{ϵ} (γ; λ^{*})$ for which queue m is not delay stable. Because ϵ can be arbitrarily small, this implies that there exists no δ such that queue m is delay stable for all arrival processes in the class $A_{δ} (γ; λ^{*})$ , and therefore, queue m is not RDS. However, before proceeding to the formal arguments, we overview informally the key ideas in the proof.

8.1. Outline of the Proof

The main idea is to construct a certain arrival process $A (\cdot)$ in the class $A_{ϵ} (γ; λ^{*})$ , whose arrival rate is a time-scaled version of the piecewise constant rate $λ (\cdot)$ associated with the ϵ-JF(n) trajectory. We then use the bounded sensitivity property of the MW dynamics (Theorem 4) to show that the resulting process $Q (\cdot)$ tracks a suitably scaled (by a factor of T) version of the ϵ-JF trajectory $q^{ϵ} (\cdot)$ with substantial probability. In particular, $Q (T)$ is (with substantial probability) comparable to $T q^{ϵ} (1)$ , leading to a large value of $E [Q_{m} (T)]$ . This part of the argument capitalizes on the fact that the number of jumps of the ϵ-JF(n) trajectory is limited by the condition $γ^{T} n \leq 1$ .

On the technical side, the tracking result involves two separate arguments:

Whenever the ϵ-JF(n) trajectory has a jump, at some time τ, there is substantial probability that the stochastic process also has a jump at some time near the scaled counterpart, $T τ$ , of τ; see Lemma 8.
In between jump times of the ϵ-JF(n) trajectory, we use concentration inequalities to show that there is a fairly large probability that the stochastic process stays close to its fluid counterpart.

8.2. Jumps of the ϵ-JF(n) Trajectory

We fix a network with $ℓ$ nodes; a set $M$ of possible service vectors; an arrival rate vector $λ^{*} ⪰ 0$ ; a particular queue, m, of interest; and a vector $γ$ of tail exponents with components in $(0, \infty]$ . We also fix some $ϵ > 0$ and assume that the ϵ-JF( $γ$ ) condition fails to hold.

We start with a few elementary observations, namely, that the ϵ-JF(n) trajectories of interest can be taken, without loss of generality, through scaling and perturbations, to have some convenient properties that allow us to simplify subsequent notation and arguments. The proof is given in Appendix C.1.

Lemma 7.

If the ϵ-JF( $γ$ ) condition fails to hold, then there exists some $n ⪰ 0$ with $γ^{T} n \leq 1$ and an ϵ-JF(n) trajectory $q^{ϵ} (\cdot)$ with the following properties:

$q_{m}^{ϵ} (1) > 0$ .
The times at which $q^{ϵ} (\cdot)$ is discontinuous (the jump times) all belong to the open interval (0, 1).
At each jump time, exactly one of the components of $q^{ϵ} (\cdot)$ is discontinuous.
The arrival rate associated to $q^{ϵ} (\cdot)$ satisfies $\inf_{t} λ_{j} (t) > 0$ for all j.

For the rest of the proof, we fix an ϵ-JF(n) trajectory with the properties in Lemma 7, together with the associated vector n and rate function $λ (\cdot)$ . We define $n = n_{1} + \dots + n_{ℓ}$ , which is the total number of jumps.⁵

We define $Θ_{0} = 0, Θ_{n + 1} = 1$ , and for $k = 1, \dots, n$ , let Θ_k be the kth jump time. In particular,

0 = Θ_{0} < Θ_{1} < \dots < Θ_{n} < Θ_{n + 1} = 1 .

(29)

We use j_k to refer to the queue at which the kth jump occurs and a_k to refer to the size of the kth jump. In particular, at time Θ_k, for $k = 1, \dots, n, q_{j} (\cdot)$ is continuous for every $j \neq j_{k}$ , and

q_{j_{k}}^{ϵ} (Θ_{k}) = q_{j_{k}}^{ϵ} (Θ_{k}^{-}) + a_{k} .

8.3. Defining Certain Constants

As in (16), we define

\bar{μ} = 1 + \max_{μ \in M} ‖ μ ‖ + ‖ λ^{*} ‖ + ϵ .

(30)

Moreover, similar to (13), we let

γ = \min_{j = 1, \dots, ℓ} γ_{j} > 0 .

(31)

By arguing as in Claim 1, we can and do assume, without loss of generality, that $γ \leq 1$ . We then define a positive constant

c = q_{m}^{ϵ} (1),

(32)

and also let

d = \frac{1}{2} \min {\frac{γ c}{4 (1 + 4 \bar{μ})}, \min_{k = 0, \dots, n} {Θ_{k + 1} - Θ_{k}}, \frac{\min_{k = 1, \dots, n} a_{k}}{1 + 2 \bar{μ}}} .

(33)

(In case n = 0, the last term inside the brackets is taken to be zero.)

8.4. Defining the Stochastic Arrivals

Let us fix some constants $t_{0} \geq 0$ and T > 0 and keep them fixed until the end of the proof in Section 8.7. In this section, we define a stochastic arrival process over an interval of the form $[t_{0}, t_{0} + T)$ , thus constructing what we call an episode of the overall process. Later, in Section 8.7, we concatenate multiple episodes to construct the stochastic process over the entire timeline $[0, \infty)$ .

Consider the arrival rate function $λ (\cdot)$ of the ϵ-JF(n) trajectory; in particular, $‖ λ (t) - λ^{*} ‖ \leq ϵ$ . The arrival rate vector for the stochastic process during the episode is a time-scaled (by a factor of T) and shifted (by t₀) version of $λ (\cdot)$ . More concretely, we let

\bar{λ} (t) = λ (\frac{t - t_{0}}{T}), t \in [t_{0}, t_{0} + T) .

(34)

Clearly,

‖ \bar{λ} (t) - λ^{*} ‖ \leq ϵ, \forall t \in [t_{0}, t_{0} + T) .

(35)

Furthermore, it follows from Lemma 7(d) that $\inf_{t} {\bar{λ}}_{j} (t) > 0$ for every j.

We now digress to introduce certain constants that are used to specify the exact form of the distribution of the arrival processes. For any $α > 0$ , we let

σ (α) = \int_{\bar{μ}}^{\infty} x^{- (1 + α)} \log (x + 1) d x,

(36)

where

\bar{μ}

is defined in (30). Then, for any

α > 0

, we have

\frac{σ (α)}{σ (1 + α)} = \frac{\int_{\bar{μ}}^{\infty} x \cdot x^{- (2 + α)} \log (x + 1) d x}{\int_{\bar{μ}}^{\infty} x^{- (2 + α)} \log (x + 1) d x} \geq \frac{\bar{μ} \int_{\bar{μ}}^{\infty} x^{- (2 + α)} \log (x + 1) d x}{\int_{\bar{μ}}^{\infty} x^{- (2 + α)} \log (x + 1) d x} = \bar{μ} .

(37)

Definition 8

(Arrivals During an Episode). Given T > 0 and $t_{0} \geq 0$ , we define the arrival process over the interval $t \in [t_{0}, t_{0} + T)$ (which we call an episode) as follows:

If $γ_{j} = \infty$ , then $A_{j} (t) = {\bar{λ}}_{j} (t)$ (deterministically).
If $γ_{j} < \infty$ , then $A_{j} (t)$ is a random variable with probability density function
$f_{A_{j} (t)} (x) = \frac{{\bar{λ}}_{j} (t)}{σ (γ_{j})} \cdot x^{- (2 + γ_{j})} \log (x + 1) 𝟙 (x \geq \bar{μ}) + (1 - \frac{{\bar{λ}}_{j} (t) σ (1 + γ_{j})}{σ (γ_{j})}) δ (x),$ (38)
where $𝟙 (\cdot)$ denotes the indicator function, $δ (\cdot)$ is Dirac’s delta function, and $σ (\cdot)$ is as defined in (36).
The $A_{j} (t)$ for different j and t are independent.

We refer to ${A (t) : t \in [t_{0}, t_{0} + T)}$ as an episode-adjusted arrival process.

From (37), we obtain $σ (γ_{j}) / σ (1 + γ_{j}) \geq \bar{μ} \geq {\bar{λ}}_{j} (t)$ , where the last inequality is due to (30) and (35). Therefore, the coefficient of the delta function is nonnegative, and we have a well-defined distribution. It is also easy to verify that $\int_{0}^{\infty} f_{A_{j} (t)} (x) d x = 1$ . Furthermore, from (35), $\bar{λ_{j}} (t)$ is bounded above. It follows that each $A_{j} (t)$ is dominated by a random variable ${\bar{A}}_{j}$ with tail exponent γ_j, and consequently, the process also has tail exponent γ_j. Moreover, $E [A_{j} (t)] = \int_{0}^{\infty} x f_{A_{j} (t)} (x) d x = {\bar{λ}}_{j} (t)$ . Therefore, using again (35), we have $‖ E [A (t)] - λ^{*} ‖ \leq ϵ$ , for all $t \in [t_{0}, t_{0} + T)$ . In particular, the process $A (\cdot)$ belongs to the class $A_{ϵ} (γ; λ^{*})$ as desired.

8.5. Probabilistic Analysis

We aim to show that, during an episode and with significant probability, the queue vector process $Q (\cdot)$ stays close to a scaled version of the ϵ-JF trajectory, that is, that

Q (t) \approx T q^{ϵ} (\frac{t - t_{0}}{T}), \forall t \in [t_{0}, t_{0} + T) .

To accomplish this, we consider each interval of the form $[t_{0} + Θ_{k} T, t_{0} + Θ_{k + 1} T)$ for $k = 0, 1, \dots, n$ and show that there is a substantial probability that the arrival process we have defined has the following properties: (a) as long as k > 0, it has a jump in a small segment in the beginning of the interval, and (b) it has small fluctuations in the rest of the interval. We define events that capture these two properties.

For $k = 1, \dots, n$ , let $B_{k}$ be the cumulative arrival vector over the interval $[t_{0} + Θ_{k} T, t_{0} + Θ_{k} T + d T)$ , that is,⁶

B_{k} = \sum_{t = t_{0} + Θ_{k} T}^{t_{0} + Θ_{k} T + d T - 1} A (t) .

(39)

We let $E_{k}^{jump}$ be the event that $B_{k}$ emulates the jump in $q^{ϵ} (\cdot)$ at time Θ_k scaled by T, that is,

E_{k}^{jump} = {‖ B_{k} - T a_{k} e_{j_{k}} ‖ \leq d T (1 + 2 \bar{μ})},

(40)

where j_k is defined as the index of the queue at which the kth jump takes place and d is the constant defined in (33).

Lemma 8.

There exist $ψ \in (0, 1)$ and ${\bar{T}}_{1} > 0$ such that, if $T \geq \bar{T_{1}}$ , then for $k = 1, \dots, n$ and for the episode-adjusted arrival process over an episode $[t_{0}, t_{0} + T)$ , we have $P (E_{k}^{jump}) \geq ψ T^{- γ_{j_{k}}} \log T$ .

The proof is given in Appendix C.2.

Recall now that the function $λ (\cdot)$ driving the trajectory $q^{ϵ} (\cdot)$ is piecewise constant with a finite number of pieces. We use r to denote the number of such pieces. We then proceed to define certain small fluctuation events. For $k = 0, 1, \dots, n$ , we let $E_{k}^{fluc}$ be the event that the cumulative fluctuations of $A (\cdot)$ over the interval $[t_{0} + Θ_{k} T + d T, t_{0} + Θ_{k + 1} T)$ are small, that is,

E_{k}^{fluc} = {‖ \sum_{τ = t_{0} + Θ_{k} T + d T}^{t} (A (τ) - \bar{λ} (τ)) ‖ \leq \frac{γ c T}{32 C r}, \forall t \in [t_{0} + Θ_{k} T + d T, t_{0} + Θ_{k + 1} T)} .

(41)

Lemma 9.

There exists a ${\bar{T}}_{2} \geq 0$ such that, if $T \geq {\bar{T}}_{2}$ , then for $k = 0, 1, \dots, n$ and for the episode-adjusted process over an episode $[t_{0}, t_{0} + T)$ , we have $P (E_{k}^{fluc}) \geq 1 / 2$ .

The proof is given in Appendix C.3.

Note that each one of the events $E_{k}^{jump}$ for $k = 1, \dots, n$ and $E_{k}^{fluc}$ for $k = 0, \dots, n$ is determined by the arrival process during a particular interval and all of these intervals are disjoint. Thus, the independence assumption on the arrival process implies that all of these events are independent. Therefore, if $T \geq \max {{\bar{T}}_{1}, {\bar{T}}_{2}}$ , then

\begin{array}{l} P (E_{1}^{jump}, \dots, E_{n}^{jump}, E_{0}^{fluc}, \dots, E_{n}^{fluc}) & = \prod_{k = 1}^{n} P (E_{k}^{jump}) \cdot \prod_{k = 0}^{n} P (E_{k}^{fluc}) \\ \geq \prod_{k = 1}^{n} ψ T^{- γ_{j_{k}}} \log T \cdot \prod_{k = 0}^{n} \frac{1}{2} \\ = \frac{1}{2} {(\frac{ψ \log T}{2})}^{n} T^{- \sum_{k = 1}^{n} γ_{j_{k}}} \\ = \frac{1}{2} {(\frac{ψ \log T}{2})}^{n} T^{- γ^{T} n} \\ \geq \frac{1}{2} {(\frac{ψ}{2})}^{n} T^{- γ^{T} n} \log T \\ \geq \frac{1}{2} {(\frac{ψ}{2})}^{1 / γ} T^{- γ^{T} n} \log T \\ \geq \frac{1}{2} {(\frac{ψ}{2})}^{1 / γ} T^{- 1} \log T, \end{array}

(42)

where the first inequality follows from Lemmas 8 and 9. The second inequality is because

n \geq 1

and (without loss of generality)

\log T \geq 1

. The third inequality is due to

n γ = \sum_{j = 1}^{ℓ} n_{j} γ \leq \sum_{j = 1}^{ℓ} n_{j} γ_{j} \leq 1

so that

n \leq 1 / γ

together with the fact

ψ \leq 1

(see Lemma 8). The last inequality is again because

γ^{T} n \leq 1

8.6. $E [Q_{m}]$ is Large at the End of an Episode

We are now ready to argue that, if the events $E_{1}^{jump}, \dots, E_{n}^{jump}$ and $E_{0}^{fluc}, \dots, E_{n}^{fluc}$ occur, then, over an episode $[t_{0}, t_{0} + T)$ , the process $Q (\cdot)$ stays close to the suitably scaled ϵ-JF trajectory. As a consequence, the value of Q_m at the end of the episode becomes of order cT, where $c = q_{m}^{ϵ} (1) > 0$ as in (32). This argument is entirely deterministic. It is carried out in the course of the proof of the next lemma (in Appendix C.4) and relies on the sensitivity bound in Theorem 4.

Lemma 10.

There exists a ${\bar{T}}_{3} \geq 0$ such that, if $T \geq {\bar{T}}_{3}$ , then the following holds. If a sample path of the episode-adjusted process over the episode $[t_{0}, t_{0} + T)$ satisfies the events $E_{1}^{jump}, \dots, E_{n}^{jump}$ and $E_{0}^{fluc}, \dots, E_{n}^{fluc}$ and if $‖ Q (t_{0}) ‖ \leq c T / 5$ , then $Q_{m} (t_{0} + T) \geq c T / 2$ .

Let $ρ = {(ψ / 2)}^{1 / γ} / 4$ and $\bar{T} = \max {2, {\bar{T}}_{1}, {\bar{T}}_{2}, {\bar{T}}_{3}}$ , where ${\bar{T}}_{1}, {\bar{T}}_{2}$ , and ${\bar{T}}_{3}$ are the constants in Lemmas 8–10, respectively. We note that all of these constants, ${\bar{T}}_{1}, {\bar{T}}_{2}, {\bar{T}}_{3}$ , and therefore $\bar{T}$ as well, are defined in terms of general parameters and properties of the particular ϵ-JF trajectory and are deterministic. Lemma 10 and (42) imply that, for an episode $(t_{0}, t_{0} + T)$ with $T \geq \bar{T}$ and initialized so that $‖ Q (t_{0}) ‖ \leq c T / 5$ , we have

P (Q_{m} (t_{0} + T) \geq \frac{c T}{2}) \geq P (E_{1}^{jump}, \dots, E_{n}^{jump}, E_{0}^{fluc}, \dots, E_{n}^{fluc}) \geq 2 ρ T^{- 1} \log T,

(43)

which implies that

E [Q_{m} (t_{0} + T)] \geq \frac{c T}{2} P (Q_{m} (t_{0} + T) \geq \frac{c T}{2}) \geq \frac{c T}{2} \cdot 2 ρ T^{- 1} \log T = ρ c \log T .

(44)

8.7. Concatenating Episodes over the Entire Timeline

So far, we have defined and studied an arrival process over an episode $[t_{0}, t_{0} + T)$ . We now concatenate a sequence of such episodes of increasing duration, which defines an arrival process over an infinite timeline.

We define times $T_{0}, T_{1}, T_{2}, \dots,$ and arrival processes for the intervals $[T_{i}, T_{i + 1})$ , recursively, as follows. We let $T_{0} = 0$ and $T_{1} = \bar{T}$ , where $\bar{T}$ is defined in the last paragraph of Section 8.6. We also let $A (\cdot)$ for $t \in [T_{0}, T_{1})$ be the corresponding episode-adjusted process as in Definition 8. Suppose now that we have defined T_i for some $i \geq 1$ as well as the arrival process for $t \in [0, T_{i})$ . We then let

T_{i + 1} = T_{i} + \max {T_{i}, \frac{10 E [‖ Q (T_{i}) ‖]}{c}} .

(45)

Finally, we define the arrival process over the episode $[T_{i}, T_{i + 1})$ to be the corresponding episode-adjusted process. With this recursion, the arrival process is now well-defined for all times $t \geq 0$ .

Note that $E [‖ Q (T_{i}) ‖] \leq E [\sum_{t = 0}^{T_{i} - 1} ‖ A (t) ‖] = \sum_{t = 0}^{T_{i} - 1} E [‖ A (t) ‖] \leq T_{i} ℓ (‖ λ^{*} ‖ + ϵ) < \infty$ . This guarantees that all T_i are finite and we have an infinite number of episodes. Moreover, note that, for $i = 1, 2, \dots$ , we have $T_{i + 1} \geq 2 T_{i}$ . As a result, $T_{i} \geq 2^{i - 1} \bar{T}$ , and also,

\log (T_{i + 1} - T_{i}) \geq \log T_{i} \geq i \log 2,

(46)

where the last inequality is because

T_{i} \geq 2^{i - 1} \bar{T} \geq 2^{i}

We define the event

B_{i} = {‖ Q (T_{i}) ‖ \leq \frac{(T_{i + 1} - T_{i}) c}{5}},

and use the Markov inequality to obtain

P (B_{i}) \geq 1 - \frac{E [‖ Q (T_{i}) ‖]}{(T_{i + 1} - T_{i}) c / 5} \geq 1 - \frac{E [‖ Q (T_{i}) ‖]}{(10 E [‖ Q (T_{i}) ‖]) / 5} = \frac{1}{2},

(47)

where the second inequality is due to (45).

Recall now that Inequality (44), which is about an episode of length T, makes use of the assumption $‖ Q (t_{0}) ‖ \leq c T / 5$ , where t₀ is the start time of the episode. According to (47), this assumption is satisfied at the start time of the episode $[T_{i}, T_{i - 1})$ with probability at least 1/2. By interpreting (44) as a statement about conditional expectations and with t₀ and $t_{0} + T$ replaced by T_i and $T_{i + 1}$ , respectively, we obtain

E [Q_{m} (T_{i + 1})] \geq P (B_{i}) \cdot E [Q_{m} (T_{i + 1}) | B_{i}] \geq \frac{1}{2} \cdot ρ c \log (T_{i + 1} - T_{i}) \geq \frac{ρ c i \log 2}{2},

where the last inequality follows from (46). Therefore,

E [Q_{m} (T_{i})]

grows unbounded as i increases. Consequently, under the arrival process that we constructed, queue m is not delay stable. This conclusion is obtained for any positive choice of ϵ, no matter how small, and establishes that queue m is not RDS. This completes the proof of the second direction of Theorem 1.

9. Proof of Theorem 2

9.1. Proof of the First Direction

Let us fix some $ϵ > 0$ . To establish one direction of the result, we assume that the ϵ-JF $(γ)$ condition holds for every $γ \in Γ$ . We show that there exists a special ϵ-Lyapunov function.

Let $N$ be the set of all nonnegative integer vectors n such that n_j = 0 for j > h; that is, we allow arbitrarily many jumps at the heavy-tailed queues and no jumps at the light-tailed ones. As in Definition 6, for any nonnegative integer vector n, let $W (n)$ be the set of all points in $R_{+}^{ℓ}$ that are reachable by ϵ-JF $(n)$ trajectories. Let $W = \cup_{n \in N} W (n)$ and consider a Lyapunov function $V (\cdot)$ equal to the distance from W, that is, $V (x) = d (x, W)$ , for any $x \in R_{+}^{ℓ}$ . We show that this Lyapunov function has the desired properties.

The distance function is clearly Lipschitz continuous with a Lipschitz constant equal to one, which implies the first property in the definition of special ϵ-Lyapunov functions.

For the second property, Lemma 4(b) applies and shows that each set $W (n)$ is ϵ-invariant. It can be seen that the union, W, of the ϵ-invariant sets $W (n)$ is also ϵ-invariant. It then follows from Lemma 4(a) that W is ϵ-attracting. This proves the second property in Definition 5.

Note that every $n \in N$ satisfies the inequality $γ^{T} n \leq 1$ for some $γ \in Γ$ . Because the ϵ-JF $(γ)$ condition holds for every $γ \in Γ$ , it follows that every ϵ-JF(n) trajectory with $n \in N$ satisfies $q_{m} (t) = 0$ for all t. Hence, q_m = 0 for all $q \in W$ . Furthermore, because $0 \in W$ , we have $V (0) = 0$ . This establishes the third property in Definition 5.

Finally, W is closed under jumps along coordinates associated with heavy-tailed arrivals. Therefore, $V (\cdot)$ is nonincreasing along those directions, and the fourth property in Definition 5 follows. Thus, $V (\cdot)$ has all the required properties of special ϵ-Lyapunov functions. This completes the proof of one direction of the theorem.

9.2. Proof of the Reverse Direction

We continue with the proof of the reverse direction. We fix some $ϵ > 0$ and assume that there exists a special ϵ-Lyapunov function $V (\cdot)$ and let $W = {x \in R_{+}^{ℓ} | V (x) = 0}$ . The argument rests on the ϵ-invariance of W, which, in turn, relies on some properties of the MW dynamics that we discuss next.

For any $λ, x \in R_{+}^{ℓ}$ , let

ξ_{λ} (x) = \dot{q} (0),

(48)

where

q (\cdot)

is the fluid trajectory corresponding to arrival rate

λ

and initialized with

q (0) = x

. In view of (9), we have

ξ_{λ} (x) \in {\bar{D}}_{λ} (x)

. Moreover, it is shown in lemma 2(a) of Sharifnassab et al. (2019) that

ξ_{λ} (x)

has the minimum norm among all vectors in

{\bar{D}}_{λ} (x)

, that is,

ξ_{λ} (x) = \underset{ν \in {\bar{D}}_{λ} (x)}{argmin} ‖ ν ‖, \forall x \in R_{+}^{ℓ} .

(49)

Here, the minimizer is unique.

Given a closed and convex set $A \subset R^{ℓ}$ and a point $x \in R^{ℓ}$ , we denote by $π_{A} (x)$ the projection of x on $A$ , defined as the point in $A$ that is closest to x. With this terminology, $ξ_{λ} (x)$ is the projection $π_{{\bar{D}}_{λ} (x)} (0)$ of the zero vector on the set ${\bar{D}}_{λ} (x)$ .

In what follows, we also make use of an elementary property of projections: if $A$ is a closed convex set, b is some vector, and $B = A + b$ , then

‖ π_{A} (x) - π_{B} (x) ‖ \leq ‖ b ‖ .

(50)

As a consequence of this, for any $λ_{1}, λ_{2}$ , and x in $R_{+}^{ℓ}$ ,

\begin{array}{l} ‖ ξ_{λ_{1}} (x) - ξ_{λ_{2}} (x) ‖ & = ‖ π_{{\bar{D}}_{λ_{1}} (x)} (0) - π_{{\bar{D}}_{λ_{2}} (x)} (0) ‖ \\ = ‖ π_{{\bar{D}}_{λ_{1}} (x)} (0) - π_{{\bar{D}}_{λ_{1}} (x) + λ_{2} - λ_{1}} (0) ‖ \\ \leq ‖ λ_{1} - λ_{2} ‖, \end{array}

(51)

where the second equality is because

{\bar{D}}_{λ_{2}} (x) = λ_{2} - \bar{M} (x) = λ_{1} - \bar{M} (x) + λ_{2} - λ_{1} = {\bar{D}}_{λ_{1}} (x) + λ_{2} - λ_{1},

and the inequality follows from (50).

Lemma 11.

The set W is ϵ-invariant.

Proof.

Because V is a special ϵ-Lyapunov function, it is Lipschitz continuous with Lipschitz constant one. Let $λ \in R_{+}^{ℓ}$ be such that $‖ λ - λ^{*} ‖ \leq ϵ$ and consider fluid trajectories $q (\cdot)$ and $p (\cdot)$ corresponding to arrival rates $λ$ and $λ^{*}$ , respectively, initialized with the same nonnegative vector $q (0) = p (0) \notin W$ . Then,

\begin{array}{l} \dot{V} (q (t)) |_{t = 0} & = \lim_{δ ↓ 0} \frac{V (q (δ)) - V (q (0))}{δ} \\ = \lim_{δ ↓ 0} \frac{V (p (δ)) + V (q (δ)) - V (p (δ)) - V (q (0))}{δ} \\ \leq \lim_{δ ↓ 0} \frac{[V (p (δ)) + ‖ q (δ) - p (δ) ‖] - V (q (0))}{δ} \\ = \lim_{δ ↓ 0} (\frac{V (p (δ)) - V (p (0))}{δ} + \frac{‖ q (δ) - p (δ) ‖}{δ}) \\ \leq \lim_{δ ↓ 0} (\frac{V (p (δ)) - V (p (0))}{δ} + ‖ λ - λ^{*} ‖) \\ = \dot{V} (p (t)) |_{t = 0} + ‖ λ - λ^{*} ‖ \\ \leq - ϵ + ϵ \\ = 0, \end{array}

(52)

where the first inequality is because V has a Lipschitz constant equal to one, the third equality is due to

q (0) = p (0)

, and the second inequality follows from (51). The last inequality follows from the second property of special ϵ-Lyapunov functions in Definition 5 and the assumption

‖ λ - λ^{*} ‖ \leq ϵ

This argument shows that, for any $λ \in R_{+}^{ℓ}$ with $‖ λ - λ^{*} ‖ \leq ϵ$ and for any fluid trajectory $q (\cdot)$ corresponding to arrival rate $λ$ , the distance from the set W cannot increase. In particular, if $q (\cdot)$ is initialized with $q (0) \in W$ , it must stay in W. Therefore, using the terminology in Definition 7, W is ϵ-invariant. □

From the third property in the definition of special ϵ-Lyapunov functions, we have $V (0) = 0$ and, therefore, $0 \in W$ . Thus, every ϵ-JF(n) trajectory starts in W. From the fourth property in the definition of special ϵ-Lyapunov functions, W is closed with respect to positive jumps along the coordinates associated with heavy-tailed arrivals. Using also Lemma 11 for the times between jumps, we see that, for any $n \in N$ , every ϵ-JF(n) trajectory stays in W. Equivalently, for any $γ \in Γ$ , every ϵ-JF( $γ$ ) trajectory stays in W. Finally, employing again the third property of special ϵ-Lyapunov functions, we have q_m = 0 for all $q \in W$ . This implies that $q_{m} (t) = 0$ for all ϵ-JF $(γ)$ trajectories $q (\cdot)$ with $γ \in Γ$ and all $t \geq 0$ . This establishes the ϵ-JF $(γ)$ condition for all $γ \in Γ$ and completes the proof of Theorem 2.

9.3. Proof of Corollary 1

For part (a), fix some $ϵ > 0$ and suppose that there exists a special ϵ-Lyapunov function. Theorem 2 implies that the ϵ-JF( $γ$ ) condition holds for every $γ \in Γ$ . It then follows from Theorem 1 that queue m is RDS for every $γ \in Γ$ .

For part (b), we fix some $ϵ > 0$ and assume that queue m is ϵ-RDS for all $γ \in Γ$ . In particular, queue m is RDS, and Theorem 1 implies that the $ϵ'$ -JF( $γ$ ) condition holds for some $ϵ' > 0$ . However, a close inspection of the proof of the reverse part of Theorem 1 reveals that we can in fact choose $ϵ'$ to be the same as ϵ. Thus, the ϵ-JF( $γ$ ) condition holds, and Theorem 2 implies that there exists a special ϵ-Lyapunov function.

10. Discussion

In this section, we summarize some key points and conclude with a few open questions.

10.1. Framing and Results

We have addressed the problem of delay stability for a class of queueing networks that operate under the max-weight scheduling policy when some arrival processes are heavy-tailed and some are light-tailed. The overall purpose was to develop conditions for delay stability in terms of fluid-like models. However, as illustrated by the example in Section 3, delay instability can be the result of multiple coordinated large jumps. The probabilities of such large jumps are, in turn, affected by the tail exponents of the arrival processes. Given that traditional fluid models are oblivious to the tail exponents, we introduce JF models, a generalization that allows for jumps along the coordinates associated with heavy-tailed flows subject to a budget on the number of jumps with the budget being determined by the tail exponents.

At the same time, it became clear that tight conditions for delay stability that do not depend on the details of the arrival distributions are only possible under a suitable robust formulation with respect to both the arrival rates and the arrival process distributions. With a careful choice of definitions, we were finally able to establish necessary and sufficient conditions for robust delay stability in terms of ϵ-JF models.

In Section 3, we also discuss a related, so-called ZF condition. The ZF condition essentially examines fluid trajectories that start at zero and involve a single jump and leads to a necessary condition for delay stability (Markakis et al. 2018), but the question whether it can also form the basis for a sufficient condition was open. Our results show that, in order to obtain necessary and sufficient conditions, we need to examine a richer set of trajectories that involve multiple jumps.

Finally, earlier works (Markakis 2013, Markakis et al. 2018) show that Lyapunov functions with certain structural properties can yield sufficient conditions for delay stability. But it was not clear if and when delay stability is equivalent to the existence of such Lyapunov functions. Our results make progress toward establishing the completeness of such a Lyapunov-based methodology for the regime in which the heavy-tailed flows can have arbitrarily small tail exponents.

Our RJF condition is difficult to test for general networks. In some sense, this reflects the intrinsic complexity of the (robust) delay stability problem. The Lyapunov-based condition also appears to be hard to test for general networks.

10.2. Alternative Formulations

Given the complexity of the RJF condition, it is natural to inquire about simpler alternatives. For example, is it possible to obtain tight delay stability results (without robustness) if we consider JF models with a constant rate $λ (\cdot)$ ? In the same spirit, can we restrict to the case in which all jumps in ϵ-JF models take place at the same time? Might it be easier to consider concrete arrival processes instead of focusing on delay stability for all arrival processes with given exponents?

For all three of these questions, the answer is negative. We discuss such variations and related (counter)examples in Appendix A.

10.3. Open Problems

We collect here a few open problems and possible future research directions.

Can the results be generalized to other scheduling policies, for example, MW-α policies (Neely 2010), an extension of the MW policy considered in this paper, or more generally to other stochastic networks whose stability has been studied using fluid models? One obstacle here is that our main result relies heavily on a particular fluctuation bound, which is established specifically for the MW dynamics (Sharifnassab et al. 2020). However, progress may be possible if we rely on alternative stochastic bounding techniques.
Can we identify some special classes of networks for which our criteria (either the RJF condition or the Luyapunov-based condition) can be tested in polynomial time?
Is the RJF condition, which involves time-varying $λ (t)$ with $λ (t) \approx λ^{*}$ equivalent to a similar condition in which we only consider ϵ-JF trajectories with time-invariant $λ (t) = λ$ with $λ \approx λ^{*}$ ? See Appendix A for further discussion and some conjectures.

Acknowledgments

We are grateful to Dr. Bert Zwart for providing us with pointers to relevant works in the literature and for informative technical discussions. Various stages of this work were completed while A. Sharifnassab was affiliated with the Massachusetts Institute of Technology, Sharif University of Technology, and the University of Alberta.

Appendix A. Exploring Alternative Formulations

Our formulation involves rate-robustness (robustness with respect to variations in the arrival rate) as well as distributional robustness (by considering the worst case over all distributions with given tail exponents). It would be preferable to develop conditions that characterize delay stability for specific systems (with fixed arrival rates and arrival distributions). However, this seems to be impossible for reasons that become clearer in this appendix. In particular, the distributional robustness aspect appears to be inevitable as long as we are aiming at conditions that are both necessary and sufficient; see Appendix A.4. For this reason, most of this appendix is devoted to exploring variants of rate robustness. In the interest of brevity, we keep the discussion informal without rigorous proofs.

A.1. Variations of Our Definitions

In this section, we present a number of variations to our definitions of RDS and the RJF condition. In later sections, we elaborate on their relations. Throughout this appendix, we assume that the tail exponent vector $γ$ is fixed with every γ_j in $(0, \infty]$ . We also fix some $λ^{*} > 0$ . The various definitions that we offer differ only with respect to the choice of allowed functions $λ (\cdot)$ .

Let $F$ be a class of discrete-time functions $λ (\cdot)$ . We say that queue m is $F$ -RDS if it obeys Definition 1 except that the allowed arrival rates $E [A (t)]$ are also required to belong to $F$ .

We consider the following choices for $F$ , leading to three alternative definitions of robust delay stability, namely, G-RDS, C-RDS, and 0-RDS:

G (general): Here, we impose no additional restrictions on $E [A (t)]$ . Thus, G-RDS is identical to the RDS condition that we study.
C (constant): Here, we require $E [A (t)]$ to be constant. Effectively, we are considering small but constant perturbations of $λ^{*}$ .
0 (zero): Here, we require $E [A (t)]$ to be equal to $λ^{*}$ for all times t.

We continue similarly to define variants of the RJF condition. Let $F$ be a class of continuous-time functions $λ (\cdot)$ . The $F$ -RJF condition is defined exactly as in Definition 4 except that we only consider ϵ-JF trajectories for which the rate function $λ (\cdot)$ is also required to belong to the class $F$ . We consider four possible choices for $F$ , leading to four variants of the RJF condition, namely, UC-RJF, PC-RJF, C-RJF, 0-RJF:

UC (uniformly continuous): Here, we remove the requirement in Definition 3 that $λ (\cdot)$ be piecewise constant. Instead, we require $λ (\cdot)$ to be (i) piecewise continuous with a finite number of discontinuities and (ii) uniformly continuous on any interval in which it is continuous.
PC (piecewise constant): Here, $λ (\cdot)$ is exactly as in Definition 3, and in particular, piecewise constant. Thus, the PC-RJF condition coincides with the RJF condition we are studying.
C (constant): Here, we require $λ (\cdot)$ to be constant. Effectively, we are considering small but constant perturbations of $λ^{*}$ .
0 (zero): Here, we require $λ (\cdot)$ to be equal to $λ^{*}$ for all times t.

A.2. Relations Between Alternative Definitions

In this section, we explore the relation between $F$ -RDS and $F$ -RJF conditions for different choices of F; see Figure A.1 for a visual summary.

**Figure A.1. Relation Between the Various Conditions**

It is clear that, when we restrict to a smaller class, the RDS or RJF conditions are easier to satisfy. Thus,

G-RDS \Rightarrow C-RDS \Rightarrow 0 -RDS,

and

UC-RJF \Rightarrow PC-RJF \Rightarrow C-RJF \Rightarrow 0 -RJF .

Furthermore, Theorem 1 establishes that G-RDS is equivalent to PC-RJF.

A.2.1. PC-RJF $\Rightarrow$ UC-RJF.

An arbitrary (continuous-time) function $λ (\cdot)$ in the class UC can be approximated by a piecewise constant function with finitely many pieces uniformly over a compact set. Furthermore, it can be shown that, if we perturb by ϵ the vector $λ (\cdot)$ that drives a JF-trajectory, the resulting trajectory is perturbed by at most ϵ over a time interval of length one. It follows that, if the UC-RJF condition fails, we can construct piecewise-constant approximations of $λ (\cdot)$ that demonstrate that the PC-RJF condition also fails. Therefore, PC-RJF $\Rightarrow$ UC-RJF.

Taking Theorem 1 also into account, we see that all three conditions, G-RDS, PC-RJF, and UC-RJF, are equivalent. An alternative path to the same conclusion consists of modifying the proof in Section 8 and showing that G-RDS $\Rightarrow$ UC-RJF. This is possible, but quite tedious.

A.2.2. (C-RDS $\Rightarrow$ C-RJF) and (0-RDS $\Rightarrow$ 0-RJF).

These two implications are true because the proof in Section 8 applies verbatim. Indeed, if we assume that C-RJF fails to hold, we start with a trajectory $q (\cdot)$ that is driven by a constant rate $λ$ and drives queue m to a positive value. The construction of the arrival process in Section 8.4 yields a process with a constant rate $\bar{λ}$ . Thus, the same proof establishes that failure of the C-RJF condition leads to failure of C-RDS; equivalently, C-RDS implies C-RJF. The argument that 0-RDS $\Rightarrow$ 0-RJF is the same.

A.2.3. 0-RJF $⇏$ 0-RDS.

This fact exemplifies the difficulty of obtaining necessary and sufficient conditions in the absence of robustness considerations with respect to the arrival rates.

The argument is simple. Consider a single queue that is served at unit rate and let $λ^{*} = 1$ . Suppose that the tail exponent is larger than one so that no jumps are allowed. In that case, there is only one possible JF trajectory, which obeys $\dot{q} = 1 - 1 = 0$ . When initialized at zero, the JF trajectory stays at zero. Thus, the 0-RJF condition holds for the single queue of interest. On the other hand, as long as the arrivals are not deterministic, the stochastic system is marginally unstable, the expected queue length grows to infinity, and the 0-RDS condition does not hold.

This example involves a system operating at the boundary of its capacity region (marginally unstable). We can also construct simple examples (involving two queues) in which the system operates in the interior of the stability region, is stable and satisfies the 0-RJF condition but is not 0-RDS. Such an example (which we omit) involves a system that operates at the threshold between robust delay stability and instability.

A.2.4. 0-RJF $⇏$ C-RJF.

This is again a simple observation. Consider the same single-queue system as in the previous paragraph with $λ^{*} = 1$ . As long as the rate is fixed at one, the JF trajectory stays at zero, and the 0-RJF condition holds. On the other hand, a small constant perturbation that results in $λ > 1$ yields a divergent JF trajectory, and therefore, the C-RJF condition does not hold.

A.3. Conjectures and Open Problems

We list here a number of questions and conjectures.

A.3.1. 0-RDS $\overset{?}{\Rightarrow}$ C-RDS.

We conjecture that, when $λ^{*} ≻ 0$ ,⁷ 0-RDS implies C-RDS. Ultimately, this amounts to showing that the set of positive arrival rate vectors $λ^{*}$ for which the system is delay stable (robustly over all distributions with given tail exponents) is open. The rationale behind this conjecture is that, in more standard settings (ordinary stability) the set of positive vectors $λ^{*}$ that lead to a stable system is open.

A.3.2. C-RJF $\overset{?}{\Rightarrow}$ PC-RJF.

We conjecture that this implication is true although we do not see how to establish it. If it is true, it would follow from the diagram in Figure A.1 that C-RJF and C-RDS are equivalent to G-RDS, UC-RJF, and PC-RJF.

An indirect approach to establishing the conjecture is to show that (i) C-RJF $\Rightarrow$ C-RDS and (ii) C-RDS $\Rightarrow$ G-RDS. However, this appears to be difficult. Our proof that PC-RJF implies G-RDS involves the set W of points reachable by ϵ-JF trajectories; see Definition 6. However, when we restrict $λ (\cdot)$ to be constant, this set is no longer ϵ-invariant, and Lemma 4(ii) fails to go through.

A.3.3. Generic Considerations.

A fundamental reason behind the mismatch between 0-RJF and G-RDS is that, at least for simple examples, the set of nonnegative nominal rates $λ^{*}$ for which 0-RJF holds is closed, whereas the set of positive nominal rates $λ^{*}$ for which G-RDS holds is open. It is conceivable, however, that one set is the closure of the other and the difference between the two sets is just a lower dimensional boundary. This leads us to the conjecture that 0-RJF and G-RDS are generically equivalent.

Hypothesis A.1.

Let us fix a network and some $γ$ . The set of nonnegative nominal arrival vectors $λ^{*}$ for which the 0-RJF condition holds but G-RDS does not hold has zero Lebesgue measure.

A.4. The Details of the Arrival Distribution May Matter

Our discussion so far has been about distributionally robust results, dealing with delay stability for all arrival distributions with the given tail exponents $γ$ . The reason for this was that JF models cannot take into account any further properties of these distributions.

Once we start inquiring about delay stability for a fixed, fully specified system, the situation is more complex: necessary and sufficient conditions for delay stability appear to be impossible. We illustrate the situation by stating a positive result and disciussing the obstacles in establishing a converse.

A.4.1. Delay Stability Implies the 0-RJF Condition Under a Regularity Assumption.

Suppose that a particular system (with a constant arrival rate $λ^{*}$ and given, i.i.d. arrival distributions) is delay stable. Suppose, furthermore, that the distribution of each $A_{j} (t)$ satisfies (3) with γ replaced by the appropriate γ_j. Then, it can be shown that the 0-RJF condition holds. The argument involves similar ideas as the proof in Section 8. That is, we can show that the stochastic system can track an ϵ-JF trajectory with significant probability.

Note, however, that the 0-RJF condition does not imply delay stability,even under such a regularity assumption. The argument is the same as in our earlier example that showed that the 0-RJF condition does not imply the 0-RDS condition.

A.4.2. Without a Regularity Assumption, Delay Stability Need Not Imply the 0-RJF Condition.

In contrast to the aforementioned result, we have strong reasons to conjecture that there exist systems that are delay stable, and yet, the 0-RJF condition fails to hold. The intuition behind this conjecture is as follows.

Consider a system with two heavy-tailed arrival streams together with some light-tailed ones. Suppose that the tail exponents of the heavy-tailed arrivals are larger than 1/3. We can arrange the system so that a JF trajectory drives the light-tailed queue of interest to a positive value if and only if we have one jump at each heavy-tailed queue, the two jump times are approximately equal, and the jump sizes are comparable (within a constant factor of each other). Such a system does not satisfy the 0-RJF condition.

As in the proof in Section 8, we might expect that the stochastic system can track this JF trajectory. However, we can arrange the arrival process distributions for the two heavy-tailed queues to be such that their supports are wide apart. For example, one distribution may be supported on integers of the form $10^{2 i}$ and the other on integers of the form $10^{2 i + 1}$ . In that case, equal-size jumps are essentially impossible. As a consequence, the stochastic system should be unable to emulate the JF trajectory, the instability mechanism suggested by the JF trajectory need not be present, and the queue of interest may turn out to be delay stable.

A.5. The Timing of the Jumps

The definition of ϵ-JF trajectories allows for jumps at different times. On the other hand, our examples so far rely on jumps that happen simultaneously. This raises the question whether the RJF condition is equivalent to an analogous condition in which we only consider trajectories with simultaneous jumps. It turns out that this is not possible. We give an example with four queues in which an ϵ-JF trajectory drives a certain queue to a positive value, but this is only possible if we allow jumps to occur at different times.

Consider the system in Figure A.2. The first queue receives heavy-tailed arrivals with $γ_{1} = 1 / 2$ , whereas the three other queues receive light-tailed arrivals. There are three possible service vectors as shown in the figure, and the arrival rate vector is $λ^{*} = (1, 2, 1, 1)$ . Note that the condition $γ^{T} n \leq 1$ allows up to two jumps at queue 1. If we restrict to simultaneous jumps, this essentially limits us to a single jump at queue 1.

Figure A.2. A Network with Three Light-Tailed Queues and One Heavy-Tailed Queue, Which Demonstrates the Importance of the Timing of Multiple Jumps
*Notes*. In the example of this figure, if q₁ undergoes a single jump at time 0, q₄ stays at zero. However, two jumps in q₁ at suitably arranged times can result in a positive q₄.

Suppose that q₁ has a jump of size 27 at time 0. Then, the 0-JF trajectory is piecewise linear with breakpoints $q (0) = (27, 0, 0, 0), q (3) = (6, 6, 0, 0), q (5) = (0, 2, 2, 0)$ , and $q (9) = 0$ . This can be easily verified by noticing that the MW policy chooses service vector $μ^{1}$ for $t \in [0, 3)$ and service vector $μ^{2}$ for $t \in (3, 5)$ , and the service capacity is split between $μ^{2}$ and $μ^{3}$ with ratios 5/8 and 3/8 for $t \in (5, 9)$ . It then follows from the form of this piecewise linear fluid trajectory that, given a single jump at time 0, q₄ stays at zero for all subsequent times. We now argue that this is not the case if q₁ undergoes two jumps at different times.

Suppose that q₁ has a jump of size 27 at time 0 and a jump of size 2 at time 5. Let $p (\cdot)$ be the associated jumping fluid trajectory. Then, right before time 5, we have $p (5^{-}) = q (5^{-}) = (0, 2, 2, 0)$ . Therefore, after the second jump, we have $p (5) = (2, 2, 2, 0)$ . In this case, $μ^{3}$ is the dominant service vector for some positive time interval starting from time 5. Because q₄ receives no service under $μ^{3}$ , it starts to build up and become positive.

In this example, the 0-RJF condition fails to hold, and the system is not 0-RDS. On the other hand, if we were to restrict to simultaneous jumps, we would not be able to tell that this is the case. Finally, using the same example, we see that we should also consider nonsimultaneous jumps when examining the C-RJF or PC-RJF conditions.

Appendix B. Proofs of Lemmas for the First Direction of Theorem 1 (ϵ-JF $\Rightarrow$ RDS)

B.1. Proof of Theorem 4

We compare the fluid trajectory $q (\cdot)$ , which is initialized with $q (0) \neq Q (0)$ , with another fluid trajectory $\tilde{q} (\cdot)$ , initialized with $\tilde{q} (0) = Q (0)$ . From the triangle inequality,

‖ Q (t) - q (t) ‖ \leq ‖ Q (t) - \tilde{q} (t) ‖ + ‖ \tilde{q} (t) - q (t) ‖ .

We apply Theorem 3 to bound the first term on the right-hand side. Because of the nonexpansive property of the MW dynamics, we also have

‖ \tilde{q} (t) - q (t) | \leq ‖ \tilde{q} (0) - q (0) ‖ = ‖ Q (0) - q (0) ‖,

and the result follows.

B.2. Proof of Lemma 1

Let us fix T throughout this proof. Recall the constant $β \in (1, 2)$ introduced in the context of (19). For $j = 1, \dots, ℓ$ , we define

γ_{j}^{'} = {\begin{cases} γ_{j} / β & if γ_{j} < β^{3}, \\ β^{2} & if γ_{j} \geq β^{3} . \end{cases}

We then let $γ' = (γ_{1}^{'}, \dots, γ_{ℓ}^{'})$ , and $γ' = \min_{j} γ_{j}^{'}$ . Note that, in all cases, we have $γ_{j}^{'} \leq γ_{j} / β$ so that $γ' \leq γ / β$ . As argued in Claim 1, we can and do (without loss of generality) assume that $γ = \min_{j} γ_{j} \leq 1$ , and thus, $γ' < 1$ . Finally, in view of (19), it can be seen that, for any nonnegative integer vector n,

if γ^{T} n > 1, then {(γ')}^{T} n \geq β^{2} > 1 .

(B.1)

For every j and according to our definition (5) of the tail exponent of an arrival process, there is a random variable ${\bar{A}}_{j}$ that dominates $A_{j} (t)$ for all $t \geq 0$ and for which all moments of order less than $1 + γ_{j}$ are finite; see (3). We define

Γ_{j} = E [{\bar{A}}_{j}^{1 + γ_{j}^{'}}], j = 1, \dots, ℓ,

(B.2)

which is finite because

γ_{j}^{'} < γ_{j}

For $t = 0, \dots, T - 1$ , and $j = 1, \dots, ℓ$ , let

p_{j, t} = P (A_{j} (t) > θ_{t}) .

For any j and $t \leq T - 1$ , the Markov inequality yields

\begin{array}{l} p_{j, t} & = P (A_{j} (t) > θ_{t}) \\ \leq P ({\bar{A}}_{j} > θ_{t}) \\ = P ({\bar{A}}_{j}^{1 + γ_{j}^{'}} > θ_{t}^{1 + γ_{j}^{'}}) \\ \leq \frac{E [{\bar{A}}_{j}^{1 + γ_{j}^{'}}]}{θ_{t}^{1 + γ_{j}^{'}}} \\ = \frac{Γ_{j} η^{1 + γ_{j}^{'}} \log^{1 + γ_{j}^{'}} (M + T - t)}{{(M + T - t)}^{1 + γ_{j}^{'}}}, \end{array}

(B.3)

where the last equality is due to the definitions of θ_t and Γ_j in (11) and (B.2), respectively.

Let $ϕ = 1 - 2^{- 1 / ℓ}$ and note that $0 < ϕ < 1$ . Because $0 < (1 - 1 / β) γ' < γ_{j}^{'}$ for every j, there exists some $M_{1} \geq 1$ such that, if $M \geq M_{1}$ , then

Γ_{j} η^{1 + γ_{j}^{'}} \log^{1 + γ_{j}^{'}} M \leq ϕ \cdot (γ_{j}^{'} - (1 - 1 / β) γ') \cdot M^{(1 - 1 / β) γ'}, j = 1, \dots, ℓ .

(B.4)

We fix such an M₁. Then, for $M \geq M_{1}$ , we have

\begin{array}{l} \sum_{t = 0}^{T - 1} p_{j, t} & \leq Γ_{j} η^{1 + γ_{j}^{'}} \sum_{t = 0}^{T - 1} \frac{\log^{1 + γ_{j}^{'}} (M + T - t)}{{(M + T - t)}^{1 + γ_{j}^{'}}} \\ \leq Γ_{j} η^{1 + γ_{j}^{'}} \sum_{τ = 1}^{\infty} \frac{\log^{1 + γ_{j}^{'}} (M + τ)}{{(M + τ)}^{1 + γ_{j}^{'}}} \\ \leq ϕ (γ_{j}^{'} - (1 - 1 / β) γ') \sum_{τ = 1}^{\infty} \frac{{(M + τ)}^{(1 - 1 / β) γ'}}{{(M + τ)}^{1 + γ_{j}^{'}}} \\ \leq ϕ (γ_{j}^{'} - (1 - 1 / β) γ') \int_{M}^{\infty} \frac{x^{(1 - 1 / β) γ'}}{x^{1 + γ_{j}^{'}}} d x \\ = ϕ M^{(1 - 1 / β) γ' - γ_{j}^{'}}, \end{array}

(B.5)

where the first inequality is due to (B.3) and the third inequality follows from (B.4).

Let $X_{j, t}$ be the event ${A_{j} (t) > θ_{t}}$ . Recall that N_j stands for the number of jumps at queue j during the interval $[0, T)$ . For any j and any nonnegative integer n, we have

\begin{array}{l} P (N_{j} = n) & \leq \sum_{0 \leq τ_{1} < \dots < τ_{n} \leq T - 1} P (X_{j, τ_{1}} \cap \dots \cap X_{j, τ_{n}}) \\ = \sum_{0 \leq τ_{1} < \dots < τ_{n} \leq T - 1} p_{j, τ_{1}} \dots p_{j, τ_{n}} \\ \leq {(\sum_{t = 0}^{T - 1} p_{j, t})}^{n} \\ \leq ϕ^{n} M^{(1 - 1 / β) γ' n - γ_{j}^{'} n} . \end{array}

(B.6)

Here, the first inequality follows from the union bound, the equality is from the independence of the events $X_{j, τ_{1}}, \dots, X_{j, τ_{n}}$ , and the last inequality is due to (B.5). Therefore, for any $n \in Z_{+}^{ℓ}$ ,

\begin{array}{l} P (N = n) & = \prod_{j = 1}^{ℓ} P (N_{j} = n_{j}) \\ \leq \prod_{j = 1}^{ℓ} ϕ^{n_{j}} M^{(1 - 1 / β) γ' n_{j}} M^{- γ_{j}^{'} n_{j}} \\ = ϕ^{| n |} M^{(1 - 1 / β) γ' | n | - n^{T} γ'}, \end{array}

(B.7)

where

| n | = n_{1} + \dots + n_{ℓ} .

γ^{T} n > 1

, then (B.1) asserts that

n^{T} γ' \geq β^{2}

. As a result and because

γ'

is the smallest component of

γ'

(1 - \frac{1}{β}) γ' | n | - n^{T} γ' \leq (1 - \frac{1}{β}) n^{T} γ' - n^{T} γ' = - \frac{n^{T} γ'}{β} \leq - β .

(B.8)

Hence,

\begin{array}{l} P (γ^{T} N > 1) & = \sum_{n : γ^{T} n > 1} P (N = n) \\ \leq \sum_{n : γ^{T} n > 1} ϕ^{| n |} M^{(1 - 1 / β) γ' | n | - n^{T} γ'} \\ \leq M^{- β} \sum_{n : γ^{T} n > 1} ϕ^{| n |} \\ \leq M^{- β} ((\sum_{n_{1}, \dots, n_{ℓ} \geq 0} ϕ^{| n |}) - 1) \\ = M^{- β} ((\prod_{j = 1}^{ℓ} \sum_{n_{j} = 0}^{\infty} ϕ^{n_{j}}) - 1) \\ = M^{- β} (\frac{1}{{(1 - ϕ)}^{ℓ}} - 1) \\ = M^{- β}, \end{array}

(B.9)

where the first inequality is from (B.7) and the second inequality follows from (B.8). The last equality is because

1 - ϕ = 2^{- 1 / ℓ}

. This completes the proof of Lemma 1.

B.3. Proof of Lemma 2

The Bernstein inequality (see, e.g., (1.21) in appendix 1 of Anthony and Bartlett 2009) asserts that, for any $z \geq 0$ , we have

P (| Y - E [Y] | > z) \leq 2 \exp (- \frac{z^{2} / 2}{\sum_{i = 1}^{n} E [{(X_{i} - E [X_{i}])}^{2}] + b z / 3}) .

(B.10)

We note that $E [{(X_{i} - E [X_{i}])}^{2}] \leq E [X_{i}^{2}] \leq E [X_{i} b] \leq b \bar{λ}$ . Plugging this into (B.10), we obtain

P (| Y - E [Y] | > z) \leq 2 \exp (- \frac{z^{2} / 2}{n b \bar{λ} + b z / 3}) .

(B.11)

This implies (20) and completes the proof of Lemma 2.

B.4. Proof of Lemma 3

For every j, and according to our definition (5) of the tail exponent of an arrival process, there is a random variable ${\bar{A}}_{j}$ that dominates $A_{j} (t)$ for all $t \geq 0$ and for which all moments of order less than $1 + γ_{j}$ are finite; see (3). Because $γ_{j} > 0$ , we have $\int_{0}^{\infty} P ({\bar{A}}_{j} > x) d x = E [{\bar{A}}_{j}] < \infty$ . Therefore, there exists a constant $M_{2} > 0$ such that, for any $M > M_{2}$ and every j,

\int_{M / η \log M}^{\infty} P ({\bar{A}}_{j} > x) d x \leq \frac{γ ϵ}{60 C \sqrt{ℓ}} .

(B.12)

Note that the choice of M₂ is independent of T. For the rest of the proof, we fix M₂ and assume that $M > M_{2}$ .

Recall that $A^{*} (τ) = \min {A (τ), θ_{τ}}$ so that

A_{j} (τ) - A_{j}^{*} (τ) = \max {0, A_{j} (τ) - θ_{τ}} .

(B.13)

Therefore,

\begin{array}{l} E [A_{j} (τ)] - E [A_{j}^{*} (τ)] & = E [\max {0, A_{j} (τ) - θ_{τ}}] \\ = \int_{0}^{\infty} P (\max {0, A_{j} (τ) - θ_{τ}} > x) d x \\ = \int_{0}^{\infty} P (A_{j} (τ) - θ_{τ} > x) d x \\ = \int_{θ_{τ}}^{\infty} P (A_{j} (τ) > x) d x \\ \leq \int_{M / η \log M}^{\infty} P (A_{j} (τ) > x) d x \\ \leq \int_{M / η \log M}^{\infty} P ({\bar{A}}_{j} > x) d x \\ \leq \frac{γ ϵ}{60 C \sqrt{ℓ}}, \end{array}

(B.14)

where the fourth equality uses a change of variables from

x + θ_{τ}

to x, the first inequality uses the fact

θ_{τ} \leq M / η \log M

for all

τ \geq 0

(see the definition (11) of θ_t), the second inequality is because

{\bar{A}}_{j}

dominates

A_{j} (τ)

, and the last inequality follows from (B.12). Therefore, for any

τ \geq 0

‖ E [A (τ)] - E [A^{*} (τ)] ‖ \leq \sqrt{ℓ} \max_{j = 1, \dots, ℓ} {E [A_{j} (τ)] - E [A_{j}^{*} (τ)]} \leq \frac{γ ϵ}{60 C} .

(B.15)

We now introduce some “simpler” events whose occurrence are shown to imply the event $E^{fluc} (T, M)$ that is introduced in (21):

E^{*} = {‖ \sum_{τ = t_{0}}^{t} (A^{*} (τ) - E [A^{*} (τ)]) ‖ \leq \frac{γ ϵ}{30 C} (M + T - t_{0}), for 0 \leq t_{0} \leq t < T},

(B.16)

and for every j,

E_{j}^{*} = {| \sum_{τ = t_{0}}^{t} (A_{j}^{*} (τ) - E [A_{j}^{*} (τ)]) | \leq \frac{γ ϵ}{30 C ℓ} (M + T - t_{0}), for 0 \leq t_{0} \leq t < T},

(B.17)

and we observe that the events

E_{1}^{*}, \dots, E_{ℓ}^{*}

imply the event

E^{*}

. Then, the union bound, applied to the complements of these events, implies that

P (E^{*}) \geq 1 - \sum_{j = 1}^{ℓ} (1 - P (E_{j}^{*})) .

(B.18)

We now argue that the event $E^{*}$ implies the event $E^{fluc} (T, M)$ defined in (21). Indeed, suppose that event $E^{*}$ occurs. Then, as long as $0 \leq t_{0} \leq t < T$ , we obtain

\begin{array}{l} ‖ \sum_{τ = t_{0}}^{t} (A^{*} (τ) - λ^{*}) ‖ \\ = ‖ \sum_{τ = t_{0}}^{t} (A^{*} (τ) - E [A^{*} (τ)]) + \sum_{τ = t_{0}}^{t} (E [A^{*} (τ)] - E [A (τ)]) + \sum_{τ = t_{0}}^{t} (E [A (τ)] - λ^{*}) ‖ \\ \leq ‖ \sum_{τ = t_{0}}^{t} (A^{*} (τ) - E [A^{*} (τ)]) ‖ + \sum_{τ = t_{0}}^{t} ‖ E [A^{*} (τ)] - E [A (τ)] ‖ + \sum_{τ = t_{0}}^{t} ‖ E [A (τ)] - λ^{*} ‖ \\ \leq \frac{γ ϵ}{30 C} (M + T - t_{0}) + \frac{γ ϵ}{60 C} (t - t_{0}) + \frac{γ ϵ}{20 C} (t - t_{0}) \\ \leq \frac{γ ϵ}{10 C} (M + T - t_{0}), \end{array}

(B.19)

where, in the second inequality, we use the definition of the event

E^{*}

to bound the first term, (B.15) to bound the second term, and (14) to bound the third term. The last inequality is because t < T. Thus, the event

E^{*}

implies the event

E^{fluc} (T, M)

, and

P (E^{fluc} (T, M)) \geq P (E^{*}) \geq 1 - \sum_{j = 1}^{ℓ} (1 - P (E_{j}^{*})),

(B.20)

where the second inequality follows from (B.18).

To complete the proof, we derive an upper bound for $1 - P (E_{j}^{*})$ . As a first step, we obtain a relation between various constants, which reflects the fact that η is chosen large enough. We have

\begin{array}{l} \frac{γ^{2} ϵ^{2} η}{2 {(30 C ℓ)}^{2} \bar{μ} + 20 γ ϵ C ℓ} & \geq \frac{γ^{2} ϵ^{2} η}{1800 C^{2} ℓ^{2} \bar{μ} + 20 \bar{μ} C ℓ} \\ \geq \frac{γ^{2} ϵ^{2}}{1820 C^{2} ℓ^{2} \bar{μ}} \cdot η \\ = \frac{γ^{2} ϵ^{2}}{1820 C^{2} ℓ^{2} \bar{μ}} \cdot \frac{8000 C^{2} ℓ^{2} \bar{μ}}{γ^{2} ϵ^{2}} \\ > 4, \end{array}

(B.21)

where the first inequality is due to the assumptions

γ \leq 1

and the fact

ϵ < \bar{μ}

, which is evident from the definition (16) of

\bar{μ}

; the second inequality is because

C \geq 1

, and the equality follows from the definition of η in (17).

Finally, we note that, for any $τ \geq 0$ and every j,

E [A_{j}^{*} (τ)] \leq E [A_{j} (τ)] \leq ‖ E [A (τ)] ‖ \leq ‖ λ^{*} ‖ + ϵ \leq \bar{μ} .

(B.22)

For any t₀ and t with $0 \leq t_{0} \leq t < T$ , using the fact that $A_{j}^{*} (τ)$ is bounded above by $θ_{τ}$ , we have

\begin{array}{l} P & (| \sum_{τ = t_{0}}^{t} (A_{j}^{*} (τ) - E [A_{j}^{*} (τ)]) | > \frac{γ ϵ}{30 C ℓ} (M + T - t_{0})) \\ \leq 2 \exp (- \frac{{(\frac{γ ϵ}{30 C ℓ} (M + T - t_{0}))}^{2}}{2 \frac{M + T - t_{0}}{η \log (M + T - t_{0})} \cdot (\bar{μ} (t - t_{0} + 1) + \frac{γ ϵ}{90 C ℓ} (M + T - t_{0}))}) \\ \leq 2 \exp (- \frac{γ^{2} ϵ^{2} η \log (M + T - t_{0})}{2 {(30 C ℓ)}^{2} \bar{μ} + 20 γ ϵ C ℓ}) \\ \leq 2 \exp (- 4 \log (M + t - t_{0})) \\ = \frac{2}{{(M + T - t_{0})}^{4}}, \end{array}

(B.23)

where the first inequality follows from Lemma 2 with

z = (γ ϵ / 30 C ℓ) (M + T - t_{0}), b = (M + T - t_{0}) / η \log (M + T - t_{0}) = θ_{t_{0}} \geq θ_{τ}, \bar{λ} = \bar{μ}

(see (B.22)), and

n = t - t_{0} + 1

; the second inequality is because

t - t_{0} + 1 < M + T - t_{0}

, and the third inequality is due to (B.21). Therefore, for every j,

\begin{array}{l} 1 - P (E_{j}^{*}) & \leq \sum_{t_{0} = 0}^{T - 1} \sum_{t = t_{0}}^{T - 1} P (| \sum_{τ = t_{0}}^{t} (A_{j}^{*} (τ) - E [A_{j}^{*} (τ)]) | > \frac{γ ϵ}{30 C ℓ} (M + T - t_{0})) \\ \leq \sum_{t_{0} = 0}^{T - 1} \sum_{t = t_{0}}^{T - 1} \frac{2}{{(M + T - t_{0})}^{4}} \\ = \sum_{t_{0} = 0}^{T - 1} \frac{2 (T - t_{0})}{{(M + T - t_{0})}^{4}} \\ \leq \sum_{t_{0} = 0}^{T - 1} \frac{2}{{(M + T - t_{0})}^{3}} \\ \leq \sum_{τ = 1}^{\infty} \frac{2}{{(M + τ)}^{3}} \\ \leq \int_{0}^{\infty} \frac{2}{{(M + x)}^{3}} d x \\ = M^{- 2}, \end{array}

(B.24)

where the first inequality is from the union bound and the second inequality is due to (B.23). Plugging (B.24) into (B.20), we obtain

P (E^{fluc} (T, M)) \geq 1 - ℓ M^{- 2}

. This completes the proof of Lemma 3.

B.5. Proof of Lemma 4

We start with the proof of the first part and assume that W is ϵ-invariant. We prove the result under the additional assumption that the set W is closed. This is without loss of generality for the following reason. Given an ϵ-invariant set W, let $\bar{W}$ be its closure. Because fluid trajectories under the MW policy are continuous functions of their initial conditions, it is not hard to see that $\bar{W}$ is also ϵ-invariant. Once we show the result for closed sets, we have established that $\bar{W}$ is ϵ-attracting. Finally, because $d (x, W) = d (x, \bar{W})$ for all x, we can conclude that W is also ϵ-attracting.

Having assumed that W is closed, we now consider a fluid trajectory $q (\cdot)$ initialized with $q (0) = q_{0} ⪰ 0$ with $q_{0} \notin W$ . Then, there exists some $x_{0} \in W$ , which is closest to $q_{0}$ . Let $x (\cdot)$ be a fluid trajectory initialized at $x (0) = x_{0}$ and corresponding to the rate vector

λ = λ^{*} + ϵ \frac{q_{0} - x_{0}}{‖ q_{0} - x_{0} ‖} .

(B.25)

Because W is ϵ-invariant and $‖ λ - λ^{*} ‖ = ϵ$ , we have $x (t) \in W$ for all $t \geq 0$ . In particular, $d (q (t), W) \leq ‖ q (t) - x (t) ‖$ for all $t \geq 0$ . Furthermore, equality holds at time t = 0. This implies that

\frac{d}{d t} d (q (t), W) |_{t = 0} \leq \frac{d}{d t} ‖ q (t) - x (t) ‖ |_{t = 0} .

(B.26)

From the fluid equations (see Definition 2), we have $\dot{x} (0) \in {\bar{D}}_{λ} (x_{0}) = λ - \bar{M} (x_{0})$ . Equivalently, there exist coefficients $α_{μ} \geq 0$ for $μ \in M (x_{0})$ such that $\sum_{μ \in M (x_{0})} α_{μ} = 1$ and

\dot{x} (0) = λ - \sum_{μ \in M (x_{0})} α_{μ} μ .

(B.27)

Similarly, let $β_{ν} \geq 0$ for $ν \in M (q_{0})$ be a set of coefficients such that $\sum_{ν \in M (q_{0})} β_{ν} = 1$ and

\dot{q} (0) = λ^{*} - \sum_{ν \in M (q_{0})} β_{ν} ν .

(B.28)

For any $μ \in M (x_{0})$ and any $ν \in M (q_{0})$ because, by definition, $μ$ is a maximizer of $u^{T} x_{0}$ over all $u \in M$ , we have ${(μ - ν)}^{T} x_{0} \geq 0$ . Similarly, ${(ν - μ)}^{T} q_{0} \geq 0$ . Combining these two inequalities, we obtain

{(μ - ν)}^{T} (x_{0} - q_{0}) = {(μ - ν)}^{T} x_{0} + {(ν - μ)}^{T} q_{0} \geq 0, \forall μ \in M (x_{0}), ν \in M (q_{0}) .

(B.29)

Because $\sum_{μ \in M (x_{0})} α_{μ} = \sum_{ν \in M (q_{0})} β_{ν} = 1$ , it follows that

{(\sum_{μ \in M (x_{0})} α_{μ} μ - \sum_{ν \in M (q_{0})} β_{ν} ν)}^{T} (x_{0} - q_{0}) = \sum_{μ \in M (x_{0})} \sum_{ν \in M (q_{0})} α_{μ} β_{ν} {(μ - ν)}^{T} (x_{0} - q_{0}) \geq 0 .

(B.30)

Using (B.26), we have

\begin{array}{l} \frac{d}{d t} d (q (t), W) |_{t = 0} & \leq \frac{d}{d t} ‖ q (t) - x (t) ‖ |_{t = 0} \\ = \frac{{(\dot{x} (0) - \dot{q} (0))}^{T} (x_{0} - q_{0})}{‖ x_{0} - q_{0} ‖} \\ = \frac{1}{‖ x_{0} - q_{0} ‖} {(λ - \sum_{μ \in M (x_{0})} α_{μ} μ - λ^{*} + \sum_{ν \in M (q_{0})} β_{ν} ν)}^{T} (x_{0} - q_{0}) \\ = \frac{{(λ - λ^{*})}^{T} (x_{0} - q_{0})}{‖ x_{0} - q_{0} ‖} - {(\sum_{μ \in M (x_{0})} α_{μ} μ - \sum_{ν \in M (q_{0})} β_{ν} ν)}^{T} \frac{x_{0} - q_{0}}{‖ x_{0} - q_{0} ‖} \\ \leq \frac{{(λ - λ^{*})}^{T} (x_{0} - q_{0})}{‖ x_{0} - q_{0} ‖} \\ = {(\frac{ϵ (q_{0} - x_{0})}{‖ x_{0} - q_{0} ‖})}^{T} \frac{(x_{0} - q_{0})}{‖ x_{0} - q_{0} ‖} \\ = - ϵ . \end{array}

(B.31)

Here, the second equality follows from (B.27) and (B.28), the second inequality is due to (B.30), and the fourth equality uses the definition of $λ$ in (B.25). Thus, W is ϵ-attracting.

For the proof of the second part, we fix some n and consider the set $W (n)$ as in Definition 6. Let x be an element of $W (n)$ . Then, there exists an ϵ-JF(n) trajectory $x (\cdot)$ that reaches x. Consider now a fluid trajectory $q (\cdot)$ , corresponding to $λ$ , for some $λ$ with $‖ λ - λ^{*} ‖ \leq ϵ$ and initialized at x. The concatenation of the trajectories $x (\cdot)$ and $q (\cdot)$ is an ϵ-JF(n) trajectory, and therefore, any point that it can reach also belongs to $W (n)$ . Thus, $W (n)$ is ϵ-invariant as claimed.

B.6. Proof of Lemma 5

As in the statement of the lemma, we fix some $M \geq M_{3}$ and times that satisfy $0 \leq t_{0} < t_{1} \leq T$ . We also fix a sample path under which the event $E^{fluc} (T, M)$ occurs, and the interval (t₀, t₁) is jump-free.

Let $q (\cdot)$ be a fluid trajectory corresponding to the arrival rate $λ^{*}$ and initialized with $q (t_{0} + 1) = Q (t_{0} + 1)$ . Then,

\begin{array}{l} ‖ Q (t_{1}) - q (t_{1}) ‖ & \leq C (1 + ‖ λ^{*} ‖ + \max_{t \in (t_{0}, t_{1})} ‖ \sum_{τ = t_{0} + 1}^{t} (A (τ) - λ^{*}) ‖) \\ = C (1 + ‖ λ^{*} ‖ + \max_{t \in (t_{0}, t_{1})} ‖ \sum_{τ = t_{0} + 1}^{t} (A^{*} (τ) - λ^{*}) ‖) \\ \leq C (1 + ‖ λ^{*} ‖ + \frac{γ ϵ}{10 C} (M + T - t_{0})) \\ = \frac{γ ϵ}{10} (M + T - t_{0}) + (‖ λ^{*} ‖ + 1) C, \end{array}

(B.32)

where the first inequality follows from Theorem 4, the first equality is because (t₀, t₁) is jump-free, and as a result,

A^{*} (τ) = A (τ)

for all

τ \in (t_{0}, t_{1})

, and the second inequality is because of the occurrence of the event

E^{fluc} (T, M)

.⁸

Moreover, from Lemma 4, $W (t_{0} + 1)$ is ϵ-attracting. Furthermore, because (t₀, t₁) is a jump-free interval, we have $W (t_{1}) = W (t_{0} + 1)$ . Therefore,

\begin{array}{l} d (q (t_{1}), W (t_{1})) & = d (q (t_{1}), W (t_{0} + 1)) \\ \leq \max {0, d (q (t_{0} + 1), W (t_{0} + 1)) - (t_{1} - t_{0} - 1) ϵ}, \end{array}

(B.33)

where the inequality rests on (23).

We now consider the effect of possible jumps at time t₀. Let $k \geq 0$ be the number of jumps that occur at time t₀ in different entries of $A (t_{0})$ . Without loss of generality, suppose that $A_{1}, \dots, A_{k}$ undergo a jump at time t₀. Let J be an $ℓ$ -dimensional vector with the first k entries equal to $A_{1} (t_{0}), \dots, A_{k} (t_{0})$ and all other entries equal to zero. Then, $A (t_{0}) - J$ is an $ℓ$ -dimensional vector whose first k entries are zero; all other entries are jump-free and are, therefore, bounded by $θ_{t_{0}} = (M + T - t_{0}) / (η \log (M + T - t_{0}))$ . Thus,

‖ A (t_{0}) - J ‖ \leq \frac{(M + T - t_{0}) ℓ}{η \log (M + T - t_{0})} .

(B.34)

A key consequence of our definition of the sets W(t) is that if t₀ is a jump time, then some components of the vector $N (t_{0})$ are larger than those of $N (t_{0} - 1)$ so that the set $W (t_{0} + 1)$ is larger than the set $W (t_{0})$ . In particular, if $x \in W (t_{0})$ , then $x + J \in W (t_{0} + 1)$ . Now, let x be a point in the closure of $W (t_{0})$ that is closest to $Q (t_{0})$ , that is,

x \in \underset{y \in closure (W (t_{0}))}{argmin} ‖ y - Q (t_{0}) ‖ .

Because $x \in closure (W (t_{0}))$ , it follows that $x + J \in closure (W (t_{0} + 1))$ . Therefore,

\begin{array}{l} d (q (t_{0} + 1), W (t_{0} + 1)) & \leq d (q (t_{0} + 1), x + J) \\ = d ({[Q (t_{0}) - μ (t_{0})]}^{+} + A (t_{0}), x + J) \\ \leq d ({[Q (t_{0}) - μ (t_{0})]}^{+} + A (t_{0}), Q (t_{0}) + A (t_{0})) \\ + d (Q (t_{0}) + A (t_{0}), Q (t_{0}) + J) + d (Q (t_{0}) + J, x + J) \\ = d ({[Q (t_{0}) - μ (t_{0})]}^{+}, Q (t_{0})) + d (A (t_{0}), J) + d (Q (t_{0}), x) \\ \leq ‖ μ (t_{0}) ‖ + d (A (t_{0}), J) + d (Q (t_{0}), x) \\ \leq \bar{μ} + \frac{(M + T - t_{0}) ℓ}{η \log (M + T - t_{0})} + d (Q (t_{0}), x) \\ = \bar{μ} + \frac{(M + T - t_{0}) ℓ}{η \log (M + T - t_{0})} + d (Q (t_{0}), W (t_{0})), \end{array}

(B.35)

where the first inequality is because

x + J \in closure (W (t_{0} + 1))

, the first equality is from the evolution formula for the MW dynamics in (1) and the initialization

q (t_{0} + 1) = Q (t_{0} + 1)

for

q (\cdot)

, the last inequality follows from (B.34), and the last equality is from the definition of x.

Combining (B.32), (B.33), and (B.35), we obtain, for $M \geq M_{3}$ ,

\begin{array}{l} d (Q (t_{1}), W (t_{1})) & \leq d (Q (t_{1}), q (t_{1})) + d (q (t_{1}), W (t_{1})) \\ \leq \frac{γ ϵ (M + T - t_{0})}{10} + (‖ λ^{*} ‖ + 1) C + d (q (t_{1}), W (t_{1})) \\ \leq \frac{γ ϵ (M + T - t_{0})}{10} + (‖ λ^{*} ‖ + 1) C + \max {0, d (q (t_{0} + 1), W (t_{0} + 1)) - (t_{1} - t_{0} - 1) ϵ} \\ \leq \frac{γ ϵ (M + T - t_{0})}{10} + (‖ λ^{*} ‖ + 1) C \\ + \max {0, [d (Q (t_{0}), W (t_{0})) + \bar{μ} + \frac{(M + T - t_{0}) ℓ}{η \log (M + T - t_{0})}] - (t_{1} - t_{0} - 1) ϵ} \\ \leq \max {0, d (Q (t_{0}), W (t_{0})) - (t_{1} - t_{0} - 1) ϵ} \\ + \bar{μ} + \frac{(M + T - t_{0}) ℓ}{η \log (M + T - t_{0})} + \frac{γ ϵ (M + T - t_{0})}{10} + (‖ λ^{*} ‖ + 1) C \\ \leq \max {0, d (Q (t_{0}), W (t_{0})) - (t_{1} - t_{0}) ϵ} + ϵ \\ + \frac{(M + T - t_{0}) ℓ}{η \log (M + T - t_{0})} + \frac{γ ϵ (M + T - t_{0})}{10} + (‖ λ^{*} ‖ + 1) C + \bar{μ} \\ \leq \max {0, d (Q (t_{0}), W (t_{0})) - (t_{1} - t_{0}) ϵ} + \frac{γ ϵ (M + T - t_{0})}{6}, \end{array}

(B.36)

where the second inequality is from (B.32), the third inequality is due to (B.33), the fourth inequality follows from (B.35), and the last inequality is because M has been chosen large enough as in (25). This completes the proof of Lemma 5.

B.7. Proof of Lemma 6

Let N(t) be the cardinality of $N (t)$ , that is, the number of jumps up to time t. We also use the convention $N (- 1) = 0$ . When the sample path is such that the event $E^{jump} (T, M)$ occurs, then $N (t) \leq 1 / γ$ for all t < T. Thus, for any $t \in [0, T)$ ,

\frac{(N (t) + 2) γ ϵ}{3} \leq \frac{(1 / γ + 2) γ ϵ}{3} = \frac{(1 + 2 γ) ϵ}{3} \leq ϵ,

(B.37)

where we have used our assumption that

γ \leq 1

We fix a sample path of the arrival process $A (\cdot)$ under which both events $E^{jump} (T, M)$ and $E^{fluc} (T, M)$ occur. We use strong induction to prove that, for any $t \in [0, T]$ ,

d (Q (t), W (t)) \leq \frac{(N (t - 1) + 2) γ ϵ}{6} (M + T - t) .

(B.38)

To establish the base case of the induction, we use the assumption $Q (0) = 0$ and the fact that $0 \in W (0)$ (because ϵ-JF trajectories are zero for negative times). Thus, $d (Q (0), W (0)) = 0$ , and therefore, (B.38) holds for t = 0.

For the induction step, we consider a time $t_{1} \in (0, T]$ and assume that (B.38) holds for all $t < t_{1}$ . We show that (B.38) also holds for time t₁. Let

t_{0} = \max {0, T - (2 (T - t_{1}) + M)} = \max {0, 2 t_{1} - T - M} .

(B.39)

Note that either $t_{0} = 0 < t_{1}$ or $t_{0} = 2 t_{1} - T - M \leq t_{1} - M < t_{1}$ so that we always have $t_{0} < t_{1}$ . We consider two cases.

Case B.1.

For the first case, we assume that the interval $(t_{0}, t_{1})$ is jump-free.⁹ We consider two subcases. If $t_{0} = 0$ , then

\begin{array}{l} d (Q (t_{1}), W (t_{1})) & \leq \max {0, d (Q (t_{0}), W (t_{0})) - (t_{1} - t_{0}) ϵ} + \frac{ϵ γ}{6} (M + T - t_{0}) \\ = \frac{ϵ γ}{6} (M + T - t_{0}) \\ \leq \frac{ϵ γ}{6} (M + T - (2 t_{1} - T - M)) \\ = \frac{2 ϵ γ}{6} (M + T - t_{1}), \end{array}

(B.40)

where the first inequality is due to Lemma 5 and the first equality is because of the assumptions

t_{0} = 0

and

Q (0) = 0

together with the observation that

0 \in W (0)

. The second inequality follows from (B.39), which implies that

0 = t_{0} \geq 2 t_{1} - T - M

. In particular, (B.38) holds for t = t₁.

For the second subcase, we assume that $t_{0} > 0$ , in which case, $t_{0} = 2 t_{1} - T - M$ . Then,

\begin{array}{l} d (Q (t_{1}), W (t_{1})) \\ \leq \max {0, d (Q (t_{0}), W (t_{0})) - (t_{1} - t_{0}) ϵ} + \frac{ϵ γ}{6} (M + T - t_{0}) \\ \leq \max {0, \frac{(N (t_{0} - 1) + 2) γ ϵ}{6} (M + T - t_{0}) - (t_{1} - t_{0}) ϵ} + \frac{ϵ γ}{6} (M + T - t_{0}) \\ = \max {0, \frac{(N (t_{0} - 1) + 2) γ ϵ}{6} (2 M + 2 T - 2 t_{1}) - (M + T - t_{1}) ϵ} + \frac{ϵ γ}{6} (2 M + 2 T - 2 t_{1}) \\ = \max {0, (\frac{(N (t_{0} - 1) + 2) γ ϵ}{3} - ϵ) (M + T - t_{1})} + \frac{2 ϵ γ}{6} (M + T - t_{1}) \\ = \frac{2 ϵ γ}{6} (M + T - t_{1}), \end{array}

(B.41)

where the first inequality follows from Lemma 5, the second inequality is due to the induction hypothesis (B.38), the first equality uses the substitution

t_{0} = 2 t_{1} - T - M

, and the last equality is due to (B.37). Thus, (B.38) again holds for t=t₁. This completes the induction step for Case B.1.

Case B.2.

For the second case, we assume that there is a jump in the interval $(t_{0}, t_{1})$ . Let ${\hat{t}}_{0}$ be the last jump time in the interval $(t_{0}, t_{1})$ so that $({\hat{t}}_{0}, t_{1})$ is jump-free. Because ${\hat{t}}_{0}$ is the last jump time, we have

N (t_{1} - 1) = N ({\hat{t}}_{0}) \geq N ({\hat{t}}_{0} - 1) + 1,

(B.42)

where the inequality is because there is at least one jump at

{\hat{t}}_{0}

(we say “at least” because, at time

{\hat{t}}_{0}

, we could have jumps at multiple components of

A (\cdot)

. Consequently,

\begin{array}{l} d (Q (t_{1}), W (t_{1})) \\ \leq \max {0, d (Q ({\hat{t}}_{0}), W ({\hat{t}}_{0})) - (t_{1} - {\hat{t}}_{0}) ϵ} + \frac{ϵ γ}{6} (M + T - {\hat{t}}_{0}) \\ \leq \max {0, \frac{(N ({\hat{t}}_{0} - 1) + 2) γ ϵ}{6} (M + T - {\hat{t}}_{0}) - (t_{1} - {\hat{t}}_{0}) ϵ} + \frac{ϵ γ}{6} (M + T - {\hat{t}}_{0}) \\ = \max {\frac{ϵ γ}{6} (M + T - {\hat{t}}_{0}), \frac{(N ({\hat{t}}_{0} - 1) + 3) γ ϵ}{6} (M + T - {\hat{t}}_{0}) - (t_{1} - {\hat{t}}_{0}) ϵ} \\ \leq \max (\frac{ϵ γ}{6} (M + T - {\hat{t}}_{0}), \frac{(N (t_{1} - 1) + 2) γ ϵ}{6} (M + T - {\hat{t}}_{0}) - (t_{1} - {\hat{t}}_{0}) ϵ) \\ \leq \max {\frac{ϵ γ}{6} (2 M + 2 T - 2 t_{1}), \frac{(N (t_{1} - 1) + 2) γ ϵ}{6} (M + T - t_{1}) \\ + (\frac{(N (t_{1} - 1) + 2) γ ϵ}{6} - ϵ) (t_{1} - {\hat{t}}_{0})} \\ \leq \max {\frac{2 ϵ γ}{6} (M + T - t_{1}), \frac{(N (t_{1} - 1) + 2) γ ϵ}{6} (M + T - t_{1})} \\ = \frac{(N (t_{1} - 1) + 2) γ ϵ}{6} (M + T - t_{1}), \end{array}

(B.43)

where the first inequality follows from Lemma 5 and the assumption that

({\hat{t}}_{0}, t_{1})

is jump-free, the second inequality is due to the induction hypothesis (B.38), the third inequality is from (B.42), the fourth inequality is because

{\hat{t}}_{0} \geq t_{0} \geq 2 t_{1} - T - M

, and the last inequality is due to (B.37). Therefore, the induction step goes through for Case B.2 as well. This completes the proof of the induction and implies (B.38) for all

t \in [0, T]

Finally, letting t = T, (B.38) becomes

d (Q (T), W (T)) \leq \frac{(N (T - 1) + 2) γ ϵ}{6} M \leq \frac{M ϵ}{2},

(B.44)

where the last inequality is due to (B.37). This completes the proof of Lemma 6.

Appendix C. Proof of Lemmas for the Second Direction of Theorem 1: RDS $\Rightarrow ϵ$ -JF Condition

C.1. Proof of Lemma 7

Suppose that the ϵ-JF( $γ$ ) condition fails to hold. Then, there exists a nonnegative integer vector $n'$ with $γ^{T} n' \leq 1$ , an ϵ-JF( $n'$ ) trajectory $q' (\cdot)$ , and some time T such that $q_{m}^{'} (T) > 0$ . If T = 0, we can use right-continuity of trajectories to see that, without loss of generality, we can take T to be positive. We then define a new trajectory $q (\cdot)$ by letting $q (t) = q' (t T) / T$ for all $t \geq 0$ . It is not hard to verify that $q (\cdot)$ is also an ϵ-JF( $n'$ ) trajectory, and satisfies $q_{m} (1) = q_{m}^{'} (T) / T > 0$ so that the first property is satisfied.

Suppose now that some of the jumps of $q (\cdot)$ happen after time 1. Let n be the vector that counts the number of jumps that take place until time 1. Starting with $q (\cdot)$ , we eliminate the jumps that happen after time 1 to obtain an ϵ-JF(n) trajectory with $γ^{T} n \leq γ^{T} n' \leq 1$ , all jumps in $[0, 1]$ , and $q_{m} (1) > 0$ . By slightly perturbing the jump times and using a continuity argument, we can ensure that no two components have simultaneous jumps and also that all jump times belong to (0, 1) so that properties (b) and (c) in Lemma 7 are satisfied.

Finally, we can replace the arrival rates $λ_{j} (t)$ that drive the ϵ-JF trajectory by $\max {λ_{j} (t), ϵ'}$ , where $ϵ' < ϵ$ is a small positive constant. This ensures that $\inf_{t} λ_{j} (t) > 0$ . Furthermore, using a continuity argument and as long as $ϵ'$ is small enough, the property $q_{m} (1) > 0$ is preserved. This proves condition (d) in Lemma 7 and concludes the proof of the lemma.

C.2. Proof of Lemma 8

For simplicity, and without loss of generality, we present the proof for the case in which $t_{0} = 0$ . We let ${\bar{T}}_{1}$ be a large enough constant such that, for all $T \geq {\bar{T}}_{1}$ and all k, we have

(a_{k} - d) T \geq \max {\bar{μ}, \sqrt{T}} .

(C.1)

This is possible because, according to the definition of d in (33), we have $\min_{k} a_{k} > d$ .

We fix some $T \geq {\bar{T}}_{1}$ as well as some $k \in {1, \dots, n}$ . We aim to show that the process has a substantial probability of a jump of size approximately a_kT during the interval $[Θ_{k} T, (Θ_{k} + d) T)$ . Within the proof of this lemma, we use the symbol j (instead of j_k) to denote the index of the queue at which the kth jump took place.

From (C.1), we have $\log ((a_{k} - d) T) \geq (1 / 2) \log T$ . We then obtain, for any t, any j, and any $x \in [(a_{k} - d) T, a_{k} T]$ ,

\begin{array}{l} f_{A_{j} (t)} (x) & = \frac{{\bar{λ}}_{j} (t)}{σ (γ_{j})} \cdot x^{- (2 + γ_{j})} \log (x + 1) 𝟙 (x \geq \bar{μ}) \\ = \frac{{\bar{λ}}_{j} (t)}{σ (γ_{j})} \cdot x^{- (2 + γ_{j})} \log (x + 1) \\ \geq \frac{{\bar{λ}}_{j} (t)}{σ (γ_{j})} \cdot {(a_{k} T)}^{- (2 + γ_{j})} \log ((a_{k} - d) T) \\ \geq ζ T^{- (2 + γ_{j})} \log T, \end{array}

(C.2)

where the second equality is due to (C.1) and ζ is a positive constant chosen so that

ζ \leq {\bar{λ}}_{i} (t) / 2 σ (γ_{i}) a_{k}^{(2 + γ_{j})}

for every i, k, and t. Note that ζ can be taken positive because, according to Lemma 7, we can assume that

\inf_{t} {\bar{λ}}_{j} (t) > 0

We define $ϕ = ζ d$ . Then,

\begin{array}{l} P (A_{j} (t) \in [(a_{k} - d) T, a_{k} T]) & = \int_{(a_{k} - d) T}^{a_{k} T} f_{A_{j} (t)} (x) d x \\ \geq (d T) \cdot ζ T^{- (2 + γ_{j})} \log T \\ = ϕ T^{- (1 + γ_{j})} \log T . \end{array}

(C.3)

As in (39), but with $t_{0} = 0$ , we define

B_{k} = \sum_{t = Θ_{k} T}^{Θ_{k} T + d T - 1} A (t),

(C.4)

and for

t \in [Θ_{k} T, (Θ_{k} + d) T)

U_{t} = B_{k} - A_{j} (t) e_{j},

(C.5)

and note that

‖ U_{t} ‖ \leq ‖ B_{k} ‖

for every k and t. We have

\begin{array}{l} P (‖ U_{t} ‖ \geq 2 d T \bar{μ}) & \leq \frac{E [‖ U_{t} ‖]}{2 d T \bar{μ}} \\ \leq \frac{\sum_{τ = Θ_{k} T}^{(Θ_{k} + d) T - 1} E [‖ A (τ) ‖]}{2 d T \bar{μ}} \\ \leq \frac{d T (‖ λ^{*} ‖ + ϵ)}{2 d T \bar{μ}} \\ \leq \frac{1}{2} . \end{array}

(C.6)

where the first step makes use of the Markov inequality, and the last step uses the fact

‖ λ^{*} ‖ + ϵ \leq \bar{μ}

For $t \in [Θ_{k} T, (Θ_{k} + d) T)$ , let $E_{t}$ be the event that $A_{j} (t) \in [(a_{k} - d) T, a_{k} T]$ and $‖ U_{t} ‖ \leq 2 d T \bar{μ}$ . Note that the term $A_{j} (t)$ is omitted from U_t, and therefore, $A_{j} (t)$ and U_t are independent. Thus, using (C.3) and (C.6), we obtain

P (E_{t}) \geq \frac{ϕ}{2} T^{- (1 + γ_{j})} \log T, \forall t \in [Θ_{k} T, (Θ_{k} + d) T) .

(C.7)

In light of the definition of d in (33), we have $d (1 + 2 \bar{μ}) < a_{k}$ . Therefore,

2 d T \bar{μ} < (a_{k} - d) T .

(C.8)

Thus, for any $t, τ \in [Θ_{k} T, (Θ_{k} + d) T)$ with $τ \neq t$ , if $‖ U_{t} ‖ \leq 2 d T \bar{μ}$ , then

A_{j} (τ) \leq ‖ U_{t} ‖ \leq 2 d T \bar{μ} < (a_{k} - d) T .

(C.9)

Here, the first inequality follows because, if $τ \neq t$ , then $A_{j} (τ)$ is one of the summands in the definition (C.5) of $U_{t}$ , and the last inequality is due to (C.8). Consequently, if $‖ U_{t} ‖ \leq 2 d T \bar{μ}$ , then $E_{τ}$ holds for no $τ \in [Θ_{k} T, (Θ_{k} + d) T)$ with $τ \neq t$ , that is, for $t \neq τ$ , the events $E_{t}$ and $E_{τ}$ are disjoint.

We now claim that, for any $t \in [Θ_{k} T, Θ_{k} T + d T)$ , the event $E_{t}$ implies the event $E_{k}^{jump}$ defined in (40). Indeed, when event $E_{t}$ occurs, we obtain

\begin{array}{l} ‖ B_{k} - a_{k} T e_{j} ‖ & = ‖ U_{t} + A_{j} (t) e_{j} - a_{k} T e_{j} ‖ \\ \leq ‖ U_{t} ‖ + ‖ A_{j} (t) e_{j} - a_{k} T e_{j} ‖ \\ = ‖ U_{t} ‖ + | A_{j} (t) - a_{k} T | \\ \leq 2 d T \bar{μ} + d T, \end{array}

(C.10)

where the last inequality follows from the definition of

E_{t}

. This shows that the event

E_{k}^{jump}

occurs as claimed. Let

ψ = \min {1, d ϕ / 2}

. We then have

\begin{array}{l} P (E_{k}^{jump}) & \geq P (\underset{t \in [Θ_{k} T, Θ_{k} T + d T)}{\cup} E_{t}) \\ = \sum_{t = Θ_{k} T}^{(Θ_{k} + d) T - 1} P (E_{t}) \\ \geq d T \cdot \frac{ϕ T^{- (1 + γ_{j})} \log T}{2} \\ \geq ψ T^{- γ_{j}} \log T, \end{array}

(C.11)

where the first equality is because, for

t \neq τ

, the events

E_{t}

and

E_{τ}

are disjoint; the second inequality is due to (C.7), and the last one uses the definition of ψ. This completes the proof of Lemma 8.

C.3. Proof of Lemma 9

This proof is similar to the proof of Lemma 3 in Appendix B.4 although some of the details are different.

Recall that r is the number of piecewise constant pieces in the trajectory $q^{ϵ} (\cdot)$ . We fix some $k \in {0, 1, \dots, n}$ and let

α = \frac{γ c}{32 C r},

(C.12)

and

η = \frac{8 ℓ (6 ℓ \bar{μ} + α)}{3 α^{2}} .

(C.13)

Claim C.1.

There exists a ${\bar{T}}_{2} \geq 8 ℓ$ such that, for any $T \geq {\bar{T}}_{2}$ , any j, and any $t \in [Θ_{k} T + d T, Θ_{k + 1} T)$ , we have

\int_{T / η \log T}^{\infty} P (A_{j} (t) \geq x) d x \leq \frac{α}{2 \sqrt{ℓ}},

(C.14)

and

P (A_{j} (t) > \frac{T}{η \log T}) \leq \frac{1}{4 ℓ T} .

(C.15)

Proof.

Because all γ_j are positive, we can fix some δ such that $0 < δ < γ_{j}$ for every j. Then, the density $f_{A_{j} (t)} (\cdot)$ in Definition 8 decays at least as fast as $x^{- (2 + δ)}$ . More concretely, there exists a constant χ such that for all j and t, we have

f_{A_{j} (t)} (x) \leq χ x^{- (2 + δ)}, \forall x \geq \bar{μ} .

We then have, for any time $t \in [Θ_{k} T + d T, Θ_{k + 1} T)$ , any $y \geq 1$ , and any j,

P (A_{j} (t) \geq y) \leq χ \int_{y}^{\infty} x^{- (2 + δ)} d x = \frac{χ}{1 + δ} \cdot x^{- (1 + δ)} .

(C.16)

It then follows that, as T goes to infinity, both $T \cdot P (A_{j} (τ) > T / η \log T)$ and $\int_{T / η \log T}^{\infty} P (A_{j} (τ) \geq x) d x$ go to zero uniformly over all j and τ. Therefore, there exists a ${\bar{T}}_{2}$ such that, for any $T \geq {\bar{T}}_{2}$ , (C.15) and (C.14) hold, and the claim follows. □

For the rest of the proof, we fix such a ${\bar{T}}_{2}$ and assume that $T \geq {\bar{T}}_{2}$ . For any $τ \in [Θ_{k} T + d T, Θ_{k + 1} T)$ and every j, let

A_{j}^{*} (τ) = \min {A_{j} (τ), \frac{T}{η \log T}} .

(C.17)

We define the “no jumps” event $E^{=}$ as follows:

E^{=} = {A (τ) = A^{*} (τ), forall τ \in [Θ_{k} T + d T, Θ_{k + 1} T)} .

(C.18)

Then,

\begin{array}{l} 1 - P (E^{=}) & \leq \sum_{j = 1}^{ℓ} \sum_{τ = Θ_{k} T + d T}^{Θ_{k + 1} T} P (A_{j} (τ) > \frac{T}{η \log T}) \\ \leq \sum_{j = 1}^{ℓ} \sum_{τ = Θ_{k} T + d T}^{Θ_{k + 1} T} \frac{1}{4 ℓ T} \\ \leq \frac{1}{4}, \end{array}

(C.19)

where the first inequality uses the union bound, the second inequality follows from (C.15), and the last inequality uses the fact that d as defined in (33) is no larger than one. Moreover, for any j and any

τ \in [Θ_{k} T + d T, Θ_{k + 1} T)

and using the same steps as in (B.14),

E [A_{j} (τ)] - E [A_{j}^{*} (τ)] \leq \int_{T / η \log T}^{\infty} P (A_{j} (τ) > x) d x \leq \frac{α}{2 \sqrt{ℓ}},

where the last inequality follows from (C.14). Therefore, for any

τ \geq 0

‖ E [A (τ)] - E [A^{*} (τ)] ‖ \leq \sqrt{ℓ} \max_{j} (E [A_{j} (τ)] - E [A_{j}^{*} (τ)]) \leq \frac{α}{2} .

(C.20)

Consider now the following events:

E^{*} = {‖ \sum_{τ = Θ_{k} T + d T}^{t} (A^{*} (τ) - E [A^{*} (τ)]) ‖ \leq \frac{α T}{2}, \forall t \in [Θ_{k} T + d T, Θ_{k + 1} T)},

(C.21)

and for

j = 1, \dots, ℓ

E_{j}^{*} = {| \sum_{τ = Θ_{k} T + d T}^{t} (A_{j}^{*} (τ) - E [A_{j}^{*} (τ)]) | \leq \frac{α T}{2 ℓ}, \forall t \in [Θ_{k} T + d T, Θ_{k + 1} T)} .

(C.22)

Note that, if all of the events $E_{1}^{*}, \dots, E_{ℓ}^{*}$ occur, then $E^{*}$ also occurs. By applying the union bound to the complement of these events, we have

1 - P (E^{*}) \leq \sum_{j = 1}^{ℓ} (1 - P (E_{j}^{*})) .

(C.23)

Consider a sample path under which the events $E^{=}$ and $E^{*}$ occur. Then, for any $t \in [Θ_{k} T + d T, Θ_{k + 1} T)$ ,

\begin{array}{l} ‖ \sum_{τ = Θ_{k} T + d T}^{t} & (A (τ) - E [A (τ)]) ‖ = ‖ \sum_{τ = Θ_{k} T + d T}^{t} (A^{*} (τ) - E [A (τ)]) ‖ \\ = ‖ \sum_{τ = Θ_{k} T + d T}^{t} (A^{*} (τ) - E [A^{*} (τ)]) + \sum_{τ = Θ_{k} T + d T}^{t} (E [A^{*} (τ)] - E [A (τ)]) ‖ \\ \leq ‖ \sum_{τ = Θ_{k} T + d T}^{t} (A^{*} (τ) - E [A^{*} (τ)]) ‖ + \sum_{τ = Θ_{k} T + d T}^{t} ‖ E [A^{*} (τ)] - E [A (τ)] ‖ \\ \leq ‖ \sum_{τ = Θ_{k} T + d T}^{t} (A^{*} (τ) - E [A^{*} (τ)]) ‖ + \frac{α T}{2} \\ \leq \frac{α T}{2} + \frac{α T}{2} \\ = \frac{γ c T}{32 C r}, \end{array}

(C.24)

where the first equality is due to

E^{=}

, the second inequality is due to (C.20), the third inequality follows from

E^{*}

, and the last equality is from the definition of α in (C.12). Therefore, the events

E^{=}

and

E^{*}

imply the event

E_{k}^{fluc}

. Using the union bound on the complements of these events,

1 - P (E_{k}^{fluc}) \leq (1 - P (E^{=})) + (1 - P (E^{*})) \leq \frac{1}{4} + \sum_{j = 1}^{ℓ} (1 - P (E_{j}^{*})),

(C.25)

where the last inequality follows from (C.19) and (C.23). To complete the proof, we develop an upper bound on

1 - P (E_{j}^{*})

For any $τ \geq 0$ and every j and as in (B.22), we have

E [A_{j}^{*} (τ)] \leq E [A_{j} (τ)] \leq ‖ E [A (τ)] ‖ \leq \bar{μ} .

(C.26)

For any $t \in [Θ_{k} T + d T, Θ_{k + 1} T)$ , we have

\begin{array}{l} P & (| \sum_{τ = Θ_{k} T + d T}^{t} (A_{j}^{*} (τ) - E [A_{j}^{*} (τ)]) | > \frac{α T}{2 ℓ}) \\ \leq 2 \exp (- \frac{{(α T / 2 ℓ)}^{2}}{2 \frac{T}{η \log T} \cdot (\bar{μ} (t - Θ_{k} T - d T + 1) + α T / 6 ℓ)}) \\ \leq 2 \exp (- \frac{3 α^{2} η \log T}{4 ℓ (6 ℓ \bar{μ} + α)}) \\ = 2 \exp (- 2 \log T) \\ = 2 T^{- 2}, \end{array}

(C.27)

where the first inequality follows from Lemma 2 with the identifications

z = α T / 2 ℓ, b = T / η \log T, \bar{λ} = \bar{μ}

(see (C.26)), and

n = t - Θ_{k} T - d T + 1

; the second inequality is because

n \leq T

; the equality follows from the definition of η in (C.13). Therefore, for every j,

\begin{array}{l} 1 - P (E_{j}^{*}) & \leq \sum_{t = Θ_{k} T + d T}^{T} P (| \sum_{τ = Θ_{k} T + d T}^{t} (A_{j}^{*} (τ) - E [A_{j}^{*} (τ)]) | > \frac{α T}{2 ℓ}) \\ \leq \sum_{t = 1}^{T} \frac{2}{T^{2}} \\ = 2 T^{- 1} \\ \leq \frac{1}{4 ℓ}, \end{array}

(C.28)

where the first inequality is from the union bound, the second inequality is due to (C.27), and the last inequality is from the condition

T \geq {\bar{T}}_{2} \geq 8 ℓ

in Claim C.1. Plugging (C.28) into (C.25), we obtain

P (E_{k}^{fluc}) \geq 1 / 2

. This completes the proof of Lemma 9.

C.4. Proof of Lemma 10

The proof of this lemma is essentially a continuity argument together with an induction that pieces together different time segments.

For simplicity and without loss of generality, we assume that the constant t₀ in the statement of the lemma is equal to zero. Note, however, that, with this convention $Q (0)$ is, in general, nonzero.

For any $λ, x \in R_{+}^{ℓ}$ , let

ξ_{λ} (x) = \dot{q} (0),

(C.29)

where

q (\cdot)

is the fluid trajectory corresponding to arrival rate

λ

and initialized with

q (0) = x

. In view of (9), we have

ξ_{λ} (x) \in {\bar{D}}_{λ} (x)

. From the definition (8) of the set

{\bar{D}}_{λ} (x)

of possible drifts and the standing assumption

‖ λ (τ) - λ^{*} ‖ \leq ϵ

for all τ, we have

‖ ξ_{λ (τ)} (q^{ϵ} (τ)) ‖ \leq \bar{μ}, \forall τ \geq 0,

(C.30)

where

\bar{μ}

is defined in (30). By taking into account the jump

a_{k} e_{j_{k}}

at time Θ_k and then integrating the drift

ξ_{λ (τ)}

over the jump-free interval

[Θ_{k}, Θ_{k} + d]

, we have

‖ q^{ϵ} (Θ_{k} + d) - q^{ϵ} (Θ_{k}) ‖ \leq d \bar{μ} .

Noting also that $q^{ϵ} (Θ_{k}) = q^{ϵ} (Θ_{k}^{-}) + a_{k} e_{j_{k}}$ , we obtain

‖ q^{ϵ} (Θ_{k} + d) - q^{ϵ} (Θ_{k}^{-}) - a_{k} e_{j_{k}} ‖ \leq d \bar{μ} .

(C.31)

For $τ \geq 0$ , let ${\bar{μ}}^{a} (τ) = \min {μ (τ), Q (τ)}$ , where the minimum is taken component-wise; thus, $μ^{a} (τ)$ is the actual service received at time τ. It then follows from (1) that

Q (τ + 1) = Q (τ) - μ^{a} (τ) + A (τ) .

(C.32)

We start by considering the intervals $[Θ_{k} T, (Θ_{k} + d) T]$ associated with jumps for $k = 1, \dots, n$ . We are working with a sample path for which the event $E_{k}^{jump}$ occurs, Therefore,

\begin{array}{l} ‖ Q (( & Θ_{k} + d) T) - (Q (Θ_{k} T) + T a_{k} e_{j_{k}}) ‖ \\ = ‖ (Q (Θ_{k} T) + \sum_{τ = Θ_{k} T}^{Θ_{k} T + d T - 1} (A (τ) - μ^{a} (τ))) - (Q (Θ_{k} T) + T a_{k} e_{j_{k}}) ‖ \\ \leq ‖ (\sum_{τ = Θ_{k} T}^{Θ_{k} T + d T - 1} A (τ)) - T a_{k} e_{j_{k}} ‖ + ‖ \sum_{τ = Θ_{k} T}^{Θ_{k} T + d T - 1} μ^{a} (τ) ‖ \\ \leq d T (1 + 2 \bar{μ}) + d T \bar{μ} \\ = d T (1 + 3 \bar{μ}), \end{array}

(C.33)

where the first equality is due to (C.32). The second inequality follows from the definition (40) of

E_{k}^{jump}

and the fact

‖ μ^{a} (τ) ‖ \leq ‖ μ (τ) ‖ \leq \bar{μ}

for all

τ \geq 0

Combining (C.33) with (C.31), it follows that, for $k = 1, \dots, n$ ,

\begin{array}{l} ‖ Q & ((Θ_{k} + d) T) - T q^{ϵ} (Θ_{k} + d) ‖ \\ \leq ‖ Q ((Θ_{k} + d) T) - (Q (Θ_{k} T) + T a_{k} e_{j_{k}}) ‖ + ‖ (Q (Θ_{k} T) + T a_{k} e_{j_{k}}) - T (q^{ϵ} (Θ_{k}^{-}) + a_{k} e_{j_{k}}) ‖ \\ + ‖ T (q^{ϵ} (Θ_{k}^{-}) + a_{k} e_{j_{k}}) - T q^{ϵ} (Θ_{k} + d) ‖ \\ \leq ‖ Q ((Θ_{k} + d) T) - (Q (Θ_{k} T) + T a_{k} e_{j_{k}}) ‖ + ‖ Q (Θ_{k} T) - T q^{ϵ} (Θ_{k}^{-}) ‖ + d T \bar{μ} \\ \leq d T (1 + 3 \bar{μ}) + ‖ Q (Θ_{k} T) - T q^{ϵ} (Θ_{k}^{-}) ‖ + d T \bar{μ} \\ = ‖ Q (Θ_{k} T) - T q^{ϵ} (Θ_{k}^{-}) ‖ + d T (1 + 4 \bar{μ}), \end{array}

where the second and third inequalities are due to (C.31) and (C.33), respectively. Using the definition (33) of d, we conclude that

‖ Q ((Θ_{k} + d) T) - T q^{ϵ} (Θ_{k} + d) ‖ \leq ‖ Q (Θ_{k} T) - T q^{ϵ} (Θ_{k}^{-}) ‖ + \frac{γ c T}{8} .

(C.34)

We have so far established that, if the two trajectories $Q (\cdot)$ and $q^{ϵ} (\cdot)$ are close at the beginning of an interval $[Θ_{k} T, (Θ_{k} + d) T]$ , they are also close at the end. We now need to establish a similar conclusion over intervals of the form $[(Θ_{k} + d) T, Θ_{k + 1} T]$ . We wish to capitalize on Theorem 4. That result, however, refers to fluid models with constant arrival rates. In contrast, our stochastic process has a piecewise constant arrival rate, and the same holds for the associated JF trajectory. We deal with this issue by applying Theorem 4 repeatedly for each one of the piecewise constant segments.

Let us fix some $k \in {1, \dots, n}$ and recall that r is an upper bound on the number of subintervals in $[Θ_{k} + d, Θ_{k + 1})$ during which $λ (\cdot)$ stays constant. Let us then fix some times θ_i for $i = 1, \dots, r + 1$ such that

Θ_{k} + d = θ_{1} < \dots < θ_{r} < θ_{r + 1} = Θ_{k + 1},

and such that

λ (\cdot)

is constant during each one of the intervals

(θ_{i}, θ_{i + 1})

for

i = 1, \dots, r

Under our assumption that the sample path satisfies the event $E_{k}^{fluc}$ , we see that, during each one of the intervals $[θ_{i} T, θ_{i + 1} T]$ and for $i = 1 \dots, r$ , we have

\begin{array}{l} \max_{t \in [θ_{i} T, θ_{i + 1} T)} ‖ \sum_{τ = θ_{i} T}^{t} (A (τ) - \bar{λ} (τ)) ‖ & \leq 2 \max_{t \in [Θ_{k} T + d T, Θ_{k + 1} T)} ‖ \sum_{τ = Θ_{k} T + d T}^{t} (A (τ) - \bar{λ} (τ)) ‖ \\ \leq \frac{γ c T}{16 C r} . \end{array}

(C.35)

We now note that the function ${\tilde{q}}^{ϵ} (\cdot)$ defined by ${\tilde{q}}^{ϵ} (t) = T q^{ϵ} (t / T)$ is also an ϵ-JF trajectory and, in particular, is a fluid trajectory during each interval $[θ_{i} T, θ_{i + 1} T)$ . We apply Theorem 4 over this interval. Using also the fact that $1 + ‖ λ (τ) ‖ \leq \bar{μ}$ , we obtain

‖ Q (θ_{i + 1} T) - T q^{ϵ} (θ_{i + 1}^{-}) ‖ \leq ‖ Q (θ_{i} T) - T q^{ϵ} (θ_{i}^{-}) ‖ + C \bar{μ} + C \cdot \frac{γ c T}{16 C r} .

(C.36)

By summing such inequalities, for $i = 1, \dots, r$ and using the facts $θ_{1} = Θ_{k} + d, θ_{r + 1} = Θ_{k + 1}$ , and $q^{ϵ} ({(Θ_{k} + d)}^{-}) = q^{ϵ} (Θ_{k} + d)$ , we obtain

‖ Q (Θ_{k + 1} T) - T q^{ϵ} (Θ_{k + 1}^{-}) ‖ \leq ‖ Q ((Θ_{k} + d) T) - T q^{ϵ} (Θ_{k} + d) ‖ + r C \bar{μ} + \frac{γ c T}{16} .

(C.37)

We now add (C.34) and (C.37) to obtain

‖ Q (Θ_{k + 1} T) - T q^{ϵ} (Θ_{k + 1}^{-}) ‖ \leq ‖ Q (Θ_{k} T) - T q^{ϵ} (Θ_{k}^{-}) ‖ + r C \bar{μ} + \frac{3 γ c T}{16} .

We finally sum these bounds for $k = 1, \dots, n$ and use the fact $q^{ϵ} (Θ_{n + 1}^{-}) = q^{ϵ} (1^{-})$ to conclude that

\begin{array}{l} ‖ Q (T) - T q^{ϵ} (1^{-}) ‖ & \leq ‖ Q (Θ_{1} T) - T q^{ϵ} (Θ_{1}^{-}) ‖ + nrC \bar{μ} + \frac{3 n γ c T}{16} \\ \leq ‖ Q (Θ_{1} T) - T q^{ϵ} (Θ_{1}^{-}) ‖ + nrC \bar{μ} + \frac{3 c T}{16}, \end{array}

(C.38)

where we also make use of the property

n γ \leq γ^{T} n \leq 1

The interval $[0, Θ_{1})$ is to be treated a little differently as $Θ_{0} = 0$ is not a jump time. Even so, the argument leading to (C.37) applies verbatim and shows that

‖ Q (Θ_{1} T) - T q^{ϵ} (Θ_{1}^{-}) ‖ \leq ‖ Q (0) - T q^{ϵ} (0) ‖ + r C \bar{μ} + \frac{γ c T}{16} \leq ‖ Q (0) ‖ + r C \bar{μ} + \frac{c T}{16},

where we make use of the fact that the fluid trajectory is initialized at zero and the inequality

γ \leq 1

. Combining with (C.38), we finally conclude that

‖ Q (T) - T q^{ϵ} (1) ‖ \leq ‖ Q (0) ‖ + (n + 1) r C \bar{μ} + \frac{c T}{4} .

We now let ${\bar{T}}_{3}$ be large enough so that, for any $T \geq {\bar{T}}_{3}$ , we have $(n + 1) r C \bar{μ} + (c T / 4) + (c T / 5) \leq c T / 2$ . As long as $T \geq {\bar{T}}_{3}$ and using the assumption $‖ Q (0) ‖ \leq c T / 5$ , we obtain

‖ Q (T) - T q^{ϵ} (1) ‖ \leq \frac{c T}{5} + (n + 1) r C \bar{μ} + \frac{c T}{4} \leq \frac{c T}{2} .

Finally, using $c = q_{m}^{ϵ} (1)$ , we have

\begin{array}{l} Q_{m} (T) & = T q_{m}^{ϵ} (1) - (T q_{m}^{ϵ} (1) - Q_{m} (T)) \\ \geq T q_{m}^{ϵ} (1) - ‖ T q^{ϵ} (1) - Q (T) ‖ \\ \geq T q_{m}^{ϵ} (1) - \frac{c T}{2} \\ = c T - \frac{c T}{2} \\ = \frac{c T}{2}, \end{array}

(C.39)

and this completes the proof of Lemma 10.

Endnotes

¹ This phenomenon can arise under other scheduling policies as well, for example, the generalized processor sharing policy (Borst et al. 2003).

² Strictly speaking, the results in Section 5 only establish the absence of robust delay stability, not delay instability for the specific arrival process distributions of our example. However, a slight modification of the proof in Section 8 shows that queue 3 is indeed delay unstable.

³ Recall our assumption in Section 2 that, for any $μ \in M$ , the set $M$ also contains all vectors that result from setting some entries of $μ$ equal to zero. It is shown in proposition 2 of Sharifnassab et al. (2020) that, under this assumption, the fluid model of Definition 2 is equivalent to the more standard, albeit more complicated, definitions of fluid models based on differential equations with boundary conditions.

⁴ If $γ_{j} = \infty$ and n_j = 0, we use the convention $\infty \cdot 0 = 0$ .

⁵ We note that n may be equal to zero. For example, for a single unstable queue, an ϵ-JF trajectory becomes positive even in the absence of jumps. Such a system is not RDS, consistent with our result.

⁶ To avoid notation clutter, we present the proof as if $Θ_{k} T$ or $Θ_{k} T + d T$ were integer, which is not necessarily the case. Everything goes through, with occasional trivial modifications, if a sum of the form $\sum_{t = a}^{b} c_{t}$ is interpreted as $\sum_{t = ⌊ a ⌋}^{⌊ b ⌋} c_{t}$ .

⁷ The reason for the condition $λ^{*} ≻ 0$ is that, if $λ^{*} = 0$ , then a system is trivially 0-RDS, but a small perturbation that leads to positive arrival rates can result in a delay unstable system.

⁸ In case $t_{1} = t_{0} + 1$ , the interval (t₀, t₁) is empty, and the term involving a maximum over $t \in (t_{0}, t_{1})$ is interpreted as zero.

⁹ This includes the case in which $t_{1} = t_{0} + 1$ so that the interval (t₀, t₁) is empty.

References

Anantharam V (1989) How large delays build up in a GI/G/1 queue. Queueing Systems 5(4):345–367.Google Scholar
Anthony M, Bartlett PL (2009) Neural Network Learning: Theoretical Foundations (Cambridge University Press, New York).Google Scholar
Asmussen S (1996) Rare events in the presence of heavy tails. Glasserman P, Sigman K, Yao DD, eds. Stochastic Networks (Springer, New York), 197–214.Google Scholar
Bertsimas D, Gamarnik D, Tsitsiklis JN (2001) Performance of multiclass Markovian queueing networks via piecewise linear Lyapunov functions. Ann. Appl. Probab. 11(4):1384–1428.Google Scholar
Borst S, Boxma O, Jelenković P (2003) Reduced-load equivalence and induced burstiness in GPS queues with long-tailed traffic flows. Queueing Systems 43(4):273–306.Google Scholar
Chen B, Blanchet J, Rhee CH, Zwart B (2019) Efficient rare-event simulation for multiple jump events in regularly varying random walks and compound Poisson processes. Math. Oper. Res. 44(3):919–942.Link, Google Scholar
Durrett R (1980) Conditioned limit theorems for random walks with negative drift. Zeitschrift Wahrscheinlichkeitstheorie Verwandte Gebiete 52(3):277–287.Google Scholar
Foss S, Korshunov D (2012) On large delays in multi-server queues with heavy tails. Math. Oper. Res. 37(2):201–218.Link, Google Scholar
Foss S, Korshunov D, Zachary S (2011) An Introduction to Heavy-Tailed and Subexponential Distributions (Springer, New York).Google Scholar
Georgiadis L, Neely MJ, Tassiulas L (2006) Resource Allocation and Cross-Layer Control in Wireless Networks (Now Publishers Inc., Hanover, MA).Google Scholar
Jelenković P, Momčilović P (2003) Asymptotic loss probability in a finite buffer fluid queue with hetergeneous heavy-tailed on–off processes. Ann. Appl. Probab. 13(2):576–603.Google Scholar
Maguluri ST, Burle SK, Srikant R (2016) Optimal heavy-traffic queue length scaling in an incompletely saturated switch. ACM SIGMETRICS Performance Evaluation Rev. 44(1):13–24.Google Scholar
Markakis MG (2013) Scheduling in switched queueing networks with heavy-tailed traffic. Unpublished PhD thesis, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
Markakis MG, Modiano E, Tsitsiklis JN (2014) Max-weight scheduling in queueing networks with heavy-tailed traffic. IEEE/ACM Trans. Networking 22(1):257–270.Google Scholar
Markakis MG, Modiano E, Tsitsiklis JN (2018) Delay analysis of the max-weight policy under heavy-tailed traffic via fluid approximations. Math. Oper. Res. 43(2):460–493.Link, Google Scholar
Nair J, Jagannathan K, Wierman A (2015) When heavy-tailed and light-tailed flows compete: The response time tail under generalized max-weight scheduling. IEEE/ACM Trans. Networking 24(2):982–995.Google Scholar
Nair J, Wierman A, Zwart B (2020) The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation (Cambridge University Press, Cambridge, UK).Google Scholar
Neely MJ (2010) Stochastic Network Optimization with Application to Communication and Queueing Systems. Ying L, ed., Synthesis Lectures on Communication Networks and Algorithms 3.1 (Springer Nature, Cham, Switzerland).Google Scholar
Pakes A (1975) On the tails of waiting-time distributions. J. Appl. Probab. 12(3):555–564.Google Scholar
Shah D, Wischik D (2012) Switched networks with maximum weight policies: Fluid approximation and multiplicative state space collapse. Ann. Appl. Probab. 22(1):70–127.Google Scholar
Sharifnassab A, Tsitsiklis JN, Golestani SJ (2019) Sensitivity to cumulative perturbations for a class of piecewise constant hybrid systems. IEEE Trans. Automatic Control 65(3):1057–1072.Google Scholar
Sharifnassab A, Tsitsiklis JN, Golestani SJ (2020) Fluctuation bounds for the max-weight policy with applications to state space collapse. Stochastic Systems 10(3):223–250.Link, Google Scholar
Tassiulas L, Ephremides A (1992) Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio networks. IEEE Trans. Automatic Control 37(12):1936–1948.Google Scholar
Veraverbeke N (1977) Asymptotic behaviour of Wiener-Hopf factors of a random walk. Stochastic Processes Appl. 5(1):27–37.Google Scholar
Zwart B, Borst S, Mandjes M (2004) Exact asymptotics for fluid queues fed by multiple heavy-tailed on-off flows. Ann. Appl. Probab. 14(2):903–957.Google Scholar

Volume 13, Issue 4

December 2023

Pages 399-495

Article Information

Metrics

Information

Received:November 16, 2021
Accepted:March 06, 2023
Published Online:April 14, 2023

Cite as

Arsalan Sharifnassab, John Tsitsiklis (2023) Jumping Fluid Models and Delay Stability of Max-Weight Dynamics Under Heavy-Tailed Traffic. Stochastic Systems 13(4):399-437.

https://doi.org/10.1287/stsy.2023.0110

Keywords

Acknowledgments

PDF download

Available Issues

Available Issues

Jumping Fluid Models and Delay Stability of Max-Weight Dynamics Under Heavy-Tailed Traffic

Abstract

1. Introduction

1.1. Our Contributions

1.2. Related Works on Multiple Big Jumps

1.3. Outline

1.4. Notation

2. The Model

2.1. Network Model and the Max-Weight Policy

2.2. Light- and Heavy-Tailed Arrivals

2.3. Robust Delay Stability

3. A Counterexample and the Insufficiency of Fluid Models

4. Jumping Fluid Models

4.1. The Fluid Model

4.2. Adding the Jumps

5. Main Result

5.1. Some Intuition

5.2. Remarks

5.2.1. Heavy-Tailed Queues.

5.2.2. Unstable Systems.

5.2.3. Comparison with the ZF Condition.

5.2.4. The Light-Tailed Case.

5.2.5. The Case of Zero Tail Exponents.

5.2.6. Computational Issues.

5.3. Our Example, Revisited

6. Robust Delay Stability via Lyapunov Functions

7. Proof of the “if” Direction of Theorem 1 (RJF⇒RDS)

7.1. Outline of the Proof

7.2. Sensitivity of Max-Weight Dynamics

7.3. Arrival Process and Jumps

7.4. Fluctuations of the Arrival Process

7.5. Deterministic Analysis of the Dynamics

7.6. Some More Intuition

7.7. The Distance from the Invariant Set

7.8. Bounding E[Qm(t)]

8. Proof of the Reverse Direction of Theorem 1 (RDS⇒ RJF)

8.1. Outline of the Proof

8.2. Jumps of the ϵ-JF(n) Trajectory

8.3. Defining Certain Constants

8.4. Defining the Stochastic Arrivals

8.5. Probabilistic Analysis

8.6. E[Qm] is Large at the End of an Episode

8.7. Concatenating Episodes over the Entire Timeline

9. Proof of Theorem 2

9.1. Proof of the First Direction

9.2. Proof of the Reverse Direction

9.3. Proof of Corollary 1

10. Discussion

10.1. Framing and Results

10.2. Alternative Formulations

10.3. Open Problems

Appendix A. Exploring Alternative Formulations

A.1. Variations of Our Definitions

A.2. Relations Between Alternative Definitions

A.2.1. PC-RJF ⇒ UC-RJF.

A.2.2. (C-RDS ⇒ C-RJF) and (0-RDS ⇒ 0-RJF).

A.2.3. 0-RJF ⇏ 0-RDS.

A.2.4. 0-RJF ⇏ C-RJF.

A.3. Conjectures and Open Problems

A.3.1. 0-RDS ⇒? C-RDS.

A.3.2. C-RJF ⇒? PC-RJF.

A.3.3. Generic Considerations.

A.4. The Details of the Arrival Distribution May Matter

A.4.1. Delay Stability Implies the 0-RJF Condition Under a Regularity Assumption.

A.4.2. Without a Regularity Assumption, Delay Stability Need Not Imply the 0-RJF Condition.

A.5. The Timing of the Jumps

Appendix B. Proofs of Lemmas for the First Direction of Theorem 1 (ϵ-JF ⇒RDS)

B.1. Proof of Theorem 4

B.2. Proof of Lemma 1

B.3. Proof of Lemma 2

B.4. Proof of Lemma 3

B.5. Proof of Lemma 4

B.6. Proof of Lemma 5

B.7. Proof of Lemma 6

Appendix C. Proof of Lemmas for the Second Direction of Theorem 1: RDS⇒ϵ-JF Condition

C.1. Proof of Lemma 7

C.2. Proof of Lemma 8

C.3. Proof of Lemma 9

7. Proof of the “if” Direction of Theorem 1 (RJF $\Rightarrow$ RDS)

7.8. Bounding $E [Q_{m} (t)]$

8. Proof of the Reverse Direction of Theorem 1 (RDS $\Rightarrow$ RJF)

8.6. $E [Q_{m}]$ is Large at the End of an Episode

A.2.1. PC-RJF $\Rightarrow$ UC-RJF.

A.2.2. (C-RDS $\Rightarrow$ C-RJF) and (0-RDS $\Rightarrow$ 0-RJF).

A.2.3. 0-RJF $⇏$ 0-RDS.

A.2.4. 0-RJF $⇏$ C-RJF.

A.3.1. 0-RDS $\overset{?}{\Rightarrow}$ C-RDS.

A.3.2. C-RJF $\overset{?}{\Rightarrow}$ PC-RJF.

Appendix B. Proofs of Lemmas for the First Direction of Theorem 1 (ϵ-JF $\Rightarrow$ RDS)

Appendix C. Proof of Lemmas for the Second Direction of Theorem 1: RDS $\Rightarrow ϵ$ -JF Condition