Chattering and congestion collapse in an overload switching control

Routing mechanisms for stochastic networks are often designed to produce state space collapse (SSC) in a heavy-traffic limit, i.e., to confine the limiting process to a lower-dimensional subset of its full state space. In a fluid limit, a control producing asymptotic SSC corresponds to an ideal sliding mode control that forces the fluid trajectories to a lower-dimensional sliding manifold. Within deterministic dynamical systems theory, it is well known that sliding-mode controls can cause the system to chatter back and forth along the sliding manifold due to delays in activation of the control. For the prelimit stochastic system, chattering implies fluid-scaled fluctuations that are larger than typical stochastic fluctuations. In this paper we show that chattering can occur in the fluid limit of a controlled stochastic network when inappropriate control parameters are used. The model has two large service pools operating under the fixed-queue-ratio with activation and release thresholds (FQR-ART) overload...


Introduction.
In this paper we study the fluid limit of a stochastic system comprised of two service pools, each having its own arrival process and own queue. The system is operating under the fixed-queue-ratio with activation and release thresholds (FQR-ART) overload control (specified in §2.1 below), which was developed in [28,32]. The control is designed to automatically switch on sharing (serving some customers from the other pool) when an unexpected overload occurs, and switch off sharing when the overload incident is over, based only on the observed queue lengths. While the control is switched on, it aims to hold the two queues nearly fixed at some pre-specified ratio. From a many-server asymptotic perspective, the fixed-queue-ratio goal is to produce state space collapse (SSC). It is significant that when the control parameters are chosen appropriately, the control is both effective and robust. In particular, it is successful in automatically switching on and off as needed, and in producing the desired SSC (again, automatically), even under unrealistically-extreme conditions; see [31] for key theory, involving an averaging principle, and §4.3 in [33] for important examples.
Nevertheless, in some extreme cases a performance degradation was demonstrated via simulation in [32]. More specifically, with highly inefficient sharing (the service rate is much less for the other customers), if the control parameters are badly-chosen, a system that is recovering from an overload incident, and is no longer overloaded, may get stuck in an oscillatory behavior that is due to unintended on-and-off switchings of the control. We now provide mathematical analysis that establishes key properties of this oscillatory behavior and provides a way to approximately quantify it. In [32] we developed a fluid approximation that can be used, in addition to simulation, to ensure that the bad behavior does not occur. Nevertheless, it is important to carefully study the limitations of controls. The insights gained should be useful for studying other overload controls.
A switching control. Most of the literature on control of queueing networks deals with ongoing operations, in which the control is operating continuously. Typically, it is also assumed that the arrival rates and total service capacity are known. However, here we are considering the control in [32] which automatically switches on and off, as was briefly described above. The fluid analysis we perform thus falls within the settings of (deterministic) switching dynamical systems [23].
A simple example of a switching system is the description of heated space. If the target temperature is set to be T • F , then a thermostat should turn the heating on whenever the temperature drops below level T , and off when it reaches the target again. The ideal dynamic system's description then has two phases: The reaching (transient) phase, which describes the system until the temperature reaches level T , and the "sliding" phase, in which the temperature remains fixed ("slides") at T . Since heat is lost continuously, a true sliding phase requires that the thermostat switch the heating on and off infinitely-many times during any time interval. In reality, a hysteresis control is employed, namely, the thermostat turns the heat on when the temperature reaches some level T < T , and off when the temperature hits some level T h > T . Hence, a more realistic description of the corresponding dynamical system has the temperature chatter about level T , with this chattering being faster and smaller the more accurate the thermostat is. If the chattering is sufficiently small and fast, then the hysteresis dynamics approximate the ideal sliding phase quite well.
The queueing system we consider here has important similarities with the heating system describe above in that it is designed to "slide" on some region of its state space. More importantly, as we explain below, many other control mechanisms for queueing systems are designed for this purpose. The difference between the "ideal" sliding and the dynamics that are experienced in practice must be taken into account so as to avoid the harmful phenomena described here.
State space collapse, sliding motion and chattering. Asymptotic SSC in heavy-traffic limits is often a key step in developing effective (e.g., asymptotically optimal) controls for multidimensional stochastic networks; e.g., [4,14,15,31,34,41,43,49]. (Related ideas date back to [46], but the systems there are uncontrolled.) As the term suggests, SSC means that the limit process is of a lower dimension than the prelimit process. More precisely, if SSC holds, then the limit process "collapses" (i.e., is confined) to a lower dimensional subset of its full state space. It is significant that SSC is often not only a mathematical tool that is employed to simplify asymptotic analysis, but rather, as in [31], SSC may be a goal of the control. We elaborate on the relation of SSC to optimal control in §9 below. See also page 136 in [1].
In the context of a functional weak law of large numbers (FWLLN) or fluid limit, asymptotic SSC corresponds to the limiting deterministic fluid process exhibiting a sliding motion, i.e., all the fluid trajectories "slide" on a lowerdimensional subspace, called a sliding manifold ; see, e.g., §14.1 in [21] and §1.2.3 in [23]. In such cases, the fluid limit often has discontinuous dynamics in its full state space; i.e., it is governed by an ordinary differential equation (ODE) with a discontinuous right-hand side. The discontinuous dynamics is often avoided by assuming that the initial condition is asymptotically on the sliding manifold and restricting attention to the behavior of the limit on that region of the state space. However, if the initial condition of the fluid limit is not on the sliding manifold, the fluid trajectory must first go through a transient period before reaching the manifold; see Theorem 3 in [4] and the explanation preceding it. (We remark that sliding manifolds should not be confused with the invariant manifolds in [4], which are defined to be the fixed points of the fluid limit. In particular, on the invariant manifold, the fluid trajectories are constant functions, whereas on a sliding manifold, the fluid limits may exhibit a transitory behavior.) An effective SSC control must therefore (i) pull the system to the sliding manifold without undue delay and (ii) ensure that the system remains on the sliding manifold thereafter. For queueing networks, this may require specifying different routing rules for different regions of the state space -on and off the sliding manifold. For example, suppose that the state space S can be partitioned into three disjoint subsets M, M + and M − , where M is a sliding manifold, while M + and M − are "above" and "below" M. A sliding-mode control will direct trajectories starting in M − upwards toward M, and downwards toward M from M + . Ideally, a sliding-mode control that starts in M − will switch immediately once the fluid trajectory hits M, aiming to keep that trajectory sliding on M after that hitting time. In reality, however, there may be a delay period until the control switches, so that the trajectory will cross immediately into M + after hitting M. Once the control finally switches, the trajectory is in M + and the trajectory reverses its direction towards M, but may again cross M, this time into M − , because of delays in switching the control. This is the chattering phenomenon in the control literature; see §14.1 in [21]. When this chattering occurs, the sliding manifold M becomes a switching manifold, because the system switches its dynamics each time it crosses M. Figure 1 depicts a schematic representation of chattering about a manifold M, denoted by the dashed line, in the two-dimensional plane. The queueing context. Within queuing theory, the current paper should be considered in the context of instability of subcritical queueing networks, and in particular, instability caused by the control. Subcritical queueing networks that become overloaded due to exercising a bad control are said to experience congestion collapse, as in [39]; see §1 in [32] and §2.2 in [33].
The first studies of unstable subcritical queueing networks are the Lu-Kumar [25] and Rybko-Stolyar [37] networks. These networks were constructed as special, "atypical", examples with the purpose of demonstrating that sub-criticality is not sufficient for stability of queueing systems. It was only later acknowledged that these two counterexamples are indicative of general phenomena (see the discussion in §3 in [6]). For example, a twostation multiclass stochastic network, operating under FIFO, was shown to be unstable in [5]. The analysis there was carried out under the assumption that the number of classes is very large, namely, jobs move through station 2 a large number of times (e.g., 1,600 times) before exiting the system; see §2 in this reference. Nevertheless, a simulation experiment in [9] shows that with only four visits to station 2, the system is already unstable. Similarly, we consider extreme parameters for our instability analysis so as to permit qualitative and quantitative analysis of the fluid model, and use simulation to show that the oscillatory behavior can occur for the stochastic system with realistic parameters.
The setting. In this paper we illustrate the chattering phenomenon in a queueing network. Specifically, we consider a deterministic fluid approximation arising in the many-server heavy-traffic limit for a system with two service pools, each having its own arrival process and designated queue, that is operating under the FQR-ART overload control which we suggested in [32]. Normally, the two pools process work from their designated queues only. However, when an overload occurs due to an unexpected shift in the arrival rates, the control automatically identifies which queue should receive help and sharing begins, so that jobs from the overloaded queue are routed to both service pools, according to a routing rule that will be specified below.
The overload control was created for two call centers that normally operate separately, but might benefit by assisting each other to respond to unexpected overloads by temporarily serving some of the other customers. Given this call center motivation, we refer to pools of agents and the customers served in the other (not designated) pool as shared customers. When sharing is activated, the goal is to maintain the two queues nearly fixed at a pre-specified ratio during overload periods that is optimal with an appropriate cost formulation; see [28].
We showed that sharing can be effective even if sharing is inefficient, i.e., the shared customers are served at a slower rate. Since there is the possibility of performance degradation if there is too much sharing, it is necessary to choose the control parameters appropriately. The root cause of the chattering discussed here is indeed the combination of excessive inefficient sharing and poorly chosen control parameters. To avoid excessive simultaneous sharing of customers in both directions ("two-way sharing,"see §4.1 in [28]), sharing with pool 1 helping queue 2 is activated only if the number of shared customers in pool 2 is below a certain (small) threshold, and similarly in the other direction. This latter restriction can cause delays in activating sharing when the direction of overload switches. Once activated, the control aims to produce asymptotic SSC by confining the queues to a certain region of the state space in the fluid limit [31]. In the fluid limit, this SSC translates to sliding motion on one of two sliding manifolds, each associated with one direction of sharing. We elaborate in §2 below.
When sharing is inefficient and the control parameters are not chosen appropriately, delays in activating the control can cause so much chattering that the fluid trajectory hits both sliding manifolds, without remaining in either, leading to complex chattering behavior. Here the chattering manifests itself in periodic oscillations, which lead to inefficient utilization of the service capacity. In turn, the inefficient utilization of agents creates severe overloads, even though the arrival rates we consider are smaller than the potential service capacity.
Chattering in sliding-mode controls is a well-known phenomenon in deterministic control theory. Indeed, chattering is considered to be the natural "state of affairs", whereas perfect sliding motion is considered ideal and unrealistic; e.g., §14.1 in [21]. Accordingly, even though we focus on a single system that operates under a specific control, our results have broader relevance. In particular, similar phenomena should be expected to occur with other SSC-inducing controls when there are deviations from ideal modeling assumptions, such as stationarity, or "convenient" initial conditions and control settings.
Switching dynamical systems. The chattering found in the fluid model implies that the ODE governing the evolution of the fluid trajectories switches whenever the control is activated or released. Therefore, the appropriate fluid model x := {x(t) : t ≥ 0} for the stochastic system is a switching dynamical systemẋ = f σ(x) (x), where σ(x) achieves a finite set of values, f i is a continuous function for each value i of σ, but the function f σ is discontinuous [23]. As the notation suggests, the switching epochs are state dependent (depending only on the value of the solution x), so that the ODE is autonomous (time-homogeneous).
The framework of switching systems in general, and of systems with sliding motion in particular, is outside the classical ODE and dynamical-systems theory, because the right-hand side function f σ is not continuous, and so it is not locally Lipshcitz. Hence, the conditions of the Picard-Lindelöf theorem, ensuring the existence of a unique solution to the ODE, are not satisfied. In general, the existence of a unique solution to a switching system with no sliding motion can only hold in the Carathéodory sense, namely, such a solution is an absolutely-continuous function that satisfies the ODE almost everywhere; see [23]. A solution with a sliding motion is generally considered to hold in the Filippov sense [12], In [30,31] we have proved that a unique solution exists for the fluid limit of our system during sliding motion via a stochastic averaging principle (i.e., the control achieves the desired asymptotic SSC).
Analytical contribution. In addition to exposing the chattering behavior discussed above, our current work has important analytical contributions. We emphasize at the outset that the derivation of the fluid model, which will also be shown to be the FWLLN in §I.2, and its quantitative analysis is relatively standard. The stronger analytical contributions lie in the nontrivial qualitative analysis of the fluid model. Specifically, we provide sufficient conditions for chattering to lead to endless oscillations, and prove the existence of a periodic equilibrium. Furthermore, we provide a simple algorithm to efficiently analyze the system for any given initial condition.
It is known that even seemingly simple switching systems can experience chaotic-like behavior, e.g., have infinitely-many periodic equilibria that are dense in the state space, and exhibit high sensitivity to perturbations of the initial condition (popularly known as "the butterfly effect"); see, e.g., [8,11]. Such systems are clearly unamenable to long-run analysis. Even fluid models of uncontrolled systems can have uncountably-many periodic equilibria [24]. However, numerical experiments suggest that our system has at most one periodic equilibrium, and that it is bi-stable, i.e., any fluid trajectory can have long-run behavior of only two kinds: either it converges to the periodic equilibrium, or else it converges to the unique stationary point (which is therefore asymptotically stable).
To conduct a more complete study of the (bi)stability properties of the fluid model, we create an approximation to the fluid system. (Note that "stability" here does not refer to the prelimit queueing system which is always stable due to assumed abandonment.) For that approximating dynamical system we show that all oscillating solutions must converge to the unique periodic equilibrium (of the approximating system), while all other solutions converge to the unique stationary point, which is the same as that of the fluid limit. In particular, the approximating system is bistable. We conjecture that the same is true for the fluid limit (Conjecture 5.1 below), and support this conjecture by numerical experiments in §7.
To summarize, we develop and analyze two layers of approximations, one being the fluid limit, which approximates the stochastic system, and the other being an approximating dynamical system which serves as a simplified approximation to the fluid limit, whose qualitative behavior is easier to characterize.
Implications of the fluid analysis to the stochastic system. A straightforward implication of our result that the fluid limit may oscillate indefinitely is that the prelimit stochastic systems can experience congestion collapse. Moreover, the fluid limit may oscillate, even though the stochastic system in the pre-limit is an ergodic continuous-time Markov chain (CTMC) and is therefore necessarily aperiodic with a unique equilibrium (stationary) distribution. Since the CTMC converges to its unique stationary distribution also for initial conditions that are associated with oscillatory fluid limits, one concludes that the convergence rate of the CTMC to stationarity must be prohibitively slow. We elaborate in §8.
Our fluid analysis also has indirect implications to the stochastic system. Specifically, stochastic noise, which is not captured by the fluid approximation, may eventually push the system into the oscillatory behavior, even if the system is unambiguously initialized in the attraction region of the stationary point. This suggests that stochastic fluctuations can lead to fluid-scaled fluctuations. In addition, oscillations can occur in the stochastic system even if its fluid limit does not possess a periodic equilibrium, and never oscillates. Therefore, studying the relatively simple fluid model is important for gaining insight into the dynamics of the stochastic system. See the examples in §7.3 below.
Organization. The rest of the paper is organized as follows. We describe the stochastic model and the control in §2. In §2.2 we explain how to construct a direct fluid model to approximate the system's dynamics. The switching fluid model is derived in §3. Qualitative analysis, including relevant equilibrium and stability notions for dynamical systems, are rigorously defined and analyzed in §4. In §5 we show that the fluid model can oscillate indefinitely and when it does we show there exists a periodic equilibrium. The approximating dynamical system to the fluid model is developed in §6 and is shown to be bi-stable. Numerical examples and simulation experiments are provided in §7. In §8 we study the implications of the fluid analysis to the stochastic system, and in particular, to the long-run behavior of the underlying CTMC. General takeaways from our results, applicable to other systems and controls, are discussed in §9. We conclude in §10. Many of the results are proved in the appendix, and additional results appear in Sections F-J. 2. The model. We start by reviewing the stochastic model which is assumed Markovian, and in particular, it can be described as a CTMC. In §2.2 we quickly develop the deterministic fluid model to the stochastic system, which will be our focus in this paper. We defer the proof that the fluid model is indeed a rigorous approximation via a FWLLN to the appendix; see §I.2. The model has two large service pools of many homogeneous agents in a call center, each with its own arrival stream and designated queue for waiting customers. We assume that customers have finite patience, and will abandon if their wait time in queue exceeds their patience. The two pools are designed to operate independently when both are normally loaded, i.e., to serve their own arrivals only, but all the agents can help both customer classes.
Sharing of customers (namely, routing customers from one pool to be served in the other pool) may be beneficial if one of the pools is overloaded, even if sharing makes the second service pool overloaded as well, because abandonment keep the two queues stable. Indeed, in [28] we showed that sharing of customers may be optimal during overload periods in a deterministic fluid approximation, assuming a convex holding cost is incurred on the two queues. However, as we showed in Proposition 2 in [28], when agents are less efficient in serving the other class, i.e., agents serve shared customers slower on average than their designated customers, it is never optimal to share in both directions simultaneously. Nevertheless, since sharing of customers in either direction takes place sometimes, and some sharing in both directions simultaneously may also take place, the routing graph of the system has the letter X shape, and is therefore called the X model in the call-center literature. Figure 2 is a schematic portrayed of the X model; the circles represent the service pools, and the open-ended rectangle represent the buffers, and the arrows connecting the circles to the rectangles represent the allowed routing of customers to service. The solid arrows pointing to the buffers represent input (due to arrivals), and the dotted arrows represent output (due to abandonment from the buffers, and service completions from the service pools). We also show a figure of the N model in Figure 3 in which sharing of customers is possible in one direction only. We discuss related known results concerning the N model in §9 below. (Figure 3 has no arrows from the buffers since the N systems we discuss have no abandonment.) In general, there is a fluid-optimal amount of sharing for any given pair of arrival rates and so, to find how many agents in the helping pool should be assigned to shared customers requires knowing the exact arrival rates during the overload period. A simplification is achieved by observing that the exact amount of sharing does not need to be determined at the outset, since it can be achieved, at least approximately, if the two fluid queues are kept at a fixed ratio during overload periods. We again refer to [28]. There is a different optimal ratio for each direction of sharing, and the direction of sharing depends on which pool is overloaded. The above reasoning lead us to design the fixed queue ratio with thresholds (FQR-T) overload control, which (i) is activated automatically once the queue ratio exceeds a certain "activation threshold" (so that the system is considered overloaded); (ii) aims to maintain the two queues at a prespecified fixed ratio (in the many-server asymptotics); (iii) class-i customers are routed to pool j only if there are no class j customers in pool i, i = j.
In time-varying settings, the direction of overload may switch, so that the direction of sharing must switch as well. If the one-way sharing rule in Condition (iii) above is forced, then substantial delays in switching the direction of sharing may occur. We therefore modified FQR-T in [32] by introducing release thresholds for the service process. Specifically, in the modified fixed queue ratio with activation and release thresholds (FQR-ART) control the one-way sharing rule is relaxed as follows: class-1 customers can be routed to pool 2, provided that the number of class-2 customers in pool 1 is smaller than a release threshold τ 2,1 > 0, and similarly in the other direction. We elaborate in §2.1 below.
Cyclic routing graph. An important characteristic of the X model is that its (undirected) routing graph is cyclic. In particular, it is the most basic cyclic parallel server system (PSS). The X model is therefore easier to study than other cyclic PSS's but at the same time serves as a representative to problems that are associated with its cyclic structure. Indeed, in [28] we showed that the QIR control from [15] can produce severe congestion collapse if applied to the X model when the service rates of shared customers are slower than those of designated ones. This congestion collapse cannot occur in PSS's having a tree graph; see Theorem 3.1 in [15].
As was mentioned above, the FQR-ART control aims to avoid simultaneous sharing of customers as much as possible, and to reduce the system into an N model (as in Figure 3) at any given time that sharing take place, although the some simultaneous sharing is possible, and the direction of sharing may change with time. It is therefore compelling to compare our results regarding the X model to known results on the well-studied N model; see, e.g, [2,18,20,45] and references therein. In §9 we make such comparisons to indicate how our results here, as well as results from our previous work on the X model, provide important insights to other SSC-inducing controls, taking the N model as an example. We note that the N model has the most basic tree structure of a PSS with more than one class of arrivals and more than one service station, making it a "representative model" for PSS's with tree structures (in a similar manner to the X model being a "representative model" for cyclic PSS's). In particular, our insights are not restricted to PSS's with cyclic routing graphs.
2.1. The FQR-ART control. We will start by developing a deterministic fluid approximation for the stochastic system directly, but to fully describe the control, we must consider that fluid model from an asymptotic perspective. We therefore consider a sequence of X models indexed by superscript n, where system n has m n i agents in pool i and arrival rate λ n i of class-i customers, i = 1, 2. We assume that the arrival rates and number of agents in each pool grow proportionally to n as n → ∞, putting us in the many-server heavy-traffic framework.
The control of each system n ≥ 1 is based on two activation thresholds, k n 1,2 and k n 2,1 , two release thresholds, τ n 1,2 and τ n 2,1 , and two ratio parameters r 1,2 and r 2,1 . These ratios, which are independent of n, are chosen to be optimal in a fluid model of an overloaded system (here we will consider underloaded systems), as was mentioned above.
Let Q n i (t) denote the number of class-i customers waiting in their designated queue at time t, and let Z n i,j (t) denote the number of class-i customers being served in pool j at time t. The FQR-ART is an overload control, namely, it is designed to be activated and start customer sharing automatically when an overload occurs. To define overloads, we consider the difference processes. For t ≥ 0, As long as D n 1,2 < 0 and D n 2,1 < 0, the system is considered normally loaded. Once one of these difference processes hits 0, which corresponds to the ratio between the two queues hitting one of the activation thresholds, the system is deemed overloaded, and sharing begins, provided that there is only a small number of shared customers in the overloaded pool. By "small number" we mean that the number of shared customers in the overloaded pool is no larger than its associated release threshold. For example, if D n 1,2 (t) ≥ 0, then class 1 is judged to be overloaded (because then Q n 1 (t) − r 1,2 Q n 2 (t) ≥ k n 1,2 ) and it is desirable to send class-1 customers to be served in pool 2. However, sharing is allowed only if Z n 2,1 (t) ≤ τ n 2,1 . Similar rules apply to overloads in the other direction. (It is important that τ n i,j are taken to be small numbers, so that not too much harmful simultaneous sharing can occur. However, these threshold must be strictly positive; see (3.21) below and §3 in [32].) Once sharing is activated, say with class 1 receiving help from pool 2, the routing rule is as follows: Any agent, from either pool, that becomes available at any time t, will take his next customer from class 1 if D n 1,2 (t) > 0, and will take his next customer from his designated queue otherwise. Observe that this means that agents from pool 1 will only take customers from their own queue, but some class 1 customers will be routed to pool 2. The routing mechanism when class 2 is overloaded is similar, with D n 2,1 replacing D n 1,2 , and the labels of the thresholds switched.

A deterministic fluid model. If the arrival processes are independent
Poisson processes, and all service times and times to abandon are independent exponential random variables, then the six-dimensional process (2.2) X n (t) = (Q n i (t), Z n i,j (t); i, j = 1, 2), t ≥ 0, is a CTMC. Our goal is to develop and then analyze a fluid approximation for this CTMC, based on asymptotic considerations (which will be made rigorous in §I.2).
When sharing is active, the control aims to keep the two queues at the corresponding fluid-optimal ratio, either r 1,2 or r 2,1 , depending on the direction of sharing. Minor modifications to the statement and proof of Corollary 4.1 in [31] show that, if the system is overloaded and there is no sharing initially, then the control achieves asymptotic SSC in the fluid limit (or under any scaling of the appropriate process in (2.1) that is larger than log n). More general assumptions were considered in [32]. The mathematical support for the asymptotic SSC was a direct consequence of the aforementioned stochastic averaging principle.
The oscillatory performance and its resulting congestion collapse we analyze here does not involve the averaging principle, because there is no SSC.
Indeed, unlike the fluid models in [31] and [32], the fluid model we develop here has an explicit solution. The challenges are associated with proving that oscillations (and congested collapse) can be self-sustained and in studying the long-run behavior of the fluid model.
It is significant that the fluid approximation for X n is obtained as the FWLLN forX n ≡ X n /n, see §I.2. However, we start by deriving the fluid model directly. (We refer to the fluid model as fluid approximation or limit, depending on the context, as the terms are equivalent in our case.) For each of the six stochastic processes comprising X n in (2.2) there is a fluid counterpart, namely a deterministic and almost-everywhere differentiable function. We let x ≡ {x(t) : t ≥ 0} denote the fluid approximation of X n , where and call a time t "regular" if x(t) is differentiable at t. In our case, any compact interval will have at most a finite number of points that are not regular.
To derive the fluid equations, we simply replace the instantaneous rates of the stochastic processes at each time t with instantaneous rates of change of the derivatives of their fluid counterparts, e.g., the instantaneous rate of abandonment from queue 1 at time t in system n is θ 1 Q n 1 (t), which becomes θ 1 q 1 (t) in the fluid model. Similarly, the instantaneous rate of departure from service in pool j at time t is μ j,j Z n j,j (t) + μ i,j Z n i,j (t) in system n is replaced with the instantaneous processing rate μ j,j z j,j (t) + μ i,j z i,j (t) in the fluid model. Combining all these instantaneous rates gives the derivative of x(t) at a regular time t.
For example, if both queues are smaller than the activation thresholds at a time t, then any newly-available agent in pool 1 will take his next customer from queue 1 in the stochastic system. Similar reasonings applied to q 2 give that, if q 1 (t) < k 1,2 and q 2 (t) < k 2,1 , and t is regular, theṅ We derive the full set of differential equations for the fluid model during overload periods (due to congestion collapse) in §3.1 below.
The purpose of FQR-ART is to produce SSC in the fluid limit by sending customers from one queue to both pools according to the routing rules described above during overload periods. If the control is successful in achieving SSC, the six-dimensional fluid model is confined to one of the sliding manifolds S 1,2 ≡ {x ∈ S : q 1 − r 1,2 q 2 = k 1,2 , z 1,1 + z 2,1 = m 1 , z 1,2 + z 2,2 = m 2 }, is the domain of x. We note that with each of the two sliding manifolds in (2.2) there is an associated fixed point to which the "sliding" fluid solutions converge as t → ∞. In particular, letting x * 1,2 denote the stationary point on S 1,2 and assuming that x(0) ∈ S 1,2 during an overload period with sliding motion on S 1,2 , we have shown in [30] that x(t) → x * 1,2 as t → ∞, i.e., x * 1,2 is globally asymptotically stable. If x(0) = x * 1,2 , then x(t) = x * 1,2 for all t, so that the set {x * 1,2 } is the invariant manifold for the fluid model (sliding on S 1,2 ), as in [4].
The behavior of the fluid limit when sliding on one of these manifolds can be thought of as an infinitely-fast chattering with infinitely-small fluctuations of the queues about the corresponding activation threshold. This view can be justified rigorously via the aforementioned stochastic averaging principle; see §4 in [30] and Theorem 4.1 in [31].
Observe that the fluid model is essentially a three-dimensional process on either one of these sliding manifolds, because knowing x 3 ≡ (q 1 , z 1,2 , z 2,1 ) for example, is sufficient to determine the value of the remaining three processes. Here, however, we are interested in bad oscillatory behavior when the fluid model overshoots past the sliding manifold due to delay in activating the control, where a delay is caused if z j,i (t 0 ) > τ j,i , at the time t 0 in which S i,j is hit. If no SSC occurs, we must consider all six components of the fluid model and, as will become clear below, four different switching epochs for each cycle. We can obtain considerable simplification by considering a symmetric model. Symmetry reduces the amount of notation and, as will become clear later, allows us to focus attention on two switching times in each cycle instead of four.
A symmetric model. In order to expose the bad behavior that can result from poorly chosen controls, we consider a special case that is easier to analyze than the general model. In particular, we consider systems with the following parameters Observe that time is measured in terms of μ 1,1 and μ 2,2 (which are normalized to be equal to 1). In this model there are 5 parameters instead of 16 in the general case. There is the triple of model parameters (λ, μ, θ) and the pair of control parameters (κ, τ ). Note that each of the pools is underloaded if there is no sharing that slows its potential service capacity, because In this model, there is sharing with all class-2 fluid sent to pool 1 if q 2 (t) > q 1 (t) + κ and z 1,2 (t) ≤ τ ; there is sharing with all class-1 fluid sent to pool 2 if q 1 (t) > q 2 (t) + κ and z 2,1 (t) ≤ τ ; there is complex sharing, associated with sliding motion and described by the averaging principle if q 2 (t) = q 1 (t) + κ and z 1,2 (t) ≤ τ or if q 1 (t) = q 2 (t) + κ and z 2,1 (t) ≤ τ ; there is possibly sharing according to the spare capacity control described above if q 1 (t) ≥ κ and z 1,2 (t) + z 2,2 (t) < m or q 2 (t) ≥ κ and z 2,1 (t) + z 1,1 (t) < m. otherwise there is no sharing actively taking place.
We have assumed in (2.4) that λ < 1, so that either pool is underloaded if it serves its own class only (because μ i,i m i = 1, i = 1, 2). It will be convenient to assume that λ ≤ 1 − τ . In that case, if class i fluid is sent to pool j at time t, i = j, then z j,i (t) ≤ τ and the instantaneous service rate in pool i is implying that the instantaneous total service rate in pool i is larger than the arrival rate to that pool so that q i is decreasing; see also (2.3). In addition, to achieve explicit solutions to the ODE's we develop, we will assume that θ < μ. We summarize in the following assumption.
Assumption 1 is not necessary for chattering and oscillations to occur, and is taken in order to somewhat simplify the analysis. Since τ is small, the condition λ ≤ 1 − τ is a slight strengthening of the condition λ < 1 in (2.4), and implies that either pool is underloaded when its own class of customers receives help from the second pool, regardless of the value of μ. To see this, observe that the instantaneous total service rate at pool j at time t, if there is no routing of new class-i customers to that pool, is μz i,j (t) + (1 − z i,j (t)), and that The condition θ < μ simplifies the exposition of the fluid model. Specifically, As will be seen below, we provide closed-form solutions for the fluid model which are not defined for θ = μ, e.g., see (3.6). Therefore, one needs to solve for the case θ = μ separately. Letting x(·, θ, μ) := {x(t; θ, μ) : t ≥ 0} denote our solution for the fluid equations as a function of θ and μ, and defining x(t, θ, θ) and x(t, μ, μ) to be the solution for the fluid equations when θ = μ, it easily checked that our explicit solution x(·, θ, μ) is continuous in θ and in μ. We further remark that our solution remains valid if θ > μ, but the sign of some arguments changes.
Since the activation thresholds κ are strictly positive in the fluid model, there is no ambiguity about the translation of the FQR-ART control to the fluid model when there is no SSC. It is then entirely determined by the processes which are simply the fluid counterparts of (2.1). Due to the assumed symmetry, the state space of the fluid model is R 2 + × [0, 1] 4 and the sliding manifold are defined via If x(t) ∈ S i,j for all t over some interval I, then x is said to slide on the sliding manifold S i,j . Chattering corresponds to the fluid trajectory hitting and immediately crossing a sliding manifold, e.g., when it is moving from S − i,j to S + i,j (necessarily via S i,j ) without sliding on S i,j , and back from S + i,j to S − i,j . It will be clear that chattering about one sliding manifold is not sustainable unless the fluid trajectory makes it all the way to the second manifold. When both manifolds are hit, we say that the fluid oscillates. Since we will consider initial conditions in S + 2,1 , a full cycle is considered to end when the fluid trajectory first enters S + 2,1 after hitting S 1,2 . When chattering or oscillations occur, the sliding manifolds in (2.6) become switching surfaces, because the dynamics of the fluid model switches when it hits either of these subspaces.
The state space. It is easily seen from (2.3) thatq i (t) ≤ λ − θq i (t), and that this inequality holds for all t ≥ 0 regardless of the routing. It follows from the comparison principle for ODE's, e.g., Lemma 3.4 in [21], that for all t > 0, q i (t) ≤ max{q i (0), λ/θ}, i = 1, 2, and that, if q i (0) > λ/θ, then q i must be strictly decreasing as long as q i (t) > λ/θ. Furthermore, q i can never cross λ/θ from below, i.e., if q i (s) < λ/θ, then q i (t) < λ/θ for all t > s ≥ 0. We can therefore assume without any loss of generality that q i (0) < λ/θ so that the state space of the symmetric model is the compact and convex subset S ⊂ R 6 , where 3. The switching fluid model. Consider a system that has just recovered from an overload, in which class 1 was receiving help from pool 2. Suppose that λ 1 , which was greater than μ 1,1 m 1 = 1 during the preceding overload period, dropped to the value λ < 1 in (2.4) Since sharing was taking place with pool 2 helping, we necessarily had z 2,1 < τ and q 1 − q 2 = κ > 0 (x sliding on S 1,2 ) during the overload period.
Assuming that z 1,2 was larger than τ during the preceding overload period, we designate by 0 the first time that z 1,2 hits τ , so that sharing can begin with pool 1 helping queue 2 if d 2,1 (0) > κ. Formally, Assumption 2 (initial condition).
To describe the oscillatory behavior of the fluid model, we define the times Observe that T 1 and T 3 are the hitting times of the switching manifolds, and are in fact the crossing times from above of these manifolds. At those hitting times, ongoing sharing ends. The times T 2 and T 4 are the hitting times (again, crossing times from above) of the release thresholds, so that sharing can begin at those times. For example, if x(T 2 ) ∈ S + 1,2 , then sharing of class-1 fluid can begin at this time. See §3.2 below.
We refer to the times Σ i as switching times, and to T i as holding times (the times between switching). The length of each interval We will interchangeably write T 1 or Σ 1 , and T 1 + T 2 or Σ 2 , as convenient.
Clearly T 1 > 0 for the initial condition in Assumption 2, but it is possible that T i = 0 for i > 1. Observe that if at the end of the first cycle x(Σ 4 ) satisfies the same conditions specified for x(0) in Assumption 2, then x(Σ 4 ) can be taken as a new "initial condition" for the fluid model (which is time homogeneous, as will be shown below), and a new cycle begins. Furthermore, if both fluid queues are strictly positive on [0, Σ q ) and z 2,1 (Σ 1 ) > τ in addition to d 1,2 (Σ 2 ) > 0, then x(Σ 2 ) can be thought of as a "mirror image" of x(0) because we necessarily have 0 < z 1,2 (Σ 2 ) < τ. In particular x(Σ 2 ) satisfies the conditions in Assumption 2, but with the labels (subscripts) reversed. Similarly, if both queues remain positive throughout [0, Σ 3 ), then . This observation greatly simplifies the search for a periodic equilibrium since, on the trajectory of a periodic equilibrium, it holds that ., x s has the labels of x reversed). We can then focus on analyzing a half cycle [0, Σ 2 ] for the symmetric model.
Hence, we consider the fluid model as long as the conditions in Assumption 2 hold in the switching times, either for x or for x s . It will be seen below that, for any initial condition in S, 0 ≤ z i,j ≤ 1, i, j = 1, 2. However, the equations for q 1 and q 2 can become negative. We thus consider the fluid Since T 1 > 0 for any initial condition satisfying Assumption 2, we necessar- On the other hand, if x(Σ 4 ) satisfies the conditions in Assumption 2, then Σ q > Σ 4 . We then take x(Σ 4 ) as the initial condition for the second cycle, and start over. We will provide sufficient conditions for Σ q to be infinite, in which case cycle-end time Σ 4 is the beginning of a new full cycle, and the fluid model keeps oscillating indefinitely. Since both queues are strictly positive throughout (despite Assumption 1), we get congestion collapse that is due to self-sustained oscillations.

The switching fluid equations.
3.1.1. The equations on I 1 : Both pools serve queue 2 only. Recall that over the interval I 1 ≡ [0, Σ 1 ) sharing takes place with both pools accepting only fluid from queue 2 and no fluid from queue 1. For a given initial condition x(0) satisfying Assumption 2, and determined by specifying the triple (q 1 (0), q 2 (0), z 2,1 (0)), the fluid equations for the service process are thereforė (3.4) and the fluid equations for the queue processes arė (3.5) For the given initial condition x(0), we can calculate the interval termination time T 1 and the fluid performance functions in I 1 . Observe that by first solving for the service processes in (3.4), the autonomous (timehomogeneous) ODE for the queues becomes a nonhomogeneous first-order linear ODE. Under the condition θ < μ in Assumption 1, the explicit solution to the ODEs (3.5) over [0, T 1 ) is We see that q 1 (t) is strictly increasing in S and necessarily remains strictly positive in the interval I 1 . Given the initial conditions in Assumption 2 and the definition of Σ 1 ≡ T 1 in (3.1), this implies that both fluid queue lengths are necessarily strictly positive in the interval I 1 , so that Σ q > T 1 .

3.1.2.
The equations on I 2 : No active sharing. Given any initial condition (q 1 (0), q 2 (0), z 2,1 (0)), we can calculate T 1 and the 6-tuple (q i (T 1 ), z i,j (T 1 )); i, j = 1, 2). These provide the initial condition for the second interval I 2 ≡ [Σ 1 , Σ 2 ). We assume that z 2,1 (T 1 ) > τ so that sharing with pool 2 helping queue 1 did not begin at time T 1 and so T 2 > 0. The fluid equations for the service process for t ∈ I 2 arė As long as both queues remain positive, since there is no new sharing in this second interval I 2 , at time T 1 + t for t ∈ [0, T 2 ], the queues evolve as follows:q Paralleling (3.6), we can solve these ODE's explicitly: For all t ∈ [0, T 2 ) are derived similarly to the equations for the intervals I 1 and I 2 , assuming Σ q < Σ 4 . We summarize in the following definition of the direct fluid model. As was mentioned before, we consider the interval [0, Σ q ) and provide sufficient conditions for Σ q to be infinite. We further prove that oscillations must end at time Σ q when this time is finite.
For two real numbers a, b, let a ∧ b ≡ min{a, b}. We will later also use the notation a ∨ b for the maximum between the two numbers.  3 satisfies the equations of f 1 , but with the labels of the processes reversed, and f 4 satisfies and equations of f 2 , with the labels of the processes reversed. The switching times Σ i , 1 ≤ i ≤ 4, are determined by the value of the solution x(t) at time t and are defined in (3.2). Furthermore, all points t ∈ [0, Σ 4 ∧ Σ q ), except for the switching times, are regular.
We refer to any specific solution to (3.10) as a fluid solution or a trajectory. As was mentioned above, if x(Σ 4 ) satisfies Assumption 2, then it serves as an initial condition for the following cycle, so that (3.10) describes the fluid dynamics beyond the first cycle in an obvious way. In §I.2 we will show that the unique solution x to (3.10) with a given initial condition arises as the FWLLN ofX n in (2.2) as n → ∞ over any compact subinterval of [0, Σ q ).

The queue-difference process. Let
In that case, fluid from queue 2 stops flowing into pool 1. At some point t 0 > T 1 we may have that −Δ(t 0 ) = κ, in which case sharing should begin with pool 2 helping queue 1, unless z 2,1 (t 0 ) > τ, which means that x will cross the sliding manifold S 1,2 into S + 1,2 . We now study the difference process over [0, Σ 2 ). In terms of (3.6), Lemma 3.1 (derivative of Δ over I 1 ). The function Δ in (3.11) has a negative derivative on I 1 and is therefore strictly decreasing. In particular, Proof. The expression for the derivative (prior to time T 1 ) follows immediately from (3.5). The function Ψ(t) in (3.13) is strictly negative because We also have an explicit expression for the difference at time t in terms of its value at time 0. Specifically, (3.12) is a classic first-order ordinary differential equation, which is known to have the explicit solution where Ψ(s) is defined in (3.13) and is independent of Δ(0). Thus, Δ(t) is a strictly increasing function of the initial difference Δ(0) > 0. In addition, Ψ(s) and Δ(t) are increasing functions of z 2,1 (0) and τ . As a consequence, T 1 is strictly increasing function of Δ(0), z 2,1 (0) and τ . Moreover, for Ψ L and Ψ U in (3.14) and t ∈ I 1 , From (3.8), we immediately obtain an expression for the derivative of the queue difference: Therefore,Δ(t) < 0 for all t ∈ [0, Σ 2 ), so that Δ is strictly decreasing over that interval, implying that the time T 1 is well defined as the unique solution t to the equation Δ(t) = κ. Finally, (3.9) implies that the function Δ(t) can be expressed as In particular, Δ(T 1 + t) < κ for all t ∈ I 2 , so that there is no active sharing in I 2 .

Conditions for finiteness of the switching times.
From the definition of T 1 in (3.1) together with (3.16), we immediately get that T 1 < ∞. Given T 1 , we can apply (3.7) to obtain an equation for T 2 . If T 1 is sufficiently large so that z 2,1 (T 1 ) > τ, then where the last equality follows from the definition of T 2 . As an immediate consequence of (3.1), we have explicit formulas for T 2 : It is easy to check whether We can apply (3.9) to calculate satisfies the conditions of x(0) in Assumption 2 but with the labels of the processes reversed, then we can again apply (3.7) (with the labels reversed) to conclude that T 3 < ∞. If T 3 > 0, then T 4 satisfies a similar equation to (3.21), but with T 3 replacing T 1 and z 1,2 (T 3 ) replacing z 2,1 (T 1 ), provided that z 1,2 (T 3 ) > τ.

Qualitative analysis.
Just as for the stochastic system, it is important to identify the possible equilibrium behavior of the fluid models, as well as its long-run behavior. We start with formally defining the relevant equilibria for our fluid model and then stating the main results regarding fluid model.
Recall that the state space of the fluid model is S in (2.7). For the general discussion regarding the long-run behavior of the system, we consider all the possible initial conditions, and therefore Assumption 2 is not enforced in this section. Specifically, any γ ∈ S is allowed to be an initial condition.
Lyapunov stability of a stationary point. We will show that for any set of parameters, the fluid model in Definition 3.1 has a unique stationary point and that, in some cases, there also exists a unique periodic equilibrium. We will then study the stability properties of the fluid model. There are three types of stability notions corresponding to stationary points that are relevant for us.
For a stationary point We note that for our system with the state space S in (2.7), subsets of S R 6 are considered open in the relative topology induced on S by the topology of R 6 . In particular, open subsets can contain points on the boundary of S in R 6 .
Stability of a periodic equilibrium. When a periodic equilibrium u * exists, it is possible for the fluid model to oscillate indefinitely, at least when the initial condition is taken to be on the periodic equilibrium trajectory. However, we would like to know if the periodic equilibrium is also asymptotically stable in some sense, namely, if there exists a set S u * ⊆ S such that, if x(0) ∈ S u * , then x(t) converges to the periodic equilibrium. We note that convergence to periodic equilibrium cannot hold in the Lyapunov sense, as in Definition 4.3, because there would typically be a time shift between the converging solution and the periodic-equilibrium solution. We therefore say that a solution x converges to a periodic equilibrium u * if its image "spirals" toward the image of u * as time increases. (By spiraling we mean that the image of x keeps moving in the direction of u * and gets closer to it as time increases; see Lemma D.4 in the appendix.) Consider a switching dynamical systemẋ = f σ (x) (not necessarily (3.10)). The standard way of proving that a periodic equilibrium u * (assuming one exists) with period T is stable, is to consider the intersection pointũ of u * with a switching surface M, and show that any trajectory x that is initialized on M sufficiently close toũ, will reach M again after a time that is approximately equal to the period T of u * . If, in addition, the intersections of x with M converge toũ, then u * is asymptotically stable; see, e.g., page 121 in [40].
To rigorously define the above asymptotic stability notion, and show that it indeed implies the "spiraling motion" of solutions that are initialized sufficiently close to a periodic equilibrium, we first make a simple observation: When there are N > 1 switching surfaces M i , 1 < i ≤ N , that are intersected by a stable periodic equilibrium u * , the intersections of x with M i , as well as the values of x at those intersection points, will converge to the intersection points of u * with M i and the values of u * at these epochs, respectively, for each i ≤ N . Since this is the case for our system, we define asymptotic stability in term of all four switching surfaces and the corresponding switching times. To avoid introducing more notation, the definition is given for our system directly.
Let P u * denote the image of a periodic equilibrium u * having period T ; Since u * (0) = u * (T ), the set P u * is an invariant set, namely, if y 0 ∈ P u * and y is the unique solution toẏ = f σ (y) in (3.10) with initial condition y(0) = y 0 , then y(t) ∈ P u * for all t > 0. Let x be a solution to (3.10) with x(0) / ∈ P u * and Σ q = ∞ (so that x oscillates indefinitely; we will show in Theorem 5.5 below that such solutions exist). Note that if x is an oscillating solution to (3.10), then there exists a t 1 ≥ 0 such that x(t 1 ) satisfies the conditions in Assumption 2. Due to the time-homogeneity of x we can restart the ODE at the first time t 1   exists an open subset S u * of S which contains P u * such that, if x(0) ∈ S u * , then for 1 ≤ i ≤ 4 and any t > 0, 5. Asymptotic behavior of the fluid model. In this section we establish results about the asymptotic behavior of the switching fluid model in (3.10). We show that there always is the underloaded stationary point equilibrium, to which the fluid model converges if it does not oscillate indefinitely. We show that there exists an overloaded periodic equilibrium if it oscillates indefinitely, and provide sufficient conditions for endless oscillations. For the discussion of equilibria, we no longer assume initial conditions in Assumption 2; we allow arbitrary initial conditions in the state space S. We also consider the system after time Σ q in (3.3). The proofs of all the results in this section are relegated to the Appendix.

Existence and asymptotic stability of a unique stationary point.
If there is no sharing actively taking place on an interval [0, T ], then the stochastic system decomposes into denote the total number of customers in each of these systems andȲ n i := Y n i /n, i = 1, 2. Then the fluid model forȲ n in the symmetric case we consider is the solution of the ODEẏ where a + ≡ max{a, 0}. In this case we have the following elementary, but important, result.
Theorem 5.1. If q i (0) ≤ κ, then no sharing will ever begin in the fluid model and Hence, x * 0 is an asymptotically stable stationary point.
Remark 5.1. Having x * 0 in (5.1) be an asymptotically stable stationary point depends critically on the assumption that κ > 0. If, instead, κ = 0, then it is possible for x * 0 to be an unstable stationary point, so that x oscillates indefinitely for any initial condition x(0) = x * 0 . Instability of x * 0 has important consequences for the stochastic system X n , since stochastic fluctuations may trigger undesirable sharing even if the system is initialized at the neighborhood of x * 0 . Therefore, stochastic fluctuations can quickly lead to fluid-scaled fluctuations, namely, to an oscillatory behavior. See the simulations in §7.4 below. The moral is that there is a need to ensure that the activation thresholds in the (finite) stochastic system are large enough to be considered positive in fluid scale. The size of the stochastic fluctuations of critically-loaded pools with no sharing can be estimated from the established heavy-traffic limit approximations for the Erlang-A model in [13].
Ideally, x * 0 in (5.1) would be a globally asymptotically stable stationary point for the fluid model, since the system is underloaded (λ < 1) and we want no sharing to take place, and indeed that will be the case with appropriate controls. However, here we are interested in fluid models with poorly chosen controls. Then solutions to (3.10) need not converge to x * 0 , so that S c where, for a set A, A c denotes the complement of A and φ denotes the empty set.

5.3.
Existence of a periodic equilibrium. Theorem 5.3 shows that a solution x to (3.10) either converges to x * 0 or oscillates indefinitely. We now consider what happens if the solution oscillates indefinitely. (3.10). In particular, if O = φ, then there exists a initial state vector x(0) satisfying Assumption 2 such that x(0) ∈ O and, for that It is important that the condition in Theorem 5.4 can be satisfied, as the following theorem shows.
then there exists a unique periodic equilibrium u * and x converges to u * as in (4.1). Therefore, S x * 0 ∪ S u * = S, namely the fluid model is bi-stable with all fluid trajectories converging to one of the two equilibria as t → ∞.
Extensive numerical trials, some of which are presented in §7 below, indicate that Conjecture 5.1 holds. Moreover, we next derive an approximating system to (3.10) which is shown to be bi-stable.
6. Approximating dynamical system. Since we were unable to fully characterize the asymptotic behavior of our initial fluid model, we now develop an approximating fluid model that can be analyzed more easily; i.e., for which we can establish bistability and calculate the two equilibria. The approximating system is easier to analyze because it is essentially a onedimensional system at the switching times. However, there are discontinuities at some of the switching times, so the approximating fluid model is a dynamical system with jumps (alternatively, it can be represented as a hybrid system with jumps); see [38] and [42]. The latter reference provides a general framework for defining and analyzing solutions for dynamical systems with jumps (see §1.5 of [42]), but the relative simplicity of our approximation obviates the need for a general theory. Numerical examples confirm that the approximating system serves as a useful approximation for the original fluid model, allowing us to rapidly compute a periodic equilibrium.
The approximation is obtained in five steps: First, we approximate the solution x to (3.10) by a solution x a to for a given initial condition x a (0), where we supplement the argument x a of f σ in (3.10) by the abandonment rate θ a and the control parameter τ a of the approximating system. Second, we assume that there is no abandonment, i.e., we let θ a = 0. Third, approximate τ by 0 on the first and third subintervals, i.e., where the switching times Σ a i are defined analogously to (3.2), and are formally defined in (6.5) below. Fourth, we let the initial condition for the approximating system be defined by where x(0) is the initial condition in Assumption 2. Fifth, and finally, we primarily focus on the three-dimensional function x a 3 ≡ (Δ a , z a 1,2 , z a 2,1 ) that approximates the three-dimensional function x 3 ≡ (Δ, z 1,2 , z 2,1 ) obtained from (3.10), ignoring the queue lengths. We will be assuming that the queue lengths remain positive, which can be checked at the end. In general, our analysis is valid until a queue length becomes 0. First, we focus on the difference function because it is possible to do so and still have a bonafide dynamical system, which is easier to analyze. Second, we are motivated to ignore the queue lengths because we have less control over them without abandonment; e.g., they can easily explode (diverge to infinity). However, we will also state results for the full six-dimensional approximation x a .
Since the approximating queue lengths q a 1 and q a 2 can obtain any nonnegative value, the full state space S ≡ [0, λ/θ] 2 × [0, 1] 4 of the solutions to (3.10) , but we will show below that Δ is bounded from above.
Paralleling (3.1), the switching and holding times, and the intervals between switching times, are defined via where, with T a 0 ≡ Σ a 0 ≡ 0, Paralleling (3.3), we let Our analysis will be valid for the full six-dimensional system on the interval [0, Σ a q ], but we will not examine Σ a q until the end. In particular, we will show that the system quickly converges to the (unique) periodic equilibrium, when it exists, for any initial condition that is associated with an oscillating solution. We can therefore initialize the queues (which are unbounded) at large values so that there is no time for them to reach 0 by the time convergence to the periodic equilibrium is observed.
In examples we see that the approximating system approximates our original system very well when the parameters θ and τ are suitably small. For this approximating system, we establish the following two results (Theorems 6.1 and 6.2). Let Σ a,(k) 4 and Δ a,(k) be the values of the k th iteration, where we apply the approximation above in the k th subinterval after making Σ a,(k−1) 4 equal to time 0.
The first main result regarding the approximating system, Theorem 6.1 below, considers the case in which the approximating system converges to its unique fixed point.
(a) The unique stationary point x * 0 in (5.1) for the fluid model in §3 is also the unique stationary point in R 6 for the approximating system.
The next theorem considers the case in which the approximating system possesses a periodic equilibrium, in addition to its unique stationary point x * 0 .
Theorem 6.2. Consider the approximating system defined in (6.1)-(6.6). the condition in part (a) holds, and if Σ a q = ∞, then (i) there exists a unique periodic equilibrium u a * 3 to the three-dimensional approximating system and (ii) the approximating system is bistable: There are initial conditions for which x a (t) → x * 0 in R 6 for x * 0 in (5.1) (which may include having Σ a q < ∞); there are other initial conditions for which Σ a q = ∞ and x a (t) fails to converge in R 6 in the usual sense of pointwise convergence, but x a In particular, by Theorem 6.2 (b), when a periodic equilibrium exists to the approximating system, then the system is bi-stable; each solution must converge to one of the two equilibria x * 0 or u a * 3 . Otherwise, all solutions converge to x * 0 (which is therefore a globally asymptotically stable stationary point in this case).
The condition Σ a q = ∞ in part (b) of Theorem 6.2 is easy to check directly by solving the simple equations for the full six-dimensional equation (6.1). However, in Appendix F we show that, whether or not this condition holds can be determined a posteriori by a simple calculation that depends only on the periodic equilibrium, and does not depend on the transient behavior of the fluid model.
In §6.1 and §6.2 we derive the solution to the approximating system over the first and second intervals, [0, Σ a 1 ) and [Σ a 1 , Σ a 2 ), respectively. In §6.3 we construct the solution after Σ a 2 . In §6.4 we consider a simple heuristic to provide an approximate explicit formula for the switching time T a 1 to facilitate computations.
All the results in this section are proved in §D in the appendix. Furthermore, in §F we show how to apply the explicit formula in §6.4 to determine if there will be congestion collapse. We establish conditions for a stronger geometric rate of convergence and exponential stability in §H.
It follows from (6.7) that for Σ a 1 ≡ T a 1 , which is well-defined by Lemma 6.1.

The approximation over the second interval
. The equations for the service process over [Σ a 1 , Σ a 2 ) are obtained from (3.7), but with T a 1 replacing T 1 and z a i,j (T a 1 ) replacing z i,j (T 1 ), i, j = 1, 2. As in §3.1.2, it is possible to have Σ a 1 < Σ a q ≤ Σ a 2 , but we do not check that now. Since the process z 1,2 in (3.7) keeps decreasing and z a 1,2 (T a 1 ) = 0, it follows from (6.10) and (3.7) that (6.11) z a 1,2 (T a 1 + t) = 0 and z a 2,1 (T a Taking θ ↓ 0 and inserting the values of z a 1,2 (T a 1 ) and z a 2,1 (T a 1 ) from (6.10) in (3.17), we see that for 0 ≤ t < T a 2 , where Δ a (Σ a 1 ) = κ. By (6.4), T a 2 is the first time after Σ a 1 that z a 2,1 hits τ , so that, paralleling (3.21), Clearly, if τ ↓ 0 then T a 2 → ∞, which is why we cannot replace τ with 0 over the second interval [Σ a 1 , Σ a 2 ). Inserting the value of T a 2 into the solution to (6.12) we obtain where y(t−) ≡ lim s↑t y(s) denotes the left limit at time t of a function y. Hence, As before, we can use the symmetry of x a 3 and take x a 3 (Σ a 2 ) to be the "initial condition" by reversing the labels. This means that, as in (6.3), we take τ ↓ 0 in x a 3 (Σ a 2 ). It follows immediately from (6.14) that lim τ ↓0 x a 3 (Σ a 2 ) = x a 3 (Σ a 2 −). Hence, the approximation x a 3 , and therefore x a , has a jump at time Σ a 2 , since the values of Δ a (Σ a 2 −) and z 2,1 (Σ a 2 −) both depend on τ . However, we can easily avoid having jumps in the process Δ a , which we want to avoid because it causes ambiguities about the behavior of the queues at the jump times. To that end, we simply define As a consequence, only z 2,1 jumps at the second switching time Σ a 2 . That discontinuity makes our fluid model a switching dynamical system with jumps, as mentioned at the beginning of the section.
If Δ a (Σ a 2 ) > κ, then T a 3 > 0, and paralleling (6.9) and Lemma 6.1, T a 3 is the unique strictly positive solution to Furthermore, paralleling (6.13), and start over. The preceding shows that, just as for the original system, we can exploit the symmetry of the model and consider only the half cycle [0, Σ a 2 ). In particular, for a given initial condition Δ a (0) we solve up to time Σ a 2 and take to be a new initial condition to solve beyond time Σ a 2 . It immediately follows that It is significant that at the switching times, x a 3 depends only on the known control parameters (κ, τ ) and the one unknown T a 1 . Therefore, the approximating system is reduced to an essentially one-dimensional system at the switching times.
The approximating three-dimensional system. From the above, x a 3 = (Δ a , z 1,2 , z 2,1 ) is the unique solution over [0, Σ a q ), for Σ a q in (6.6), to with initial condition (6.3) and τ a in (6.2), where f 3 1 is defined in (3.4) and (6.8), f 3 2 is defined in (3.7) and (6.12), f 3 3 satisfies the equations of f 3 1 , but with the labels reversed, and f 3 4 satisfies the equations of f 3 2 , with the labels of the processes reversed.

6.4.
A simple heuristic approximation for computation. The approximating system we have developed in this section has been useful to estalbish the strong theoretical results in Theoreem 6.1, which supports what we see for the original system in numerical examples. However, it is still not easy to compute the periodic equilibrium of the approximating system. We must either numerically solve the ODE's or numerically solve for T a 1 in (6.9) in order to evaluate the values of x a at the switching times. Hence, in the present section we develop a simple heuristic approximation for T a 1 in (6.9).
In particular, our approximation is obtained by simply omitting the second exponential term on the right in (6.9), so that Approximation (6.18) can be justified by observing that equation (6.9) can be expressed abstractly as T a 1 = A + Be −T a 1 for A > 0 and 0 < B < 1. Since T a 1 > A and T a 1 − A < Be−A, T a 1 ≈ A whenever B is suitably small or A is suitably large. In particular, the error is asymptotically negligible as A increases. We remark that approximation (6.18) also coincides with − log (ξ) ξ ≡ ξ(Δ) in (D.7), which can provide another way to derive the approximation. We can combine (6.10) and (6.18) ot obtain an associated approximation for z a 2,1 (T a 1 ). With this heuristic approximation for z a 2,1 (T a 1 ), we have by (6.13) that so that (6.14) and (6.15) are respectively approximated by and x a (Σ a 2 ) serves as the initial condition for the following cycle. We can use this heuristic approximation to approximate the values of the fluid model at the switching times, using an iterative algorithm, which is described in §D.2.2. Furthermore, in §F we explain how this heuristic can be employed to evaluate whether a periodic equilibrium exists. A numerical example comparing the solution to the approximating system to the original fluid solution is presented in §7.1 below.

Numerical examples.
In this section we report the results of numerical experiments based on numerical algorithms (numerical solutions of the dynamical systems) and simulations. Throughout this section we consider symmetric systems with parameters as in (2.4). In all our examples, λ = 0.98, τ = 0.01 and κ = 0.1, but we vary the parameters θ and μ. The initial condition in the numerical examples is taken in accordance with Assumption 2.
We emphasize at the outset that μ in our numerical examples is taken to be extremely small. (We also consider systems with no abandonment, or with very small abandonment rate, but this is prevalent in modeling.) However, as our simulation experiments below demonstrate, the oscillating fluid models for systems with extreme parameters suggest possible bad oscillatory dynamics in systems with more realistic parameters. In these more realistic setting the behavior cannot be predicted analytically, since the stochastic system is too complicated. Moreover, oscillations may even be overlooked in practice, because sufficient abandonment keep the queues relatively small, so that congestion collapse may fail to be noticed. Thus, we obtain important practical insights by rigorously studying extreme cases.
The rest of this section is organized as follows. In §7.1 we consider a system with no abandonment (θ = 0) and compare the results to the heuristic approximating model in §6.4. We consider a similar system in §7.2 but increase μ to show that x * 0 is globally asymptotically stable, thus showing the dependence on μ of the long-run behavior of the fluid model, as was established in §6. We add abandonment in §7.3 in comparison to the system in §7.1 to numerically support the reasoning for the development of the approximating system in §6. Finally, in §7.4 we present simulations of stochastic systems for which the fluid limit has no oscillatory solutions, and show that stochasticity may lead to substantial oscillations. 7.1. A system with no abandonment. We start with a system that has no abandonment, i.e., θ = 0. The other parameters are λ = 0.98, τ = 0.01, κ = 0.1 and μ = 0.1. The initial condition is q 1 (0) = 1 and q 2 (0) = 1.2, so that d 2,1 (0) = 0.2 and Δ(0) = 0.1. We further take z 1,2 (0) = τ and z 2,1 (0) = τ /2 = 0.005.
The time-dependent behavior of Δ is shown in Figure 4, whereas Figure  5 plots the image of (z 2,1 , Δ) (with time suppressed). As can be easily seen from Figure 4, there are ten full cycles plotted in this example. However, there are four loops visible in Figure 5, with each loop being a full cycle, where a full cycle begins at a time t 0 when z 1,2 (t 0 ) hits τ from above, such that Assumption 2 is satisfied at that hitting time. In this example, the two variables (Δ, z 2,1 ) spiral outward to the periodic equilibrium, namely, the first cycle is the inner (smallest) loop, the second cycle is the second smallest loop, etc. The fact that only four cycles are clearly visible in Figure  5 suggests that convergence to the periodic equilibrium is extremely fast in terms of the number of periods. The fast convergence is also visible by in Figure 4 itself. See §H for theoretical support.
Of course, the stability of (Δ, z 1,2 , z 2,1 ) does not imply stability of system. Indeed, Figure 6 suggests that q 1 increases without bound, and by symmetry, so is q 2 . Figure 7 shows that a substantial proportion of each pool has fluid     from the other class for a non-negligible amount of time, which is the cause for the congestion collapse observed in Figure 6. See §F. Finally, in Table 1 we compare the numerical solution to the iterative algorithm in §C.3.1 (in the "original sys." row), to the heuristic approximations developed in §6. 4. We note that L ≈ 0.44 < λ = 0.98 for L in (F.2). 7.2. Bifurcation: μ = 0.3. The term "bifurcation" refers to a change in the equilibrium behavior of a dynamical system as the value of one of its parameters varies, while all other parameters remain unchanged. Following the analysis in §6, we now take the same system considered in §7.1 but change the value of μ. We do not carry out a full bifurcation analysis to find the bifurcation point in which the equilibrium behavior of the system changes, but instead consider a single value μ = 0.3. To see how the system converges to the stationary point with no sharing, we change the initial condition in §7.1 and take Δ(0) = 20. The trajectory of Δ is shown in Figure 8. (Note however, that we cut the vertical axis in this figure at the value 3 to make the oscillations more apparent.) Figure 9 shows the spiraling towards that equilibrium point in the (z 2,1 , Δ) plane. Unlike the case depicted in Figure  5, now spiraling is "inward", i.e., the largest loop corresponds to the first cycle, and each of the four cycles is shorter than the previous one. we remark that the heuristic approximation in §6.4 was stopped in the fifth iterations since Δ (5) < 0.  Observe that even though the convergence to the stationary point is fast in terms of the number of oscillations, it is very slow in continuous time. In particular, the system oscillates for more than a hundred time units before it ceases to oscillate. 7.3. Adding abandonment. For a numerical depiction of the approximating solution, we now consider a system with μ = 0.1 as in §7.1 but add abandonment, taking θ = 0.01. As can be seen by comparing Figures 10  and 11 to Figures 4 and 5, the system with no abandonment serves as a reasonable approximation for the a system with a small abandonment rate, but the oscillations are smaller, as is intuitively expected.  It is significant that for a given stochastic system X n which is approximated by a fluid model x, there is freedom in how to choose the limiting thresholds. For example, if n = 100, then activation thresholds k n i,j = 10 can be considered as being √ n or as 0.1n. In the latter case, the stochastic fluctuations are considered negligible with respect to the activation thresholds, and κ = 0.1. However, in the first case, κ = 0, and so the stochastic fluctuations are significant. Specifically, if κ = 0, then oscillations are much more likely to occur because S 1,2 = S 2,1 in that case; see Remark 5.1.
System with a practically unstable stationary point. We simulated a system with similar parameters to those in §7.1 taking n = 100, so that there are 100 agents in each pool and λ n = 98. As above, θ = 0.01. Since κ n = 0.1n, we take κ n = 10, which we can also think of as being √ n, i.e., κ = 0. Figures 12 and 13 show a single sample path of the Q n 1 process and the shared-customers processes for a system starting empty. Due to symmetry of the parameters and the initial condition of the two pools, the fluid model will unambiguously move through x * 0 . Once x * 0 is hit, and since there is no sharing at that hitting time, the fluid model must remain at that point. However, random noise in the stochastic system causes sharing to begin, leading to extreme oscillations. From the fluid model perspective, this suggests that To show that O = φ we solve the fluid model for an extreme example with q 1 (0) = 1 and q 2 (0) = 1000, z 1,2 = τ and z 2,1 = 0. In the simulation however, we have Z n 2,1 = 20 and Z n 1,2 = 0, which is a likely initial condition for a system recovering from an overload in queue 2. (The initial conditions of the stochastic system and the fluid model do not match because we want to show that the fluid model does not oscillate, and has no periodic equilibrium.) Figure 14 shows a single sample path of the shared-customers processes from a single simulation run, and Figure 15 shows the fluid model of the system with the initial condition specified above. We only show figures of the shared customers service process, because both queues monotonically decrease to 0 in the fluid model, whereas customer abandonment make the oscillations of the queue processes unobservable in the simulation. From the practical point of view, this means that oscillations may be hard to detect in real time, unless one knows to look for them. Specifically, if the control parameters are chosen in accordance with the fluid model so as to ensure that the system will not oscillate, and if only the queues are observed, then the oscillations may not be captured by the controller. Indeed, the controller may assume that sharing is initiated and stopped due to "legitimate" activations of the control to respond to changes in the arrival rates We note that Figure 14 shows only the time interval [0, 100] for clarity, but that the oscillations continued for the full run time of the simulation, which lasted 1500 time units. (As before, time here is measured in service time units μ i,i = 1, i = 1, 2.) In ending we remark that the bad behavior shown here can be easily avoided by increasing k n i,j , as was discussed in Remark 5.1. A numerical example, related to the one given here, is given in Section 4.1 in [32]; see Figures 5 and 6 in that reference.

Implications of the fluid analysis for stochastic systems.
In this section we consider the implications of our fluid analysis to the (finite) stochastic system which they approximate. These implications rely on the fact that the fluid model can be achieved as a fluid limit in the many-server heavy traffic limiting regime. That is, if X n (0) ⇒ x(0) in R 6 , then, uniformly on compact intervals,X n ⇒ x, where ⇒ denotes convergence in distribution. See Theorem I.1 in §I.2 below for the precise statement and proof of this result. Now, since for each fixed n ≥ 1, X n is clearly an irreducible and positive recurrent CTMC, it possesses a unique stationary distribution which is also its limiting distribution. In particular, for some random variable X n (∞) with values in R 6 , it holds that, regardless of the initial condition, X n (t) ⇒ X n (∞) as t → ∞.
Given that the fluid limit ofX n may oscillate indefinitely, and never converge to its stationary point, it is not a-priori clear whetherX n (∞) can be approximated by x * for large n. The following weak law of large numbers (WLLN) for the sequence {X n (∞) : n ≥ 1}, whose proof appears in §E, shows that the sequence of "fluid-scaled" stationary distributions converges to the stationary point x * 0 with no sharing, even if O = φ, i.e., the fluid limit may not converge to its stationary point x * 0 .
Theorem 8.1 (WLLN for stationary distributions).X(∞) ⇒ x * 0 , i.e., for each continuous and bounded function f : Note that taking the limits in Theorem 8.1 in the reverse order, namely, first taking n → ∞ and then taking t → ∞, is not possible when O is not empty, because the limit of x(t) as t → ∞ does not exist for all initial conditions. We therefore cannot prove Theorem 8.1 using standard arguments, as were laid out in the proof of Theorem 4 in [16].

8.1.
On the rate of convergence to stationarity. The fact that X n ⇒ x uniformly on compact intervals together with Theorems 5.5 and 8.1 suggest that the state space of the irreducible CTMC X n is nearly decomposable into two regions when O = φ. In particular, the chain may spend a long time in one region before eventually moving to the second region. For example, ifX n (0) ≈ x(0) ∈ O for n large, then X n will approximately track the fluid trajectory with that initial condition. The oscillations ofX n can continue for arbitrarily large time periods as n increases.
On the other hand, if X n is initialized with no sharing and no queues, then hitting the activation thresholds is a rare event asymptotically, and oscillations will not begin for a long time. However, the chain being irreducible, must eventually visit a state in an "oscillating region" for the CTMC, triggering oscillations that, as explained in the paragraph above, will take a long time before finally ending, if n is large.
To make this discussion rigorous, consider a sequence of initial conditions {X n (0) : n ≥ 1} such thatX n (0) ⇒ x(0) ∈ O as n → ∞. SinceX n ⇒ x uniformly over compact intervals, and x is oscillating, we see that for any fixed t > 0 we can find N large enough, such that (8.1) X n (t) −X n (∞) tv > , for all n > N and for some > 0, where · tv denotes the total-variation norm (here given in terms of the random variables instead of their distributions); see, e.g., [10]. In particular, despite the fact thatX n (t) ⇒X n (∞) as t → ∞ for any given n, and moreover, the convergence rate to stationarity is exponentially fast as we show below, the convergence rate to stationarity can be arbitrarily slow for a sufficiently large system. To see that (8.1) indeed holds for all n large enough, note that convergence in total variation implies convergence in distribution (the two notions of convergence are in fact equivalent on countable state spaces). We can use the Lévy metric to measure distances between random variables corresponding to convergence in distribution. Specifically, we let the distance between two random variables X and Y with respective cumulative distribution functions F X and F Y , be Then, for random variables Y and {Y n : Now, take the contradictory assumption to (8.1), namely assume that there exists a time t > 0, such that X n (t) −X n (∞) tv < for all n ≥ 1 and > 0.
Then for this specific time t and for all n large enough, we have by the triangular inequality that where the second inequality follows from Theorem I.1, our contradictory assumption and Theorem 8.1, and the above holds for any fluid trajectory, regardless of the initial condition. Hence, x * 0 is globally asymptotically stable, in contradiction to the assumption that x(0) ∈ O.
The fact that X n may converge extremely slowly to stationarity for large n is not entirely straightforward, because X n is an exponentially ergodic CTMC, for each n ≥ 1, and therefore considered to converge "fast". The proof of the following theorem can be found in §E.
Theorem 8.2. Fix n ≥ 1. Then for any initial condition k ∈ Z 6 + , there exist positive constants M k and α (where M k depends on the initial state k and α does not), such that

Important takeaways for SSC-inducing controls.
An SSC-inducing control for a stochastic system is commonly designed in order to achieve an optimal control for a diffusion approximation of the system. In particular, stochastic networks can rarely be analyzed exactly, even under a fixed control, and finding a "good" control that is nearly optimal (in an appropriate sense), and is furthermore implementable, is a prohibitively hard problem. The most prevalent approach to solving an optimal control problem, initiated by Harrison [17], is to solve a related Diffusion Control Problem (DCP) for the system in heavy traffic, and then translate the optimal control for the DCP into a control for the system it approximates. As reviewed in [7], a key step in this procedure is to formulate an equivalent workload formulation (EWF), in which the dimension of the workload process is reduced. Thus, an optimal control for the approximating diffusion process found via the DCP scheme, dictates an SSC-inducing control for the stochastic system, which forces the asymptotic workload process to a lowerdimensional "boundary" subset of its state space. In fact, the FQR-ART control considered in this paper was developed by an analogous approach: In [28] we optimized a fluid model of the stochastic system, and then proposed a control to achieve the optimal fluid solution.
Before describing general insights obtained from our analysis here, we mention that the FQR-ART control in the prelimit, and its limiting counterpart, share some of the general characteristics of SSC-inducing control and the resulting limiting control. First note that an optimal control for an EWF is necessarily "bang-bang" with a singular part in the sense of [19], namely, such a control uses the maximum "pushing" towards the boundary set when the queue is away from it, and switches instantaneously between the directions of pushing to maintain the queue on the boundary. Similarly, the control process in both FQR-ART and the resulting limiting control use the maximum force to push the queues toward the threshold and the boundary set, respectively. See also the paragraph below the display of S 1,2 and S 2,1 in §2.2, page 145.
We further observe that, in addition to the bang-bang portion, FQR-ART also has a "no-action" zone associated with states in which both difference processes are below the activation thresholds, or when one of the difference process is above its corresponding activation threshold, but the release threshold prevents the control from being activated. Our analysis here revealed that the no-action zone is responsible for the delayed activation of the control, and to the resulting oscillations when the control parameters are not chosen correctly. In that regard, we make the following general observation: While much attention had been given in the literature to the difficulties in translating an optimal control for limiting approximations (typically diffusion approximations) to a control for the underlying sequence of stochastic systems, little attention was given to the difficulty in translating the asymptotic control parameters (when such a control is found) to a fixed system. Our analysis sheds light to possible problems that can arise in this second layer of translation. (Recall Remark 5.1 and the example in Figures 12-13. We discuss this issue further in §J in the appendix.) We next demonstrate with specific examples how our insights apply to other models. 9.1. A comparison with the N model. To make the discussion concrete, we now compare some conceptual similarities between our model and the N model which was studied extensively in the single-server stations setting; see, e.g., [18,20] and [2]. We refer to these references for other works on the N model in the conventional heavy traffic, and to [45] for a many-server heavy-traffic study of an N system. (We further remark that the N model is covered by the model considered in [15].) The N model, depicted in Figure 3 above, has two single-server stations, with the server in station 1 dedicated to serving class-1 jobs, while the server in station 2 can handle both queues. (note that there are no arrows representing abandonment from the queues.) A (non-idling) service policy determines which queue server 2 should handle when both queues are non empty. Similarly to our simulation experiments in [28], which showed that the QIR control applied to a critically-loaded X system can lead to congestion collapse (see §2 above), a simulation in [18] demonstrated that a naive implementation of the cμ rule, under which server 2 always prioritizes the class 1 queue, may lead to congestion collapse in a critically-loaded N model.
A solution to a DCP for the N model led the authors in [2] to design a threshold policy and prove that this policy is asymptotically optimal and leads to SSC. Under that policy, server 2 helps queue 1 only when that queue is larger than a threshold, and otherwise this server only serves its own queue. It is further assumed that server 2 preempts the job it processes if it needs to switch the class it serves. Note that this control uses the maximum force to push queue 1 towards the threshold, and has a no-action zone associated with queue 1 being below that threshold. Since the threshold is taken to be O(log n) as n → ∞, asymptotic SSC about the threshold implies that queue 1 is null in the diffusion limit. In particular, it is shown that the two-dimensional queue process has all its mass on the class-2 queue, and the idleness process has all its mass on server 2 (i.e., the class 1 queue is asymptotically null, and server 1 never idle). Observe that the no-action zone shrinks to 0 as n increases, so that it does not appear in the limit.
It is significant that the asymptotics in [2] were achieved under the assumption that server 2 can switch in zero time between the two queues. This assumption is reasonable to make if the switching times in reality are sufficiently small relative to the service time, but it is intuitively clear that even short switching times will lead to diffusion-scaled chattering of queue 1 about the threshold. In fact, since server 2 switches infinitely-often between the two queues in the limit over any finite time interval, non-zero switching times may push the system to an overload, because switching times of server 2 essentially increase its idle period. Moreover, if the O(log n) threshold is not chosen appropriately for a given stochastic system, and in particular, if the the threshold is chosen to be too small, then, as in the example in [18], server 2 will again spend too much time helping queue 1, leading to congestion of its own queue.
Finally, the authors in [2] remark that a hysteresis control with two thresholds can be employed to avoid chattering of the queue about the threshold (see the remark at the bottom of p. 621 in this reference). Under this modified control, only once queue 1 crosses an upper threshold will server 2 switch to help this queue, and that help is turned off only once the queue returns to the lower threshold. It is again intuitively clear that, if the thresholds are not chosen appropriately for the fixed system, namely, if the lower threshold is too small and the upper one is too large, then server 2 may end up spending too much time helping queue 1, leading queue 2 to increase without bound in an oscillatory manner. Indeed, the two-thresholds control increases the "no-action" zone, in turn, leading to increased delays in activating the control.

Summary.
In this paper we considered the FQR-ART overload control applied to the cyclic X model, when the control parameters are badly chosen. For the dynamical-system (fluid) limit, the purpose of the control is to attract any fluid trajectory to one of two sliding manifolds during overload periods, so as to maintain a pre-specified ratio between the two queues.
Switching fluid limit. We have shown that possible delays in activation and release of the control can lead to chattering and resulting oscillations, which translates to fluid-scaled fluctuations in the underlying stochastic system. The pathological oscillatory behavior can be analyzed via a switching dynamical system, as in Definition 3.1, within the framework of the manyserver heavy-traffic FWLLN (Theorem I.1 in §I.2). Theorems 5.2 and 5.4, respectively, prove that the fluid limit has a unique stationary point and a non-trivial periodic equilibrium that is associated with the oscillatory motion. Sufficient conditions for endless oscillations were provided in Theorem 5.5.
Fluid stability. In Theorem 5.3 it was shown that any fluid trajectory that ceases to oscillate must converge to the unique stationary point. A convenient approximating dynamical system to the fluid limit was developed and shown to be bi-stable in §6. Specifically, all the trajectories of the approximating system were shown to converge to one of the two equilibriathe stationary point x * 0 in (5.1), or a unique non-trivial periodic equilibrium. Finally, a simple heuristic construction in §6.4 can be used to approximate the values of the solutions to (3.10) at the switching times, and in particular, the values of the periodic equilibrium at the switching times, when it exists.

Implications. Numerical examples in §7
show the effectiveness of the approximating system. The simulation experiment in §7.4 demonstrates that our fluid model provides important insights into the untractable behavior of the underlying stochastic system, even when the fluid approximation itself is not oscillating.
From the theoretical stochastic perspective, the results in §8 demonstrate that, despite the fact that the stochastic system is an ergodic CTMC, and is even exponentially ergodic by Theorem 8.2, an oscillatory behavior of the fluid model implies that it may take very long time for the system to converge to stationarity. In particular, exponential ergodicity should does not necessarily imply "fast" convergence to stationarity.
From the practical perspective, the most important conclusion is that the control parameters must be chosen with caution. For example, the bad oscillatory behavior presented in §7.4 (which may be hard to detect in real time) can be avoided by choosing appropriate activation thresholds. We again refer to [32] for a further discussion.

APPENDIX A: OVERVIEW
This appendix contains supplementary material for the main paper. First, in §B we give notation for sets used in the paper, as well as sets that are used in the appendix. The proofs of the results in Section §5 appear in §C, and the proofs of the results in §6 are presented in §D. Both § §C and D include supporting results with their proofs, and efficient algorithms to compute the respective ODE's in switching times. The proofs of the theorems in §8 appear in §E. In §F we show how the approximating system can be employed to check whether congestion collapse occurs, and in §G we present an algorithm to compute the solution to the heuristic approximation suggested in §6. 4. In §H we establish stronger forms of convergence of solutions to the approximating system to their equilibrium behavior. In §I we show that the fluid model we considered in the main paper arises as the fluid limit in a many-server heavy-traffic fluid limit of the underlying model. The proof of the FWLLN is given in §I.2, after a brief expansion on the stochastic model and many-server scaling in §I.1. Finally, in §J we discuss implications of our results here for the control of the stochastic system.

APPENDIX B: NOTATION OF SETS
Below is a list of the different sets that appear in the paper. Their first appearance is in parenthesis.
• M -switching (or sliding) manifold in a general system ( §1). In this section we provide the proofs of the results in §5.
Then both z 1,2 and z 2,1 converge to 0, and it is easy to see from (2.3) (recall that there is no new sharing taking place) that both queues will reach 0 in finite time. Then, after q i reaches 0, all arriving fluid moves immediately into service, so thatż 2,2 = λ − z 2,2 , and we see that z 2,2 (t) → λ as t → ∞. Now suppose that x ∈ S 1,2 over an interval I. If z 2,1 > τ over I, then no fluid flows from q 1 to pool 2, so that both queues evolve independently according to (2.3). Since z 1,2 and z 2,1 are strictly decreasing over I, the same arguments given above apply in this case. Therefore, assume that z 2,1 ≤ τ over an interval J ⊆ I so that sharing is allowed. By Assumption 1, q 1 is strictly decreasing on J, and the sliding motion implies thatq 1 (t)−q 2 (t) = 0, so that q 2 is strictly decreasing as well (at exactly the same rate as q 1 ). Now, some of the service capacity of pool 2 is given to queue-1 fluid at any point, so that, for t ∈ J,q andq 2 (t) > λ − z 2,2 (t) − μz 1,2 (t) − θq 2 (t).
Recalling that q 1 (t) = q 2 (t) + κ and z i, so that z 1,2 (t) < z 2,1 (t). It follows that z 1,2 (t) ≤ τ and is decreasing on J. In particular, both queues continue decreasing after the sliding motion is over.
The same arguments give that, if x ever slides on S 2,1 , then both queues are strictly increasing to 0. Hence, the processes z 1,2 and z 2,1 never increase above τ during sliding motion, so that both queues are strictly decreasing to 0. After q i hits 0, z j,i decreases monotonically to 0 and z i,i converges to λ.

C.2. Bounds to guarantee oscillations.
We now provide auxiliary results, needed for the proof of Theorem 5.5, providing sufficient conditions for endless oscillations of solutions to (3.10) and congestion collapse. In § §C.2.1 and C.2.2 we construct simple bounds on T 1 and x(T 1 ), and bounds on T 2 and the values of x over [Σ 1 , Σ 2 ), respectively. Universal bounds on the solution x and the holding times, and a numerical example, are given in §C.2.3.

C.2.2. Bounds on T 2 and {x(t)
: . For bad oscillatory behavior, we will want to see that q 2 (T 1 + t) remains positive and, furthermore that d 2,1 < 0. to ensure that the initial conditions in Assumption 2 hold at the switching time Σ 2 ≡ T 1 +T 2 with the index labels reversed. From Corollary C.2, we obtain the following Corollary C.4 (lower bounds on the queue lengths on [T 1 , T 1 + T 2 )).
so that, for i = 1, 2, which is a strictly decreasing function of t. As a consequence, a sufficient condition for both q 1 (t) and q 2 (t) to remain positive throughout [T 1 , for which a sufficient condition is for T U 1 in Corollary C.1.

C.2.3. Universal bounds.
We now consider the performance over a range of initial conditions. First, we introduce lower and upper bounds on the initial difference Δ(0) ≡ q 2 (0) − q 1 (0). We assume that uniformly enforcing Assumption 2. We also assume that the smaller queue length is bounded below and above by Proof. Apply Corollary C.1.
If a periodic equilibrium exists, then the value of z 1,2 (Σ 2 ) will equal to z 2,1 (σ 2 ) on that equilibrium, as explained below (3.2) in §3. See also (5.2) in Theorem 5.4. We put the results above together to obtain bounds on z 1,2 (T 1 + T 2 ), which will serve as the new value of z 2,1 (0) in a continuation of the algorithm beyond time Σ 2 = T 1 + T 2 .
Proof. Apply (3.22) together with the lemmas above.
Next we consider the queue lengths at time T 1 + T 2 .
Finally, we obtain lower and upper bounds on the queue difference at time Lemma C.5 (universal bounds on the queue difference at time T 1 + T 2 ).

C.3. Proofs of Theorems 5.4 and 5.5.
To establish these results, we exploit an algorithm for efficiently computing a solution to the switching model in (3.10) and efficiently calculating the periodic equilibrium if it exists. The algorithm improves on the piecewise numerical solution of the piecewise ODE in (3.10) by exploiting the exact formulas in §3. We can recursively calculate the values at the switching times Σ i and then afterwards calculate the trajectory in between. By iterating, we can easily determine numerically if the solution converges to the stationary point or not. Numerical experience indicates that if the solution oscillates indefinitely, then it rapidly converges to a periodic equilibrium. In particular, the algorithm identifies the periodic equilibrium. However, more is required to provide a mathematical proof of existence, uniqueness and convergence.
(Observe that the labels of the processes in the second equality in (5.2) are reversed.) If indeed we can establish the closure property in (C.8), then we will have proved that there exists a periodic equilibrium.
If at some iteration we obtain an unreasonable value for x 3 , e.g., q i < 0, i = 1 or i = 2, or Δ ≤ κ, then the algorithm is stopped and we conclude that the solution corresponding to the initial condition we chose converges to x * 0 (due to Theorem 5.3). However, a pathological case has Δ > κ for all iterations, but Δ → κ. Let Δ * and T * 1 denote the limit of Δ and T 1 when the algorithm is iterated indefinitely. Observe that Δ * = κ implies T * 1 = 0, so that the corresponding limiting solution u * is necessarily a constant function.
This case is clearly a pathology, due to the uniqueness of the stationary point x * 0 . The following lemma ensures that such a pathological behavior of the algorithm is not possible. In particular, if at some iteration of the algorithm Δ is too close to κ, then this is also the last iteration.
Let Δ (k) be the value of Δ at the k th iteration of the algorithm. It follows from Lemma C.6 that

C.3.2. Proof of Theorem 5.5.
Proof. We first impose conditions on the model parameters and initial conditions so that the iterative algorithm in §C.3.1 mapping the initial state vector x 3 (0) ≡ (q 1 (0), q 2 (0), z 2,1 (0)) into the state vector x 3 (Σ 2 ) ≡ (q 1 (Σ 2 ), q 2 (Σ 2 ), z 1,2 (Σ 2 )) and then iterated again to map For that purpose, we introduce lower and upper bounds on the initial queue difference Δ(0), and assume that the smaller queue length q 1 (0) is bounded below as well as above by both consistent with Assumption 2. We can apply (3.16) to establish upper and lower bounds on T 1 , as shown in Corollary C.1. Those bounds are (C.11) where Δ L (0) and Δ U (0) come from (C.9) and Ψ U and Ψ L are upper bounds on Ψ in (3.13) and (3.14). We then impose an upper bound on τ by requiring τ < 1 − e −T L 1 , which imposes an upper bound on T 2 , i.e., If, in addition, then the two queue lengths both remain positive throughout the interval [0, T 1 + T 2 ] and q 1 (T 1 + T 2 ) ≥ q L 1 (T 1 + T 2 ) in (C.13), as shown in Lemma C.5. (If necessary, we redfine q L 1 (0) so that q L 1 (T 1 + T 2 ) ≥ q L 1 as well as (C.10).) Finally, if then we can iterate without limit, with Σ q = ∞. Condition (C.14) can be checked after the first iteration. However, sufficient conditions for (C.14) to hold without performing the first iteration are given in Lemma C.5. Numerical examples confirm that all these conditions can be satisfied, thus proving Theorem 5.5.
Corollary C.5 implies that for this two-dimensional process x 2 , the algorithm acts as a map from the space where κ > 0. The explicit solution to the ODE (3.10) over [0, Σ 4 ] and to Δ in (3.15) and (3.19) shows that this map is continuous. Hence, by Brouwer's fixed point theorem (e.g., Theorem 5.28 in [36]) there exists a fixed point to this map in the set S κ . That fixed point cannot be also a fixed point of (3.10), due to Theorem 5.2, i.e., due to the uniqueness of x * 0 . It follows that there exists a solution to (3.10) satisfying (C.8) which is not a constant. Necessarily, such a solution is a non-trivial periodic equilibrium.
so that T is strictly increasing in Δ.
In passing we note that the point (Δ 0 , T 0 ) ≡ (1 − μ + κ, 0) satisfies F (Δ 0 , T 0 ) = 0. However, this point is not in B, so there is no contradiction to the claim that there exists a function T (Δ) as in the proof of Lemma 6.1.  (6.17). The next lemma shows that, in this case, the solution will converge to x * 0 and will therefore cease to oscillate.

D.2. Proof of
Note that the lemma considers the full six-dimensional approximation x a , and not only the three-dimensional restriction x a 3 .
Proof. The initial condition has z a 1,2 (0) = z a 2,1 (0) = 0, so that z a 1,1 (0) = z a 2,2 (0) = 1. Hence, both pools serve only their own fluid queues, as long as q i (t) − q j (t) < κ, for both (i, j) = (1, 2) and (i, j) = (2, 1). Therefore (see (2.3))q 1 (t) =q 2 (t) = λ − 1 < 0, 0 ≤ t < Σ a q , so thatΔ a (t) = 0 on [0, Σ a q ), and no sharing can begin during that interval. At time Σ a q at least one of the queues hits 0, say q a i . If the other queue is still positive at that time, then it continues to decrease at the same constant rate as before. Since |q a i (Σ a q ) − q a j (Σ a q )| = q a j (Σ a q ) < κ, j = i, the difference between the two queues can never become larger than κ, so that the positive queue must also hit 0 at a finite time after Σ q . Therefore, letting t j denote the time at which queue j hits 0, i = 1, 2, we have It follows from (6.16) and Lemma D.1 that, if at the end of cycle we have −Δ a (Σ a 2 ) ≤ κ, then Σ a q < ∞ and x a (t) → x * 0 as t → ∞. In addition, Δ a (t) was just shown to reach 0 in finite time, and z a 1,2 and z a 2,1 each reach 0 in finite time by construction. Therefore, x a 3 (t) reaches (0, 0, 0) in finite time. Using similar arguments to those in Theorem 5.2, we can prove that An iterative algorithm for the approximating system. In the iterative algorithm each (half) cycle of x a corresponds to an iteration. We use a superscript (k) denote the k th iteration of the algorithm, and drop the superscript "a" for ease of notation, e.g., T 1 is the value of T a 1 in (6.9) in the first cycle of x a , or equivalently, the first iteration of the algorithm.
We start by choosing a value Δ (0) ≡ Δ(0) > κ and use it to numerically compute T (1) 1 via (6.9). The obtained value of T a 1 is then used to compute (6.15). We continue iterating this way until one of two things occur: either we see Δ (k) > κ for all k or else we observe Δ (k) ≤ κ for some k ≥ 1, in which case the algorithm is stopped.
Similar to Lemma C.6 and Corollary C.5 we can show that there exists a κ > 0 such that, if the algorithm can be iterated indefinitely, then Δ (k) > κ + a κ for all k ≥ 1. Of course, for the approximating system we can characterize a κ explicitly, and its value can serve as an approximation for the value of κ in Corollary C.5.
As was mentioned above, the approximating fluid model is a switching dynamical system with jumps. In this new setting, the approximating fluid solutions are elements in the space D ≡ D[0, ∞) of real-valued right-continuous functions with limits everywhere, which we endow with the Skorohod J 1 topology, which we denote by d t . Specifically, we consider the topological space (D, J 1 ), as in §3.3 of [48]. We have x k → x in (D, J 1 ) as k → ∞ if, for each t that is a continuity point of x, and || · || t is the uniform norm applied to functions on the finite interval [0, t]. Note that convergence in J 1 reduces to uniform convergence over bounded intervals whenever the limit function is continuous, as is the case for all the solutions of (3.10).
We generalize Definition 4.4 by replacing the uniform metric in (4.1) with the Skorohod metric. We then say that a solution x a spirals towards u a * if (4.1) holds for x a and u a * , but with the Skorohod J 1 metric replacing the uniform metric. In our application we will let λ k (Σ (k) 0 ) = Σ * (k) 0 . After making that small perturbation of the switching times, so that they are aligned, we have uniform convergence over [0, t].
The next lemma shows that spiraling of a solution x a to u a * follows from the first limit in (4.1) and convergence of x a to u a * at the four switching times. Its elementary proof is omitted.
Lemma D.4. Suppose that a periodic equilibrium u a * , having period T , exists for (6.17). If for some solution x a = u a * , then x a spirals towards u a * . In particular, We are now prepared to prove Theorem 6.2 (a) and (b).
Next, we introduce a mapping taking Δ(0) = Δ into a function of T a 1 , where T a 1 ≡ T a 1 (Δ) is the unique positive solution to (6.9); specifically, let For fixed μ ∈ (0, μ 1 ) and 0 < δ μ < Δ M μ − Δ m μ to be specified below, let Note that the end points of S μ depend on μ, and that μ S μ = [1 + κ, ∞), where the union is taken over all the values of μ ∈ (0, μ 1 ), for μ 1 in (D.3). In particular, the left end point of S μ is bounded from below whereas its right end point is unbounded as μ ↓ 0. Nevertheless, S μ is compact for any fixed μ ∈ (0, μ 1 ).
Proof. Observe that by (D.1) and (D.4) so that T (Δ) > Δ m μ whenever the following inequality holds To see that (D.7) does hold for all μ ≤ μ * , for some μ * as in the statement of the lemma, observe that ξ(Δ) decreases to 0 as Δ increases to ∞ and that the right-hand side of (D.7) is bounded from below by 1 − τ as μ decreases to 0. Since Δ m μ → 1 + κ and Δ M μ → ∞ as μ ↓ 0, we can find μ * small enough and Δ large enough such that, for all μ ≤ μ * and Δ m μ < Δ < Δ M μ , (D.7) holds for that Δ.
We use Lemma D.5 to obtain geometric rates of convergence to equilibrium in §H.

APPENDIX E: PROOF OF THE RESULTS IN SECTION 8
Proof of Theorem 8.1. For each n ≥ 1 consider the CTMC X n initialized with its stationary distribution, namely, X n (0) d = X n (∞), n ≥ 1. The sequence X n (∞) is tight in R 6 because each sequence of elements in the vectorX n is tight in R. This follows immediately forZ n i,j (0), which are bounded from below by 0 and from above by some c > 1, i, j = 1, 2. Tightness ofQ n 1 (0) andQ n 2 (0) follows from the infinite-server stochasticorder bound on the queues in Lemma A.5 in [31]. In particular,Q n i ≤ stQ n i,bd pathwise, where Q n i,bd is the number-in-system process in an M/M/∞ queue with arrival rate λ n i and service rate θ. See also the proof of Theorem 8.2 where a similar bound is constructed.
By Theorem I.1, the sequence of processes {X n : n ≥ 1} is tight in D 6 , and we can therefore consider a converging subsequence of processes, whose initial conditionsX n (0) d =X n (∞) also converge to some limit Since the initial condition is distributed according to the stationary distribution ofX n , each of the CTMC's in the prelimit is stationary, and it follows that any limit ofX n must also be stationary process. In particular, First observe that, ifZ 1,2 (0) =Z 2,1 (0) = 0 andQ i (0) < κ w.p.1, then the two pools and their associated queues operate as two independent underloaded M/M/m i systems and thereforeX(0) = x * 0 w.p.1, implying thatX n (∞) ⇒ x * 0 . It follows from the routing rules of FQR-ART that for any sample path for which bothZ 1,2 (0) andZ 2,1 (0) are strictly positive, at least one of these processes must be strictly decreasing over some interval (0, ), > 0, contradicting the stationarity ofX. Therefore, ifZ i,j (0) > 0, thenZ j,i (0) = 0, i = j w.p.1.
It follows that, ifQ i (0) > 0, thenQ i must be strictly decreasing on some right neighborhood of 0, becauseZ i,i (0) = m i . Hence,Q i (0) = 0. Then the X model is asymptotically two independent M/M/n + M systems with service rate equals to 1 and arrival rate λ n < n. Paralleling (C.1), we conclude thatX(0) = x * 0 w.p.1, so thatX n (∞) ⇒ x * 0 as n → ∞. The statement of the theorem follows because the converging subsequence we considered was arbitrary.
Proof of Theorem 8.2. Consider the queue process Q n bd := {Q n bd (t) : t ≥ 0} in an M/M/∞ system that has arrival rate 2λ n and service rate θ. Then Q n bd is distributed the same as the sum of the two queues in the X system in which the service process is "shut off" so that all the output from the two queues is due to abandonment. Specifically, we construct the X model and the M/M/∞ system on the same probability space by giving both the same initial condition and the same Poisson arrival processes (exploiting the fact that a superposition of two independent Poisson processes is a Poisson process with the sum of the rates). If Q n Σ (t) = Q n bd (t) and there is an abandonment from Q n Σ , then we can generate an abandonment from Q n bd ; see, e.g., [47]. Therefore, Q n bd is never below Q n Σ . It is well-known that the Markovian infinite-server queue is exponentially ergodic, see, e.g., Proposition 7.2 in [35]. However, we need to show that this implies that the same holds for X n . We thus use the exponential drift condition on the generator of X n whose state space is . For x ∈ Ξ, let V (x) := (1 + γ) x 1 +x 2 , for some γ > 0 which is characterized below. For Q n bd we consider the corresponding function U (q) = (1 + γ) q , q = x 1 + x 2 . Then V : is a norm-like Lyapunov function for the generator of Q n bd . Due to the sample-path stochastic order relation between Q n Σ and Q n bd , we have QV ≤ Q bd U , where Q denotes the generator matrix of X n and Q bd denotes the generator matrix of Q n bd . Now, if we show that, for some compact set C ⊂ Ξ, the following exponential drift condition holds for strictly positive constants c and d and γ, then the statement of the theorem will follow from Theorem 2.5 in [22], because QV ≤ Q bd U.
To that end, we recall that the off-diagonal components of Q bd are given by q i,i+1 = 2λ n , q i,i−1 = kθ, and q i,j = 0 for |i − j| > 1, i≥ 1.
Remark E.1. In general, the exponential drift condition in the above proof should hold for a "small set" C; see, e.g., [22]. In a discrete state space, as is the case here, any compact set is small.

APPENDIX F: CHECKING FOR CONGESTION COLLAPSE
When there is no abandonment, we cannot expect that the queues in an oscillating system will remain finite as time increases. Indeed, if then the queues are not rate stable, i.e., the long-run average input rate λ is larger than the long-run average throughput rate, so that the queues will increase without bound. We now show how to estimate whether (F.1) holds.
In particular, we now show that the simplified heuristic approximation in §6.4 facilitates verification of (F.1) for a system that is known to converge to the unique periodic equilibrium. Let Σ * i and T * i denote the switching and holding times of the periodic equilibrium, 1 ≤ i ≤ 4. Without loss of generality, consider pool 1. (Due to the symmetry, it is sufficient to check whether (F.1) holds for one of the pools.) Then, for where the first equality follows from the asymptotic periodicity of the solution, and the second equality follows from the symmetry of the model. Recall also that z a 2,1 ≡ 0, so that z a 1,1 = 1 over [Σ * 2 , Σ * 4 ], which gives the last term in the square brackets. We can use the last value of ξ (k) obtained from the algorithm above to serve as our approximation for ξ * ≡ ξ(Δ a * ), for ξ(·) in (D.7), together with (6.7) and (6.11) to approximate L.
Using the fact that Σ * with the approximation following by, first noting that Σ * 2 = T * 1 + T * 2 and, second, replacing T * 1 and T * 2 with (6.18) and (6.19), respectively. Note that, unlike the original system (3.10), in the approximating system we can first compute the periodic equilibrium, when it exists, via the iterative algorithm, and then check whether the system goes through congestion collapse. The heuristic approximation given here facilitates this inspection, via the computation in (F.2). More specifically, if a periodic equilibrium of (6.17) is found, and if this periodic equilibrium is associate with congestion collapse, then the queues necessarily increase to infinity as time increases, provided that x a 3 converges to u a * before either queue hits 0. We can then make sure that Σ a q = ∞ simply by initializing the two queues of the sixdimensional vector x a (0) at sufficiently large values, so that either queue does not reach state 0 during the first few cycles (i.e., before x a 3 is sufficiently close to u a * ). Here, congestion collapse means that the queues will have an increasing trend in the sense that each queue will be larger at the beginning of a cycle than its value at the beginning of the previous cycle.
On the other hand, if the periodic equilibrium is not associated with congestion collapse, i.e., the total average service rate during the periodic cycle is smaller than the arrival rate, then the queues will have a decreasing trend, so that they must eventually reach 0, regardless of their initial condition. We conclude that there is no need to actually determine the exact values of the initial queue lengths, or to check wether Σ a q = ∞, but only to check wether a periodic equilibrium is associated with congestion collapse. We start by choosing a value Δ(0) such that ξ (1) ≡ ξ in (D.7) is sufficiently small (e.g., ξ (1) < 0.05) and T 2 ) in order to compute ξ (2) via (D.7). As before, we continue iterating until we see convergence to a legitimate value, i.e., Δ (k) converges to some Δ a * > κ and ξ (k) converges to a value ξ * < 1, or we obtain an illegitimate value at some iteration, i.e., Δ (k) < κ or ξ (k) > 1 for some k ≥ 1. In the latter case, the algorithm is stopped. The latter case indicates that the solution x a converges to x * 0 . If the initial condition for the algorithm is extreme, i.e., Δ (0) is taken to be very large, then stopping the algorithm suggests that a periodic equilibrium does not exist.

APPENDIX H: STRONGER NOTIONS OF CONVERGENCE AND STABILITY
In Lemma D.5 we showed that for any κ and τ we can find μ * , such that the iterative algorithm for the approximating system acts as a map from the space S μ in (D.5) into itself, thus ensuring that the algorithm can be iterated indefinitely. We now use Lemma D.5 and its proof to show that the iterative algorithm in §D.2.2 converges geometrically fast to the point Δ a * on the periodic equilibrium, when u a * ∈ S μ . The fast monotone convergence to equilibrium is seen also in the numerical experiments in §7.
Note that the statement of the theorem implies that there exists a unique asymptotically-stable periodic equilibrium in S μ , as we already know.
By Lemma D.4, the three-dimensional solution x a 3 to (6.17) "spirals" toward u a * . Using Theorem H.1, we next prove a stronger result, stating that the rate of convergence of an oscillating solution to the approximating system (in continuous time) is exponential.
Let P a * denote the image of the periodic equilibrium u a * ; where S a in §6 is the state space of the approximating system. Recall that the convergence of x a 3 to u a * holds under the Skorohod metric defined in §D. 2  Proof. It follows from Lemma D.4 and Theorem H.1 that, for all k ≥ 1 and t > Σ (k) * , Since x a 3 and u a * are uniformly bounded from above by Δ M μ in (D.2), the upper bound in (D.1) together with (6.13) give 0 < 2R, for all k ≥ 1. In particular, the length of any full cycle of any possible solution, including the periodic equilibrium, is smaller than 2R. Since x a 3 (Σ (0) 0 ) − u a * (0) ≤ δ μ , for δ μ in (D.8), the statement of the theorem follows by taking ϑ ≡ δ μ /(1 − ρ) and β ≡ − log(ρ)/2R.
In ending we remark that the exponential bound on the rate of convergence to u a * should in general depend on the initial condition, as seen in the proof of Theorem H.2. In particular, exponential stability should in general be defined via x a 3 (t) − u a * (t) < ϑ x a 3 (0) − u a * (0) e −βt for β, ϑ > 0. However, we obtain the bound in the statement of the theorem since all the solutions we consider have values in S μ , and are therefore uniformly bounded.

APPENDIX I: THE FLUID MODEL AS A LIMIT
The focus of the paper is on a fluid approximation for the stochastic X model under FQR-ART. In this section we prove that the switching fluid model arises as a many-server heavy-traffic fluid limit when a fluid-scaled sequence of these stochastic systems is considered. The proof of the functional weak law of large numbers (FWLLN) is given in §I.2, but we first expand on the stochastic model and many-server scaling in §I.1. We emphasize that, unlike the fluid limit proved in [31], the proof of the FWLLN here is standard because it does not include the stochastic averaging principle.

I.1. More on the stochastic model and heavy-traffic scaling.
We now briefly expand on the review of the stochastic model, which was described in §2, and the heavy-traffic scalings. We consider a Markovian model, i.e., we assume that both arrival processes are independent (timehomogeneous) Poisson processes, and that service times, as well as patience times of customers waiting in queue, are exponentially distributed. Specifically, we assume that the class-i arrival rate in system n is λ n i , a class-i customer receives an exponentially-distributed service time in pool j with mean 1/μ i,j , and a class-i customer has exponentially distributed patience with mean 1/θ i , i, j = 1, 2. Customers who do not enter service before running out of patience will abandon the queue. (There is no abandonment from service.) All random variables are independent of each other and of the two arrival processes. Since FQR-ART is a Markovian control, in that the routing and scheduling decisions are a function of the state of the system and are independent of its history, it is easy to see that X n in (2.2) is a six-dimensional time-homogeneous CTMC.
Due to abandonment of waiting customers, defining overloads is not entirely straightforward because a service pool can be considered normally loaded even if the traffic intensity to that pool is larger than 1. Our definition of overloads is taken from an asymptotic perspective. In particular, pool i is considered overloaded if ρ i > 1, where On the other hand, we can have ρ i ≤ 1 with class i overloaded because there are many shared customers in pool i. This latter type of overload may be intentional, if sharing is deemed beneficial and is employed to alleviate an overload in the other class, or it may be caused by a harmful execution of the control, namely it is due to congestion collapse. For any fixed n we must take k n 1,2 to be sufficiently large so as to ensure that sharing begins only when the corresponding pool is genuinely overloaded due to a high arrival rate. In addition, τ n 1,2 should be sufficiently small to ensure that there is only a negligible amount of simultaneous twoway sharing. (Simultaneous sharing can occur because the direction of overload switches.) On the other hand, τ n 1,2 must be sufficiently large to be hit in a reasonable time. We refer to § §2.2 and 3.2 in [32] for elaborations on the reasonings behind the way we choose the thresholds. For our purposes here we simply enforce the following scaling assumption: Assumption 3 (scaling parameters). For strictly positive numbers m i , where x is a deterministic element of C 6 and is the unique solution to the switching ODEẋ = f σ (x), for f σ in (3.10). Moreover, with +∞ being a possible value as a limit of these stopping times.
Note that, if x(0) satisfies Assumption 2, then necessarily Σ q > 0. If, in addition, the fluid model is in the invariant set O, then the convergence can be extended in an obvious way to any compact interval of [0, ∞) because Σ q ≡ ∞. Otherwise, Σ q < ∞ and since λ < 1, class-i fluid will stop flowing to pool j, i = j. Since P (|Σ n q − Σ q | > ) → 0 as n → ∞ (recall that convergence in distribution is equivalent to convergence in probability when the limit is deterministic), this show that sharing of customers will end at approximately time Σ q in a large stochastic system.
The proof of Theorem I.1 follows standard pre-compactness arguments, combined with applications of the continuous-mapping theorem. We again refer to [48] for the general framework. We therefore start by representing the sample paths of X n in terms of independent Poisson processes; see [27].
To simplify notation, let queues) the abandonment rate is strictly smaller than the service rate of shared customers (and both are small). For a given stochastic system there is freedom in choosing how to model the scaling of the thresholds. It is important that this freedom leads to ambiguities that must be accounted for. For example, if for n = 100, m n 1 = m n 2 = 100 and we take k n 1,2 = k n 2,1 = 10, then we can think of the activation thresholds as being equal to 0.1n or √ n. From the fluid perspective, there are important difference between the two scalings. If the latter holds, then κ = 0 so that S 1,2 = S 2,1 and the fluid model can cross from S − 1,2 to S + 2,1 , and vice versa, in zero time. In this case, chattering and oscillations, as defined above, coincide, and are clearly more likely to occur. In particular, this suggests that oscillations can occur in the stochastic system even if a fluid approximation with κ > 0 does not oscillate at all, because a more appropriate approximation for the given system would be to assume that κ = 0; see Remark 5.1 below.
Implications to the release thresholds. There are important inconsistencies regarding the rescaling of the release thresholds. For example, in a system having 100 agents in each pool and arrival rate λ n = 98, we may take τ n i,j = 3. With these parameters, and regardless of the value of μ, pool j is clearly not overloaded at time t if Z n i,j (t) ≤ τ n , and the fluctuations of the queue must therefore be considered to be of order o(n). However, the fluctuations of the queue will often be larger than τ n , which is considered to be asymptotically positive under fluid scaling. Specifically, whereas Q n T /τ n ⇒ 0 as n → ∞, for all T > 0, where Q n T ≡ sup 0≤t≤T Q n (t), we have Q n T >> τ n for any reasonable value of n (which is not unrealistically large) and over intervals [0, T ], with T = O(1) (e.g, T ≈ 1/μ 1,1 .) It follows that, relative to the stochastic fluctuations, it is appropriate to think of the release thresholds as being o(n) (even O(1)!). On the other hand, from a fluid-limit perspective, τ n must satisfy Assumption 3, namely be strictly positive asymptotically in fluid scale, since otherwiseZ n i,j := Z n i,j /n will not be hit this threshold in finite time when it is strictly decreasing; see §3.2 in [32].
We can think of the release thresholds as having a duality property in the fluid model: When z i,j ≤ τ their affect on the system's performance is negligible, and we can consider them to be 0, i.e., τ n i,j = o(n). Whenever z i,j > τ and is decreasing, we must think of τ as being strictly positive, so that τ n is as in Assumption 3, to ensure that z i,j can hit τ in finite time. We take advantage of this duality property when constructing an approximation for the fluid model in §6.4.