A Particle System with Mean-Field Interaction: Large-Scale Limit of Stationary Distributions

Published Online:https://doi.org/10.1287/stsy.2023.0108

Abstract

We consider a system consisting of n particles, moving forward in jumps on the real line. System state is the empirical distribution of particle locations. Each particle “jumps forward” at some time points, with the instantaneous rate of jumps given by a decreasing function of the particle’s location quantile within the current state (empirical distribution). Previous work on this model established, under certain conditions, the convergence, as n, of the system random dynamics to that of a deterministic mean-field model (MFM), which is a solution to an integro-differential equation. Another line of previous work established the existence of MFMs that are traveling waves, as well as the attraction of MFM trajectories to traveling waves. The main results of this paper are: (a) We prove that, as n, the stationary distributions of (recentered) states concentrate on a (recentered) traveling wave; (b) we obtain a uniform across n moment bound on the stationary distributions of (recentered) states; and (c) we prove a convergence-to-MFM result, which is substantially more general than that in previous work. Results (b) and (c) serve as “ingredients” of the proof of (a), but also are of independent interest.

1. Introduction

We consider a system consisting of n particles, moving forward on the real line. The particles move in jumps. The system state at a given time is the current empirical distribution of particle locations. Each particle gets “urges to jump” as an independent Poisson process of constant rate. However, a particle getting a jump urge actually jumps with the probability given by a decreasing function of the particle’s location quantile within the current state (i.e., empirical distribution); hence, this is a mean-field type of particles’ interaction with each other. When a particle does jump, the jump size is independent, distributed as a random variable Z > 0. We are interested in the system behavior when n is large.

This model was introduced in Greenberg et al. (1995, 1996) as an idealized model of distributed parallel simulation. In this case, n particles represent n processors (“sites”) simulating different part of some large system, and a particle location is the current “local simulation time” of the corresponding processor. The following types of questions are of interest, as n becomes large: how the local times of the processors progress over time; whether local times “stay closely together”; whether the evolution of the empirical distribution of local times becomes that of a traveling wave; etc. This model, and similar models, are motivated by other applications as well (including recent applications, such as blockchains), where, roughly speaking, a distributed synchronization of a large number of sites is of interest; cf. Manita and Shcherbakov (2005), Malyshev and Manita (2006), Manita (2006, 2009, 2014), Malyshkin (2006), Balazs et al. (2014), and references therein for examples of synchronization models.

There are two lines of work on the particle system described above. The first one is in Greenberg et al. (1995), where it is shown that, under certain additional conditions, as n, the system random dynamics converges to that of a deterministic mean-field model (MFM), which is a solution to an integro-differential equation. There are several additional assumptions made in Greenberg et al. (1995), one of which is especially restrictive, in that the proof technique crucially relies on it—a particle jump probability depends on the current locations of K other particles chosen uniformly at random, where K > 0 is a fixed parameter, same for all n. We will refer to this additional assumption as the finite-dependence assumption, and it substantially restricts the more general model of this paper.

The second line of work is represented by Greenberg et al. (1996) and Stolyar (2023), where formally defined mean-field models (solutions to an integro-differential equation) are studied. The results of Greenberg et al. (1996) prove, in particular, that if an MFM that is a traveling wave exist, then “typically,” the traveling wave is unique, and MFM trajectories are attracted to it as time increases; the question of a traveling-wave existence under general assumptions was left open in Greenberg et al. (1996). Stolyar (2023) proves the existence of a traveling wave, under very general assumptions. (We also note that, for some MFMs that are different from the one in this paper, the existence and explicit forms of the traveling waves are obtained in Balazs et al. 2014, Hongler 2015, and Hongler and Filliger 2019, in some special cases of the jump-size distribution.)

The combination of the results of Greenberg et al. (1995, 1996) and Stolyar (2023) strongly suggests the following asymptotic property of the stationary distributions: As n, the stationary distributions of (recentered) states concentrate on a (recentered) traveling wave. This property is stated as conjecture 7.1 in Stolyar (2023, section 7).

Our main results are:

  • (a) We prove (Theorem 1) that, as n, the stationary distributions of (recentered) states concentrate on a (recentered) traveling wave. (This proves conjecture 7.1 in Stolyar (2023, section 7), under a slightly stronger assumption on the jump size, namely, EZ2+χ<,χ>0, as opposed to EZ2<.)

  • (b) As a key “ingredient” of the proof of (a), we obtain (Theorem 2) a uniform across n moment bound on the stationary distributions of (recentered) states. This result is also of independent interest.

  • (c) We prove (Theorem 3) the convergence-to-MFM result in our general setting, without the additional finite-dependence assumption. (This substantially generalizes the result of Greenberg et al. 1995. The proof largely follows the approach used in Balazs et al. 2014, for a different model. The approach is more generic than that in Greenberg et al. 1995; in particular, it does not rely on the finite-dependence assumption.) This result is another ingredient of the proof of (a), but is also of independent interest.

The proof of (a) relies on (b) and (c) and on the results in Greenberg et al. (1996) and Stolyar (2023) on the existence/uniqueness of—and attraction to—traveling waves.

1.1. Outline of the Rest of the Paper

In Section 2, we formally define the model and informally state the main results. Section 3 gives some basic notation, definitions, and conventions used throughout the paper. In Section 4, we state our main results, Theorems 1, 2, and 3, formally. Sections 5, 6, and 7 contain the proofs of Theorems 2, 3, and 1, respectively. A brief discussion of our results is in Section 8.

2. The Model and Main Results

The particle system model is as follows. There are n particles, moving in the positive direction (“right”) on the real axis R. Each particle moves in jumps, as follows. For each particle, there is an independent Poisson process of rate μ>0 of “jump urges.” When a particle gets an urge to jump, it actually jumps, to the right, with probability ηn(ν), where ν is its quantile in the current empirical distribution of the particles’ locations; that is, ν=/n when the particle location is -th from the left. With complementary probability 1ηn(ν), the particle does not jump. To have the model well-defined, assume that quantile-ties between colocated particles are broken uniformly at random. We adopt the convention that function ηn(ν) is defined for a continuous argument ν[0,1], by assuming that it is constant in each interval ((1)/n,/n] for =1,,n and ηn(0)=1. Assume that, for each n, function ηn(ν), 0ν1, is nonincreasing and that, as n, it uniformly converges to a continuous, strictly decreasing function η(ν), 0ν1, with η(0)=1,η(1)=0. The jump sizes, when a particle does jump, are given by independent and identically distributed (i.i.d.) nonnegative random variables (r.v.) with cumulative distribution function (CDF) J(y),y0; we denote by J¯(y)=1J(y) the complementary CDF; a generic jump size is given by the r.v. Z0. Without loss of generality, we can and will assume that Z > 0—that is, J(0)=0. We will use notation

m()0ydJ(y)=EZ,  0,(1)
for the -th moment of a jump size. (So, m(1)=EZ is the mean jump size.)

For some (not all) of our results, we will need the following two additional conditions on the jump-size distribution:

m(2+χ)<  for some χ>0,(2)
and
J(·)isabsolutelycontinuouswithdensityJ(y),boundedawayfrom0oncompactsubsetsofR+.(3)

Note that, without loss of generality (WLOG), we can and do assume μ = 1; otherwise, we can achieve this condition by rescaling time. Also, if m(1)=EZ<, we can and do assume, WLOG, that m(1)=1; otherwise, m(1)=1 is achieved by rescaling space.

Let fn(t)=(fxn(t), xR) be the (random) empirical distribution of the particle locations at time t; namely, fxn(t) is the fraction of particles located in (,x] at time t.

As n, it is very intuitive that fxn(t) converges (in appropriate sense, under appropriate conditions) to a deterministic function fx(t), such that f(t)=(fx(t), xR) is a distribution function for each t, and the following equation holds:

tfx(t)=xdyf(y,t)η(fy(t))J¯(xy),(4)
where dy means the differential in y (and recall that μ = 1, WLOG). We call a function fx(t) satisfying (4) a mean-field model. (The formal meaning of (4) and the definition of a mean-field model will be given in Definition 1 in Section 6.3.) The intuition for (4) is as follows. For each t, the distribution f(t)=(fx(t), xR) approximates the distribution of particles fn(t) when n is large. Because particles move right, fx(t) is nonincreasing in t for each x. So, the partial derivative (/t)fx(t) is nonpositive, and it should be equal to the right-hand side (RHS) of (4), which gives the instantaneous rate (scaled by 1/n and taken with minus sign) at which particles jump over point x at time t.

It is known (Greenberg et al. 1996, Stolyar 2023) that, as long as m(1)< (or, m(1)=1, WLOG), for any mean-field model, the speed at which the mean xdxfx(t) of the distribution f(t) moves right, is equal to v=m(1)μ01η(ν)dν=01η(ν)dν. A distribution function ϕ=(ϕx, xR) is called a traveling wave shape (TWS) if fx(t)=ϕxvt is a mean-field model. By substituting into (4), we see that any TWS ϕ must satisfy equation

vϕx=xϕyη(ϕy)J¯(xy)dy.(5)

It is known (Stolyar 2023) that a TWS ϕ exists as long as m(2)< (no other assumption on the jump-size distribution is required), and it is unique (up to a shift) if, in addition, (3) holds.

We now informally state the main results of this paper. (The formal results will be given later, in Section 4, after introducing more notation.) Let f°n(t) denote the distribution fn(t), recentered so that the distribution mean is at 0.

Informal Statement of Theorem 1

(in Section 4). Assume (2) and (3). Then, as n, the stationary distribution of the process f°n(·) converges to the (Dirac) distribution concentrated on the unique TWS ϕ, centered so that its mean is at zero. Moreover, ϕ has finite absolute (1+χ)-th moment: |y|1+χdϕy<.

Informal Statement of Theorem 2

(in Section 4). Assume (2) and (3). Then, for all sufficiently large n, the Markov process f°n(·) is stable (positive Harris recurrent), and its stationary distribution is such that the expected absolute (1+χ)-th moment of f°n(·) is bounded uniformly in n:

E|y|1+χdyfyn°(t)C¯.

Informal Statement of Theorem 3

(in Section 4). Assume m(1)< (or, m(1)=1, WLOG) and (3). Suppose that, as n, the initial conditions fn(0) converge to a deterministic proper distribution f(0). (Nothing else about f(0) is assumed, not even the existence of a finite mean.) Then, the process fn(·) converges to the unique mean-field model f(·) with initial condition f(0).

3. Basic Notation

The set of real numbers is denoted by R and is viewed as the usual Euclidean space. As a measurable space, R is endowed with Borel σ-algebra. For scalar functions h(x) of a real x: h1=x|h(x)|dx is L1-norm; h(x) is called c-Lipschitz if it is Lipschitz with constant c0. Let Cb be the set of continuous bounded functions on R, which are constant outside a closed interval (one constant value to the “left” of it and possibly another constant value to the right of it.)

For functions h(x) of a real x: h(x+) and h(x) are the right and left limits; a function h(x) is RCLL if it is right-continuous and has left limits at each x.

A function h of x may be written as either h(x) or hx. Notation dxh(x,t) for a multivariate function h(x, t), where xR, means the differential in x.

Denote by M the set of scalar RCLL nondecreasing functions f=(f(x), xR), which are (proper) probability distribution functions—that is, such that f(x)[0,1],limxf(x)=0 and limxf(x)=1. For elements fM, we use the terms distribution function and distribution interchangeably. Space M is endowed with the Levy-Prohorov metric (cf. Ethier and Kurtz 1986) and the corresponding topology of weak convergence (which is equivalent to the convergence at every point of continuity of the limit); the weak convergence in M is denoted w. Note that, for f,ϕM, the L1-norm of their difference, fϕ1, is equal to the Wasserstein W1-distance between the corresponding two distributions. The inverse (ν-th quantile) of fM is f1(ν)inf{y|f(y)ν},ν[0,1]; γ1(1)= when f(y) < 1 for all y.

Unless explicitly specified otherwise, we use the following conventions regarding random elements and random processes. A measurable space is considered equipped with a Borel σ-algebra, induced by the metric, which is clear from the context. A random process Y(t), t0, always takes values in a complete separable metric space (clear from the context) and has RCLL sample paths. For a random process Y(t), t0, we denote by Y() the random value of Y(t) in a stationary regime (which will be clear from the context). Symbol signifies convergence of random elements in distribution; P means convergence in probability. W.p.1 or a.s. means with probability one. For a condition/event A, I{A}=1 if A holds, and I{A}=0 otherwise.

Space D([0,),R) (respectively (resp.) D([0,),M)) is the Skorohod space of RCLL functions on [0,) taking values in R (resp. M), with the corresponding Skorohod (J1) metric and topology (cf. Ethier and Kurtz 1986); J1 denotes the convergence in these spaces.

For a distribution fM and scalar function h(x),xR, fhRh(x)dfx.

For scalar functions h(x),xX, with some domain X,h=supxX|h(x)| is the sup-norm. When Gk,G are operators mapping the space of such functions into itself, limGkh=Gh and GkhGh mean the uniform convergence: GkhGh0.

Suppose we have a Markov process with state space X and transition function Pt(x,H), t0. A measurable set XX is called small (cf. Bramson 2008) if there exist constants T > 0 and δ>0 and probability distribution α(·) on X, such that for any xX and any measurable HX,PT(x,H)δα(H). Pt, as an operator, is Pth(x)yPt(x,dy)h(y), where h is a scalar function with domain X; I=P0 is the identity operator. The process (infinitesimal) generator B is

Bhlimt0(1/t)[PtI]h.

Function h is within the domain of the generator B if Bh is well-defined. We say that a Markov process is stable if it is positive Harris recurrent (cf. Bramson 2008); if it is, it has unique stationary distribution.

For real numbers a and b, we use notations: sign(a)I(a0)I(a<0), abmin{a,b}. RHS and LHS mean right-hand side and left-hand side, respectively. Abbreviation w.r.t. means with respect to; a.e. means almost everywhere w.r.t. Lebesgue measure.

4. Formal Statements of Main Results

4.1. Uniform Moment Bound for Stationary Distributions. Limit of Stationary Distributions

For an element fM, denote by f¯ the mean of the corresponding distribution,

f¯=xdfx=0(1fx)dx0fxdx,
with the usual convention that the mean is well-defined and finite when both integrals in the RHS are finite. Denote
M°={fM|f¯=0}.

When f¯ is finite, denote by f°=(fx°, xR)M° the centered version of f, namely,

fx°=fx+f¯, xR.

If we denote by M(n)M the state space of the process fn(·), then M°(n)=M(n)M° is the state space of f°n(·). If we use notation

Φ(f)=|x|dfx,
for the -th absolute moment of f, then we obviously have
Φ(f)<,   fM(n), n, 0.

Theorem 1.

Suppose Conditions (2) and (3) hold. Then, as n,

f°n()ϕ,(6)
where ϕ is the unique TWS with ϕ¯=0. Moreover, ϕ is such that
Φ1+χ(ϕ)<,(7)

and a stronger form of convergence (6) holds:

f°n()ϕ10.(8)

The proof of Theorem 1 is in Section 7.

Theorem 2.

Suppose Conditions (2) and (3) hold. Then, there exist C¯>0 and n¯ such that, for all nn¯, the Markov process f°n(·) is stable, and we have

EΦ1+χ(f°n())C¯.(9)

The proof of Theorem 2 is in Section 5.

4.2. Transient Behavior: Convergence to a Mean-Field Model

Essentially all our asymptotic results on the transient behavior of the processes are for the noncentered processes fn(·). The result for the centered processes (Theorem 3(ii)) is obtained essentially as a corollary.

Denote by L(n) the generator of the process fn(·). For any hCb, function fnh of fn is within the domain of L(n) (where we use the fact each function in Cb is constant outside a closed interval), and

L(n)[fnh]=01dν[fn]1(ν)ηn([fn]1(ν))E[h([fn]1(ν)+Z)h([fn]1(ν))],
where the expectation is over the distribution of the random jump size Z. We also formally define the “limit” of L(n) as
L[fh]=01dνf1(ν)η(f1(ν))E[h(f1(ν)+Z)h(f1(ν))].

Theorem 3.

Suppose fn(0)wf(0), where {fn(0)} is the deterministic sequence of fn(0)M(n), and f(0)M. (Note that we do not assume that f(0) has well-defined mean f¯(0).) Assume that conditions m(1)=EZ< (or m(1)=EZ=1, WLOG) and (3) hold. Then, we have:

  • (i) fn(·)f(·) in D([0,),M), where f(·)D([0,),M) is deterministic, uniquely determined by f(0). Moreover, f(·) is a continuous element of D([0,),M), which satisfies

    f(t)hf(0)h0tLf(s)hds=0,  hCb, t0.(10)

    The dependence of f(·) (as an element of space D([0,),M) with J1-convergence topology) on f(0) (as an element of M with weak convergence topology) is continuous.

  • (ii) If, in addition, f¯(0)=0, then f¯(t)=vt, and, consequently, f°(·) is a continuous element of D([0,),M), uniquely determined by f°(0)=f(0)M°, with f°(t)M° for all t0. The dependence of f°(·) on f°(0) is continuous. And if, in addition, f¯n(0)=0 for all n, then f°n(·)f°(·) in D([0,),M).

The proof of Theorem 3 is in Section 6. Also in Section 6, we show (Theorem 8) that solutions to (10) are exactly the mean-field models (Definition 1). We note that many of the supplementary results in Section 6, which may be of independent interest, require assumptions on the jump-size distribution that are much weaker than conditions m(1)=EZ< and (3). In particular, some of those results assume nothing about the jump-size distribution besides it being a proper distribution.

5. Proof of Theorem 2

5.1. Equivalent View of Process f°n(·)

State f°n(t) can be equivalently described as wn(t)=(w1(t),w2(t),,wn(t))Rn, where w1(t),w2(t),,wn(t) are the locations of the n particles w.r.t. the mean f¯n(t), listed in a nondecreasing order. (So, the average (1/n)iwi(t)=0 at all times.) From now on, for each f°nM°(n), we will consider the corresponding vector wn=(w1,w2,,wn), and vice versa. Any function of f°nM°(n) may be expressed via the corresponding wn, and vice versa. In particular,

Φ(f°n(t))=1ni=1n|wi(t)|.

Note that the topology on M°(n), induced by the (weak convergence) topology on M, is equivalent to the usual topology of component-wise convergence of the corresponding vectors wn.

The evolution of wn(t) is as follows. Between the times of the jump urges, wn(t) remains constant. At a time t of a jump urge, the following occurs. Let κi(t) be the actual jumps size of particle i, in the system without recentering, upon this urge; κi(t)0 and can be nonzero for at most one particle. Then, in the recentered system, particle i jump size (i.e., the increment of wi(t)) at t is ζi=κi(t)sκs(t)/n (which may be positive or negative). After the jumps (if any) at t occur, the particles indices i are changed, if necessary, to keep wi(t) nondecreasing in i.

5.2. Informal Intuition for the Proof

The proof of stability, in Subsection 5.3, uses fluid limit technique and is fairly straightforward. Let us discuss the intuition for the proof of the bound (9) in Subsection 5.4.

At a very high level, the bound (9) is due to the fundamental property of the system, which can be called the “egalitarian trend”: In a recentered system, the particles at high quantiles (large i) will have a negative drift, whereas particles at low quantiles (small i) will have a positive drift, thus preventing the centered empirical distribution f°n from “spreading out.”

To obtain the bound on the expected (1+χ)-th moment of f°n, we need the finite (2+χ)-th moment on a jump size. Informally speaking, we use Φ2+χ(f°n) as a Lyapunov function. If B(n) is the generator of process f°n(·), then, “generally speaking,” we have EB(n)Φ2+χ(f°n())=0; in other words, the expected drift of Φ2+χ(f°n(t)) in steady-state is zero. Function

G1+χ(n)(f°n)=(2+χ)isign(wi)|wi|1+χEζi(f°n),  f°nM°(n),
where ζi(f°n) is the random jump size of particle i (in recentered system) upon a jump urge when the state is f°n, can be thought of as the “first-order approximation of the generator B(n), applied to function Φ2+χ(f°n)”; note that the derivative |wi2+χ|=(2+χ)sign(wi)wi1+χ. We can show that, when Φ1+χ(f°n) is large,
G1+χ(n)(f°n)ϵΦ1+χ(f°n),
for some constant ϵ>0; this is where we use the egalitarian trend property, which ensures that Eζi(f°n) is negative (resp., positive) for particles at high (resp., low) quantiles. From here, we can obtain, informally speaking,
G1+χ(n)(f°n)ϵΦ1+χ(f°n)+K,
which holds for some K > 0 and all f°n. Taking into account the fact that G1+χ(n)(f°n) is not B(n), but only its first-order approximation, and doing the corresponding estimates, we obtain, informally speaking,
B(n)Φ2+χ(f°n)(ϵ/2)Φ1+χ(f°n)+K.

Taking expectation w.r.t. f°n(), we obtain, informally speaking,

0(ϵ/2)EΦ1+χ(f°n())+K,
which yields (9).

In the actual proof, instead of Φ2+χ(f°n), we use its truncated version Φ2+χ(C1)(f°n)=Φ2+χ(f°n)C1 as the Lyapunov function because the latter is certainly within the domain of generator B(n). And then let C1.

We note that this “program” for proving a property of the type of (9) is likely applicable to other models having the egalitarian trend property, though the technical details may differ.

5.3. Stability

The stability (positive Harris recurrence) is easily established using the fluid-limit technique (Rybko and Stolyar 1992, Dai 1995, Stolyar 1995, Bramson 2008). We note that the stability proof only uses Conditions (3) and m(1)< (as opposed to stronger Condition (2)).

To prove stability, we will use the equivalent representation wn(·) of the process f°n(·), given in Subsection 5.1. The state space of wn(·) is

W°(n)={wn=(w1,,wn)Rn|(1/n)iwi=0},
that is, the set of those vectors in Rn, corresponding to f°nM°(n). The norm of a state wn=(w1,,wn)W°(n) is wn=maxi|wi|. Using (3), it is straightforward to see that, for any fixed a > 0, the closed set W°(n)(a)={wnW°(n)|wna} is small (see the definition in Section 3).

Pick any two numbers, 0<ν1<ν2<1, and choose n¯ large enough so that for any nn¯, ηn(ν1)ηn(ν2)>(η(ν1)η(ν2))/2. (All we need here is that the jump probabilities of the “left-most” and “right-most” particle are separated by a positive constant.) Consider any fixed nn¯.

Consider a sequence of versions of the process wn(·), namely, processes wn,k(·) with increasing norm of the initial state, wn,k(0)=ck,k, with wn,k(0)/ckw(0)=(w1(0),,wn(0)). Given that W°(n)(a) is a closed small set for any a > 0, to establish stability, it suffices to show that, for some fixed T > 0, wn,k(T)/ck0. This, in turn, follows from the fact that the limit (in appropriate sense) w(·)=(w1(·),,wn(·)) of the sequence of processes wn,k(ckt)/ck, t0, has trajectories such that w(t)W°(n) for all t0, w(0)=1, and (d/dt)maxwi(t)ϵ<0, as long as maxwi(t)>0; therefore, w(t)=0 for all t1/ϵ. We omit further details, which are straightforward. □

5.4. Proof of (9)

At some level, this proof is similar to the proof of an analogous result in Stolyar (2022) for a different particle system. However, the difference of our model from that in Stolyar (2022) is substantial, so we give full details of the proof for our model.

Consider the following function

G1+χ(n)(f°n)=(2+χ)isign(wi)|wi|1+χEζi(f°n),  f°nM°(n),
where ζi(f°n) is the random jump size (which can have any sign) of particle i upon a jump urge when the state is f°n. (The sizes ζi(f°n) are dependent across i, of course.)

As we will see, function G1+χ(n)(f°n) can be thought of as the “first-order approximation of the generator B(n) of process f°n(·), applied to function Φ2+χ(f°n)”; but we do not even claim that Φ2+χ(f°n) is within the generator B(n) domain. Note that, for each n, G1+χ(n)(f°n) is continuous in f°nM°(n).

Note that each Eζi(f°n) is the quantity of the order O(1/n), which motivates the definition

ζ¯i(f°n)nEζi(f°n).(11)

We observe that, if particle i location wi is the -th from the left, and it is not colocated with any other particle, then

ζ¯i(f°n)=ηn(/n)vn,
where
vnηn(/n)01ηn(ν)dν,(12)
is the average drift of the mean of the noncentered particle system. (Clearly, limnvn=v.) In the more general case, when exactly k particles are colocated particles—namely, -th, (+1)-th, …, (+k1)-th left-most particles are colocated—and particle i is one of them, we have
ζ¯i(f°n)=ηn(/n)++ηn((+k1)/n)kvn.(13)

We define the function ζ¯(f°n)(x), xR, as follows: ζ¯(f°n)(x)=ζ¯i(f°n), where i is the particle whose location wi is the closest to x on the left; we also adopt a convention that, if wi is the location of the left-most particle, then ζ¯(f°n)(x)=ζ¯i(f°n) for all x<wi. Clearly, function ζ¯(f°n)(x) is a piece-wise constant nonincreasing function.

We can write:

G1+χ(n)(f°n)=(2+χ)ζ¯(f°n)(x)sign(x)|x|1+χdfxn°,(14)
or, by “integrating over the values ν of fxn°” and using (13),
G1+χ(n)(f°n)=(2+χ)01dν[ηn(ν)vn]sign([f°n]1(ν))|[f°n]1(ν)|1+χ0.(15)

The inequality in (15) follows by the following argument. Denote

ν0ninf{ν|ηn(ν)<vn},
and observe that, as n,
limnν0n=ν0, where η(ν0)=v.

To be specific, consider the case when [f°n]1(ν0n)0. (The case [f°n]1(ν0n)0 is treated analogously.) We have ν*nf0n°ν0n. Note that

0ν0n[ηn(ν)vn]dν=ν0n1[ηn(ν)vn]dν.

Then, we have

ν*n1dν[ηn(ν)vn]sign([f°n]1(ν))|[f°n]1(ν)|1+χ=ν*n1dν[ηn(ν)vn]|[f°n]1(ν)|1+χ0,
and
0ν*ndν[ηn(ν)vn]sign([f°n]1(ν))|[f°n]1(ν)|1+χ=0ν*ndν[ηn(ν)vn]|[f°n]1(ν)|1+χ=0ν0ndν[ηn(ν)vn]|[f°n]1(ν)|1+χν0nν*ndν[ηn(ν)vn]|[f°n]1(ν)|1+χ0,
where the last inequality holds because |[f°n]1(ν)|1+χ is nonincreasing in [0,ν*n]. Thus, (15) is proved.

Next, we claim the following property: There exists a sufficiently large C > 0 and some ϵ>0, such that, uniformly in all sufficiently large n and all f°nM°(n) with Φ1+χ(f°n)C,

G1+χ(n)(f°n)ϵΦ1+χ(f°n).(16)

The proof of (16) is given in Section 5.5.

From (16) and (15), we obtain that, uniformly in all sufficiently large n,

G1+χ(n)(f°n)ϵΦ1+χ(f°n)+ϵC.(17)

Denote by Φ2+χ(C1)(f°n)=Φ2+χ(f°n)C1 the function Φ2+χ truncated at level C1>0. Given that this is a continuous bounded function of f°nM°(n), and the process of jump urges is Poisson, it is not hard to see that Φ2+χ(C1)(f°n) is within the domain of the generator B(n) of process f°n(·).

Next, we claim the following fact: There exists C2>0 such that for any fixed C1>0, uniformly in all large n and f°n such that Φ2+χ(f°n)C1, we have

B(n)Φ2+χ(C1)(f°n)G1+χ(n)(f°n)+C2Φχ(f°n)+C2,(18)
and then
B(n)Φ2+χ(C1)(f°n)ϵΦ1+χ(f°n)+C2Φχ(f°n)+C3,(19)
with C3=ϵC+C2. The proof of (18) is given in Section 5.6.

From (19) and inequality Φχ(f°n)[Φ1+χ(f°n)]χ/(1+χ)C4+(ϵ/(2C2))Φ1+χ(f°n), which holds for a sufficiently large fixed C4>0, we obtain

B(n)Φ2+χ(C1)(f°n)(ϵ/2)Φ1+χ(f°n)+C5,(20)
where C5=C4C2+C3.

Bound (20), in turn, implies that for any fixed C1>0,

B(n)Φ2+χ(C1)(f°n)[(ϵ/2)Φ1+χ(f°n)+C5]I{Φ2+χ(f°n)C1},(21)
because, obviously, B(n)Φ2+χ(C1)(f°n)0 when Φ2+χ(f°n)>C1.

Recalling that f°n() is the random value of f°n(t) in the stationary regime, we have for all large n:

0=EB(n)Φ2+χ(C1)(f°n())E[((ϵ/2)Φ1+χ(f°n())+C5)I{Φ2+χ(f°n())C1}](ϵ/2)E[Φ1+χ(f°n())I{Φ2+χ(f°n())C1}]+C5,
and then
E[Φ1+χ(f°n())I{Φ2+χ(f°n())C1}]2C5/ϵ.

Letting C1, we finally obtain that

EΦ1+χ(f°n())2C5/ϵ,
for all sufficiently large n, and then
EΦ1+χ(f°n())C¯,
holds for all n for some large C¯>0. □

5.5. Proof of (16)

The definition of ζ¯i=ζ¯i(f°n) in (11) can be interpreted as follows: ζ¯i is the expected jump size of particle i, conditioned on this particle receiving the jump urge, and then centered by the expected jump size of any particle upon a jump urge in the system.

The proof is by contradiction. Suppose Property (16) does not hold. Then, we can and do choose a subsequence of n, and corresponding f°n, so that along this subsequence Φ1+χ(f°n) and

G1+χ(n)(f°n)/Φ1+χ(f°n)0.(22)

Consider separately two cases: (a) liminfnΦ1(f°n)=c< and (b) limnΦ1(f°n)=.

Case (a).

Consider a subsequence of f°n such that limnΦ1(f°n)=c< and, moreover, f°nwf, where f is a proper distribution.

For a fixed 0<δ<1/2, denote

Φ1+χ(δ)(f°n)=[0,δ][1δ,1]sign([f°n]1(ν))|[f°n]1(ν)|1+χdν.

Using the facts that f is proper and Φ1+χ(fn), we easily see that, for any fixed δ>0,

limΦ1+χ(δ)(f°n)/Φ1+χ(f°n)=1.

Pick a sufficiently small δ>0 such that, for some δ1>0 and all large n, ηn(δ)vn>δ1 and ηn(1δ)vn<δ1. Then, we have

liminf|G1+χ(n)(f°n)|/Φ1+χ(f°n)(2+χ)δ1,
which contradicts (22).

Case (b).

Denote cn=Φ1(f°n), and consider the sequence of rescaled versions of f°n, namely,

f˜xn=f°cnxn,  wR.

Note that

G1+χ(n)(f°n)/Φ1+χ(f°n)=G1+χ(n)(f˜n)/Φ1+χ(f˜n).(23)

Consider two subcases: (b.1) limnΦ1+χ(f˜n)= and (b.2) liminfnΦ1+χ(f˜n)=c<. In the subcase (b.1), on account of (23), we can obtain a contradiction in the same way as in the Case (a).

So, the remaining case to consider is (b.2).

For an fM, let us formally define a “limiting version” of the functional G1+χ(n)(f°n) defined in the LHS of the inequality in (15):

G1+χ(f)=(2+χ)01dν[η(ν)v]sign(f1(ν))|f1(ν))|1+χ.(24)

Note that Φ1+χ(f˜n)1, so that 0<c<. Consider a subsequence of f˜n such that limnΦ1+χ(f˜n)=c and f˜nwfM. Distribution f cannot be concentrated at a single point. (Otherwise, because Φ1(f˜n)=1 for all n, Φ1+χ(f˜n) could not remain bounded.) Therefore, |G1+χ(f)|>0, and then

liminf|G1+χ(n)(f˜n)||G1+χ(f)|>0,
and then liminf|G1+χ(n)(f˜n)|/Φ1+χ(f˜n)>0, which contradicts (22). □

5.6. Proof of (18)

We will use the following inequality, which holds, for some constant C6>0, for any numbers y and δ:

|y+δ|2+χ|y|2+χ(2+χ)sign(y)|y|1+χδ+C6|y|χδ2+C6|δ|2+χ.(25)

Indeed,

|y+δ|2+χ|y|2+χ(2+χ)sign(y)|y|1+χδ+(1/2)(2+χ)(1+χ)|y˜|χδ2,
for some y˜[y|δ|,y+|δ|]. Using |y˜|χ(|y|+|δ|)χ(2|y|)χ+(2|δ|)χ, we obtain (25).

Consider a fixed state f°n and consider the expected increment Δ of Φ2+χ(f°n) upon a jump urge, which occurs in this state. Then, using (25),

Δ=E1ni|wi+ζi|2+χ1ni|wi|2+χE1ni[(2+χ) sign(wi)|wi|1+χζi+C6|wi|χζi2+C6|ζi|2+χ],
where expectation E is with respect to the uniform selection of the particle receiving the jump urge, the random event of it actually jumping, and the randomness of the jump size (if it occurs).

For ζi, we have:

ζi=κi1nsκs,
where κi=κi(f°n) the (random) jump size of particle i. Note that Eζi is the quantity of the order O(1/n), because sκs is of order O(1) and Eκi is of order O(1/n) (because 1/n is the probability of particle i receiving the jump urge). Therefore, ζ¯i=nEζi=O(1), and we can write:
Eisign(wi)|wi|1+χζi=1nisign(wi)|wi|1+χζ¯i.

Next,

E(ζi/2)2+χ(1/2)Eκi2+χ+(1/2)1n2+χE(sκs)2+χ,
and, therefore,
Eζi2+χC7/n,(26)
for some C7>0 because E(sκs)2+χ is upper bounded by the (2+χ)-th moment of a jump size and Eκi2+χ is upper bounded by 1/n times the (2+χ)-th moment of a jump size. Note that (26) holds for χ = 0 as well, with possibly different C7. Therefore, by choosing C7 sufficiently large, we have both (26) and
Eζi2C7/n.(27)

Assembling these bounds, we obtain

nΔ(2+χ)1nisign(wi)|wi|1+χζ¯i+C6C71ni|wi|χ+C6C7=G1+χ(n)(f°n)+C2Φχ(f°n)+C2,(28)
where, recall, Δ is the expected increment of Φ2+χ upon a jump urge occurring in a fixed state f°n, and C2=C6C7.

Now, consider the value of the generator B(n)Φ2+χ(C1)(f°n) at point f°n such that Φ2+χ(f°n)C1. For that, consider the expected increment of Φ2+χ(C1)(f°n(t)) over a small interval [0,t/n], with f°n(0)=f°n. First, note that, as t0, the contribution into this expected increment of the event that more than one jump urge occurs, is o(t) (because jump urges follow a Poisson process of rate n, and Φ2+χ(C1) is bounded). With probability t+o(t), there will be exactly one jump urge in [0,t/n], which, therefore, occurs into the state f°n (such that Φ2+χ(f°n)C1); then, the expected increment of Φ2+χ(C1) will not exceed that of Φ2+χ. Using these observations and the Estimate (28), we obtain (18). We omit the remaining straightforward ϵ/δ formalities. □

6. Proof of Theorem 3

This proof largely follows the approach used in Balazs et al. (2014), for a different model. Unlike in Balazs et al. (2014), we work with the Levy-Prohorov metric (inducing the weak convergence topology) on M, as opposed to the stronger Wasserstein W1-metric. This, in fact, simplifies some parts of the proof in our case; we will point out those parts as they appear. However, some other parts of our proof of Theorem 3 are completely different from (or not present in) the development in Balazs et al. (2014). They are Section 6.3 and Theorem 8, which establish the equivalence between solutions to (10) and mean-field models; and Theorem 9, which establishes the uniqueness of a mean-field model and its continuous dependence on the initial state.

We note that many of the supplementary results in this section, which may be of independent interest, require assumptions on the jump-size distribution that are much weaker than conditions m(1)=EZ< and (3). In particular, some of the results assume nothing about the jump-size distribution besides it being a proper distribution. We will emphasize such weaker assumptions, where applicable, in the results’ statements.

6.1. C-relative Compactness of the Processes

A sequence of random processes with sample path in the Skorohod space D([0,),R) (resp., D([0,),M)) is called C-relatively compact (see Perkins 2002 and Balazs et al. 2014) if it is: (a) relatively compact—that is, its any subsequence has a further subsequence converging in distribution to some limiting process; and (b) any such limiting process has continuous sample paths, a.s.

Theorem 4.

Suppose fn(0)wf(0), where {fn(0)} is the deterministic sequence of fn(0)M(n), and f(0)M. Then, for any hCb, the sequence of processes {fn(·)h} is C-relatively compact in the Skorohod space D([0,),R). (Note that we do not assume that f(0) has well-defined mean f¯(0), or (3), or (2), or even m(1)=EZ<. Jump-size distribution only needs to be proper.)

Proof.

This result is analogous to theorem 6.9 in Balazs et al. (2014). Note that, although the model in Balazs et al. (2014) is different from ours, the only property that is used in the proof of theorem 6.9 in Balazs et al. (2014) is that the rate of jumps of each particle is upper bounded by a finite constant a at all times. The latter property, obviously, holds for our model as well, with a=μ=1. Therefore, the proof of theorem 6.9 in Balazs et al. (2014) applies essentially verbatim, with the following adjustments.

What in the proof of theorem 6.9 in Balazs et al. (2014) are f,μn,a, in our notation, are h,fn,μ=1, respectively. In the proof, xi(t) denote the locations of the particles, uniquely determined by the process state at time t, and vice versa. The notation x˜i(t) is used for the locations of particles in an artificial system, with the same initial particle locations x˜i(0)=xi(0), but such that each particle jumps every time it gets a jump urge; the artificial system is coupled to the original one in the natural way, so that the corresponding particles have common processes of jump urges and common jump sizes (if the particles in the original system happen to jump at a jump urge). Clearly, with this coupling, xi(t)x˜i(t) and xi(t)xi(s)x˜i(t)x˜i(s) at all times st. Finally, in our case, In(t)=fn(t)h=h(x)dxfxn(t)=(1/n)ih(xi(t)). □

Theorem 5.

Suppose fn(0)wf(0), where {fn(0)} is the deterministic sequence of fn(0)M(n), and f(0)M. Then, the sequence of processes {fn(·)} is C-relatively compact in the Skorohod space D([0,),M). (Note that we do not assume that f(0) has well-defined mean f¯(0), or (3), or (2), or even m(1)=EZ<. Jump-size distribution only needs to be proper.)

Proof.

This result is analogous to corollary 6.10 in Balazs et al. (2014), with essentially the same proof. In fact, in our case, the proof is simpler. We need to verify conditions (i) and (ii) of theorem II.4.1 in Perkins (2002), which in our case take the following form.

  • (i) For any T > 0 and ϵ>0, there exists K > 0 such that

    supnP(suptT[fKn(t)+1fKn(t)]>ϵ)<ϵ.

  • (ii) For any hCb, the sequence of processes {fn(·)h} is C-relatively compact in the Skorohod space D([0,),R). (Note that the class of functions hCb is separating, which means that a probability measure g is uniquely determined by the values of gh for hCb.)

Condition (ii) is verified by Theorem 4. The verification of condition (i) repeats the proof of corollary 6.10 in Balazs et al. (2014), essentially verbatim, up to and including the display where Markov inequality is used for the first time. At that point, it remains to observe that the probability in the RHS of the display can be made arbitrarily small by making K sufficiently large. The measure μn(t,·) in the proof of corollary 6.10 in Balazs et al. (2014) is in our notation the measure (distribution) fn(t) on R; the particle locations xi(t) and x˜i(t) in the coupled original and artificial systems, are as described above in the proof of our Theorem 5. □

6.2. Trajectories of a Limit Satisfy (10)

For trajectories fn(·)D([0,),M) with fn(t)M(n) for all t0, let us define the following functional for each hCb and t0:

At,hn(fn(·))fn(t)hfn(0)h0tL(n)fn(s)hds.

We will also formally define a “limit version” of At,hn for trajectories f(·)D([0,),M), for each hCb and t0:

At,h(f(·))f(t)hf(0)h0tLf(s)hds.

Theorem 6.

Suppose fn(0)wf(0), where {fn(0)} is deterministic sequence of fn(0)M(n), and f(0)M. Then, the sequence of processes {fn(·)} is such that, for every t0 and any hCb,

sup0st|As,hn(fn(·))|0, asn.
(Note that we do not assume that f(0) has well-defined mean f¯(0), or (3), or (2), or even m(1)=EZ<. Jump-size distribution only needs to be proper.)

Proof.

The proof repeats the proof of theorem 6.11 in Balazs et al. (2014), in which we replace: L by L(n); f by h; μn by f n; a by μ = 1. In particular, in our case, In(t)=fn(t)h=h(x)dxfxn(t)=(1/n)ih(xi(t)), so that LIn(t) is replaced by

L(n)In(t)=L(n)fn(t)h=01E[h(fν1(t)+Z)h(fν1(t))]ηn(ν)dν,
where the expectation in the integrand is over a random jump size Z. The martingale Mn(t), t0, in our case is
Mn(t)=At,hn(fn(·)),
so that LMn2(t) in our case is L(n)Mn2(t). The last line of the last display of the proof in Balazs et al. (2014) can be removed, and the final estimate L(n)Mn2(t)4h2/n for hCb can be observed without that line (because the jump urge rate of each particle is μ = 1).

Finally, note that in our theorem, we only need to consider hCb. We do not need to consider the identity test function h(x) = x, and that is why in Theorem 6, we do not need condition EZ2<—it suffices that Z has a proper distribution. □

Theorem 7.

Suppose fn(0)wf(0), where {fn(0)} is deterministic sequence of fn(0)M(n), and f(0)M. Suppose the sequence of processes {fn(·)} is such that, fn(·)f(·) as n, in D([0,),M). Then, for every t0 and any hCb,

At,hn(fn(·))At,h(f(·)), as n.
(Note that we do not assume that f(0) has well-defined mean f¯(0), or (3), or (2), or even m(1)=EZ<. Jump-size distribution only needs to be proper.)

Proof.

The statement of this theorem is analogous to that of theorem 6.12 in Balazs et al. (2014). However, the proof in our case is simpler and is as follows. We know that, by Theorem 5, a.s. the limiting process f(·) has continuous trajectories in D([0,),M). We can use Skorohod representation to construct all processes on a common probability space so that, w.p.1, fn(·)J1f(·) as n; moreover, by the continuity of f(·), we see that fn(t)f(t) uniformly on compact sets of t; we also see that, w.p.1, fx(t) is nonincreasing in t (because so are fxn(t)). Then, it is easy to see (using, in particular, the facts that convergence ηn(·)η(·) is uniform, and η(·) is strictly decreasing continuous) that At,hn(fn(·))At,h(f(·)) w.p.1. □

Corollary 1.

Suppose fn(0)wf(0), where {fn(0)} is the deterministic sequence of fn(0)M(n), and f(0)M. Then, the sequence of processes {fn(·)} is such that its any distributional limit f(·) in D([0,),M) is such that, a.s., f(·) is continuous, and it satisfies At,h(f(·))=0 (i.e., (10)) for every t0 and any hCb. (Note that we do not assume that f(0) has well-defined mean f¯(0), or (3), or (2), or even m(1)=EZ<. Jump-size distribution only needs to be proper.)

Proof.

By Theorem 5, there exists a subsequence of {fn(·)}, which converges in distribution to a process f(·) with a.s. continuous trajectories. By Theorems 6 and 7, this process must satisfy At,h(f(·))=0 for every t0 and any hCb. □

6.3. Equivalent Characterization of Solutions to (10) as Mean-Field Models

Definition 1.

A function fx(t), xR, tR+, will be called a mean-field model if it satisfies the following conditions.

  • (a) For any t, f(t)=(fx(t),xR)M.

  • (b) For any x, fx(t) is nonincreasing c-Lipschitz in t, with constant c independent of x.

  • (c) For any x, for any t where the partial derivative (/t)fx(t) exists (which is almost all t w.r.t. Lebesgue measure, by the Lipschitz property), equation

    tfx(t)=01dνf1(ν)η(f1(ν))I{f1(ν)x}J¯(xf1(ν)),(29)
    holds.

Note that, by the change of variable y=f1(ν) in the integral in the RHS of (29), Equation (29) can be equivalently written as

tfx(t)=xdyfy(t)η¯(y,f(t))J¯(xy),(30)
where we use notations
η¯(y,f(t)){η(fy(t)),whenfu(t)iscontinuousatpointu=y(ν2ν1)1ν1ν2η(ν)dν,otherwise,
where ν2=fy(t),ν1=fy(t). Equation (30) (or (29)) is a more general form of (4), allowing fx(t) to be RCLL in x, rather than continuous. If fu(t) is continuous at u = y, then η¯(y,f(t))=η(fy(t)); if fu(t) has a jump at u = y, then η¯(y,f(t)) is η(ν) averaged over ν[fy(t),fy(t)]. In paper Stolyar (2023), the equation in Form (30) is used to define a mean-field model.

Theorem 8.

For any initial condition f(0)M,f(·) satisfies (10) if and only if it is a mean-field model. (Note that we do not assume that f(0) has well-defined mean f¯(0), or (3), or (2), or even m(1)=EZ<. Jump-size distribution only needs to be proper.)

Proof.

“Only if.” Let h=(h(u), uR) be the left-continuous step-function h(u)=I(ux), jumping from one to zero at point x. Let hϵ,ϵ>0 be a continuous approximation of h, which is linearly decreasing from one to zero in [x,x+ϵ]. We see that

L[f(t)hϵ]L[f(t)h],  t.

Indeed, |L[f(t)hϵ]L[f(t)h]| is upper bounded by the probability that a random jump of size Z of a particle randomly located according to f(t) is such that the jump either originates or lands in (x,x+ϵ); this probability vanishes as ϵ0. Also, because both L[f(t)hϵ] and L[f(t)hϵ] are within [1,0], for all t and ϵ, we have a universal bound |L[f(t)hϵ]L[f(t)h]|1. Then, for any fixed t, by taking the ϵ0 limit in f(t)hϵf(0)hϵ0tL[f(s)hϵ]ds=0, we obtain

f(t)hf(0)h0tL[f(s)h]ds=0.

This means that f(t)h=fx(t) is absolutely continuous in t, with the derivative equal to (/t)fx(t)=L[f(t)h] a.e. in t. This, in particular, implies that, for any fixed y, the fy(t)fy(t) is continuous in t (in fact, Lipschitz); this, in turn, means that, possible discontinuity points y of fy(t) “cannot move” in time t. We can then conclude that, for any x, the derivative (/t)fx(t)=L[f(t)h] is, in fact, continuous in t. Therefore, (/t)fx(t)=L[f(t)h] at every t. It remains to observe that L[f(t)h] is exactly the RHS of (29).

“If.” The definition of a mean-field model f(·) implies that (10) holds for the defined above step-function h for any x. Then, we have (10) for any h, which is piece-wise constant with finite number of pieces; and the set of such functions h is tight within the space of test functions hCb, equipped with uniform metric. Then, (10) holds for any hCb. □

6.4. Uniqueness of Solution to (10) and Continuity in Initial State. Proof of Theorem 3(i)

Theorem 9.

Assume that m(1)=EZ< (or, m(1)=EZ=1, WLOG) and Condition (3) holds. Then, for any initial condition f(0)M, a solution f(·) of (10) (i.e., a mean-field model) is unique. (Note that we do not assume that f(0) has well-defined mean f¯(0), or (2).)

Proof.

We know that solutions to (10) are mean-field models. Papers Greenberg et al. (1996) and Stolyar (2023) study the properties of mean-field models. It is easy to check that the proofs of all results in sections 4.4 and 4.5 of Greenberg et al. (1996), including Theorem 2, stating that the Wasserstein W1-distance (i.e., the L1-norm of the difference) between any two mean-field models f(1)(·) and f(2)(·) is nonincreasing, never use the fact that the means f¯(1)(0) and f¯(2)(0) are well-defined and equal to zero. Those proofs only use the fact that f(1)(0),f(2)(0)M and

[fw(1)(0)fw(2)(0)]dw=0.
(The conditions EZ< and (3) are used there. Technically, those proofs in Greenberg et al. (1996) assume that the jump-size distribution J(·) is exponential—but only the Property (3) is actually used.)

For any f(0)M, a mean-field model f(·) is such that (see Stolyar 2023)

[fw(0)fw(t)]dw=vt,  t0.(31)

Now, if f(1)(·) and f(2)(·) are two different solutions with the same initial condition, f(1)(0)=f(2)(0)=f(0), then, from (31),

[fw(1)(t)fw(2)(t)]dw=0,  t0.

Therefore, the Wasserstein W1-distance between f(1)(t) and f(2)(t) cannot increase, which implies the uniqueness. □

Proof of Theorem 3(i).

We have established that the family of distributions of the processes is C-tight, any subsequential distributional limit is concentrated on solutions f(·) to (10) (i.e., mean-field models), and the solution f(·) with initial condition f(0) is unique. This implies the convergence fn(·)f(·).

The continuity of f(·) in f(0) is then a consequence of convergence and uniqueness. Indeed, consider a sequence of initial states f(k)(0)M, converging to some f(0)M; namely, f(k)(0)wf(0) as k. Fix ϵ>0 and any δ>0. We can choose an increasing sequence n=n(k) and corresponding f(k),n(0)M(n), such that f(k),n(0)wf(0) as k, and for each k, the process f(k),n(·) is within distance ϵ from the deterministic trajectory f(k)(·) with probability at least 1δ. Then, for all large k, f(k),n(·) is within distance ϵ from both f(k)(·) and f(·) with probability at least 12δ. Because this is true for any ϵ and δ, the only option is that f(k)(·)J1f(·). □

6.5. Corollaries for the Centered Processes: Proof of Theorem 3(ii)

Suppose f¯(0)=0. Then, from (31), we have f¯(t)=vt. Then, the continuity and uniqueness of f°(·), as well as the continuity of its dependence on f°(0), follow from the corresponding properties of noncentered f(·) in Theorem 3(i).

Suppose, in addition, that f¯n(0)=0 for all n. Because ηn converges to η, we easily see directly that f¯n(t)Pvt for any t. Combining these observations with Theorem 3(i), we obtain the convergence f°n(·)f°(·); for example, we can use Skorohod representation to obtain a.s. convergence to the limit. □

7. Proof of Theorem 1

7.1. Basic Properties of a Limit of Stationary Distributions

Lemma 1.

The sequence of stationary distributionsthat is, the distributions of f°n()is tight. Any subsequential distributional limit f°() is such that:

f°() is concentrated on M°;

EΦ1+χ(f°())C¯,(32)
where C¯ is the constant in Theorem 2;
EΦ1(f°())1+C¯.(33)

Proof.

By Theorem 2 and Markov inequality, for any δ>0, there exists a sufficiently large number C(δ)>0, such that for all large n, with probability at least 1δ,Φ1+χ(f°n())C(δ). The set

S(δ)={fM°|Φ1+χ(f)C(δ)},
is compact in M (equipped with the weak convergence topology). And we know that, for all large n, P{f°n()S(δ)}1δ. Therefore, the sequence of distributions of f°n() is tight. Consider any subsequential distributional limit f°()—that is, f°n()f°() for a subsequence of n. Then, P{f°()S(δ)}1δ, and then the mean f°¯()=0 w.p.1. Moreover, the limit f°() must be such that (32) holds. (This is by the Fatou lemma because, using Skorohod representation, we can construct the sequence of f°n() and f°() on a common probability space such that the convergence f°n()wf°() is w.p.1.) Finally, (33) is from
EΦ1(f°())E[[Φ1+χ(f°())]1/(1+χ)]1+EΦ1+χ(f°())1+C¯.

7.2. Characterization of a Limit of Stationary Distributions

Suppose f(0)M°—that is, f(0)M and f¯(0)=0. If f(·) is a mean-field model with initial state f(0) (i.e., the solution to (10)), then the corresponding centered trajectory f°(·) will be called the centered mean-field model.

Lemma 2.

The distribution of any subsequential limit f°() is a stationary distribution of the deterministic process evolving along centered mean-field limits.

Proof.

By Theorem 3(ii), the dependence of the deterministic trajectory f°(·) on the initial state f°(0) is continuous. Then, we can apply theorem 8.5.1 in Liptser and Shiryaev (1989), adapted to our setting. Or, the proof is also easy to obtain directly as follows. We need to show that for any test function hCb and any t0, we have

Ef°(0)h=Ef°(t)h,(34)
where f(0) is equal in distribution to f(). We obtain (34) by taking the limit of the equality
Ef°n(0)h=Ef°n(t)h,(35)
where f°n(0) is equal in distribution to f°n(); (35) clearly holds for all n. Clearly, Ef°n(0)hEf°(0)h. To demonstrate
Ef°n(t)hEf°(t)h,(36)
we can use Skorohod representation, so that the convergence f°n(0)wf°(0) is a.s. For any deterministic converging sequence f°n(0)wf°(0), we have, by Theorem 3(ii), f°n(t)f°(t) (which is equivalent to f°n(t)Pf°(t)), and then Ef°n(t)hEf°(t)h. Thus, we obtain (36), and then (34). □

7.3. Completion of the Proof of Theorem 1

Consider any subsequential distributional limit f°(). Its distribution is a stationary distribution of the deterministic process evolving along centered mean-field limits f°(·).

First Proof.

Consider any two different initial conditions f°(1)(0)M° and f°(2)(0)M° of the deterministic process f°(·). The Wasserstein W1-distance f°(1)(t)f°(2)(t)1 must strictly decrease with t, by theorem 2 in Greenberg et al. (1996). Then, the only option is that f°() is concentrated on a single element ϕ, which then must be a traveling wave shape, and then a unique traveling wave shape. Otherwise, if we consider two independent stationary versions of the process f°(·), say, f°(1)(0) and f°(2)(0), then Ef°(1)(t)f°(2)(t)1 is finite for t = 0 (which follows from EΦ1+χ(f°())C¯), and it is strictly decreasing in t; this contradicts the stationarity. Finally, we obtain (7)—that is, Φ1+χ(ϕ)<—because EΦ1+χ(f°())C¯. □

Second Proof.

By theorem 3.2 in Stolyar (2023), we have the existence of the unique traveling wave shape ϕM°. Then, by theorem 2 in Greenberg et al. (1996), for any f°(0)M°, the Wasserstein W1-distance f°(t)ϕ1 is strictly decreasing when it is nonzero. Then, f°() must be concentrated at ϕ. Property (7) follows from EΦ1+χ(f°())C¯. □

Remark 1.

We have given two proofs that complete the proof of Theorem 1. They both require additional Assumptions (2) and (3). Note that, as a byproduct of the first proof, we also obtain the existence of the unique traveling wave shape, but only under these additional conditions. As far as the existence of the traveling wave shape is concerned, it is established in theorem 3.1 in Stolyar (2023) under weaker conditions, only requiring the finite second moment of jump size, m(2)=EZ2<.

8. Discussion

The main results of this paper, in a sense, complete the “program” represented by previous work by Greenberg et al. (1995, 1996) and Stolyar (2023) on the specific model in this paper. Greenberg et al. (1995), informally speaking, prove the convergence to a deterministic mean-field model as n. (In our paper, we generalize that result to a more general model, without the finite-dependence assumption.) Greenberg et al. (1996), informally speaking, prove that if a traveling wave exists, then each mean-field model trajectory is attracted to that traveling wave, as t; Stolyar (2023) shows that a traveling wave does exist under very general assumptions. This paper proves that the convergence to the traveling wave also holds if we “interchange the limits”: first consider the stationary distribution (take limit in t) and then consider the n limit of stationary distribution; if we take limits in this order, the limit is the same—a traveling wave. Thus, the results of this program answer essentially “all” questions about the behavior of the system when n is large—both about its transient behavior and about its stationary distribution.

There are many other well-motivated large-scale particle systems (cf. Manita and Shcherbakov 2005; Malyshev and Manita 2006; Manita 2006, 2009, 2014; Malyshkin 2006, and Balazs et al. 2014), for which implementing a similar program would be of interest.

References

  • Balazs M, Racz MZ, Toth B (2014) Modeling flocks and prices: Jumping particles with an attractive interaction. Ann. Institut Henri Poincare Probab. Statist. 50(2):425–454.Google Scholar
  • Bramson M (2008) Stability of Queueing Networks (Springer Berlin Heidelberg, Berlin).Google Scholar
  • Dai JG (1995) On the positive Harris recurrence for multiclass queueing networks: A unified approach via fluid models. Ann. Appl. Probab. 5:49–77.Google Scholar
  • Ethier S, Kurtz T (1986) Markov Processes: Characterization and Convergence (Wiley, Hoboken, NJ).Google Scholar
  • Greenberg A, Malyshev V, Popov S (1995) Stochastic model of massively parallel computation. Markov Processes Related Fields 1(4):473–490.Google Scholar
  • Greenberg A, Shenker S, Stolyar A (1996) Asynchronous updates in large parallel systems. Proc. ACM SIGMETRICS Internat. Conf. Measurement Modeling Comput. Systems (ACM, New York), 91–103.Google Scholar
  • Hongler MO (2015) Exact soliton-like probability measures for interacting jump processes. Math. Sci. 40(1):62–66.Google Scholar
  • Hongler MO, Filliger R (2019) When do redundant requests reduce latency? Methodology Comput. Appl. Probab. 21:753–764.Google Scholar
  • Liptser RS, Shiryaev AN (1989) Theory of Martingales (Kluwer Academic Publishers Dordrecht, Dordrecht, Netherlands).Google Scholar
  • Malyshev V, Manita A (2006) Phase transitions in the time synchronization model. Theory Probab. Appl. 50:134–141.Google Scholar
  • Malyshkin A (2006) Limit dynamics for stochastic models of data exchange in parallel computation networks. Problems Inform. Transmission 42:234–250.Google Scholar
  • Manita A (2006) Markov processes in the continuous model of stochastic synchronization. Russian Math. Surveys 61:993–995.Google Scholar
  • Manita A (2009) Stochastic synchronization in a large system of identical particles. Theory Probab. Appl. 53:155–161.Google Scholar
  • Manita A (2014) Clock synchronization in symmetric stochastic networks. Queueing Systems 76:149–180.Google Scholar
  • Manita A, Shcherbakov V (2005) Asymptotic analysis of a particle system with mean-field interaction. Markov Processes Related Fields 11:489–518.Google Scholar
  • Perkins EE (2002) Dawson-Watanabe superprocesses and measure-valued diffusions. Bernard P, ed. Lectures on Probability Theory and Statistics: Ecole d’Eté de Probabilités de Saint-Flour XXIX, 1999, Lecture Notes in Mathematics, vol. 1781 (Springer, Berlin), 125–324.Google Scholar
  • Rybko A, Stolyar A (1992) Ergodicity of stochastic processes describing the operation of open queueing networks. Problems Inform. Transmission 28:199–220.Google Scholar
  • Stolyar AL (1995) On the stability of multiclass queueing networks: A relaxed sufficient condition via limiting fluid processes. Markov Processes Related Fields 1(4):491–512.Google Scholar
  • Stolyar AL (2022) Parallel server systems with cancel-on-completion redundancy. Stochastic Systems 12(4):340–372.LinkGoogle Scholar
  • Stolyar AL (2023) Large-scale behavior of a particle system with mean-field interaction: Traveling wave solutions. Adv. Appl. Probab. 55(1):245–274.Google Scholar