Open Access

Estimating Customer Impatience in a Service System With Unobserved Balking

Yoshiaki Inoue
Yoshiaki Inoue
[email protected]
https://orcid.org/0000-0002-2483-7652
Department of Information and Communications Technology, Graduate School of Engineering, Osaka University, Suita, Osaka 565-0871, Japan;
Search for more papers by this author
,
Liron Ravner
Corresponding Author
Liron Ravner
[email protected]
https://orcid.org/0000-0002-7359-9793
Department of Statistics, University of Haifa, Haifa 3498838, Israel;Korteweg-de Vries Institute for Mathematics, University of Amsterdam, 1098 XG Amsterdam, Netherlands;
Search for more papers by this author
,
Michel Mandjes
Michel Mandjes
[email protected]
https://orcid.org/0000-0001-6783-4833
Korteweg-de Vries Institute for Mathematics, University of Amsterdam, 1098 XG Amsterdam, Netherlands;EURANDOM, Eindhoven University of Technology, 5612 AZ Eindhoven, Netherlands;Amsterdam Business School, Faculty of Economics and Business, University of Amsterdam, 1018 WB Amsterdam, Netherlands
Search for more papers by this author

Department of Information and Communications Technology, Graduate School of Engineering, Osaka University, Suita, Osaka 565-0871, Japan;

Search for more papers by this author

Liron Ravner

Corresponding Author

Liron Ravner

[email protected]

https://orcid.org/0000-0002-7359-9793

Department of Statistics, University of Haifa, Haifa 3498838, Israel;Korteweg-de Vries Institute for Mathematics, University of Amsterdam, 1098 XG Amsterdam, Netherlands;

Search for more papers by this author

Michel Mandjes

[email protected]

https://orcid.org/0000-0001-6783-4833

Korteweg-de Vries Institute for Mathematics, University of Amsterdam, 1098 XG Amsterdam, Netherlands;EURANDOM, Eindhoven University of Technology, 5612 AZ Eindhoven, Netherlands;Amsterdam Business School, Faculty of Economics and Business, University of Amsterdam, 1018 WB Amsterdam, Netherlands

Search for more papers by this author

Published Online:13 Dec 2022https://doi.org/10.1287/stsy.2022.0101

Abstract

This paper studies a service system in which arriving customers are provided with information about the delay they will experience. Based on this information, they decide to wait for service or leave the system. Specifically, every customer has a patience threshold, and they balk if the observed delay is above the threshold. The main objective is to estimate the parameters of the customers’ patience-level distribution and the corresponding potential arrival rate, using knowledge of the actual queue-length process only. The main complication and distinguishing feature of our setup lies in the fact that customers who decide not to join are not observed, and remarkably, we manage to devise a procedure to estimate the underlying patience and arrival rate parameters. The model is a multiserver queue with a Poisson stream of customers, enabling evaluation of the corresponding likelihood function of the state-dependent effective arrival process. We establish strong consistency of the MLE and derive the asymptotic distribution of the estimation error. Several applications and extensions of the method are discussed. The performance is further assessed through a series of numerical experiments. By fitting parameters of hyperexponential and generalized hyperexponential distributions, our method provides a robust estimation framework for any continuous patience-level distribution.

Funding: The research of Y. Inoue is supported in part by JSPS KAKENHI [Grant JP18K18007]. The research of L. Ravner and M. Mandjes is partly funded by NWO Gravitation Project Networks [Grant 024.002.003].

Supplemental Material: The e-companion is available at https://doi.org/10.1287/stsy.2022.0101.

1. Introduction

In many service systems, potential customers are provided with real-time information on the delay they will experience (Ibrahim 2018, Dong et al. 2019). This information, typically in terms of anticipated delay or current workload, is used by the potential customers to decide to either join the system or balk. In general, one could view balking as an implicit form of admission control; the load on the service system does not explode as a consequence of the fact that during busy periods some of the potential customers decide to leave, thus mitigating the overall congestion. This means that the system is in effect self-regulating even if the potential service demand is unknown to the administrator. To make sound decisions in designing the service system, however, it is crucial to have knowledge of the volume of the potential demand and the mechanism based on which potential customers decide to join or to leave. With this knowledge, the service provider can, for example, determine the optimal admission price or decide whether it is economically viable to increase the service capacity.

This paper is motivated by the fact that it is challenging to extract this knowledge from data when balking customers cannot be observed by the administrator. Such complication arises naturally in many service systems. The most obvious case is a physical service facility, where a visible queue (e.g., shops, parking lots, museums), sometimes even outside of the facility, will deter some of the potential customers. This has become even more prevalent recently, with social distancing measures limiting the number of people that can be inside a public space at a given time. This form of data censorship occurs also in systems that provide real-time delay data via electronic means such as dedicated apps for the service. Some examples are expected travel times in transportation services (e.g., Google Maps and Uber) and expected delivery times in food delivery services (e.g., UberEats in the United States or Deliveroo in Europe). Note that expected delay is often listed for the possible services (e.g., estimated arrival time of a taxi or delivery times for a list of Italian restaurants) even if a user does not indicate the specific service he or she is interested in, and therefore, the system cannot know whether a user balked from the service because of the delay information.

In this paper, we model such service systems with unobservable balking as an M/G/s + H queue, where the last symbol stands for the cumulative distribution function of the customer patience. More specifically, we consider a first-come, first-served (FCFS) queue with s ( $s ⩾ 1$ ) servers and potential customers arriving according to a Poisson process with rate $λ > 0$ . Potential customers have independent and identically distributed (i.i.d.) patience levels that are distributed as the nonnegative random variable $Y \sim H$ and bring i.i.d. service requirements that are distributed as the nonnegative random variable $B \sim G$ . Customer i joins the system if Y_i is greater or equal to the virtual waiting time at the moment of his or her arrival and otherwise balks without being observed. Note that although customers are homogeneous in a statistical sense, they have individual (thus heterogeneous) patience levels, so this model can clearly represent the types of customers that will effectively join the system.

We set up an estimation problem by parametrizing the patience distribution as $H_{θ}$ ( $θ \in Θ$ ), where Θ denotes a compact set satisfying some regularity conditions. The goal of this work is to estimate the pair $(λ, θ)$ based on observation of the effective queueing process, which is constructed from records of arrival and departure times not including balking customers. As some of the customers balk, the corresponding effective interarrival times are not the usual i.i.d. exponentially distributed random variables. The effective arrival process is in fact a nonhomogeneous Poisson process whose rate depends on the virtual waiting time process. In particular, the effective arrival rate to the system when the virtual waiting time $v ⩾ 0$ is $λ (1 - H (v))$ . Note that the marginal effective arrival process is not a nonhomogeneous Poisson process with respect to time. Therefore, the dependence structure of the observations of interarrival times is directly determined by that of the virtual waiting time process. The workload-dependent arrival dynamics are utilized to derive a maximum-likelihood estimator (MLE) based on the effective interarrival observations. Specifically, the density of an interarrival time A conditional on initial virtual waiting time $v ⩾ 0$ is given by

f_{A | v} (t) = λ (1 - H (v - t)) \exp (- λ \int_{0}^{t} (1 - H (v - u)) ​d u), t ⩾ 0 .

The Markovian dynamics of the queue further imply that the likelihood of the interarrival times conditional on the waiting times and job sizes is given by a product of densities with the above form for any sample of observations.

We further study the asymptotic properties of the proposed estimator. Importantly, the asymptotic performance of the estimator is not given by standard results due to the intricate underlying dependence structure. Indeed, so as to prove consistency and asymptotic normality, a subtle reasoning is required to make sure that the situation at hand fulfills specific regularity conditions. It is further shown that for homogeneous (deterministic) patience levels, that is, $P (Y = θ) = 1$ for some $θ > 0$ , the asymptotic distribution of the errors is not normal but rather exponential (with rate n and not $\sqrt{n}$ ). The main contribution of this paper is the development of procedures for consistent estimation of the total arrival rate (that is, corresponding to customers joining the system and balking customers) and the patience-level distribution, observing the queue-length process only. Hence, somewhat counterintuitively, we can estimate load that has never been observed.

The main purpose of this paper is to lay foundations for statistical inference for congested service systems with unobserved balking. Although the queueing model considered in this paper is of course not a perfect replica of any specific real system, it does provide a methodological statistical approach for service systems where unobserved balking due to congestion is a key feature. Furthermore, the framework presented here can be extended to capture additional features such as noisy waiting time information, as will be discussed in the concluding remarks.

It is also important to stress that the framework we develop in this paper is essentially different from the extensively studied problem of estimating patience parameters from observed abandonment data (see Brown et al. 2005, Mandelbaum and Zeltyn 2013). If balking customers are observed, then the data are comprised of waiting times and corresponding indicator variables stating whether a customer joined or balked. Specifically, when it is observed that a customer i joined (resp. balked) given the waiting time W_i, the observation is of the form $Y ⩾ W_{i}$ (resp. $Y < W_{i}$ ). These data can be directly applied to construct an estimator for θ, for example, by applying an MLE method. If abandonment times are directly observed, then even more information is available, namely the realization of Y_i for every abandoning customer, and $Y ⩾ W_{i}$ for customers who obtained service. Note that in the above scenarios estimation of the total arrival rate is straightforward because all arrivals are observed. However, when balking is unobserved, the only information available is that of interarrival and waiting times, and therefore, indirect estimation methods are called for. Of course, there may be applications where some of the balking or reneging customers are observed, whereas others are not. For such systems, it may be reasonable to adopt a hybrid approach that uses classical patience estimation methods together with those presented in this paper. For the sake of brevity, we assume throughout that balking happens immediately and is not observed by the system operator so that no data regarding the balking customers is available for estimation purposes.

1.1. Contributions

We proceed by detailing our paper’s contributions. The focus is on developing estimators with provable performance guarantees. Notably, setting up our estimator is straightforward in the sense that it relies on a closed-form expression for the likelihood and does not require any queueing-theoretic analysis. As a result, even for models for which stationary performance analysis is involved (or intractable), parameter inference can be done in a relatively direct manner. Our contributions relate to (i) our estimator and its asymptotic properties, (ii) results and experiments for specific patience-level distributions, and (iii) extensions and ramifications.

Framing the system described above as a queueing model with impatient customers, we can evaluate the corresponding likelihood function (in terms of the observed quantities). This means that, in a parametric context, we can estimate the unknown parameters — that is, λ and the parameter(s) θ corresponding to the patience level Y — relying on a maximum-likelihood procedure. The estimation method has several attractive features, the most prominent one being that it does not require simultaneous estimation of the service requirement distribution or even making any distributional assumption about it. In addition, our methodology uses the fundamental queueing dynamics and does not, as is common in the literature, rely on any fluid and/or diffusion approximations.
In the case of a continuous patience-level distribution, we prove, under appropriate regularity conditions, strong consistency and asymptotic normality of the resulting estimation errors (scaled by $\sqrt{n}$ ). The proof of the asymptotic normality relies on an application of a suitable version of the martingale central limit theorem.
In the case that the patience level Y is constant (i.e., $Y = θ$ for some θ), more refined results can be obtained. Most notably, we show that the estimation error (scaled by n) converges to an exponential distribution with a parameter that depends on the stationary workload density at θ and the loss probability of customers.
A number of other special cases are dealt with, covering exponentially and (generalized) hyperexponentially distributed patience levels; we present closed-form expressions for the asymptotic variance of the estimation error. The class of generalized hyperexponentially distributions is highly relevant for various reasons, most importantly because these can be used to approximate any positive continuous distribution arbitrarily well (Botta and Harris 1986). In addition, in the call center literature (Roubos and Jouini 2013), it has been observed that such distributions yield a good fit for observed customer patience in terms of the time until abandonment. We provide several numerical examples that illustrate the robustness of our methodology. Indeed, our experiments show that even if the patience-level distribution is misspecified (i.e., is not generalized as hyperexponential itself), the distribution function of the estimated generalized hyperexponential distribution is still remarkably close to the true one.
We also discuss several ramifications and model extensions that can be handled using our framework. These include extending the framework to a system with noisy delay information, estimating the proportion of impatient customers in the population, and estimating specific utility parameters such as service value and waiting time sensitivity.

1.2. Related literature

There is a substantial body of literature on queues with impatient customers. Without attempting to provide an exhaustive overview, we include a brief account of the various research lines. Naor’s influential paper (Naor 1969) can be seen as a pioneering contribution to the field of “behavioral queues.” It presented a powerful stylized model for a queueing resource in which potential customers decide to join or to balk based on delay information. Since then, performance and economic analysis of queueing systems with balking customers has been studied extensively. In particular, the effect of providing workload information to strategic customers was investigated in, for example, Guo and Zipkin (2007, 2008). Many other model variations are reviewed in Hassin and Haviv (2003) and Hassin (2016). More background on applied probability and queueing systems in a general context is provided in, for example, Assmussen (2003) and Dębicki and Mandjes (2015).

For various specific types of queueing systems with impatient customers, explicit performance results are known; see, for example, Boxma et al. (2010, 2011), Liu and Kulkarni (2006, 2008), and Mandelbaum and Shimkin (2000, 2004). Importantly, however, the estimation techniques presented in our paper do not require knowledge of, for example, the queue’s stationary distribution. It is also worth stressing that despite the fact that queues with impatient customers are well studied, to the best of our knowledge, hardly any workload-based estimation techniques have been developed. A notable exception is Gorst-Rasmussen and Hansen (2009), featuring an asymptotic analysis of statistical inference of waiting times truncated by abandonments; our setting, however, is crucially different from the one in Gorst-Rasmussen and Hansen (2009), because in the latter, abandoning customers and their sojourn times are observed.

There are various papers on estimation problems in the setting where all customers join but leave when getting impatient; examples are Courcoubetis et al. (1995), Akşin et al. (2013), Mandelbaum and Zeltyn (2013), and Yefenof et al. (2022). Importantly, this setting is crucially different from ours, in which the balking customers are not observed. In this branch of the literature, the challenge lies in the fact that for customers who are served, we have just a lower bound on their impatience time, making this a “right-censored” estimation problem.

Another related research direction concerns the so-called queue inference engine, a setting in which one does not observe the arrival process of the customers but rather the ordered service entry and service completion times (so-called “transactional data”); see, for example, Larson (1990) and Daily and Servi (1998). One comes across this situation when considering, for example, automatic teller machines, where the queue of customers waiting for the machine (and in particular their arrival times) is not observable, but the transactional data are recorded. In Daley and Servi (1998), a multiserver queue is considered, but with the specific feature that balking is assumed to occur when all servers are busy upon arrival, with a fixed probability independent of the queue length, making the underlying queueing dynamics fundamentally different from ours.

Strictly speaking, Negahban (2019, 2022) did not deal with transactional data; it is assumed that the interarrival times corresponding to the nonbalking customers are observed (so that its scope is somewhat similar to ours). At the methodological level, however, Negahban (2019, 2022) crucially differed from our approach; the proposed estimation procedure relies on discrete-event simulation rather than a rigorously backed maximum-likelihood procedure.

Our work fits in the larger branch of the literature on inference for queues. This domain focuses on estimation problems where the queue is observed (for instance periodically), with the objective to estimate its input parameters. We refer to Asanjarani et al. (2021) for a broad recent overview of this area. Examples are the procedures for the M/G/1 queue, as presented in Hansen and Pitts (2006) and den Boer and Mandjes (2017). We also refer to the Poisson-probing based approach (Ravner et al. 2019) for Lévy-driven queues (Dębicki and Mandjes 2015) and the general framework presented in Duffie and Glynn (2004), as well as the corresponding hypothesis-testing problem (Mandjes and Ravner 2020). Some approaches rely on exploiting knowledge of the queue’s tail probabilities (Courcoubetis et al. 1995, Glynn and Zeevi 2000, Duffy and Meyn 2011). A related line of research concerns the detection of stability or instability of queueing networks; see the Monte Carlo-based methods of Leahu et al. (2017) and Mandjes et al. (2017).

The main conclusion from the above is that, despite the mature literature on queues with impatient customers and related estimation problems, there is a lack of methodologies that can deal with the situation in which balking customers are not observed.

1.3. Paper Organization

This paper is organized as follows. In Section 2 we describe the model featuring in this paper and introduce the notation used. In addition, we present a number of known results on queues with customer impatience that are relied on later. Then, in Section 3, we construct and analyze our maximum-likelihood estimator. For continuous patience-level distributions, we prove strong consistency and asymptotic normality of the estimation error. The case of deterministic patience is treated separately, leading to a similar conclusion regarding strong consistency, but with the notable difference that in this case the limiting distribution of the estimation error is exponential. Section 4 provides detailed analysis of the MLE for exponential, hyperexponential, and generalized hyperexponential patience distributions, the latter cases being relevant because they can be used to approximate the distributions of arbitrary nonnegative random variables. This includes explicit analysis of the likelihood function and its corresponding asymptotic variance as well as simulation experiments assessing the performance of the estimators. Section 5 illustrates how the method can be utilized for several applications. In Section 6, we provide concluding remarks and directions for followup research, including a brief discussion of a nonparametric estimation procedure.

2. Model and Preliminaries

In this section, we provide a detailed model description, introduce the notation used throughout this paper, state the objective of the paper, and present preliminaries.

2.1. Model Description

We consider a service system with s servers ( $s \in N$ ). Potential customers arrive at the service system according to a Poisson process with rate λ. Each customer has a service requirement with cumulative distribution function (cdf) $G (\cdot)$ and a patience level with cdf $H (\cdot)$ . We define B and Y as the generic random variables corresponding to service requirements and impatience levels, respectively. The interarrival times ${(T_{i})}_{i = 1, 2 \dots}$ , the service requirements ${(B_{i})}_{i = 1, 2, \dots}$ , and patience levels ${(Y_{i})}_{i = 1, 2, \dots}$ are independent sequences of i.i.d. distributed random variables. Customer i joins the queue if Y_i is smaller than or equal to the waiting time he or she will experience. More specifically, let V(t) (for $t ⩾ 0$ ) denote the virtual waiting time at time t, which is defined as the time it would take before at least one server becomes idle if no customers have joined the system after time t:

V (t) = \inf {u ⩾ 0 : Q (t) - (N_{D} (t + u) - N_{D} (t)) ⩽ s - 1},

(1)

where Q(t) denotes the number of customers in the system (including those in service) at time t and

N_{D} (t)

(for

t ⩾ 0

) denotes the total number of customers finishing their service in a time interval

(0, t]

. For the single-server case (i.e., if s = 1), the virtual waiting time equals to the workload in system, that is, the sum of remaining service requirements of customers. Note that the virtual waiting time in the multiserver case (

s = 2, 3, \dots

) does not coincide with the workload in system, unlike in the single-server case:

(i) When $i \in {0, 1, \dots, s - 1}$ servers are busy, the virtual waiting time equals zero, whereas the workload equals some positive value and decreases with slope i as time passes without effective arrivals.
(ii) When all s servers are busy, the virtual waiting time and the workload decrease with slope 1 and s, respectively, as time passes without effective arrivals.

Figure 1 illustrates an example of the realization of the virtual waiting time process for s = 2. The actual waiting time of a customer arriving at time t equals $V (t -)$ so that a customer balks if the virtual waiting time just before the arrival instant exceeds the customer’s patience threshold. The resulting system can be considered as an M/G/s queue with impatient customers.

**Figure 1. An Example of the Virtual Waiting Time Process (s = 2)**

Because some customers will balk, there will be a difference between the interarrival times of the customers (covering both the balking and nonbalking customers) on one hand, and the effective interarrival times (covering just the nonbalking customers) on the other hand. Throughout this paper, the sequence of effective interarrival times is denoted by ${(A_{i})}_{i = 1, 2, \dots}$ . Importantly, observe that the random variables A_i are generally neither identically distributed nor independent. We can reconstruct them from the sequences ${(T_{i})}_{i = 1, 2, \dots}$ , ${(B_{i})}_{i = 1, 2, \dots}$ , and ${(Y_{i})}_{i = 1, 2, \dots}$ in the following manner. For the following construction, we let a nonbalking customer arrive at time t = 0, as will be further motivated in Remark 1. Denoting the virtual waiting time process at time t by V(t), we have that $A_{1} = \sum_{k = 1}^{j_{1}} T_{k}$ , where

j_{1} ≔ \inf {j ⩾ 1 : V (\sum_{k = 1}^{j} T_{k} -) ⩽ Y_{j}} .

Similarly, $A_{i} = \sum_{k = j_{i - 1} + 1}^{j_{i}} T_{k}$ , where

j_{i} ≔ \inf {j ⩾ j_{i - 1} + 1 : V (\sum_{k = 1}^{j} T_{k} -) ⩽ Y_{j}} .

Observe that j_i has the interpretation of being the index of the i-th nonbalking customer. Let ${\tilde{A}}_{i} ≔ \sum_{k = 1}^{i} A_{k}$ denote the i-th effective arrival time. We denote the upward-jump size in the virtual waiting time caused by the i-th joining customer by $X_{i} ≔ V ({\tilde{A}}_{i}) - V ({\tilde{A}}_{i} -)$ . In the single-server case (s = 1), we have $X_{i} = B_{j_{i}}$ . In the multiserver case ( $s ⩾ 2$ ), on the other hand, X_i takes a complicated form, depending on both the queue-length and the residual service times of customers in service seen on the arrival instant. For a single-server system (s = 1), Figure 2 illustrates the workload process during a single busy period for the case of deterministic patience, that is, $Y = θ$ for some $θ > 0$ . The building blocks of our framework, viz. the interarrival and service requirements and their effective counterparts, are highlighted in the figure. It is stressed that the distribution of the effective interarrival times can significantly deviate from the (exponential) distribution of the interarrival times. Figure 3 illustrates this effect for two different patience-level distributions. In addition, as mentioned, the effective interarrival times are not independent, as opposed to the interarrival times.

Figure 2. The Virtual Waiting Time V(t) During a Busy Period for a Single-Server System and Deterministic Patience Levels
*Notes.* The dotted red lines mark arrival instants of new customers. Arrivals 3 and 5 observe a virtual waiting time level above θ and immediately balk, and thus the virtual waiting time continues to deplete as if there was no arrival. Although six customers have arrived during the busy period, only four effective interarrival times and service requirements are observed, namely $((A_{1}, X_{1}), (A_{2}, X_{2}), (A_{3}, X_{3}), (A_{4}, X_{4}))$ , being equal to $((T_{1}, B_{1}), (T_{2}, B_{2}), (T_{3} + T_{4}, B_{4}), (T_{5} + T_{6}, B_{6}))$ .

Figure 3. Logarithmic Scale Tail Distribution (Smooth Black) of the Potential Interarrival Time T Is Exponential with Rate λ = 1
*Notes.* Simulation of the tail distributions of the stationary effective interarrival times is illustrated for two cases: *A^d* corresponds to deterministic patience levels with θ = 3 (dotted red), whereas *A^e* corresponds to exponentially distributed patience levels with mean 3 (dashed blue). The service requirements in the example are Erlang distributed with parameters $(5, 1.5)$ . The plotted stationary effective interarrival distributions are given by the empirical distribution obtained by simulating $n = 10^{5}$ effective arrivals for each of the cases.

We focus on the case that the patience-level distribution is determined by a parameter θ that is a vector in $R^{p}$ for some (known) $p \in N$ , entailing the estimation problem is parametric. In addition, the arrival rate λ is to be estimated, which is evidently of a parametric nature as well. Alternatively, θ could be a function from some more general space, rendering the estimation problem nonparametric; in Section 6, we briefly discuss an approach that can be used in this setting. Throughout this paper, we assume that we observe the full queue-length process. This in particular means that we observe the arrival and departure epochs of the nonbalking customers (and that we do not observe the balking customers at all). The objective is to use this information to somehow learn the true arrival rate and the parameters of the patience-level distribution. More concretely, we wish to devise statistical procedures for estimating the arrival rate λ and the distribution of the patience Y (both corresponding to nonobservable quantities) with provable performance properties.

2.2. Stationary Distribution

We assume that the mean service time $E [B]$ is finite and that, with $ρ ≔ λ E [B] / s$ denoting the traffic intensity,

ρ \lim_{x \to \infty} (1 - H (x)) < 1 .

(2)

Under these assumptions, we have that the queue is stable and that a stationary queue-length $Q ≔ Q (\infty)$ and a stationary virtual waiting time $V ≔ V (\infty)$ exist (see Baccelli and Hebuterne 1981 and Baccelli et al. 1984). Note that $\lim_{x \to \infty} H (x) < 1$ holds only if there are customers with infinite patience, that is, $P (Y = \infty) > 0$ . If $P (Y < \infty) = 1$ , on the other hand, then we have $\lim_{x \to \infty} H (x) = 1$ , so that the system is stable irrespective of the value of the traffic intensity ρ. As we will see, the MLE method to be developed relies only on the transient dynamics of the system so that it can be applied even if the stability condition (2) is not satisfied. On the other hand, we will utilize the system stability to discuss the asymptotic properties of the proposed estimator.

For the M/G/s queue with impatient customers, exact expressions for the distributions of the stationary queue-length Q and virtual waiting time V are not known in the literature. However, we can derive the following relations they satisfy in a similar way to the single-server case (Kovalenko 1961). Let $q_{n} ≔ P (Q = n)$ ( $n = 0, 1, \dots$ ) denote the probability mass function of the stationary queue length. The stationary virtual waiting time distribution has probability mass $π_{0} ≔ P (V = 0) = \sum_{n = 0}^{s - 1} q_{n}$ at zero, and it is absolutely continuous on $(0, \infty)$ . Let $X_{| y}$ denote a generic random variable for the stationary upward-jump size X conditioned that it takes a positive value and that the immediately preceding virtual waiting time equals y:

P (X_{| y} ⩽ x) = \lim_{t \to \infty} P (V (t) - V (t -) ⩽ x | V (t -) = y, V (t) > y) .

With the level-crossing argument, we can verify that the density function v(x) of the stationary virtual waiting time satisfies the following integral equation (cf. Boxma et al. 2011):

v (x) = λ q_{s - 1} {\bar{J}}_{| 0} (x) + λ \int_{0}^{x} v (y) P (Y ⩾ y) {\bar{J}}_{| y} (x - y) ​d y, x ⩾ 0,

(3)

where

{\bar{J}}_{| y} (x) ≔ P (X_{| y} > x)

denotes the complementary cdf of the conditional upward-jump size

X_{| y}

Owing to the PASTA property, the virtual waiting time seen by an arriving customer has the same distribution as the stationary virtual waiting time V. The stationary waiting time W of a nonbalking customer is thus given by a conditional random variable $[V | V ⩽ Y]$ . We can thus identify $P (W = 0)$ and the probability density function (pdf) $w (\cdot)$ of W in terms of the stationary virtual waiting time distribution $v (\cdot)$ :

P (W = 0) = \frac{π_{0}}{1 - P_{ℓ}}, w (x) = \frac{v (x) P (Y ⩾ x)}{1 - P_{ℓ}}, x ⩾ 0,

(4)

where

P_{ℓ}

denotes the loss probability given by

P_{ℓ} = \int_{0}^{\infty} v (y) P (Y < y) ​d y .

(5)

For s = 1, (3) simplifies to

v (x) = λ π_{0} \bar{G} (x) + λ \int_{0}^{x} v (y) P (Y ⩾ y) \bar{G} (x - y) ​d y, x ⩾ 0,

(6)

where

\bar{G} (x) ≔ 1 - G (x)

denotes the complementary cdf of service requirements. A solution of the integral Equation (6) is, as can be found in Baccelli et al. (1984) and Inoue and Takine (2015), given in terms of the pdf

g_{e} (x) ≔ \bar{G} (x) / E [X]

of the equilibrium distribution (i.e., the residual lifetime distribution) corresponding to the service requirements:

\begin{array}{l} v (x) = π_{0} \sum_{n = 1}^{\infty} u_{n} (x), x ⩾ 0, \\ π_{0} = {(\sum_{n = 1}^{\infty} \int_{0}^{\infty} u_{n} (x) ​d x)}^{- 1}, \\ u_{1} (x) = ρ g_{e} (x), u_{n} (x) = ρ \int_{0}^{x} u_{n - 1} (y) P (Y ⩾ y) g_{e} (x - y) ​d y, n = 2, 3, \dots . \end{array}

(7)

In some special cases, the expression (7) for the virtual waiting time density $v (\cdot)$ can be further simplified, as shown in the examples below.

Example 1

(Constant Patience Levels). We consider the case that the patience levels take a constant value θ₀ (for some $θ_{0} > 0$ ). Let $g_{e}^{(n)} (\cdot)$ (for $x ⩾ 0, n = 1, 2, \dots$ ) denote the n-fold convolution of the pdf $g_{e} (\cdot)$ of the equilibrium distribution of the service times and $G_{e}^{(n)} (x)$ the corresponding cdf. Also, define $g_{e, θ_{0}}^{(n)} (\cdot)$ by

g_{e, θ_{0}}^{(n)} (x) ≔ {\begin{array}{l} g_{e}^{(n)} (x) / G_{e}^{(n)} (θ_{0}), & x ⩽ θ_{0}, \\ 0, & x > θ_{0} . \end{array}

We then have, by de Kok and Tijms (1985) and Inoue and Takine (2015), with as usual “ $⋆$ ” denoting the convolution operator,

π_{0} = {(1 + ρ + ρ \sum_{n = 1}^{\infty} ρ^{n} G_{e}^{(n)} (θ_{0}))}^{- 1},

(8)

v (x) = {\begin{array}{l} π_{0} \sum_{n = 1}^{\infty} ρ^{n} g_{e}^{(n)} (x), & 0 ⩽ x ⩽ θ_{0}, \\ π_{0} ρ g_{e} (x) + π_{0} \sum_{n = 2}^{\infty} ρ^{n} G_{e}^{(n - 1)} (θ_{0}) \cdot [g_{e, θ_{0}}^{(n - 1)} ⋆ g_{e}] (x), & x > θ_{0}, \end{array}

(9)

Example 2

(Exponential Service Times). If the service times follow an exponential distribution with mean $1 / μ$ , we have, by Stanford (1979) and Inoue and Takine (2015), that

\begin{array}{l} π_{0} = {(1 + λ \int_{0}^{\infty} \exp [- μ x + λ \int_{0}^{x} P (Y ⩾ y) ​d y] ​d x)}^{- 1}, \\ v (x) = π_{0} λ \exp [- μ x + λ \int_{0}^{x} P (Y ⩾ y) ​d y], x ⩾ 0 . \end{array}

Observe that if the patience distribution is also exponential with rate θ, then the expressions can be simplified by substituting $\int_{0}^{x} P (Y ⩾ y) d y = \exp (- θ x)$ .

3. Parametric Estimation Procedure

In the setting considered, we focus on the first nonbalking n + 1 customers. We set the time origin t = 0 to the time instant that a nonbalking customer joins the system. We record the effective arrival times ${\tilde{A}}_{1}, {\tilde{A}}_{2}, \dots, {\tilde{A}}_{n}$ and departure times. From this information, we can fully reconstruct the virtual waiting time process using (1); in particular, we obtain the sequence of virtual waiting times $V (0), V ({\tilde{A}}_{1}), \dots, V ({\tilde{A}}_{n})$ observed by the nonbalking customers and the sizes of upward jumps $X_{0}, X_{1}, \dots, X_{n}$ caused by them. In this section, the goal is to estimate in a parametric context the arrival rate and patience-level distribution from the data.

The crucial observation is that we can construct a likelihood function of the patience-level distribution using an observed sample of effective interarrival times $A_{1}, A_{2}, \dots, A_{n}$ , waiting times $W_{0} = V (0), W_{1} = V ({\tilde{A}}_{1}), W_{2} = V ({\tilde{A}}_{2}), \dots, W_{n} = V ({\tilde{A}}_{n})$ , and upward-jump sizes $X_{0}, X_{1}, \dots, X_{n}$ , which can be reconstructed from the observed sequences mentioned in the previous paragraph. We work under the natural assumption that the observed system is in stationarity, so that W₀ is the stationary waiting time of a nonbalking customer and X₀ is a stationary upward-jump size.

Remark 1.

To make the sequence of waiting times $W_{1}, W_{2}, \dots, W_{n}$ stationary, we should let the time origin coincide with an arrival instant of a nonbalking customer. The validity of this statement can be argued as follows.

We first point out that the sequence of waiting times is in general not a Markov process; only for s = 1 it is. We obtain a Markov process when considering the vector of residual service times at effective arrival instants. The waiting times and upward jump sizes are in fact measurable functions of this process. Therefore, in what follows, we let the underlying Markov process of residuals be stationary.

First, it is observed that assuming the virtual waiting time at time t = 0 being stationary does not imply that the waiting time W₁ of the first customer follows the stationary distribution W. To see this, consider (for simplicity) an ordinary M/G/s queue (that is, without customer impatience). If the virtual waiting time at time t = 0 follows the stationary virtual waiting time distribution V, then the first customer arriving after time t = 0 finds an idle server with probability

P (W_{1} = 0) = P (V = 0) + \int_{0}^{\infty} e^{- λ x} v (x) ​d x > P (V = 0) .

Because the stationary waiting time W has the same distribution as the stationary virtual waiting time V in the ordinary M/G/s queue (due to PASTA), we find the inequality

P (W_{1} = 0) > P (V = 0) = P (W = 0) .

This shows that the waiting time W₁ of the first-arriving customer is biased in that it is not distributed as W.

Intuitively, the bias is a consequence of the fact that a customer arriving after a long interarrival time is more likely chosen as the first arriving customer in our experiment (cf. the well-known inspection paradox in renewal theory), and such a customer tends to observe the system less congested than time average. It is easily seen that letting time t = 0 correspond to the arrival instant of a nonbalking customer makes the sequence of waiting times $W_{1}, W_{2}, \dots, W_{n}$ stationary. □

Let $\bar{H} (x) ≔ 1 - H (x)$ denote the complementary cdf of patience levels. Also, we denote by $\tilde{H} (x)$ the probability of a customer joining when the virtual waiting time equals $x ⩾ 0$ :

\tilde{H} (x) ≔ P (Y ⩾ x) = \bar{H} (x) + P (Y = x) .

Note that $\tilde{H} (x) = \bar{H} (x) = P (Y > x)$ for continuous patience distributions.

We next describe the construction of the likelihood function. We first recall that the effective interarrival times $A_{1}, A_{2}, \dots, A_{n}$ are not exponentially distributed because of the fact that between two effective arrival instants there may have been arriving customers who observed a virtual waiting time level exceeding their patience level (as depicted in Figure 3). Despite this, we can still characterize the effective interarrival time distribution in terms of the observed quantities. To see this, suppose that, for some $t ⩾ 0$ , there have been no effective arrivals in the interval $({\tilde{A}}_{i - 1}, {\tilde{A}}_{i - 1} + t]$ . An arrival in $({\tilde{A}}_{i - 1} + t, {\tilde{A}}_{i - 1} + t + Δ t]$ then occurs with probability $λ Δ t + o (Δ t)$ , as $Δ t ↓ 0$ . The corresponding customer joins the system (i.e., becomes effective) with probability $\tilde{H} (V ({\tilde{A}}_{i - 1}) - t)$ because (i) the virtual waiting time seen on the arrival equals $\max {0, V ({\tilde{A}}_{i - 1}) - t}$ and (ii) $\tilde{H} (x) = 1$ for x < 0. Therefore, the occurrence of the next (i.e., the i-th) effective arrival follows a time-inhomogeneous Poisson process with time-dependent intensity $λ \tilde{H} (V ({\tilde{A}}_{i - 1}) - t)$ . Noting that $V ({\tilde{A}}_{i - 1}) = W_{i - 1} + X_{i - 1}$ (for $i = 1, 2, \dots$ ), we thus conclude that

P (A_{i} > t | W_{i - 1} + X_{i - 1} = v) = \exp (- λ \int_{0}^{t} \tilde{H} (v - u) ​d u) .

(10)

This representation of the distribution of the effective interarrival times facilitates the evaluation of the likelihood. The (conditional) likelihood function for $A ≔ (A_{1}, A_{2}, \dots, A_{n})$ given $W ≔ (W_{1}, W_{2}, \dots, W_{n})$ and $X ≔ (X_{1}, X_{2}, \dots, X_{n})$ is given by the product of the conditional densities

f_{A_{i} | v} (t) = - \frac{​d}{​d t} P (A_{i} > t | W_{i - 1} + X_{i - 1} = v)

(11)

= λ \tilde{H} (v - t) \exp (- λ \int_{0}^{t} \tilde{H} (v - u) ​d u) .

(12)

This yields the conditional likelihood function

\begin{array}{l} L_{n} (H; A, W | X) = \prod_{i = 1}^{n} λ \tilde{H} (W_{i - 1} + X_{i - 1} - A_{i}) \exp (- λ \int_{0}^{A_{i}} \tilde{H} (W_{i - 1} + X_{i - 1} - u) ​d u) \\ = λ^{n} \prod_{i = 1}^{n} \tilde{H} (W_{i}) \exp (- λ \int_{0}^{A_{i}} \tilde{H} (W_{i - 1} + X_{i - 1} - u) ​d u), \end{array}

where we used a Lindley-type recursion, generalized to the multiserver queue:

W_{k} = \max {W_{k - 1} + X_{k - 1} - A_{k}, 0}, k = 1, 2, \dots .

(13)

In the rest of this section, we assume that the patience-level distribution is characterized by a finite-dimensional vector of parameters $θ \in Θ \subseteq R^{p}$ . For $θ \in Θ$ , let $H_{θ} (\cdot), {\bar{H}}_{θ} (\cdot)$ , and ${\tilde{H}}_{θ} (\cdot)$ denote $H (\cdot), \bar{H} (\cdot)$ , and $\tilde{H} (\cdot)$ given parameter θ, respectively. Let any $H_{θ} (\cdot)$ , with $θ \in Θ$ , be identifiable in the conventional Kullback-Leibler sense. The maximum-likelihood estimator (MLE) of $(λ, θ)$ is then given by

({\hat{λ}}_{n}, {\hat{θ}}_{n}) \in \underset{_{(λ, θ) \in R_{+} \times Θ}}{arg max} L_{n} (λ, θ; A, W | X),

where

L_{n} (λ, θ; A, W | X) ≔ λ^{n} \prod_{i = 1}^{n} {\tilde{H}}_{θ} (W_{i}) \exp (- λ \int_{0}^{A_{i}} {\tilde{H}}_{θ} (W_{i - 1} + X_{i - 1} - u) ​d u) .

(14)

In the sequel, we provide an asymptotic analysis describing the performance of this MLE. We first focus, in Sections 3.1 and 3.2, on the parameters pertaining to the patience only; that is, the estimation does not cover the arrival rate λ. The object of study is, in self-evident notation, for $λ > 0$ given,

{\hat{θ}}_{n} \in \underset{_{θ \in Θ}}{arg max} L_{n} (θ; A, W | X) .

Extending this to a procedure to also include estimation of the arrival rate is relatively straightforward; we get back to it in Section 3.3.

Remark 2.

Observe that L is a conditional likelihood and not a full likelihood. This is due to the fact that the upward jump-sizes X_i have an elaborate distribution both marginally and jointly with the waiting times. The exception is the single-server case where $X_{i} \sim B$ and is independent of the previous waiting and arrival times. Hence, for s = 1, the maximization of $L_{n} (H; A, W | X)$ with respect to them is equivalent to that of the unconditional likelihood function for $(A, W, X)$ . It is noted that the procedure to estimate θ does not require the simultaneous estimation of the service requirement or upward-jump size distributions. $⋄$

In more detail, the remainder of this section is organized as follows. First, we establish (in Section 3.1) conditions for strong consistency and asymptotic normality of our estimator of the patience parameters, with the errors scaled by $\sqrt{n}$ , focusing on continuous patience-level distributions. We then consider (in Section 3.2) the case of deterministic patience and establish consistency (independently) for this case as well. It is noted that in the latter case the asymptotic errors are not normally but rather exponentially distributed, with the errors scaled by n. This is due to the fact that for deterministic patience the MLE is obtained on the boundary of the sample data, much like in the well-known case of estimating the parameter θ of a uniform distribution on $[0, θ]$ . As mentioned, Section 3.3 discusses the estimation of the arrival rate.

3.1. Continuous Patience-Level Distribution

This subsection covers the asymptotic performance of the MLE for the case that the patience-level distribution is continuous and parametric. Let θ₀ denote the true parameter. Throughout the following analysis, the underlying probability measure is $P_{θ_{0}}$ , that is, the probability measure corresponding to the true patience-level distribution. For the asymptotic results, we make the following assumptions, which will be discussed in Section 3.1.1.

Assumption.

The following assumptions are imposed:

(A1) The parameter space $Θ \subset R^{p}$ is a compact set such that the true parameter lies in the interior of the set, that is, $θ_{0} \in Θ^{o}$ .
(A2) The observation period commences at t = 0, which corresponds to the stationary arrival instant of a nonbalking customer, and thus the sequence of waiting times $W_{1}, W_{2}, \dots, W_{n}$ is stationary.
(A3) Let ${\bar{H}}_{\inf} (x) ≔ \inf_{θ \in Θ} {\bar{H}}_{θ} (x)$ . Depending on whether the right endpoint
$h_{\sup} ≔ \inf {x ⩾ 0 : {\bar{H}}_{\inf} (x) = 0}$
takes a finite value or not, we assume one of the following properties: (i) If $h_{\sup} = \infty$ , then there exists a positive nondecreasing function $f (\cdot)$ such that $\lim_{x \to \infty} f (x) \in (0, \infty]$ and for some constants $c_{1}, c_{2} \in [0, \infty)$ ,
$\lim_{x \to \infty} \frac{e^{- f (x)}}{{\bar{H}}_{\inf} (x)} = c_{1}, \lim_{x \to \infty} f (x) {\bar{H}}_{θ_{0}} (x) = c_{2} .$
(ii) If $h_{\sup} < \infty$ , then there exists a positive nondecreasing function $f (\cdot)$ such that $\lim_{x \to h_{\sup} -} f (x) \in (0, \infty]$ and for some constants $c_{1}, c_{2} \in [0, \infty)$ ,
$\lim_{x \to h_{\sup} -} \frac{e^{- f (x)}}{{\bar{H}}_{\inf} (x)} = c_{1}, \lim_{x \to h_{\sup} -} f (x) {\bar{H}}_{θ_{0}} (x) = c_{2} .$
(A4) The gradient vector and Hessian matrix of $H_{θ}$ are continuous with respect to θ. With $Ψ_{1} (θ)$ denoting the Hessian matrix of the log-likelihood corresponding to a single waiting time $($ in stationarity $), E Ψ_{1} (θ)$ has finite elements in all coordinates for any $θ \in Θ$ .
(A5) The collection of functions ${H_{θ} (x) : θ \in Θ}$ is equicontinuous; for any $x ⩾ 0$ and $ϵ > 0$ there exists a $δ > 0$ such that $| H_{θ} (x) - H_{θ} (y) | < ϵ$ for any y such that $| x - y | < δ$ uniformly over all $θ \in Θ$ .

We now state the main results of this subsection. In the sequel, $N (μ, Σ)$ denotes a normally distributed random variable with mean vector μ and covariance matrix $Σ .$

Theorem 1.

If Assumptions (A1)–(A3) and (A5) are satisfied, then, as $n \to \infty$ ,

{\hat{θ}}_{n} \to_{as} θ_{0} .

(15)

Theorem 2.

If Assumptions (A1)–(A5) hold, then, because $n \to \infty$ ,

\sqrt{n} ({\hat{θ}}_{n} - θ_{0}) \to_{d} N (0, I^{- 1} (θ_{0})),

(16)

where

I (θ_{0}) ≔ - E Ψ_{1} (θ_{0})

, and

I^{- 1} (θ_{0})

denotes the inverse of

I (θ_{0})

The proofs of the above theorems are provided in Section 3.1.2 after the detailed discussion of (A1)–(A5) that we give in Section 3.1.1.

3.1.1. Discussion of Assumptions.

We next provide more background on the assumptions imposed. In addition, we discuss their possible relaxation.

(1) Assumption (A1) is natural, because it requires that the parameter space is big enough so that it contains the true parameter. In practice, the parameter space can be adjusted on the fly, for example, if for large n we obtain boundary solutions for the first-order conditions.
(2) Assumption (A2) facilitates the use of known results on the convergence of stationary dependent sequences. Note that, as long as a stationary virtual waiting time distribution exists (for which we have given the condition as Equation (2) in Section 2), the same results should hold without making this assumption because of the regenerative nature of the process. However, without this assumption, the conditions for both consistency and asymptotic normality are harder to verify. In particular, the crucial step for consistency is the uniform convergence of the log-likelihood, established in Lemma 1 below. More elaborate conditions for uniform convergence are detailed in, for example, Ranga Rao (1962), Andrews (1987), and Pötscher and Prucha (1989).
(3) If there are patient customers who do not balk regardless of the waiting time, that is, $\lim_{x \to \infty} {\bar{H}}_{θ_{0}} (x) > 0$ , then Assumption (A3) requires that the parameter space Θ is chosen so that the existence of patient customers ( $\lim_{x \to \infty} {\bar{H}}_{θ} (x) > 0$ ) is assumed for all $θ \in Θ$ , which is a reasonable assumption in modeling a service system where both patient and impatient customers exist. In the case of $\lim_{x \to \infty} {\bar{H}}_{θ_{0}} (x) = 0$ , on the other hand, Assumption (A3) requires that the decay rate of ${\bar{H}}_{θ} (x)$ does not vary too strongly among the distributions $H_{θ}$ , with θ in the parameter space Θ. In practice, this assumption is seldom violated. For example, if the true patience-level distribution decays exponentially, that is, $\lim_{x \to \infty} e^{ν x} {\bar{H}}_{θ_{0}} (x) = c$ for some $ν > 0$ and c > 0, then (A3) is satisfied if the infimum tail function ${\bar{H}}_{\inf} (x)$ does not decay faster than the doubly exponential function $e^{- e^{ν x}}$ (which evidently decays exceptionally fast). As another example, supposing that the true patience-level distribution obeys a power law, that is, $\lim_{x \to \infty} x^{k} {\bar{H}}_{θ_{0}} (x) = c$ for some k > 0 and c > 0, then (A3) is satisfied if ${\bar{H}}_{\inf} (x)$ does not decay faster than $e^{- x^{k}}$ .
(4) Assumption (A4) enables the construction of a standard martingale CLT for the asymptotic distribution of the estimation error of the MLE. It is not a necessary condition, but in cases where the assumption is not satisfied, one is typically required to apply ad hoc analysis to derive an asymptotic distribution. In Section 3.2, we show that for a deterministic patience level that does not satisfy (A4), the asymptotic distribution of the error is exponential and not normal. Assumption (A2) implies that the expectation of the Hessian matrix of the log-likelihood $Ψ_{1} (θ)$ is the covariance matrix of the gradient, and thus it is always positive definite. As a consequence, one just needs to verify that the coordinates are finite for (A4) to hold.
(5) The equicontinuity assumption (A5) enables concise analysis and can be verified for many continuous patience-level distributions. For example, if there exists a bounded density for every $H_{θ} (\cdot)$ , then it is Lipschitz continuous, and a uniform bound is given by the supremum of the constants in the compact set Θ. The assumption does not hold if the support of the distribution depends on θ. If $H_{θ} (x)$ has some discontinuities (with respect to x), one could pursue replacing (A5) by an appropriate upper semicontinuity assumption; see theorem 16b in Ferguson (1996) and Pötscher and Prucha (1989). We will present such an example in Section 5.1. Our asymptotic results may hold by replacing (A5) with (A4) and verifying additional measurability conditions on $\inf_{η \in B_{θ}} {\bar{H}}_{η} (x)$ for a neighborhood $ℬ_{θ}$ of any $θ \in Θ$ (see Andrews 1987, corr. 2).

3.1.2. Proofs.

From (14), first note that we can express the log-likelihood $ℓ_{n} (θ; A, W | X) ≔ \log L_{n} (θ; A, W | X)$ by

n \log λ + \sum_{i = 1}^{n} (\log {\bar{H}}_{θ} (W_{i}) - λ \int_{0}^{A_{i}} {\bar{H}}_{θ} (W_{i - 1} + X_{i - 1} - u) ​d u) 1_{{{\bar{H}}_{θ} (W_{i}) > 0}} .

(17)

Observe that we replaced ${\tilde{H}}_{θ} (\cdot)$ with ${\bar{H}}_{θ} (\cdot)$ because $H_{θ} (\cdot)$ is continuous.

With $W ≔ \lim_{i \to \infty} W_{i}$ existing, because $n \to \infty$ , the continuous mapping theorem yields

\frac{1}{n} ℓ_{n} (θ; A, W | X) \to_{as} E [ℓ_{1} (θ; A_{1}, W_{0}, W_{1} | X_{0})],

(18)

where A₁ is the effective interarrival time when the initial virtual waiting time is

W_{0} + X_{0}

and

W_{0} =_{d} W

, that is, the stationary waiting time. From now on, we use the compact notations

ℓ_{n} (θ) ≔ ℓ_{n} (θ; A, W | X)

and

ℓ (θ) ≔ E [ℓ_{1} (θ; A_{1}, W_{1}, W_{0} | X_{0})]

. The density of the effective interarrival times (11) is uniquely determined by the function

H_{θ} (\cdot)

, and thus so is the likelihood (14). This is because the function

H_{θ} (\cdot) ≔ {H_{θ} (x) ≔ \int_{[0, x]} ​d ν_{θ} (u) : x ⩾ 0}

is uniquely determined by θ as a consequence of the fact that for every

θ \in Θ

the measure

ν_{θ} (\cdot)

corresponds to a different distribution (that is, in the almost-everywhere sense). As a consequence,

{H_{θ} (\cdot) : θ \in Θ}

is a parametric collection of distribution functions such that there is no pair

θ_{1}, θ_{2} \in Θ

for which

H_{θ_{1}} (\cdot) = H_{θ_{2}} (\cdot)

almost everywhere. This entails that the model is identifiable in the Kullback-Leibler sense (see, e.g., Ferguson 1996, chapter 17), and hence,

ℓ (θ) - ℓ (θ_{0}) < 0, \forall θ \neq θ_{0} .

(19)

The key step in the proof of Theorem 1 is establishing a uniform version of (18); then, strong consistency follows by the methodology of Wald (Ferguson 1996, chapters 16 and 17). The virtual waiting time observations are not independent, but by (A2) they are stationary so that we can apply a uniform law for stationary sequences that is commonly used in the econometrics literature. We rely specifically on a theorem taken from Ranga Rao (1962) and its extensions, in particular the ones developed in Andrews (1987) and Pötscher and Prucha (1989). The uniform convergence is established in Lemma 1 (proven in the Appendix).

Lemma 1.

If Assumptions (A1)–(A3) and (A5) hold, then, because $n \to \infty$ ,

\sup_{θ \in Θ} | \frac{1}{n} ℓ_{n} (θ) - ℓ (θ) | \to_{as} 0 .

(20)

Proof of Theorem 1.

Observe that the strong consistency follows from the identifiability property (19) and from Lemma 1. Note in particular that the proof of Ferguson (1996) (theorem 17) does not rely on i.i.d. observations once uniform convergence has been established, and so the same steps can be applied here. In particular, a sufficient condition (relying on uniform convergence for strong consistency when observations are dependent) can also be found in Heijmans and Magnus (1986); for every $θ \neq θ_{0}$ , there exists a neighborhood $ℬ_{δ} (θ)$ such that

\lim_{n \to \infty} \sup_{γ \in ℬ_{δ} (θ)} (\frac{1}{n} ℓ_{n} (γ) - \frac{1}{n} ℓ_{n} (θ_{0})) < 0,

(21)

almost surely. This follows directly from (19) and (20). □

We continue by proving asymptotic normality that was stated in Theorem 2. This amounts to showing that the estimation error $\sqrt{n} ({\hat{θ}}_{n} - θ_{0})$ is asymptotically normal because $n \to \infty$ , assuming that (A1)–(A5) are satisfied. To this end, we apply the well-known delta method and an appropriate version of the martingale CLT (see, e.g., Heyde 1997, theorem 12.6).

Proof of Theorem 2.

For any given v, we let $\nabla {\bar{H}}_{θ} (v) \in R^{p}$ and $\nabla^{2} {\bar{H}}_{θ} (v) \in R^{p \times p}$ denote the gradient and Hessian, respectively, of ${\bar{H}}_{θ}$ . These are both continuous with respect to any coordinate θ_k as a consequence of (A4). Thus, both the gradient and the Hessian of the log-likelihood are continuous functions. These are given by $\nabla ℓ_{n} (θ) ≔ {\dot{ℓ}}_{n} (θ)$ and $\nabla^{2} ℓ_{n} (θ) ≔ \sum_{i = 1}^{n} Ψ_{i} (θ)$ , respectively, where, for $k = 1, \dots, p$ ,

{({\dot{ℓ}}_{n} (θ))}_{k} = \sum_{i = 1}^{n} (\frac{\nabla_{k} {\bar{H}}_{θ} (W_{i})}{{\bar{H}}_{θ} (W_{i})} - λ \int_{0}^{A_{i}} \nabla_{k} {\bar{H}}_{θ} (W_{i - 1} + X_{i - 1} - u) ​d u) 1_{{{\bar{H}}_{θ} (W_{i}) > 0}},

(22)

and, for

k, l = 1, \dots, p

\begin{array}{l} {(Ψ_{i} (θ))}_{k l} & = (\frac{{\bar{H}}_{θ} (W_{i}) \nabla_{k l}^{2} {\bar{H}}_{θ} (W_{i}) - \nabla_{k} {\bar{H}}_{θ} (W_{i}) \nabla_{l} {\bar{H}}_{θ} (W_{i})}{{\bar{H}}_{θ} {(W_{i})}^{2}} - λ \int_{0}^{A_{i}} \nabla_{k l}^{2} {\bar{H}}_{θ} (W_{i - 1} + X_{i - 1} - u) ​d u) 1_{{{\bar{H}}_{θ} (W_{i}) > 0}} . \end{array}

(23)

Following the lines of the standard delta method (van der Vaart 1998, chapter 3), we consider the expansion of ${\dot{ℓ}}_{n} (\cdot)$ at the MLE ${\hat{θ}}_{n}$ around the true parameter:

\frac{1}{n} {\dot{ℓ}}_{n} ({\hat{θ}}_{n}) = \frac{1}{n} {\dot{ℓ}}_{n} (θ_{0}) + ({\hat{θ}}_{n} - θ_{0}) \int_{0}^{1} \frac{1}{n} \sum_{i = 1}^{n} Ψ_{i} (α {\hat{θ}}_{n} + (1 - α) θ_{0}) ​d α .

(24)

If $θ_{0} \in Θ$ , then because of the strong consistency that we found in Theorem 1, we have, because $n \to \infty$ , that

{\hat{θ}}_{n} \to_{as} θ_{0} .

Furthermore, because $θ_{0} \in Θ$ by assumption (A1), the smoothness assumption (A4) implies that, as n grows large, $ℓ_{n} (θ)$ converges to a concave function, and the MLE is given by a sequence of roots ${\hat{θ}}_{n}$ satisfying ${\dot{ℓ}}_{n} ({\hat{θ}}_{n}) = 0$ ; that is, from some N on there is no boundary solution for any $n ⩾ N$ with probability one. Therefore, (24) yields, with

B_{n} ≔ - \int_{0}^{1} \frac{1}{n} \sum_{i = 1}^{n} Ψ_{i} (α {\hat{θ}}_{n} + (1 - α) θ_{0}) ​d α,

that

\lim_{n \to \infty} \sqrt{n} ({\hat{θ}}_{n} - θ_{0}) B_{n} \overset{d}{=} \lim_{n \to \infty} \frac{1}{\sqrt{n}} {\dot{ℓ}}_{n} (θ_{0}),

(25)

that is, both sides of (25) converge to the same distribution (if the limits exist). In Lemma 2, we apply the stationarity of W_i in (A2) to show that

B_{n} \to_{as} I (θ_{0})

, where

I (θ_{0}) = - E Ψ_{1} (θ_{0})

. Furthermore,

{\dot{ℓ}}_{n} (θ_{0}) / \sqrt{n}

is shown to satisfy a martingale CLT with asymptotic variance

I (θ_{0})

. Because

E \nabla ℓ_{1} (θ_{0}) = 0

, Assumption (A2) implies that the asymptotic variance equals the stationary covariance of the gradient

I (θ_{0}) = E \nabla^{2} (ℓ_{1} (θ_{0})) = E [\nabla (ℓ_{1} (θ_{0})) \nabla {(ℓ_{1} (θ_{0}))}^{⊤}],

(26)

and hence,

I (θ_{0})

is a positive definite matrix (see Ferguson 1996, chapter 18, for more details). Assumption (A4) further demands that the elements of

I (θ_{0})

are all finite, and then combining the above and applying Slutsky’s theorem to (25) yields Theorem 2. □

We are thus left with showing Lemma 2 below; its proof is given in the Appendix.

Lemma 2.

If (A1)–(A5) hold and $I (θ_{0}) = - E Ψ_{1} (θ_{0})$ is a positive definite matrix with finite elements, then (a) as $n \to \infty, B_{n} \to_{as} I (θ_{0})$ , and (b) as $n \to \infty$ ,

\frac{1}{\sqrt{n}} {\dot{ℓ}}_{n} (θ_{0}) \to_{d} N (0, I (θ_{0})) .

(27)

3.2. Constant Patience-Level MLE

We next consider the case where all customers have the same patience level θ₀, that is, $H (y) = 1_{{y ⩾ θ_{0}}}$ ; observe that in this case the stability condition (2) is always satisfied, but the continuity assumptions of Section 3.1 do not apply. As we will see, in this case the properties of the MLE are markedly different from those identified in Section 3.1.

In this setting, the likelihood function (14) reduces to, with the event $ℰ_{i}$ denoting ${W_{i - 1} + X_{i - 1} ⩽ θ}$ and $ℰ_{i}^{c}$ its complement,

\begin{array}{l} L (θ; A, W | X) = λ^{n} \prod_{i = 1}^{n} 1_{{W_{i} ⩽ θ}} (1_{{ℰ_{i}}} e^{- λ \int_{0}^{A_{i}} ​d u} + 1_{{ℰ_{i}^{c}}} e^{- λ \int_{W_{i - 1} + X_{i - 1} - θ}^{A_{i}} ​d u}) \\ = λ^{n} \prod_{i = 1}^{n} 1_{{W_{i} ⩽ θ}} (1_{{ℰ_{i}}} e^{- λ A_{i}} + 1_{{ℰ_{i}^{c}}} e^{- λ (A_{i} + θ - W_{i - 1} - X_{i - 1})}) . \end{array}

Observe that $L (θ; A, W | X)$ (i) equals zero for $θ \in [0, \max_{i = 1, 2, \dots, n} W_{i})$ , (ii) has an upward discontinuity at $θ = \max_{i = 1, 2, \dots, n} W_{i}$ and decreases for

θ \in [\max_{_{i = 1, 2, \dots, n}} W_{i}, \max_{_{i = 1, 2, \dots, n}} {W_{i} + X_{i}}),

and (iii) takes a constant value for

θ ⩾ \max_{i = 1, 2, \dots, n} {W_{i} + X_{i}}

. We thus find the intuitively appealing property that the MLE is given by

{\hat{θ}}_{n} = \max_{_{i = 1, 2, \dots, n}} W_{i},

(28)

that is, the maximum virtual waiting time at jump times. Note that this estimator resembles the MLE of the parameter θ when the observations are uniformly distributed on

[0, θ]

Although the estimator ${\hat{θ}}_{n}$ is clearly biased (which follows from $E_{θ_{0}} [{\hat{θ}}_{n}] < θ_{0}$ for all n), we show that it converges almost surely to θ₀ as $n \to \infty$ . In addition, we prove that the estimation error scaled at rate n converges to an exponential random variable.

Theorem 3.

Because $n \to \infty$ ,

{\hat{θ}}_{n} \to_{as} θ_{0},

(29)

and

P (n (θ_{0} - {\hat{θ}}_{n}) ⩾ x) \to e^{- v (θ_{0}) x / (1 - P_{ℓ})},

(30)

where

P_{ℓ}

and

v (θ_{0})

denote the stationary loss probability and the virtual waiting time density at level θ₀. In addition, the asymptotic variance of the estimation error agrees with the variance of the limiting exponential distribution and is given by

\lim_{n \to \infty} n^{2} Var [θ_{0} - {\hat{θ}}_{n}] = \lim_{n \to \infty} {E [{n (θ_{0} - {\hat{θ}}_{n})}^{2}] - {(E [n (θ_{0} - {\hat{θ}}_{n})])}^{2}}

(31)

= {(\frac{1 - P_{ℓ}}{v (θ_{0})})}^{2} .

(32)

The proof of Theorem 3, which is given in the Appendix, constructs lower and upper bounds for the MLE and establishes that they both converge to θ₀. The same bounds are also used to characterize the asymptotic distribution of the estimation error.

3.3. Estimating the Arrival Rate

Where in the preceding subsections we focused on estimating the patience-level distribution (in a parametric context) for a given value of the arrival rate λ, we now discuss the estimation of λ. Let λ₀ denote the true value of λ. There are several ways to estimate it, the most basic one relying on the idle period observations. Denote the duration of the k-th idle period by I_k, where in the multiserver setting (s > 1) an idle period refers to time intervals such that at least one server is idle. Denote by E_k the total number of effective arrivals during idle period $k = 1, 2, \dots$ (meaning, for instance, E_k = 1 for all $k = 1, 2, \dots$ when s = 1). Let C_n denote the number of idle periods observed up until (and including) the n-th effective arrival. Then we propose the estimator

{\hat{λ}}_{n} = \frac{\sum_{k = 1}^{C_{n}} E_{k}}{\sum_{k = 1}^{C_{n}} I_{k}} .

(33)

The rationale behind this estimator is that there is no balking during idle periods because the virtual waiting time is zero. Hence, the arrival process during these idle times is homogeneous Poisson with rate λ. Therefore, the estimator ${\hat{λ}}_{n}$ is a standard MLE of the rate parameter of an exponentially distributed random variable and satisfies all desired asymptotic properties.

Of course, the above procedure does not exploit a substantial amount of potentially useful data that is collected during busy periods. For the case of a deterministic patience level θ₀, the above estimator is easily improved upon. Recall that ${\hat{θ}}_{n}$ was defined as $\max {W_{1}, W_{2}, \dots, W_{n}}$ , with the immediate consequence that ${\hat{θ}}_{n} ⩽ θ_{0}$ . Supposing we observe an arrival such that immediately after this arrival the virtual waiting time level is still below the current value of the estimator, then the next arrival is an effective arrival and hence, occurs after an exponentially distributed random variable with rate λ. In this way, we generate more observations, thus allowing us to estimate λ with a better precision. Observe that the new observations and the idle times form an i.i.d. sequence.

For the case of a continuous patience-level distribution, one may use (33) or, alternatively, set up a joint MLE for λ and θ. The latter has clear advantages, in particular for small samples or heavily loaded systems. Because the log-likelihood function (17) is a smooth and concave function with respect to λ, the MLE of the arrival rate for any estimator ${\overset{˘}{θ}}_{n}$ of θ is

{\hat{λ}}_{n} ({\overset{˘}{θ}}_{n}) = {\overset{˘}{θ}}_{n} {[\frac{1}{n} \sum_{i = 1}^{n} \int_{0}^{A_{i}} {\bar{H}}_{{\overset{˘}{θ}}_{n}} (W_{i - 1} + X_{i - 1} - u) ​d u 1_{{{\bar{H}}_{{\overset{˘}{θ}}_{n}} (W_{i}) > 0}}]}^{- 1} .

(34)

The asymptotic results of Section 3.1 can, therefore, be extended in a straightforward manner so as to cover the joint estimation of $(λ, θ)$ . We demonstrate this in Section 4.1, where we detail a procedure for jointly estimating the arrival rate and the patience parameter for the case of an exponentially distributed patience level.

4. Exponential and Generalized Hyperexponential Patience

In this section, we discuss a robust and practical approach for estimating continuous patience distributions. In our approach, this is achieved by fitting the MLE of a generalized hyperexponential (GHE) distribution. This approach is attractive because the class of GHE distributions is known to be dense in the space of nonnegative continuous distributions (see, for instance, Botta and Harris 1986), which, in practical terms, means that any nonnegative continuous distribution can be approximated arbitrarily closely by a GHE distribution. In our simulation-based experiments, highly accurate estimates are obtained, even when the baseline patience distribution itself is not GHE.

We first provide a detailed analysis of the joint MLE for the arrival rate and the single parameter of an exponential patience distribution to then move to the cases of hyperexponential and generalized hyperexponential patience distributions. We also present a heuristic search method that fits an approximate GHE distribution for any continuous patience distribution.

4.1. Exponentially Distributed Patience

Suppose that the arrival rate λ₀ is unknown and that the patience-level distribution is exponential, that is, $H_{θ_{0}} (x) = 1 - e^{- θ_{0} x}$ for an unknown θ₀ (and $x ⩾ 0$ ). Because of (17), it can be verified, by distinguishing between the cases $W_{0} + X_{0} ⩾ A_{1}$ and $W_{0} + X_{0} < A_{1}$ , that the log-likelihood for a single observation is

\begin{array}{l} ℓ_{1} (λ, θ) & = \log λ - θ W_{1} - λ \int_{0}^{A_{1}} e^{- θ {(W_{0} + X_{0} - u)}^{+}} ​d u \\ = \log λ - θ W_{1} - \frac{λ}{θ} (e^{- θ W_{1}} - e^{- θ (W_{0} + X_{0})} + θ {(A_{1} - W_{0} - X_{0})}^{+}); \end{array}

here the Lindley recursion (13) has been used.

Our objective is to analyze the MLE of both λ and θ for a sample of size n, which we denote by $({\hat{λ}}_{n}, {\hat{θ}}_{n})$ . We assume that (A1) and (A2) hold, that is, compact parameter space and stationary W₀. In addition, (A3) holds because $H_{θ}$ is exponential for all $θ \in Θ$ . The log-likelihood $ℓ_{1} (λ, θ)$ is Lipschitz continuous (with respect to the observations), and so (A5) holds. We thus have that Theorem 1 holds and the MLE for $(λ, θ)$ is strongly consistent.

Clearly, this is a smooth function. We proceed by computing the gradient and Hessian with respect to $(θ, λ)$ . It takes some elementary calculus to verify that gradient is given by

\nabla {(ℓ_{1} (λ, θ))}_{λ} = \frac{1}{λ} - \frac{1}{θ} (e^{- θ W_{1}} - e^{- θ (W_{0} + X_{0})} + θ {(A_{1} - W_{0} - X_{0})}^{+}),

(35)

\nabla {(ℓ_{1} (λ, θ))}_{θ} = - W_{1} - \frac{λ}{θ^{2}} [(θ (W_{0} + X_{0}) + 1) e^{- θ (W_{0} + X_{0})} - (θ W_{1} + 1) e^{- θ W_{1}}] .

(36)

For any given value of θ, $ℓ_{1} (λ, θ)$ is a concave function in λ, and therefore, the MLE of (34) is given by

{\hat{λ}}_{n} (θ) = {(\frac{1}{n} \sum_{i = 1}^{n} (e^{- θ W_{i}} - e^{- θ (W_{i - 1} + X_{i - 1})} + θ {(A_{i} - W_{i - 1} - X_{i - 1})}^{+}))}^{- 1} θ .

(37)

The MLE ${\hat{θ}}_{n}$ now follows by maximizing $\frac{1}{n} \sum_{i = 1}^{n} ℓ_{1} (θ, {\hat{λ}}_{n} (θ))$ for $θ \in Θ$ . The optimizing θ is obtained either on the boundary of the parameter space Θ or by solving the first-order condition $\nabla (ℓ_{1} {(θ, {\hat{λ}}_{n} (θ))}_{θ} = 0$ . Because $n \to \infty$ , we are guaranteed to find an interior solution as long as $θ_{0} \in Θ^{o}$ .

Taking second derivatives, we find that the entries of the Hessian matrix are given by

\begin{array}{l} \nabla^{2} {(ℓ_{1} (λ, θ))}_{λ, λ} = - \frac{1}{λ^{2}}, \\ \nabla^{2} {(ℓ_{1} (λ, θ))}_{θ, λ} = \nabla^{2} {(ℓ_{1} (λ, θ))}_{λ, θ} = - \frac{1}{θ^{2}} [(θ (W_{0} + X_{0}) + 1) e^{- θ (W_{0} + X_{0})} - (θ W_{1} + 1) e^{- θ W_{1}}], \\ \nabla^{2} {(ℓ_{1} (λ, θ))}_{θ, θ} = - \frac{λ}{θ^{3}} [(1 + {(θ W_{1} + 1)}^{2}) e^{- θ W_{1}} - (1 + {(θ (W_{0} + X_{0}) + 1)}^{2}) e^{- θ (W_{0} + X_{0})}] . \end{array}

Because $w^{k} e^{- θ w}$ is a bounded function on $w \in [0, \infty)$ for any $θ > 0$ and $k \in N$ , we conclude that $- E \nabla^{2} (ℓ_{1} (λ_{0}, θ_{0})) < \infty$ , and therefore, (A4) applies, which implies that Theorem 2 holds. In particular, because $n \to \infty$ ,

\sqrt{n} (\begin{matrix} {\hat{λ}}_{n} - λ_{0} \\ {\hat{θ}}_{n} - θ_{0} \end{matrix}) \to_{d} N (0, I (λ_{0}, θ_{0})),

and the asymptotic covariance is given by the inverse of the Hessian,

I (λ_{0}, θ_{0}) = - E \nabla^{2} (ℓ_{1} (λ_{0}, θ_{0})) .

We performed numerical experiments to assess the performance of the estimation procedure. Table 1 presents approximated confidence intervals for the maximum-likelihood estimators $({\hat{λ}}_{n}, {\hat{θ}}_{n})$ ; in addition, we also evaluated the estimator of the arrival rate based on idle periods, denoted by ${\tilde{λ}}_{n}$ , as discussed in Section 3.3. The confidence intervals are evaluated for three congestion levels, namely $ρ = λ E X \in {0.5, 1, 2}$ . In all experiments, the sample size (of observed waiting times) was $n = 1000$ . Evidently, however, the heavier the system load, the fewer the number of idle periods; on average, $C_{n} = C_{1000}$ equals 665, 457, and 267 for the three congestion levels.

Table 1. Confidence Intervals for the MLE s $({\hat{θ}}_{n}, {\hat{λ}}_{n})$ as Well as the Arrival Rate Estimated from Idle Periods ${\tilde{λ}}_{n}$ for Different Confidence Levels

Table 1. Confidence Intervals for the MLE s $({\hat{θ}}_{n}, {\hat{λ}}_{n})$ as Well as the Arrival Rate Estimated from Idle Periods ${\tilde{λ}}_{n}$ for Different Confidence Levels

Estimator \| Load		80%	90%	95%	99%
${\hat{θ}}_{n}$	$ρ = 0.5$	[0.423,0.592]	[0.403,0.620]	[0.386,0.645]	[0.352,0.691]
	ρ = 1	[0.454,0.555]	[0.441,0.571]	[0.431,0.586]	[0.407,0.612]
	ρ = 2	[0.469,0.536]	[0.460,0.546]	[0.453,0.555]	[0.440,0.573]
${\hat{λ}}_{n}$	$ρ = 0.5$	[0.958,1.048]	[0.948,1.061]	[0.937,1.074]	[0.918,1.099]
	ρ = 1	[0.955,1.053]	[0.944,1.067]	[0.933,1.081]	[0.912,1.104]
	ρ = 2	[0.951,1.061]	[0.936,1.079]	[0.924,1.096]	[0.901,1.128]
${\tilde{λ}}_{n}$	$ρ = 0.5$	[0.953,1.052]	[0.941,1.069]	[0.929,1.082]	[0.909,1.110]
	ρ = 1	[0.944,1.063]	[0.928,1.083]	[0.915,1.101]	[0.890,1.134]
	ρ = 2	[0.928,1.085]	[0.908,1.110]	[0.892,1.134]	[0.862,1.182]

Notes. A total of $n = 1 000$ waiting observations were generated, $M = 10 000$ times for each parameter setting. The parameter of the exponential patience parameter was $θ = 0.5$ , the arrival rate was λ = 1, and the service requirements followed a gamma distribution with parameters $(η, μ)$ with μ = 1 and $η \in {0.5, 1, 2}$ .

The numerical output is summarized in Table 1. A first observation is that the MLE for ${\hat{θ}}_{n}$ is more accurate as the system load increases. This is because the patience threshold is reached more often, and therefore, the data are more informative. Furthermore, for a lightly loaded system, the MLE ${\hat{λ}}_{n}$ and idle-period based estimator ${\tilde{λ}}_{n}$ yield similar (accurate) results, as opposed to the high-load regime, in which the two estimators behave quite differently:

As a consequence of the fact that idle periods are observed considerably less frequently for a high load, ${\tilde{λ}}_{n}$ becomes substantially less accurate.
The accuracy of ${\hat{λ}}_{n}$ , however, is only slightly reduced in the high-load regime. This is potentially because of the better accuracy of the estimation of θ, being jointly estimated with λ in the MLE procedure.

In the most heavily loaded example (ρ = 2, that is), the 95% confidence interval of the MLE is very similar to the 80% confidence interval of the idle period-based estimator. We conclude that even though the observations of the waiting times and the effective interarrival observations are highly dependent, the MLE ${\hat{λ}}_{n}$ provides substantially better confidence intervals than the idle period-based estimator ${\tilde{λ}}_{n}$ .

Table 2 presents empirical confidence intervals for the maximum-likelihood estimators of $(λ, θ)$ for a multiserver system with s = 5. The confidence intervals are wider than in the single-server case, even though the sample size was taken twice as big, being indicative of the fact that the variance of the estimation error in a system with multiple servers is higher. As in the single-server case, lower system load increases the variance of the estimation error for the patience parameter θ. This effect is even stronger in the multiserver setting because of the fact that balking occurs only when all five servers are working, and such a state is not frequently observed. For higher load, the estimation of the patience parameter is much more accurate, as was the case for the single-server system. For the arrival rate, we observe high accuracy for all load levels.

Table 2. Empirical Confidence Intervals for the MLEs $({\hat{θ}}_{n}, {\hat{λ}}_{n})$ for Different Confidence Levels

Table 2. Empirical Confidence Intervals for the MLEs $({\hat{θ}}_{n}, {\hat{λ}}_{n})$ for Different Confidence Levels

Estimator \| Load		80%	90%	95%	99%
${\hat{θ}}_{n}$	$ρ = 0.5$	$[0.282, 0.568]$	$[0.248, 0.618]$	$[0.221, 0.664]$	$[0.172, 0.771]$
	ρ = 1	$[0.364, 0.443]$	$[0.354, 0.455]$	$[0.345, 0.466]$	$[0.326, 0.486]$
	ρ = 2	$[0.382, 0.421]$	$[0.376, 0.427]$	$[0.371, 0.432]$	$[0.363, 0.441]$
${\hat{λ}}_{n}$	$ρ = 0.5$	$[0.972, 1.032]$	$[0.963, 1.040$	$[0.957, 1.048]$	$[0.945, 1.061]$
	ρ = 1	$[0.970, 1.035]$	$[0.961, 1.045]$	$[0.953, 1.053]$	$[0.939, 1.071]$
	ρ = 2	$[0.964, 1.041]$	$[0.953, 1.053]$	$[0.945, 1.063]$	$[0.929, 1.084]$

Notes. A total of $n = 2 000$ waiting observations were generated, $M = 10 000$ times for each parameter setting. The parameter of the exponential patience parameter was $θ = 0.4$ , and the arrival rate was λ = 1. The data were simulated for a system with s = 5 servers, and the service requirements followed a gamma distribution with parameters $(η, μ)$ with $μ = 0.8$ and $η \in {2, 4, 8}$ , corresponding to loads of $ρ = \frac{η}{s μ} \in {0.5, 1, 2}$ .

4.2. (Generalized) Hyperexponential Patience

This subsection focuses on the case of the patience-level distribution being generalized hyperexponential (GHE), meaning that the corresponding cdf can be written as a mixture of exponential cdfs with weights that sum to 1 but that are not necessarily positive; this is in contrast with the standard hyperexponential (HE) distribution, where the weights are assumed to be positive. This case is particularly relevant because, as argued in Roubos and Jouini (2013), this distribution has a good empirical fit-to-patience data. Another motivation for considering this distribution lies in the known fact that the cdf of any continuous positive random variable can be approximated arbitrarily accurately (in terms of a suitably defined metric) by a GHE cdf (Botta and Harris 1986).

For convenience, we now assume the arrival rate λ to be known, but it is noted that it can be estimated in a similar manner as in the exponential case discussed in Section 4.1. Suppose the degree of the GHE distribution is $p \in N$ :

{\bar{H}}_{θ} (x) = \sum_{k = 1}^{p} α_{k} e^{- β_{k} x}, x ⩾ 0,

where

θ = (α_{1}, \dots, α_{p}, β_{1}, \dots, β_{p}) \in R^{2 p}, \sum_{k = 1}^{p} α_{k} = 1

(where, importantly, we do not assume the α_k to be positive) and

β_{k} > 0

for

k = 1, \dots, p

. Without loss of generality, we assume that

α_{1} > α_{2} > \dots > α_{p}

. Denote the coordinates of the true parameter θ₀ by

(α_{0 k}, β_{0 k})

for

k = 1, \dots, p

. It means that we are to identify the 2p-dimensional parameter vector

θ_{0} = (α_{01}, \dots, α_{0 p}, β_{01}, \dots, β_{0 p}) .

Similarly to the previous example of the exponential distribution, Assumptions (A1)–(A3) and (A5) are satisfied when assuming stationary waiting times and a compact parameter space Θ. Therefore, the MLE is strongly consistent by Theorem 1. One should be cautious when fitting a GHE distribution because of further conditions to be imposed on the parameters to make sure that ${\bar{H}}_{θ} (\cdot)$ is a proper cdf. Even though convergence to the true parameters of the distribution is eventually guaranteed, for finite samples the estimated parameters may not yield a proper distribution. For an in-depth discussion on these conditions and the identifiability of GHE distributions, we refer to Botta and Harris (1986) (Section 3).

Using the representation (17), the log-likelihood pertaining to a single observation is calculated in a similar manner as in the exponential case, yielding

ℓ_{1} (θ) = \log λ + \log (\sum_{k = 1}^{p} α_{k} e^{- β_{k} W_{1}}) - λ \sum_{k = 1}^{p} \frac{α_{k}}{β_{k}} (e^{- β_{k} W_{1}} - e^{- β_{k} (W_{0} + X_{0})}) - λ {(A_{1} - W_{0} - X_{0})}^{+} .

As follows with some elementary algebra, the gradient is then given by, for $j = 1, \dots, p$ ,

\begin{array}{l} \nabla {(ℓ_{1} (θ))}_{α_{j}} = \frac{e^{- β_{j} W_{1}}}{\sum_{k = 1}^{p} α_{k} e^{- β_{k} W_{1}}} - \frac{λ}{β_{j}} (e^{- β_{j} W_{1}} - e^{- β_{j} (W_{0} + X_{0})}), \\ \nabla {(ℓ_{1} (θ))}_{β_{j}} = - \frac{W_{1} α_{j} e^{- β_{j} W_{1}}}{\sum_{k = 1}^{p} α_{k} e^{- β_{k} W_{1}}} - \frac{λ α_{j}}{β_{j}^{2}} [(β_{j} (W_{0} + X_{0}) + 1) e^{- β_{j} (W_{0} + X_{0})} - (β_{j} W_{1} + 1) e^{- β_{j} W_{1}}] . \end{array}

The Hessian can be derived by computing the matrix of second derivatives. For the evaluation of the asymptotic covariance, a convenient alternative is to apply (26), that is,

I (θ_{0}) = - E \nabla^{2} (ℓ_{1} (θ_{0})) = E [\nabla (ℓ_{1} (θ_{0})) \nabla {(ℓ_{1} (θ_{0}))}^{⊤}],

so as to avoid the symbolic evaluation of the matrix of second derivatives. The entries of

I (θ_{0})

are finite because these are combinations of terms of the following types:

Products of polynomials (of degree at most 2) and exponentials. For example, we come across a term that is, up to a multiplicative constant, $W_{0}^{2} e^{- β_{j} W_{0}}$ . Observe that the mapping $x \mapsto x^{k} e^{- x}$ is bounded for $x ⩾ 0 .$
Ratios of the form, for example, up to a multiplicative constant
$| \frac{W_{1} e^{- β_{j} W_{1}}}{\sum_{k = 1}^{p} α_{k} e^{- β_{k} W_{1}}} | ⩽ W_{1} .$

In addition, it is verified that if the patience-level distribution is light-tailed (which is the case for the GHE distribution), then the stationary waiting time W is also light-tailed and has finite moments. Indeed, note that (3) implies

v (x) ⩽ λ π_{0} + λ \int_{0}^{\infty} v (y) \bar{H} (y) ​d y = λ (1 - P_{ℓ}),

so that we have from (4) that the density of W is bounded as

w (x) ⩽ λ \bar{H} (x)

; that is, if the patience level distribution is light-tailed, so is W. As a consequence, from the above and Theorem 2, the estimation errors (that is, scaled by

\sqrt{n}

) converge to a multivariate normal distribution.

Although there are theoretical guarantees for the asymptotic performance of the MLE, computation of the MLE is not straightforward, even for small parameter spaces such as p = 3. It requires maximizing $ℓ_{n} (θ) / n$ , that is, solving a nonlinear and nonconcave program in $p \times (p - 1)$ variables, with both equality and inequality constraints. This is computationally highly challenging, and standard optimization methods may lead to local maxima. In particular, observe that $\nabla {(ℓ_{1} (θ))}_{β_{j}}$ is very small for large values of β_j, which implies that $ℓ_{n} (θ) / n$ displays very “flat” behavior for large values of these β_j. Our experiments revealed that a direct implementation using standard optimization packages often led to points that were even not local maxima. As a consequence, we decided to write dedicated code to compute the MLE.

In light of the inherent complexity of maximizing the likelihood function, in our numerical experiments we have applied the following nested two-step heuristic optimization method.

(a) For each vector of $α = (α_{1}, \dots, α_{p})$ , the objective function $ℓ_{n} (θ) / n$ was maximized with respect to $β = (β_{1}, \dots, β_{p})$ using a conventional coordinate descent algorithm. This step was typically fast and accurate; as for given $α$ , the objective function behaves nicely. Now we have reduced the problem to an optimization over $α .$
(b) Then, a standard L-BFGS quasi-Newton method (see e.g., Byrd et al. 1995) is applied to $ℓ_{n} (θ) / n$ , so as to search for the optimal vector $α$ , with $β$ being parameterized by $α$ . The optimization is carried out with the following constraints that ensure that the parameters yield a proper distribution (Botta and Harris 1986):
${α : α_{1} > α_{2} > \dots > α_{p}, \sum_{i = 1}^{p} α_{i} = 1, \sum_{i = 1}^{p} α_{i} β_{i} > 0} .$

Note that there is no firm guarantee that this heuristic method converges to the optimal parameters. To overcome this, the search for the optimizing vector $α$ was conducted multiple times for different initial values. The resulting method turned out to be time-consuming, especially for a relatively large sample size n and/or a relatively large dimension p of the parameter space. However, it typically returns considerably more robust results than off-the-shelf optimization routines (that in addition tended to converge very slowly).

Table 3 presents the marginal confidence intervals obtained by the normal approximation for an example of a HE cdf $H_{θ} (\cdot)$ with p = 2. Observe that the variance of the estimation error is large, even for a substantial sample size. The accuracy is much higher for the low rate of $β_{1} = 0.25$ than the higher rate of $β_{2} = 1$ . This may be explained by the fact that the likelihood function is almost flat for high values of β₂.

Table 3. Confidence Intervals for the MLEs $({\hat{α}}_{1 n}, {\hat{β}}_{1 n}, {\hat{β}}_{2 n})$ Based on a Normal Approximation for a Sample Size of $n = 100 000$ for Different Confidence Levels

Estimator \| Load		80%	90%	95%	99%
${\hat{α}}_{1 n}$	$ρ = 0.5$	[0.500,0.908]	[0.500,0.967]	[0.500,1.000]	[0.500,1.000]
	ρ = 1	[0.609,0.791]	[0.583,0.817]	[0.560,0.840]	[0.516,0.884]
	ρ = 2	[0.660,0.740]	[0.649,0.750]	[0.639,0.761]	[0.620,0.780]
${\hat{β}}_{1 n}$	$ρ = 0.5$	[0.183,0.317]	[0.164,0.336]	[0.147,0.353]	[0.115,0.385]
	ρ = 1	[0.223,0.277]	[0.216,0.284]	[0.209,0.291]	[0.196,0.304]
	ρ = 2	[0.240,0.260]	[0.237,0.262]	[0.235,0.265]	[ 0.230,0.270]
${\hat{β}}_{2 n}$	$ρ = 0.5$	[0.539,1.460]	[0.408,1.592]	[0.295,1.705]	[0.073,1.930]
	ρ = 1	[0.768,1.232]	[0.702,1.298]	[0.645,1.355]	[0.533,1.467]
	ρ = 2	[0.870,1.130]	[0.833,1.167]	[0.801,1.199]	[0.739,1.261]

Note. The true values of the parameters $(α_{1}, α_{2}, β_{1}, β_{2})$ of the HE patience were $(0.7, 0.3, 0.25, 1)$ , the arrival rate was λ = 1, and the service requirements followed a gamma distribution with parameters $(η, μ)$ with μ = 1 and $η \in {0.5, 1, 2}$ .

The results presented in Table 3 show that the estimates are concentrated around the true values. The variance of the estimation error, however, is high, even more so given the large number of observations used. It should be realized, though, that HE distributions are not always easily distinguishable, in the sense that seemingly different HE distributions may lead to a very similar cdf. Clearly, from a practical standpoint, the main question is not whether the correct parameters are recovered but rather whether the cdf corresponding to the estimated parameters provides a good fit with the true cdf. In this respect, the main conclusion of our experiments is that our MLE procedure performs remarkably well. For example, in Figure 4, we illustrate this by presenting the true cdf and the estimated cdf based on four random samples (using $n = 10 000$ observations). We have used the same parameters as in Table 3, with ρ = 1. We observe that, although the fitted parameters greatly differ from the correct ones, the fit of the (complementary) cdf is highly accurate.

Figure 4. Tail Distribution of HE Customer Patience Compared with the Estimated Counterparts
*Notes.* The true values of the parameters $(α_{1}, α_{2}, β_{1}, β_{2})$ are given by $(0.7, 0.3, 0.25, 1)$ . The fitted distributions $H_{{\hat{θ}}_{n}^{(i)}} (t)$ , with $i = 1, 2, 3, 4$ , are based on independent samples of $n = 10 000$ observations with the HE MLE for p = 2. The MLE parameters of the fitted functions are ${\hat{θ}}_{n, 1} = (0.529, 0.471, 0.193, 0.736),$ ${\hat{θ}}_{n, 2} = (0.884, 0.116, 0.297, 1.969), {\hat{θ}}_{n, 3} = (0.75, 0.25, 0.269, 1.031)$ , and ${\hat{θ}}_{n, 4} = (0.573, 0.427, 0.212, 0.746)$ . The arrival rate in the example is λ = 1, and the service requirements are gamma distributed with parameters (3, 2).

In the remainder of this subsection, we further study the performance of the HE MLE. Our experiments lead to the conclusion that HE MLE provides a good fit even in cases where the true distribution is not HE (i.e., in misspecified scenarios). The examples focus mainly on fitting the target distribution by a (conventional) HE distribution, rather than a GHE distribution, by computing the MLE. The motivation behind this choice is that, when working with a GHE distribution, there is the additional complication that the estimated parameters may not yield a proper cdf. More specifically, the space of valid parameters is hard to characterize, let alone to be coded in terms of constraints of an optimization problem. Nevertheless, we would still like to exploit the extra versatility that the GHE class offers so as to improve the fit (when compared with the HE MLE). This is especially relevant for scenarios in which the model is misspecified, bearing in mind that, as mentioned earlier, in principle any distribution can be arbitrarily accurately approximated by a GHE distribution.

With the above considerations in mind, we implemented the following (seemingly naïve) heuristic model selection method. We generate various random weight vectors $α$ (equipped with a random size $p \in N$ ). Then, for each of them, we optimize the likelihood function over $β$ using only step (a) above. The best model is then chosen by comparing the various combinations of $α$ and p, relying on the Akaike Information Criterion (AIC). The AIC encompasses both the log-likelihood and, in order to avoid overfitting, a penalty for the number of parameters (Burnham and Anderson 2002). Because step (a) can be performed highly efficiently, a main advantage of this heuristic is that it works very fast, even for bigger values of p, thus providing us with a technique to fit general continuous distributions. Recall from Botta and Harris (1986) that if we order the components of the weight vector (i.e., $α_{1} > α_{2} > \dots > α_{p}$ ), then a sufficient condition for $(α, β)$ to yield a cdf is that $α_{1} > 0$ and $\sum_{i = 1}^{p} α_{i} β_{i} > 0$ ; we use this principle to select feasible solutions produced by the above heuristic.

Figure 5, (a) and (c), illustrates the estimated survival function of using the HE MLE with varying weights for three examples of patience-level distributions. In the first example, the patience-level distribution is indeed HE with p = 4, whereas the second and third examples are misspecified (corresponding to a lognormal and gamma patience-level distribution, respectively). For each distribution, the MLE was computed using $p = 1,$ 2, and 4 (where p = 3 has been left out because it is barely distinguishable from p = 4).

As was the case in the setting of Figure 4, the experiments corresponding to the HE patience-level distribution show that, even for p = 4 and as many as $n = 10 000$ waiting-time observations, the MLE does not accurately capture the true parameters. Nevertheless, the fit in terms of the cdf, as displayed in Figure 5(a), is remarkably good. We in addition performed the GHE fitting heuristic, which also provides a highly accurate fit and is considerably faster to compute than the MLE for p = 4.
The case of lognormal patience, as illustrated by Figure 5(b), shows that the fit is quite good for all estimators, except for p = 1 (i.e., an exponential distribution).
Figure 5(c) presents the fitted distributions for gamma patience. The fit of the HE MLE is decent, but importantly, in this case the GHE heuristic performs considerably better than the HE MLE (for p = 1, 2, 4). The GHE heuristic selects a model with p = 10 weights.

The above observations are replicated for a multiserver system with s = 10. Figure 6, (a) and (b), plots the true and estimated patience distribution for lognormal and gamma distributions, respectively, using the misspecified hyperexponential MLE. As in the single-server case, the GHE approximation yields an accurate fit.

Figure 5. Tail Distribution of the Customer Patience Compared with its Estimated Counterparts
*Notes.* The true patience distributions $H_{θ_{0}} (t)$ are hyperexponential, lognormal, and gamma. The fitted distributions $H_{{\hat{θ}}_{n}^{(p)}} (t)$ are based on a sample of $n = 10 000$ observations, with the HE MLE for $p \in {1, 2, 4}$ and the GHE heuristic with p = 10. The arrival rate is λ = 1, and the service requirements follow a gamma distribution with parameters (3, 2). (a) HE (0.5, 0.3, 0.15, 0.05, 0.4, 1.5, 0.1, 1); (b) Lognormal (0.5, 1); (c) gamma (1.5, 0.5).

Figure 6. Tail Distribution of the Customer Patience Compared with its Estimated Counterparts
*Notes.* The true patience distributions $H_{θ_{0}}$ are lognormal and gamma. The fitted distributions $H_{{\hat{θ}}_{n}^{(p)}} (t)$ are based on a sample of $n = 30 000$ observations, with the HE MLE for $p \in {1, 3}$ and the GHE heuristic. The system has s = 10 has servers with an arrival rate of λ = 10, and the service requirements follow a gamma distribution with parameters (3, 2). (a) Lognormal (0.5, 1); (b) gamma (1.5, 0.5).

5. Applications

In this section, we discuss a series of applications in which our estimation methodology can be directly used.

5.1. Unknown Proportion of Impatient Customers

In this subsection, we consider a system where an unknown proportion $θ \in (0, 1)$ of the arriving customers has a known deterministic patience threshold w > 0. The other customers are patient and always join the queue. Suppose one wishes to estimate the fraction θ from data.

To this end, first observe that

H_{θ} (x) = θ 1_{{x ⩾ w}} .

Assumption (A4) is not satisfied because $H_{θ} (x)$ is not continuous (in x) at x = w. However, with respect to θ, we do have that $H_{θ} (x)$ is continuous. In addition, $ℓ_{n} (θ)$ is concave, and therefore, the MLE is given by the first-order condition $ℓ_{n}^{'} ({\hat{θ}}_{n}) = 0$ . In this case, we can apply results from Heijmans and Magnus (1986) to establish strong consistency. In particular, the smoothness of $ℓ_{n}^{'} (θ)$ enables direct verification of the sufficient condition (21). Asymptotic normality then follows by verifying the conditions of Theorem 2 directly.

5.2. Noisy Delay Messages

Suppose that the patience threshold is a constant θ, but the customers do not observe their exact waiting time but rather some noisy estimate $W_{e} = e (W)$ for some random function e such that $E [W_{e} | W] = W$ . We assume that customers join based on this “perturbed delay” W_e, that is, if $W_{e} ⩽ θ$ . Suppose we are in the context that the parameters underlying the noise distribution are known, but the threshold θ is not.

The probability of joining at virtual waiting time level W can be computed given the specific noise distribution. A few examples are as follows:

Additive perturbations. In this case, $W_{e} = W + ϵ$ . One could, for instance, consider normally distributed perturbations: $ϵ \sim N (0, σ^{2})$ , independent of W, with $σ > 0$ . Let customers facing $W_{e} < θ$ join the system. We thus have, with $Φ (\cdot)$ the cdf of the standard normal distribution,
${\bar{H}}_{θ} (v) = P (W_{e} ⩽ θ | W = v) = P (ϵ ⩽ θ - v) = Φ (\frac{θ - v}{σ}) .$
The asymptotic properties of Section 3.1 hold in this case because $Φ (\cdot)$ satisfies the regularity conditions.
Multiplicative perturbations. Now, W_e = WG, with G nonnegative unit-mean and independent of W. In this case,
${\bar{H}}_{θ} (v) = P (W_{e} ⩽ θ | W = v) = P (G ⩽ \frac{θ}{v}) .$

Now, if the random variable G is such that the regularity conditions of Section 3.1 are met, then it follows that the asymptotic properties hold in this case as well (with an asymptotic variance that can be expressed in terms of G).

5.3. Admission Pricing

Our estimation procedure can be exploited in the context of various problems rooted in operations research. Evidently, when having estimates of the arrival rate and the patience-level distribution, one could consider the option of increasing the service rate so as to potentially raise profits. Thus, the operational decision to be made is whether the increased revenues outweigh the cost of speeding up the service rate. In this subsection, we consider another operations research-related problem.

Suppose that the queue has an admission price of p and that customers are homogenous with a utility function as featured in the Naor (1969) model. More concretely, let there be constants r and c such that a customer will join the queue only if the virtual waiting time w upon the customer’s arrival satisfies

r - p - c w ⩾ 0,

or, alternatively, w is smaller than the threshold value

(r - p) / c .

Now observe that for any fixed price p the threshold

θ (p) = (r - p) / c

can be estimated using the MLE procedure presented in Section 3.2. If one of the cost function parameters r and c is known, then the other parameter can be estimated directly. If both are unknown, then their estimation can be performed by an exploration procedure; set two prices, say, p₁ and p₂, and observe the system for each price. Supposing each of these two experiments is done with n clients, we let

{\hat{θ}}_{n} (p_{i})

be the MLE for price i = 1, 2. Then the estimators for the cost function parameters r and c are given by, respectively,

{\hat{r}}_{n} = \frac{{\hat{θ}}_{n} (p_{1}) p_{2} - {\hat{θ}}_{n} (p_{2}) p_{1}}{{\hat{θ}}_{n} (p_{1}) - {\hat{θ}}_{n} (p_{2})}, {\hat{c}}_{n} = \frac{{\hat{r}}_{n} - p_{1}}{{\hat{θ}}_{n} (p_{1})} .

Theorem 3, in combination with the continuous mapping theorem, implies that, because $n \to \infty$ , both ${\hat{r}}_{n} \to_{as} r$ and ${\hat{c}}_{n} \to_{as} c$ . Furthermore, for large n, confidence intervals for the stationary average revenue per unit of time can be approximated using (30). In particular, observing that the loss probability depends on the price, in self-evident notation,

n (θ (p_{i}) - {\hat{θ}}_{n} (p_{i})) \to_{d} Exp [\frac{v_{i} (θ (p_{i}))}{1 - P_{ℓ} (p_{i})}], i = 1, 2,

because

n \to \infty .

This opens the possibility of approximating, for n large, the (joint) distribution of

n ({\hat{r}}_{n} - r_{0}, {\hat{c}}_{n} - c_{0}))

. Note that the loss probability

P_{ℓ}

needs to be estimated as well. For the single-server case, this can be done by estimating the idle probability upon arrival for a given price and using (5).

6. Concluding Remarks

This paper has considered a service system in which clients potentially balk based on the virtual waiting time level they face at arrival. The main objective concerned the development of a framework for estimating the arrival rate and patience-level distribution. Our approach resolves the complication that in our setup only nonbalking clients are observed. Distinguishing between the case of a continuous patience-level distribution and constant patience, we developed MLE estimators and quantified their asymptotic properties. Through a sequence of examples and ramifications, we have illustrated the performance and broad applicability of our findings.

An important next step could concern the extension to a nonstationary arrival process. For example, one may assume that the potential arrival process is a nonhomogeneous Poisson process with an arrival rate function that depends on time, that is, ${λ (t) : t ⩾ 0}$ . This means that when the virtual waiting time is v, the effective arrival rate at time t is given by $λ (t) (1 - H (v))$ , and the likelihood function in Section 2 can be updated accordingly. If the arrival rate function is known, then our estimation procedure for θ essentially carries over, including its performance guarantees. If the arrival rate function is unknown and parametric assumptions about it are made, then the joint estimation of the arrival rate and patience parameters is possible. Furthermore, in the practically relevant case that the time-dependent arrival rate is cyclic (i.e, $λ (t) = λ (t + s)$ for all $t ⩾ 0$ and some cycle length $s ⩾ 0$ ; think of daily or weekly patterns), then the queueing process will still have regenerative dynamics that can be exploited for asymptotic analysis. It is thus anticipated that the framework laid out in this paper can serve as a basis for developing estimation techniques for a wide class of nonstationary models.

Although we have developed estimation procedures assuming the independence between the customers’ service requirements and patience levels, one could consider an extension that allows dependence. Such an extension could, for example, model the situation in which customers with larger service requirements can be assumed to have more patience. Multiclass queueing models can be used to capture such dependence in service requirements and patience levels (Sakuma and Takine 2017). More specifically, suppose that there are two customer classes 1 and 2, where class k (k = 1, 2) customers arrive with rate λ_k and have service requirements (resp. patience levels), i.i.d. according to a class-dependent cdf $G_{k} (\cdot)$ (resp. $H_{k} (\cdot)$ ). It is natural to assume that the customer class is unobservable by the estimator, thus yielding a sequence of dependent service requirements and patience levels. In this setting, we have to estimate the service-requirement cdf $G_{k} (\cdot)$ jointly with the patience-level cdf $H_{k} (\cdot)$ (and the arrival rate λ_k) to compute the likelihood function corresponding to an observed sequence $(A, W, X)$ .

Another possible direction for future work concerns the situation in which each customer decides to balk based on his or her sojourn time, that is, the virtual waiting time seen by this customer just after (instead of before) his or her arrival. If the customer precisely knows his or her service requirement, this problem is easily reduced to the one considered in this paper. The other obvious option is that the customer balks if the virtual waiting time just before his or her arrival increased by a “guess” of his or her service requirement exceeds the customer’s patience threshold.

An alternative to the approach followed in the paper would be to pursue nonparametric estimation, cf. the results in Hansen and Pitts (2006) for the conventional case without balking. In the single-server case, it may be of help that we can write, with W(x) and D(x) denoting the cdf s of the waiting times and sojourn times of nonbalking customers, respectively,

\bar{H} (x) = \frac{w (x)}{λ (W (x) - D (x))}

(38)

if w(x) > 0 and 0 else; the validity of (38) follows from combining (4) with a level-crossing identity. Whereas it is clear how to estimate W(x) and D(x) using the empirical distribution, estimation of w(x) is more challenging; kernel-based techniques may turn out useful in this context.

A final issue to consider is that in many systems customers do not observe exact waiting times but rather queue lengths. Specifically, this can be modeled as an M/G/s + H system with customers that join or balk after having observed the number of customers in the system. In this case, the patience level Y is a discrete random variable indicating at what queue length a customer is willing to join. Therefore, the interarrival process at any given queue length $q ⩾ 0$ is a Poisson process with rate $λ_{q} = λ {\bar{H}}_{θ} (q)$ . As was done in this paper, and MLE for the arrival rates (and the corresponding parameter θ) can be derived from a sample of interarrival times and the respective queue lengths.

Acknowledgments

The authors are grateful to the associate editor and two referees for their helpful comments and feedback. The authors also thank Onno Boxma (Eindhoven University of Technology, The Netherlands), Avi Mandelbaum (Technion, Haifa, Israel), Ran Snitkovsky (Columbia University & CUHK Shenzhen), and Galit Yom-Tov (Technion, Haifa, Israel) for providing useful feedback.

Appendix: Proofs

Proof of Lemma 1.

Assuming (A1)–(A3) and (A5), we verify the conditions for (20) as given in Ranga Rao (1962), (theorem 6.4). These are as follows: (i) Θ is compact, (ii) ${(A_{n}, W_{n}, X_{n})}_{n ⩾ 0}$ is a stationary ergodic sequence, (iii) the function $\frac{1}{n} ℓ_{n} (θ)$ converges almost surely pointwise to $ℓ (θ)$ , (iv) the set ${ℓ_{i} (θ) : θ \in Θ}$ is equicontinuous with respect to (A, X, W), and (v) there exists a function $K : {[0, \infty)}^{4} \to R$ such that

ℓ_{1} (θ; a, z, y, x) ⩽ K (a, z, y, x), \forall θ \in Θ; E [K (A_{1}, W_{0}, W_{1}, X_{0})] < \infty .

(A.1)

Condition (i) is assumed in (A1). Assumption (A2) states that W_n is a stationary ergodic sequence, which also implies that the same is true for A_n and X_n, thus yielding the validity of (ii). Note that conditional on $W_{i} = z$ the distribution of A_i is determined by z and the external arrival process. Moreover, the upward-jump size is determined by the new service requirement together with the vector of residual service requirements in the system, which are assumed to be stationary and ergodic. Condition (iii) is (18). The equicontinuity assumption (A5) on $H_{θ}$ implies that the same holds for the set of log-likelihood functions because both the log and the integral terms in (17) are continuous functions with respect to the observations $(A, X, W)$ so that we have established (iv). In the remainder of the proof, we verify (v).

Recall, from (A3), the definition ${\bar{H}}_{\inf} (y) = \inf_{θ \in Θ} {\bar{H}}_{θ} (y)$ for any $y ⩾ 0$ . Because Θ is compact and all functions $H_{θ}$ are continuous, we have that ${\bar{H}}_{\inf} (\cdot)$ is a continuous monotone nonincreasing function such that ${\bar{H}}_{\inf} (0) \in (0, 1]$ . Defining

K (a, y) ≔ | \log λ | + | \log {\bar{H}}_{\inf} (y) | 1_{{{\bar{H}}_{\inf} (y) > 0}} + λ a,

by recalling that

H_{θ} (y) ⩽ 1

for any

y ⩾ 0

, we conclude from (17) and the triangle inequality that

| ℓ_{1} (θ; a, z, y, x) | ⩽ K (a, y), \forall a, z, y, x ⩾ 0 .

Clearly, $E A_{1} < \infty$ , and so to complete the proof we need to show that $E | \log {\bar{H}}_{\inf} (W_{1}) | < \infty$ . Recall that $W_{1} \sim W_{0}$ is the stationary virtual waiting time seen by an effective arrival, whose pdf $w (\cdot)$ is given by (cf. (4))

w (y) = \frac{v (y) {\bar{H}}_{θ_{0}} (y)}{1 - P_{ℓ}} .

As mentioned in Section 3.1.1, (A5) ensures that the support of the distribution $H_{θ}$ does not depend on θ. The above equation thus implies $W_{0} < h_{\sup}$ with probability one.

Note that we have

E | \log {\bar{H}}_{\inf} (W_{0}) | = P (W_{0} = 0) (- \log ({\bar{H}}_{\inf} (0))) + P (W_{0} > 0) E [- \log ({\bar{H}}_{\inf} (W_{0})) | W_{0} > 0],

where the first term on the right-hand side is always finite. We thus provide a proof of

E [- \log ({\bar{H}}_{\inf} (W_{0})) | W_{0} > 0] < \infty

below.

Regardless of whether $h_{\sup} < \infty$ or $h_{\sup} = \infty$ , it follows from (A3) that for any $ϵ \in (0, \infty)$ , there exist $0 < y_{1}^{⋆} < h_{\sup}$ and $0 < y_{2}^{⋆} < h_{\sup}$ such that

\frac{e^{- f (y)}}{{\bar{H}}_{\inf} (y)} < c_{1} + ϵ, y ⩾ y_{1}^{⋆}, 0 < f (y) {\bar{H}}_{θ_{0}} (y) < c_{2} + ϵ, y ⩾ y_{2}^{⋆} .

Using these inequalities, with $y^{⋆} ≔ \max {y_{1}^{⋆}, y_{2}^{⋆}}$ , we obtain

\begin{array}{l} (1 - P_{ℓ}) E [- \log ({\bar{H}}_{\inf} (W_{0})) | W_{0} > 0] \\ = (1 - P_{ℓ}) \int_{0}^{h_{\sup}} (- \log ({\bar{H}}_{\inf} (y))) w (y) ​d y \\ = \int_{0}^{h_{\sup}} (- \log ({\bar{H}}_{\inf} (y))) {\bar{H}}_{θ_{0}} (y) v (y) ​d y \\ < \int_{0}^{y^{⋆}} (- \log ({\bar{H}}_{\inf} (y))) {\bar{H}}_{θ_{0}} (y) v (y) ​d y + \int_{y^{⋆}}^{h_{\sup}} (- \log (\frac{e^{- f (y)}}{c_{1} + ϵ})) \frac{c_{2} + ϵ}{f (y)} \cdot v (y) ​d y \\ \leq - \log ({\bar{H}}_{\inf} (y^{⋆})) \int_{0}^{y^{⋆}} v (y) ​d y + \int_{y^{⋆}}^{h_{\sup}} (f (y) + | \log (c_{1} + ϵ) |) \cdot \frac{c_{2} + ϵ}{f (y)} \cdot v (y) ​d y \\ \leq - \log ({\bar{H}}_{\inf} (y^{⋆})) + (c_{2} + ϵ) \int_{y^{⋆}}^{h_{\sup}} v (y) ​d y + \frac{| \log (c_{1} + ϵ) | (c_{2} + ϵ)}{f (y^{⋆})} \int_{y^{⋆}}^{h_{\sup}} v (y) ​d y \\ \leq - \log ({\bar{H}}_{\inf} (y^{⋆})) + (c_{2} + ϵ) (1 + \frac{| \log (c_{1} + ϵ) |}{f (y^{⋆})}) < \infty, \end{array}

which implies

E [- \log ({\bar{H}}_{\inf} (W_{0})) | W_{0} > 0] < \infty

. □

Proof of Lemma 2.

(a) With Lemma 1 and Theorem 1 at our disposal, the proof essentially follows standard arguments. For instance, see the proof of Ferguson (1996) (theorem 18) for i.i.d. observations or the proof of Ravner et al. (2019) (lemma 14) that applies a martingale CLT for an MLE based on dependent workload observations in the M/G/1-queue context.

(b) We construct a martingale CLT for stationary sequences (Billingsley 1999, theorem 18.3) and verify that the corresponding conditions are satisfied. Let $Z_{i} ≔ {\dot{ℓ}}_{i} (θ_{0}) - {\dot{ℓ}}_{i - 1} (θ_{0})$ for $i \in {1, 2, \dots}$ , where ${\dot{ℓ}}_{0} (θ_{0}) ≔ 0$ . The identifiability condition (19) implies that the smooth function $ℓ (θ)$ is maximized at θ₀, and hence, $\dot{ℓ} (θ_{0}) ≔ \nabla ℓ (θ_{0}) = 0$ , that is, all coordinates equal zero. Therefore, if $W_{i - 1}$ is stationary, we have that

E [{(Z_{i})}_{k} | W_{i - 1}] = E [{({\dot{ℓ}}_{1} (θ_{0}))}_{k} | W_{0}] = {(\dot{ℓ} (θ_{0}))}_{k} = 0, \forall k \in {1, \dots, p},

and

Z_{n i} ≔ K_{n}^{- 1} Z_{i}, i \in {1, \dots, n},

is a martingale difference, where

K_{n} K_{n}^{⊤} = Var [{\dot{ℓ}}_{n} (θ_{0})]

is the Cholesky decomposition of the covariance matrix of

{\dot{ℓ}}_{n} (θ_{0})

. Furthermore,

\frac{1}{n} Var [{\dot{ℓ}}_{n} (θ_{0})] = Var [\frac{1}{\sqrt{n}} {\dot{ℓ}}_{n} (θ_{0})] \to_{as} - E Ψ_{1} (θ_{0}) = I (θ_{0}),

because

n \to \infty

, and because

I (θ_{0}) < \infty

, by (A4) we have that

\lim_{n \to \infty} {(K_{n})}_{k} = \infty

for all

k = 1, \dots, p

, and in addition,

\lim_{n \to \infty} \frac{K_{n}}{\sqrt{n}} = K,

where K is defined through

K K^{⊤} = I (θ_{0})

. Therefore by Heyde (1997), (theorem 12.6; which is the multi-dimensional counterpart of Billingsley 1999, theorem 18.1) we conclude that

\sum_{i = 1}^{n} Z_{n i}

converges to a standard normal random variable, and thus

\frac{1}{\sqrt{n}} {\dot{ℓ}}_{n} (θ_{0}) = \frac{1}{\sqrt{n}} K_{n} \sum_{i = 1}^{n} Z_{n i} \to_{d} K N (0, 1),

which is equivalent to (27). □

Proof of Theorem 3.

This proof consists of a lower bound and an upper bound.

$▹$ Lower bound. Note that for $θ_{0} > W_{k - 1}$ , we have

A_{k} \sim {\begin{array}{l} E_{λ, k}, & W_{k - 1} + X_{k - 1} ⩽ θ_{0}, \\ W_{k - 1} + X_{k - 1} - θ_{0} + E_{λ, k}, & W_{k - 1} + X_{k - 1} > θ_{0}, \end{array}

(A.2)

where

{(E_{λ, k})}_{k = 1, 2, \dots}

denotes a sequence of i.i.d. exponentially distributed random variables with parameter λ. Let

N_{n}^{+} \subset {1, 2, \dots, n}

denote the set of indices of observed virtual waiting times such that

N_{n}^{+} = {k \in {1, 2, \dots, n} : W_{k - 1} + X_{k - 1} ⩾ θ_{0}} .

We see from (13) and (A.2) that given $k \in N_{n}^{+}$ , the observed virtual waiting time W_k is stochastically identical to $\max {0, θ_{0} - A_{k}^{'}}$ , where ${(A_{k}^{'})}_{k = 0, 1, \dots}$ denotes a sequence of i.i.d. random variables that are exponentially distributed with parameter λ. Therefore, ${(W_{k})}_{k \in N_{n}^{+}}$ are i.i.d. with

P (W_{k} ⩽ x | k \in N_{n}^{+}) = e^{- λ (θ_{0} - x)}, x \in [0, θ_{0}],

(A.3)

so that

P (\max_{_{i \in N_{n}^{+}}} W_{i} ⩽ x) = E [\prod_{i \in N_{n}^{+}} 1_{{W_{i} ⩽ x}}] = E [e^{- λ (θ_{0} - x) \cdot | N_{n}^{+} |}],

(A.4)

where

| N_{n}^{+} |

denotes the number of elements in

N_{n}^{+}

Obviously, we have $θ_{0} ⩾ {\hat{θ}}_{n}$ and

{\hat{θ}}_{n} ⩾ \max_{_{i \in N_{n}^{+}}} W_{i}, a . s .

(A.5)

Let, as before, $v (\cdot)$ and $w (\cdot)$ denote the pdfs of the stationary virtual waiting time and the stationary waiting time of nonbalking customers, and let $P_{ℓ}$ denote the stationary loss probability. From the ergodicity, we have

\frac{| N_{n}^{+} |}{n} \to_{as} \frac{q_{s - 1}}{1 - P_{ℓ}} \cdot {\bar{J}}_{| 0} (θ_{0}) + \int_{0}^{θ_{0}} w (y) {\bar{J}}_{| y} (θ_{0} - y) ​d y = \frac{v (θ_{0})}{λ (1 - P_{ℓ})},

(A.6)

because

n \to \infty

, where the second equality follows from (3) and (4).

Because (A.6) implies $| N_{n}^{+} | \to_{as} \infty$ , we have, for any $ϵ > 0$ , using (A.4) and (A.5),

\begin{array}{l} P (| {\hat{θ}}_{n} - θ_{0} | > ϵ) = P ({\hat{θ}}_{n} < θ_{0} - ϵ) \leq P (\max_{_{i \in N_{n}^{+}}} W_{i} ⩽ θ_{0} - ϵ) \\ = E [e^{- λ ϵ \cdot | N_{n}^{+} |}] \to 0 \end{array}

because

n \to \infty

. Therefore, we have that

{\hat{θ}}_{n} \to_{P} θ_{0}

because

n \to \infty

. Convergence in probability implies that there exists a subsequence

{({\hat{θ}}_{n_{m}})}_{m = 1}^{\infty}

such that

{\hat{θ}}_{n_{m}} \to_{as} θ_{0}

m \to \infty

, and because

{\hat{θ}}_{n}

is a monotone, nondecreasing sequence, we conclude (29), that is, the MLE is strongly consistent.

Furthermore, it follows from (A.4) and (A.6) that

\begin{array}{l} P (n {θ_{0} - \max_{_{i \in N_{n}^{+}}} W_{i}} ⩾ x) = P (\max_{_{i \in N_{n}^{+}}} W_{i} ⩽ θ_{0} - \frac{x}{n}) \\ = {\begin{matrix} E [e^{- λ x \cdot | N_{n}^{+} | / n}], & 0 ⩽ x < n θ_{0}, \\ 0, & x ⩾ n θ_{0}, \end{matrix} \end{array}

(A.7)

so that for each

x ⩾ 0

, we obtain from the continuous mapping theorem,

\lim_{n \to \infty} P (n {θ_{0} - \max_{_{i \in N_{n}^{+}}} W_{i}} ⩾ x) = e^{- v (θ_{0}) x / (1 - P_{ℓ})} .

(A.8)

$▹$ Upper bound. Let $N_{n}^{⋆} \subset {1, 2, \dots, n}$ denote the set of indices of observed virtual waiting times such that

N_{n}^{⋆} = {k \in {1, 2, \dots, n}; W_{k - 1} + X_{k - 1} > \max_{_{i \in {1, 2, \dots, k - 1}}} W_{i}} .

By definition, we have

{\hat{θ}}_{n} = \max_{_{i \in N_{n}^{⋆}}} W_{i} .

(A.9)

For $k = 1, 2, \dots, n$ , we define

\begin{array}{l} Z_{k} ≔ \max {0, θ_{0} - [A_{k} - \max {0, W_{k - 1} + X_{k - 1} - θ_{0}}]} \\ = {\begin{array}{l} \max {0, θ_{0} - A_{k}}, & W_{k - 1} + X_{k - 1} ⩽ θ_{0}, \\ \max {0, W_{k - 1} + X_{k - 1} - A_{k}}, & W_{k - 1} + X_{k - 1} > θ_{0} . \end{array} \end{array}

From (13) and (A.2), it follows that $W_{i} ⩽ Z_{i}$ a.s. for $i = 0, 1, \dots, n$ . We then have from (A.9),

{\hat{θ}}_{n} ⩽ \max_{_{i \in N_{n}^{⋆}}} Z_{i}, a . s .

(A.10)

Furthermore, Z_k is stochastically identical to $\max {0, θ_{0} - E_{λ, k}}$ , where ${(E_{λ, k})}_{k = 1, 2, \dots}$ are i.i.d. exponential random variables with parameter λ, as defined above. ${Z_{k}}_{k \in N_{n}^{⋆}}$ thus forms a sequence of i.i.d. nonnegative random variables with the same cdf as (A.3) so that (cf. (A.4))

P (\max_{_{i \in N_{n}^{⋆}}} Z_{i} ⩽ x) = E [e^{- λ (θ_{0} - x) \cdot | N_{n}^{⋆} |}],

which implies (cf. (A.7))

P (n {θ_{0} - \max_{_{i \in N_{n}^{⋆}}} Z_{i}} ⩾ x) = {\begin{matrix} E [e^{- λ x \cdot | N_{n}^{⋆} | / n}], & 0 ⩽ x < n θ_{0}, \\ 0, & x ⩾ n θ_{0} . \end{matrix}

(A.11)

Let $ϕ_{θ_{0}} (x)$ ( $x \in (0, θ_{0}]$ ) denote the stationary probability that the virtual waiting time just after an acceptance of a customer takes a value in $(θ_{0} - x, θ_{0})$ :

ϕ_{θ_{0}} (x) = π_{0} P (θ_{0} - x < X_{| 0} < θ_{0}) + \int_{0}^{θ_{0}} v (y) P (θ_{0} - y - x < X_{| y} < θ_{0} - y) ​d y .

Because we have $\int_{0}^{θ_{0}} v (y) d y < \infty$ , it follows from the dominated convergence theorem that

\lim_{x \to 0 +} ϕ_{θ_{0}} (x) = 0,

so that we have

\forall δ > 0, \exists d_{0} > 0 : ϕ_{θ_{0}} (d_{0}) < \frac{δ}{4} .

(A.12)

Now define $N_{n}^{⋆⋆} ≔ N_{n}^{⋆} ∖ N_{n}^{+}$ , which concretely means that

N_{n}^{⋆⋆} = {k \in {1, 2, \dots, n}; \max_{_{i = 1, 2, \dots, k - 1}} W_{i} < W_{k - 1} + X_{k - 1} < θ_{0}} .

Because $N_{n}^{+}$ and $N_{n}^{⋆⋆}$ are two disjoint sets whose union is $N_{n}^{⋆}$ , we have that

| N_{n}^{⋆} | = | N_{n}^{+} | + | N_{n}^{⋆⋆} | .

(A.13)

Note that (29) implies

\exists n_{0} \in {0, 1, \dots}, P ({\hat{θ}}_{n} < θ_{0} - d_{0}) < \frac{δ}{4} for n = n_{0}, n_{0} + 1, \dots .

(A.14)

Because ${\hat{θ}}_{n}$ is nondecreasing in n a.s., we have for $n = n_{0}, n_{0} + 1, \dots$ ,

\begin{array}{l} E [| N_{n}^{⋆⋆} |] = E [| N_{n_{0}}^{⋆⋆} | + | N_{n}^{⋆⋆} ∖ N_{n_{0}}^{⋆⋆} |] \\ = E [| N_{n_{0}}^{⋆⋆} |] + P ({\hat{θ}}_{n_{0}} < θ_{0} - d_{0}) \cdot E [| N_{n}^{⋆⋆} ∖ N_{n_{0}}^{⋆⋆} | | {\hat{θ}}_{n_{0}} < θ_{0} - d_{0}] \\ + P ({\hat{θ}}_{n_{0}} ⩾ θ_{0} - d_{0}) \cdot E [| N_{n}^{⋆⋆} ∖ N_{n_{0}}^{⋆⋆} | | {\hat{θ}}_{n_{0}} ⩾ θ_{0} - d_{0}] \\ \leq E [| N_{n_{0}}^{⋆⋆} |] + P ({\hat{θ}}_{n_{0}} < θ_{0} - d_{0}) \cdot (n - n_{0} + 1) \\ + P ({\hat{θ}}_{n_{0}} ⩾ θ_{0} - d_{0}) \cdot E [| {k \in {n_{0}, n_{0} + 1, \dots, n}; d_{0} < W_{k - 1} + X_{k - 1} < θ_{0}} |] . \end{array}

It then follows from (A.12) and (A.14) that

\begin{array}{l} E [\frac{| N_{n}^{⋆⋆} |}{n}] \leq E [\frac{| N_{n_{0}}^{⋆⋆} |}{n}] + \frac{δ}{4} \cdot \frac{n - n_{0} + 1}{n} \\ + \frac{n - n_{0} + 1}{n} E [\frac{| {k \in {n_{0}, n_{0} + 1, \dots, n}; d_{0} < W_{k - 1} + X_{k - 1} < θ_{0}} |}{n - n_{0} + 1}] \\ \to \frac{δ}{4} + ϕ_{θ_{0}} (d_{0}) \end{array}

because

n \to \infty

, which is majorized by

δ / 2 .

We conclude that we have established that

E [| N_{n}^{⋆⋆} | / n] ⩽ δ

for sufficiently large n. Because

δ > 0

is arbitrary, we find

\lim_{n \to \infty} E [\frac{| N_{n}^{⋆⋆} |}{n}] = 0 .

(A.15)

Using the Markov Inequality, we have for any $ϵ > 0$ ,

P (\frac{| N_{n}^{⋆⋆} |}{n} > ϵ) \leq \frac{1}{ϵ} \cdot E [\frac{| N_{n}^{⋆⋆} |}{n}],

so that (A.15) immediately implies that

| N_{n}^{⋆⋆} | / n \to_{P} 0

because

n \to \infty

. Upon combining the above, we have from (A.6) and (A.13),

\frac{| N_{n}^{⋆} |}{n} \to_{P} \frac{v (θ_{0})}{λ (1 - P_{ℓ})} (n \to \infty) .

(A.16)

Therefore, we obtain from (A.11),

\lim_{n \to \infty} P (n {θ_{0} - \max_{_{i \in N_{n}^{⋆}}} Z_{i}} ⩾ x) = e^{- v (θ_{0}) x / (1 - P_{ℓ})} .

(A.17)

Finally, noting that (A.5) and (A.10) imply

P (n {θ_{0} - \max_{_{i \in N_{n}^{⋆}}} Z_{i}} ⩾ x) \leq P (n (θ_{0} - {\hat{θ}}_{n}) ⩾ x) \leq P (n {θ_{0} - \max_{_{i \in N_{n}^{+}}} W_{i}} ⩾ x),

we have from (A.8) and (A.17),

\lim_{n \to \infty} P (n (θ_{0} - {\hat{θ}}_{n}) ⩾ x) = e^{- v (θ_{0}) x / (1 - P_{ℓ})} .

Also, (31), (A.5), and (A.10) yield the following bounds for the scaled variance of the estimation error:

\begin{array}{l} n^{2} Var [θ_{0} - {\hat{θ}}_{n}] ⩾ E [{n (θ_{0} - \max_{_{i \in N_{n}^{⋆}}} Z_{i})}^{2}] - E {[{n (θ_{0} - \max_{_{i \in N_{n}^{+}}} W_{i})}]}^{2}, \\ n^{2} Var [θ_{0} - {\hat{θ}}_{n}] ⩽ E [{n (θ_{0} - \max_{_{i \in N_{n}^{+}}} W_{i})}^{2}] - E {[n (θ_{0} - \max_{_{i \in N_{n}^{⋆}}} Z_{i})]}^{2} . \end{array}

We can (i) explicitly calculate the right-hand sides of these inequalities using (A.7) and (A.11) and (ii) verify (with (A.6) and (A.16)) that their limits (because $n \to \infty$ ) coincide and are given by ${((1 - P_{ℓ}) / v (θ_{0}))}^{2}$ ; we omit the details of these straightforward calculations. □

References

Akşin Z, Ata B, Emadi SM, Su CL (2013) Structural estimation of callers’ delay sensitivity in call centers. Management Sci. 59(12):2727–2746.Link, Google Scholar
Andrews D (1987) Consistency in nonlinear econometric models: A generic uniform law of large numbers. Econometrica 55(6):1465–1471.Google Scholar
Asanjarani A, Nazarathy Y, Taylor P (2021) A survey of parameter and state estimation in queues. Queueing Syst. 97:39–80.Google Scholar
Asmussen S (2003) Applied Probability and Queues, 2nd ed. (Springer, New York).Google Scholar
Baccelli F, Hebuterne G (1981) On queues with impatient customers. Doctoral dissertation, INRIA.Google Scholar
Baccelli F, Boyer P, Hebuterne G (1984) Single-server queues with impatient customers. Adv. in Appl. Probab. 16:887–905.Google Scholar
Billingsley P (1999) Convergence of Probability Measures. (Wiley, Chichester, UK)Google Scholar
Botta R, Harris C (1986) Approximation with generalized hyperexponential distributions: Weak convergence results. Queueing Syst. 1:169–190.Google Scholar
Boxma O, Perry D, Stadje W (2011) The M/G/1+G queue revisited. Queueing Syst. 67:207–220.Google Scholar
Boxma O, Perry D, Stadje W, Zacks S (2010) The busy period of an M/G/1 queue with customer impatience. J. Appl. Probab. 47:130–145.Google Scholar
Brown L, Gans N, Mandelbaum A, Sakov A, Shen H, Zeltyn S, Zhao L (2005) Statistical analysis of a telephone call center: A queueing-science perspective. J. Amer. Statist. Assoc. 100 (469):36–50.Google Scholar
Burnham K, Anderson D (2002) Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach (Springer, New York).Google Scholar
Byrd RH, Lu P, Nocedal J, Zhu C (1995) A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16:1190–1208.Google Scholar
Castellanos A, Yom-Tov GB, Goldberg Y (2020) Silent abandonment in contact centers: Estimating customer patience from uncertain data. Working paper, https://gality.net.technion.ac.il/files/2019/12/Castellanos_Silent_Abandonment_Paper.pdf.Google Scholar
Courcoubetis C, Kesidis G, Ridder A, Walrand J, Weber R (1995) Admission control and routing in ATM networks using inferences from measured buffer occupancy. IEEE Trans. Commun. 43(2/3/4):1778–1784.Google Scholar
Daley D, Servi L (1998) Moment estimation of customer loss rates from transactional data. J. Appl. Math. Stochastic Anal. 11(3):301–310.Google Scholar
de Kok A, Tijms H (1985) A queueing system with impatient customers. J. Appl. Probab. 22:688–696.Google Scholar
Dębicki K, Mandjes M (2015) Queues and Lévy Fluctuation Theory. (Springer, New York).Google Scholar
den Boer A, Mandjes M (2017) Convergence rates of Laplace-transform based estimators. Bernoulli 23(4A):2533–2557.Google Scholar
Duffie D, Glynn P (2004) Estimation of continuous-time Markov processes sampled at random time intervals. Econometrica 72(6):1773–1808.Google Scholar
Duffy K, Meyn S (2011) Estimating Loynes’ exponent. Queueing Syst. 68(285):285–293.Google Scholar
Dong J, Yom-Tov E, Yom-Tov G (2019) The impact of delay announcements on hospital network coordination and waiting times. Management Sci. 65(5):1969–1994.Abstract, Google Scholar
Ferguson T (1996) A Course in Large Sample Theory (Chapman & Hall, Routledge, London).Google Scholar
Glynn P, Zeevi A (2000) Estimating tail probabilities in queues via extremal statistics. Analysis of communication networks: Call centres, traffic, and performance. AMS Fields Inst. Comm. 28:135–158.Google Scholar
Gorst-Rasmussen A, Hansen M (2009) Asymptotic inference for waiting times and patiences in queues with abandonment. Comm. Statist. Simulation Comput. 38:318–334.Google Scholar
Guo P, Zipkin P (2007) Analysis and comparison of queues with different levels of delay information. Management Sci. 53(6):962–970.Link, Google Scholar
Guo P, Zipkin P (2008) The effects of information on a queue with balking and phase-type service times. Naval Res. Logist. 55(5):406–411.Google Scholar
Hansen M, Pitts S (2006) Nonparametric inference from the M/G/1 workload. Bernoulli. 12(4):737–759.Google Scholar
Hassin R (2016) Rational Queueing (CRC Press, Boca Raton, FL).Google Scholar
Hassin R, Haviv M (2003) To Queue or Not to Queue: Equilibrium Behavior in Queueing Systems (Springer, New York).Google Scholar
Heijmans R, Magnus J (1986) Consistent maximum-likelihood estimation with dependent observations: The general (non-normal) case and the normal case. J. Econom. 32(2):253–285.Google Scholar
Heyde C (1997) Quasi-Likelihood and Its Application: A General Approach to Optimal Parameter Estimation (Springer Science & Business Media, New York).Google Scholar
Ibrahim R (2018) Sharing delay information in service systems: A literature survey. Queueing Syst. 89:49–79.Google Scholar
Inoue Y, Takine T (2015) Analysis of the loss probability in the M/G/1+G queue. Queueing Syst. 80:363–386.Google Scholar
Kovalenko I (1961) Some queueing problems with restrictions. Theory Probab. Appl. 6:205–208.Google Scholar
Larson R (1990) The queue inference engine: Deducing queue statistics from transactional data. Management Sci. 36(5):586–601.Link, Google Scholar
Leahu H, Mandjes M, Oprescu A (2017) A numerical approach to stability of multiclass queueing networks. IEEE Trans. Automat. Control 62:5478–5484.Google Scholar
Liu L, Kulkarni V (2006) Explicit solutions for the steady state distributions in M/PH/1 queues with workload dependent balking. Queueing Syst. 52:251–260.Google Scholar
Liu L, Kulkarni V (2008) Busy period analysis for M/PH/1 queues with workload dependent balking. Queueing Syst. 59(37):37–51.Google Scholar
Mandelbaum A, Shimkin N (2000) A model for rational abandonments from invisible queues. Queueing Syst. 36:141–173.Google Scholar
Mandelbaum A, Shimkin N (2004) Rational abandonment from tele-queues: Nonlinear waiting costs with heterogeneous preferences. Queueing Syst. 47:117–146.Google Scholar
Mandelbaum A, Zeltyn D (2013) Data-stories about (im)patient customers in tele-queues. Queueing Syst. 75:115–146.Google Scholar
Mandjes M, Ravner L (2020) Hypothesis testing for a Lévy-driven storage system by Poisson sampling. Stochastic Process. Appl. 133:41–73.Google Scholar
Mandjes M, Patch B, Walton N (2017) Detecting Markov chain instability: A Monte Carlo approach. Stoch. Syst. 7(2):289–314.Link, Google Scholar
Naor P (1969) The regulation of queue size by levying tolls. Econometrica 37(1):15–24.Google Scholar
Negahban A (2019) Simulation-based estimation of the real demand in bike-sharing systems in the presence of censoring. European J. Oper. Res. 277(1):317–332.Google Scholar
Negahban A (2022) Estimating the true arrival, balking, and reneging processes from censored transactional data: A simulation-based approach. Simulation 98(7):597–614.Google Scholar
Pötscher B, Prucha I (1989) A uniform law of large numbers for dependent and heterogeneous data processes. Econometrica 57(3):675–683.Google Scholar
Ranga Rao R (1962) Relations between weak and uniform convergence of measures with applications. Ann. Math. Statist. 33(2):659–680.Google Scholar
Ravner L, Boxma O, Mandjes M (2019) Estimating the input of a Lévy-driven queue by Poisson sampling of the workload process. Bernoulli 25(4B):3734–3761.Google Scholar
Roubos A, Jouini O (2013) Call centers with hyperexponential patience modeling. Int. J. Prod. Econom. 141(1):307–315.Google Scholar
Stanford R (1979) Reneging phenomena in single channel queues. Math. Oper. Res. 4(2):162–178.Link, Google Scholar
Sakuma Y, Takine T (2017) Multi-class M/PH/1 queues with deterministic impatience times. Stoch. Models. 33(1):1–29.Google Scholar
Yefenof J, Goldberg Y, Wiler J, Mandelbaum A, Ritov Y (2022) Self-reporting and screening: Data with right-censored, left-censored, and complete observations. Statist. Medicine. 41(18):3561–3578.Google Scholar
van der Vaart A (1998) Asymptotic Statistics, vol. 3 (Cambridge University Press, Cambridge, UK).Google Scholar

Volume 13, Issue 2

June 2023

Pages 181-319

Article Information

Supplemental Material

Metrics

Information

Received:November 30, 2021
Accepted:August 18, 2022
Published Online:December 13, 2022

Cite as

Yoshiaki Inoue, Liron Ravner, Michel Mandjes (2022) Estimating Customer Impatience in a Service System With Unobserved Balking. Stochastic Systems 13(2):181-210.

https://doi.org/10.1287/stsy.2022.0101

Keywords

Acknowledgments

PDF download

Available Issues

Available Issues

Available Issues

Estimating Customer Impatience in a Service System With Unobserved Balking

Abstract

1. Introduction

1.1. Contributions

1.2. Related literature

1.3. Paper Organization

2. Model and Preliminaries

2.1. Model Description

2.2. Stationary Distribution

3. Parametric Estimation Procedure

3.1. Continuous Patience-Level Distribution

3.1.1. Discussion of Assumptions.

3.1.2. Proofs.

3.2. Constant Patience-Level MLE

3.3. Estimating the Arrival Rate

4. Exponential and Generalized Hyperexponential Patience

4.1. Exponentially Distributed Patience

4.2. (Generalized) Hyperexponential Patience

5. Applications

5.1. Unknown Proportion of Impatient Customers

5.2. Noisy Delay Messages

5.3. Admission Pricing

6. Concluding Remarks

Appendix: Proofs

References

Volume 13, Issue 2

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News