Amazon Locker Capacity Management
Abstract
Amazon Locker is a self-service delivery or pickup location where customers can pick up packages and drop off returns. A basic first-come-first-served policy for accepting package delivery requests to lockers results in lockers becoming full with standard shipping speed (3- to 5-day shipping) packages, leaving no space for expedited packages, which are mostly next-day or two-day shipping. This paper proposes a solution to the problem of determining how much locker capacity to reserve for different ship-option packages. Yield management is a much-researched field with popular applications in the airline, car rental, and hotel industries. However, Amazon Locker poses a unique challenge in this field because the number of days a package will wait in a locker (package dwell time) is, in general, unknown. The proposed solution combines machine learning techniques to predict locker demand and package dwell time with linear programming to maximize throughput in lockers. The decision variables from this optimization provide optimal capacity reservation values for different ship options. This resulted in a year-over-year increase of 9% in Locker throughput worldwide during the holiday season of 2018, impacting millions of customers.
History: This paper was refereed.
Introduction
Amazon Locker is a self-service parcel locker that accepts package deliveries and customer returns. Lockers are conveniently located in highly visited locations such as shopping centers, business parks, gas stations, and convenience stores. On the Amazon.com order-placement page, customers have the option of getting a package delivered to a locker of their choice instead of their home, at no additional cost. The selection of lockers presented to a customer include those with available space and situated close to the customer’s residence. After a locker is allocated for the order, the package is delivered there, and the customer receives a notification along with a code to access a specific bin within the locker. The notification also indicates the timeframe within which the package must be picked up. If the package is not retrieved within this period, a return is initiated, and a credit is issued to the customer. The number of days the package remains in the locker is the random variable dwell time. Lockers provide unique advantages such as protection from package theft and the option of pickup or drop-off at a time and location convenient to the customer. Lockers have become increasingly popular among Amazon customers because of these unique features.
With increased demand, capacity management has become an essential component of the operations of Amazon Locker. Locker capacity management performs two major functions:
Delivery request evaluation: The time lag between when a customer places an order to when it is delivered can vary between a few hours and a few days, depending on the ship option chosen. Customers have the option of choosing one out of multiple ship options, which determines the shipping speed of the package. For example, in the United States, order ship options can be Same-Day, Next-Day, Two-day, or Standard (three to five days). Customers can also choose to return their orders to a locker, which can be considered as a special type of ship option. When a customer requests an order to be delivered to a locker, the capacity management system predicts (1) when the package will be delivered and (2) the occupancy of the locker at the time when the package will be delivered. Based on these predictions, the request is accepted if capacity is estimated to be available or rejected otherwise.
Ship option capacity reservations: Typically, if the timeline of order requests for delivery to a locker on a particular day is concerned, slower ship option orders are placed before faster ship option orders. Therefore, a first-come-first-served service discipline for accepting delivery requests to lockers results in the locker running out of space and the rejection of packages with a faster ship option. As we will see next, this is not desirable if the goal is throughput maximization, because the average number of days a package stays in the locker (the package dwell time) is usually lower for packages with a faster ship option. Therefore, accepting a higher number of packages with a faster ship option results in higher locker throughput and subsequently, a higher number of satisfied customers.
Figure 1 shows average dwell time in days for two-day, standard, and customer return packages over a 10-week period in 2018 for lockers in the zip code 98109 (Seattle, Washington). The term dwell time is the duration (in days) that a package remains in the locker until it is picked up. In this figure, dwell time in days for packages picked up on the same day as delivery is considered to be zero. The data show that the dwell time of packages with a two-day ship option is consistently lower than that of standard, which in turn is lower than the dwell time of customer returns. Therefore, it is necessary to reserve space for packages with a faster ship option. However, reserving more than the required space for packages with a faster ship option unnecessarily results in rejection of requests with a slower ship option request and therefore results in wasted locker space. Figure 2 shows three examples of Locker capacity management disciplines. In the time axes of this figure, DD is the delivery day (i.e., the day the package is placed in the locker), and DD-i indicates i days before the delivery date. In the three examples, the yellow rectangle represents a locker with four slots. The timeline shows when orders were made. In the top example, excessive capacity is reserved for possible customers with a next-day or two-day ship option. In this example, only the standard-shipped ping-pong racket order, made five days prior to the delivery date, and the next-day shoe order are allocated a slot in the Locker, whereas the mobile phone, the book, and the ice skate are rejected. In this example, two locker slots are left unused. In the other extreme, the example in the middle of the figure, a first-come-first-served discipline is shown, in which the orders are assigned a slot as the orders are placed. In this discipline, expedited orders are usually rejected because the locker is usually full with standard shipping orders when a new expedited order is made. These two extreme cases motivate our solution, shown in the bottom of the figure. In an optimal-capacity reservation approach, next-day or two-day orders are accepted without the need to use the excessive-capacity reservation that forces standard shipping orders to be rejected. This leads to great customer satisfaction because both expedited- and standard shipping orders are accepted by the Locker.


Notes. In the top figure, excess capacity is reserved for expedited shipments and the locker is underused. In the middle figure, the first-come-first-served (FCFS) rule is implemented and customer ordering expedited cannot ship to locker. An optimal capacity reservation scheme is shown in the bottom figure.
This paper describes the algorithm developed by us at Amazon to enable the second function of locker capacity management described previously. We answer the following question for each locker: How much capacity should be reserved for each ship option on each day to maximize the throughput of the locker? In the context of lockers, especially delivery or parcel lockers, throughput generally refers to the number of packages (or transactions) that a locker can handle or process within a given period of time. Throughput can be used as a metric to gauge the efficiency and capacity of a locker system.
At a high level, this problem falls into the widely researched field of yield management. Yield management principles have been applied widely in travel industries such as airlines and hotels. One of the first publications on applied airline yield management is the seminal paper by Littlewood (1972, 2005), which introduces Littlewood’s rule for assigning capacity reservations to two fare classes in single-leg flights. This was extended to multileg flights with multiple fare classes by Wang (1983) in what later became known as expected marginal seat revenue (EMSR). Optimization and estimation problems in airline yield management are addressed in a PhD thesis by McGill (1989). The 1991 Franz Edelman Award was won by American Airlines. The award-winning work is published by Smith et al. (1992), who present a complete implementation of an end-to-end yield management system at American Airlines. Jacobs et al. (2008) present scalable yield management models that are currently used by modern-day airlines. Yield management at Hub Group, a North American freight rail transportation company, is discussed by Gorman (2010). The paper describes an integrated decision support system to enhance yield management and container allocation. In its first year of use by Hub Group in 2008, the system increased revenue per load by 3% and container velocity by 5% and generated $11 million in cost savings, yielding a 22-fold return on the initial investment. McGill and van Ryzin (1999) review more than 190 references on revenue management prior to 1999. Weatherford and Kimes (2003) used real data to test a variety of forecasting methods in a hotel revenue management system. Bitran and Caldentey (2003) survey research on dynamic pricing policies and their relation to revenue management up to 2002. Guillen et al. (2019) describe Opticar, a decision support system that uses advanced algorithms to forecast demand, optimize revenue, and manage Europcar’s car rental fleet capacity for up to six months in advance. The system, adopted by managers in nine countries, serves as a foundation for daily operations, aiding in pricing and vehicle management. Talluri et al. (2009) provide an introduction to revenue management from the perspective of simulation. Gopalakrishnan and Rangaraj (2010) present a linear programming model to optimally allocate train capacity among different travel segments on an Indian Railways route. Applying this model to 17 trains resulted in significant increases in revenue, load factors, and passengers carried. Recognizing the distinct needs of the cruise industry, Beck et al. (2021) describe the yield optimization and demand analytics (YODA) system, a revenue management system at Carnival Corporation. This system uses a unique model and machine learning to determine cruise pricing and inventory allocation, driving a 1.5%–2.5% rise in net ticket revenue. Clough et al. (2015) address the network revenue management problem under a MultiNomial Logit discrete choice model with applications in airline revenue management. Feldman and Topaloglu (2017) study revenue management problems in which customers select from a group of offered products according to the Markov chain choice model. Ferreira et al. (2018) consider a price-based network revenue management problem in which the goal of a retailer is to maximize revenue over a finite selling season from multiple products with limited inventory. Wang et al. (2018) explore revenue management in online advertisement. Kunnumkal and Talluri (2019) propose a new Lagrangian relaxation method for discrete-choice network revenue management based on an extended set of multipliers. Shihab et al. (2019) consider the seat inventory control and overbooking problem of airline revenue management, formulated as a Markov decision process, and solved with deep reinforcement learning to find a policy that maximizes the revenue of each flight. Pandey et al. (2017) review revenue-management research in the broadcasting and online advertisement industries, focusing on strategies and techniques to maximize advertising revenue. They also highlight mobile advertising as an emerging area in revenue management and identify potential gaps for future research. The book by Phillips (2005) provides a comprehensive review of practical yield management in industry.
Parcel lockers have been studied in the literature since the mid-2010s. A literature review of parcel lockers can found in a paper by van Duin et al. (2020). In that paper, four literature focus areas are identified: (1) parcel locker use from the customer’s perspective; (2) location of parcel lockers; (3) cost of parcel lockers; and (4) environmental economics impact. A case study of PostNL, the market segment leader (70%) in the Netherlands, illustrates the four focus areas. Iwan et al. (2016) concentrate on the analysis of evaluating the usability and efficiency of parcel lockers using the Polish InPost Company system as a case study. Orenstein et al. (2019) introduce a logistic model for small parcel delivery using multiple service points (SPs) and showcase solution methods like the savings heuristic, the petal method, and tabu search. The model, demonstrating cost and time efficiencies, especially when recipients are flexible with delivery locations, is validated through numerical and simulation studies against traditional methods.
Rohmer and Gendron (2020, p. 13) explore various locker station delivery concepts, address related challenges and decision problems, and outline potential research directions in operations research (OR) to foster advancements in this burgeoning domain. In section 3.3 of their monograph, Locker Assignment and Scheduling, they conclude with the recommendation, “More research is therefore needed, focusing on the capacity management of the locker stations at the operational level, while taking into account uncertainties and the dynamic operating environment.” Our paper is a first step in this direction.
Yield management in Amazon Locker offers some unique challenges not common in the travel industry. The major challenge arises from unknown package dwell times—that is, the time a package stays in the locker (which is known at the time of booking for airlines, car rentals, and hotels). Furthermore, the sparsity of data (due to lockers being deployed in areas with different demographics, where pickup and ordering characteristics vary in volume and timing) and the superlinear growth rate of Amazon Locker locations offer additional challenges to traditional assumptions on demand distribution. These factors forge the need for disruptive technologies for locker capacity management.
Apart from unique modeling challenges, the need for such disruptive technologies is reinforced by the impact of Amazon Locker capacity management on customers. The capacity management system previously in place at Amazon (the legacy system) resulted in many locker requests being rejected even when capacity was available in the locker. Such rejected requests are called unjustified rejections. These rejections happen because of poor locker demand and dwell time forecasts, resulting in poor ship option capacity reservations. In 2017, 20% of Locker requests were rejected. Out of these 20%, between 18% and 35% were unjustified rejections. Forty percent of these unjustified rejections were attributed to poor ship option capacity reservations. Customer surveys showed that these rejections were the biggest pain point for Amazon Locker customers.
As the Amazon Locker business geared up for phenomenal growth, it was important to invest into efficient capacity management to ensure the best possible utilization of resources and guarantee great customer experience.
The remainder of the paper is organized as follows. In the section titled Legacy Practice, we review the methodology previously applied at Amazon for locker capacity management. The section titled Locker Capacity Management Model focuses on the newly proposed Locker Capacity Management model composed of a locker demand forecasting module, a package dwell time probability estimation module, and a module for ship option capacity reservation optimization. In the section titled Model Evaluation by Simulation System, we describe a data-driven simulation system used to evaluate our proposed model and determine the impact on the locker capacity management system of any changes made to inputs, including changes in ship option capacity reservations. In the section titled Experimental Results, we report experimental results from the implementation of the Locker Capacity Management Model both in the simulation system and in real-world production. Finally, conclusions are drawn in the section titled Conclusion.
Legacy Practice
The legacy methodology for assigning capacity reservations for faster ship options was based on the following algorithm. For each locker, in each week, evaluate the proportion of different ship option packages delivered to homes in the previous year, in that week, in the zip code of the locker. Locker capacity was then assigned as equal to these proportions, normalized by locker capacity. We refer to this algorithm in the remainder of the paper as the proportion rule.
As a hypothetical example of the proportion rule, consider the locker Ruby (located in Amazon’s Ruby building in zip code 98109 in Seattle, Washington). Assume that there are only two ship options to consider: two-day and standard. If the number of packages delivered to homes in the zip code 98109, in week 26 of 2017, was 3,000 two-day packages and 1,000 standard packages, and if the locker capacity is 100 slots, then in week 26 of 2018 (one year later), 75 slots would be reserved for two-day packages and 25 for standard packages.
This approach was an intuitive first attempt at estimating locker demand. However, it had the following limitations:
It assumes that the same ratio is maintained between the number of package deliveries for different ship options in homes and lockers, which is not true in general. For example, Figure 3 shows the ratio between the number of standard and two-day ship option packages delivered to homes in zip code 98109 and to locker Ruby over a 10-week period in 2018. The ratio for Locker packages has a higher variability than the ratio for home packages. The two ratios can differ substantially.
Space is reserved in the ratio of the number of packages expected to be delivered on a particular day. However, package dwell time is not considered. The number of packages delivered is not equal to the number of packages in a locker on a given day. There are packages in the locker that were delivered in the previous few days and were not picked up by the customer. Therefore, space needs to be reserved while taking those packages into account. Figure 1 shows nonzero average dwell time for different ship options over a 10-week period for lockers in zip code 98109. A package that is picked up on the same day as delivery is counted as having a dwell time of zero.
It does not maximize locker throughput. It is possible to achieve higher throughput by allocating protection limits based on demand as well as dwell time of ship options. As mentioned previously, the dwell time of packages belonging to different ship options follows a distinctive trend, with packages with a faster ship option having a lower dwell time (as shown in Figure 1). Therefore, a model that considers demand as well as dwell time is required to maximize throughput through the locker.

Amazon Locker influences customer experience directly, thus giving rise to a high-impact application with unique challenges. The work presented in this paper analyzes the strengths of traditional operations research methods and machine learning techniques and demonstrates the best use of both worlds, thus establishing the importance to Amazon of the rich field of yield management.
We next describe our model for reserving capacity for ship options in Amazon Locker.
Locker Capacity Management Model
The Locker Capacity Management Model for assigning capacity reservations to different ship options consists of three modules (Figure 4). We describe each module in detail in sections titled Locker Demand Forecast, Package Dwell Time Probability Estimation, and Ship Option Capacity Reservation Optimization.

Locker Demand Forecast
The first module forecasts the expected number of packages delivered in each locker, of each ship option, and on each day over the next seven days (Module I in Figure 4). In the section titled Ship Option Capacity Reservation Optimization, we refer to the outputs from this model as dst, the demand forecast for ship option s on day t. This problem poses some unique challenges to traditional time-series forecasting models:
Though data on orders that were placed and delivered are available, data for requests that were rejected because of a lack of locker capacity are noisy and cannot be used for demand prediction. Therefore, historical demand is constrained by locker capacity and previously defined protection limits. Most traditional demand-unconstraining methods require some portion of historical data to be unconstrained, which is not possible in the case of lockers.
Time series models require data for the previous few years to account for seasonality. However, the number of lockers can increase by up to 50% year over year. Therefore, only a few months of data are available for many lockers during peak season.
In the new method we propose, random forest regression is used to estimate locker demand for each locker, ship option, and day, for the next seven days. We train seven different random forest models, one for each day in the next seven days because there is more information available for predicting demand for the next day than for predicting demand for seven days out. The six features of the random forest regression model are as follows:
Locker deliveries for that ship option on that weekday of the previous four weeks
Home deliveries in the Locker zip code in the same week of the previous year to account for seasonality in demand
Time of first rejection for a ship option to account for unseen demand
Delivery day of week
Delivery day of month
Ship option
The training time period is 16 weeks of data before the model run date, along with 4 weeks of data from peak of the previous year. This is so the model is given a broader range of possible case scenarios that could allow it to pick up on sudden peaks that might occur within the planning horizon. Random forest regression has the advantage of being robust in the face of sparse data. We say data are sparse because lockers are implemented in a number of locations with different demographics (e.g., on university campuses and around suburban neighborhoods), so the pickup and ordering characteristics vary in volume and timing. Campuses, for example, have peaks at the start and end of the semester, a time that suburban neighborhoods may not experience peaks. Furthermore, machine learning techniques allow for additional features, such as the first observed rejection time for unconstraining demand—that is, the first point in time when we reject a request for a delivery to the Locker because of a lack of capacity. Because we do not have complete visibility into lost demand, this time directionally indicates how much demand exceeds capacity. An early time indicates demand is much higher than capacity, whereas a later time indicates that the difference between demand and capacity is smaller. Seasonality is incorporated by using home deliveries in the previous year in the same week as one of the features.
To evaluate model performance, prediction accuracy of the model is compared with the values obtained with the legacy method of using the proportion of ship option deliveries method, described in the Legacy Practice section.
The error metric we use is not the traditional mean absolute percentage error (MAPE). This is a distinguishing feature of our model. The MAPE we use is defined as the average of

Package Dwell Time Probability Estimation
Module II of the Locker Capacity Management Model (Figure 4) is package dwell time probability estimation. In this module, we estimate the probability that a package stays in the locker for zero, one, two, …, six days for each shipping speed and delivery day in the time horizon of the next seven days. In the section titled Ship Option Capacity Reservation Optimization, we refer to the output dwell time probabilities from this model as psvt, the probability that a package belonging to ship option s and delivered on day v will be present on day t, where t > v.
Customers are advised to pick up their packages within three days of delivery. If the package is not picked up in three days, a call tag is generated and a carrier picks up the package for return to Amazon within the next two to three days. Therefore, the possible values for dwell time are . Dwell time probability is currently calculated in the locker capacity management system for a separate function unrelated to the capacity reservations for ship options. This probability is calculated as the proportion of packages that stayed in the locker for zero, one, two, …, six days in the previous four weeks. However, this method often leads to overfitting due to sparse data. For example, if in the previous four weeks, the number of Standard deliveries is two, with one package being picked up on the same day as delivery and the other staying in the locker for six days, with the prior method, the probability of the package being picked up on day 0 will be 0.5, and the probability the package stays in the locker for six days will be 0.5, whereas the probability that the package stays for one, two, three, four, and five days is 0, which is not descriptive of reality. In reality, the probability that a package stays in the locker for one day is not zero and is higher than the probability the package stays in the locker for six days.
In this module, random forest classification is used to estimate dwell time probabilities of a package with the following four features:
Average, minimum, and maximum dwell time of packages for that ship option, delivered on that day of week in the previous four weeks
Ship option
Delivery day of week
Delivery day of month
Random forest regression allows utilization of data for multiple similar lockers to estimate dwell time probability of a package, thus reducing effects of sparse data. To improve the quality of dwell time probabilities, we use probability calibration using isotonic regression implemented in the scikit-learn Python library with the technique proposed by Zadrozny and Elkan (2002).
Isotonic regression is a type of regression analysis used for fitting a nondecreasing function to data. In the context of machine learning, isotonic regression is often used to calibrate the probabilities output by a classification model, making them more interpretable and reliable.
The scikit-learn Python library provides an implementation of isotonic regression that can be used for various applications. The technique proposed by Zadrozny and Elkan (2002) is specifically geared toward probability calibration. In their method, they aim to adjust the predicted probabilities such that they better represent the true probabilities. The idea is to fit a nondecreasing function to the output of a model in a way that minimizes some measure of error with respect to the true, observed outcomes. The implementation in scikit-learn often follows a “pool adjacent violators” algorithm for this fitting process.
Using isotonic regression for calibration makes sense when aiming for the output probabilities of a model to be well-calibrated, meaning that a predicted probability of x% should correspond to the event happening x% of the time. This is especially useful in risk-sensitive applications in which the predicted probabilities are used to make decisions.
To use isotonic regression in scikit-learn, you typically first train your classification model and obtain predicted probabilities. Then, you fit an isotonic regression model to these predicted probabilities and the true labels, thereby obtaining a calibrated set of probabilities.
Similar to the error metric in the section titled Locker Demand Forecast, the error metric used to measure the quality of probabilities also uses the same units in its model performance metric as used in the business performance metric (i.e., locker throughput). Let A # PPU and E # PPU be, respectively, the actual and expected number of packages picked up. The performance metric is the average of
Ship Option Capacity Reservation Optimization
In the ship option capacity reservation module (Module III of Figure 4), a linear programming formulation is proposed to find optimal capacity reservations for ship options in a locker with the goal of maximizing throughput. This linear program is run at the end of day 0, and the time horizon is assumed to be the next seven days, with the count starting with day 1. Based on the customer or carrier pickup time windows described in the Package Dwell Time Probability Estimation section, it is assumed that packages delivered up to six days before the first day of the time horizon might still be in the locker at the end of day 0. The actual number of packages in the locker is known in real time and is a parameter. However, the date on which the package will be picked up is not known.
Let T be the number of days in the time horizon and S be the number of ship options. Let C be the locker capacity and psvt be the dwell time probability of a package delivered for ship option on day being present in the locker on day . This probability is computed in Module II of Figure 4. Furthermore, let esv be the number of packages of ship option delivered to the locker on day and present in the locker at the end of day 0, and dst is the package demand forecast for package deliveries of ship option s on day . This demand is computed in Module I of Figure 4.
There are two classes of decision variables. Decision variable xst is the number of slots to reserve for ship option on day . Variable yst is the number of packages accepted for arrival for ship option on day .
A linear program whose objective is to maximize throughput—that is, the total number of packages delivered to the locker in the time horizon—is given in Equations (A.1)–(A.5) in the appendix.
In the linear program, decision variables xst are used as capacity reservations for ship option on day of the of planning horizon.
In this formulation, packages with a faster ship option are not explicitly prioritized using specific weight parameters. However, because the average dwell times of packages with a faster ship option is lower than that of packages with a slower ship option (Figure 1), the objective of throughput maximization automatically results in prioritization of packages with a faster ship option.
Model Evaluation by Simulation System
To best serve Amazon customers, it is essential to quantify the impact of any improvement to the capacity management system. However, locker capacity management is a complex system, and it is difficult to predict the effect of any change to inputs on locker throughput. To enable this task, a simulation system was deployed that allowed the determination of the impact on the locker capacity management system of any changes made to inputs, including changes in ship option capacity reservations.
This simulation system was built to mirror what happens in production, based on different inputs. Millions of package events (e.g., order requests, arrivals, order pickups, etc.) that happened in a locker over a 15-day period can be replayed in this simulation system in less than six minutes. This tool has enabled us to perform many types of what-if analyses. Prior to building the simulation system, the only way to do these analyses was to apply changes directly to production and measure the result in real life. This is now done through the simulation system, thus reducing the time needed to arrive at data-driven decisions and eliminating the risk of negatively impacting customer experience. Accuracy of the simulation system is measured as the percentage of acceptance/rejection decisions of locker requests that were the same in production and in the simulation system. Figure 6 shows the high level of simulation system accuracy for four sample lockers over a 14-day time period.

Experimental Results
In this section, we report experimental results from the implementation of the Locker Capacity Management Model both in the simulation system and in real-world production. Results are presented in terms of change in locker throughput, the adopted business metric for measuring the success of the new model.
Simulations
The ship option capacity reservations from the model proposed in the Locker Capacity Management Model section were first tested on the simulation system for the time period of the two weeks from April 15 to 28, 2018. A set of 30 lockers were used in the experiment, chosen randomly as a mix of high-, medium-, and low-throughput lockers in the United States. We used actual data about the lockers and historical data. Two sets of simulations were run for each locker. The first set of simulations used the ship option capacity reservations previously deployed in production (i.e., the legacy system). The second set of simulations used the ship option capacity reservations obtained through the algorithm described in this paper. Historical locker requests for the 30 lockers were replayed, and the number of accepted requests in the two sets of simulations were compared against each other. Simulation results are shown in Figures 7–9. Figures 7 and 8 compare, respectively, the percent error (in log scale) in the forecasts made with random forest regression and with the legacy proportion method on the lockers used in the study for the two-day ship option and standard ship option. The percent error is computed as the absolute difference between the forecast and what actually occurred divided by the locker capacity. For example, if the forecast was one and the actual was two while the locker capacity was 100, then the percent error would be . However, if the locker capacity was three, then the percent error would be .


Although random forest regression forecasts are better than those made with the legacy approach, there is a more noticeable difference for the two-day ship option. On the standard ship option, both methods do well, with random forests still outperforming the legacy method.
Figure 9 shows percent throughput increase with the new capacity management algorithm compared with the legacy algorithm. There is an average 6% increase in throughput for the 30 lockers, with a maximum increase of 23%. In 4 of the 30 lockers, there was no increase or decrease in locker throughput. This is because those lockers had low demand compared with locker capacity, so any model, including first-come-first-served, would result in the same throughput. These simulations provided the needed confidence in the model to launch on actual lockers in production.

Note. The 30 lockers are sorted in decreasing order of throughput increase.
Lockers in Production
The model was put into production on live lockers in a phased deployment, with 100% of Amazon Lockers worldwide using model outputs by November 2018. This resulted in an increase in throughput in all countries. However, if a locker has very low demand, then our model is not expected to make much of a difference, because the first-come-first-served policy performs as well as any model. Therefore, the results were evaluated on Amazon Locker locations having a demand to capacity ratio above the 60-percentile mark. Here, these percentile cutoffs are calculated based on all Amazon Lockers in the country for the five weeks during peak. Figure 10 shows percent improvement results in the United States, France, and Italy. This allowed Amazon to say “yes” to millions of requests for which it would otherwise reject.

Conclusion
Amazon Locker influences customer experience directly, thus giving rise to a high-impact yield management application with unique challenges due to data sparsity and unknown package dwell times. The major pain point for Amazon Locker customers is rejected requests due to capacity algorithm estimations. The biggest chunk (40%) of unjustified rejections of requests for deliveries to lockers in 2017 were due to poor ship option capacity reservations. Amazon built and implemented an algorithm that computes optimal ship option capacity reservations to maximize throughput (i.e., minimize rejections) and therefore enhance customer experience at Amazon Lockers. This allowed Amazon to accept and say “yes” to millions of additional Locker requests.
The methodology introduced in this paper and used by us for managing parcel locker capacity has versatile applications across various business scenarios, particularly in optimizing space and resource allocation. For instance, in a grocery store setting, this approach could enhance the management of shelf space, especially for perishable goods, for which shelf life and demand prediction are crucial. The model can dynamically adjust inventory levels and shelf allocation based on real-time demand forecasts, reducing waste and maximizing efficiency. Extending this idea further, such innovative methods can revolutionize the management of computational resources in cloud services like AWS. Here, the approach could be tailored to allocate CPU services more effectively. Customers often request compute time without precise knowledge of the duration needed for their computations. By using a system that intelligently bundles and allocates CPU resources across the cloud based on demand and cost considerations, efficiency and cost-effectiveness can be significantly improved.
One significant limitation inherent in these systems is their dependency on accurate forecasting and an understanding of customer behavior. This reliance becomes particularly challenging because of the dynamic and often unpredictable nature of consumer preferences and market trends. If given more time and resources, a more nuanced model could be developed. This model would not only factor in general buying trends but also delve into customer behavior specific to each product (identified by its ASIN or SKU). Such a granular approach could potentially yield more precise predictions and efficient allocation of resources.
Furthermore, an intriguing avenue for exploration could be the strategic overbooking of space, akin to practices in the airline and hotel industries. This approach could potentially maximize utilization and revenue, especially in scenarios in which demand outstrips supply. However, it is crucial to balance this strategy with the risk of overcommitment, which could lead to customer dissatisfaction and operational challenges. Assessing the threshold for overbooking and developing mechanisms to manage potential overcommitment situations are key areas for further research and development.
Considering future market trends, we have identified the interplay between locker deliveries and last-mile delivery services as a promising area for exploration that has not yet been thoroughly investigated. Understanding customer preferences and behavior in this domain could yield valuable insights. Specifically, it would be intriguing to conduct a study in which customers are given a choice between immediate locker delivery and next-day home delivery. This investigation could reveal customer preferences for convenience versus immediacy and how these choices align with their expectations and satisfaction levels.
Moreover, the financial aspect of this comparison is equally important. We hypothesize that home deliveries, although offering direct-to-door convenience, likely incur higher costs due to factors such as fuel, labor, and vehicle maintenance. In contrast, locker deliveries, by centralizing drop-offs, could significantly reduce these expenses. Assessing the cost implications of each delivery method would not only inform our operational strategies but also help in tailoring our services to be both cost-effective and customer-centric.
The potential environmental impact of each delivery method could also be explored. For instance, locker deliveries might reduce carbon emissions by reducing the distance covered by delivery vehicles. This aspect aligns with growing consumer awareness and concern about environmental sustainability.
The work of S. Sethuraman, T. Mardan, and M.G.C. Resende was mainly done when they were at Amazon Transportation Services in Bellevue, Washington. The work of A. Bansal was mainly done when he was at Access Point Technology, Amazon Delivery Technologies, in Delhi, India. On December 13, 2022, a U.S. patent was issued for the capacity management system described in this paper (Sethuraman et al. 2022; U.S. Patent 11 526 838, Dec. 2022).
Appendix. Linear Programming Model for Throughput Maximization
This appendix describes a linear program whose objective is to maximize throughput—that is, the total number of packages delivered to the locker in the time horizon. Let S be the number of ship options and T the number of days in the planning horizon.
Let C be the locker capacity and psvt be the dwell time probability of a package delivered for ship option on day being present in the locker on day . Furthermore, let esv be the number of packages of ship option delivered to the locker on day and present in the locker at the end of day 0, and dst is the package demand forecast for package deliveries of ship option s on day .
In this model, there are two classes of decision variables. Decision variable xst is the number of slots to reserve for ship option on day . Variable yst is the number of packages accepted for arrival for ship option on day .
Constraints (A.2) guarantee that locker capacity is not violated in any period of the planning horizon. Constraints (A.3) ensure, for any period of the planning horizon, that the number of packages of any ship option accepted for delivery is less than or equal to demand for that ship option. Constraints (A.4) define the relationship between the number of packages accepted for delivery and the number of slots reserved for the ship option. Finally, the decision variables are constrained to be nonnegative by Constraints (A.5).
References
- 2021) Carnival optimizes revenue and inventory across heterogenous cruise line brands. INFORMS J. Appl. Analytics 51(1):26–41.Link, Google Scholar (
- 2003) An overview of pricing models for revenue management. Manufacturing Service Oper. Management 5:203–229.Link, Google Scholar (
- 2019) Personal communication.Google Scholar (
- 2015) New formulations for price and ticket availability decisions in choice-based network revenue management. Proc. AGIFORS 55th Annual Sympos. Analytics Efficiency Customer Centric Optimization (AGIFORS, Atlanta).Google Scholar (
- 2017) Revenue management under the Markov Chain choice model. Oper. Res. 65:1322–1342.Link, Google Scholar (
- 2018) Online network revenue management using Thompson sampling. Oper. Res. 66:1586–1602.Link, Google Scholar (
- 2010) Capacity management on long-distance passenger trains of Indian Railways. Interfaces 40(4):291–302.Link, Google Scholar (
- 2010) Hub Group implements a suite of OR tools to improve its operations. Interfaces 40(5):368–384.Link, Google Scholar (
- 2019) Europcar integrates forecasting, simulation, and optimization techniques in a capacity and revenue management system. INFORMS J. Appl. Analytics 49(1):40–51.Link, Google Scholar (
- 2016) Analysis of parcel lockers’ efficiency as the last mile delivery solution: The results of the research in Poland. Proc. 10th Internat. Conf. City Logistics (Elsevier, Amsterdam), 644–655.Google Scholar (
- 2008) Incorporating network flow effects into the airline fleet assignment process. Transportation Sci. 42(4):514–529.Link, Google Scholar (
- 2019) A strong Lagrangian relaxation for general discrete-choice network revenue management. Comput. Optim. Appl. 73:275–310.Google Scholar (
- 1972) Forecasting and control of passenger bookings. Proc. 12th AGIFORS Sympos. (American Airlines Incorporated, Fort Worth, TX), 95–117.Google Scholar (
- 2005) Forecasting and control of passenger bookings. J. Revenue Pricing Management 4(2):111–123.Google Scholar (
- 1989) Optimization and estimation problems in airline yield management. PhD thesis, University of British Columbia, Vancouver.Google Scholar (
- 1999) Revenue management: Research overview and prospects. Transportation Sci. 33:233–256.Link, Google Scholar (
- 2019) Flexible parcel delivery to automated parcel lockers: Models, solution methods and analysis. EURO J. Transportation Logist. 8:683–711.Google Scholar (
- 2017) Survey on revenue management in media and broadcasting. Interfaces 47(3):195–213.Link, Google Scholar (
- 2005) Pricing and Revenue Optimization (Stanford University Press, Stanford, CA).Google Scholar (
- 2020) A guide to parcel lockers in last mile distribution: Highlighting challenges and opportunities from an OR perspective. Technical report, CIRRELT, Université de Montréal, Montreal.Google Scholar (
- 2022) Capacity management system for delivery lockers. U.S. Patent 11 526 838.Google Scholar (
- 2019) Autonomous airline revenue management: A deep reinforcement learning approach to seat inventory control and overbooking. Proc. 36th Internat. Conf. Machine Learn. (Machine Learning Research Press, Online).Google Scholar (
- 1992) Yield management at American Airlines. Interfaces 22(1):8–31.Link, Google Scholar (
- 2009) Revenue management: Models and methods. Proc. Winter Simulation Conf. (IEEE Press, Piscataway, NJ), 148–161.Google Scholar (
- 2020) From home delivery to parcel lockers: A case study in Amsterdam. Transportation Res. Proc. 46:37–44.Google Scholar (
- 1983) Optimum seat allocation for multi-leg flights with multiple fare types. Proc. 23rd Annual Sympos. AGIFORS (Eastern Airlines, Inc., Miami).Google Scholar (
- 2018) Learning theory and algorithms for revenue management in sponsored search. Technical report, Alibaba Group, Hangzhou, China.Google Scholar (
- 2003) A comparison of forecasting methods for hotel revenue management. Int. J. Forecasting 19(3):401–415.Google Scholar (
- 2002) Transforming classifier scores into accurate multiclass probability estimates. Proc. 8th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM, New York), 694–699.Google Scholar (
Verification Letter
Ekta Mittal, Software Development Manager, Access Points Capacity Optimization, Amazon Development Center India Pvt. Ltd., 26/1, Brigade Gateway, World Trade Centre, 10th Floor, Dr. Rajkumar Road, Malleshwaram (W) Bangalore 560 055, Karnataka, India, writes:
“This letter is to certify that the methodology indicated in the paper entitled “Lockers Capacity Management’’ by “Samyukta Sethuraman, Ankur Bansal, Setareh Mardan, Mauricio G. C. Resende, and Timothy L. Jacobs” has been implemented in production in all marketplaces worldwide in Nov 2019. My team owns engineering for Amazon Locker capacity Optimization for all marketplaces including NA (US, CA), and EU5 (UK, IT, FR, DE, ES). We had deployed an interim implementation of the solution as proposed in the paper in Dec 2018, and the end to end automated solution in Oct 2019.
“In Dec 2018, we implemented a semiautomated solution and observed a year-over-year increase in locker throughput of 9% worldwide during the 2018 holiday season, impacting millions of customers. In Oct 2019, we started launching the automated solution in all countries. The 3 year NPV of using the new proposed algorithm is estimated to be $24.3M. Post the launch, we have seen improvement in unjustified rejections (Rejecting orders when we actually have space in lockers) on retail website by 60% YoY. We have also observed an increase in throughput (number of packages accepted per slot per locker) by 15% in US and UK to +50% in smaller countries like IT and ES.
“The solution implemented is a generic platform for all forecasting and optimization needs and horizontally scalable. Please see the comparison metrics for increase in throughput of capacity constrained lockers:
|
Country | Throughput (Nov 2019) | Throughput (Nov 2018) |
---|---|---|
CA | 37.2% | 26.4% |
US | 43.2% | 37.1% |
FR | 38.5% | 26.6% |
DE | 48.1% | 34.3% |
IT | 44.9% | 29.7% |
ES | 52.8% | 26.3% |
UK | 52.8% | 46.1% |
“I would like to submit that the methodology is running in production since Dec 2018 in semiautomated way and since Oct 2019 in an end-to-end automated scalable technology.
“If you have any questions regarding the information provided, please feel free to reach me at ekta@amazon.
Samyukta Sethuraman is an applied science manager at Amazon in Palo Alto, California, focusing on machine learning and operations research. She joined Amazon in 2015 and has made notable contributions, including developing dynamic pricing models for Amazon Ads, capacity management systems for Amazon Lockers, and truck routing models. She earned her PhD in industrial engineering from Texas A&M University.
Ankur Bansal is the senior vice president of tech at Kotak Mahindra Bank and has more than 13 years in software development, including roles at Amazon and Adobe. He excels in innovation, team management, and architecture. He is a BITS Pilani graduate and leads product improvement, strategy, and microservices at Kotak. He oversees debt collection platform engineering, focusing on innovation and efficiency, and is keen on new technologies and public health awareness.
Setareh Mardan is a senior science manager at Amazon and heads a team of machine learning scientists and economists developing analytics for Amazon’s pricing. Her team addresses complex research questions impacting pricing strategies, blending economic theory, statistics, machine learning, and optimization. She has a PhD in operations research from the University of Minnesota.
Mauricio G. C. Resende is an affiliate professor at the University of Washington and is renowned for his work in metaheuristics and interior point methods. He has authored five books and more than 200 papers and holds 15 U.S. patents. His 40-year career spans various industries, including power systems in Brazil, semiconductor manufacturing in Silicon Valley, telecommunications in New Jersey, and retail logistics at Amazon. He is an INFORMS Fellow.
Timothy L. Jacobs is the director of research science at Amazon and leads middle-mile product research and optimization. His team focuses on optimization, generative artificial intelligence, machine learning, and strategic initiatives for global air, ground, rail, and maritime logistics. Their innovations have saved billions for Amazon’s transport network. He has authored many publications, holds several U.S. patents in the airline and supply chain sectors, and is a licensed professional engineer.