Open Access

A Theory of Auto-Scaling for Resource Reservation in Cloud Services

Konstantinos Psychas
Konstantinos Psychas
[email protected]
https://orcid.org/0000-0002-5655-4868
Department of Electrical Engineering, Columbia University, New York, New York 10027
Search for more papers by this author
,
Javad Ghaderi
Javad Ghaderi
[email protected]
https://orcid.org/0000-0001-8038-550X
Department of Electrical Engineering, Columbia University, New York, New York 10027
Search for more papers by this author

Konstantinos Psychas

[email protected]

https://orcid.org/0000-0002-5655-4868

Department of Electrical Engineering, Columbia University, New York, New York 10027

Search for more papers by this author

Javad Ghaderi

[email protected]

https://orcid.org/0000-0001-8038-550X

Department of Electrical Engineering, Columbia University, New York, New York 10027

Search for more papers by this author

Published Online:1 Feb 2022https://doi.org/10.1287/stsy.2021.0091

References

Aceto G, Botta A, De Donato W, Pescapè A (2013) Cloud monitoring: A survey. Comput. Networking 57(9):2093–2115.Google Scholar
Amazon (2021) Amazon EC2 auto scaling with EC2 spot instances. Accessed January, 17, 2022, https://aws.amazon.com/getting-started/hands-on/ec2-auto-scaling-spot-instances/.Google Scholar
Amazon On-Demand Instances (2021) Amazon on-demand instances. Accessed January, 17, 2022, https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-on-demand-instances.html.Google Scholar
Amazon Spot Instances (2021) Amazon spot instances. Accessed January, 17, 2022, https://aws.amazon.com/ec2/spot/.Google Scholar
Amazon Web Services (2020a) Amazon AWS containers. Accessed January, 17, 2022, https://aws.amazon.com/containers/.Google Scholar
Amazon Web Services (2020b) Amazon EC2 auto-scaler. Accessed January, 17, 2022, https://docs.aws.amazon.com/autoscaling/.Google Scholar
Amazon Web Services (2020c) Amazon web Services (AWS). Accessed January, 17, 2022, https://aws.amazon.com/.Google Scholar
Amazon Web Services (2020d) AWS service level agreements (SLAs). Accessed January, 17, 2022, https://aws.amazon.com/legal/service-level-agreements/.Google Scholar
Andonov R, Poirriez V, Rajopadhye S (2000) Unbounded knapsack problem: Dynamic programming revisited. Eur. J. Oper. Res. 123(2):394–407.Google Scholar
Apache Software Foundation (2019) Apache hadoop yarn. Accessed January, 17, 2022, http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html.Google Scholar
Bean NG, Gibbens RJ, Zachary S (1995) Asymptotic analysis of single resource loss systems in heavy traffic, with applications to integrated networks. Adv. Appl. Probability 27(1):273–292.Google Scholar
Billingsley P (2008) Probability and Measure (John Wiley & Sons, Hoboken, NJ).Google Scholar
Bolch G, Greiner S, Meer H, Trivedi K (2006) Queueing Networks and Markov Chains: Modeling and Performance Evaluation with Computer Science Applications, 2nd ed., vol. 95 (John Wiley & Sons, Hoboken, NJ).Google Scholar
Boyd S, Vandenberghe L (2004) Convex Optimization (Cambridge University Press, New York).Google Scholar
Cohen C, Rouhling D (2017) A formal proof in Coq of LaSalle’s invariance principle. Ayala-Rincón M, Muñoz CA, eds. Interactive Theorem Proving (Springer International Publishing, Cham, Switzerland), 148–163.Google Scholar
Corradi A, Fanelli M, Foschini L (2014) VM consolidation: A real case based on openstack cloud. Future Generation Comput. Systems 32:118–127.Google Scholar
Cygan M, Jeż Ł, Sgall J (2016) Online knapsack revisited. Theory Comput. Systems 58(1):153–190.Google Scholar
Ghaderi J, Zhong Y, Srikant R (2014) Asymptotic optimality of bestfit for stochastic bin packing. Performance Evaluation Rev. 42(2):64–66.Google Scholar
Ghobaei-Arani M, Jabbehdari S, Pourmina MA (2018) An autonomic resource provisioning approach for service-based cloud applications: A hybrid approach. Future Generation Comput. Systems 78:191–210.Google Scholar
Google Cloud (2020a) Google Cloud. Accessed January, 17, 2022, https://cloud.google.com/.Google Scholar
Google Cloud (2020b) Google compute engine pricing. Accessed January, 17, 2022, https://cloud.google.com/compute/all-pricing.Google Scholar
Google Cloud (2020c) Google kubernetes. Accessed January, 17, 2022, https://cloud.google.com/kubernetes/.Google Scholar
Guo Y, Stolyar A, Walid A (2018) Online VM auto-scaling algorithms for application hosting in a cloud. IEEE Trans. Cloud Comput. 8(3):889–898.Google Scholar
Gupta V, Radovanovic A (2012) Online stochastic bin packing. Preprint, submitted November 12, 2012. Accessed January, 17, 2022, https://arxiv.org/abs/1211.2687.Google Scholar
Han R, Guo L, Ghanem MM, Guo Y (2012) Lightweight resource scaling for cloud applications. Proc. EEE/ACM Internat. Sympos. on Cluster, Cloud and Grid Comput. (IEEE Computer Society, Los Alamitos, CA), 644–651.Google Scholar
Hunt P, Kurtz T (1994) Large loss networks. Stochastic Processing Appl. 53(2):363–378.Google Scholar
Hunt P, Laws C (1997) Optimization via trunk reservation in single resource loss systems under heavy traffic. Ann. Appl. Probability 7(4):1058–1079.Google Scholar
Ibarra OH, Kim CE (1975) Fast approximation algorithms for the knapsack and sum of subset problems. J. ACM 22(4):463–468.Google Scholar
Iwama K, Taketomi S (2002) Removable online knapsack problems. Proc. Internat. Colloquium on Automata, Languages, and Programming (Springer, Berlin), 293–305.Google Scholar
Jiang J, Lu J, Zhang G, Long G (2013) Optimal cloud resource auto-scaling for web applications. Proc. IEEE/ACM Internat. Sympos. on Cluster, Cloud, and Grid Comput. (IEEE Press, Delft, Netherlands), 58–65.Google Scholar
Jo C, Kim H, Egger B (2020) Instant virtual machine live migration. Proc. Internat. Conf. on the Econom. of Grids, Clouds, Systems, and Services (Springer, Berlin), 155–170.Google Scholar
Karthik A, Mukhopadhyay A, Mazumdar RR (2017) Choosing among heterogeneous server clouds. Queueing Systems 85(1-2):1–29.Google Scholar
Kellerer H, Pferschy U, Pisinger D (2004) Multidimensional knapsack problem. Knapsack Problems (Springer, Berlin), 235–283.Google Scholar
Kelly FP (1991) Loss networks. Ann. Appl. Probability 1(3):319–378.Google Scholar
Key PB (1990) Optimal control and trunk reservation in loss networks. Probability Engrg. Inform. Sci. 4(2):203–242.Google Scholar
LaSalle J (1960) Some extensions of Liapunov’s second method. IRE Trans. Circuit Theory 7(4):520–527.Google Scholar
Le T (2020) A survey of live virtual machine migration techniques. Comput. Sci. Rev. 38:100304.Google Scholar
Maguluri ST, Srikant R, Ying L (2012) Stochastic models of load balancing and scheduling in cloud computing clusters. Proc. IEEE INFOCOM (IEEE, New York), 702–710.Google Scholar
Maguluri ST, Srikant R, Ying L (2014) Heavy traffic optimal resource allocation algorithms for cloud computing clusters. Performance Evaluation 81:20–39.Google Scholar
Mao M, Li J, Humphrey M (2010) Cloud auto-scaling with deadline and budget constraints. Proc. IEEE/ACM Internat. Conf. on Grid Comput. (IEEE Computer Society, Los Alamitos, CA), 41–48.Google Scholar
Marchetti-Spaccamela A, Vercellis C (1995) Stochastic on-line knapsack problems. Math. Programming 68(1-3):73–104.Google Scholar
Martello S, Toth P (1990) An exact algorithm for large unbounded knapsack problems. Oper. Res. Lett. 9(1):15–20.Google Scholar
Microsoft Azure (2020) Microsoft Azure. Accessed January, 17, 2022, https://azure.microsoft.com/.Google Scholar
Mukhopadhyay A, Karthik A, Mazumdar RR, Guillemin F (2015) Mean field and propagation of chaos in multi-class heterogeneous loss models. Performance Evaluation 91:117–131.Google Scholar
Psychas K, Ghaderi J (2017) On non-preemptive VM scheduling in the cloud. Proc. ACM on Measurement and Analysis of Comput. Systems, vol. 1 (ACM, New York), 1–29.Google Scholar
Psychas K, Ghaderi J (2018) Randomized algorithms for scheduling multi-resource jobs in the cloud. IEEE/ACM Trans. Networks 26(5):2202–2215.Google Scholar
Qu C, Calheiros RN, Buyya R (2018) Auto-scaling web applications in clouds: A taxonomy and survey. ACM Comput. Survey 51(4):1–33.Google Scholar
Rampersaud S, Grosu D (2014) A sharing-aware greedy algorithm for virtual machine maximization. Proc. IEEE 13th Internat. Sympos. on Network Comput. and Applications (IEEE Computer Society, Los Alamitos, CA), 113–120.Google Scholar
RedHat (2021) Cloud native applications: Stateful vs stateless. Accessed January, 17, 2022, https://www.redhat.com/en/topics/cloud-native-apps/stateful-vs-stateless.Google Scholar
Roy N, Dubey A, Gokhale A (2011) Efficient autoscaling in the cloud using predictive models for workload forecasting. Proc. IEEE 4th Internat. Conf. on Cloud Comput., (IEEE Computer Society, Los Alamitos, CA), 500–507.Google Scholar
Shi J, Luo J, Dong F, Jin J, Shen J (2018) Fast multi-resource allocation with patterns in large scale cloud data center. J. Comput. Sci. 26:389–401.Google Scholar
Song W, Xiao Z, Chen Q, Luo H (2013) Adaptive resource provisioning for the cloud using online bin packing. IEEE Trans. Comput. 63(11):2647–2660.Google Scholar
Stillwell M, Vivien F, Casanova H (2012) Virtual machine resource allocation for service hosting on heterogeneous distributed platforms. Proc. IEEE Internat. Parallel Distributed Processing Sympos. (IPDPS) 2012, Shanghai, China, 786–797.Google Scholar
Stolyar AL (2013) An infinite server system with general packing constraints. Oper. Res. 61(5):1200–1217.Link, Google Scholar
Stolyar AL (2017) Large-scale heterogeneous service systems with general packing constraints. Adv. Appl. Probability 49(1):61–83.Google Scholar
Stolyar AL, Zhong Y (2013) A large-scale service system with packing constraints: Minimizing the number of occupied servers. ACM SIGMETRICS Performance Evaluation Rev. 41(1):41–52.Google Scholar
Stolyar AL, Zhong Y (2015) Asymptotic optimality of a greedy randomized algorithm in a large-scale service system with general packing constraints. Queueing Systems 79(2):117–143.Google Scholar
Verma A, Pedrosa L, Korupolu M, Oppenheimer D, Tune E, Wilkes J (2015) Large-scale cluster management at Google with Borg. Proc. 10th Eur. Conf. on Comput. Systems (Association for Computing Machinery, New York), 1–17.Google Scholar
Whitt W (1985) Blocking when service is required from several facilities simultaneously. ATT Tech. J. 64(8):1807–1856.Google Scholar
Wilkes J (2011) Google cluster data. Accessed January 17, 2022, https://github.com/google/cluster-data.Google Scholar
Xie Q, Dong X, Lu Y, Srikant R (2015) Power of d choices for large-scale bin packing: A loss model. Performance Evaluation Rev. 43(1):321–334.Google Scholar
Zhao Y, Huang Y, Chen K, Yu M, Wang S, Li D (2015) Joint VM placement and topology optimization for traffic scalability in dynamic datacenter networks. Comput. Networks 80:109–123.Google Scholar

Volume 12, Issue 3

September 2022

Pages 227-317

Article Information

Supplemental Material

Metrics

Information

Received:January 05, 2021
Accepted:November 17, 2021
Published Online:February 01, 2022

Cite as

Konstantinos Psychas, Javad Ghaderi (2022) A Theory of Auto-Scaling for Resource Reservation in Cloud Services. Stochastic Systems 12(3):227-252.

https://doi.org/10.1287/stsy.2021.0091

Keywords

PDF download

Available Issues

Available Issues

Available Issues

A Theory of Auto-Scaling for Resource Reservation in Cloud Services

References

Volume 12, Issue 3

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News