Abstract
Cloud datacenters are increasingly hosting business workloads. Such long-running, on-demand workloads raise important challenges in datacenter operation, requiring efficient online scheduling of workloads with unprecedented characteristics under strict service level agreements (SLAs). In this work, we propose an approach to manage the risk of not meeting SLAs. Our approach is based on portfolio scheduling, which is an online scheduling technique that dynamically selects a scheduling algorithm from a set (portfolio), subject to a possibly changing utility function. Ours is the first datacenter-scheduling approach to consider operational and disaster-recovery risks. Using trace-based simulation with traces collected from a commercial multi-datacenter environment, we give evidence that portfolio scheduling is able to mitigate risks significantly better than its constituent scheduling algorithms and better than datacenter engineers.
Original language | English |
---|---|
Title of host publication | 2019 18th International Symposium on Parallel and Distributed Computing (ISPDC 2019) |
Subtitle of host publication | [Proceedings] |
Editors | Alexandru Iosup, Radu Prodan, Alexandru Uta, Florin Pop |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 94-102 |
Number of pages | 9 |
ISBN (Electronic) | 9781728138008, 9781728138015 |
ISBN (Print) | 9781728138022 |
DOIs | |
Publication status | Published - 2019 |
Event | 18th International Symposium on Parallel and Distributed Computing, ISPDC 2019 - Amsterdam, Netherlands Duration: 5 Jun 2019 → 7 Jun 2019 |
Conference
Conference | 18th International Symposium on Parallel and Distributed Computing, ISPDC 2019 |
---|---|
Country/Territory | Netherlands |
City | Amsterdam |
Period | 5/06/19 → 7/06/19 |
Keywords
- Portfolio Scheduling, Datacenter Resource Management, Risk Management, Risk Tolerance, Operational Risk, Disaster Recoverability Risk