POSUM: A Portfolio Scheduler for MapReduce Workloads

Maria A. Voinea, Alexandru Uta, Alexandru Iosup

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

MapReduce ecosystems are (still) widely popular for big data processing in data centers. To address the diverse non-functional requirements arising from many and increasingly more sophisticated users, the community has developed many scheduling policies for MapReduce workloads. Although some individual policies can dynamically optimize for single and stable performance objectives, such as minimizing runtime or cost, or meeting deadlines for realtime-jobs, it seems unlikely that individual policies will remain competitive for increasingly more dynamic workloads and objectives. In contrast, in this work we investigate the ability to dynamically balance performance and cost of a portfolio scheduler for MapReduce workloads. To this end, we design and implement a portfolio scheduling technique, that is, a system capable of adapting to the current workload characteristics and target objectives by periodically evaluating its set of potential policies, and of switching to »the best» policy that targets the current system state. We implement and evaluate our system with real-world experiments on a workload containing a mixture of real-time and batch jobs, with the purpose of minimizing deadline violations, while keeping batch job slowdown in check. Our results show that POSUM is a promising alternative: it can out-perform the individual policies of its portfolio for the combined optimization goal, even without precise predictions.

LanguageEnglish
Title of host publicationProceedings - 2018 IEEE International Conference on Big Data, Big Data 2018
EditorsYang Song, Bing Liu, Kisung Lee, Naoki Abe, Calton Pu, Mu Qiao, Nesreen Ahmed, Donald Kossmann, Jeffrey Saltz, Jiliang Tang, Jingrui He, Huan Liu, Xiaohua Hu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages351-357
Number of pages7
ISBN (Electronic)9781538650356
DOIs
Publication statusPublished - 22 Jan 2019
Event2018 IEEE International Conference on Big Data, Big Data 2018 - Seattle, United States
Duration: 10 Dec 201813 Dec 2018

Conference

Conference2018 IEEE International Conference on Big Data, Big Data 2018
CountryUnited States
CitySeattle
Period10/12/1813/12/18

Fingerprint

Scheduling
Ecosystems
Costs
Experiments
Big data

Cite this

Voinea, M. A., Uta, A., & Iosup, A. (2019). POSUM: A Portfolio Scheduler for MapReduce Workloads. In Y. Song, B. Liu, K. Lee, N. Abe, C. Pu, M. Qiao, N. Ahmed, D. Kossmann, J. Saltz, J. Tang, J. He, H. Liu, ... X. Hu (Eds.), Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018 (pp. 351-357). [8622215] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BigData.2018.8622215
Voinea, Maria A. ; Uta, Alexandru ; Iosup, Alexandru. / POSUM : A Portfolio Scheduler for MapReduce Workloads. Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018. editor / Yang Song ; Bing Liu ; Kisung Lee ; Naoki Abe ; Calton Pu ; Mu Qiao ; Nesreen Ahmed ; Donald Kossmann ; Jeffrey Saltz ; Jiliang Tang ; Jingrui He ; Huan Liu ; Xiaohua Hu. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 351-357
@inproceedings{83f9ab9b5b954e978161b43a1404e6bb,
title = "POSUM: A Portfolio Scheduler for MapReduce Workloads",
abstract = "MapReduce ecosystems are (still) widely popular for big data processing in data centers. To address the diverse non-functional requirements arising from many and increasingly more sophisticated users, the community has developed many scheduling policies for MapReduce workloads. Although some individual policies can dynamically optimize for single and stable performance objectives, such as minimizing runtime or cost, or meeting deadlines for realtime-jobs, it seems unlikely that individual policies will remain competitive for increasingly more dynamic workloads and objectives. In contrast, in this work we investigate the ability to dynamically balance performance and cost of a portfolio scheduler for MapReduce workloads. To this end, we design and implement a portfolio scheduling technique, that is, a system capable of adapting to the current workload characteristics and target objectives by periodically evaluating its set of potential policies, and of switching to »the best» policy that targets the current system state. We implement and evaluate our system with real-world experiments on a workload containing a mixture of real-time and batch jobs, with the purpose of minimizing deadline violations, while keeping batch job slowdown in check. Our results show that POSUM is a promising alternative: it can out-perform the individual policies of its portfolio for the combined optimization goal, even without precise predictions.",
author = "Voinea, {Maria A.} and Alexandru Uta and Alexandru Iosup",
year = "2019",
month = "1",
day = "22",
doi = "10.1109/BigData.2018.8622215",
language = "English",
pages = "351--357",
editor = "Yang Song and Bing Liu and Kisung Lee and Naoki Abe and Calton Pu and Mu Qiao and Nesreen Ahmed and Donald Kossmann and Jeffrey Saltz and Jiliang Tang and Jingrui He and Huan Liu and Xiaohua Hu",
booktitle = "Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

Voinea, MA, Uta, A & Iosup, A 2019, POSUM: A Portfolio Scheduler for MapReduce Workloads. in Y Song, B Liu, K Lee, N Abe, C Pu, M Qiao, N Ahmed, D Kossmann, J Saltz, J Tang, J He, H Liu & X Hu (eds), Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018., 8622215, Institute of Electrical and Electronics Engineers Inc., pp. 351-357, 2018 IEEE International Conference on Big Data, Big Data 2018, Seattle, United States, 10/12/18. https://doi.org/10.1109/BigData.2018.8622215

POSUM : A Portfolio Scheduler for MapReduce Workloads. / Voinea, Maria A.; Uta, Alexandru; Iosup, Alexandru.

Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018. ed. / Yang Song; Bing Liu; Kisung Lee; Naoki Abe; Calton Pu; Mu Qiao; Nesreen Ahmed; Donald Kossmann; Jeffrey Saltz; Jiliang Tang; Jingrui He; Huan Liu; Xiaohua Hu. Institute of Electrical and Electronics Engineers Inc., 2019. p. 351-357 8622215.

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - POSUM

T2 - A Portfolio Scheduler for MapReduce Workloads

AU - Voinea, Maria A.

AU - Uta, Alexandru

AU - Iosup, Alexandru

PY - 2019/1/22

Y1 - 2019/1/22

N2 - MapReduce ecosystems are (still) widely popular for big data processing in data centers. To address the diverse non-functional requirements arising from many and increasingly more sophisticated users, the community has developed many scheduling policies for MapReduce workloads. Although some individual policies can dynamically optimize for single and stable performance objectives, such as minimizing runtime or cost, or meeting deadlines for realtime-jobs, it seems unlikely that individual policies will remain competitive for increasingly more dynamic workloads and objectives. In contrast, in this work we investigate the ability to dynamically balance performance and cost of a portfolio scheduler for MapReduce workloads. To this end, we design and implement a portfolio scheduling technique, that is, a system capable of adapting to the current workload characteristics and target objectives by periodically evaluating its set of potential policies, and of switching to »the best» policy that targets the current system state. We implement and evaluate our system with real-world experiments on a workload containing a mixture of real-time and batch jobs, with the purpose of minimizing deadline violations, while keeping batch job slowdown in check. Our results show that POSUM is a promising alternative: it can out-perform the individual policies of its portfolio for the combined optimization goal, even without precise predictions.

AB - MapReduce ecosystems are (still) widely popular for big data processing in data centers. To address the diverse non-functional requirements arising from many and increasingly more sophisticated users, the community has developed many scheduling policies for MapReduce workloads. Although some individual policies can dynamically optimize for single and stable performance objectives, such as minimizing runtime or cost, or meeting deadlines for realtime-jobs, it seems unlikely that individual policies will remain competitive for increasingly more dynamic workloads and objectives. In contrast, in this work we investigate the ability to dynamically balance performance and cost of a portfolio scheduler for MapReduce workloads. To this end, we design and implement a portfolio scheduling technique, that is, a system capable of adapting to the current workload characteristics and target objectives by periodically evaluating its set of potential policies, and of switching to »the best» policy that targets the current system state. We implement and evaluate our system with real-world experiments on a workload containing a mixture of real-time and batch jobs, with the purpose of minimizing deadline violations, while keeping batch job slowdown in check. Our results show that POSUM is a promising alternative: it can out-perform the individual policies of its portfolio for the combined optimization goal, even without precise predictions.

UR - http://www.scopus.com/inward/record.url?scp=85062613491&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062613491&partnerID=8YFLogxK

U2 - 10.1109/BigData.2018.8622215

DO - 10.1109/BigData.2018.8622215

M3 - Conference contribution

SP - 351

EP - 357

BT - Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018

A2 - Song, Yang

A2 - Liu, Bing

A2 - Lee, Kisung

A2 - Abe, Naoki

A2 - Pu, Calton

A2 - Qiao, Mu

A2 - Ahmed, Nesreen

A2 - Kossmann, Donald

A2 - Saltz, Jeffrey

A2 - Tang, Jiliang

A2 - He, Jingrui

A2 - Liu, Huan

A2 - Hu, Xiaohua

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Voinea MA, Uta A, Iosup A. POSUM: A Portfolio Scheduler for MapReduce Workloads. In Song Y, Liu B, Lee K, Abe N, Pu C, Qiao M, Ahmed N, Kossmann D, Saltz J, Tang J, He J, Liu H, Hu X, editors, Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018. Institute of Electrical and Electronics Engineers Inc. 2019. p. 351-357. 8622215 https://doi.org/10.1109/BigData.2018.8622215