Many scientists perform extensive computations by executing large bags of similar tasks (BoTs) in mixtures of computational environments, such as grids and clouds. Although the reliability and cost may vary considerably across these environments, no tool exists to assist scientists in the selection of environments that can both fulfill deadlines and fit budgets. To address this situation, we introduce the Expert BoT scheduling framework. Our framework systematically selects from a large search space the Pareto-efficient scheduling strategies, that is, the strategies that deliver the best results for both make span and cost. Expert chooses from them the best strategy according to a general, user-specified utility function. Through simulations and experiments in real production environments, we demonstrate that Expert can substantially reduce both make span and cost in comparison to common scheduling strategies. For bioinformatics BoTs executed in a real mixed grid + cloud environment, we show how the scheduling strategy selected by Expert reduces both make span and cost by 30%-70%, in comparison to commonly-used scheduling strategies. © 2012 IEEE.
|Title of host publication||Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2012, Shanghai, China, May 21-25, 2012|
|Number of pages||12|
|Publication status||Published - 2012|
|Event||2012 IEEE 26th International Parallel and Distributed Processing Symposium, IPDPS 2012 - Shanghai, China|
Duration: 21 May 2012 → 25 May 2012
|Conference||2012 IEEE 26th International Parallel and Distributed Processing Symposium, IPDPS 2012|
|Period||21/05/12 → 25/05/12|