TY - JOUR
T1 - Dynamic Load Balancing and Job Replication in a Global-Scale Grid Environment: A Comparison
AU - Dobber, A.M.
AU - van der Mei, R.D.
AU - Koole, G.M.
PY - 2009
Y1 - 2009
N2 - Global-scale grids provide a massive source of processing power, providing the means to support processor intensive parallel applications. The strong burstiness and unpredictability of the available processing and network resources raise the strong need to make applications robust against the dynamics of grid environments. The two main techniques that are most suitable to cope with the dynamic nature of the grid are Dynamic Load Balancing (DLB) and job replication (JR). In this paper, we analyze and compare the effectiveness of these two approaches by means of trace-driven simulations. We observe that there exists an easy-to-measure statistic Y and a corresponding threshold value Y*, such that DLB consistently outperforms JR when Y > Y*, whereas the reverse is true for Y < Y*. Based on this observation, we propose a simple and easy-to-implement approach, throughout referred to as the DLB/JR method, that can make dynamic decisions about whether to use DLB or JR. Extensive simulations based on a large set of real data monitored in a global-scale grid show that our DLB/JR method consistently performs at least as good as both DLB and JR in all circumstances, which makes our DLB/JR method highly robust against the unpredictable nature of global-scale grids. © 2009, IEEE. All rights reserved.
AB - Global-scale grids provide a massive source of processing power, providing the means to support processor intensive parallel applications. The strong burstiness and unpredictability of the available processing and network resources raise the strong need to make applications robust against the dynamics of grid environments. The two main techniques that are most suitable to cope with the dynamic nature of the grid are Dynamic Load Balancing (DLB) and job replication (JR). In this paper, we analyze and compare the effectiveness of these two approaches by means of trace-driven simulations. We observe that there exists an easy-to-measure statistic Y and a corresponding threshold value Y*, such that DLB consistently outperforms JR when Y > Y*, whereas the reverse is true for Y < Y*. Based on this observation, we propose a simple and easy-to-implement approach, throughout referred to as the DLB/JR method, that can make dynamic decisions about whether to use DLB or JR. Extensive simulations based on a large set of real data monitored in a global-scale grid show that our DLB/JR method consistently performs at least as good as both DLB and JR in all circumstances, which makes our DLB/JR method highly robust against the unpredictable nature of global-scale grids. © 2009, IEEE. All rights reserved.
U2 - 10.1109/TPDS.2008.61
DO - 10.1109/TPDS.2008.61
M3 - Article
SN - 1045-9219
VL - 20
SP - 207
EP - 218
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
ER -