TY - JOUR
T1 - Learning to select operators in meta-heuristics
T2 - An integration of Q-learning into the iterated greedy algorithm for the permutation flowshop scheduling problem
AU - Karimi-Mamaghan, M.
AU - Mohammadi, M.
AU - Pasdeloup, B.
AU - Meyer, P.
PY - 2022
Y1 - 2022
N2 - © 2022 Elsevier B.V.This paper aims at integrating machine learning techniques into meta-heuristics for solving combinatorial optimization problems. Specifically, our study develops a novel efficient iterated greedy algorithm based on reinforcement learning. The main novelty of the proposed algorithm is its new perturbation mechanism, which incorporates Q-learning to select appropriate perturbation operators during the search process. Through an application to the permutation flowshop scheduling problem, comprehensive computational experiments are conducted on a wide range of benchmark instances to evaluate the performance of the proposed algorithm. This evaluation is done against non-learning versions of the iterated greedy algorithm and seven state-of-the-art algorithms from the literature. The experimental results and statistical analyses show the better performance of the proposed algorithm in terms of optimality gaps, convergence rate, and computational overhead.
AB - © 2022 Elsevier B.V.This paper aims at integrating machine learning techniques into meta-heuristics for solving combinatorial optimization problems. Specifically, our study develops a novel efficient iterated greedy algorithm based on reinforcement learning. The main novelty of the proposed algorithm is its new perturbation mechanism, which incorporates Q-learning to select appropriate perturbation operators during the search process. Through an application to the permutation flowshop scheduling problem, comprehensive computational experiments are conducted on a wide range of benchmark instances to evaluate the performance of the proposed algorithm. This evaluation is done against non-learning versions of the iterated greedy algorithm and seven state-of-the-art algorithms from the literature. The experimental results and statistical analyses show the better performance of the proposed algorithm in terms of optimality gaps, convergence rate, and computational overhead.
UR - http://www.scopus.com/inward/record.url?scp=85132649463&partnerID=8YFLogxK
U2 - 10.1016/j.ejor.2022.03.054
DO - 10.1016/j.ejor.2022.03.054
M3 - Article
SN - 0377-2217
VL - 304
SP - 1296
EP - 1330
JO - European Journal of Operational Research
JF - European Journal of Operational Research
IS - 3
ER -