TY - GEN
T1 - Distributed DNN serving in the network data plane
AU - Razavi, K.
AU - Karlos, G.
AU - Nigade, V.
AU - Mühlhäuser, M.
AU - Wang, L.
PY - 2022/12
Y1 - 2022/12
N2 - Programmable networks have received tremendous attention recently. Apart from exciting network innovations, in-network computing has been explored as a means to accelerate a variety of distributed systems concerns, by leveraging programmable network devices. In this paper, we extend in-network computing to an important class of applications called deep neural network (DNN) serving. In particular, we propose to run DNN inferences in the network data plane in a distributed fashion and make our programmable network a powerful accelerator for DNN serving. We demonstrate the feasibility of this idea through a case study with a real-world DNN on a typical data center network architecture.
AB - Programmable networks have received tremendous attention recently. Apart from exciting network innovations, in-network computing has been explored as a means to accelerate a variety of distributed systems concerns, by leveraging programmable network devices. In this paper, we extend in-network computing to an important class of applications called deep neural network (DNN) serving. In particular, we propose to run DNN inferences in the network data plane in a distributed fashion and make our programmable network a powerful accelerator for DNN serving. We demonstrate the feasibility of this idea through a case study with a real-world DNN on a typical data center network architecture.
UR - http://www.scopus.com/inward/record.url?scp=85145596391&partnerID=8YFLogxK
U2 - 10.1145/3565475.3569079
DO - 10.1145/3565475.3569079
M3 - Conference contribution
SP - 67
EP - 70
BT - EuroP4 '22
PB - Association for Computing Machinery, Inc
T2 - 5th International Workshop on P4 in Europe, EuroP4 2022, co-located with ACM CoNEXT 2022
Y2 - 9 December 2022
ER -