Cluster communication protocols for parallel-programming systems

K. Verstoep, R.A.F. Bhoedjang, T. Rühl, H.E. Bal, R. Hofman

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

Clusters of workstations are a popular platform for high-performance computing. For many parallel applications, efficient use of a fast interconnection network is essential for good performance. Several modern System Area Networks include programmable network interfaces that can be tailored to perform protocol tasks that otherwise would need to be done by the host processors. Finding the right trade-off between protocol processing at the host and the network interface is difficult in general. In this work, we systematically evaluate the performance of different implementations of a single, user-level communication interface. The implementations make different architectural assumptions about the reliability of the network and the capabilities of the network interface. The implementations differ accordingly in their division of protocol tasks between host software, network-interface firmware, and network hardware. Also, we investigate the effects of alternative data-transfer methods and multicast implementations, and we evaluate the influence of packet size. Using microbenchmarks, parallel-programming systems, and parallel applications, we assess the performance of the different implementations at multiple levels. We use two hardware platforms with different performance characteristics to validate our conclusions. We show how moving protocol tasks to a relatively slow network interface can yield both performance advantages and disadvantages, depending on specific characteristics of the application and the underlying parallel-programming system.
Original languageEnglish
Pages (from-to)281-325
JournalACM Transactions on Computer Systems
Volume22
Issue number3
DOIs
Publication statusPublished - 2004

Fingerprint

Parallel programming
Interfaces (computer)
Computer systems
Network protocols
Firmware
Computer workstations
Data transfer
Computer networks
Computer hardware
Hardware
Communication
Processing

Bibliographical note

1012269

Cite this

Verstoep, K. ; Bhoedjang, R.A.F. ; Rühl, T. ; Bal, H.E. ; Hofman, R. / Cluster communication protocols for parallel-programming systems. In: ACM Transactions on Computer Systems. 2004 ; Vol. 22, No. 3. pp. 281-325.
@article{31530268a4934b44acf0f78943cc7a19,
title = "Cluster communication protocols for parallel-programming systems",
abstract = "Clusters of workstations are a popular platform for high-performance computing. For many parallel applications, efficient use of a fast interconnection network is essential for good performance. Several modern System Area Networks include programmable network interfaces that can be tailored to perform protocol tasks that otherwise would need to be done by the host processors. Finding the right trade-off between protocol processing at the host and the network interface is difficult in general. In this work, we systematically evaluate the performance of different implementations of a single, user-level communication interface. The implementations make different architectural assumptions about the reliability of the network and the capabilities of the network interface. The implementations differ accordingly in their division of protocol tasks between host software, network-interface firmware, and network hardware. Also, we investigate the effects of alternative data-transfer methods and multicast implementations, and we evaluate the influence of packet size. Using microbenchmarks, parallel-programming systems, and parallel applications, we assess the performance of the different implementations at multiple levels. We use two hardware platforms with different performance characteristics to validate our conclusions. We show how moving protocol tasks to a relatively slow network interface can yield both performance advantages and disadvantages, depending on specific characteristics of the application and the underlying parallel-programming system.",
author = "K. Verstoep and R.A.F. Bhoedjang and T. R{\"u}hl and H.E. Bal and R. Hofman",
note = "1012269",
year = "2004",
doi = "10.1145/1012268.1012269",
language = "English",
volume = "22",
pages = "281--325",
journal = "ACM Transactions on Computer Systems",
issn = "0734-2071",
publisher = "Association for Computing Machinery (ACM)",
number = "3",

}

Cluster communication protocols for parallel-programming systems. / Verstoep, K.; Bhoedjang, R.A.F.; Rühl, T.; Bal, H.E.; Hofman, R.

In: ACM Transactions on Computer Systems, Vol. 22, No. 3, 2004, p. 281-325.

Research output: Contribution to JournalArticleAcademicpeer-review

TY - JOUR

T1 - Cluster communication protocols for parallel-programming systems

AU - Verstoep, K.

AU - Bhoedjang, R.A.F.

AU - Rühl, T.

AU - Bal, H.E.

AU - Hofman, R.

N1 - 1012269

PY - 2004

Y1 - 2004

N2 - Clusters of workstations are a popular platform for high-performance computing. For many parallel applications, efficient use of a fast interconnection network is essential for good performance. Several modern System Area Networks include programmable network interfaces that can be tailored to perform protocol tasks that otherwise would need to be done by the host processors. Finding the right trade-off between protocol processing at the host and the network interface is difficult in general. In this work, we systematically evaluate the performance of different implementations of a single, user-level communication interface. The implementations make different architectural assumptions about the reliability of the network and the capabilities of the network interface. The implementations differ accordingly in their division of protocol tasks between host software, network-interface firmware, and network hardware. Also, we investigate the effects of alternative data-transfer methods and multicast implementations, and we evaluate the influence of packet size. Using microbenchmarks, parallel-programming systems, and parallel applications, we assess the performance of the different implementations at multiple levels. We use two hardware platforms with different performance characteristics to validate our conclusions. We show how moving protocol tasks to a relatively slow network interface can yield both performance advantages and disadvantages, depending on specific characteristics of the application and the underlying parallel-programming system.

AB - Clusters of workstations are a popular platform for high-performance computing. For many parallel applications, efficient use of a fast interconnection network is essential for good performance. Several modern System Area Networks include programmable network interfaces that can be tailored to perform protocol tasks that otherwise would need to be done by the host processors. Finding the right trade-off between protocol processing at the host and the network interface is difficult in general. In this work, we systematically evaluate the performance of different implementations of a single, user-level communication interface. The implementations make different architectural assumptions about the reliability of the network and the capabilities of the network interface. The implementations differ accordingly in their division of protocol tasks between host software, network-interface firmware, and network hardware. Also, we investigate the effects of alternative data-transfer methods and multicast implementations, and we evaluate the influence of packet size. Using microbenchmarks, parallel-programming systems, and parallel applications, we assess the performance of the different implementations at multiple levels. We use two hardware platforms with different performance characteristics to validate our conclusions. We show how moving protocol tasks to a relatively slow network interface can yield both performance advantages and disadvantages, depending on specific characteristics of the application and the underlying parallel-programming system.

U2 - 10.1145/1012268.1012269

DO - 10.1145/1012268.1012269

M3 - Article

VL - 22

SP - 281

EP - 325

JO - ACM Transactions on Computer Systems

JF - ACM Transactions on Computer Systems

SN - 0734-2071

IS - 3

ER -