Abstract
Accelerating sequential algorithms in order to achieve high performance is often a nontrivial task. However, there are certain properties that can exacerbate this process and make it particularly daunting. For example, building an efficient parallel solution for a data-intensive algorithm requires a deep analysis of the memory access patterns and data reuse potential. Attempting to scale out the computations on clusters of machines introduces further complications due to network speed limitations. In this context, the optimization landscape can be extremely complex owing to the large number of trade-off decisions. In this paper, we discuss our experience designing two parallel implementations of an existing data-intensive machine learning algorithm that detects overlapping communities in graphs. The first design uses a single GPU to accelerate the computations of small data sets. We employed a code generation strategy in order to test and identify the best performing combination of optimizations. The second design uses a cluster of machines to scale out the computations for larger problem sizes. We used a mixture of MPI, RDMA and pipelining in order to circumvent networking overhead. Both these efforts bring us closer to understanding the complex relationships hidden within networks of entities.
Original language | English |
---|---|
Title of host publication | Proceedings - 2016 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2016 |
Publisher | Institute of Electrical and Electronics Engineers, Inc. |
Pages | 175-178 |
Number of pages | 4 |
ISBN (Electronic) | 9781509024520 |
DOIs | |
Publication status | Published - 18 Jul 2016 |
Event | 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2016 - Cartagena, Colombia Duration: 16 May 2016 → 19 May 2016 |
Conference
Conference | 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2016 |
---|---|
Country/Territory | Colombia |
City | Cartagena |
Period | 16/05/16 → 19/05/16 |
Keywords
- Algorithms for Accelerators and Heterogeneous Systems
- Combinatorial and Data In-tensive Application
- Performance Analysis
- Statistical Learning