Abstract
While modern CPUs offer an increasing number of cores with shared caches, prevailing execution engines for business processes, workflows, or Web service compositions have not been optimized for properly exploiting the abundant processing resources of such CPUs. One factor limiting performance is the inefficient thread scheduling by the operating system, which can result in suboptimal use of shared caches. In this paper we study performance of the JOpera business process execution engine on a recent multicore machine. By analyzing the engine's architecture and by binding threads that are likely to access shared data to cores with a common cache, we achieve speedups up to 13% for a variety of workloads, without modifying the engine's architecture and implementation, apart from binding threads to CPUs. As the engine is implemented in Java, we provide a new Java library to manage thread bindings and hardware performance counters. We also leverage hardware performance counters to explain the observed speedup in our performance analysis.
Original language | English |
---|---|
Title of host publication | Proceedings - 2010 IEEE International Conference on Service-Oriented Computing and Applications, SOCA 2010 |
DOIs | |
Publication status | Published - 2010 |
Externally published | Yes |
Event | 2010 IEEE International Conference on Service-Oriented Computing and Applications, SOCA 2010 - , Australia Duration: 13 Dec 2010 → 15 Dec 2010 |
Conference
Conference | 2010 IEEE International Conference on Service-Oriented Computing and Applications, SOCA 2010 |
---|---|
Country/Territory | Australia |
Period | 13/12/10 → 15/12/10 |