Abstract
High-frequency memory checkpointing is an important technique in several application domains, such as automatic error recovery (where frequent checkpoints allow the system to transparently mask failures) and application debugging (where frequent checkpoints enable fast and accurate time-traveling support). Unfortunately, existing (typically incremental) checkpointing frameworks incur substantial performance overhead in high-frequency memory checkpointing applications, thus discouraging their adoption in practice.
This paper presents Speculative Memory Checkpointing (SMC), a new low-overhead technique for high-frequency memory checkpointing. Our motivating analysis identifies key bottlenecks in existing frameworks and demonstrates that the performance of traditional incremental checkpointing strategies in high-frequency checkpointing scenarios is not optimal. To fill the gap, SMC relies on working set estimation algorithms to eagerly checkpoint the memory pages that belong to the writable working set of the running program and only lazily checkpoint the memory pages that do not. Our experimental results demonstrate that SMC is effective in reducing the performance overhead of prior solutions, is robust to variations in the workload, and incurs modest memory overhead compared to traditional incremental checkpointing.
This paper presents Speculative Memory Checkpointing (SMC), a new low-overhead technique for high-frequency memory checkpointing. Our motivating analysis identifies key bottlenecks in existing frameworks and demonstrates that the performance of traditional incremental checkpointing strategies in high-frequency checkpointing scenarios is not optimal. To fill the gap, SMC relies on working set estimation algorithms to eagerly checkpoint the memory pages that belong to the writable working set of the running program and only lazily checkpoint the memory pages that do not. Our experimental results demonstrate that SMC is effective in reducing the performance overhead of prior solutions, is robust to variations in the workload, and incurs modest memory overhead compared to traditional incremental checkpointing.
Original language | English |
---|---|
Title of host publication | Proceedings of the ACM/IFIP/USENIX Middleware Conference |
Publisher | ACM |
Pages | 197-209 |
ISBN (Print) | 978-1-4503-3618-5 |
DOIs | |
Publication status | Published - 2015 |