TY - JOUR
T1 - ZWAL
T2 - Rethinking Write-ahead Logs for ZNS SSDs with Zone Appends
AU - Doekemeijer, Krijn
AU - Ren, Zebin
AU - Tehrany, Nick
AU - Trivedi, Animesh
N1 - Publisher Copyright:
© 2024 Association for Computing Machinery. All rights reserved.
PY - 2024/6
Y1 - 2024/6
N2 - KV-stores are extensively used databases that require performance stability. Zoned Namespace (ZNS) is an emerging interface for flash storage devices that provides such stability. Due to their sequential write access patterns, LSM trees, ubiquitous data structures in KV stores, present a natural fit for the append-only ZNS interface. However, LSM-trees achieve limited write throughput on ZNS. This limitation is because the largest portion of LSM-tree writes are small writes for the write-ahead log (WAL) component of LSM-trees, and ZNS has limited performance for small write I/O. The ZNS-specific zone append operation presents a solution, enhancing the throughput of small sequential writes. Still, zone appends are challenging to utilize in WALs. The storage device is allowed to reorder the data of zone appends, which is not supported by WAL recovery. Therefore, we need to change the WAL design to support such reordering. This paper introduces ZWALs, a new WAL design that uses zone appends to increase LSM-tree write throughput. They are resilient to reordering by adding identifiers to each append along with a novel recovery technique. We implement ZWALs in the state-of-the-art combination of RocksDB and ZenFS and report up to 8.56 times higher throughput on the YCSB benchmark. We open-source all our code at https://github.com/stonet-research/zwal.
AB - KV-stores are extensively used databases that require performance stability. Zoned Namespace (ZNS) is an emerging interface for flash storage devices that provides such stability. Due to their sequential write access patterns, LSM trees, ubiquitous data structures in KV stores, present a natural fit for the append-only ZNS interface. However, LSM-trees achieve limited write throughput on ZNS. This limitation is because the largest portion of LSM-tree writes are small writes for the write-ahead log (WAL) component of LSM-trees, and ZNS has limited performance for small write I/O. The ZNS-specific zone append operation presents a solution, enhancing the throughput of small sequential writes. Still, zone appends are challenging to utilize in WALs. The storage device is allowed to reorder the data of zone appends, which is not supported by WAL recovery. Therefore, we need to change the WAL design to support such reordering. This paper introduces ZWALs, a new WAL design that uses zone appends to increase LSM-tree write throughput. They are resilient to reordering by adding identifiers to each append along with a novel recovery technique. We implement ZWALs in the state-of-the-art combination of RocksDB and ZenFS and report up to 8.56 times higher throughput on the YCSB benchmark. We open-source all our code at https://github.com/stonet-research/zwal.
UR - http://www.scopus.com/inward/record.url?scp=85201512179&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85201512179&partnerID=8YFLogxK
U2 - 10.1145/3689051.3689060
DO - 10.1145/3689051.3689060
M3 - Article
AN - SCOPUS:85201512179
SN - 0163-5980
VL - 58
SP - 53
EP - 60
JO - Operating Systems Review (ACM)
JF - Operating Systems Review (ACM)
IS - 1
ER -