Advanced

LSM-Tree

Keywords

LSM-TreememtableSSTablesorted runflushcompactionleveled compactiontiered compactionwrite amplificationread amplificationspace amplificationAsk ChatGPT

Prerequisites

Progress

在前面的單元中，我們學過了 key-value store API 的抽象介面以及 Bloom filter 如何快速過濾不存在的查詢，現在我們要把這些概念組合起來，理解現代 write-intensive 儲存引擎的核心：LSM-Tree。LSM-Tree 的設計哲學是將隨機寫入轉換為循序寫入——我們先將資料寫入 memory 中的 memtable，滿了之後 flush 成 immutable 的 sorted run（SSTable）到磁碟上。隨著 SSTable 數量增加，我們透過 compaction 將它們合併，而 compaction 策略（leveled vs. tiered）決定了 write amplification、read amplification 與 space amplification 之間的取捨。在讀取路徑上，我們會利用先前學過的 Bloom filter 來跳過不包含目標 key 的 SSTable，大幅降低讀取延遲。

Key Concepts

我理解 LSM-Tree 的寫入路徑：資料先寫入 memory 中的 memtable，滿了之後 flush 成 immutable 的 SSTable 到磁碟上

我理解 leveled compaction 與 tiered compaction 兩種策略的差異，以及它們各自在 write amplification、read amplification、space amplification 之間的取捨

我理解三種 amplification（write、read、space）之間存在不可能三角的權衡關係，無法同時最小化三者

我理解讀取路徑中 Bloom filter 如何被用來跳過不包含目標 key 的 SSTable，避免不必要的 disk read

我了解為什麼 LSM-Tree 偏好 sequential write：將隨機寫入轉換為循序寫入的設計哲學，讓它在 write-intensive workload 中表現優異

LSM-Tree

Keywords

Prerequisites

Related Papers

Progress

Key Concepts

Recommended Resources

Test Your Understanding