Component Deep Dive: src/page_cache.rs
The page cache module implements a generic LRU cache that underpins both the uncompressed (UPC) and compressed (CPC) caches. It manages hot and cold copies of pages, tracks usage timestamps, and performs eviction when capacity is exceeded.
Source Overview
src/page_cache.rs
31 #[derive(Clone)] pub struct PageCacheEntry<T> { page: Arc<T>, used_time }
37 #[derive(Clone)] pub struct PageCacheEntryUncompressed { pub page: Page }
41 #[derive(Clone)] pub struct PageCacheEntryCompressed { pub page: Vec<u8> }
63 pub struct PageCache<T> { store, lru_queue }
69 impl<T> PageCache<T> { pub fn new() -> Self, add, has, get, evict }
Core Structures
PageCacheEntry
┌─────────────────────────────┐ ┌──────────────────────────────┐
│ PageCacheEntry<T> │ │ PageCacheEntryUncompressed │
│ ├─ page : Arc<T> │ │ └─ page : Page │
│ └─ used_time: u64 │ └──────────────────────────────┘
└─────────────────────────────┘
┌──────────────────────────────┐
│ PageCacheEntryCompressed │
│ └─ page : Vec<u8> (blob) │
└──────────────────────────────┘
Arc<T>allows multiple consumers to share a cached page without copying.used_timestores the access timestamp (epoch millis) driving the LRU ordering.
PageCache
PageCache<T>
├─ store : HashMap<PageId, PageCacheEntry<T>>
└─ lru_queue: BTreeSet<(used_time, PageId)>
storeprovides O(1) lookups by page ID.lru_queuemaintains a sorted set of(used_time, PageId)pairs, enabling quick retrieval of the oldest entry (the first element).LRUsizeis fixed at 10; when exceeded, the cache evicts the least recently used entry.
Flow: Adding & Evicting Entries
PageCache::add(id, page)
│
├─ if id already present:
│ remove old (used_time, id) from lru_queue
│
├─ used_time := current_epoch_millis()
├─ store[id] := PageCacheEntry { page: Arc::new(page), used_time }
└─ lru_queue.insert((used_time, id))
│
└─ if lru_queue.len() > LRUsize:
oldest := lru_queue.iter().next()
evict(oldest_id)
Eviction
evict(id)
├─ remove (used_time, id) from lru_queue
└─ remove id from store
Currently, eviction simply drops the entry; TODO comments indicate future hooks:
- UPC eviction should trigger compression & insertion into CPC.
- CPC eviction should flush to disk via direct IO.
ASCII Diagram: Dual Cache Setup
┌───────────────────────────┐
│ Uncompressed Page Cache │
│ (UPC) │
│ store: { "p1": Arc<Page>,│
│ "p3": Arc<Page> }│
│ lru : { (t=101,"p3"), │
│ (t=150,"p1") } │
└────────────┬──────────────┘
│ Arc<PageCacheEntryUncompressed>
▼
Hot Page consumers
┌───────────────────────────┐
│ Compressed Page Cache │
│ (CPC) │
│ store: { "p2": Arc<Vec<u8>> } │
│ lru : { (t=80,"p2") } │
└────────────┬─────────────────────┘
│ Arc<PageCacheEntryCompressed>
▼
PageHandler → Compressor → UPC
Access Patterns
get(id)
├─ lookup store[id]
└─ return Arc::clone(&entry.page)
has(id)
└─ store.contains_key(id)
Note that get does not update used_time; the caller must call add again to reflect fresh usage. This is intentional: PageHandler always goes through add after a page is (re)materialized, keeping the latest access time in sync.
Integration Points
- PageHandler: Owns both UPC and CPC as
Arc<RwLock<PageCache<_>>>and orchestrates movement of pages between them. - Compressor: Feeds PageCacheEntry objects directly, minimizing auxiliary allocations.
- Future WAL / Storage: Eviction hooks will eventually interact with persistent storage to flush dirty pages.
Extensibility Notes
- Replace fixed
LRUsizewith a configurable capacity once workload characteristics are known. - Implement
touchsemantics to updateused_timeon read without re-inserting the entire entry. - Track dirty flags to avoid flushing clean pages unnecessarily during eviction.