Component Deep Dive: src/page_cache.rs

The page cache module implements a generic LRU cache that underpins both the uncompressed (UPC) and compressed (CPC) caches. It manages hot and cold copies of pages, tracks usage timestamps, and performs eviction when capacity is exceeded.

Source Overview

src/page_cache.rs
 31  #[derive(Clone)] pub struct PageCacheEntry<T> { page: Arc<T>, used_time }
 37  #[derive(Clone)] pub struct PageCacheEntryUncompressed { pub page: Page }
 41  #[derive(Clone)] pub struct PageCacheEntryCompressed { pub page: Vec<u8> }
 63  pub struct PageCache<T> { store, lru_queue }
 69  impl<T> PageCache<T> { pub fn new() -> Self, add, has, get, evict }

Core Structures

PageCacheEntry

┌─────────────────────────────┐   ┌──────────────────────────────┐
│ PageCacheEntry<T>           │   │ PageCacheEntryUncompressed   │
│  ├─ page     : Arc<T>       │   │  └─ page : Page              │
│  └─ used_time: u64          │   └──────────────────────────────┘
└─────────────────────────────┘
┌──────────────────────────────┐
│ PageCacheEntryCompressed     │
│  └─ page : Vec<u8> (blob)    │
└──────────────────────────────┘
  • Arc<T> allows multiple consumers to share a cached page without copying.
  • used_time stores the access timestamp (epoch millis) driving the LRU ordering.

PageCache

PageCache<T>
├─ store    : HashMap<PageId, PageCacheEntry<T>>
└─ lru_queue: BTreeSet<(used_time, PageId)>
  • store provides O(1) lookups by page ID.
  • lru_queue maintains a sorted set of (used_time, PageId) pairs, enabling quick retrieval of the oldest entry (the first element).
  • LRUsize is fixed at 10; when exceeded, the cache evicts the least recently used entry.

Flow: Adding & Evicting Entries

PageCache::add(id, page)
   │
   ├─ if id already present:
   │      remove old (used_time, id) from lru_queue
   │
   ├─ used_time := current_epoch_millis()
   ├─ store[id] := PageCacheEntry { page: Arc::new(page), used_time }
   └─ lru_queue.insert((used_time, id))
           │
           └─ if lru_queue.len() > LRUsize:
                oldest := lru_queue.iter().next()
                evict(oldest_id)

Eviction

evict(id)
   ├─ remove (used_time, id) from lru_queue
   └─ remove id from store

Currently, eviction simply drops the entry; TODO comments indicate future hooks:

  • UPC eviction should trigger compression & insertion into CPC.
  • CPC eviction should flush to disk via direct IO.

ASCII Diagram: Dual Cache Setup

┌───────────────────────────┐
│ Uncompressed Page Cache   │
│ (UPC)                     │
│  store: { "p1": Arc<Page>,│
│           "p3": Arc<Page> }│
│  lru : { (t=101,"p3"),     │
│          (t=150,"p1") }    │
└────────────┬──────────────┘
             │ Arc<PageCacheEntryUncompressed>
             ▼
      Hot Page consumers

┌───────────────────────────┐
│ Compressed Page Cache     │
│ (CPC)                     │
│  store: { "p2": Arc<Vec<u8>> }   │
│  lru : { (t=80,"p2") }           │
└────────────┬─────────────────────┘
             │ Arc<PageCacheEntryCompressed>
             ▼
         PageHandler → Compressor → UPC

Access Patterns

get(id)
   ├─ lookup store[id]
   └─ return Arc::clone(&entry.page)

has(id)
   └─ store.contains_key(id)

Note that get does not update used_time; the caller must call add again to reflect fresh usage. This is intentional: PageHandler always goes through add after a page is (re)materialized, keeping the latest access time in sync.

Integration Points

  • PageHandler: Owns both UPC and CPC as Arc<RwLock<PageCache<_>>> and orchestrates movement of pages between them.
  • Compressor: Feeds PageCacheEntry objects directly, minimizing auxiliary allocations.
  • Future WAL / Storage: Eviction hooks will eventually interact with persistent storage to flush dirty pages.

Extensibility Notes

  • Replace fixed LRUsize with a configurable capacity once workload characteristics are known.
  • Implement touch semantics to update used_time on read without re-inserting the entire entry.
  • Track dirty flags to avoid flushing clean pages unnecessarily during eviction.