Component Deep Dive: src/ops_handler.rs

The ops_handler module exposes the high-level APIs for mutating and reading column data. It acts as the boundary between external operations and the internal caching/metadata machinery.

Source Summary

src/ops_handler.rs
 9  pub fn upsert_data_into_column(meta_store, handler, col, data) -> Result<bool>
26  pub fn update_column_entry(meta_store, handler, col, data, row) -> Result<bool>
45  pub fn range_scan_column_entry(meta_store, handler, col, l_row, r_row, commit_time_upper_bound) -> Vec<Entry>

Architectural Position

      External Client / API Layer
                    │
                    ▼
           ┌──────────────────────┐
           │    ops_handler       │
           │  (upsert/update/scan)│
           └──────────┬───────────┘
                      │
        ┌─────────────┴─────────────┐
        ▼                           ▼
TableMetaStore (read-lock)      PageHandler (cache + IO)

Each function coordinates metadata lookups with page retrieval/manipulation, keeping locks short and relying on PageHandler to manage caches.

Upsert Flow (upsert_data_into_column)

Input: column name, payload string

1) Acquire read-lock on `TableMetaStore`.
2) Call `get_latest_page_meta(col)` → &Arc<PageMetadata>; clone to own.
3) Release lock.
4) Fetch page via `PageHandler::get_page(page_meta)` → Arc<PageCacheEntryUncompressed>.
5) Clone the Arc target (`(*page_arc).clone()`) to get owned PageCacheEntryUncompressed.
6) Append new `Entry::new(data)` to `page.entries`.
7) Acquire UPC write-lock.
8) Reinsert updated page with `PageCache::add(page_meta.id, updated)`.
9) Return `Ok(true)`.

ASCII Sequence

Client → ops_handler::upsert_data_into_column
   │
   │ read-lock
   ▼
TableMetaStore ──► latest PageMetadata (id="pX")
   │
   │ PageHandler::get_page("pX")
   ▼
PageHandler ──► UPC/CPC/Disk (as needed) ──► Arc<PageCacheEntryUncompressed>
   │
   │ clone, add Entry::new(data)
   ▼
UPC write-lock ──► add("pX", updated_page)

TODO: Update (start_idx,end_idx) ranges and metadata when a page grows/splits. Currently, this function only mutates in-memory pages without reflecting changes in the metadata catalog.

Update Flow (update_column_entry)

Input: column name, payload string, row index

Same steps as upsert to retrieve Page.

After cloning Page:
   if row >= entries.len() -> return Ok(false)
   else entries[row] = Entry::new(data)

Insert updated page back into UPC.

Squarely relies on PageHandler for fetching the latest committed version and uses row index to overwrite the existing entry.

Range Scan Flow (range_scan_column_entry)

Input: column name, [l_row, r_row), commit_time_upper_bound

1) Read-lock TableMetaStore.
2) Call `get_ranged_pages_meta` to fetch MVCC-aware page metadata (Vec<Arc<PageMetadata>>).
3) Release lock.
4) Convert metadata arcs to owned `PageMetadata` clones.
5) Call `PageHandler::get_pages` with the vector.
6) Iterate through returned pages, concatenating their `Page::entries`.
7) Return Vec<Entry>.

ASCII Diagram

┌─────────────────────────────┐
│ TableMetaStore::get_ranged… │
└───────────────┬─────────────┘
                ▼
   [Arc<PageMetadata{"p5"}>, Arc<PageMetadata{"p7"}>, …]
                │
                ▼
PageHandler::get_pages([PageMetadata{"p5"}, PageMetadata{"p7"}, …])
                │
                ▼
[Arc<PageCacheEntryUncompressed>, …]
                │
                ▼
Vec<Entry> (concatenate pages sequentially)

Note: The function currently returns entire pages; precise row slicing within page boundaries is left for future implementation.

Error Handling

  • Upserts and updates wrap their logic in Result<bool, Box<dyn std::error::Error>> but exclusively return Ok(true/false); internal unwraps can panic if metadata or caches are not populated. Tightening error propagation is a future concern.
  • Range scans unwrap the metadata response; if the metadata store is missing columns, the function will panic. Production code should gracefully return empty results.

Integration Notes

  • The module expects the metadata store to be primed with at least one page per column. Bootstrapping logic (creating the first page) is not yet implemented.
  • All cache writes occur through the UPC; compression/flush responsibilities are deferred to cache eviction behavior.

Next Steps

  • Update metadata ranges (start/end indices) during upsert/update to reflect new row counts.
  • Handle page splits (when a page exceeds capacity) and update metadata store accordingly.
  • Add bounds checking for range scans to slice inside pages rather than returning full pages.
  • Replace unchecked unwraps with explicit error handling to make the API robust for client integration.