Sorted Inserts (ORDER BY)

Tables created with an ORDER BY clause maintain sorted order across all columns. The ops_handler module provides different insertion strategies depending on whether the ORDER BY is single-column or multi-column.

Table Creation

CREATE TABLE users (
    id INT,
    email TEXT,
    age INT
) ORDER BY age, email;

This creates a table with:

  • Sort key: [age, email] (composite)
  • Sort ordinals: [2, 1] (column positions)

Implementation Overview

╔══════════════════════════════════════════════════════════════╗
║                  Sorted Insert Strategy                      ║
╚══════════════════════════════════════════════════════════════╝
                          │
                          ▼
               ┌──────────────────────┐
               │ Check sort key size  │
               └──────────┬───────────┘
                          │
         ┌────────────────┴────────────────┐
         │                                 │
         ▼                                 ▼
  ┌──────────────┐                ┌──────────────────┐
  │ Single-column│                │ Multi-column     │
  │ ORDER BY     │                │ ORDER BY         │
  └──────┬───────┘                └────────┬─────────┘
         │                                 │
         │ binary_search_insert_index      │ find_insert_position
         │ O(log n)                        │ O(n)
         │                                 │
         └────────────────┬────────────────┘
                          │
                          ▼
              ┌────────────────────────┐
              │ Insert at found index  │
              │ across ALL columns     │
              └────────────────────────┘

Single-Column ORDER BY

Algorithm

Uses Rust’s binary_search_by for O(log n) insertion point discovery.

Code path: sorted_insert_single_column()

fn sorted_insert_single_column(handler, table, col, data) {
    // 1. Load latest page for the sort column
    let page_meta = handler.locate_latest_in_table(table, col)?;
    let page_arc = handler.get_page(page_meta.clone())?;

    // 2. Binary search for insertion index
    let insert_idx = binary_search_insert_index(&page.entries, data);

    // 3. Insert entry at found position
    page.entries.insert(insert_idx, Entry::new(data));

    // 4. Update metadata entry count
    handler.update_entry_count_in_table(table, col, new_count)?;

    // 5. Write back to cache
    handler.write_back_uncompressed(&page_meta.id, updated);
}

Binary Search Implementation

fn binary_search_insert_index(entries: &[Entry], value: &str) -> usize {
    match entries.binary_search_by(|entry| compare_entry_value(entry, value)) {
        Ok(idx) | Err(idx) => idx,
    }
}

Behavior:

  • Ok(idx): Exact match found → insert at that position (before existing)
  • Err(idx): No match → insert at position where value would maintain sort order

Example: Single-Column Insert

Existing page (sorted by age):
┌───────────┐
│ Entry: 18 │  idx=0
│ Entry: 25 │  idx=1
│ Entry: 42 │  idx=2
│ Entry: 50 │  idx=3
└───────────┘

Insert "30":
  binary_search_by(|e| compare_strs(e.data, "30"))

  Comparisons:
    18 vs 30 → Less    → continue search
    25 vs 30 → Less    → continue search
    42 vs 30 → Greater → found! insert before idx=2

Result:
┌───────────┐
│ Entry: 18 │  idx=0
│ Entry: 25 │  idx=1
│ Entry: 30 │  idx=2 ← inserted
│ Entry: 42 │  idx=3
│ Entry: 50 │  idx=4
└───────────┘

Complexity: O(log n) comparisons

Multi-Column ORDER BY

Algorithm

Uses linear scan with type-aware comparison across sort key columns.

Code path: sorted_insert_row(handler, table, row)

╔═══════════════════════════════════════════════════════════════╗
║               insert_sorted_row(row)                          ║
╚═══════════════════════════════════════════════════════════════╝
                          │
                          ▼
      ┌────────────────────────────────────────┐
      │ 1. Load TableCatalog                   │
      │    Get sort_ordinals: [ord_1, ord_2]   │
      └────────────┬───────────────────────────┘
                   │
                   ▼
      ┌────────────────────────────────────────┐
      │ 2. Validate sort key columns provided  │
      │    Ensure row contains values for      │
      │    all columns in ORDER BY             │
      └────────────┬───────────────────────────┘
                   │
                   ▼
      ┌────────────────────────────────────────┐
      │ 3. Build full row vector               │
      │    row_values[col] = provided value    │
      │                   OR ""                │
      └────────────┬───────────────────────────┘
                   │
                   ▼
      ┌────────────────────────────────────────┐
      │ 4. find_insert_position()              │
      │    Linear scan: 0..row_count           │
      │      compare_row_against_existing()    │
      └────────────┬───────────────────────────┘
                   │
                   ▼
      ┌────────────────────────────────────────┐
      │ 5. For each column:                    │
      │    - Load latest page                  │
      │    - Insert entry at insert_idx        │
      │    - Update metadata entry count       │
      │    - Write back to cache               │
      └────────────────────────────────────────┘

Linear Scan Details

fn find_insert_position(
    handler: &PageHandler,
    table: &str,
    columns: &[ColumnCatalog],
    sort_ordinals: &[usize],
    new_row: &[String],
    row_count: usize,
) -> Result<usize> {
    for idx in 0..row_count {
        let cmp = compare_row_against_existing(
            handler,
            table,
            columns,
            sort_ordinals,
            new_row,
            idx as u64,
        )?;
        if cmp == Ordering::Less {
            return Ok(idx);  // Insert before this row
        }
    }
    Ok(row_count)  // Append at end
}

Complexity: O(n) where n = number of existing rows

Why not binary search?

  • Multi-column comparison requires reading multiple pages per row
  • Each comparison involves PageHandler::read_entry_at() calls
  • Binary search would amplify cache misses and I/O

Row Comparison Logic

fn compare_row_against_existing(
    handler: &PageHandler,
    table: &str,
    columns: &[ColumnCatalog],
    sort_ordinals: &[usize],
    new_row: &[String],
    existing_idx: u64,
) -> Result<Ordering> {
    // Compare columns in sort key order
    for &ordinal in sort_ordinals {
        let column_name = &columns[ordinal].name;

        // Read existing row's value for this column
        let existing_entry = handler.read_entry_at(
            table,
            column_name,
            existing_idx
        )?;

        // Compare new vs existing
        let cmp = compare_strs(&new_row[ordinal], existing_entry.get_data());

        // Early return on first difference
        if cmp != Ordering::Equal {
            return Ok(cmp);
        }
    }

    // All sort columns equal
    Ok(Ordering::Equal)
}

Example: Multi-Column Insert

Table: users ORDER BY age, email
Existing rows:
┌─────┬─────────────────┐
│ age │ email           │
├─────┼─────────────────┤
│ 18  │ [email protected]  │  row 0
│ 25  │ [email protected]    │  row 1
│ 25  │ [email protected]│  row 2
│ 42  │ [email protected]   │  row 3
└─────┴─────────────────┘

Insert: {age: "25", email: "[email protected]"}

Linear scan comparisons:
  Row 0: compare("25", "18") → Greater → continue
  Row 1: compare("25", "25") → Equal
         compare("[email protected]", "[email protected]") → Greater → continue
  Row 2: compare("25", "25") → Equal
         compare("[email protected]", "[email protected]") → Less → FOUND!
         Insert at idx=2

Result:
┌─────┬─────────────────┐
│ age │ email           │
├─────┼─────────────────┤
│ 18  │ [email protected]  │  row 0
│ 25  │ [email protected]    │  row 1
│ 25  │ [email protected]  │  row 2 ← inserted
│ 25  │ [email protected]│  row 3 (shifted)
│ 42  │ [email protected]   │  row 4 (shifted)
└─────┴─────────────────┘

Type-Aware Comparison

All comparisons use compare_strs() which attempts numeric comparison first:

fn compare_strs(left: &str, right: &str) -> Ordering {
    match (left.parse::<f64>(), right.parse::<f64>()) {
        (Ok(l), Ok(r)) => l.partial_cmp(&r).unwrap_or(Ordering::Equal),
        _ => left.cmp(right),  // Lexicographic fallback
    }
}

Comparison Rules

╔═══════════════════════════════════════════════════════════╗
║                  compare_strs(left, right)                ║
╚═══════════════════════════════════════════════════════════╝
                          │
                          ▼
              ┌───────────────────────┐
              │ Try parse both as f64 │
              └───────────┬───────────┘
                          │
         ┌────────────────┴────────────────┐
         │                                 │
         ▼ Both parse OK                   ▼ At least one fails
  ┌──────────────┐                  ┌──────────────────┐
  │ Numeric      │                  │ Lexicographic    │
  │ comparison   │                  │ string compare   │
  │ 2 < 10 < 100 │                  │ "2" < "10" fails │
  └──────────────┘                  │ "10" < "2" (!)   │
                                    └──────────────────┘

Examples

Left Right Comparison Result Reason
"2" "10" Numeric Less 2.0 < 10.0
"3.14" "2.718" Numeric Greater 3.14 > 2.718
"alice" "bob" Lexicographic Less ‘a’ < ‘b’
"10" "abc" Lexicographic Less ‘1’ < ‘a’ (parse fails for “abc”)
"true" "false" Lexicographic Greater ‘t’ > ‘f’

Important: Mixing numeric and non-numeric values in the same ORDER BY column leads to inconsistent sort order (numeric values treated as strings).

Missing/Empty Values

Non-provided columns default to empty string "":

let mut row_values: Vec<Option<String>> = vec![None; columns.len()];

// ... populate from provided values ...

let final_row: Vec<String> = row_values
    .into_iter()
    .map(|opt| opt.unwrap_or_else(|| "".to_string()))
    .collect();

Behavior

Table: users(id, email, age) ORDER BY age, email

Insert: {email: "[email protected]"}  // age not provided

Actual row inserted:
  id:    ""  (empty)
  email: "[email protected]"
  age:   ""  (empty, used for sorting!)

Sort position: "" compares lexicographically
  "" < "18" → inserted at beginning

Warning: Empty strings sort before all non-empty values lexicographically, but this behavior may be surprising when mixing with numeric columns.

Metadata Synchronization

After inserting at index idx, ALL columns must be updated:

for (ordinal, column) in columns.iter().enumerate() {
    let descriptor = handler.locate_latest_in_table(table, &column.name)?;
    let page_arc = handler.get_page(descriptor.clone())?;

    let mut updated = (*page_arc).clone();
    let insert_pos = insert_idx.min(updated.page.entries.len());

    // Insert entry at same position across all columns
    updated.page.entries.insert(insert_pos, Entry::new(&final_row[ordinal]));

    handler.write_back_uncompressed(&descriptor.id, updated);

    // Update metadata entry count
    handler.update_entry_count_in_table(
        table,
        &column.name,
        new_count as u64
    )?;
}

Critical invariant: All columns must have the same entry count after insertion, or row alignment breaks.

UPDATE Operations with ORDER BY

When updating a row in an ORDER BY table:

UPDATE users SET age = 30 WHERE id = 42;

Decision Tree

╔═══════════════════════════════════════════════════════════╗
║         Does UPDATE modify ORDER BY columns?              ║
╚═══════════════════════════════════════════════════════════╝
                          │
         ┌────────────────┴────────────────┐
         │                                 │
         ▼ NO                              ▼ YES
  ┌──────────────┐                  ┌──────────────┐
  │ overwrite_row│                  │ delete_row + │
  │ (in-place)   │                  │insert_sorted │
  └──────────────┘                  │  _row        │
                                    └──────────────┘

Rationale: If ORDER BY columns change, the row’s position in the sort order changes. Rather than implementing in-place reordering, we delete and reinsert.

Implementation (from sql/executor.rs)

// Check if ORDER BY columns changed
let order_by_changed = catalog
    .sort_key()
    .iter()
    .any(|col| assignments.contains_key(&col.name));

if order_by_changed {
    // Delete old row
    delete_row(handler, table, row_idx)?;

    // Build new row with updates
    let mut new_values = old_values.clone();
    for (col_name, new_val) in assignments {
        new_values[col.ordinal] = new_val;
    }

    // Reinsert in sorted position
    insert_sorted_row(handler, table, &row_tuples)?;
} else {
    // In-place overwrite
    overwrite_row(handler, table, row_idx, &new_values)?;
}

DELETE Operations

Deletion removes entries at the same index from ALL columns:

pub fn delete_row(
    handler: &PageHandler,
    table: &str,
    row_idx: u64,
) -> Result<()> {
    let catalog = handler.table_catalog(table)?;

    // Iterate columns in REVERSE order (consistency during partial failures)
    for column in catalog.columns().iter().rev() {
        let descriptor = handler.locate_latest_in_table(table, &column.name)?;
        let page_arc = handler.get_page(descriptor.clone())?;

        let mut updated = (*page_arc).clone();
        updated.page.entries.remove(row_idx as usize);

        let new_len = updated.page.entries.len() as u64;
        handler.write_back_uncompressed(&descriptor.id, updated);

        handler.update_entry_count_in_table(table, &column.name, new_len)?;
    }

    Ok(())
}

Why reverse order? If deletion fails partway through, earlier columns retain the old row while later columns are already modified. Reverse order minimizes partial state visibility.

Performance Characteristics

Operation Single-Column Multi-Column Notes
Insert O(log n) O(n) Single-column uses binary search
Comparison O(1) O(k) k = number of columns in ORDER BY
Page reads 1 n × k Multi-column reads k columns for each of n rows
Cache pressure Low High Multi-column thrashes cache with random access

Bottlenecks

Multi-column sorted inserts are slow:

  • Linear scan: O(n) row comparisons
  • Each comparison: k column reads via PageHandler::read_entry_at
  • Each read: potential cache miss → CPC/disk fetch
  • Worst case: n × k page fetches for single insert

Example:

  • Table with 10,000 rows
  • ORDER BY on 3 columns
  • Insert requires up to 10,000 × 3 = 30,000 column reads

Module Location

  • Source: src/ops_handler.rs
  • SQL Integration: src/sql/executor.rs
  • Tests: tests/end_to_end_tests.rs