Traversal, Iteration, and Synchronization

← Back to SHAMap and NodeStore: Data Persistence and State Management


Introduction

With navigation and hashing understood, we now explore how SHAMap enables efficient tree traversal and network synchronization. These operations are critical for:

  • New nodes catching up to the network

  • Peers verifying they have identical state

  • Extracting state information for queries

  • Detecting and resolving ledger divergences

Traversal Algorithms

SHAMap provides multiple traversal strategies depending on the use case:

Depth-First Traversal: visitNodes

The visitNodes method provides complete tree traversal:

void SHAMap::visitNodes(
    std::function<void(SHAMapTreeNode*)> callback)
{
    std::stack<std::shared_ptr<SHAMapTreeNode>> toVisit;
    toVisit.push(mRoot);

    while (!toVisit.empty()) {
        auto node = toVisit.top();
        toVisit.pop();

        callback(node);

        // Process inner node's children
        if (auto inner = std::dynamic_pointer_cast<SHAMapInnerNode>(node)) {
            for (int i = 15; i >= 0; --i) {  // Reverse order for stack
                if (auto child = inner->getChild(i)) {
                    toVisit.push(child);
                }
            }
        }
    }
}

Use Cases:

  • Tree validation (verify all hashes)

  • Bulk operations on all nodes

  • Custom tree analysis

Leaf-Only Traversal: visitLeaves

Iterator-Based Traversal

Parallel Traversal: walkMapParallel

For performance-critical operations:

Use Cases:

  • Missing node detection at scale

  • High-throughput synchronization

Missing Node Detection

The core synchronization primitive identifies nodes needed for complete tree reconstruction:

Algorithm: getMissingNodes

Output:

Returns vector of (NodeID, Hash) pairs representing missing nodes, prioritized for network retrieval.

Full Below Optimization

An optimization preventing redundant traversal:

When a subtree is verified complete (all descendants present), skip traversing it again until a new sync starts.

Node Addition and Canonicalization

Adding the Root Node: addRootNode

Initializes or verifies the root node:

Adding Known Nodes: addKnownNode

Adds interior or leaf nodes during synchronization:

Canonicalization

Purpose:

Ensure nodes are unique in memory (one NodeObject per hash):

Benefits:

  1. Memory Efficiency: Identical nodes stored once

  2. Thread Safety: Cache handles concurrent insertion atomically

  3. Fast Equality: Compare pointers instead of content

  4. Shared Trees: Multiple SHAMaps can share nodes

Synchronization Scenario

The Complete Flow:

Performance Metrics:

State Reconstruction Guarantee

The transaction tree, persisted in NodeStore, ensures unique history:

Problem:

Without transaction history, many sequences could produce same state:

Solution:

The transaction tree proves the exact sequence:

NodeStore's Role:

Both tree nodes are persisted:

  • Query state tree to find current values

  • Query transaction tree to find history

  • Together: complete, verifiable record

Summary

Key Algorithms:

  1. Traversal: visitNodes, visitLeaves, iterators

  2. Missing Detection: getMissingNodes with Full Below cache

  3. Node Addition: addRootNode, addKnownNode with canonicalization

  4. Synchronization: Complete flow from root to leaves

  5. Verification: Hash chain validation

Performance Properties:

Critical Insight:

Synchronization works because of hash-based tree structure:

  • Compare root hashes: O(1) comparison

  • If differ: compare children: O(16) comparisons max

  • Recursive descent: O(log N) total comparisons

  • vs. comparing all N leaves directly

This logarithmic advantage is what makes blockchain synchronization practical.

Last updated