SHAMap Architecture and Node Hierarchy
← Back to SHAMap and NodeStore: Data Persistence and State Management
Introduction
Now that you understand the mathematical foundations of Merkle-Patricia tries, let's explore how XRPL actually implements them in the SHAMap data structure.
The SHAMap is responsible for:
Maintaining account and transaction state in a cryptographically-verified Merkle tree
Computing root hashes that represent entire ledger state
Enabling efficient navigation through key-based lookups
Supporting snapshots and immutability for historical ledgers
Providing proof generation for trustless state verification
This chapter covers the architectural overview and node types. Later chapters dive into specific algorithms for navigation, hashing, and synchronization.
Core Design Principles
The SHAMap architecture achieves three critical properties:
1. Cryptographic Integrity
Every change to data propagates through hashes up to the root:
Modify Account0 data:
  Account0 hash changes
  Parent hash changes (incorporates new Account0 hash)
  Grandparent hash changes
  ... up to root
New root hash is deterministically different from old root hashThis ensures that no data can be modified undetected. Changing even one bit in an account's balance changes the root hash.
2. Efficient Navigation
The Patricia trie structure uses account identifiers as navigation guides:
Account hash: 0x3A7F2E1B4C9D...
Tree navigation:
  Level 0: Extract digit 3 → go to child 3
  Level 1: Extract digit A → go to child A
  Level 2: Extract digit 7 → go to child 7
  ... follow 4-bit chunks down the treeNo binary search needed. The path is directly encoded in the key.
3. Optimized Synchronization
Hash-based comparison eliminates unnecessary data transfer:
Peer A claims: "Root is 0xABCD"
Peer B claims: "Root is 0xXYZW"
If 0xABCD == 0xXYZW:
  Both have identical ledger state
  No synchronization needed
If different:
  Compare children's hashes
  Identify exactly which subtrees differ
  Request only those subtreesThis allows new nodes to synchronize full ledgers from peers in minutes.
The SHAMap Instance
A SHAMap instance represents a complete tree of ledger state:
Core Data:
class SHAMap {
    // The root node (always an inner node)
    std::shared_ptr<SHAMapInnerNode> mRoot;
    // Tree state (Immutable, Mutable, or Synching)
    State mState;
    // For mutable trees: unique identifier
    std::uint32_t mCowID;  // Copy-on-write identifier
    // For navigating to nodes
    std::shared_ptr<Family> mFamily;
};Key Properties:
Root Node: Always an inner node, never a leaf (even if only one account, still has inner structure)
Depth: Exactly 64 levels (256-bit keys ÷ 4 bits per level)
Navigation Determinism: Any key uniquely determines its path from root to leaf
Node Hierarchy
The SHAMap consists of three conceptual layers:
Layer 1: Root
              [Inner Node - Root]
              (hashes all state)Always present
Always an inner node
Contains the root hash representing entire tree
Can have up to 16 children (one for each hex digit 0-F)
Layer 2: Internal Structure
              [Inner Node - Root]
             /    |    \    ...    \
       [Inner]  [Inner] [Inner] ... [Inner]
       / | \    / | \
   [Inner][Inner][Inner]Inner nodes serve as branch points
Each can have 0-16 children
Store only hashes of children, not actual data
No data items in inner nodes
Layer 3: Leaf Nodes
           [Inner Node]
           / | \ ... \
       [Leaf][Leaf][Leaf]
        |     |     |
    Account Account Account
     State    State   StateTerminal nodes containing actual data items
Types: Account state, Transaction, Transaction+Metadata
All leaves in a SHAMap tree are homogeneous (same type)
Inner Nodes
Inner nodes form the branching structure:
Structure:
class SHAMapInnerNode {
    // Up to 16 child slots (indexed 0-F)
    std::array<Branch, 16> mBranches;
    // Cryptographic hash of this node
    uint256 mHash;
    // Bitset: which children exist
    std::uint16_t mChildBits;
    // For copy-on-write: which SHAMap owns this node
    std::uint32_t mCowID;
    // Synchronization optimization: generation marker
    std::uint32_t mFullBelow;
};
// Each branch slot contains:
struct Branch {
    std::shared_ptr<SHAMapTreeNode> mNode;  // nullptr if empty
    uint256 mHash;  // Hash of child (or empty)
};Key Characteristics:
Do not store data items directly - Only hashing information
Maintain cryptographic commitments through child hashes
Variable occupancy - Not all 16 children present
Support both serialization formats - Compressed and full
Serialization Formats:
Inner nodes support two wire formats:
Compressed Format (used when most slots are empty):
Header: "Inner (compressed)"
Bitmap: Which branches exist
For each existing branch:
  Branch index
  Child hashSaves space by omitting empty branches.
Full Format (used when most slots are occupied):
Header: "Inner (full)"
For each of 16 branches:
  Child hash (or empty marker)Simpler structure despite larger size.
Format Selection Algorithm:
XRPL automatically chooses format based on branch count:
Branch count:
  0-8:   Use compressed (saves ~40 bytes)
  9+:    Use full (simpler, fewer bytes overall)Leaf Nodes
Leaf nodes store the actual blockchain data:
Base Properties:
class SHAMapLeafNode {
    // The data contained in this leaf
    std::shared_ptr<SHAMapItem> mItem;
    // Cryptographic hash
    uint256 mHash;
    // For copy-on-write
    std::uint32_t mCowID;
};SHAMapItem Structure:
class SHAMapItem {
    // 256-bit unique identifier (determines tree position)
    uint256 mTag;
    // Variable-length data (transaction, account state, etc.)
    Blob mData;
    // Memory management: intrusive reference counting
};Leaf Node Specializations:
Three distinct leaf node types exist, each with unique hashing:
1. Account State Leaves (hotACCOUNT_NODE)
Store account information
Include balances, settings, owned objects, trust lines
Type prefix during hashing prevents collision with other types
Updated when transactions affect accounts
2. Transaction Leaves (hotTRANSACTION_NODE)
Store transaction data
Do not include execution metadata
Immutable once added to ledger
Enable verification of transaction history
3. Transaction+Metadata Leaves (hotTRANSACTION_LEDGER_ENTRY)
Store transactions with execution metadata
Include results (success/failure, ledger entries modified)
Complete information for replaying or auditing
Support full transaction reconstruction
Why Multiple Types?
// Type prefix prevents collisions
uint256 hash_account_state = SHA512Half(
    PREFIX_ACCOUNT || key || data);
uint256 hash_transaction = SHA512Half(
    PREFIX_TRANSACTION || key || data);
// Even with identical (key, data), hashes differ
hash_account_state != hash_transactionEnsures that moving data between leaves would be immediately detected as invalid.
SHAMapNodeID: Node Identification
Every node in the tree is uniquely identified by its position:
Components:
class SHAMapNodeID {
    // Distance from root (0 = root)
    std::uint32_t mDepth;
    // Path from root to this node
    // Packed as 4-bit nibbles in a uint256
    uint256 mNodeID;
};Path Encoding:
The path is encoded as 4-bit chunks (nibbles) in a uint256:
Node at depth 3, path [3, A, 7]:
mNodeID = 0x3A7000...000
          ^^^     (significant nibbles)
            ^^^^^^ (zero padding for remaining levels)
Depth 0 (root):     mNodeID = 0x0000...000
Depth 1:            mNodeID = 0x3000...000
Depth 2:            mNodeID = 0x3A00...000
Depth 3:            mNodeID = 0x3A70...000
...
Depth 64:           mNodeID = complete (all 64 nibbles filled)Key Operations:
getChildNodeID(branch) - Compute child position:
SHAMapNodeID parent(depth=2, nodeID=0x3A00...);
SHAMapNodeID child = parent.getChildNodeID(7);
// Result: depth=3, nodeID=0x3A70...selectBranch(nodeID, key) - Determine which branch to follow:
uint256 key = 0x3A7F2E1B4C9D...;
int branch = key.nthNibble(depth);  // Extract nth 4-bit chunk
// For depth=2: branch = 7 (extract bits 8-11)This deterministic navigation ensures every key has exactly one path through the tree.
State Management
SHAMaps exist in different states reflecting their role in the system:
Immutable State
mState == State::Immutable;
mCowID == 0;Represents a finalized, historical ledger
Nodes cannot be modified
Multiple readers can access simultaneously
Critical limitation: cannot be trimmed (nodes loaded stay in memory)
Used for consensus history
Mutable State
mState == State::Modifying;
mCowID != 0;  // Unique identifier for this treeRepresents work-in-progress ledger state
Nodes can be modified through copy-on-write
Safe mutations without affecting other SHAMap instances
Single writer (typically), multiple readers possible
Used for constructing new ledger
Synching State
mState == State::Synching;Transitional state during network synchronization
Allows incremental tree construction
Transitions to Immutable or Modifying when complete
Used when receiving missing nodes from peers
State Transitions:
        New Tree
          |
          v
    [Synching] <-- Receiving nodes from peers
          |
          +--> [Immutable] <-- Finalized ledger
          |
          +--> [Modifying] <-- New transactions incomingCopy-on-Write Mechanism
The copy-on-write system enables safe node sharing and snapshots:
Principle:
Nodes are shared between SHAMaps using shared pointers. When a mutable SHAMap needs to modify a shared node, it creates a copy:
// Check if node is owned by this SHAMap
if (node->getCowID() != mCowID) {
    // Node is shared or immutable
    // Create a copy
    auto newNode = node->clone();
    newNode->setCowID(mCowID);  // Mark as owned by this tree
    return newNode;
} else {
    // Node already owned by this tree
    // Safe to modify in place
    return node;
}Node Sharing Rules:
cowID == 0:           Immutable (shared by all)
cowID == tree.mCowID: Owned by this tree (safe to modify)
cowID != tree.mCowID: Owned by another tree (must copy before modifying)Benefits:
Efficient Snapshots: New SHAMap instances share unmodified subtrees
Create snapshot before ledger close: New immutable SHAMap shares root and all unchanged nodes No deep copy requiredSafe Concurrent Access: Modifications don't affect other instances
SHAMap A modifies Account1: Clones nodes along path to Account1 Leaves rest of tree shared SHAMap B (different ledger) unaffectedMemory Efficiency: Identical subtrees stored once
100 ledgers share 99% of tree structure Only 1% of data duplicated for different accounts
Integration with NodeStore
SHAMap is designed for in-memory operation, but nodes can be persisted:
Persistent Nodes:
Some nodes are marked as "backed" - they exist in NodeStore:
// Node came from persistent storage
mNode->setBacked();
// When synchronizing, try to retrieve from storage
std::shared_ptr<SHAMapTreeNode> node = nodestore.fetch(hash);
if (node) {
    // Add to tree
    canonicalizeNode(node);  // Ensure uniqueness
}Canonicalization:
Ensures nodes are unique in memory:
// Check cache for existing node with same hash
std::shared_ptr<SHAMapTreeNode> cached = cache.get(hash);
if (cached) {
    return cached;  // Use existing node
} else {
    cache.insert(hash, newNode);  // Cache new node
    return newNode;
}This enables:
Safe node sharing across multiple SHAMap instances
Fast equality checking (compare pointers, not content)
Memory efficiency (identical nodes stored once)
Summary
Key Architectural Elements:
SHAMap Instance: Complete tree representing one ledger version
Inner Nodes: Branching structure with 0-16 children, storing hashes
Leaf Nodes: Terminal nodes containing account or transaction data
SHAMapNodeID: Identifies node position through depth and path
State Management: Immutable (historical), Mutable (work-in-progress), Synching (incomplete)
Copy-on-Write: Enables snapshots and safe sharing
NodeStore Integration: Persistent layer for large datasets
Design Properties:
Perfect balance: All leaves at approximately same depth (log_16 of account count)
Deterministic navigation: Key uniquely determines path
Immutable persistence: Historical ledgers safe from modification
Efficient sharing: Snapshots require minimal additional memory
Scalable: Handles millions of accounts with logarithmic operations
In the next chapter, we'll explore how nodes are navigated, hashes are computed, and the tree structure is maintained.
Last updated

