SHAMap Architecture and Node Hierarchy

← Back to SHAMap and NodeStore: Data Persistence and State Management


Introduction

Now that you understand the mathematical foundations of Merkle-Patricia tries, let's explore how XRPL actually implements them in the SHAMap data structure.

The SHAMap is responsible for:

  • Maintaining account and transaction state in a cryptographically-verified Merkle tree

  • Computing root hashes that represent entire ledger state

  • Enabling efficient navigation through key-based lookups

  • Supporting snapshots and immutability for historical ledgers

  • Providing proof generation for trustless state verification

This chapter covers the architectural overview and node types. Later chapters dive into specific algorithms for navigation, hashing, and synchronization.

Core Design Principles

The SHAMap architecture achieves three critical properties:

1. Cryptographic Integrity

Every change to data propagates through hashes up to the root:

This ensures that no data can be modified undetected. Changing even one bit in an account's balance changes the root hash.

2. Efficient Navigation

The Patricia trie structure uses account identifiers as navigation guides:

No binary search needed. The path is directly encoded in the key.

3. Optimized Synchronization

Hash-based comparison eliminates unnecessary data transfer:

This allows new nodes to synchronize full ledgers from peers in minutes.

The SHAMap Instance

A SHAMap instance represents a complete tree of ledger state:

Core Data:

Key Properties:

  • Root Node: Always an inner node, never a leaf (even if only one account, still has inner structure)

  • Depth: Exactly 64 levels (256-bit keys ÷ 4 bits per level)

  • Navigation Determinism: Any key uniquely determines its path from root to leaf

Node Hierarchy

The SHAMap consists of three conceptual layers:

Layer 1: Root

  • Always present

  • Always an inner node

  • Contains the root hash representing entire tree

  • Can have up to 16 children (one for each hex digit 0-F)

Layer 2: Internal Structure

  • Inner nodes serve as branch points

  • Each can have 0-16 children

  • Store only hashes of children, not actual data

  • No data items in inner nodes

Layer 3: Leaf Nodes

  • Terminal nodes containing actual data items

  • Types: Account state, Transaction, Transaction+Metadata

  • All leaves in a SHAMap tree are homogeneous (same type)

Inner Nodes

Inner nodes form the branching structure:

Structure:

Key Characteristics:

  1. Do not store data items directly - Only hashing information

  2. Maintain cryptographic commitments through child hashes

  3. Variable occupancy - Not all 16 children present

  4. Support both serialization formats - Compressed and full

Serialization Formats:

Inner nodes support two wire formats:

Compressed Format (used when most slots are empty):

Saves space by omitting empty branches.

Full Format (used when most slots are occupied):

Simpler structure despite larger size.

Format Selection Algorithm:

XRPL automatically chooses format based on branch count:

Leaf Nodes

Leaf nodes store the actual blockchain data:

Base Properties:

SHAMapItem Structure:

Leaf Node Specializations:

Three distinct leaf node types exist, each with unique hashing:

1. Account State Leaves (hotACCOUNT_NODE)

  • Store account information

  • Include balances, settings, owned objects, trust lines

  • Type prefix during hashing prevents collision with other types

  • Updated when transactions affect accounts

2. Transaction Leaves (hotTRANSACTION_NODE)

  • Store transaction data

  • Do not include execution metadata

  • Immutable once added to ledger

  • Enable verification of transaction history

3. Transaction+Metadata Leaves (hotTRANSACTION_LEDGER_ENTRY)

  • Store transactions with execution metadata

  • Include results (success/failure, ledger entries modified)

  • Complete information for replaying or auditing

  • Support full transaction reconstruction

Why Multiple Types?

Ensures that moving data between leaves would be immediately detected as invalid.

SHAMapNodeID: Node Identification

Every node in the tree is uniquely identified by its position:

Components:

Path Encoding:

The path is encoded as 4-bit chunks (nibbles) in a uint256:

Key Operations:

getChildNodeID(branch) - Compute child position:

selectBranch(nodeID, key) - Determine which branch to follow:

This deterministic navigation ensures every key has exactly one path through the tree.

State Management

SHAMaps exist in different states reflecting their role in the system:

Immutable State

  • Represents a finalized, historical ledger

  • Nodes cannot be modified

  • Multiple readers can access simultaneously

  • Critical limitation: cannot be trimmed (nodes loaded stay in memory)

  • Used for consensus history

Mutable State

  • Represents work-in-progress ledger state

  • Nodes can be modified through copy-on-write

  • Safe mutations without affecting other SHAMap instances

  • Single writer (typically), multiple readers possible

  • Used for constructing new ledger

Synching State

  • Transitional state during network synchronization

  • Allows incremental tree construction

  • Transitions to Immutable or Modifying when complete

  • Used when receiving missing nodes from peers

State Transitions:

Copy-on-Write Mechanism

The copy-on-write system enables safe node sharing and snapshots:

Principle:

Nodes are shared between SHAMaps using shared pointers. When a mutable SHAMap needs to modify a shared node, it creates a copy:

Node Sharing Rules:

Benefits:

  1. Efficient Snapshots: New SHAMap instances share unmodified subtrees

  2. Safe Concurrent Access: Modifications don't affect other instances

  3. Memory Efficiency: Identical subtrees stored once

Integration with NodeStore

SHAMap is designed for in-memory operation, but nodes can be persisted:

Persistent Nodes:

Some nodes are marked as "backed" - they exist in NodeStore:

Canonicalization:

Ensures nodes are unique in memory:

This enables:

  • Safe node sharing across multiple SHAMap instances

  • Fast equality checking (compare pointers, not content)

  • Memory efficiency (identical nodes stored once)

Summary

Key Architectural Elements:

  1. SHAMap Instance: Complete tree representing one ledger version

  2. Inner Nodes: Branching structure with 0-16 children, storing hashes

  3. Leaf Nodes: Terminal nodes containing account or transaction data

  4. SHAMapNodeID: Identifies node position through depth and path

  5. State Management: Immutable (historical), Mutable (work-in-progress), Synching (incomplete)

  6. Copy-on-Write: Enables snapshots and safe sharing

  7. NodeStore Integration: Persistent layer for large datasets

Design Properties:

  • Perfect balance: All leaves at approximately same depth (log_16 of account count)

  • Deterministic navigation: Key uniquely determines path

  • Immutable persistence: Historical ledgers safe from modification

  • Efficient sharing: Snapshots require minimal additional memory

  • Scalable: Handles millions of accounts with logarithmic operations

In the next chapter, we'll explore how nodes are navigated, hashes are computed, and the tree structure is maintained.

Last updated