The NodeStore

← Back to SHAMap and NodeStore: Data Persistence and State Management


Introduction

Now that you understand SHAMap's in-memory structure and efficient algorithms, we turn to the critical question: how do you persist this state?

Without persistence, every validator restart requires replaying transactions from genesis—weeks of computation. With persistence, recovery takes minutes. But persistence introduces challenges:

  • Scale: Millions of nodes, gigabytes of data

  • Performance: Database queries are 1000x slower than memory access

  • Flexibility: Different operators need different storage engines

  • Reliability: Data must survive crashes without corruption

The NodeStore solves all these problems through elegant abstraction and careful design.

NodeStore's Role in XRPL Architecture

NodeStore sits at a critical junction in XRPL's architecture:

SHAMap's Dependency:

SHAMap needs to retrieve historical nodes:

But SHAMap doesn't know or care about:

  • How data is stored

  • Which database backend is used

  • Where the data is physically located

  • How caching is implemented

All that complexity is hidden behind NodeStore's interface.

Core Purpose

NodeStore provides four critical services:

1. Persistence

2. Consistent Interface

3. Performance Optimization

4. Lifecycle Management

NodeObject: The Fundamental Storage Unit

The atomic unit of storage in XRPL is the NodeObject:

Structure:

Key Characteristics:

  1. Immutable Once Created: Cannot modify data after creation

  2. Hash as Key: Hash uniquely identifies the object

  3. Type Distinguishing: Type prevents hash collisions between different data types

  4. Serialized Format: Data is already in wire format

NodeObject Types:

Type
Purpose
Numeric Value

hotLEDGER

Ledger headers and metadata

1

hotACCOUNT_NODE

Account state tree nodes

3

hotTRANSACTION_NODE

Transaction tree nodes

4

hotUNKNOWN

Unknown/unrecognized types

0

hotDUMMY

Cache marker for missing entries

512

Type Prefix in Hashing:

Type fields prevent collisions:

NodeObject Lifecycle

Creation

Storage

Caching

Retrieval

Archival

Backend Abstraction

The Backend class defines the minimal interface for any storage system:

Core Operations:

Status Codes:

Backend Independence:

NodeStore sits above backends, application logic unchanged:

Supported Backends

RocksDB (Recommended for Most Cases)

  • Modern key-value store developed by Facebook

  • LSM tree (Log-Structured Merge tree) design

  • Excellent performance for XRPL workloads

  • Built-in compression support

  • Active maintenance

Characteristics:

  • Write throughput: ~10,000-50,000 objects/second

  • Read throughput: ~100,000+ objects/second

  • Compression: Reduces disk space by 50-70%

NuDB (High-Throughput Alternative)

  • Purpose-built for XRPL by Ripple

  • Append-only design optimized for SSD

  • Higher write throughput than RocksDB

  • Efficient space utilization

Characteristics:

  • Write throughput: ~50,000-200,000 objects/second

  • Read throughput: ~100,000+ objects/second

  • Better for high-volume systems

Testing Backends

Data Encoding Format

To enable backend independence, NodeStore uses a standardized encoding:

Encoded Blob Structure:

Encoding Process:

Decoding Process:

Benefits:

  1. Backend Agnostic: Any backend can store/retrieve encoded blobs

  2. Self-Describing: Type embedded, forward-compatible with unknown types

  3. Efficient: Minimal overhead (8 bytes) per object

  4. Validated: Type byte catches most corruption

Database Key as Hash

The database key is the object's hash (not a sequential ID):

Implications:

  1. Direct Retrieval: Any node retrievable by hash

  2. Deduplication: Identical content produces identical hash → same key

  3. Immutability: Hash never changes for given data

  4. Verification: Can verify data by recomputing hash

Integration Architecture

NodeStore integrates with SHAMap through the Family pattern:

Summary

Key Architectural Elements:

  1. NodeObject: Atomic storage unit (type, hash, data)

  2. Backend Interface: Minimal, consistent interface for storage

  3. Abstraction: Decouples application logic from storage implementation

  4. Encoding Format: Standardized format enables backend independence

  5. Key as Hash: Direct retrieval without index lookups

  6. Family Pattern: Provides access to caching and storage

Design Properties:

  • Backend Flexibility: Switch storage engines without code changes

  • Scale: Handles millions of objects efficiently

  • Persistence: Survives crashes and restarts

  • Verification: Data integrity through hashing

  • Simplicity: Minimal interface hides complexity

In the next chapter, we'll explore the critical Cache Layer that makes NodeStore practical for high-performance systems.

Last updated