Resource Management and Performance Characteristics

← Back to SHAMap and NodeStore: Data Persistence and State Management


Introduction

Understanding SHAMap and NodeStore theoretically is one thing. Operating them in production is another.

This final chapter covers:

  • Real-world resource requirements

  • Performance measurement and optimization

  • Bottleneck identification

  • Tuning for different deployment scenarios

Lookup Complexity Analysis

SHAMap Lookup

Operation: Find account by ID

Worst case: O(64) node traversals
  Tree depth: 256 bits / 4 bits per level = 64 levels

Typical case: O(1)
  Most accounts found before depth 64
  Average depth in realistic ledger: ~25 levels

Expected time:
  Each traversal: O(1) array access (branch[0..15])
  Total: O(1) expected time (linear in actual tree size,
         but tree size ~ account count)

Cache hit: 1-10 microseconds
  Direct pointer access, no I/O

Cache miss: 1-10 milliseconds
  Database query required

Batch Fetch

N objects requested:

Naive (sequential):
  N × database_latency = N × 10ms
  Example: 100 objects = 1000ms

Batched:
  single_batch_latency + deserialize
  Example: 100 objects = 10ms + 5ms = 15ms
  Speedup: 66x

Write Throughput

Ledger Close Cycle

Timeline for typical ledger close (3-5 seconds):

1. Receive transactions: 2 seconds
   - Validate signatures
   - Check preconditions
   - Execute in SHAMap

2. Consensus: 1 second
   - Reach agreement on state
   - Sign ledger

3. Store phase: 0.5-1 second
   - Serialize modified nodes
   - Write to NodeStore
   - Update indexes

Total: 3.5-5 seconds

Node Object Volume

Typical ledger modification:

200-400 transactions per ledger
  Average 2-4 modified accounts per transaction
  = 500-1000 modified nodes

Plus structural nodes (parent rehashing):
  Depth of modified accounts: ~25 levels
  = 25 ancestor nodes modified

Total objects created: ~600-1100 per ledger

At 4 ledgers/second:
  2400-4400 objects/second

Database requirement:
  RocksDB: Handles 10,000-50,000 obj/sec easily
  NuDB:    Handles 50,000-200,000 obj/sec

Write Latency

Store a NodeObject:

1. Cache update: 1-10 microseconds
2. Encode to blob: 100 microseconds
3. Database write: 100-1000 microseconds (SSD)
4. Batch accumulation: 10-100 milliseconds

Total for batch of 100: 10-100 milliseconds
Per object in batch: 0.1-1 millisecond

Read Performance Characteristics

Cache Hit Scenario

Hit rate: 95% (well-tuned system)

1000 object requests:
  950 cache hits × 5 microseconds = 4.75 milliseconds
  50 cache misses × 10 milliseconds = 500 milliseconds
  Total: 504.75 milliseconds = 0.5 seconds

Average per request: 0.5 milliseconds

Cache Miss Scenario

Hit rate: 60% (poorly tuned system)

1000 object requests:
  600 cache hits × 5 microseconds = 3 milliseconds
  400 cache misses × 10 milliseconds = 4000 milliseconds = 4 seconds
  Total: 4.003 seconds

Average per request: 4 milliseconds

10x slower due to cache misses!

Memory Requirements

NodeStore Memory

Cache layer:
  Size: Configurable (32MB - 4GB typical)
  Per object: ~100-500 bytes
  At 256MB cache: ~300,000-500,000 cached objects

Backend buffers:
  RocksDB: ~100-300MB for block cache
  NuDB: ~50-100MB

Thread pools:
  Each async thread: ~1-2MB stack
  10 threads: ~20MB

Total NodeStore memory: cache_size + backend_buffers + thread_stacks
  Typical: 256MB cache + 200MB backend = 500MB total
  Large: 1GB cache + 300MB backend = 1.3GB total

SHAMap Memory

In-memory tree of current + recent ledgers:

Active ledger: ~10-50MB
  Depends on account count and modification volume

Recent immutable ledgers (kept for quick access):
  2-3 most recent: ~30-150MB

Total SHAMap: 50-200MB typical

Plus cached nodes (shared with NodeStore cache):
  Counted above in NodeStore memory

Total Memory Budget

Minimal validator:
  SHAMap: 50MB
  NodeStore: 200MB
  Other rippled: 100MB
  Total: 350MB

Standard validator:
  SHAMap: 100MB
  NodeStore: 500MB
  Other rippled: 100MB
  Total: 700MB

Large validator:
  SHAMap: 200MB
  NodeStore: 2000MB
  Other rippled: 100MB
  Total: 2.3GB

Disk Space Requirements

Database Growth

Without rotation: Unbounded growth

Per ledger:
  ~600-1100 new objects per ledger
  ~200KB per object (with compression)
  = 120-220MB per ledger

Per day:
  ~20 ledgers per day
  = 2.4-4.4 GB per day

Per year:
  = 876GB - 1.6TB per year

Clearly unsustainable (disk fills in weeks)

With Rotation

Retention policy: Keep last 100,000 ledgers

Ledger creation rate: 1 ledger per ~3 seconds
100,000 ledgers = ~8 days of history

Database size:
  100,000 × 0.2MB = 20GB (stable)
  With overhead: 30-50GB typical

Bounded growth enables indefinite operation

Actual Sizes on Mainnet

Small validator (RocksDB, compressed):
  Database: 30-50GB
  With binaries/logs: 60GB total

Archive node (full history):
  Database: 500GB-1TB
  With redundancy: 1.5TB total

Growth per day (with rotation):
  ~500MB-1GB per day
  (old data deleted as new data added)

File Descriptor Usage

File Descriptor Requirements

Each backend type requires different FDs:

RocksDB:
  - Main database: 1
  - WAL (write-ahead log): 1
  - SSTable files: 20-100 (per configuration)
  - Total: 25-100 FDs

NuDB:
  - Main data file: 1
  - Index file: 1
  - Total: 2-5 FDs

Operating system overhead:
  stdin, stdout, stderr: 3
  Socket listening: 2-5
  Network connections: ~50 typical

Total rippled process:
  - Without NodeStore: 50-100 FDs
  - With RocksDB: 100-200 FDs
  - Comfortable limit: 4096 FDs

Configuration:
  ulimit -n 4096    # Set FD limit

Performance Tuning

Identifying Bottlenecks

Monitor these metrics:

// Cache hit rate - most important
if (metrics.hitRate() < 90%) {
    // Increase cache_size
    problem = "Cache too small";
}

// Write latency - latency-sensitive
if (metrics.writeLatency > 100ms) {
    // Switch to faster backend or increase batch size
    problem = "Backend I/O too slow";
}

// Fetch latency
if (metrics.fetchLatency > 50ms) {
    // Check cache hit rate
    // Check disk health
    problem = "Database queries too slow";
}

// Async queue depth
if (metrics.asyncQueueDepth > 10000) {
    // Not keeping up with demand
    problem = "Async processing overwhelmed";
}

Tuning Parameters

[node_db]
type = RocksDB
path = /var/lib/rippled/db

# Cache tuning
cache_size = 256        # Increase if memory available
cache_age = 60          # Longer = better hit rate

# Threading
async_threads = 4       # Increase for I/O-bound systems

# Batch operations
batch_write_size = 256  # Larger batches, fewer transactions

[node_db_rotation]
online_delete = 256     # Keep 256K ledgers (8 days)

Scenario 1: High-Traffic Validator

Problem: Write latency too high (ledgers close slowly)

Solution:
  - Increase cache_size to 1GB+
  - Switch to NuDB backend (higher throughput)
  - Increase async_threads to 8-16
  - Ensure SSD (not HDD)
  - Increase batch_write_size

Result: Write throughput 50K+ objects/sec

Scenario 2: Memory-Constrained

Problem: Only 512MB RAM available

Solution:
  - Set cache_size = 64MB (small)
  - Still runs, but slower
  - Increase cache_age for working set
  - Monitor hit rate (may drop to 80%)

Result: Functional but slower sync and queries

Scenario 3: Archive Node

Problem: Need complete history, very large disk

Solution:
  - No rotation (online_delete disabled)
  - RocksDB with compression
  - Smaller cache_size (less frequently accessed)
  - Parallel database with rotated copy

Result: Full history, terabyte+ database

Performance Characteristics Summary

Lookup Performance:

Single object lookup:
  Cache hit:    1-10 microseconds
  Cache miss:   1-10 milliseconds
  95% hit rate: ~0.5 milliseconds average

Batch operation (100 objects):
  Sequential:   1000 milliseconds
  Batched:      10 milliseconds
  Speedup:      100x

Write Performance:

Per ledger:
  1000 objects per ledger
  Per-object: 0.1-1 millisecond
  Batch overhead: 10-100 milliseconds
  Total per ledger: 100-1100 milliseconds

Throughput:
  4 ledgers/second × 1000 objects/ledger = 4000 obj/sec
  Well within RocksDB/NuDB capacity

Memory Usage:

Minimum:  200-300MB
Typical:  500-700MB
Large:    2-4GB
Depends on cache_size configuration

Disk Space:

With rotation:  30-50GB (8-10 days history)
Unbounded:      ~1TB per year (without rotation)
Growth rate:    ~500MB-1GB per day

Scalability Limits:

Current network:
  2000+ validators
  100-400 transactions/ledger
  Proven sustainable

Theoretical limits:
  Cache hit rate: 80%+ maintainable at any size
  Write throughput: 100K obj/sec possible
  Read throughput: 1M obj/sec with cache

Practical limits:
  Memory: 4-16GB per validator typical
  Disk: 100GB-1TB per validator typical
  Network: Synchronization limits transaction volume
  Consensus: Agreement time limits throughput

Monitoring in Production

Key Metrics to Track

1. Cache Statistics:
   - Hit rate (target: >90%)
   - Size (should be close to configured max)
   - Eviction rate

2. Database Performance:
   - Write latency (target: <100ms per ledger)
   - Read latency (target: <50ms per request)
   - Queue depth (target: <1000)

3. Resource Usage:
   - Memory (should stabilize)
   - CPU (typically 20-50% on modern systems)
   - Disk I/O (peaks during sync)

4. Application:
   - Ledger close time (target: 3-5 seconds)
   - Synchronization lag (target: 0 when caught up)
   - Block proposal success (target: >95%)

Alerting Thresholds

Warning:
  - Hit rate < 80%
  - Write latency > 200ms
  - Queue depth > 5000

Critical:
  - Hit rate < 60%
  - Write latency > 500ms
  - Ledger close > 10 seconds
  - Disk space < 10GB free

Last updated