With millions of nodes and transactions processing at ledger-close speeds (every 3-5 seconds), any system that naively queries the database for every node access will fail.
The NodeStore's Cache Layer solves this through a multi-tier strategy that keeps hot data in memory while safely delegating cold data to disk.
The Performance Problem
Scenario: Synchronizing from the Network
A new node joins XRPL and must catch up to current ledger. This requires:
Fetching missing ledgers (blocks of transactions)
For each ledger, fetching all state nodes
Verifying each node's hash
Storing nodes to disk
Naive Approach (No Cache):
Clearly infeasible.
With Caching (90% Hit Rate):
Still slow, but realistic with parallel processing.
The difference between possible and impossible is caching.
TaggedCache Architecture
The NodeStore's primary cache is the TaggedCache:
Purpose:
Structure:
Cache Tiers:
NodeStore implements a two-tier caching strategy:
Fetch Algorithm:
Cache Insertion Strategy
When Objects Enter Cache:
Dummy Objects:
Special marker objects prevent wasted lookups:
Benefit of Dummies:
Prevents thundering herd of repeated failed lookups.
Cache Eviction
Cache capacity is limited. When full, old objects must be evicted.
Eviction Triggers:
LRU (Least Recently Used) Eviction:
Age-Based Eviction:
Configuration Parameters:
Impact of Configuration:
Cache Performance Metrics
NodeStore tracks cache effectiveness:
Example Metrics:
Synchronization Optimization
During network synchronization, special techniques optimize caching:
Prefetching
Batch Loading
Deferred Reads
Working Set Management
Different phases of operation have different access patterns:
During Normal Operation (Steady State)
During Synchronization
After Sync Completion
Thread Safety
Cache must be safe for concurrent access:
Concurrency Properties:
Advanced Caching: Full Below Optimization
During synchronization, special tracking prevents redundant work:
The Problem:
The Solution: Full Below Generation Counter
Benefit:
Summary
Key Concepts:
Multi-Tier Caching: Memory + disk strategy balances performance and capacity
LRU Eviction: Keeps frequently-accessed data, evicts cold data
Dummy Markers: Prevent repeated failed lookups
Metrics Tracking: Monitor cache effectiveness
Synchronization Optimization: Prefetch and batch loading
Full Below Cache: Avoid redundant traversal during sync
Thread Safety: Shared locks for multiple readers
Performance Impact:
The Cache Layer transforms NodeStore from theoretical to practical. Database queries are unavoidable, but caching hides their latency, allowing XRPL to operate at microsecond efficiency despite millisecond database performance.
for (each node in ledger) {
backend.fetch(nodeHash); // Database query - 10ms
}
For 10,000 nodes in ledger:
10,000 × 10ms = 100 seconds per ledger
For 100,000 ledgers to catch up:
100 seconds × 100,000 = 1,157 days
for (each node in ledger) {
if (cache.has(nodeHash)) {
cache.get(nodeHash); // 10 microseconds
} else {
backend.fetch(nodeHash); // 10 milliseconds
}
}
For 10,000 nodes:
9,000 × 10µs (cache hits) = 90ms
1,000 × 10ms (cache misses) = 10,000ms
Total: 10,090ms ≈ 10 seconds per ledger
For 100,000 ledgers:
100,000 × 10 seconds = 1,000,000 seconds ≈ 11.5 days
class TaggedCache {
// Keep frequently accessed NodeObjects in memory
// Minimize expensive database queries
// Provide thread-safe concurrent access
};
class TaggedCache {
private:
// Key: NodeObject hash (uint256)
// Value: shared_ptr<NodeObject>
std::unordered_map<uint256, std::shared_ptr<NodeObject>> mCache;
// Protect concurrent access
std::mutex mLock;
// Configuration
size_t mMaxSize; // Maximum objects in cache
std::chrono::seconds mMaxAge; // Maximum age before eviction
};
public:
// Retrieve from cache
std::shared_ptr<NodeObject> get(uint256 const& hash);
// Store in cache
void insert(uint256 const& hash,
std::shared_ptr<NodeObject> const& obj);
// Remove from cache
void remove(uint256 const& hash);
// Evict old entries
void evictExpired();
};
std::shared_ptr<NodeObject> fetch(uint256 const& hash) {
// Step 1: Check cache
{
std::lock_guard<std::mutex> lock(mCacheLock);
auto it = mCache.find(hash);
if (it != mCache.end()) {
// Found in cache
recordHit(hash);
return it->second;
}
}
// Step 2: Cache miss - query backend
auto obj = backend->fetch(hash);
if (!obj) {
// Object not found anywhere
// Cache a dummy marker to prevent repeated lookups
cacheDummy(hash);
return nullptr;
}
// Step 3: Update cache with newly fetched object
{
std::lock_guard<std::mutex> lock(mCacheLock);
mCache.insert({hash, obj});
}
recordMiss(hash);
return obj;
}
1. On Database Fetch
fetch() fails or succeeds → always cache result
(Success: cache object, Failure: cache dummy)
2. On Storage
store() called → immediately cache object
Object will be reaccessed soon
3. Predictive Loading
When traversing SHAMap, prefetch likely-needed siblings
class NodeObject {
static std::shared_ptr<NodeObject> createDummy(uint256 const& hash) {
auto obj = std::make_shared<NodeObject>();
obj->mType = hotDUMMY; // Type 512: marker
obj->mHash = hash;
obj->mData.clear(); // Empty data
return obj;
}
bool isDummy() const {
return mType == hotDUMMY;
}
};
// In fetch:
std::shared_ptr<NodeObject> backend_result = backend->fetch(hash);
if (!backend_result) {
// Not found - cache dummy to avoid retry
auto dummy = NodeObject::createDummy(hash);
cache.insert(hash, dummy);
return nullptr;
}
Scenario: Syncing but peer doesn't have a node
Without dummies:
Request node X from network
Peer: "I don't have it"
Local check: Try backend
Backend: "Not there"
(repeat this sequence multiple times)
With dummies:
Request node X from network
Peer: "I don't have it"
Local check: Cache hit on dummy
Immediately know: "Not available"
Don't retry
void insertWithEviction(uint256 const& hash,
std::shared_ptr<NodeObject> const& obj) {
{
std::lock_guard<std::mutex> lock(mCacheLock);
// Check size limit
if (mCache.size() >= mMaxSize) {
evictLRU(); // Remove least recently used
}
mCache.insert({hash, obj});
}
// Check age limit (periodic)
if (shouldEvictExpired()) {
evictExpired(); // Remove objects older than mMaxAge
}
}
void evictLRU() {
// Find object with oldest access time
auto oldest = findOldestAccess();
// Remove from cache
mCache.erase(oldest->hash);
}
// Track access times
struct CacheEntry {
std::shared_ptr<NodeObject> object;
std::chrono::steady_clock::time_point lastAccess;
};
void evictExpired() {
auto now = std::chrono::steady_clock::now();
for (auto it = mCache.begin(); it != mCache.end();) {
auto age = now - it->second.lastAccess;
if (age > mMaxAge) {
// Object too old - remove
it = mCache.erase(it);
} else {
++it;
}
}
}
Small cache:
cache_size = 32 MB
Hit rate: ~60% (more evictions)
Disk queries: 40% of lookups (slower)
Large cache:
cache_size = 1024 MB
Hit rate: ~95% (fewer evictions)
Disk queries: 5% of lookups (faster)
Memory usage: Higher
Operators choose based on available RAM and performance needs
From a running XRPL validator:
Period: 1 hour
Cache hits: 86,400 (hit on hot data)
Cache misses: 7,200 (query database)
Hit rate: 92.3% (excellent)
Latency impact:
Average: 0.92 * 1µs + 0.08 * 10ms = 0.81 milliseconds
Without cache: Average 10 milliseconds
Speedup: 12.3x
Throughput impact:
Queries handled: 93,600 per hour
If all disk: 360 per hour (would be ~260x slower)
Caching enables actual performance
void synchronizeNode(SHAMapNodeID nodeID,
uint256 const& nodeHash) {
// Fetch this node
auto node = fetch(nodeHash);
if (auto inner = dynamic_cast<SHAMapInnerNode*>(node)) {
// This is an inner node
// Likely next accesses are to its children
// Prefetch children to warm cache
for (int i = 0; i < 16; ++i) {
uint256 childHash = inner->getChildHash(i);
if (childHash.isValid()) {
// Asynchronously prefetch
asyncFetch(childHash);
}
}
}
}
// Fetch multiple nodes in single operation
std::vector<uint256> hashes = {hash1, hash2, hash3, ...};
auto results = backend->fetchBatch(hashes);
// Reduces backend overhead
// Populates cache efficiently
// Parallelizes I/O operations
// During synchronization, identify missing nodes
std::vector<uint256> missing = getMissingNodes(shamap);
// Request from peers asynchronously
// When they arrive, cache them
// Continue traversal without blocking on network
// This allows pipelining: request more while processing previous results
Access pattern: Recent ledgers frequently accessed
Root hash: checked at consensus
Recent state nodes: queried for transactions
Old historical data: rarely accessed
Cache configuration:
Keep recent ledgers fully cached
Let old ledgers evict to make room
Result: Excellent hit rate for current operations
Access pattern: Missing nodes from network
Need to verify hash chain from root to leaf
Often fetching siblings (related nodes)
May access same node multiple times
Cache strategy:
Smaller cache acceptable (still beneficial)
Prefetch siblings when fetching parent
Use dummy markers to avoid retry storms
Result: Synchronization completes in hours vs days
Access pattern: Back to steady-state recent ledger access
Cache characteristics:
Most-accessed nodes pinned in cache
Hit rate quickly reaches 90%+
Warm cache from prior work
Multiple readers:
Many threads can fetch simultaneously
No contention for cache hits
Scaling: hundreds of concurrent fetches possible
Insert/evict operations:
Exclusive lock for modification
Short-lived (just map operations)
Background eviction: doesn't block fetches
Synchronizing large subtree:
1. Check root: hash differs (need to sync)
2. Check children: most hashes match, one differs
3. Recurse into different child
4. Check its children: all match
5. Later: receive more nodes
6. Check same grandparent again
7. Redundant check (we already knew it was full)
class SHAMapInnerNode {
// Generation number marking when this node was verified complete
std::uint32_t mFullBelow;
};
class FullBelowCache {
// Current generation
std::uint32_t mCurrentGeneration = 0;
bool isKnownFull(SHAMapInnerNode* node) {
return node->mFullBelow == mCurrentGeneration;
}
void markFull(SHAMapInnerNode* node) {
node->mFullBelow = mCurrentGeneration;
}
void invalidate() {
++mCurrentGeneration; // Invalidate all prior markings
}
};
When synchronizing:
Fetch subtree, verify all descendants present
Mark as "full below" with generation ID
Later sync process checks generation
If matches current: skip this subtree (known complete)
If differs: need to re-verify (new sync started)
Result: Avoids re-traversing known-complete subtrees
Significant speedup in incremental sync scenarios
Without caching: Synchronization would take months
With 90% hit rate: Synchronization takes hours
With 95% hit rate: Synchronization takes minutes
Production systems: Carefully tuned caches running at 92-96% hit rate