Performance & Optimization

Introduction

Cryptography is essential for security, but it comes with computational cost. In a high-throughput blockchain like XRPL, cryptographic operations—signing, verifying, hashing—happen thousands of times per second. Understanding performance characteristics and optimization opportunities is crucial for building efficient systems.

This chapter explores the performance implications of different cryptographic choices and strategies for optimizing without compromising security.

Signature Algorithm Performance

Benchmark Results

// Approximate timings on modern hardware (2023-era CPU)

Operation              secp256k1    ed25519     Winner
─────────────────────────────────────────────────────────
Key generation         ~100 μs      ~50 μs      ed25519
Public key derivation  ~100 μs      ~50 μs      ed25519
Signing                ~200 μs      ~50 μs      ed25519 (4x faster)
Verification           ~500 μs      ~100 μs     ed25519 (5x faster)
Batch verification     N/A          Available   ed25519
─────────────────────────────────────────────────────────
Public key size        33 bytes     33 bytes    Tie
Signature size         ~71 bytes    64 bytes    ed25519

Why Ed25519 is Faster

1. Simpler mathematics:

// secp256k1:
// - Complex curve operations
// - Modular arithmetic with large primes
// - DER encoding/decoding overhead

// ed25519:
// - Optimized curve (Curve25519)
// - Simpler point arithmetic
// - No encoding overhead (raw bytes)

2. Better caching:

// ed25519 operations fit better in CPU cache
// Fewer memory accesses
// More predictable branching

3. Modern design:

// Ed25519 designed in 2011 with performance in mind
// secp256k1 designed in 2000 before modern optimizations

Verification is the Bottleneck

// In XRPL consensus:
// Every validator verifies EVERY transaction signature
// 1000 tx/s × 50 validators = 50,000 verifications/second

// With secp256k1:
50,000 × 500 μs = 25,000,000 μs = 25 seconds of CPU time

// With ed25519:
50,000 × 100 μs = 5,000,000 μs = 5 seconds of CPU time

// Ed25519 saves 20 seconds of CPU time per second!
// Allows for higher throughput or more validators

When to Use Each Algorithm

Use ed25519:

New accounts (recommended)
High-throughput applications
When performance matters
Modern systems

Use secp256k1:

Compatibility requirements
Existing accounts (can't change)
Cross-chain interoperability
Legacy systems

Hash Function Performance

Benchmark Results

// Throughput on modern 64-bit CPU

Algorithm        Throughput       Notes
────────────────────────────────────────────────────
SHA-512         ~650 MB/s        64-bit optimized
SHA-512-Half    ~650 MB/s        Same (just truncated)
SHA-256         ~450 MB/s        32-bit operations
RIPEMD-160      ~200 MB/s        Older algorithm
RIPESHA         ~200 MB/s        Limited by RIPEMD-160

Why SHA-512-Half?

// On 64-bit processors:
SHA-512:  Uses 64-bit operations → fast
SHA-256:  Uses 32-bit operations → slower on 64-bit CPU

// SHA-512-Half gives us:
Performance of SHA-512 (~650 MB/s)
Output size of SHA-256 (32 bytes)

// Best of both worlds!

Hashing Performance Impact

// Transaction ID calculation:
Serialize transaction: ~1 KB
Hash with SHA-512-Half: ~1.5 μs

// Negligible compared to signature verification (100-500 μs)
// Not a bottleneck

Caching Strategies

Public Key Caching

// Problem: Deriving public key from signature is expensive
// Solution: Cache account ID → public key mappings

class PublicKeyCache
{
private:
    std::unordered_map<AccountID, PublicKey> cache_;
    std::shared_mutex mutex_;
    size_t maxSize_ = 10000;

public:
    std::optional<PublicKey> get(AccountID const& id)
    {
        std::shared_lock lock(mutex_);
        auto it = cache_.find(id);
        return it != cache_.end() ? std::optional{it->second} : std::nullopt;
    }

    void put(AccountID const& id, PublicKey const& pk)
    {
        std::unique_lock lock(mutex_);

        if (cache_.size() >= maxSize_)
            cache_.clear();  // Simple eviction

        cache_[id] = pk;
    }
};

// Usage:
PublicKey getAccountPublicKey(AccountID const& account)
{
    // Check cache first
    if (auto pk = keyCache.get(account))
        return *pk;

    // Not in cache - derive from ledger
    auto pk = deriveFromLedger(account);

    // Cache for next time
    keyCache.put(account, pk);

    return pk;
}

Benefits:

Avoids repeated derivation
Reduces ledger lookups
Especially beneficial for frequently-used accounts

Signature Verification Caching

// Problem: Same transaction verified multiple times
// Solution: Cache transaction hash → verification result

class VerificationCache
{
private:
    struct Entry {
        bool valid;
        std::chrono::steady_clock::time_point expiry;
    };

    std::unordered_map<uint256, Entry> cache_;
    std::shared_mutex mutex_;

public:
    std::optional<bool> check(uint256 const& txHash)
    {
        std::shared_lock lock(mutex_);

        auto it = cache_.find(txHash);
        if (it == cache_.end())
            return std::nullopt;

        // Check if expired
        if (std::chrono::steady_clock::now() > it->second.expiry) {
            return std::nullopt;  // Expired
        }

        return it->second.valid;
    }

    void store(uint256 const& txHash, bool valid)
    {
        std::unique_lock lock(mutex_);

        cache_[txHash] = Entry{
            valid,
            std::chrono::steady_clock::now() + std::chrono::minutes(10)
        };
    }
};

// Usage:
bool verifyTransaction(Transaction const& tx)
{
    auto txHash = tx.getHash();

    // Check cache
    if (auto cached = verifyCache.check(txHash))
        return *cached;

    // Not cached - verify
    bool valid = verify(tx.publicKey, tx.data, tx.signature, true);

    // Cache result
    verifyCache.store(txHash, valid);

    return valid;
}

Considerations:

Cache must expire (memory limits)
Expiry time vs hit rate trade-off
Thread safety required
Only cache verified transactions (not unverified)

Hash Caching in SHAMap

// Merkle tree nodes cache their hashes
class SHAMapNode
{
private:
    uint256 hash_;
    bool hashValid_ = false;

public:
    uint256 const& getHash()
    {
        if (!hashValid_) {
            hash_ = computeHash();
            hashValid_ = true;
        }
        return hash_;
    }

    void invalidateHash()
    {
        hashValid_ = false;
        // Parent nodes also invalidated (recursively)
    }
};

Benefits:

Avoids recomputing unchanged subtrees
Critical for Merkle tree performance
Cache invalidation on modification

Batch Operations

Batch Signature Verification (Ed25519 Only)

// Ed25519 supports batch verification
// Verify multiple signatures faster than individually

bool verifyBatch(
    std::vector<PublicKey> const& publicKeys,
    std::vector<Slice> const& messages,
    std::vector<Slice> const& signatures)
{
    // Batch verification algorithm:
    // Combines multiple verification equations
    // Single verification check for all signatures
    //
    // Time: ~1.2 × single verification
    // Instead of: N × single verification
    //
    // For N=100: 100× speedup!

    return ed25519_sign_open_batch(
        messages.data(),
        messages.size(),
        publicKeys.data(),
        signatures.data(),
        messages.size()) == 0;
}

Benefits:

Massive speedup for multiple verifications
Ideal for transaction processing
Only available for Ed25519

Limitations:

Batch fails if ANY signature is invalid
Must verify individually to find which failed
Requires all same algorithm (ed25519)

Batch Hashing

// For hashing multiple items
void hashMultiple(
    std::vector<Slice> const& items,
    std::vector<uint256>& hashes)
{
    hashes.reserve(items.size());

    // Option 1: Parallel hashing
    #pragma omp parallel for
    for (size_t i = 0; i < items.size(); ++i) {
        hashes[i] = sha512Half(items[i]);
    }

    // Option 2: Vectorized hashing (if available)
    // Some crypto libraries support SIMD hashing
    hashMultipleSIMD(items, hashes);
}

Parallel Processing

Multi-threaded Verification

// Verify signatures in parallel
std::vector<bool> verifyParallel(
    std::vector<Transaction> const& transactions)
{
    std::vector<bool> results(transactions.size());

    // Use thread pool
    #pragma omp parallel for
    for (size_t i = 0; i < transactions.size(); ++i) {
        results[i] = verifyTransaction(transactions[i]);
    }

    return results;
}

Considerations:

Cryptographic operations are CPU-bound
Parallelism limited by number of cores
Thread synchronization overhead
Good for batch processing

Async Processing

// Verify asynchronously
std::future<bool> verifyAsync(Transaction const& tx)
{
    return std::async(std::launch::async, [tx]() {
        return verifyTransaction(tx);
    });
}

// Usage:
std::vector<std::future<bool>> futures;
for (auto const& tx : transactions) {
    futures.push_back(verifyAsync(tx));
}

// Collect results
for (auto& future : futures) {
    bool valid = future.get();
    // ...
}

Memory Optimization

Signature Size

// Ed25519 signatures are smaller and fixed-size
Signature size:
    secp256k1: 70-72 bytes (variable, DER encoded)
    ed25519:   64 bytes (fixed, raw bytes)

// For 1,000,000 signatures:
secp256k1: ~71 MB
ed25519:   ~64 MB

// Savings: 7 MB (10%)
// Also: Fixed size easier to handle

Public Key Storage

// Compressed public keys
secp256k1: 33 bytes (compressed)
ed25519:   33 bytes

// Both use compression
// No optimization available

Performance Measurement

Profiling

// Measure cryptographic operations
auto measureSign = []() {
    auto [pk, sk] = randomKeyPair(KeyType::ed25519);
    std::vector<uint8_t> message(1000, 0xAA);

    auto start = std::chrono::high_resolution_clock::now();

    for (int i = 0; i < 1000; ++i) {
        auto sig = sign(pk, sk, makeSlice(message));
    }

    auto end = std::chrono::high_resolution_clock::now();
    auto duration = std::chrono::duration_cast<std::chrono::microseconds>(end - start);

    std::cout << "Average sign time: " << duration.count() / 1000.0 << " μs\n";
};

Bottleneck Identification

// Use profiler to find hotspots
// Example output:

Function                     Time      % Total
───────────────────────────────────────────────
verifyTransaction            45.2%     Critical
  ├─ ed25519_sign_open      42.1%     ← Bottleneck
  └─ sha512Half              2.8%
processLedger                35.1%
  ├─ computeMerkleRoot      20.3%
  └─ serializeTransactions  14.8%

Optimization Guidelines

✅ DO:

Use ed25519 for new accounts

// 4-5× faster than secp256k1
auto [pk, sk] = randomKeyPair(KeyType::ed25519);

Cache frequently-used data

// Public keys, verification results, hashes
cache.get(key);

Batch operations when possible

// Especially for ed25519 batch verification
verifyBatch(pks, messages, sigs);

Profile before optimizing

// Measure actual bottlenecks
// Don't optimize blindly

Use parallel processing for batches

// Utilize multiple cores
#pragma omp parallel for

❌ DON'T:

Don't sacrifice security for speed

// ❌ Skipping canonicality checks
// ❌ Using weak algorithms
// ❌ Reducing key sizes

Don't cache unverified data

// ❌ Caching before verification
// ✅ Cache after verification succeeds

Don't over-optimize negligible operations

// Hashing is fast (~1 μs)
// Focus on signatures (~100-500 μs)

Don't forget thread safety

// Caches need proper locking
// Crypto libraries might not be thread-safe

Real-World Performance

XRPL Mainnet Statistics

// Approximate numbers from XRPL mainnet:

Transactions per ledger: ~50-200
Ledger close time: ~3-5 seconds
Validators: ~35-40

Signature verifications per second:
(150 tx/ledger × 40 validators) / 4 seconds = 1,500 verifications/second

With ed25519 (100 μs each):
1,500 × 0.0001s = 0.15 seconds of CPU time per second
= 15% CPU utilization

With secp256k1 (500 μs each):
1,500 × 0.0005s = 0.75 seconds of CPU time per second
= 75% CPU utilization

Ed25519 allows 5× higher throughput with same CPU!

Summary

Performance optimization in cryptography:

Algorithm choice matters: ed25519 is 4-5× faster than secp256k1
Verification is the bottleneck: Focus optimization here
Caching helps: Public keys, verification results, hashes
Batch operations: Especially for ed25519
Parallel processing: Utilize multiple cores
Profile first: Measure before optimizing
Never sacrifice security: Performance < Security

Key takeaways:

Use ed25519 for new accounts (faster, simpler)
Cache wisely (but verify first)
Batch when possible (ed25519 batch verification)
Profile to find real bottlenecks
Optimize hot paths only
Security always comes first

PreviousCommon Cryptographic Pitfalls NextAppendix

Last updated 2 days ago