Hash Functions in XRPL
Introduction
Hash functions are the workhorses of cryptographic systems. While signatures prove authorization and keys establish identity, hash functions ensure integrity and enable efficient data structures. In XRPL, hash functions are everywhere—transaction IDs, ledger object keys, Merkle trees, address generation, and more.
This chapter explores how XRPL uses hash functions, why specific algorithms were chosen, and how they provide the integrity guarantees the system depends on.
What is a Cryptographic Hash Function?
A cryptographic hash function takes arbitrary input and produces a fixed-size output:
Input (any size) → Hash Function → Output (fixed size)
"Hello" → sha512Half → 0x7F83B165...
"Hello World!" → sha512Half → 0xA591A6D4...
[1 MB file] → sha512Half → 0x3C9F2A8B...
Required Properties
1. Deterministic
sha512Half("Hello") == sha512Half("Hello") // Always true
// Same input always produces same output
2. Fast to Compute
// Can hash gigabytes per second
auto hash = sha512Half(largeData); // Microseconds to milliseconds
3. Avalanche Effect
sha512Half("Hello") → 0x7F83B165...
sha512Half("Hello!") → 0xC89F3AB2... // Completely different!
// One bit change → ~50% of output bits flip
4. Preimage Resistance (One-Way)
// Given hash, cannot find input
uint256 hash = 0x7F83B165...;
// No way to compute: input = reverse_hash(hash);
5. Collision Resistance
// Cannot find two inputs with same hash
// sha512Half(x) == sha512Half(y) where x != y
// Computationally infeasible
SHA-512-Half: The Primary Workhorse
Why SHA-512-Half?
// Not SHA-256, but SHA-512 truncated to 256 bits
template <class... Args>
uint256 sha512Half(Args const&... args)
{
sha512_half_hasher h;
hash_append(h, args...);
return static_cast<typename sha512_half_hasher::result_type>(h);
}
Why truncate SHA-512 instead of using SHA-256?
Performance on 64-bit processors:
SHA-512: Operates on 64-bit words → ~650 MB/s on modern CPUs
SHA-256: Operates on 32-bit words → ~450 MB/s on modern CPUs
SHA-512-Half = SHA-512 speed + SHA-256 output size
On 64-bit systems (which all modern servers are), SHA-512 is faster than SHA-256 despite producing more output. By truncating to 256 bits, we get the best of both worlds.
Implementation
// From src/libxrpl/protocol/digest.cpp
class sha512_half_hasher
{
private:
SHA512_CTX ctx_;
public:
using result_type = uint256;
sha512_half_hasher()
{
SHA512_Init(&ctx_);
}
void operator()(void const* data, std::size_t size) noexcept
{
SHA512_Update(&ctx_, data, size);
}
operator result_type() noexcept
{
// Compute full SHA-512 (64 bytes)
std::uint8_t digest[64];
SHA512_Final(digest, &ctx_);
// Return first 32 bytes (256 bits)
result_type result;
std::memcpy(result.data(), digest, 32);
return result;
}
};
Usage Throughout XRPL
Transaction IDs:
uint256 STTx::getTransactionID() const
{
Serializer s;
s.add32(HashPrefix::transactionID);
addWithoutSigningFields(s);
return sha512Half(s.slice());
}
Ledger Object Keys:
uint256 keylet::account(AccountID const& id)
{
return sha512Half(
HashPrefix::account,
id);
}
Merkle Tree Nodes:
uint256 SHAMapInnerNode::getHash() const
{
if (hashValid_)
return hash_;
Serializer s;
for (auto const& child : children_)
s.add256(child.getHash());
hash_ = sha512Half(s.slice());
hashValid_ = true;
return hash_;
}
Secure Variant: sha512Half_s
// Secure variant that erases internal state
uint256 sha512Half_s(Slice const& data)
{
sha512_half_hasher h;
h(data.data(), data.size());
auto result = static_cast<uint256>(h);
// Hasher destructor securely erases internal state
// This prevents sensitive data from lingering in memory
return result;
}
When to use the secure variant:
Hashing secret keys or seeds
Deriving keys from passwords
Any operation involving sensitive data
Why it matters:
// Regular variant
auto hash1 = sha512Half(secretData);
// SHA512_CTX still contains secretData fragments in memory
// Secure variant
auto hash2 = sha512Half_s(secretData);
// SHA512_CTX is securely erased
RIPESHA: Address Generation
The Double Hash
class ripesha_hasher
{
private:
openssl_sha256_hasher sha_;
public:
using result_type = ripemd160_hasher::result_type; // 20 bytes
void operator()(void const* data, std::size_t size) noexcept
{
// First: SHA-256
sha_(data, size);
}
operator result_type() noexcept
{
// Get SHA-256 result (32 bytes)
auto const sha256_digest =
static_cast<openssl_sha256_hasher::result_type>(sha_);
// Second: RIPEMD-160 of the SHA-256
ripemd160_hasher ripe;
ripe(sha256_digest.data(), sha256_digest.size());
return static_cast<result_type>(ripe); // 20 bytes
}
};
Why Two Hash Functions?
1. Defense in Depth
If SHA-256 is broken:
RIPEMD-160 provides second layer
If RIPEMD-160 is broken:
SHA-256 provides protection
Breaking both: requires defeating two independent algorithms
2. Compactness
Public Key: 33 bytes
↓ SHA-256
SHA-256 hash: 32 bytes
↓ RIPEMD-160
Account ID: 20 bytes (40% smaller than public key)
3. Quantum Resistance (Partial)
Quantum computers may break elliptic curves:
PublicKey → SecretKey (vulnerable)
But cannot reverse hashes:
AccountID ↛ PublicKey (still secure)
This provides time to upgrade the system if quantum computers emerge.
Usage
// Calculate account ID from public key
AccountID calcAccountID(PublicKey const& pk)
{
ripesha_hasher h;
h(pk.data(), pk.size());
return AccountID{static_cast<ripesha_hasher::result_type>(h)};
}
// Calculate node ID from public key
NodeID calcNodeID(PublicKey const& pk)
{
ripesha_hasher h;
h(pk.data(), pk.size());
return NodeID{static_cast<ripesha_hasher::result_type>(h)};
}
SHA-256: Checksum and Encoding
Double SHA-256 for Base58Check
// From src/libxrpl/protocol/tokens.cpp
std::string encodeBase58Token(
TokenType type,
void const* token,
std::size_t size)
{
std::vector<uint8_t> buffer;
buffer.push_back(static_cast<uint8_t>(type));
buffer.insert(buffer.end(), token, token + size);
// Compute checksum: first 4 bytes of SHA-256(SHA-256(data))
auto const hash1 = sha256(makeSlice(buffer));
auto const hash2 = sha256(makeSlice(hash1));
// Append checksum
buffer.insert(buffer.end(), hash2.begin(), hash2.begin() + 4);
// Base58 encode
return base58Encode(buffer);
}
Why double SHA-256?
Historical reasons (inherited from early cryptocurrency designs):
Provides defense against length-extension attacks
Standard pattern for checksums
Well-tested over many years
Checksum properties:
4 bytes = 32 bits = 2^32 possible values
Probability of random corruption matching checksum: 1 in 4,294,967,296
Effectively catches all typos and errors.
Hash Prefixes: Domain Separation
// From include/xrpl/protocol/HashPrefix.h
enum class HashPrefix : std::uint32_t
{
transactionID = 0x54584E00, // 'TXN\0'
txSign = 0x53545800, // 'STX\0'
txMultiSign = 0x534D5400, // 'SMT\0'
manifest = 0x4D414E00, // 'MAN\0'
ledgerMaster = 0x4C575200, // 'LWR\0'
ledgerInner = 0x4D494E00, // 'MIN\0'
ledgerLeaf = 0x4D4C4E00, // 'MLN\0'
accountRoot = 0x41525400, // 'ART\0'
};
Why use prefixes?
Prevent cross-protocol attacks where a hash from one context is used in another:
// Without prefixes (BAD):
hash_tx = SHA512Half(tx_data)
hash_msg = SHA512Half(msg_data)
// If tx_data == msg_data, then hash_tx == hash_msg
// Could cause confusion/attacks
// With prefixes (GOOD):
hash_tx = SHA512Half(PREFIX_TX, tx_data)
hash_msg = SHA512Half(PREFIX_MSG, msg_data)
// Even if tx_data == msg_data, hash_tx != hash_msg
Example Usage
// Transaction ID
uint256 getTransactionID(STTx const& tx)
{
Serializer s;
s.add32(HashPrefix::transactionID); // Add prefix first
tx.addWithoutSigningFields(s);
return sha512Half(s.slice());
}
// Signing data (different prefix, different hash)
uint256 getSigningHash(STTx const& tx)
{
Serializer s;
s.add32(HashPrefix::txSign); // Different prefix
tx.addWithoutSigningFields(s);
return sha512Half(s.slice());
}
Incremental Hashing
Hash functions can process data incrementally:
// Instead of hashing all at once:
auto hash = sha512Half(bigData); // Requires loading all data
// Can hash incrementally:
sha512_half_hasher h;
h(chunk1.data(), chunk1.size());
h(chunk2.data(), chunk2.size());
h(chunk3.data(), chunk3.size());
auto hash = static_cast<uint256>(h);
Benefits:
Stream large files without loading into memory
Hash complex data structures field by field
More efficient for large inputs
Example: Hashing a transaction
Serializer s;
s.add32(HashPrefix::transactionID);
s.addVL(tx.getFieldVL(sfAccount));
s.addVL(tx.getFieldVL(sfDestination));
s.add64(tx.getFieldU64(sfAmount));
// ... more fields ...
return sha512Half(s.slice());
Hash Collisions: Why We Don't Worry
Birthday Paradox
The "birthday attack" on a 256-bit hash requires:
Number of hashes to find collision = 2^(256/2) = 2^128
2^128 = 340,282,366,920,938,463,463,374,607,431,768,211,456
If you could compute 1 trillion hashes per second:
Time = 2^128 / (10^12) seconds
= 10^25 years
(Universe age ≈ 10^10 years)
Conclusion: Collision attacks on SHA-512-Half are not feasible with current or foreseeable technology.
Collision Resistance in Practice
// XRPL relies on collision resistance for:
// 1. Transaction IDs must be unique
uint256 txID = sha512Half(tx);
// 2. Ledger object keys must not collide
uint256 accountKey = sha512Half(HashPrefix::account, accountID);
// 3. Merkle tree integrity
uint256 nodeHash = sha512Half(leftChild, rightChild);
A collision in any of these would be catastrophic, but the probability is negligible.
Performance Considerations
Hashing Speed
// Benchmark results (approximate, hardware-dependent):
SHA-512-Half: ~650 MB/s
SHA-256: ~450 MB/s
RIPEMD-160: ~200 MB/s
For 1 KB transaction:
SHA-512-Half: ~1.5 microseconds
Caching Hashes
class SHAMapNode
{
private:
uint256 hash_;
bool hashValid_;
public:
uint256 getHash() const
{
if (hashValid_)
return hash_; // Return cached value
// Compute hash (expensive)
hash_ = computeHash();
hashValid_ = true;
return hash_;
}
void invalidateHash()
{
hashValid_ = false; // Force recomputation next time
}
};
Why cache?
Merkle tree nodes are hashed repeatedly
Caching avoids redundant computation
Invalidate when node contents change
Hash Function Summary
SHA-512-Half
256 bits
~650 MB/s
Transaction IDs, object keys, Merkle trees
SHA-256
256 bits
~450 MB/s
Base58Check checksums
RIPEMD-160
160 bits
~200 MB/s
Part of RIPESHA (address generation)
RIPESHA
160 bits
~300 MB/s
Account IDs, node IDs
Best Practices
✅ DO:
Use sha512Half for new protocols
uint256 hash = sha512Half(data); // Fast and standard
Use hash prefixes for domain separation
uint256 hash = sha512Half(HashPrefix::custom, data);
Cache computed hashes when appropriate
if (cached) return cachedHash; cachedHash = sha512Half(data); return cachedHash;
Use secure variant for sensitive data
uint256 hash = sha512Half_s(secretData);
❌ DON'T:
Don't use non-cryptographic hashes for security
std::hash<std::string>{}(data); // ❌ NOT SECURE
Don't implement your own hash function
uint32_t myHash(data) { /* ... */ } // ❌ Don't do this
Don't assume hashes are unique without checking
// Even though collisions are infeasible, handle errors gracefully if (hashExists(newHash)) handleCollision(); // Paranoid but correct
Summary
Hash functions in XRPL provide:
Integrity: Detect any data modification
Identification: Unique IDs for transactions and objects
Efficiency: Fast computation on modern CPUs
Security: Collision and preimage resistance
Key algorithms:
SHA-512-Half: Primary hash (fast on 64-bit systems)
RIPESHA: Address generation (compact, defense in depth)
SHA-256: Checksums (standard, well-tested)
Usage patterns:
Always use hash prefixes for domain separation
Cache hashes when recomputed frequently
Use secure variants for sensitive data
Trust collision resistance but code defensively
In the next chapter, we'll explore Base58Check encoding and how XRPL makes binary data human-readable.
Last updated