Base58Check Encoding

← Back to Cryptography I: Blockchain Security and Cryptographic Foundations

Introduction

Cryptographic data is fundamentally binary—sequences of bytes with values from 0 to 255. But humans don't work well with binary data. We mistype it, confuse similar characters, and struggle to verify it. Base58Check encoding solves this problem by converting binary data into human-friendly strings that are easier to read, type, and verify.

This chapter explores how XRPL uses Base58Check encoding to create readable addresses, why certain characters are excluded, and how checksums provide error detection.

The Problem with Raw Binary

###Consider an account ID in different formats:

Binary (20 bytes):
10001011 10001010 01101100 01010011 00111111 ...

Hexadecimal:
8B8A6C533F09CA0E5E00E7C32AA7EC323485ED3F

Base58Check:
rN7n7otQDd6FczFgLdlqtyMVrn3LNU8B4C

Problems with hex:

Easy to mistype: 8B8A vs 8B8B
Visually similar characters: 0 (zero) vs O (letter O)
No error detection: One wrong character, wrong address
Not compact: 40 characters for 20 bytes

Base58Check solutions:

Excludes confusing characters
Includes checksum (detects errors)
More compact: 34 characters for 20 bytes + checksum
URL-safe (no special characters)

The Base58 Alphabet

// Base58 alphabet - 58 unambiguous characters
static const char* BASE58_ALPHABET =
    "123456789"                    // Digits (no 0)
    "ABCDEFGHJKLMNPQRSTUVWXYZ"    // Uppercase (no I, O)
    "abcdefghijkmnopqrstuvwxyz";  // Lowercase (no l)

Excluded characters:

0 (zero)        - Looks like O (letter O)
O (letter O)    - Looks like 0 (zero)
I (letter I)    - Looks like l (lowercase L) or 1
l (lowercase L) - Looks like I (letter I) or 1

These exclusions prevent human transcription errors.

Included: 58 characters

Digits:     1 2 3 4 5 6 7 8 9 (9 characters)
Uppercase:  A B C D E F G H J K L M N P Q R S T U V W X Y Z (24 characters)
Lowercase:  a b c d e f g h i j k m n o p q r s t u v w x y z (25 characters)
Total:      58 characters

Base58 Encoding Algorithm

Base58 is like converting a number to a different base (like hexadecimal is base 16):

Decimal:    255 = 2×100 + 5×10 + 5×1
Hex:        FF  = 15×16 + 15×1
Base58:     4k  = 4×58 + 45×1

The Mathematics

// Conceptually: treat byte array as big integer
std::vector<uint8_t> input = {0x8B, 0x8A, ...};

// Convert to big integer
BigInt value = 0;
for (uint8_t byte : input)
    value = value * 256 + byte;

// Convert to base58
std::string result;
while (value > 0) {
    int remainder = value % 58;
    result = BASE58_ALPHABET[remainder] + result;
    value = value / 58;
}

Handling Leading Zeros

// Special case: preserve leading zero bytes as '1' characters
for (uint8_t byte : input) {
    if (byte == 0)
        result = '1' + result;
    else
        break;
}

This ensures the encoding is one-to-one: every distinct byte sequence produces a distinct string.

Implementation

// From src/libxrpl/protocol/tokens.cpp (simplified)
std::string base58Encode(std::vector<uint8_t> const& input)
{
    // Skip leading zeros, but count them
    int leadingZeros = 0;
    for (auto byte : input) {
        if (byte == 0)
            ++leadingZeros;
        else
            break;
    }

    // Allocate output buffer (worst case size)
    std::vector<uint8_t> b58(input.size() * 138 / 100 + 1);

    // Process the bytes
    for (auto byte : input) {
        int carry = byte;
        for (auto it = b58.rbegin(); it != b58.rend(); ++it) {
            carry += 256 * (*it);
            *it = carry % 58;
            carry /= 58;
        }
    }

    // Convert to string, skipping leading zeros in b58
    std::string result;
    for (int i = 0; i < leadingZeros; ++i)
        result += '1';

    for (auto value : b58) {
        if (value != 0 || !result.empty())
            result += BASE58_ALPHABET[value];
    }

    return result.empty() ? "1" : result;
}

Base58Check: Adding Error Detection

Base58 alone doesn't detect errors. Base58Check adds a checksum:

Structure:
[Type Byte] [Payload] [Checksum (4 bytes)]
     ↓          ↓           ↓
   0x00     20 bytes    SHA256(SHA256(prefix + payload))

Encoding Process

// From src/libxrpl/protocol/tokens.cpp
std::string encodeBase58Token(
    TokenType type,
    void const* token,
    std::size_t size)
{
    std::vector<uint8_t> buffer;
    buffer.reserve(1 + size + 4);

    // Step 1: Add type prefix
    buffer.push_back(static_cast<uint8_t>(type));

    // Step 2: Add payload
    auto const* tokenBytes = static_cast<uint8_t const*>(token);
    buffer.insert(buffer.end(), tokenBytes, tokenBytes + size);

    // Step 3: Compute checksum
    // First SHA-256
    auto const hash1 = sha256(makeSlice(buffer));
    // Second SHA-256
    auto const hash2 = sha256(makeSlice(hash1));

    // Step 4: Append first 4 bytes of second hash as checksum
    buffer.insert(buffer.end(), hash2.begin(), hash2.begin() + 4);

    // Step 5: Base58 encode everything
    return base58Encode(buffer);
}

Token Types

enum class TokenType : std::uint8_t {
    None            = 1,
    NodePublic      = 28,   // Node public keys:  starts with 'n'
    NodePrivate     = 32,   // Node private keys
    AccountID       = 0,    // Account addresses: starts with 'r'
    AccountPublic   = 35,   // Account public keys: starts with 'a'
    AccountSecret   = 34,   // Account secret keys (deprecated)
    FamilySeed      = 33,   // Seeds: starts with 's'
};

The type byte determines the first character of the encoded result:

Type 0  (AccountID)     → starts with 'r'
Type 33 (FamilySeed)    → starts with 's'
Type 28 (NodePublic)    → starts with 'n'
Type 35 (AccountPublic) → starts with 'a'

This provides visual identification of what kind of data you're looking at.

Decoding and Validation

std::string decodeBase58Token(
    std::string const& s,
    TokenType type)
{
    // Step 1: Decode from Base58
    auto const decoded = base58Decode(s);
    if (decoded.empty())
        return {};  // Invalid Base58

    // Step 2: Check minimum size (type + checksum = 5 bytes minimum)
    if (decoded.size() < 5)
        return {};

    // Step 3: Verify type byte matches
    if (decoded[0] != static_cast<uint8_t>(type))
        return {};  // Wrong type

    // Step 4: Verify checksum
    auto const dataEnd = decoded.end() - 4;  // Last 4 bytes are checksum
    auto const providedChecksum = Slice{dataEnd, decoded.end()};

    // Recompute checksum
    auto const hash1 = sha256(makeSlice(decoded.begin(), dataEnd));
    auto const hash2 = sha256(makeSlice(hash1));
    auto const computedChecksum = Slice{hash2.begin(), hash2.begin() + 4};

    // Compare
    if (!std::equal(
            providedChecksum.begin(),
            providedChecksum.end(),
            computedChecksum.begin()))
        return {};  // Checksum mismatch

    // Step 5: Return payload (skip type byte and checksum)
    return std::string(decoded.begin() + 1, dataEnd);
}

Error Detection

The 4-byte (32-bit) checksum provides strong error detection:

Probability of random error passing checksum:
1 / 2^32 = 1 / 4,294,967,296

Approximately: 1 in 4.3 billion

Types of errors detected:

Single character typos: 100%
Transpositions: 100%
Missing characters: 100%
Extra characters: 100%
Random corruption: 99.9999999767%

Complete Example: Account Address

// Start with public key
PublicKey pk = /* ed25519 public key */;
// ED9434799226374926EDA3B54B1B461B4ABF7237962EEB1144C10A7CA6A9D32C64

// Step 1: Calculate account ID (RIPESHA hash)
AccountID accountID = calcAccountID(pk);
// 8B8A6C533F09CA0E5E00E7C32AA7EC323485ED3F (20 bytes)

// Step 2: Encode as Base58Check address
std::string address = toBase58(accountID);
// rN7n7otQDd6FczFgLdlqtyMVrn3LNU8B4C

// Encoding breakdown:
// 1. Prepend type byte 0x00
//    008B8A6C533F09CA0E5E00E7C32AA7EC323485ED3F
//
// 2. Compute checksum:
//    SHA-256: 7C9B2F8F...
//    SHA-256: 3D4B8E9C...
//    Take first 4 bytes: 3D4B8E9C
//
// 3. Append checksum:
//    008B8A6C533F09CA0E5E00E7C32AA7EC323485ED3F3D4B8E9C
//
// 4. Base58 encode:
//    rN7n7otQDd6FczFgLdlqtyMVrn3LNU8B4C

Seeds and Human Readability

Seeds can be encoded in two formats:

Base58Check Format

Seed seed = generateRandomSeed();
std::string b58 = toBase58(seed);
// Example: sp5fghtJtpUorTwvof1NpDXAzNwf5

Properties:

Compact (25-28 characters)
Checksum for error detection
Safe to copy-paste

RFC 1751 Word Format

std::string words = seedAs1751(seed);
// Example: "MAD WARM EVEN SHOW BALK FELT TOY STIR OBOE COST HOPE VAIN"

Properties:

12 words from a dictionary
Easier to write down by hand
Easier to read aloud (for backup)
Checksum built into last word

Practical Usage

Creating an Account

// Generate key pair
auto [publicKey, secretKey] = randomKeyPair(KeyType::ed25519);

// Derive account ID
AccountID accountID = calcAccountID(publicKey);

// Encode as address
std::string address = toBase58(accountID);

std::cout << "Your XRPL address: " << address << "\n";
// Your XRPL address: rN7n7otQDd6FczFgLdlqtyMVrn3LNU8B4C

Validating User Input

bool isValidAddress(std::string const& address)
{
    // Try to decode
    auto decoded = decodeBase58Token(address, TokenType::AccountID);

    // Valid if:
    // 1. Decoding succeeded
    // 2. Payload is correct size (20 bytes)
    return !decoded.empty() && decoded.size() == 20;
}

// Usage
if (!isValidAddress(userInput)) {
    std::cerr << "Invalid XRPL address\n";
    return;
}

Parsing Different Token Types

std::optional<PublicKey> parsePublicKey(std::string const& s)
{
    // Try AccountPublic type (starts with 'a')
    if (s[0] == 'a') {
        auto decoded = decodeBase58Token(s, TokenType::AccountPublic);
        if (!decoded.empty())
            return PublicKey{makeSlice(decoded)};
    }

    // Try NodePublic type (starts with 'n')
    if (s[0] == 'n') {
        auto decoded = decodeBase58Token(s, TokenType::NodePublic);
        if (!decoded.empty())
            return PublicKey{makeSlice(decoded)};
    }

    return std::nullopt;  // Invalid
}

Comparison with Other Encodings

Encoding

Characters

Case-Sensitive

Checksum

Compact

URL-Safe

Hex

No (2×)

Yes

Base64

Yes

Yes (1.33×)

No (+, /)

Base58

Yes

Yes (1.37×)

Yes

Base58Check

Yes

Yes (4 bytes)

Yes (1.37×)

Yes

Base58Check wins for:

Human readability (no confusing characters)
Error detection (checksum)
URL safety (no special characters)
Blockchain addresses

Common Pitfalls

❌ Typos Without Validation

// User types address wrong
std::string userAddress = "rN7n7otQDd6FczFgLdlqtyMVrn3LNU8B4D";  // Last char wrong

// Send funds without validation
sendPayment(userAddress, amount);  // WRONG ADDRESS!

Solution:

if (!isValidAddress(userAddress)) {
    throw std::runtime_error("Invalid address - check for typos");
}

❌ Assuming All Addresses Start with 'r'

// ❌ WRONG
bool isAddress(std::string const& s) {
    return s[0] == 'r';  // Too simplistic
}

Solution:

// ✅ CORRECT
bool isAddress(std::string const& s) {
    return !decodeBase58Token(s, TokenType::AccountID).empty();
}

❌ Manual Base58 Implementation

// ❌ WRONG - Don't implement yourself
std::string myBase58Encode(/* ... */) {
    // Custom implementation - likely has bugs
}

Solution:

// ✅ CORRECT - Use library functions
std::string encoded = encodeBase58Token(type, data, size);

Performance Considerations

// Base58 encoding is relatively slow compared to hex:
// Hex encoding:     ~1 microsecond
// Base58 encoding:  ~10 microseconds

// But this doesn't matter for user-facing operations:
// - Displaying addresses: once per UI render
// - Parsing user input: once per input
// - Not a bottleneck in practice

When performance matters:

// For internal storage and processing, use binary:
AccountID accountID;  // 20 bytes, fast comparisons

// Only encode to Base58 when presenting to users:
std::string address = toBase58(accountID);  // For display only

Summary

Base58Check encoding makes binary cryptographic data human-friendly:

Excludes confusing characters: No 0, O, I, l
Includes checksum: 4-byte SHA-256(SHA-256(...)) checksum
Type prefixes: Different first characters for different data types
Error detection: ~99.9999999767% of errors detected
URL-safe: No special characters
Compact: ~37% overhead vs 100% for hex

Usage in XRPL:

Account addresses: starts with 'r'
Seeds: starts with 's'
Public keys: starts with 'a' or 'n'

Best practices:

Always validate before using
Use library functions, don't implement yourself
Store binary internally, encode only for display
Provide clear error messages for invalid input

In the next chapter, we'll explore how cryptography secures peer-to-peer communication in the XRPL network.

PreviousHash Functions in XRPL NextPeer Handshake Protocol

Last updated 1 month ago