Overlay Network: Peer-to-Peer Networking Layer

← Back to Rippled II Overview


Introduction

The Overlay Network is Rippled's peer-to-peer networking layer that enables distributed nodes to discover each other, establish connections, and communicate efficiently. Without the overlay network, the XRP Ledger would be a collection of isolated servers—the overlay network is what transforms individual nodes into a cohesive, decentralized system.

Understanding the overlay network is essential for debugging connectivity issues, optimizing network performance, and ensuring your node participates effectively in the XRP Ledger network. Whether you're running a validator, a stock server, or developing network enhancements, deep knowledge of the overlay network is crucial.


Network Topology and Architecture

Mesh Network Design

The XRP Ledger uses a mesh topology where nodes maintain direct connections with multiple peers. This differs from:

  • Star topology: Central hub (single point of failure)

  • Ring topology: Sequential connections (vulnerable to breaks)

  • Tree topology: Hierarchical structure (root node critical)

Mesh Advantages:

  • No single point of failure: Network remains operational if individual nodes fail

  • Multiple communication paths: Messages can route around failed nodes

  • Scalability: Network can grow organically as nodes join

  • Resilience: Network topology self-heals as nodes enter and exit

Network Layers

┌─────────────────────────────────────────────┐
│         Application Layer                   │
│  (Consensus, Transactions, Ledger)          │
├─────────────────────────────────────────────┤
│         Overlay Network Layer               │
│  (Peer Discovery, Connection Mgmt,          │
│   Message Routing)                          │
├─────────────────────────────────────────────┤
│         Transport Layer (TCP/TLS)           │
├─────────────────────────────────────────────┤
│         Internet Layer (IP)                 │
└─────────────────────────────────────────────┘

The overlay network sits between the application logic and the transport layer, abstracting away the complexities of peer-to-peer communication.

Connection Types

Rippled maintains three types of peer connections:

1. Outbound Connections

Definition: Connections initiated by your node to other peers

Characteristics:

  • Your node acts as client

  • You choose which peers to connect to

  • Configurable connection limits

  • Active connection management

Configuration:

[ips]
# DNS or IP addresses to connect to
r.ripple.com 51235
s1.ripple.com 51235
s2.ripple.com 51235

2. Inbound Connections

Definition: Connections initiated by other nodes to your server

Characteristics:

  • Your node acts as server

  • Must listen on public interface

  • Accept connections from unknown peers

  • Subject to connection limits

Configuration:

[port_peer]
port = 51235
ip = 0.0.0.0      # Listen on all interfaces
protocol = peer

3. Fixed Connections

Definition: Persistent connections to trusted peers

Characteristics:

  • High priority, always maintained

  • Automatically reconnect if disconnected

  • Bypass some connection limits

  • Ideal for validators and cluster peers

Configuration:

[ips_fixed]
# Always maintain connections to these peers
validator1.example.com 51235
validator2.example.com 51235
cluster-peer.example.com 51235

Target Connection Count

Rippled aims to maintain a target number of active peer connections:

Default Targets (based on node_size):

tiny:    10 peers
small:   15 peers
medium:  20 peers (default)
large:   30 peers
huge:    40 peers

Connection Distribution:

  • Approximately 50% outbound connections

  • Approximately 50% inbound connections

  • Fixed connections count toward total

  • System adjusts dynamically to maintain target


Peer Discovery Mechanisms

1. Configured Peer Lists

The most basic discovery method—manually configured peers:

[ips] Section: Peers to connect to automatically

[ips]
r.ripple.com 51235
s1.ripple.com 51235
validator.example.com 51235

[ips_fixed] Section: High-priority persistent connections

[ips_fixed]
critical-peer.example.com 51235

Advantages:

  • Reliable, known peers

  • Administrative control

  • Suitable for private networks

Disadvantages:

  • Manual maintenance required

  • Limited to configured peers

  • Doesn't scale automatically

2. DNS Seeds

DNS-based peer discovery for bootstrap:

How It Works:

  1. Node queries DNS for peer addresses

  2. DNS returns A records (IP addresses)

  3. Node connects to returned addresses

  4. Learns about additional peers through gossip

Configuration:

[ips]
# These resolve via DNS
r.ripple.com 51235
s1.ripple.com 51235

DNS Resolution Example:

$ dig +short r.ripple.com
54.186.73.52
54.184.149.41
52.24.169.78

Advantages:

  • Easy bootstrap for new nodes

  • Dynamic peer lists

  • Load balancing via DNS

Disadvantages:

  • Requires DNS infrastructure

  • Vulnerable to DNS attacks

  • Single point of failure for initial connection

3. Peer Gossip Protocol

Peers share information about other peers they know:

Message Type: Endpoint announcements (part of peer protocol)

Process:

  1. Peer A connects to Peer B

  2. Peer B shares list of other known peers

  3. Peer A considers these peers for connection

  4. Peer A may connect to some of the suggested peers

Gossip Information Includes:

  • Peer IP addresses

  • Peer public keys

  • Last seen time

  • Connection quality hints

Advantages:

  • Network self-organizes

  • No central directory needed

  • Discovers new peers automatically

  • Network grows organically

Disadvantages:

  • Potential for malicious peer injection

  • Network topology influenced by gossip patterns

  • Initial bootstrapping still needed

4. Peer Crawler

Some nodes run peer crawlers to discover and monitor network topology:

What Crawlers Do:

  • Connect to known peers

  • Request peer lists

  • Recursively discover more peers

  • Map network topology

  • Provide public peer directories

Public Peer Lists:

  • Various community-maintained lists

  • Used by new nodes to bootstrap

  • Updated regularly


Connection Establishment and Handshake

Connection Lifecycle

┌──────────────┐
│  Disconnected│
└──────┬───────┘
       │ initiate()

┌──────────────┐
│  Connecting  │ ← TCP handshake, TLS negotiation
└──────┬───────┘
       │ connected()

┌──────────────┐
│  Connected   │ ← Protocol handshake in progress
└──────┬───────┘
       │ handshake complete

┌──────────────┐
│    Active    │ ← Fully operational, exchanging messages
└──────┬───────┘
       │ close() or error

┌──────────────┐
│   Closing    │ ← Graceful shutdown
└──────┬───────┘


┌──────────────┐
│    Closed    │
└──────────────┘

Detailed Handshake Process

Step 1: TCP Connection

Standard TCP three-way handshake:

Client                              Server
  │                                   │
  │──────── SYN ──────────────────────>│
  │                                   │
  │<─────── SYN-ACK ──────────────────│
  │                                   │
  │──────── ACK ──────────────────────>│
  │                                   │
  │       TCP Connection Established  │

Configuration:

[port_peer]
port = 51235
ip = 0.0.0.0
protocol = peer

Step 2: TLS Handshake (Optional but Recommended)

If TLS is configured, encrypted channel is established:

Client                              Server
  │                                   │
  │──────── ClientHello ──────────────>│
  │                                   │
  │<─────── ServerHello ──────────────│
  │<─────── Certificate ──────────────│
  │<─────── ServerHelloDone ──────────│
  │                                   │
  │──────── ClientKeyExchange ────────>│
  │──────── ChangeCipherSpec ─────────>│
  │──────── Finished ─────────────────>│
  │                                   │
  │<─────── ChangeCipherSpec ─────────│
  │<─────── Finished ─────────────────│
  │                                   │
  │    Encrypted Channel Established  │

Benefits of TLS:

  • Encrypted communication (privacy)

  • Peer authentication (security)

  • Protection against eavesdropping

  • Man-in-the-middle prevention

Step 3: Protocol Handshake

Rippled-specific handshake exchanges capabilities:

Hello Message (from initiator):

message TMHello {
    required uint32 protoVersion = 1;     // Protocol version
    required uint32 protoVersionMin = 2;  // Minimum supported version
    required bytes publicKey = 3;          // Node's public key
    optional bytes nodePrivate = 4;        // Proof of key ownership
    required uint32 ledgerIndex = 5;      // Current ledger index
    optional bytes ledgerClosed = 6;      // Closed ledger hash
    optional bytes ledgerPrevious = 7;    // Previous ledger hash
    optional uint32 closedTime = 8;       // Ledger close time
}

Response (from receiver):

// Same TMHello structure with receiver's information

Handshake Validation:

bool validateHandshake(TMHello const& hello)
{
    // Check protocol version compatibility
    if (hello.protoVersion < minSupportedVersion)
        return false;
    
    if (hello.protoVersionMin > currentVersion)
        return false;
    
    // Verify public key
    if (!isValidPublicKey(hello.publicKey()))
        return false;
    
    // Verify key ownership proof
    if (!verifySignature(hello.nodePrivate(), hello.publicKey()))
        return false;
    
    // Check we're on same network (same genesis ledger)
    if (!isSameNetwork(hello.ledgerClosed()))
        return false;
    
    return true;
}

Compatibility Check:

Node A: version 1.7.0, min 1.5.0
Node B: version 1.6.0, min 1.4.0

Check: max(1.5.0, 1.4.0) ≤ min(1.7.0, 1.6.0)
       1.5.0 ≤ 1.6.0 ✓ Compatible

Use protocol version: 1.6.0 (minimum of max versions)

Step 4: Connection Acceptance/Rejection

After handshake validation:

If Compatible:

  • Connection moves to Active state

  • Add to peer list

  • Begin normal message exchange

  • Log successful connection

If Incompatible:

  • Send rejection message with reason

  • Close connection gracefully

  • Log rejection reason

  • May add to temporary ban list

Rejection Reasons:

enum DisconnectReason
{
    drBadData,           // Malformed handshake
    drProtocol,          // Protocol incompatibility
    drSaturated,         // Too many connections
    drDuplicate,         // Already connected to this peer
    drNetworkID,         // Different network (testnet vs mainnet)
    drBanned,            // Peer is banned
    drSelf,              // Trying to connect to self
};

Connection Management

Connection Limits

Rippled enforces various connection limits:

Per-IP Limits

// Maximum connections from single IP
constexpr size_t maxPeersPerIP = 2;

// Prevents single entity from dominating connections
bool acceptConnection(IPAddress const& ip)
{
    auto count = countConnectionsFromIP(ip);
    return count < maxPeersPerIP;
}

Total Connection Limits

Based on node_size configuration:

tiny:    max 10 connections
small:   max 21 connections
medium:  max 40 connections
large:   max 62 connections
huge:    max 88 connections

Formula: target + (target / 2)

Fixed Peer Priority

Fixed peers bypass some limits:

bool shouldAcceptConnection(Peer const& peer)
{
    // Always accept fixed peers
    if (isFixed(peer))
        return true;
    
    // Check against limits for regular peers
    if (activeConnections() >= maxConnections())
        return false;
    
    return true;
}

Connection Quality Assessment

Rippled continuously monitors peer quality:

Metrics Tracked

Latency: Response time to ping messages

// Ping-pong protocol
void sendPing()
{
    auto ping = std::make_shared<protocol::TMPing>();
    ping->set_type(protocol::TMPing::ptPING);
    ping->set_seq(nextPingSeq_++);
    ping->set_timestamp(now());
    
    send(ping);
}

void onPong(protocol::TMPing const& pong)
{
    auto latency = now() - pong.timestamp();
    updateLatencyMetrics(latency);
}

Message Rate: Messages per second

void trackMessageRate()
{
    messagesReceived_++;
    
    auto elapsed = now() - windowStart_;
    if (elapsed >= 1s)
    {
        messageRate_ = messagesReceived_ / elapsed.count();
        messagesReceived_ = 0;
        windowStart_ = now();
    }
}

Error Rate: Protocol errors, malformed messages

void onProtocolError()
{
    errorCount_++;
    
    if (errorCount_ > maxErrorThreshold)
    {
        // Disconnect problematic peer
        disconnect(drBadData);
    }
}

Uptime: Connection duration

auto uptime = now() - connectionTime_;

Quality Scoring

Peers are scored based on metrics:

int calculatePeerScore(Peer const& peer)
{
    int score = 100;  // Start with perfect score
    
    // Penalize high latency
    if (peer.latency() > 500ms)
        score -= 20;
    else if (peer.latency() > 200ms)
        score -= 10;
    
    // Penalize low message rate (inactive peer)
    if (peer.messageRate() < 0.1)
        score -= 15;
    
    // Penalize errors
    score -= peer.errorCount() * 5;
    
    // Reward long uptime
    if (peer.uptime() > 24h)
        score += 10;
    
    return std::max(0, std::min(100, score));
}

Score Usage:

  • Low-scoring peers may be disconnected

  • High-scoring peers prioritized for reconnection

  • Informs peer selection decisions

Connection Pruning

When connection limits are reached, low-quality peers are pruned:

void pruneConnections()
{
    if (activeConnections() <= targetConnections())
        return;
    
    // Sort peers by score (lowest first)
    auto peers = getAllPeers();
    std::sort(peers.begin(), peers.end(),
        [](auto const& a, auto const& b)
        {
            return a->score() < b->score();
        });
    
    // Disconnect lowest-scoring non-fixed peers
    for (auto& peer : peers)
    {
        if (isFixed(peer))
            continue;  // Never disconnect fixed peers
        
        peer->disconnect(drSaturated);
        
        if (activeConnections() <= targetConnections())
            break;
    }
}

Reconnection Logic

After disconnection, Rippled may attempt to reconnect:

Exponential Backoff:

Duration calculateReconnectDelay(int attempts)
{
    // Exponential backoff with jitter
    auto delay = minDelay * std::pow(2, attempts);
    delay = std::min(delay, maxDelay);
    
    // Add random jitter (±25%)
    auto jitter = delay * (0.75 + random() * 0.5);
    
    return jitter;
}

// Example progression:
// Attempt 1: ~5 seconds
// Attempt 2: ~10 seconds
// Attempt 3: ~20 seconds
// Attempt 4: ~40 seconds
// Attempt 5+: ~60 seconds (capped)

Fixed Peer Priority:

void scheduleReconnect(Peer const& peer)
{
    Duration delay;
    
    if (isFixed(peer))
    {
        // Aggressive reconnection for fixed peers
        delay = 5s;
    }
    else
    {
        // Exponential backoff for regular peers
        delay = calculateReconnectDelay(peer.reconnectAttempts());
    }
    
    scheduleJob(delay, [this, peer]()
    {
        attemptConnection(peer.address());
    });
}

Message Routing and Broadcasting

Message Types

Different message types require different routing strategies:

Critical Messages (Broadcast to All)

Validations (tmVALIDATION):

  • Must reach all validators

  • Broadcast to all peers immediately

  • Critical for consensus

Consensus Proposals (tmPROPOSE_LEDGER):

  • Must reach all validators

  • Time-sensitive

  • Broadcast widely

Broadcast Pattern:

void broadcastCritical(std::shared_ptr<Message> const& msg)
{
    for (auto& peer : getAllPeers())
    {
        // Send to everyone
        peer->send(msg);
    }
}

Transactions (Selective Relay)

Transaction Messages (tmTRANSACTION):

  • Should reach all nodes eventually

  • Don't need immediate broadcast to all

  • Use intelligent relay

Relay Logic:

void relayTransaction(
    std::shared_ptr<Message> const& msg,
    Peer* source)
{
    for (auto& peer : getAllPeers())
    {
        // Don't echo back to source
        if (peer.get() == source)
            continue;
        
        // Check if peer likely already has it
        if (peerLikelyHas(peer, msg))
            continue;
        
        // Send to peer
        peer->send(msg);
    }
}

Request/Response (Unicast)

Ledger Data Requests (tmGET_LEDGER):

  • Directed to specific peer

  • Response goes back to requester

  • No broadcasting needed

Unicast Pattern:

void requestLedgerData(
    LedgerHash const& hash,
    Peer* peer)
{
    auto request = makeGetLedgerMessage(hash);
    peer->send(request);  // Send only to this peer
}

Squelch Algorithm

Squelch prevents message echo loops:

Problem:

Node A → sends to B
Node B → receives from A
Node B → broadcasts to all (including A)
Node A → receives echo from B
Node A → broadcasts again...
(infinite loop)

Solution:

void onMessageReceived(
    std::shared_ptr<Message> const& msg,
    Peer* source)
{
    // Track message hash
    auto hash = msg->getHash();
    
    // Have we seen this before?
    if (recentMessages_.contains(hash))
        return;  // Ignore duplicate
    
    // Record that we've seen it
    recentMessages_.insert(hash);
    
    // Process message
    processMessage(msg);
    
    // Relay to others (excluding source)
    relayToOthers(msg, source);
}

Recent Message Cache:

  • Time-based expiration (e.g., 30 seconds)

  • Size-based limits (e.g., 10,000 entries)

  • LRU eviction policy

Message Priority Queues

Outbound messages are queued with priority:

enum MessagePriority
{
    priVeryHigh,    // Validations, critical consensus
    priHigh,        // Proposals, status changes
    priMedium,      // Transactions
    priLow,         // Historical data, maintenance
};

class PeerMessageQueue
{
private:
    std::map<MessagePriority, std::queue<Message>> queues_;
    
public:
    void enqueue(Message msg, MessagePriority priority)
    {
        queues_[priority].push(msg);
    }
    
    Message dequeue()
    {
        // Dequeue from highest priority non-empty queue
        for (auto& [priority, queue] : queues_)
        {
            if (!queue.empty())
            {
                auto msg = queue.front();
                queue.pop();
                return msg;
            }
        }
        
        throw std::runtime_error("No messages");
    }
};

Benefits:

  • Critical messages sent first

  • Prevents head-of-line blocking

  • Better network utilization


Network Health and Monitoring

Health Metrics

Connectivity Metrics

Active Peers: Current peer count

size_t activePeers = overlay.size();

Target vs Actual: Comparison to target

bool isHealthy = activePeers >= (targetPeers * 0.75);

Connection Distribution:

size_t outbound = countOutboundPeers();
size_t inbound = countInboundPeers();
float ratio = float(outbound) / inbound;

// Healthy: ratio between 0.5 and 2.0
bool balancedConnections = (ratio > 0.5 && ratio < 2.0);

Network Quality Metrics

Average Latency:

auto avgLatency = calculateAverageLatency(getAllPeers());

// Healthy: < 200ms average
bool lowLatency = avgLatency < 200ms;

Message Rate:

auto totalRate = sumMessageRates(getAllPeers());

// Messages per second across all peers

Validator Connectivity:

auto validatorPeers = countValidatorPeers();
auto unlSize = getUNLSize();

// Should be connected to most of UNL
bool goodValidatorConnectivity = 
    validatorPeers >= (unlSize * 0.8);

RPC Monitoring Commands

peers Command

Get current peer list:

rippled peers

Response:

{
  "result": {
    "peers": [
      {
        "address": "54.186.73.52:51235",
        "latency": 45,
        "uptime": 3600,
        "version": "rippled-1.9.0",
        "public_key": "n9KorY8QtTdRx...",
        "complete_ledgers": "32570-75234891"
      }
      // ... more peers
    ]
  }
}

peer_reservations Command

View reserved peer slots:

rippled peer_reservations_add <public_key> <description>
rippled peer_reservations_list

connect Command

Manually connect to peer:

rippled connect 192.168.1.100:51235

Logging and Diagnostics

Enable detailed overlay logging:

[rpc_startup]
{ "command": "log_level", "partition": "Overlay", "severity": "trace" }

Log Messages to Monitor:

"Overlay": "Connected to peer 54.186.73.52:51235"
"Overlay": "Disconnected from peer 54.186.73.52:51235, reason: saturated"
"Overlay": "Handshake failed with peer: protocol version mismatch"
"Overlay": "Received invalid message from peer, closing connection"
"Overlay": "Active peers: 18/20 (target)"

Codebase Deep Dive

Key Files and Directories

Overlay Core:

  • src/ripple/overlay/Overlay.h - Main overlay interface

  • src/ripple/overlay/impl/OverlayImpl.h - Implementation header

  • src/ripple/overlay/impl/OverlayImpl.cpp - Core implementation

Peer Management:

  • src/ripple/overlay/Peer.h - Peer interface

  • src/ripple/overlay/impl/PeerImp.h - Peer implementation

  • src/ripple/overlay/impl/PeerImp.cpp - Peer logic

Connection Handling:

  • src/ripple/overlay/impl/ConnectAttempt.h - Outbound connections

  • src/ripple/overlay/impl/InboundHandoff.h - Inbound connections

Message Processing:

  • src/ripple/overlay/impl/ProtocolMessage.h - Message definitions

  • src/ripple/overlay/impl/Message.cpp - Message handling

Key Classes

Overlay Class

class Overlay
{
public:
    // Start/stop overlay network
    virtual void start() = 0;
    virtual void stop() = 0;
    
    // Peer management
    virtual void connect(std::string const& ip) = 0;
    virtual std::size_t size() const = 0;
    
    // Message broadcasting
    virtual void broadcast(std::shared_ptr<Message> const&) = 0;
    virtual void relay(
        std::shared_ptr<Message> const&,
        Peer* source = nullptr) = 0;
    
    // Peer information
    virtual Json::Value json() = 0;
    virtual std::vector<Peer::ptr> getActivePeers() = 0;
};

PeerImp Class

class PeerImp : public Peer
{
public:
    // Send message to this peer
    void send(std::shared_ptr<Message> const& m) override;
    
    // Process received message
    void onMessage(std::shared_ptr<Message> const& m);
    
    // Connection state
    bool isConnected() const;
    void disconnect(DisconnectReason reason);
    
    // Quality metrics
    std::chrono::milliseconds latency() const;
    int score() const;
    
private:
    // Connection management
    boost::asio::ip::tcp::socket socket_;
    boost::asio::ssl::stream<socket_t&> stream_;
    
    // Message queues
    std::queue<std::shared_ptr<Message>> sendQueue_;
    
    // Metrics
    std::chrono::steady_clock::time_point connected_;
    std::chrono::milliseconds latency_;
    int score_;
};

Code Navigation Tips

Finding Connection Logic

Search for connection establishment:

// In OverlayImpl.cpp
void OverlayImpl::connect(std::string const& ip)
{
    // Parse IP and port
    auto endpoint = parseEndpoint(ip);
    
    // Create connection attempt
    auto attempt = std::make_shared<ConnectAttempt>(
        app_,
        io_service_,
        endpoint,
        peerFinder_.config());
    
    // Begin async connection
    attempt->run();
}

Tracing Message Flow

Follow message from receipt to processing:

// PeerImp::onMessage (entry point)
void PeerImp::onMessage(std::shared_ptr<Message> const& msg)
{
    // Check for duplicates (squelch)
    if (app_.overlay().hasSeen(msg->getHash()))
        return;
    
    // Mark as seen
    app_.overlay().markSeen(msg->getHash());
    
    // Process based on type
    switch (msg->getType())
    {
        case protocol::mtTRANSACTION:
            onTransaction(msg);
            break;
        case protocol::mtVALIDATION:
            onValidation(msg);
            break;
        // ... other types
    }
    
    // Relay to other peers
    app_.overlay().relay(msg, this);
}

Hands-On Exercise

Exercise: Monitor and Analyze Network Topology

Objective: Understand your node's position in the network and analyze peer connections.

Part 1: Initial Network State

Step 1: Get current peer list

rippled peers > peers_initial.json

Step 2: Analyze the output

Count:

  • Total peers

  • Outbound vs inbound connections

  • Peer versions

  • Geographic distribution (if known)

Questions:

  • Do you have the target number of peers?

  • Is the outbound/inbound ratio balanced?

  • Are you connected to validators in your UNL?

Part 2: Connection Quality Analysis

Step 1: Enable overlay logging

rippled log_level Overlay debug

Step 2: Monitor for 5 minutes

tail -f /var/log/rippled/debug.log | grep -E "latency|score|disconnect"

Step 3: Identify patterns

Look for:

  • Average peer latency

  • Connection failures

  • Disconnection reasons

  • Reconnection attempts

Part 3: Connectivity Test

Step 1: Manually connect to a peer

# Connect to XRP Ledger Foundation validator
rippled connect r.ripple.com:51235

Step 2: Verify connection

rippled peers | grep "r.ripple.com"

Step 3: Observe handshake in logs

"Overlay": "Connected to r.ripple.com:51235"
"Overlay": "Handshake complete with peer n9KorY8..."
"Overlay": "Added peer n9KorY8... to active peers"

Part 4: Network Health Check

Step 1: Check peer count over time

# Run every minute for 10 minutes
for i in {1..10}; do
  echo "$(date): $(rippled peers | grep -c address) peers"
  sleep 60
done

Step 2: Monitor connection churn

# Count new connections and disconnections
grep -c "Connected to peer" /var/log/rippled/debug.log
grep -c "Disconnected from peer" /var/log/rippled/debug.log

Step 3: Assess stability

Calculate:

  • Connection churn rate (disconnections per hour)

  • Average peer lifetime

  • Reconnection frequency

Part 5: Peer Quality Distribution

Step 1: Extract peer metrics

From peers output, record for each peer:

  • Latency

  • Uptime

  • Complete ledgers range

Step 2: Create distribution charts

Latency distribution:

0-50ms:    |||||| (6 peers)
51-100ms:  |||||||||| (10 peers)
101-200ms: ||| (3 peers)
201+ms:    | (1 peer)

Step 3: Identify issues

  • Are any peers consistently high-latency?

  • Do any peers have incomplete ledger history?

  • Are there peers with low uptime?

Analysis Questions

Answer these based on your observations:

  1. What's your average peer latency?

    • Is it acceptable (<200ms)?

  2. How stable are your connections?

    • High churn may indicate network issues

  3. Are you well-connected to validators?

    • Check against your UNL

  4. What's your network position?

    • Are you mostly receiving or mostly sending connections?

  5. Do you see any problematic peers?

    • High latency, frequent disconnections?

  6. How does your node handle connection limits?

    • Does it maintain target peer count?


Key Takeaways

Core Concepts

Mesh Topology: Decentralized network with no single point of failure

Three Connection Types: Outbound, inbound, and fixed connections serve different purposes

Multi-Mechanism Discovery: DNS seeds, configured peers, and gossip protocol enable robust peer discovery

Connection Quality: Continuous monitoring and scoring of peer quality

Intelligent Routing: Message-specific routing strategies optimize network efficiency

Squelch Algorithm: Prevents message loops and duplicate processing

Priority Queuing: Ensures critical messages are transmitted first

Network Health

Target Peer Count: Based on node_size configuration

Balanced Connections: ~50% outbound, ~50% inbound

Quality Metrics: Latency, message rate, error rate, uptime

Connection Pruning: Low-quality peers replaced with better alternatives

Fixed Peer Priority: Critical connections maintained aggressively

Development Skills

Codebase Location: Overlay implementation in src/ripple/overlay/

Configuration: Understanding [ips], [ips_fixed], [port_peer] sections

Monitoring: Using RPC commands and logs to assess network health

Debugging: Tracing connection issues and message flow


Common Issues and Solutions

Issue 1: Low Peer Count

Symptoms: Active peers consistently below target

Possible Causes:

  • Firewall blocking inbound connections

  • ISP blocking port

  • Poor peer quality (all disconnect quickly)

Solutions:

# Check firewall
sudo iptables -L | grep 51235

# Verify port is accessible
telnet your-ip 51235

# Check if node is reachable
rippled server_info | grep pubkey_node

Issue 2: High Latency Peers

Symptoms: Average latency >200ms

Possible Causes:

  • Geographic distance to peers

  • Network congestion

  • Poor quality peers

Solutions:

# Manually connect to closer peers
rippled connect low-latency-peer.example.com:51235

# Add fixed peers in same region
[ips_fixed]
local-peer-1.example.com 51235
local-peer-2.example.com 51235

Issue 3: Frequent Disconnections

Symptoms: High connection churn rate

Possible Causes:

  • Network instability

  • Protocol incompatibility

  • Being saturated by other peers

Solutions:

# Check logs for disconnect reasons
grep "Disconnected" /var/log/rippled/debug.log

# Look for patterns
grep "Disconnected.*reason" /var/log/rippled/debug.log | \
  cut -d: -f4 | sort | uniq -c

Issue 4: No Validator Connections

Symptoms: Not connected to any UNL validators

Possible Causes:

  • Validators are unreachable

  • Validators' connection slots full

  • Network configuration issues

Solutions:

# Manually connect to validators
rippled connect validator.example.com:51235

# Use fixed connections for validators
[ips_fixed]
validator1.example.com 51235
validator2.example.com 51235

Additional Resources

Official Documentation

Codebase References

  • src/ripple/overlay/ - Overlay network implementation

  • src/ripple/overlay/impl/PeerImp.cpp - Peer connection handling

  • src/ripple/overlay/impl/OverlayImpl.cpp - Core overlay logic


Next Steps

Now that you understand how the overlay network enables peer-to-peer communication, explore the complete journey of a transaction through the system.

➡️ Continue to: Transaction Lifecycle - Complete Transaction Journey

⬅️ Back to: Rippled II Overview


Get Started

Access the course: docs.xrpl-commons.org/core-dev-bootcamp

Got questions? Contact us here: Submit Feedback


© 2025 XRPL Commons - Core Dev Bootcamp

Last updated