0.2.7 Aug 16, 2022
0.2.6 Aug 13, 2022
0.2.2 Jul 22, 2022
0.1.7 Jul 9, 2022
0.1.0 May 28, 2022

#8 in #remote-peer

Download history 32/week @ 2024-02-18 20/week @ 2024-02-25 17/week @ 2024-03-10

69 downloads per month
Used in 35 crates (18 directly)

Apache-2.0

2MB
38K SLoC


id: network title: Network custom_edit_url: https://github.com/aptos-labs/aptos-core/edit/main/network/README.md

Overview

For more detailed info, see the AptosNet Specification.

AptosNet is the primary protocol for communication between any two nodes in the Aptos ecosystem. It is specifically designed to facilitate the consensus, shared mempool, and state sync protocols. AptosNet tries to maintain at-most one connection with each remote peer; the application protocols to that remote peer are then multiplexed over the single peer connection.

Currently, it provides application protocols with two primary interfaces:

  • DirectSend: for fire-and-forget style message delivery.
  • RPC: for unary Remote Procedure Calls.

The network component uses:

  • TCP for reliable transport.
  • NoiseIK for authentication and full end-to-end encryption.
  • On-chain NetworkAddress set for discovery, with optional seed peers in the NetworkConfig as a fallback.

Validators will only allow connections from other validators. Their identity and public key information is provided by the validator-set-discovery protocol, which updates the eligible member information on each consensus reconfiguration. Each member of the validator network maintains a full membership view and connects directly to all other validators in order to maintain a full-mesh network.

In contrast, Validator Full Node (VFNs) servers will only prioritize connections from more trusted peers in the on-chain discovery set; they will still service any public clients. Public Full Nodes (PFNs) connecting to VFNs will always authenticate the VFN server using the available discovery information.

Validator health information, determined using periodic liveness probes, is not shared between validators; instead, each validator directly monitors its peers for liveness using the HealthChecker protocol.

This approach should scale up to a few hundred validators before requiring partial membership views, sophisticated failure detectors, or network overlays.

Implementation Details

System Architecture

                      +-----------+---------+------------+--------+
 Application Modules  | Consensus | Mempool | State Sync | Health |
                      +-----------+---------+------------+--------+
                            ^          ^          ^           ^
   Network Interface        |          |          |           |
                            v          v          v           v
                      +----------------+--------------------------+   +---------------------+
      Network Module  |                 PeerManager               |<->| ConnectivityManager |
                      +----------------------+--------------------+   +---------------------+
                      |        Peer(s)       |                    |
                      +----------------------+                    |
                      |                AptosTransport              |
                      +-------------------------------------------+

The network component is implemented in the Actor model — it uses message-passing to communicate between different subcomponents running as independent "tasks." The tokio framework is used as the task runtime. The primary subcomponents in the network module are:

  • Network Interface — The interface provided to application modules using AptosNet.

  • PeerManager — Listens for incoming connections, and dials outbound connections to other peers. Demultiplexes and forwards inbound messages from Peers to appropriate application handlers. Additionally, notifies upstream components of new or closed connections. Optionally can be connected to ConnectivityManager for a network with Discovery.

  • Peer — Manages a single connection to another peer. It reads and writes NetworkMessagees from/to the wire. Currently, it implements the two protocols: DirectSend and Rpc.

  • AptosTransport — A secure, reliable transport. It uses NoiseIK over TCP to negotiate an encrypted and authenticated connection between peers. The AptosNet version and any Aptos-specific application protocols are negotiated afterward using the AptosNet Handshake Protocol.
  • ConnectivityManager — Establishes connections to known peers found via Discovery. Notifies PeerManager to make outbound dials, or disconnects based on updates to known peers via Discovery updates.

  • validator-set-discovery — Discovers the set of peers to connect to via on-chain configuration. These are the validator_network_addresses and fullnode_network_addresses of each ValidatorConfig in the ValidatorSet::validators set. Notifies the ConnectivityManager of updates to the known peer set.

  • HealthChecker — Performs periodic liveness probes to ensure the health of a peer/connection. It resets the connection with the peer if a configurable number of probes fail in succession. Probes currently fail on a configurable static timeout.

How is this module organized?

network
├── benches                    # Network benchmarks
├── builder                    # Builds a network from a NetworkConfig
├── memsocket                  # In-memory socket interface for tests
├── netcore
│   └── src
│       ├── transport          # Composable transport API
│       └── framing            # Read/write length prefixes to sockets
├── network-address            # Network addresses and encryption
├── discovery                  # Protocols for peer discovery
└── src
    ├── peer_manager           # Manage peer connections and messages to/from peers
    ├── peer                   # Handles a single peer connection's state
    ├── connectivity_manager   # Monitor connections and ensure connectivity
    ├── protocols
    │   ├── network            # Application layer interface to network module
    │   ├── direct_send        # Protocol for fire-and-forget style message delivery
    │   ├── health_checker     # Protocol for health probing
    │   ├── rpc                # Protocol for remote procedure calls
    │   └── wire               # Protocol for AptosNet handshakes and messaging
    ├── transport              # The base transport layer for dialing/listening
    └── noise                  # Noise handshaking and wire integration

Dependencies

~87MB
~1.5M SLoC