Overview
This document describes the high-level architecture of unc-infra. The focus here is on the implementation of the blockchain protocol, not the protocol itself. For reference documentation of the protocol, please refer to nomicon
Some parts of our architecture are also covered in this video series on YouTube.
Bird's Eye View
If we put the entirety of unc-infra.onto one picture, we get something like this:
Don't worry if this doesn't yet make a lot of sense: hopefully, by the end of this document the above picture would become much clearer!
Overall Operation
unc-infra. is a blockchain node -- it's a single binary (unc-node`) which runs on
some machine and talks to other similar binaries running elsewhere. Together,
the nodes agree (using a distributed consensus algorithm) on a particular
sequence of transactions. Once transaction sequence is established, each node
applies transactions to the current state. Because transactions are fully
deterministic, each node in the network ends up with identical state. To allow
greater scalability, UNC protocol uses sharding, which allows a node to hold
only a small subset (shard) of the whole state.
unc-node is a stateful, restartable process. When unc-node starts, the node
connects to the network and starts processing blocks (block is a batch of
transactions, processed together; transactions are batched into blocks for
greater efficiency). The results of processing are persisted in the database.
RocksDB is used for storage. Usually, the node's data is found in the ~/.unc
directory. The node can be stopped at any moment and be restarted later. While
the node is offline it misses the block, so, after a restart, the sync process
kicks in which brings the node up-to-speed with the network by downloading the
missing bits of history from more up-to-date peer nodes.
Major components of unc-infra.
- 
JSON RPC. This HTTP RPC interface is how unc-nodecommunicates with non-blockchain outside world. For example, to submit a transaction, some client sends an RPC request with it to some node in the network. From that node, the transaction propagates through the network, until it is included in some block. Similarly, a client can send an HTTP request to a node to learn about current state of the blockchain. The JSON RPC interface is documented here.
- 
Network. If RPC is aimed "outside" the blockchain, "network" is how peer unc-nodenodes communicate with each other within the blockchain. RPC carries requests from users of the blockchain, while network carries various messages needed to implement consensus. Two directly connected nodes communicate by sending protobuf-encoded messages over TCP. A node also includes logic to route messages for indirect peers through intermediaries. Oversimplifying a lot, it's enough for a new node to know an IP address of just one other network participant. From this bootstrap connection, the node learns how to communicate with any other node in the network.
- 
Client. Somewhat confusingly named, client is the logical state of the blockchain. After receiving and decoding a request, both RPC and network usually forward it in the parsed form to the client. Internally, client is split in two somewhat independent components: chain and runtime. 
- 
Chain. The job of chain, in a nutshell, is to determine a global order of transactions. Chain builds and maintains the blockchain data structure. This includes block and chunk production and processing, consensus, and validator selection. However, chain is not responsible for actually applying transactions and receipts. 
- 
Runtime. If chain selects the order of transactions, Runtime applies transaction to the state. Chain guarantees that everyone agrees on the order and content of transactions, and Runtime guarantees that each transaction is fully deterministic. It follows that everyone agrees on the "current state" of the blockchain. Some transactions are as simple as "transfer X tokens from Alice to Bob". But a much more powerful class of transactions is supported: "run this arbitrary WebAssembly code in the context of the current state of the chain". Running such "smart contract" transactions securely and efficiently is a major part of what Runtime does. Today, Runtime uses a JIT compiler to do that. 
- 
Storage. Storage is more of a cross-cutting concern, than an isolated component. Many parts of a node want to durably persist various bits of state to disk. One notable case is the logical state of the blockchain, and, in particular, data associated with each account. Logically, the state of an account on a chain is a key-value map: HashMap<Vec<u8>, Vec<u8>>. But there is a twist: it should be possible to provide a succinct proof that a particular key indeed holds a particular value. To allow that internally the state is implemented as a persistent (in both senses, "functional" and "on disk") merkle-patricia trie.
- 
Parameter Estimator. One kind of transaction we support is "run this arbitrary, Turing-complete computation". To protect from a loop {}transaction halting the whole network, Runtime implements resource limiting: each transaction runs with a certain finite amount of "gas", and each operation costs a certain amount of gas to perform. Parameter estimator is essentially a set of benchmarks used to estimate relative gas costs of various operations.
Entry Points
unc-node/src/main.rs contains the main function that starts a blockchain node.
However, this file mostly only contains the logic to parse arguments and
dispatch different commands. start_with_config in unc-infra.src/lib.rs is the
actual entry point and it starts all the actors.
JsonRpcHandler::process in the jsonrpc crate is the RPC entry point. It
implements the public API of a node, which is documented
here.
PeerManagerActor::spawn in the network is an entry for the other point of
contract with the outside world -- the peer-to-peer network.
Runtime::apply in the runtime crate is the entry point for transaction
processing logic. This is where state transitions actually happen, after chain
decided, according to distributed consensus, which transitions need  to
happen.
Code Map
This section contains some high-level overview of important crates and data structures.
core/primitives
This crate contains most of the types that are shared across different crates.
core/primitives-core
This crate contains types needed for runtime.
core/store/trie
This directory contains the MPT state implementation. Note that we usually use
TrieUpdate to interact with the state.
chain/chain
This crate contains most of the chain logic (consensus, block processing, etc).
ChainUpdate::process_block is where most of the block processing logic
happens.
State update
The blockchain state of a node can be changed in the following two ways:
- Applying a chunk. This is how the state is normally updated: through
Runtime::apply.
- State sync. State sync can happen in two cases:
- A node is far enough behind the most recent block and triggers state sync to fast forward to the state of a very recent block without having to apply blocks in the middle.
- A node is about to become validator for some shard in the next epoch, but it
does not yet have the state for that shard. In this case, it would run state
sync through the catchuproutine.
 
chain/chunks
This crate contains most of the sharding logic which includes chunk creation,
distribution, and processing. ShardsManager is the main struct that
orchestrates everything here.
chain/client
This crate defines two important structs, Client and ViewClient. Client
includes everything necessary for the chain (without network and runtime) to
function and runs in a single thread. ViewClient is a "read-only" client that
answers queries without interfering with the operations of Client.
ViewClient runs in multiple threads.
chain/network
This crate contains the entire implementation of the p2p network used by UNC blockchain nodes.
Two important structs here: PeerManagerActor and Peer. Peer manager
orchestrates all the communications from network to other components and from
other components to network. Peer is responsible for low-level network
communications from and to a given peer (more details in
this article). Peer manager runs in one thread while each
Peer runs in its own thread.
Architecture Invariant: Network communicates to Client through
NetworkClientMessages and to ViewClient through NetworkViewClientMessages.
Conversely, Client and ViewClient communicates to network through
NetworkRequests.
chain/epoch_manager
This crate is responsible for determining validators and other epoch related information such as epoch id for each epoch.
Note: EpochManager is constructed in NightshadeRuntime rather than in
Chain, partially because we had this idea of making epoch manager a smart
contract.
chain/jsonrpc
This crate implements JSON-RPC API server to enable
submission of new transactions and inspection of the blockchain data, the
network state, and the node status. When a request is processed, it generates a
message to either ClientActor or ViewClientActor to interact with the
blockchain. For queries of blockchain data, such as block, chunk, account, etc,
the request usually generates a message to ViewClientActor. Transactions, on
the other hand, are sent to ClientActor for further processing.
runtime/runtime
This crate contains the main entry point to runtime -- Runtime::apply. This
function takes ApplyState, which contains necessary information passed from
chain to runtime, a list of SignedTransaction and a list of Receipt, and
returns a ApplyResult, which includes state changes, execution outcomes, etc.
Architecture Invariant: The state update is only finalized at the end of
apply. During all intermediate steps state changes can be reverted.
runtime/unc-vm-logic
VMLogic contains all the implementations of host functions and is the
interface between runtime and wasm. VMLogic is constructed when runtime
applies function call actions. In VMLogic, interaction with UNC blockchain
happens in the following two ways:
- VMContext, which contains lightweight information such as current block hash, current block height, epoch id, etc.
- External, which is a trait that contains functions to interact with blockchain by either reading some nontrivial data, or writing to the blockchain.
runtime/unc-vm-runner
run function in runner.rs is the entry point to the vm runner. This function
essentially spins up the vm and executes some function in a contract. It
supports different wasm compilers including wasmer0, wasmer2, and wasmtime
through compile-time feature flags. Currently we use wasmer0 and wasmer2 in
production. The imports module exposes host functions defined in
unc-vm-logic to WASM code. In other words, it defines the ABI of the
contracts on UNC.
unc-node
As mentioned before, unc-node is the crate that contains that main entry points.
All the actors are spawned in start_with_config. It is also worth noting that
NightshadeRuntime is the struct that implements RuntimeAdapter.
core/store/src/db.rs
This file contains the schema (DBCol) of our internal RocksDB storage - a good starting point when reading the code base.
Cross Cutting Concerns
Observability
The tracing crate is used for structured, hierarchical event output and logging. We also integrate Prometheus for light-weight metric output. See the style documentation for more information on the usage.
Testing
Rust has built-in support for writing unit tests by marking functions
with the #[test] directive.  Take full advantage of that!  Testing not
only confirms that what was written works the way it was intended to but
also helps during refactoring since it catches unintended behaviour
changes.
Not all tests are created equal though and while some may only need
milliseconds to run, others may run for several seconds or even
minutes.  Tests that take a long time should be marked as such with an
expensive_tests feature, for example:
#![allow(unused)] fn main() { #[test] #[cfg_attr(not(feature = "expensive_tests"), ignore)] fn test_catchup_random_single_part_sync() { test_catchup_random_single_part_sync_common(false, false, 13) } }
Such tests will be ignored by default and can be executed by using
--ignored or --include-ignored flag as in cargo test -- --ignored or by compiling the tests with expensive_tests feature
enabled.
Because expensive tests are not run by default, they are also not run
in CI.  Instead, they are run nightly and need to be explicitly
included in nightly/expensive.txt file; for example:
expensive --timeout=1800 unc-client unc_client tests::catching_up::test_catchup_random_single_part_sync
expensive --timeout=1800 unc-client unc_client tests::catching_up::test_catchup_random_single_part_sync --features nightly
For more details regarding nightly tests see nightly/README.md.
Note that what counts as a slow test isn’t exactly defined as of now. If it takes just a couple seconds then it’s probably fine. Anything slower should probably be classified as an expensive test. In particular, if libtest complains the test takes more than 60 seconds then it definitely is and expensive test.