Life cycle of a transaction

You may notice that I omit the description of some fields defined in the Antelope node header files (such as context-free actions) in order to keep the description concise and focused. My goal is to describe how things work but not what exactly those things consist of. Keep in mind that this is not a reference manual.
Further on, name refers to the Antelope data type eosio::name, which is basically a textual representation of a 64-bit unsigned integer. Names are used everywhere in Antelope software for specifying accounts, actions, tables, and permissions.

Composing a transaction

There are two ways for a transaction to appear in the blockchain: it's either pushed to the HTTP RPC, or generated by a smart contract, which is creating a deferred transaction. Deferred transactions are deprecated, and support for them will eventually be dropped.
The /v1/chain/send_transaction RPC call is taking a serialized chain::packed_transaction object, which consists of the transaction header, actions and signatures. You can look up the details of these objects in libraries/chain/include/eosio/chain/transaction.hpp header file in Antelope source code.
The transaction header contains several values that the client is supposed to retrieve prior to sending a transaction. The current eosjs library is requesting them before sending every transaction without attempting to cache and reuse the values. Other client libraries are optimizing the RPC interaction and reusing the values that can be reused. Some important fields in transaction header:
  • time_point_sec expiration -- absolute time after which the transaction expires if it's not placed in a block. The nodes will discard and stop propagating a transaction if it expires.
  • uint16_t ref_block_num -- this field consists of the lowest 16 bits of a 32-bit block number. This means a transaction cannot be placed in a block if it refers to a block number which is older than approximately 9 hours (with 7200 blocks per hour, 65536 blocks will be produced in slightly longer than 9 hours).
  • uint32_t ref_block_prefix -- this field contains the lowest 32 bits of the reference block ID (a block ID is sha256 hash of its contents).
In order to send a valid transaction, the client needs to retrieve a valid block number and ID. The most reliable way to do so is to take the last irreversible block (this information is retrieved by get_info RPC call), and you need to keep in mind that the expiration time needs to be in the future. Some applications may want to rely on the head block instead, if the sequence of actions is important and the contract does not guarantee the ordering of actions.
Actions comprise the essence of the transaction. They define what exactly this transaction is supposed to do. A transaction may have several actions, and the whole transaction only succeeds when all actions in it are successful. Also, important to note that the first authorizer of the first action in a transaction is billed for CPU and NET resources.
The header file libraries/chain/include/eosio/chain/action.hpp defines the action object and its properties. The most important fields in an action are:
  • name account -- this is the account that contains a WASM smart contract. Each action is simply an invocation of a corresponding WASM block.
  • name name -- rather confusing, this is the name of the action, and it's of type name. Normally the WASM code has a dispatcher that calls different functions according to the action name.
  • vector<permission_level> authorization -- this defines the accounts and permissions that authorize this execution. The transaction must contain sufficient signatures to satisfy the authorization requirements.
  • bytes data -- this is a byte vector of action arguments. Normally this is a serialized structure that corresponds to the ABI that is declared by the contract account. But nodeos does not care what's in it at the moment of sending. It is solely the job of the WASM code to deserialize and interpret the data content.
So, before sending a transaction, we need to know what the actions are called and how to serialize their arguments. This information is published in ABI by the contract account alongside its WASM code.
Keep in mind that there might be accounts with WASM and without ABI (in this case, you would need to disassemble the WASM to understand what the arguments mean), and also accounts with ABI and without WASM (they would simply accept whatever you send to them).
A typical client, such as eosjs, would get the contract name, action name, and a map of arguments in its transact method. Under the hood, it will need to retrieve the ABI of specified contract account and pack the action arguments according to the ABI specification. In addition to that, it also retrieves the reference block information. As you can see, a lot is happening in just a single transaction preparation.
Next, a transaction needs to have elliptic curve signatures that correspond to the autorizations specified in each action. The client normally doesn't know which keys are needed in advance, so once again it sends another RPC request to find out which of its keys need to sign the transaction for it to go through.
Each action may spawn additional actions which belong to the same transaction. There are two types of such actions: notifications caused by require_recipient() call and new actions which are generated by eosio::action::send() method. Whenever any of these calls are made within the currently executing action, they are not causing any immediate execution of a smart contract. Instead, they are inserted in different places in the action execution queue: the notifications are placed at the head of the queue so that they are executed first after our current action finishes, and the new actions generated by send() are pushed to the end of the queue.
Nodeos is picking the actions from the execution queue one by one and spawns a fresh WASM virtual machine for each of them. The virtual machine uses various caching and optimization techniques to speed up such execution. But the rule of thumb is that each execution starts in an empty virtual environment, and only one action is executed at a time.

Broadcasting a transaction

Once a transaction arrives to a node via /v1/chain/send_transaction RPC call, or via p2p interface, it is being evaluated. The node spawns a WASM virtual machine and executes the transaction against the current blockchain state. If all checks are successful and the transaction is finished successfully, the changes in state are either discarded or temporarily preserved, depending on the read-mode parameter in nodeos configuration.
The node broadcasts the transaction to its p2p neighbors if the transaction succeeded or discards it if the transaction fails.
If the transaction arrives to the node via RPC and the immediate evaluation results in a failure, the failure details are reported back to the client in the RPC response.
If a transaction arrives via p2p interface and fails during the evaluation, it is silently discarded, and the sender does not receive any immediate feedback.
The nodes propagate such speculative transactions across the whole p2p network, and they appear eventually at the active block producer nodes.
These transactions are called speculative because up to this moment their evaluation was based on an optimistic assumption that the blockchain state and transaction attributes would allow for the transaction to be placed in a block.

Block signing

Each block producer in active schedule has a 6-second interval to sign 12 blocks. It stores the speculative transactions that came before the new block signing starts, evaluates them again, and places in a new block. If a speculative transaction arrives during the block being signed, it may also be placed in the block.
The producer node has a number of configuration parameters that determine when the block is big enough and the node should stop taking more transactions into the block. Once it finishes, the producer node takes a sha256 hash of the block contents and adds an ECC signature of it, using the configured block signing key.
It is important to note that the block signing key does not need to be associated with any blockchain account, and the general security practice discourages using the same key for block signing and for an account permission.
Once an active producer signs a block, it sends it out to all its p2p neighbors, and those propagate the block further through the p2p network.

Forks and finality

One important factor in network optimization and blockchain stability is that the last block of the current producer should arrive to the next producer in schedule before it's time to start creating a new block.
If the last block of the previous producer hasn't arrived to the current producer and it's the time to start generating a new block, a microfork situation takes place: two versions of the same block number are flying across the p2p network. Those nodes which already received and processed the previous version of the same block, will have to roll back their state and import the new block. This results in the situation that the blockchain clients see a transaction which may disappear in the next second, or still be present, but the order of transactions in the block is different.
There are a few other failure scenarios that cause microforks. For example, the server clocks at the producer nodes lose their time synchronization. The worst-case scenario is when 1/3 of active producers are offline, and that prevents LIB from advancing. In such a situation, all blocks after the LIB will potentially be discarded.
A block becomes final as soon as a supermajority of active producers has signed their blocks twice. In other words, 4/3+2 producing windows - 6 seconds each - need to pass before the block is declared final and cannot be rewritten by a microfork. This results in about 2 minutes of irreversibility delay.
Work is in progress, aiming to reduce the finality time to 1-2 blocks, which will increase the stability and usability of the blockchain a lot.

Transaction failures

As described above, there are many different ways for a transaction to fail.
At the RPC node, the client knows immediately if a transaction fails. It's typically one of the following reasons:
  • the smart contract rejected the transaction (by calling check() with false predicate) because it doesn't satisfy the logic of the contract (for example, token balance is too low or the action arguments are incorrect).
  • the transaction took longer than 30ms to execute. This is the hard limit on transaction execution time. It is also subjective to the physical server, so if a transaction took 29ms on the RPC node, it has a high chance of failing along the way to the block producer.
  • the signatures are insufficient to authorize the transaction. For example, a token transfer can only be authorized by the paying account.
  • the RPC node may have subjective billing enabled. The subjective billing takes previous failed transactions into account and calculates the remaining CPU resource as if those transactions succeeded for the sender. Thus, a noisy sender with many failing transactions will soon be blocked by subjective billing.
Once the transaction is past the RPC node, it is transmitted across the cloud of p2p nodes. Each node may also drop the transaction for one of several reasons:
  • the transaction took longer than 30ms on the local node. The node may have a slower CPU, or the node could have Optimized Compiler disabled (eos-vm-oc-enable = false), or the system resources could be overloaded with other tasks.
  • the smart contract logic may not accept the transaction in new conditions. For example, the smart contract table was updated by another transaction that arrived in a block, and our speculative transaction is no longer possible to execute.
  • subjective billing could be enabled for transactions arriving via p2p, so the node deducts the CPU resource from previous account activity, including failed transactions.
Once the transaction arrives to a producer node, there are again a number of possible conditions that lead to dropping it:
  • a producer node is also receiving the transactions via p2p interface, so all reasons described above could happen on the producer node too.
  • one of the block producers may become offline, causing a 6-second gap in consecutive blocks. If a transaction expiration time is too short, it may simply time out during this time window.
  • a microfork happens in the blockchain, and the current producer needs to re-evaluate all transactions in the queue again. Depending on the depth of the fork, it may affect transactions which were sent seconds or tens of seconds ago, and they expire if evaluated again.
  • during a microfork, the speculative transactions are re-evaluated in random order. The smart contract logic may reject transactions because they are evaluated in a wrong sequence. For example, Alice sends X tokens to Bob (Bob having less than X on his account prior to transfer), and then Bob sends X tokens to Chris. A fork happens, and the producer tries to evaluate the second transaction first. Bob cannot execute a transfer because his balance is lower than X.

Handling the transaction lifecycle

The node software provides various means for tracking the transaction status and making sure that it's not lost.
Both state_history_plugin and trace_api_plugin export all transactions for each block and roll them back on microforks. They can be used for tracking the status of important transactions, although they require direct access to the blockchain node.
Leap release 3.1 introduces new RPC calls:
  • /v1/chain/send_transaction2 will retry sending a transaction until it appears in a block.
  • /v1/chain/get_transaction_status allows querying a transaction ID, and it returns its current status, such as UNKNOWN, FAILED, FORKED_OUT, LOCALLY_APPLIED, IN_BLOCK, IRREVERSIBLEUNKNOWN, FAILED, FORKED_OUT, LOCALLY_APPLIED, IN_BLOCK, IRREVERSIBLE.