Mining and Consensus

Earlier, we viewed as a way to generate new bitcoin. But more important, it is a decentralized security mechanism. Mining secures the bitcoin system and enables the emergence of network-wide consensus without a central authority.

To earn the reward, miners compete to solve a difficult mathematical problem based on a cryptographic hash algorithm. The solution is called Proof-of-Work and is included in the new block.

Decentralized Consensus

Satoshi Nakamoto’s main invention is the decentralized mechanism for emergent consensus. Emergent, because consensus is not achieved explicitly—there is no election or fixed moment when consensus occurs. Instead, it’s an emergent artifact of the asynchronous interaction of thousands of independent nodes, all following simple rules:

  • Independent verification of each transaction and block
  • Independent aggregation of those transactions into new blocks by mining nodes, coupled with demonstrated computation through a Proof-of-Work algorithm
  • Independent verification of the new blocks by every node and assembly into a chain
  • Independent selection, by every node, of the chain with the most cumulative computation demonstrated through Proof-of-Work

Independent Verification of Transactions

In transactions, we saw how a wallet creates and send transactions.

However, before forwarding transactions to its neighbors, every bitcoin node that receives a transaction will first verify the transaction. This ensures that only valid transactions are propagated across the network.

Each node verifies every transaction against a long checklist of criteria. These conditions can be seen in detail in the functions AcceptToMemoryPool, CheckTransaction, and CheckInputs in Bitcoin Core.

Aggregating Transactions into Blocks

After validating transactions, a bitcoin node will add them to the mempool. In addition, mining nodes will aggregate these transactions into a candidate block.

A canditate block is not a valid block, because it doesn’t contain a valid Proof-of-Work.

While miners search for a solution to a block, they constantly listen for new transactions which might be included in the next block. If a new block is found, the other miners compare it against all the transactions in the mempool and remove any that were included in the previous block.

The first transaction in any block is the coinbase transaction. It pays the block rewards and transaction fees to the miner. The coinbase transaction has a special format. Instead of a transaction input specifying a previous UTXO, it has a “coinbase” input.

The “Unlocking Script” (scriptSig) is replaced by coinbase data. This data field can be used by miners in any way they want, because the coinbase input doesn’t need a scriptSig.

But with BIP-34, version-2 blocks (blocks with the version field set to 2) must contain the block height index as a script “push” operation in the beginning of the coinbase field.

Constructing a Block Header

Like we learned earlier, the header contains six fields:

  • version
  • previous block hash
  • merkle root
  • timestamp
  • target
  • nonce

The miner fills in the information. The timestamp, encoded as a Unix “epoch” timestamp, is based on the number of seconds elapsed since midnight UTC, Thursday, January 1, 1970.

The target (which is simply a number) is stored in the block as a “target bits” metric, which is a mantissa-exponent encoding of the target. It contains two parts, an exponent and a coefficient.

The final field is the nonce, which is initialized to zero.

With all the other fields filled, the block header is now complete and the process of mining can begin. The goal is now to find a value for the nonce that results in a block header hash that is equal to or less than the target. The mining node will need to test billions or trillions of nonce values before a nonce is found that satisfies the requirement.

Mining the Block

Once the candidate block is created, it’s time to mine the block.

In the simplest terms, mining is the process of hashing the block header repeatedly, changing one parameter, until the resulting hash matches a specific target. The hash function’s result cannot be determined in advance, nor can a pattern be created that will produce a specific hash value. This feature of hash functions means that the only way to produce a hash result matching a specific target is to try again and again, randomly modifying the input until the desired hash result appears by chance.

The key characteristic of a cryptographic hash algorithm is that it is computationally infeasible to find two different inputs that produce the same fingerprint (known as a collision).

The block header contains a number called nonce, which is initially set to 0. This nonce is used to vary the output of a cryptographic function.

The Proof-of-Work must produce a hash that is equal to or less than the target. Although each attempt produces a random output, we can calculate the probability of each possible outcome. Therefore, we can adjust the difficulty by changing the target. A lower target equals higher difficulty.

Currently, you need to find hash of a block header which is equal or less than:

0000000000000000000ed0eb0000000000000000000000000000000000000000

For comparison, here is the target from 2009:

00000000ffff0000000000000000000000000000000000000000000000000000

As you can see, there are a lot of zeros at the beginning of that target which equals a higher difficulty.

Target Representation

Earlier, we saw that the block contains the target in a notation called “target bits” or just “bits”. This notation expresses the Proof-of-Work target as a coefficient/exponent format, with the first two hexadecimal digits for the exponent and the next six hex digits as the coefficient.

This is how we calculate the target:

target = coefficient * 2(8*(exponent–3))

The target is adjusted to secure a constant frequency of 10 minutes per block. Because the number of miners and the computational power rise, the target gets lower.

Retargeting to Adjust Difficulty

As we saw, the target determines the difficulty and therefore affects how long it takes to find a solution to the Proof-of-Work algorithm.

But how is such an adjustment made in a completely decentralized network?

Retargeting occurs automatically and on every node independently. Every 2,016 blocks, all nodes retarget the Proof-of-Work. The equation for retargeting measures the time it took to find the last 2,016 blocks and compares that to the expected time of 20,160 minutes.

The equation can be summarized as:

New Target = Old Target * (Actual Time of Last 2016 Blocks / 20160 minutes)

To avoid extreme volatility in the difficulty, the retargeting adjustment must be less than a factor of four (4) per cycle.

The Extra Nonce Solution

Bitcoin mining has evolved to resolve a fundamental limitation in the structure of the block header. As difficulty and hash power increased, miners need more space for nonce values in order to find valid blocks.

The timestamp could be stretched a bit, but moving it too far into the future would cause the block to become invalid. A new source of “change” was needed in the block header. The solution was to use the coinbase transaction as a source of extra nonce values. Because the coinbase script can store between 2 and 100 bytes of data, miners started using that space as extra nonce space, allowing them to explore a much larger range of block header values to find valid blocks.

Validating a new Block

Immediately after a miner finds a solution, the mining node transmits the block to all its peers. They receive, validate, and then propagate the new block. As the block ripples out across the network, each node adds it to its own copy of the blockchain.

When a node receives a new block, it will validate the block by checking it against a long list of criteria that must all be met; otherwise, the block is rejected. These criteria can be seen in the Bitcoin Core client in the functions CheckBlock and CheckBlockHeader.

Nodes maintain three sets of blocks:

  • those connected to the main blockchain
  • those that form branches off the main blockchain (secondary chains)
  • blocks that do not have a known parent in the known chains (orphans)

If a valid block is received and no parent is found in the existing chains, that block is considered an “orphan.” Orphan blocks are saved in the orphan block pool where they will stay until their parent is received.

Sometimes, as we will see in Blockchain Forks, the new block extends a chain that is not the main chain. In that case, the node will attach the new block to the secondary chain it extends and then compare the work of the secondary chain to the main chain.

If the secondary chain has more cumulative work than the main chain, the node will reconverge on the secondary chain.

By selecting the greatest-cumulative-work valid chain, all nodes eventually achieve network-wide consensus. Temporary discrepancies between chains are resolved eventually as more work is added, extending one of the possible chains. Mining nodes “vote” with their mining power by choosing which chain to extend by mining the next block. When they mine a new block and extend the chain, the new block itself represents their vote.

Blockchain Forks

Forks occur as temporary inconsistencies between versions of the blockchain, which are resolved by eventual reconvergence as more blocks are added to one of the forks.

Example: node X and node Y found two different blocks at the same time. After that, both propagated their block to its neighbours. Some have block Y as the main chain and block X as the secondary chain, others in reverse. Now every mining node creates a candidate block based on their main chain and start mining. The winning mining node will propagate also this block and has the longest chain. Ultimately, this provides consensus about the main chain, because nodes will adapt to the longest chain.

Changing the Consensus Rules

The rules of consensus determine the validity of transactions and blocks. These rules are the basis for collaboration between all bitcoin nodes and are responsible for the convergence of all local perspectives into a single consistent blockchain across the entire network.

While the consensus rules are invariable in the short term and must be consistent across all nodes, they are not invariable in the long term. But it’s much more difficult to upgrade a consensus system than traditional software.

Hard Forks

Earlier, we saw how forks process occurs naturally, as part of the normal operation of the network and how the network reconverges on a common blockchain after one or more blocks are mined.

There is another scenario in which the network may diverge into following two chains: a change in the consensus rules. This type of fork is called a hard fork, because after the fork the network does not reconverge onto a single chain. Instead, the two chains evolve independently.

In open source software, a fork occurs when a group of developers choose to follow a different software roadmap and start a competing implementation of the project. There are two circumstances that will lead to a hard fork in bitcoin: a bug in the consensus rules and a deliberate modification of the consensus rules. For example, Bitcoin XT and Bitcoin cash are hard forks of bitcoin but without the majority support.

Conceptually, we can think of a hard fork as developing in four stages: a software fork, a network fork, a mining fork, and a chain fork.

Hard forks are seen as risky because they force a minority to either upgrade or remain on a minority chain. The risk of splitting the entire system into two competing systems is seen by many as an unacceptable risk. Others see the mechanism of hard fork as an essential tool for upgrading the consensus rules in a way that avoids “technical debt” and provides a clean break with the past. Others see the mechanism of hard fork as an essential tool for upgrading the consensus rules in a way that avoids “technical debt” and provides a clean break with the past.

Soft Forks

Not all consensus rule changes cause a hard fork. Only consensus changes that are forward-incompatible cause a fork. If the change is implemented that a non-upgraded client still sees the transaction or block as valid under the previous rules, the change can happen without a fork.

The term soft fork was introduced to distinguish this upgrade method from a “hard fork.” In practice, a soft fork is not a fork at all.

Soft forks can be implemented with different methods which don’t require all nodes to upgrade or force non-upgraded nodes out of consensus:

Soft forks redefining NOP opcodes

A number of soft forks have been implemented in bitcoin, based on the re-interpretation of NOP opcodes. Bitcoin Script had ten opcodes reserved for future use, NOP1 through NOP10. Under the consensus rules, the presence of these opcodes in a script is interpreted as a null-potent operator, meaning they have no effect. For example, BIP-65 (CHECKLOCKTIMEVERIFY) reinterpreted the NOP2 opcode.

Other ways to soft fork upgrade

Segwit is another soft fork mechanism was introduced that does not rely on NOP opcodes for a very specific type of consensus change. Like we learned earlier, Segwit is an architectural change to the structure of a transaction, which moves the unlocking script to the blockchain.

The mechanism used for this is a modification of the locking script of UTXO created under segwit rules, such that non-upgraded clients see the locking script as redeemable with any unlocking script whatsoever. As a result, segwit can be introduced without requiring every node to upgrade or split from the chain.

It is likely that there are other, yet to be discovered, mechanisms by which upgrades can be made in a forward-compatible way as a soft fork.

However, there are some criticism of soft forks:

  • technical dept: increasing the future cost of code maintenance because of design tradeoffs made in the past. Code complexity in turn increases the likelihood of bugs and security vulnerabilities.
  • validation relaxation: the non-upgraded clients are not validating using the full range of consensus rules, as they are blind to the new rules.
  • irreversible upgrades: if a soft fork upgrade were to be reversed after being activated, any transactions created under the new rules could result in a loss of funds under the old rules

Mining Pools

In this highly competitive environment, individual miners working alone (solo miners) don’t stand a chance. Miners now collaborate to form mining pools, pooling their hashing power and sharing the reward among thousands of participants.

Mining pools coordinate many hundreds or thousands of miners, over specialized pool-mining protocols. The individual miners configure their mining equipment to connect to a pool server, and specify a bitcoin address, which will receive their share of the rewards.

But how does a mining pool measure the individual contributions?

The answer is to use bitcoin’s Proof-of-Work algorithm to measure each pool miner’s contribution, but set at a lower difficulty so that even the smallest pool miners win a share frequently enough to make it worthwhile to contribute to the pool. Each time a pool miner finds a block header hash that is equal to or less than the pool target, she proves she has done the hashing work to find that result. Thousands of miners trying to find low-value hashes will eventually find one low enough to satisfy the bitcoin network target.

Managed Pools

Most mining pools are “managed,” meaning that there is a company or individual running a pool server. The owner of the pool server is called the pool operator, and he charges pool miners a percentage fee of the earnings.

The pool server runs specialized software and a pool-mining protocol that coordinate the activities of the pool miners. The pool server is also connected to one or more full bitcoin nodes and has direct access to a full copy of the blockchain database.

Peer-to-peer mining pool (P2Pool)

Managed pools create the possibility of cheating by the pool operator, who might direct the pool effort to double-spend transactions or invalidate blocks. Furthermore, centralized pool servers represent a single-point-of-failure. To resolve this issue, a new pool mining method was proposed and implemented: P2Pool, a peer-to-peer mining pool without a central operator.

P2Pool works by decentralizing the functions of the pool server, implementing a parallel blockchain-like system called a share chain. The share chain allows pool miners to collaborate in a decentralized pool by mining shares on the share chain, which runs at a lower difficult rate. When one of the share blocks also achieves the bitcoin network target, it is propagated and included on the bitcoin blockchain.

Even though P2Pool reduces the concentration of power by mining pool operators, it is conceivably vulnerable to 51% attacks against the share chain itself.

Soft Fork Signaling with Block Version

Since soft forks allow non-upgraded clients to continue to operate within consensus, the mechanism for “activating” a soft fork is through miners signaling readiness: a majority of miners must agree that they are ready and willing to enforce the new consensus rules. To coordinate their actions, there is a signaling mechanism that allows them to show their support for a consensus rule change. This mechanism was introduced with the activation of BIP-34 in March 2013 and replaced by the activation of BIP-9 in July 2016.

BIP-9

In BIP-34, valid blocks had to contain a specific block-height at the beginning of the coinbase data and be identified with a version number greater than or equal to “2.” To signal the change and activation of BIP-34, miners set the block version to “2,” instead of “1.” This did not immediately make version “1” blocks invalid. Once activated, version “1” blocks would become invalid and all version “2” blocks would be required to contain the block height in the coinbase to be valid. When 95% of the last 100 blocks are version “2”,  version “1” blocks are no longer considered valid.

But there were some problems, for example could only one soft fork could be activated at a time.

BIP-9 interprets the block version as a bit field instead of an integer. This leaves 29 bits that can be used to independently and simultaneously signal readiness on 29 different proposals. Furthermore, BIP-9 also sets a maximum time for signaling and activation. This way miners don’t need to signal forever. If a proposal is not activated within the TIMEOUT period (defined in the proposal), the proposal is considered rejected.

After TIMEOUT has passed, the signaling bit can be reused for another feature without confusion.

Consensus Software Development

Power is diffused between multiple constituencies such as miners, core developers, wallet developers, exchanges, merchants and end users. Decisions cannot be made unilaterally by any of these constituencies.

It is important to recognize that there is no perfect solution for consensus development. Both hard forks and soft forks involve tradeoffs. For some types of changes, soft forks may be a better choice; for others, hard forks may be a better choice.

Changes are difficult to make and require compromise. Therefore, the whole system is slow, but that’s also the greatest strength.