Transactions

Transactions are the most important part in the system, everything else is designed to create, validate or add transactions to the blockchain.

This chapter is about:

  • the various forms of transactions
  • what they contain
  • how to create and verify
  • how they become part of the permanent record

Transactions in Detail

Earlier, we learned about a block explorer where you see details about a transaction. But much of that information is not actually stored in the transaction.

transaction in a block explorer vs. raw transaction encoded (below)
{
  "version": 1,
  "locktime": 0,
  "vin": [
    {
      "txid": "7957a35fe64f80d234d76d83a2a8f1a0d8149a41d81de548f0a65a8a999f6f18",
      "vout": 0,
      "scriptSig" : "3045022100884d142d86652a3f47ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813[ALL] 0484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade8416ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc17b4a10fa336a8d752adf",
      "sequence": 4294967295
    }
  ],
  "vout": [
    {
      "value": 0.01500000,
      "scriptPubKey": "OP_DUP OP_HASH160 ab68025513c3dbd2f7b92a94e0581f5d50f654e7 OP_EQUALVERIFY OP_CHECKSIG"
    },
    {
      "value": 0.08450000,
      "scriptPubKey": "OP_DUP OP_HASH160 7f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a8 OP_EQUALVERIFY OP_CHECKSIG",
    }
  ]
}

In the raw transaction, a lot of information is missing, like coins, senders, recipients, balances, accounts or addresses. All those things are constructed at a higher level for the benefit of the user, to make things easier to understand.

Transaction Outputs

The fundamental building block of a bitcoin transaction is a transaction output, bitcoin full nodes track all available and spendable outputs, known as unspent transaction outputs (UTXO).

The UTXO set size grows when unspent outputs are created and shrinks when they are consumed, turning into a spent transaction output (STXO).

Example of a transaction chain with the matching UTXO set

But at some point, outputs had to be created. The first transaction in each block is the coinbase transaction. This transaction is placed there by the “winning” miner and does not consume UTXO. It has a special type of input called “coinbase”.

Transaction outputs consist of two parts:

  • An amount of bitcoin, denominated in satoshis
  • A cryptographic puzzle that determines the conditions required to spend the output

This puzzle is also known as a locking script, a witness script, or a scriptPubKey.

"vout": [
  {
    "value": 0.01500000,
    "scriptPubKey": "OP_DUP OP_HASH160 ab68025513c3dbd2f7b92a94e0581f5d50f654e7 OP_EQUALVERIFY
    OP_CHECKSIG"
  },
  {
    "value": 0.08450000,
    "scriptPubKey": "OP_DUP OP_HASH160 7f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a8 OP_EQUALVERIFY OP_CHECKSIG",
  }
]

When we look at the decoded transaction from above, two outputs are defined by a value and a cryptographic puzzle. Bitcoin Core shows this as scriptPubKey and show us a human-readable representation of the script.

But before we look at how this works, we need to understand the overall structure of transaction inputs and outputs.

Transaction serialization (outputs)

Most bitcoin libraries store transactions internally in data structures. When transactions are transmitted over the network or exchanged between applications, they are serialized. Serialization is the process of converting the internal representation of a data structure into a format that can be transmitted one byte at a time, also known as a byte stream. Serialization is most commonly used for encoding data structures for transmission over a network or for storage in a file.

SizeFieldDescription
8 bytes (little-endian)AmountBitcoin value in satoshis (10-8 bitcoin)
1–9 bytes (VarInt)Locking-Script SizeLocking-Script length in bytes, to follow
VariableLocking-ScriptA script defining the conditions needed to spend the output
Transaction output serialization

The process of converting from the byte-stream representation of a transaction to a library’s internal representation data structure is called deserialization or transaction parsing.

Transaction serialized in hexadecimal notation:

0100000001186f9f998a5aa6f048e51dd8419a14d8a0f1a8a2836dd73 4d2804fe65fa35779000000008b483045022100884d142d86652a3f47ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813 01410484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade84 16ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc1 7b4a10fa336a8d752adfffffffff02**60e31600000000001976a914ab6 8025513c3dbd2f7b92a94e0581f5d50f654e788acd0ef800000000000 1976a9147f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a888ac** 00000000
  • The two outputs are highlighted
  • 0.015 bitcoin = 1,500,000 satoshis = 16 e3 60 in hexadecimal.
  • In the serialized transaction, the value 16 e3 60 is encoded in little-endian (least-significant-byte-first) byte order, so it looks like 60 e3 16.
  • The scriptPubKey length is 25 bytes, which is 19 in hexadecimal.

Transaction Inputs

Transaction inputs identify (by reference) which UTXO will be consumed and provide proof of ownership through an unlocking script.

There is only one input in our transaction above:

"vin": [
  {
    "txid": "7957a35fe64f80d234d76d83a2a8f1a0d8149a41d81de548f0a65a8a999f6f18",
    "vout": 0,
    "scriptSig" : "3045022100884d142d86652a3f47ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813[ALL] 0484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade8416ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc17b4a10fa336a8d752adf",
    "sequence": 4294967295
  }
]
  • A transaction ID (txid), referencing the transaction that contains the UTXO being spent
  • An output index (vout), identifying which UTXO from that transaction is referenced (first one is zero)
  • A scriptSig, which satisfies the conditions placed on the UTXO, unlocking it for spending
  • A sequence number (to be discussed later)

But we don’t know the value of the UTXO yet. To find this information, we must retrieve the parent transaction that contains the output. The referenced UTXO is also used to calculate the transaction fees.

Transaction serialization (inputs)

When transactions are serialized for transmission on the network, their inputs are encoded into a byte stream, like the outputs.

SizeFieldDescription
32 bytesTransaction HashPointer to the transaction containing the UTXO to be spent
4 bytesOutput IndexThe index number of the UTXO to be spent; first one is 0
1–9 bytes (VarInt)Unlocking-Script SizeUnlocking-Script length in bytes, to follow
VariableUnlocking-ScriptA script that fulfills the conditions of the UTXO locking script
4 bytesSequence NumberUsed for locktime or disabled (0xFFFFFFFF)
Transaction serialization into byte stream

The transaction from above, serialized and presented in hexadecimal notation, will look like this (input is highlighted):

0100000001**186f9f998a5aa6f048e51dd8419a14d8a0f1a8a2836dd73 4d2804fe65fa35779000000008b483045022100884d142d86652a3f47 ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039 ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813 01410484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade84 16ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc1 7b4a10fa336a8d752adfffffffff**0260e31600000000001976a914ab6 8025513c3dbd2f7b92a94e0581f5d50f654e788acd0ef800000000000 1976a9147f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a888ac00000 000
  • The transaction ID is serialized in reversed byte order, it starts with (hex) 18 and ends with 79
  • The output index is a 4-byte group of zeros
  • The length of the scriptSig is 139 bytes, or 8b in hex
  • The sequence number is set to FFFFFFFF

Transaction Fees

Most wallets calculate and include transaction fees automatically (in satoshi / byte). However, if you are constructing transactions programmatically or using a command-line interface, you must manually account for and include these fees.

Fees are calculated based on the size of the transaction in kilobytes, not the value of the transaction in bitcoin. It’s influenced by market forces, based on network capacity and transaction volume.

In Bitcoin Core, fee relay policies are set by the minrelaytxfee option, which is 0,00001 BTC. But any bitcoin service that creates transactions must implement dynamic fees. This can be achieved through a third-party fee estimation service (like https://bitcoinfees.earn.com/) or with a built-in fee estimation algorithm.

Most services offers the option between high, medium or low priority fees.

Transaction fees are implied as the difference between the inputs and the outputs. Therefore, you need to pay attention to include a change address, otherwise you pay large transaction fees.

Transaction Scripts and Script Language

The bitcoin transaction script language is called Script. Both, the locking script placed on an UTXO and the unlocking script, are written in this scripting language. They are executed alongside to validate the transaction.

Today, most transactions are based on a script called a Pay-to-Public-Key-Hash script, which is like “Payment to xxx bitcoin address”. But locking scripts can be written to express a vast variety of complex conditions.

Concepts of Script

Turing Incompleteness

Script contains many operators, but there are no loops or complex flow capabilities (other than conditional flow control?). This simplicity ensures a predictable executions times. Logical errors which could be embedded in transactions are prevented.

Stateless Verification

The bitcoin transaction script language is stateless, which means that all the information needed to execute a script is contained within the script.

Therefore, it will execute the same way on any system. With the predictable outcome, everyone can verify the script

Script construction (Lock + Unlock)

There are two types of scripts to verify transactions:

  • locking script (sometimes scriptPubKeywitness script or cryptographic puzzle) is a spending condition placed on an output. Usually it’s a public key or an address.
  • unlocking script (sometimes scriptSig or witness) satisfies the condition above. It’s mostly a digital signature, produced by the user’s wallet

Every bitcoin validating node retrieve the UTXO referenced by the input, and copy the locking script from that UTXO. The unlocking and locking script are then executed in sequence.

The script executionstack:

Bitcoin’s scripting language is a stack-based language, it works with the LIFO queue (last-in-first-out). The scripting language executes the script by processing each item from left to right. Numbers (data constants) are pushed onto the stack. Operators push or pop one or more parameters from the stack, act on them, and might push a result onto the stack.

A simple script

First of all, you can find details on the available script operators and functions here.

Locking Script:

3 OP_ADD 5 OP_EQUAL
  • OP_ADD = adding two numbers and putting the result on the stack
  • OP_EQUAL = checks if two numbers are equal

Unlocking Script:

2

Now, they are combined and executed:

Transactions are valid if the top result on the stack is

  • TRUE
  • any other non-zero value or not OP_0
  • if the stack is empty after script execution.

Transactions are invalid if the top value on the stack is

  • FALSE (a zero-length empty value, noted as {})
  • script execution is halted explicitly by an operator, such as OP_VERIFY, OP_RETURN a conditional terminator such as OP_ENDIF

Before 2010, the unlocking and locking scripts were concatenated and executed in sequence. To prevent the locking script from malformed unlocking scripts, this changed.

Now, it’s executed in this way:

  1. the unlocking script is executed with the stack execution engine
  2. if it’s executed without errors, the main stack is copied and the locking script is executed.
  3. these two are executed together, if the result is true, the transaction is valid

Pay-to-Public-Key-Hash (P2PKH)

The vast majority of transactions processed on the bitcoin network spend outputs locked with a Pay-to-Public-Key-Hash or P2PKH script. These outputs contain a locking script that locks the result to a public key hash (bitcoin address), and can be unlocked by presenting a public key and a digital signature created by the corresponding private key.

locking script:

OP_DUP OP_HASH160 <Public_Key_Hash> OP_EQUALVERIFY OP_CHECKSIG

unlocking script:

<Cafe Signature> <Cafe Public Key>

Here is how the combined script is executed:

Digital Signatures (ECDSA)

We use digital signatures to prove ownership of a private key. Every time you see the script functions OP_CHECKSIG, OP_CHECKSIGVERIFY, OP_CHECKMULTISIG or OP_CHECKMULTISIGVERIFY, the script must contain a ECDSA signature. ECDSA (Elliptic Curve Digital Signature Algorithm) is the algorithm used for digital signatures based on elliptic curve private/public key pairs.

Each transaction input is signed independently. Multiple parties can collaborate to construct transactions and sign only one input each.

Digital signatures consist 2 parts. The first part is an algorithm for creating a signature, using a private key (the signing key) from a message (data from the transaction). The second part is an algorithm that allows anyone to verify the signature, given also the message and a public key.

Creating a digital signature

More precisely, the message is a hash of a specific subset of the data in the transaction. Together with the private key, the resulting signature is defined:

Sig = Fsig(Fhash(m), dA)
  • dA is the signing private key
  • m is the transaction (or parts of it)
  • Fhash is the hashing function
  • Fsig is the signing algorithm
  • Sig is the resulting signature

The function Fsig produces a signature Sig that is composed of two values, commonly referred to as R and S:

Sig = (R, S)

These 2 values are now serialized into a byte-stream using an international standard encoding scheme called the Distinguished Encoding Rules, or DER.

At the end of the post, I’ve explained the calculation more detailed.

Verifying the Signature

To verify the signature, you need

  • the signature (R and S)
  • the serialized transaction
  • the public key

The signature verification algorithm takes the message, the signer’s public key and the signature and returns TRUE if the signature is valid.

Signature HASH Types (SIGHASH)

Digital Signatures imply a commitment by the signer to one or more inputs and outputs. They use a SIGHASH flag to indicate which data should be included in the hash. This single byte is either SINGLE, ALL or NONE.

In addition, there is a modifier flag called SIGHASH_ANYONECANPAY which can be combined. If it’s set, only one input is signed.

Summary of different sighash combinations

SIGHASH flags are added to the to the end of the truncated and serialized transaction. The hash itself is the “message” that is signed. Depending on which SIGHASH flag is used, different parts of the transaction are truncated.

How different SIGHASH types are used:

  • 0x01 commits the entire transaction. I’s the most common signature form.
  • 0x81 can be used to make a “crowdfunding”-style transaction.
  • 0x02 commits only the input, the output locking script can be changed. This can serve as a blank check.
  • 0x82 can be used as a dust collector, where tiny UTXO can be donated for anyone to aggregate and spend whenever they want

Sidenote: The SIGHASH type is added before the transaction is signed, it can’t be edited afterwards.

In most wallets, you don’t have the option to modify SIGHASH flags. The standard is 0x01. There are special-purpose bitcoin applications which they come in handy.

New Proposals have been made, like Bitmask Sighash Modes, to expand the SIGHASH system.

ECDSA Math

Now, we look at the FSig function in more detail.

To calculate the R and S values, the algorithm first generates an ephemeral key pair. This temporary key pair is based on a random number k (private key) which we use to get the public key (P):

P =  k * G
  • G is the elliptic curve generator point

Next, we calculate the S value of the Signature:

S = k-1 (Hash(m) + dA * R) mod n
  • k is the ephemeral private key
  • R is the x coordinate of the ephemeral public key
  • dA is the signing private key
  • m is the transaction data
  • n is the prime order of the elliptic curve

For verification, we calculate backwards to get the ephemeral public key P:

P = S-1 * Hash(m) * G + S-1 * R * Qa

where:

  • R and S are the signature values
  • Qa is Alice’s public key
  • m is the transaction data that was signed
  • G is the elliptic curve generator point

If the x coordinate of the calculated point P is equal to R, then the verifier can conclude that the signature is valid.

Warning:

The value of k is not important, as long it’s not used twice. Otherwise, the signing private key can be calculated.

To avoid this vulnerability, we use the industry-standard algorithm which is defined in RFC 6979. Instead of a random number generator, this takes the transaction data to create a random number based on a deterministic-random process

This prevents the risk of using the same random number k twice.