You are currently viewing What is the Merkle tree?.  The Merkle tree (a hash tree) is an… |  by Sunflower Corporation |  Coinmonks |  Sep 2022

What is the Merkle tree?. The Merkle tree (a hash tree) is an… | by Sunflower Corporation | Coinmonks | Sep 2022

The Merkle tree (a hash tree) is an algorithm that generates a single hash for a large amount of data. The method is used to check the integrity of files and verify data. How does it work? Why do we need it? Let’s find out!

A hash tree can be represented as a structure, with branches branching out to intermediate nodes from the base. At the ends of the branches, leaves representing data fragments are placed. The root hash is located at the tree’s base (Merkle root). The latter is a required component of the bitcoin block header.

Each transaction can be verified using the root hash. You only need to download the block header and the operation’s authentication path. The Merkle tree reduces the number of calculations required, allowing for simplified payment verification (SPV).

Professor Ralph Merkle developed the hash tree concept. In 1979, he invented a method for representing digital signatures. Stanford University owns the technology’s patent.

The scientist proposed using a binary hash tree. Merkle also made significant contributions to the advancement of cryptography. He is known for the 1987 publication “Digital Signature Based on a Сonventional Encryption Function”.

A centralized system provides data from a single source on which all users rely. It ensures that the information received is correct.

A distributed database is what the blockchain is. Its contents are stored on a network of nodes. The node cannot accept messages from other participants unless they are verified. The node must decide whether or not the block contains valid transactions.

Merkle trees can be used to save money on computing. They enable you to reduce the amount of data downloaded and optimize verification through hashing.

Bitcoin, Ethereum, and other cryptocurrencies use this method. It’s used to get a data string that validates a set of transactions. In addition, the algorithm is used in file systems and databases. Information is checked for errors and synchronized using Merkle trees.

The Merkle tree is built from the ground up. Hashing data fragments yields values ​​in leaf vertices. The nodes of the next level contain a hash of the sum of the two children. Concatenation is a method of combining data. The operation is repeated for nodes of the following levels until a single hash is obtained. If the number of elements is odd, one of them is duplicated or transferred to the next level unchanged.

When constructing a tree, a single hash is obtained, which is known as the Merkle root. It represents all data fragments. As a result, the Merkle tree is a unidirectional hash function.

The algorithm allows you to create a binary structure with nodal values ​​composed of two strings. The latter property allows for the verification of large amounts of data without recalculating hashes for each fragment. In this case, the computational cost of determining the authenticity of a single element is much lower.

To ensure the array’s correctness and integrity, the root hash must be compared to the reference value. Fragments can be transaction data or parts of files.

The bitcoin blockchain is built up from fragments that are written to the chain’s end. It contains information on user transfers. Because the number of transactions and the amount of information are both variable, the block does not have a fixed size.

Bitcoin nodes create headers to optimize calculations. They consist of the following elements:

  • Block version number;
  • Hash of the previous block;
  • The root of the Merkle tree;
  • timestamp;
  • The goal of mining complexity;
  • A one-time code (nonce) used when generating the block.

The header is 80 bytes long and does not include any transactions. Because it is generated every ten minutes, the amount of data increases by 4.2 megabytes per year.

Transaction information is hashed, allowing you to obtain transaction IDs. The transfer data is saved in hexadecimal format. The root hash represents all transactions in the block. A Merkle tree is constructed to find the latter. The following algorithm is used to process the data:

  1. There are hashes (identifiers) of transactions included in the block: hash(L1), hash(L2), hash (L3) and so on. They form the leaves of the tree.
  2. Hashes from the sum of two adjacent identifiers are placed on the next level: hash(hash(L1) + hash(L2)). In a binary tree, the number of nodes at each level must be even. Otherwise, the hash from the last cell is duplicated and placed in an additional element.
  3. The process of hashing the amount of data is repeated until the Merkle root is obtained.

Each transactions’s validity is confirmed by the resulting hash. Nodes only use the headers of previous blocks when forming a blockchain.

The Segregated Witness protocol was updated in August of 2017. It employs a different transactional data structuring, which reduces block size. The update’s adoption has reduced the load on the bitcoin blockchain.

Hash trees make it easier to verify that a transaction belongs to a specific block and ensure the integrity of data while it is being transmitted. The method is required for easier payment verification. In the bitcoin white paper, Satoshi Nakamoto suggested using SPV.

If the node has sufficient computing power, it can load all of the blocks and calculate the hash of each transaction. Merkle trees are then formed. They allow you to verify the data’s integrity and the validity of each operation. If the node has limited resources, it can only request the block header and transaction IDs.

Light clients load the header and authentication path (Merckle proof) for the transaction being verified. They request information from the full node. The authentication path includes hashes from each pair of tree nodes located on the path from the vertex to the transaction.

They request information from the entire node. Hashes from each pair of tree nodes on the path from the vertex to the transaction are included in the authentication path.

To validate the operation, the Merkle root must be found. If the received hash matches the string in the block header, the transaction is confirmed. It is nearly impossible to find the required Merkle root in another data set, ensuring the operation’s validity.

The SPV method enables a simple client to interact with the blockchain while reducing the amount of downloaded data. A block size of 500 kilobytes with five transactions, for example, requires only 140 bytes for Merkle’s proof.

The Merkle binary tree is ideal for arrays represented as lists. Its structure remains unchanged, which makes hashing transactions easier. A prefix tree is a different way of representing data in Ethereum.

This tree enables you to store data in an associative array. Strings are keys that define the positions of the elements in a collection. The branches of the tree are denoted by various symbols to form the structure, so that the key of the element uniquely identifies it.

Unlike Bitcoin, the Ethereum blockchain uses three hash trees:

  • The tree of transactions;
  • The tree of states;
  • The tree containing transaction results data.

The block header includes three root hashes. Ethereum allows you to create light clients capable of performing a basic set of operations, such as:

  • To check if there is a transaction in the block;
  • To confirm the existence of a specific address;
  • To determine the user’s balance;
  • To find out the result of an operation or a smart contract.

These operations are carried out without fully loading the blocks. Hash trees simplify calculations, allowing light clients to run on personal computers, laptops, and smartphones.

Ethereum’s transactional data processing algorithm is similar to that of bitcoin. It is more difficult to interact with the state tree. The array element’s key represents the user’s address, and its value represents the account balance.

The hash tree requires frequent data updates and the addition and removal of addresses. The algorithm’s implementation necessitates a changeable structure. Its parameters are restricted to prevent a DDoS attack that would allow an attacker to create an excessively deep tree.

Otherwise, updating the structure and carrying out operations will take a long time.

The tree’s root should be determined by data and not by its parameters. As a result, the ability to make updates in any order is required.

You can solve these problems using the prefix tree. Each structure element in Ethereum has 16 children. This method is best for encoding nodes in hexadecimal format.

To obtain the key in the prefix tree, you must specify consecutive characters corresponding to the branches. They specify the path from the root to the chosen element. The key value is dynamic, which means you can add or remove new nodes at any time.

If you have anything to add to the Merkle tree topic, welcome to our comments!
In terms of tracking the updates, subscribe to our Medium feed.

Stay tuned!

New to trading? Try crypto trading bots or copy trading

Leave a Reply