Merkle Trees

October 17th, 2023

Above the trees by T.S. Harris

Intended audience: Solidity developers, blockchain engineers, and anyone interested in learning about Merkle trees.

Introduction

In this exploration, we're diving deep into the world of Merkle trees, a cornerstone concept in blockchain technology. Teaming up with Raza, a seasoned developer relations expert at Scroll, we aim to demystify the intricacies of Merkle trees and their profound impact on data verification and integrity. Let's get into the heart of blockchain data structures.

This article is a part of a series by Raza. He teams up with different specialists to write about a particular concept, based on a specific smart contract. For this article, we teamed up to write about Merkle trees.

To be clear, this is not a beginner tutorial level of stuff, you're jumping into the deep end. We dissect a smart contract, piece by piece, then roll up our sleeves and craft one ourselves. So if you want to level up your Solidity skills, let's dive in.

Previous articles in the series can be found here.

How to use this article

We look at a contract. Deconstruct it. Do some practise. Then we craft one ourselves. This article is not just for reading. You need to do the exercises or else it’s pointless.

Here’s what we’re going to do:

Take a contract and understand what it’s aiming to do.
Talk a bit about syntax if it hasn’t been covered by previous articles. These articles build on top of each other.
Practise, Practise, Practise
Then build the contract, based on a template with instructions

Why practising this smart contract is useful

Check out the Github Repo if you’re looking for a working example of the code.

Merkle trees allow you to cryptographically prove that an element is contained in a set without revealing the entire set.

That’s convenient, but how does that make it useful?

Surprisingly, there are many very practical reasons to use Merkle trees.

In its essence, Merkle tree can be used to verify any kind of data. It can help ensure that data blocks received from other peers in a peer-to-peer network are received undamaged and unaltered, and even to check that the other peers do not lie and send fake blocks. But there’s more to Merkle trees than just verification.

Some reasons why it's valuable to get experience and become comfortable with Merkle proofs:

Merkle proofs are used in blockchain networks like Ethereum to cryptographically verify large data sets. For example, they allow light clients to verify if a transaction is included in a block without downloading the entire blockchain.
Smart contracts often use Merkle proofs to verify inclusion in various types of Merkle trees, like token whitelist Merkle trees for airdrops or allowlist Merkle trees for minting NFTs.
Merkle trees provide an efficient way to store cryptographic hashes of data off-chain while still allowing easy integrity verification on-chain. Understanding Merkle proofs unlocks the ability to build schemas like this.
Merkle proofs are a basic building block developers need to understand to build more complex smart contract systems or integrate with blockchain protocols. Getting solid experience with hand-crafting and verifying Merkle proofs develops this blockchain development intuition.
While there are many Merkle proof code templates and examples, truly understanding how they work enough to build and debug custom implementations helps grow as a blockchain engineer.

In summary, familiarity with Merkle tree concept is essential for smart contract programming. It's one of basic building blocks of blockchain networks.

Now, let's dive deeper into details and see the contract.

The contract

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;

contract MerkleProof {
    function verify(
        bytes32[] memory proof,
        bytes32 root,
        bytes32 leaf,
        uint index
    ) public pure returns (bool) {
        bytes32 hash = leaf;

        for (uint i = 0; i < proof.length; i++) {
            bytes32 proofElement = proof[i];

            if (index % 2 == 0) {
                hash = keccak256(abi.encodePacked(hash, proofElement));
            } else {
                hash = keccak256(abi.encodePacked(proofElement, hash));
            }
            index = index / 2;
        }
        return hash == root;
    }
}

contract TestMerkleProof is MerkleProof {
    bytes32[] public hashes;

    constructor() {
        string[4] memory transactions = [
            "alice -> bob",
            "bob -> dave",
            "carol -> alice",
            "dave -> bob"
        ];

        for (uint i = 0; i < transactions.length; i++) {
            hashes.push(keccak256(abi.encodePacked(transactions[i])));
        }

        uint n = transactions.length;
        uint offset = 0;

        while (n > 0) {
            for (uint i = 0; i < n - 1; i += 2) {
                hashes.push(
                    keccak256(
                        abi.encodePacked(hashes[offset + i], hashes[offset + i + 1])
                    )
                );
            }
            offset += n;
            n = n / 2;
        }
    }

    function getRoot() public view returns (bytes32) {
        return hashes[hashes.length - 1];
    }

What does the contract DO

The MerkleProof contract implements a method to verify that a leaf element is part of a Merkle tree given the root hash and a proof.

The verify() method takes in the Merkle proof (an array of hashes representing the sibling hashes along the path from the leaf to the root), the Merkle root hash, the leaf element hash, and the index of the leaf.

It then recalculates the Merkle root hash by hashing pairs of nodes up the tree according to the proof and index. If the recalculated root hash matches the provided root hash, it returns true, indicating the leaf is part of the Merkle tree with that root.

The TestMerkleProof contract demonstrates usage by constructing a simple Merkle tree of transaction hashes and providing a way to fetch the root hash and verify proofs.

In summary, this contract allows verifying that a leaf element is part of a larger Merkle tree given just the element, a proof, and the root, without needing the entire tree. This allows efficient cryptographic verification of partial dataset integrity.

Here’s a visual example of going up the Merkle tree to verify one of the elements from the bottom:

Solidity syntax you should know

To fully understand and write the provided smart contract, you should be familiar with the following Solidity syntax and concepts:

Data Types:
- bytes32: A fixed-size byte array.
- uint: An unsigned integer.
- string: A dynamic-sized string type.
- string[4] memory: A fixed-size array of strings stored in memory.
- bytes32[]: A dynamic-sized array of bytes32.
- bytes32[] memory: A dynamic-sized array of bytes32 stored in memory.
Storage vs Memory:
- storage: Persistent data storage on the Ethereum blockchain.
- memory: Temporary data storage that only persists during the execution of a function.
Function Visibility:
- public: The function can be called from any other contract or externally.
- pure: The function doesn't modify or read from the blockchain's storage.
- view: The function reads from the blockchain's storage but doesn't modify it.
Contract Inheritance:
- contract TestMerkleProof is MerkleProof: This indicates that TestMerkleProof inherits from MerkleProof.
Control Structures:
- for: Looping structure.
- if-else: Conditional branching.
- while: Another looping structure.
Constructor:
- constructor(): A special function that only runs once, when the contract is deployed.
Built-in Functions and Global Variables:
- keccak256(abi.encodePacked(...)): Hashes the provided arguments using the keccak256 algorithm.
- push(): Adds an element to a dynamic array.
- length: Retrieves the length/size of an array.
Modifiers:
- override: Indicates that the function overrides a function from a base contract.
Events, Libraries, and Modifiers:
- Although these aren't used in the provided contract, they're commonly used in many other contracts.
Mappings:
- A key-value store that allows you to associate one data type (the key) with another (the value). Not used in this contract but crucial for many others.
Error Handling:
- require(), assert(), and revert(): Functions to handle errors and conditions, although they aren't used in this contract.

By understanding these syntax elements and concepts, you will be well-equipped to write and understand the given smart contract.

Breaking down the contract

`MerkleProof` contract

The MerkleProof acts as a parent contract and contains a single verify() function to verify Merkle proofs. This is used for verifying that a certain leaf node element of the Merkle Tree is valid with respect to a given proof.

This contract is rather minimal and serves primarily as a library for validating Merkle proofs.

Here’s a visual of the MerkleProof contract

Let’s break down what the verify() function does.

`verify()` function

Validates a Merkle proof
Takes 4 parameters:
- proof - array of proof nodes, which are intermediary hashes all the way to the top of the tree
- root - Merkle root hash
- leaf - leaf node, an individual piece of data we’re trying to verify
- index - leaf index, a corresponding position of the leaf within the Merkle tree.
Computes hash of the leaf
Uses keccak256() to compute hashes
Iterates through proof nodes, combining with hash based on index being even/odd
Iteratively hashes proof nodes (combining leaf nodes and proof elements), moving up the Merkle tree until it reaches the top
Checks if the calculated hashed value at the root of the tree matches the given root
Returns true if final hash equals root and confirms that the leaf exists within the given Merkle tree
Otherwise, returns false, indicating that the leaf isn’t part of the tree represented by the root hash.

`TestMerkleProof` contract

Shows how to construct a Merkle tree and provide the needed proof data to verify a leaf
Together with MerkleProof they demonstrate a complete Merkle proof verification flow
Inherits verify() from MerkleProof
constructor():
- Builds a Merkle tree from some sample data
- Pushes leaf nodes
- Iteratively pushes parent nodes by hashing pairs of child nodes
getRoot()
- Returns root node of Merkle tree stored in hashes

Here’s the visual of the TestMerkleProof contract

In summary

MerkleProof implements proof verification
TestMerkleProof shows example of generating a Merkle tree
Hashes are computed using keccak256()
TestMerkleProof demo includes sample proof data to verify against the tree
Together they demonstrate a complete Merkle proof verification flow.

Exercises

Here are 10 exercises that will test understanding and manipulation of Solidity's basics and more specific functionalities…

1. Basic Solidity Syntax and Structure

Exercise: Write a simple smart contract named HelloWorld that has a function greet which returns the string "Hello, World!".

Solution:

pragma solidity ^0.8.20;

contract HelloWorld {
    function greet() public pure returns (string memory) {
        return "Hello, World!";
    }
}

2. Memory Arrays and Loops

Exercise: Create a smart contract named ArrayManipulation that stores an array of integers. Add functions to add an integer to the array and to retrieve an integer at a specific index.

Solution:

pragma solidity ^0.8.20;

contract ArrayManipulation {
    int[] public numbers;

    function addNumber(int _number) public {
        numbers.push(_number);
    }

    function getNumber(uint _index) public view returns (int) {
        return numbers[_index];
    }
}

3. Keccak256 Hash Function

Exercise: Write a function that accepts a string and returns its keccak256 hash.

Solution:

pragma solidity ^0.8.20;

contract HashFunction {
    function getHash(string memory _input) public pure returns (bytes32) {
        return keccak256(abi.encodePacked(_input));
    }
}

4. Working with Bytes32

Exercise: Create a function that accepts two bytes32 values, concatenates them, and returns the result.

Solution:

pragma solidity ^0.8.20;

contract BytesManipulation {
    function concatenate(bytes32 a, bytes32 b) public pure returns (bytes memory) {
        return abi.encodePacked(a, b);
    }
}

5. Conditional and Mathematical Operations

Exercise: Write a function that accepts two integers. If the first is even, return their sum; otherwise, return their product.

Solution:

pragma solidity ^0.8.20;

contract MathOperations {
    function operate(uint a, uint b) public pure returns (uint) {
        if (a % 2 == 0) {
            return a + b;
        } else {
            return a * b;
        }
    }
}

6. Inheritance

Exercise: Create two contracts. The first, BaseContract, should have a function that returns "Base Function". The second, DerivedContract, should inherit from BaseContract and override the function to return "Derived Function".

Solution:

pragma solidity ^0.8.20;

contract BaseContract {
    function show() public pure returns (string memory) {
        return "Base Function";
    }
}

contract DerivedContract is BaseContract {
    function show() public pure override returns (string memory) {
        return "Derived Function";
    }
}

7. Memory Keyword

Exercise: Explain the difference between storage and memory in Solidity.

Solution: In Solidity, storage refers to data that is stored on the blockchain and persists between function calls and transactions. Changes to storage are very costly in terms of gas. memory, on the other hand, refers to temporary data storage. It only persists during the execution of a function and is erased after the function completes. It's cheaper in terms of gas than storage.

8. ABI Encoding

Exercise: Write a function that takes in two integers, encodes them using ABI encoding, and returns the keccak256 hash of the encoded data.

Solution:

pragma solidity ^0.8.20;

contract ABIHashing {
    function encodeAndHash(int a, int b) public pure returns (bytes32) {
        return keccak256(abi.encodePacked(a, b));
    }
}

9. Understanding Merkle Trees

Exercise: Explain in your own words what a Merkle tree is and why it's useful in blockchain.

Solution: A Merkle tree is a binary tree where each leaf node is the hash of some data, and each non-leaf node is the hash of its children's hashes. This structure allows for efficient and secure verification of content in large sets of data. In the context of blockchain, it's used to verify if a particular transaction is part of a block without having to know all other transactions in that block.

10. Implementing Merkle Proof Verification

Exercise: Given a set of data, create a Merkle tree, and write a function that uses the Merkle proof to verify if a specific piece of data is part of the set.

Solution: The provided smart contract MerkleProof at the beginning is an implementation of this. It creates a Merkle tree and provides a verification function to check the inclusion of a specific data in the tree.

By working through these exercises, you will gain a deeper understanding of the concepts and patterns used in the provided smart contract.

Final Task

Just like the last article, now we’re going to type up the contract. Below you will find a blank template with instructions. At this point, if you’re unclear how to write up a specific task, ask yourself what you don’t know. That’s your weak point and what you should be working on

Here’s a blank template with comments. Write it out.

// Define the license for the smart contract.
// SPDX-License-Identifier: MIT

// Specify the version of the Solidity compiler to use.
pragma solidity ^0.8.20;

// Define a contract named MerkleProof.
contract MerkleProof {

    // This function verifies the inclusion of a leaf in a Merkle tree using a Merkle proof.
    // Parameters:
    // - proof: The Merkle proof elements.
    // - root: The root hash of the Merkle tree.
    // - leaf: The leaf hash we want to verify.
    // - index: The position of the leaf in the Merkle tree.
    // Return value:
    // - A boolean indicating whether the leaf is validly included in the Merkle tree.
    function verify(
        bytes32[] memory proof,
        bytes32 root,
        bytes32 leaf,
        uint index
    ) public pure returns (bool) {

        // Initialize the hash variable with the provided leaf.
        bytes32 hash = leaf;

        // Iterate over each element in the proof.
        for (uint i = 0; i < proof.length; i++) {
            // Get the current proof element.
            bytes32 proofElement = proof[i];

            // Check if the index is even.
            if (index % 2 == 0) {
                // If even, hash the current hash combined with the proof element.
                hash = keccak256(abi.encodePacked(hash, proofElement));
            } else {
                // If odd, hash the proof element combined with the current hash.
                hash = keccak256(abi.encodePacked(proofElement, hash));
            }

            // Divide the index by 2 for the next iteration.
            index = index / 2;
        }

        // Check if the final hash matches the provided root.
        // Return true if they match, false otherwise.
        return hash == root;
    }
}

// Define a contract named TestMerkleProof that inherits from the MerkleProof contract.
contract TestMerkleProof is MerkleProof {

    // Declare a public array of bytes32 named hashes to store the hashes of the Merkle tree.
    bytes32[] public hashes;

    // Define the constructor for the TestMerkleProof contract.
    constructor() {
        // Define an array of strings named transactions with 4 sample transactions.
        string[4] memory transactions = [
            "alice -> bob",
            "bob -> dave",
            "carol -> alice",
            "dave -> bob"
        ];

        // Iterate over each transaction in the transactions array.
        for (uint i = 0; i < transactions.length; i++) {
            // Hash each transaction and push the result to the hashes array.
            hashes.push(keccak256(abi.encodePacked(transactions[i])));
        }

        // Initialize a variable named n with the length of the transactions array.
        // Initialize a variable named offset with the value 0.
        uint n = transactions.length;
        uint offset = 0;

        // While n is greater than 0, generate the non-leaf nodes of the Merkle tree.
        while (n > 0) {
            // For every pair of hashes in the current level of the tree:
            for (uint i = 0; i < n - 1; i += 2) {
                // Combine and hash them, then push the result to the hashes array.
                hashes.push(
                    keccak256(
                        abi.encodePacked(hashes[offset + i], hashes[offset + i + 1])
                    )
                );
            }
            // Increment the offset by n to move to the next level of the tree.
            offset += n;
            // Divide n by 2 for the next iteration.
            n = n / 2;
        }
    }

    // Define a public view function named getRoot that returns the root of the Merkle tree.
    function getRoot() public view returns (bytes32) {
        // Return the last element in the hashes array, which is the root of the Merkle tree.
        return hashes[hashes.length - 1];
    }
}

Using Developer Tools

Most developers opt for JavaScript when deploying smart contracts. In this article we're using Hardhat to test the contract, but to deploy we'll use Truffle dashboard. Using Truffle dashboard prevents the need to pass our private keys! More on that later.

Hardhat

OK, let’s install hardhat and then use it to write some basic test and then deploy it. If you’ve never touched hardhat, no worries.

Install Hardhat

Hardhat is a popular development environment, testing framework, and asset pipeline for Ethereum. To deploy with hardhat, follow the steps below:

Install hardhat using npm:

npm install --save-dev hardhat
yarn add --dev hardhat

You can run the following command to create a hardhat project:

npx hardhat

Configure Hardhat

We’re going to use Truffle Dashboard with hardhat, what that is, I’ll show in a minute. Adjust your hardhat.config.js file like this:

require("@nomicfoundation/hardhat-toolbox");

/** @type import('hardhat/config').HardhatUserConfig */
module.exports = {
  solidity: "0.8.10",
  localhost: {
    timeout: 100_000
  },
  networks: {
    'truffle-dashboard': {
      url: "http://localhost:24012/rpc",
      timeout: 400_000
    },
    hardhat: {
      forking: {
        url: "https://sepolia-rpc.scroll.io/",
      },
    }
  },
  settings: {
    optimizer: {
      enabled: true,
      runs: 200,
    },
    evmVersion: "istanbul",
    debug: {
      // Enable Solidity stack traces
      stacktrace: true,
    },
  },
};

Now you if you know a little bit dev tools, you might be asking, where do i pass the rpc url to deploy to a live network. Well you don’t have to. Let’s take a look. Side note, I am passing an rpc url in the hardhat: forking part. That’s for the testing part. Hardhat will fork the chain and simulate the transactions locally. This is handy if you’re interacting with contracts deployed on that chain.

Compiling the contract

So before we can deploy the contract, we need to compile it. Make sure to do a couple of checks, like the solidity the version in the contract should be the same as in the config file. Also make sure that the name defined inside the smart contract is the same as the name of the smart contract file.

Once you’ve done your checks, run the following command to compile the contract:

npx hardhat compile

After compiling the contract you should have an artifacts folder inside your project.

Deploy

Normally you test first, then deploy. But if you’re starting out, you might not be interested in tests. So we’ll do deploying first and then dive into tests.

First make sure your deploy script refers to the right contract. You can find your script in the scripts and it’s usually called deploy.js. This is what the script looks like for the Merkle Proof contract:

const hre = require("hardhat");

async function main() {
  // Getting the contract factory
  const TestMerkleProof = await hre.ethers.getContractFactory("TestMerkleProof");

  // Deploying the contract and waiting for it to be mined
  const testMerkleProof = await TestMerkleProof.deploy();
  await testMerkleProof.deployed();

  console.log("TestMerkleProof deployed to:", testMerkleProof.address);
}

// Running the script and handling any errors
main()
  .then(() => process.exit(0))
  .catch((error) => {
    console.error(error);
    process.exit(1);
  });

Quick note, if you’re importing ethers in the following way, you might run in to an error.

import { ethers } from "hardhat";

That has to do with JavaScript versions. Hardhat uses CommonJS, whereas the above import statement is used in ESM (ECMAScript Modules) Projects. If you want to know more, check out this article. If you prefer using the import statement, it will require you to make some changes to your package.json and change the extension name of your deploy script file.

Next we’re going to need two windows in your terminal. In the first we run Truffle Dashboard, which spins up a local host server. In the second we will compile the contract and send it to the dashboard to deploy.

Before we start make sure you have truffle installed:

npm install -g truffle

Next, spin up the dashboard.

Terminal 1: Run dashboard

truffle dashboard

Go to your browser (if it doesn’t pop up automagically) and the dashboard should be running on localhost:24012 or localhost:24013

Connect your wallet and connect to the network you want to use.

Terminal 2: Compile and send contract

npx hardhat run --network truffle-dashboard scripts/deploy.js

Now when you run the above command, go to your dashboard and you should see the metadata of your contract. Make sure your wallet is connected and you’re on the desired network (probably a test net). Click deploy and that’s it!

It might look like your terminal is “stuck”, but it’s not. If your browser doesn’t pop up ‘auto-magically’, go to your localhost:24012 and then you should see a prompt to deploy the contract.

Test your contract

Tests basically run a function and check if the function returns the desired outcome. So you need to tell a test what function to run, the parameters (if any) and then what the desired outcome is. That’s it.

Here are some for this contract:

const { expect } = require("chai");

describe("TestMerkleProof", function () {
  let TestMerkleProof, testMerkleProof;

  beforeEach(async function () {
    // Getting the ContractFactory and Signers here.
    TestMerkleProof = await ethers.getContractFactory("TestMerkleProof");

    // Deploying our contract
    testMerkleProof = await TestMerkleProof.deploy();
    await testMerkleProof.deployed();
  });

  // Test case: Ensure the contract is deployed successfully
  it("Should be deployed successfully", async function () {
    expect(testMerkleProof.address).to.properAddress;
  });

  // Test case: Validate the root of the Merkle Tree
  it("Should return the correct Merkle root", async function () {
    const expectedRoot = "0xcc086fcc038189b4641db2cc4f1de3bb132aefbd65d510d817591550937818c7";
    expect(await testMerkleProof.getRoot()).to.equal(expectedRoot);
  });

  // Test case: Verify Merkle Proof for given data (you can add more details based on your contract's logic)
  it("Should verify the Merkle proof correctly", async function () {
    const leaf = "0xdca3326ad7e8121bf9cf9c12333e6b2271abe823ec9edfe42f813b1e768fa57b";
    const root = "0xcc086fcc038189b4641db2cc4f1de3bb132aefbd65d510d817591550937818c7";
    const index = 2;
    const proof = [
      "0x8da9e1c820f9dbd1589fd6585872bc1063588625729e7ab0797cfc63a00bd950",
      "0x995788ffc103b987ad50f5e5707fd094419eb12d9552cc423bd0cd86a3861433",
    ];
    const isValid = await testMerkleProof.verify(proof, root, leaf, index);
    expect(isValid).to.be.true;
  });

  // Add more tests as required for other functionality.
});

To run the test, use the following command in your terminal

npx hardhat test

And that’s it!

Next step in Ethereum scaling: Verkle trees

While Merkle trees have been a foundational component in the blockchain space, providing efficient and secure verification of content in large datasets, the evolution of technology never stops. Enter Verkle trees.

Verkle trees are a proposed data structure that combines Vector Commitment and Merkle Trie to produce a structure that is more efficient in terms of proof size and verification time than traditional Merkle trees. As the Ethereum network and other blockchain platforms continue to scale and seek optimizations, structures like Verkle trees are being explored to potentially replace or augment existing Merkle trees.

The significance of understanding Merkle trees, as we've delved into in this article, is that it provides a basis to appreciate the advancements brought by Verkle trees. Just as we've seen the power of Merkle proofs in ensuring data integrity and enabling light clients to interact with blockchains efficiently, Verkle trees aim to take this a step further. They promise to reduce the computational overhead on nodes and potentially lead to faster transaction times and lower fees.

In essence, our deep dive into Merkle trees not only equips us with a fundamental blockchain building block but also sets the stage for us to embrace and understand the next wave of innovations, like Verkle trees, in the blockchain space. As we continue our journey in blockchain development, it's essential to keep an eye on these emerging technologies and understand their roots (pun intended) in the foundational concepts we've already mastered.

If you like what you see, you'll find more stuff like this on my Twitter.

Introduction

How to use this article

Why practising this smart contract is useful

The contract

What does the contract DO

Solidity syntax you should know

Breaking down the contract

MerkleProof contract

verify() function

TestMerkleProof contract

In summary

Exercises

1. Basic Solidity Syntax and Structure

2. Memory Arrays and Loops

3. Keccak256 Hash Function

4. Working with Bytes32

5. Conditional and Mathematical Operations

6. Inheritance

7. Memory Keyword

8. ABI Encoding

9. Understanding Merkle Trees

10. Implementing Merkle Proof Verification

Final Task

Using Developer Tools

Hardhat

Compiling the contract

Deploy

Test your contract

Next step in Ethereum scaling: Verkle trees

If you like what you see, you'll find more stuff like this on my Twitter.

`MerkleProof` contract

`verify()` function

`TestMerkleProof` contract