Account Compression

Account Compression

Tags
Web3
NFT
Published
December 4, 2024
Author

Overview

Account Compression is an on-chain program in SPL designed to optimize data storage and management for high-throughput applications, particularly non-fungible tokens (NFTs). By leveraging advanced data structures like Concurrent Merkle Trees (CMTs), Account Compression addresses the challenges associated with scaling decentralized applications (dApps) while maintaining data integrity and enabling efficient on-chain verification.

The Problem: High On-Chain Storage Costs and Scalability

a. Rise of Non-Fungible Tokens (NFTs):
  • Popularity and Demand: The Solana blockchain's high throughput has fueled the rapid creation and trading of NFTs, which are unique digital assets representing ownership of items like art, collectibles, and virtual real estate.
  • Ownership Benefits: NFTs provide custodial ownership and censorship-resistant properties, making them attractive for creators and collectors.
b. Scalability Challenges:
  • On-Chain Storage Costs: While minting a single NFT is relatively inexpensive, scaling to billions of NFTs leads to prohibitive on-chain storage costs. Storing each NFT's detailed data directly on-chain becomes economically unfeasible.
  • Data Management: Managing vast amounts of data on-chain strains network resources, leading to inefficiencies and higher operational costs.

The Solution: Account Compression Using Concurrent Merkle Trees

a. Concept of Account Compression:
  • Data Offloading: Instead of storing comprehensive NFT data on-chain, Account Compression stores a compressed hash (fingerprint) of the asset data on-chain. The actual data resides off-chain in external databases.
  • Data Integrity: The on-chain fingerprint allows for the verification of off-chain data, ensuring that the asset's integrity is maintained without the need for extensive on-chain storage.
b. Merkle Tree:
  • Merkle Trees Basics: A Merkle Tree is a cryptographic data structure that enables efficient and secure verification of large data sets. It works by hashing data in a hierarchical tree structure, culminating in a single root hash representing the entire dataset.
  • Concurrency Challenge: Traditional Merkle Trees allow only one write operation at a time. Concurrent write attempts with outdated proofs lead to collisions, where only the first operation succeeds while others fail due to mismatched proofs.
c. Introducing Concurrent Merkle Trees:
  • Parallel Operations: CMTs are designed to handle multiple tree operations (like setting or appending leaves) concurrently without proof collisions.
  • Change Logs Buffer: CMTs maintain a buffer of change logs (change_logs) that record recent modifications. This buffer allows the tree to "fast-forward" outdated proofs by applying buffered changes, enabling multiple operations to proceed in parallel.
  • Rightmost Proof Tracking: CMTs track the proof to the rightmost leaf (rightmost_proof), facilitating efficient append operations without requiring extensive proofs.

How Account Compression Solves the Problem

a. Data Compression and Storage Efficiency:
  • Compressed Fingerprints: By storing only the Merkle root (a 32-byte hash) on-chain, Account Compression drastically reduces the storage footprint compared to storing full NFT data.
  • Off-Chain Data Management: The actual asset data is maintained off-chain, typically in scalable databases. This separation ensures that on-chain operations remain cost-effective and efficient.
b. On-Chain Verification:
  • Merkle Proofs: To verify an NFT's authenticity or data, a Merkle proof is provided, linking the compressed on-chain root to the specific off-chain data. This proof ensures that the data has not been tampered with and corresponds to the stored root.
  • Concurrent Proof Handling: CMTs enable the verification process to handle multiple proofs simultaneously by managing a buffer of recent changes, thus avoiding the pitfalls of traditional Merkle Trees where concurrent operations could lead to proof invalidation.
c. Canopy for Proof Optimization:
  • Proof Size Reduction: Solana's transaction size limits pose challenges for handling extensive Merkle proofs. Account Compression introduces a "canopy," which caches the uppermost leaves of the Merkle Tree. This caching minimizes the size of proofs required for verification, ensuring they remain within Solana's constraints.
  • Canopy Management: The canopy is stored at the end of the ConcurrentMerkleTreeAccount and is updated alongside the Merkle Tree's state, maintaining an efficient balance between proof size and verification capabilities.

Use Cases

cNFT

Compression for NFTs drastically reduces the cost of onchain storage of NFTs to enable creators to be as expressive with the technology as they wish.
Launching a cNFT project on Solana using Merkle trees can be incredibly cost-effective, with costs starting as low as: (From metaplex’s analysis)
notion image

CMT Architecture

notion image

Init Empty Merkle Tree

init_empty_merkle_tree instruction initializes a new SPL Concurrent Merkle Tree on Solana. This involves setting up the account's data structures, initializing the tree header, calculating sizes, and emitting events for indexers.
 
Parameters:
  • ctx: Context containing the accounts needed for initialization.
  • max_depth: The maximum depth of the Merkle tree.
  • max_buffer_size: The maximum buffer size for the tree.
 
Steps:
  1. Verify Account Ownership
    1. Ensure the merkle_tree account is owned by the current program
  1. Borrow Mutable Data from the Merkle Tree Account
    1. Obtain a mutable reference to the data in the merkle_tree account.
  1. Split Data into Header and Rest
    1. Separate the merkle_tree account data into the header and the rest (tree and canopy data).
  1. Deserialize Header
    1. Convert the header bytes into a ConcurrentMerkleTreeHeader struct.
  1. Initialize Header
    1. Set up the header with the provided parameters.
      The initialize method sets the account type and populates the header fields, it checks that the header hasn’t been initialized before.
  1. Serialize Header Back into Bytes
    1. Write the initialized header back into the account data.
      serialize is provided by the BorshSerialize trait
  1. Split Remaining Data into Tree and Canopy Bytes
    1. merkle_tree_get_size method calculates the size of the Merkle tree data structure based on max_depth and max_buffer_size.
      Separate the rest of the account data into the Merkle tree and canopy sections.
  1. Initialize Merkle Tree Data Structure
    1. Get the public key of the merkle_tree account for use in event logging.
      Initialize the Merkle tree data structure and obtain a change log event.
      merkle_tree_apply_fn_mut! is a macro that handles loading and invoking initialize methods on the tree.
  1. Wrap and Emit Change Log Event
    1. Emit the change log event using a CPI call to the SPL Noop program.
  1. Update the Canopy
    1. Update the canopy portion of the Merkle tree account.
/// --- programs/account-compression/src/lib.rs --- /// Context for initializing a new SPL ConcurrentMerkleTree #[derive(Accounts)] pub struct Initialize<'info> { #[account(zero)] /// CHECK: This account will be zeroed out, and the size will be validated pub merkle_tree: UncheckedAccount<'info>, /// Authority that controls write-access to the tree /// Typically a program, e.g., the Bubblegum contract validates that leaves are valid NFTs. pub authority: Signer<'info>, /// Program used to emit changelogs as cpi instruction data. pub noop: Program<'info, Noop>, } /// Creates a new merkle tree with maximum leaf capacity of `power(2, max_depth)` /// and a minimum concurrency limit of `max_buffer_size`. /// /// Concurrency limit represents the # of replace instructions that can be successfully /// executed with proofs dated for the same root. For example, a maximum buffer size of 1024 /// means that a minimum of 1024 replaces can be executed before a new proof must be /// generated for the next replace instruction. /// /// Concurrency limit should be determined by empirically testing the demand for /// state built on top of SPL Compression. /// /// For instructions on enabling the canopy, see [canopy]. pub fn init_empty_merkle_tree( ctx: Context<Initialize>, max_depth: u32, max_buffer_size: u32, ) -> Result<()> { require_eq!( *ctx.accounts.merkle_tree.owner, crate::id(), AccountCompressionError::IncorrectAccountOwner ); let mut merkle_tree_bytes = ctx.accounts.merkle_tree.try_borrow_mut_data()?; let (mut header_bytes, rest) = merkle_tree_bytes.split_at_mut(CONCURRENT_MERKLE_TREE_HEADER_SIZE_V1); let mut header = ConcurrentMerkleTreeHeader::try_from_slice(header_bytes)?; header.initialize( max_depth, max_buffer_size, &ctx.accounts.authority.key(), Clock::get()?.slot, ); header.serialize(&mut header_bytes)?; let merkle_tree_size = merkle_tree_get_size(&header)?; let (tree_bytes, canopy_bytes) = rest.split_at_mut(merkle_tree_size); let id = ctx.accounts.merkle_tree.key(); let change_log_event = merkle_tree_apply_fn_mut!(header, id, tree_bytes, initialize,)?; wrap_event( &AccountCompressionEvent::ChangeLog(*change_log_event), &ctx.accounts.noop, )?; update_canopy(canopy_bytes, header.get_max_depth(), None) }

Concurrent Merkle Tree Header

/// --- programs/account-compression/src/state/concurrent_merkle_tree_header.rs --- #[derive(Debug, Copy, Clone, PartialEq, BorshDeserialize, BorshSerialize)] #[repr(u8)] pub enum CompressionAccountType { /// Uninitialized Uninitialized, /// SPL ConcurrentMerkleTree data structure, may include a Canopy ConcurrentMerkleTree, } #[repr(C)] #[derive(AnchorDeserialize, AnchorSerialize)] pub enum ConcurrentMerkleTreeHeaderData { V1(ConcurrentMerkleTreeHeaderDataV1), } #[repr(C)] #[derive(AnchorDeserialize, AnchorSerialize)] pub struct ConcurrentMerkleTreeHeaderDataV1 { /// Buffer of changelogs stored on-chain. /// Must be a power of 2; see above table for valid combinations. max_buffer_size: u32, /// Depth of the SPL ConcurrentMerkleTree to store. /// Tree capacity can be calculated as power(2, max_depth). /// See above table for valid options. max_depth: u32, /// Authority that validates the content of the trees. /// Typically a program, e.g., the Bubblegum contract validates that leaves are valid NFTs. authority: Pubkey, /// Slot corresponding to when the Merkle tree was created. /// Provides a lower-bound on what slot to start (re-)building a tree from. creation_slot: u64, /// Needs padding for the account to be 8-byte aligned /// 8-byte alignment is necessary to zero-copy the SPL ConcurrentMerkleTree _padding: [u8; 6], } /// Initialization parameters for an SPL ConcurrentMerkleTree. /// /// Only the following permutations are valid: /// /// | max_depth | max_buffer_size | /// | --------- | --------------------- | /// | 14 | (64, 256, 1024, 2048) | /// | 20 | (64, 256, 1024, 2048) | /// | 24 | (64, 256, 512, 1024, 2048) | /// | 26 | (64, 256, 512, 1024, 2048) | /// | 30 | (512, 1024, 2048) | /// #[repr(C)] #[derive(AnchorDeserialize, AnchorSerialize)] pub struct ConcurrentMerkleTreeHeader { /// Account type pub account_type: CompressionAccountType, /// Versioned header pub header: ConcurrentMerkleTreeHeaderData, }

Concurrent Merkle Tree Intialization

/// --- programs/account-compression/src/state/concurrent_merkle_tree_header.rs --- impl ConcurrentMerkleTreeHeader { pub fn initialize( &mut self, max_depth: u32, max_buffer_size: u32, authority: &Pubkey, creation_slot: u64, ) { self.account_type = CompressionAccountType::ConcurrentMerkleTree; match self.header { ConcurrentMerkleTreeHeaderData::V1(ref mut header) => { // Double check header is empty after deserialization from zero'd bytes assert_eq!(header.max_buffer_size, 0); assert_eq!(header.max_depth, 0); header.max_buffer_size = max_buffer_size; header.max_depth = max_depth; header.authority = *authority; header.creation_slot = creation_slot; } } } /// ... }

Calculate Merkle Tree Size

Concurrent merkle tree has a set of pre-defined sizes.
/// --- programs/account-compression/src/state/concurrent_merkle_tree_header.rs -- pub fn merkle_tree_get_size(header: &ConcurrentMerkleTreeHeader) -> Result<usize> { // Note: max_buffer_size MUST be a power of 2 match (header.get_max_depth(), header.get_max_buffer_size()) { (3, 8) => Ok(size_of::<ConcurrentMerkleTree<3, 8>>()), (5, 8) => Ok(size_of::<ConcurrentMerkleTree<5, 8>>()), (6, 16) => Ok(size_of::<ConcurrentMerkleTree<6, 16>>()), (7, 16) => Ok(size_of::<ConcurrentMerkleTree<7, 16>>()), (8, 16) => Ok(size_of::<ConcurrentMerkleTree<8, 16>>()), (9, 16) => Ok(size_of::<ConcurrentMerkleTree<9, 16>>()), (10, 32)=> Ok(size_of::<ConcurrentMerkleTree<10, 32>>()), (11, 32)=> Ok(size_of::<ConcurrentMerkleTree<11, 32>>()), (12, 32)=> Ok(size_of::<ConcurrentMerkleTree<12, 32>>()), (13, 32)=> Ok(size_of::<ConcurrentMerkleTree<13, 32>>()), (14, 64) => Ok(size_of::<ConcurrentMerkleTree<14, 64>>()), (14, 256) => Ok(size_of::<ConcurrentMerkleTree<14, 256>>()), (14, 1024) => Ok(size_of::<ConcurrentMerkleTree<14, 1024>>()), (14, 2048) => Ok(size_of::<ConcurrentMerkleTree<14, 2048>>()), (15, 64) => Ok(size_of::<ConcurrentMerkleTree<15, 64>>()), (16, 64) => Ok(size_of::<ConcurrentMerkleTree<16, 64>>()), (17, 64) => Ok(size_of::<ConcurrentMerkleTree<17, 64>>()), (18, 64) => Ok(size_of::<ConcurrentMerkleTree<18, 64>>()), (19, 64) => Ok(size_of::<ConcurrentMerkleTree<19, 64>>()), (20, 64) => Ok(size_of::<ConcurrentMerkleTree<20, 64>>()), (20, 256) => Ok(size_of::<ConcurrentMerkleTree<20, 256>>()), (20, 1024) => Ok(size_of::<ConcurrentMerkleTree<20, 1024>>()), (20, 2048) => Ok(size_of::<ConcurrentMerkleTree<20, 2048>>()), (24, 64) => Ok(size_of::<ConcurrentMerkleTree<24, 64>>()), (24, 256) => Ok(size_of::<ConcurrentMerkleTree<24, 256>>()), (24, 512) => Ok(size_of::<ConcurrentMerkleTree<24, 512>>()), (24, 1024) => Ok(size_of::<ConcurrentMerkleTree<24, 1024>>()), (24, 2048) => Ok(size_of::<ConcurrentMerkleTree<24, 2048>>()), (26, 512) => Ok(size_of::<ConcurrentMerkleTree<26, 512>>()), (26, 1024) => Ok(size_of::<ConcurrentMerkleTree<26, 1024>>()), (26, 2048) => Ok(size_of::<ConcurrentMerkleTree<26, 2048>>()), (30, 512) => Ok(size_of::<ConcurrentMerkleTree<30, 512>>()), (30, 1024) => Ok(size_of::<ConcurrentMerkleTree<30, 1024>>()), (30, 2048) => Ok(size_of::<ConcurrentMerkleTree<30, 2048>>()), _ => { msg!( "Failed to get size of max depth {} and max buffer size {}", header.get_max_depth(), header.get_max_buffer_size() ); err!(AccountCompressionError::ConcurrentMerkleTreeConstantsError) } } }

merkle_tree_apply_fn_mut!

merkle_tree_apply_fn_mut! is a macro that handles loading and invoking methods on the tree.
 
Macro Expansion:
  • Step 1: Match the max_depth and max_buffer_size to select the correct tree type.
  • Step 2: Load the tree using ConcurrentMerkleTree::<max_depth, max_buffer_size>::load_mut_bytes(tree_bytes).
  • Step 3: Call the initialize method on the loaded tree.
  • Step 4: Capture the change log event from the tree.
/// --- programs/account-compression/src/macros.rs --- /// This applies a given function on a mutable ConcurrentMerkleTree #[macro_export] macro_rules! merkle_tree_apply_fn_mut { ($header:ident, $id:ident, $bytes:ident, $func:ident, $($arg:tt)*) => { _merkle_tree_apply_fn!($header, $id, $bytes, $func, TreeLoad::Mutable, $($arg)*) }; } /// This applies a given function on a ConcurrentMerkleTree by /// allowing the compiler to infer the size of the tree based /// upon the header information stored on-chain #[macro_export] macro_rules! _merkle_tree_apply_fn { ($header:ident, $($arg:tt)*) => { // Note: max_buffer_size MUST be a power of 2 match ($header.get_max_depth(), $header.get_max_buffer_size()) { (3, 8) => _merkle_tree_depth_size_apply_fn!(3, 8, $($arg)*), (5, 8) => _merkle_tree_depth_size_apply_fn!(5, 8, $($arg)*), (6, 16) => _merkle_tree_depth_size_apply_fn!(6, 16, $($arg)*), (7, 16) => _merkle_tree_depth_size_apply_fn!(7, 16, $($arg)*), (8, 16) => _merkle_tree_depth_size_apply_fn!(8, 16, $($arg)*), (9, 16) => _merkle_tree_depth_size_apply_fn!(9, 16, $($arg)*), (10, 32) => _merkle_tree_depth_size_apply_fn!(10, 32, $($arg)*), (11, 32) => _merkle_tree_depth_size_apply_fn!(11, 32, $($arg)*), (12, 32) => _merkle_tree_depth_size_apply_fn!(12, 32, $($arg)*), (13, 32) => _merkle_tree_depth_size_apply_fn!(13, 32, $($arg)*), (14, 64) => _merkle_tree_depth_size_apply_fn!(14, 64, $($arg)*), (14, 256) => _merkle_tree_depth_size_apply_fn!(14, 256, $($arg)*), (14, 1024) => _merkle_tree_depth_size_apply_fn!(14, 1024, $($arg)*), (14, 2048) => _merkle_tree_depth_size_apply_fn!(14, 2048, $($arg)*), (15, 64) => _merkle_tree_depth_size_apply_fn!(15, 64, $($arg)*), (16, 64) => _merkle_tree_depth_size_apply_fn!(16, 64, $($arg)*), (17, 64) => _merkle_tree_depth_size_apply_fn!(17, 64, $($arg)*), (18, 64) => _merkle_tree_depth_size_apply_fn!(18, 64, $($arg)*), (19, 64) => _merkle_tree_depth_size_apply_fn!(19, 64, $($arg)*), (20, 64) => _merkle_tree_depth_size_apply_fn!(20, 64, $($arg)*), (20, 256) => _merkle_tree_depth_size_apply_fn!(20, 256, $($arg)*), (20, 1024) => _merkle_tree_depth_size_apply_fn!(20, 1024, $($arg)*), (20, 2048) => _merkle_tree_depth_size_apply_fn!(20, 2048, $($arg)*), (24, 64) => _merkle_tree_depth_size_apply_fn!(24, 64, $($arg)*), (24, 256) => _merkle_tree_depth_size_apply_fn!(24, 256, $($arg)*), (24, 512) => _merkle_tree_depth_size_apply_fn!(24, 512, $($arg)*), (24, 1024) => _merkle_tree_depth_size_apply_fn!(24, 1024, $($arg)*), (24, 2048) => _merkle_tree_depth_size_apply_fn!(24, 2048, $($arg)*), (26, 512) => _merkle_tree_depth_size_apply_fn!(26, 512, $($arg)*), (26, 1024) => _merkle_tree_depth_size_apply_fn!(26, 1024, $($arg)*), (26, 2048) => _merkle_tree_depth_size_apply_fn!(26, 2048, $($arg)*), (30, 512) => _merkle_tree_depth_size_apply_fn!(30, 512, $($arg)*), (30, 1024) => _merkle_tree_depth_size_apply_fn!(30, 1024, $($arg)*), (30, 2048) => _merkle_tree_depth_size_apply_fn!(30, 2048, $($arg)*), _ => { msg!("Failed to apply {} on concurrent merkle tree with max depth {} and max buffer size {}", stringify!($func), $header.get_max_depth(), $header.get_max_buffer_size() ); err!(AccountCompressionError::ConcurrentMerkleTreeConstantsError) } } }; } /// This macro applies functions on a ConcurrentMerkleT:ee and emits leaf information /// needed to sync the merkle tree state with off-chain indexers. #[macro_export] macro_rules! _merkle_tree_depth_size_apply_fn { ($max_depth:literal, $max_size:literal, $id:ident, $bytes:ident, $func:ident, TreeLoad::Mutable, $($arg:tt)*) => { match ConcurrentMerkleTree::<$max_depth, $max_size>::load_mut_bytes($bytes) { Ok(merkle_tree) => { match merkle_tree.$func($($arg)*) { Ok(_) => { Ok(Box::<ChangeLogEvent>::from((merkle_tree.get_change_log(), $id, merkle_tree.sequence_number))) } Err(err) => { msg!("Error using concurrent merkle tree: {}", err); err!(AccountCompressionError::ConcurrentMerkleTreeError) } } } Err(err) => { msg!("Error zero copying concurrent merkle tree: {}", err); err!(AccountCompressionError::ZeroCopyError) } } }; ($max_depth:literal, $max_size:literal, $id:ident, $bytes:ident, $func:ident, TreeLoad::Immutable, $($arg:tt)*) => { match ConcurrentMerkleTree::<$max_depth, $max_size>::load_bytes($bytes) { Ok(merkle_tree) => { match merkle_tree.$func($($arg)*) { Ok(_) => { Ok(Box::<ChangeLogEvent>::from((merkle_tree.get_change_log(), $id, merkle_tree.sequence_number))) } Err(err) => { msg!("Error using concurrent merkle tree: {}", err); err!(AccountCompressionError::ConcurrentMerkleTreeError) } } } Err(err) => { msg!("Error zero copying concurrent merkle tree: {}", err); err!(AccountCompressionError::ZeroCopyError) } } }; }

Initialize Merkle Tree

initialize initializes a new Concurrent Merkle Tree. It initializes rightmost_proof.
 
Steps:
  1. Check Bounds
    1. Ensures that the provided MAX_DEPTH and MAX_BUFFER_SIZE are within acceptable limits.
  1. Check if Tree is Already Initialized
    1. Prevents re-initialization of an already initialized tree.
  1. Initialize rightmost_proof
    1. Sets up the rightmost_proof field, which keeps track of the proof to the rightmost leaf in the tree.
      The default implementation on Path sets all fields to their default values (zeros).
  1. Fill rightmost_proof.proof
    1. Creates a cache of empty nodes to optimize the computation of empty nodes at different levels. But in fact it doesn’t optimize because there is no cached node in it.
      Initializes the proof array in rightmost_proof. It’s the proof of the rightmost leaf in a tree whose leaves are all empty.
  1. Fill path
    1. Creates a proof path which is the proof of the rightmost leaf in an empty tree, similar to the rightmost_proof.proof. It is used in change log.
  1. Initialize Change Log
    1. Sets the initial root of the tree. It uses empty_node method to calculate the tree root whose leaves are all empty.
      Assigns the previously computed path to the first ChangeLog.
  1. Initialize Concurrent Merkle Tree Data
      • sequence_number: A monotonic counter incremented with each tree operation. Initialized to 0.
      • active_index: Points to the current position in the change_logs buffer. Initialized to 0.
      • buffer_size: The number of active change logs. Since we've initialized the first one, it's set to 1.
      • Assigns the fully initialized rightmost_proof (regarding the rightmost node in a tree filled with empty nodes) to the tree's rightmost_proof field.
  1. Return the Root Node
    1. Returns the root node of the newly initialized tree.
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/concurrent_merkle_tree.rs --- impl<const MAX_DEPTH: usize, const MAX_BUFFER_SIZE: usize> ConcurrentMerkleTree<MAX_DEPTH, MAX_BUFFER_SIZE> { pub fn is_initialized(&self) -> bool { !(self.buffer_size == 0 && self.sequence_number == 0 && self.active_index == 0) } /// This is the trustless initialization method that should be used in most /// cases. pub fn initialize(&mut self) -> Result<Node, ConcurrentMerkleTreeError> { /// Check Bounds check_bounds(MAX_DEPTH, MAX_BUFFER_SIZE); /// Check if Tree is Already Initialized if self.is_initialized() { return Err(ConcurrentMerkleTreeError::TreeAlreadyInitialized); } /// Initialize rightmost_proof let mut rightmost_proof = Path::default(); /// Fill rightmost_proof.proof let empty_node_cache = [Node::default(); MAX_DEPTH]; for (i, node) in rightmost_proof.proof.iter_mut().enumerate() { *node = empty_node_cached::<MAX_DEPTH>(i as u32, &empty_node_cache); } /// Fill path let mut path = [Node::default(); MAX_DEPTH]; for (i, node) in path.iter_mut().enumerate() { *node = empty_node_cached::<MAX_DEPTH>(i as u32, &empty_node_cache); } /// Initialize Change Log self.change_logs[0].root = empty_node(MAX_DEPTH as u32); self.change_logs[0].path = path; /// Initialize Concurrent Merkle Tree Data self.sequence_number = 0; self.active_index = 0; self.buffer_size = 1; self.rightmost_proof = rightmost_proof; /// Return the Root Node Ok(self.change_logs[0].root) } /// ... } /// Represents a proof to perform a Merkle tree operation on the leaf at `index` #[derive(Copy, Clone, Debug, PartialEq, Eq)] #[repr(C)] pub struct Path<const MAX_DEPTH: usize> { pub proof: [Node; MAX_DEPTH], pub leaf: Node, pub index: u32, pub _padding: u32, } impl<const MAX_DEPTH: usize> Default for Path<MAX_DEPTH> { fn default() -> Self { Self { proof: [Node::default(); MAX_DEPTH], leaf: Node::default(), index: 0, _padding: 0, } } } /// --- /solana-program-library/libraries/concurrent-merkle-tree/src/node.rs --- /// Abstract type for 32 byte leaf data pub type Node = [u8; 32];
 
The rightmost_proof is the proof of the rightmost node in the tree.
  • leaf is rightmost node
  • index is the index of the next node to be appended.
After initialization, the rightmost_proof's leaf is empty node, the proof is the proof of empty node in an empty tree, the index is 0. During node adding logic, when the index is 0, it knows this tree is right after initialization, so it handles node adding specially.
notion image
 
check_bounds enforces constraints on max depth and buffer size.
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/concurrent_merkle_tree.rs --- /// Enforce constraints on max depth and buffer size #[inline(always)] fn check_bounds(max_depth: usize, max_buffer_size: usize) { // We cannot allow a tree depth greater than 30 because of the bit math // required to update `ChangeLog`s assert!(max_depth < 31); // This will return true if MAX_BUFFER_SIZE is a power of 2 or if it is 0 assert!(max_buffer_size & (max_buffer_size - 1) == 0); }

Calculate Empty Tree Node Hash

The empty_node_cached function is designed to compute the hash of empty nodes up to a specified depth (level) in the tree, utilizing a cache to avoid redundant computations.
 
Generics:
const N: usize: Represents the size of the cache array, indicating how many precomputed empty nodes are available.
Parameters:
  • level: u32: The depth level in the Merkle tree for which the empty node needs to be computed.
  • cache: &[Node; N]: A reference to an array of precomputed empty nodes, facilitating faster access to previously computed hashes.
Returns:
  • Node: A 32-byte array representing the empty node at the specified level.
 
Steps:
  1. Initial Setup
    1. Initializes the data variable with the EMPTY node, which is a 32-byte array of zeroes.
      Serves as the default return value for level = 0, ensuring that the base case is handled correctly.
  1. Recursively Computes Node Hash
      • Checks if level is not zero. This is because level = 0 corresponds to the base case, which is already handled by the EMPTY node.
      • Determines the immediate lower level (level - 1) for which the empty node needs to be fetched or computed.
      • Retrieving the Lower-Level Empty Node
        • Cache Utilization:
        • Condition: Checks if the target level is within the bounds of the cache and if the cached node is not EMPTY.
        • If True: Uses the precomputed empty node from the cache.
        • If False: Recursively calls empty_node(target) to compute the hash of the node on the lower level.
      • Hashes two child nodes’ hashes to get the hash of node on current level.
  1. Returning the Computed Node Hash
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/node.rs --- /// Calculates the hash of empty nodes up to level i pub fn empty_node(level: u32) -> Node { empty_node_cached::<0>(level, &[]) } /// Calculates and caches the hash of empty nodes up to level i pub fn empty_node_cached<const N: usize>(level: u32, cache: &[Node; N]) -> Node { /// Initial Setup let mut data = EMPTY; /// Recursively Computes Node Hash if level != 0 { let target = (level - 1) as usize; /// Check whether there is a cached node let lower_empty = if target < cache.len() && cache[target] != EMPTY { cache[target] } else { /// If there is no cached node hash, then calcualte /// from scratch empty_node(target as u32) }; /// Calculate node hash based on child node hashes let hash = hashv(&[lower_empty.as_ref(), lower_empty.as_ref()]); data.copy_from_slice(hash.as_ref()); } /// Returning the Computed Node Hash data }

Wrap and Emit Change Log Event

/// --- programs/account-compression/src/noop/mod.rs --- pub fn wrap_event<'info>( event: &AccountCompressionEvent, noop_program: &Program<'info, Noop>, ) -> Result<()> { invoke( &spl_noop::instruction(event.try_to_vec()?), &[noop_program.to_account_info()], )?; Ok(()) } /// --- programs/account-compression/src/events/mod.rs --- #[derive(AnchorDeserialize, AnchorSerialize)] #[repr(C)] pub enum AccountCompressionEvent { ChangeLog(ChangeLogEvent), ApplicationData(ApplicationDataEvent), } /// --- programs/account-compression/src/events/changelog_event.rs --- #[derive(AnchorDeserialize, AnchorSerialize)] #[repr(C)] pub enum ChangeLogEvent { V1(ChangeLogEventV1), } #[derive(AnchorDeserialize, AnchorSerialize)] pub struct ChangeLogEventV1 { /// Public key of the ConcurrentMerkleTree pub id: Pubkey, /// Nodes of off-chain merkle tree needed by indexer pub path: Vec<PathNode>, /// Index corresponding to the number of successful operations on this tree. /// Used by the off-chain indexer to figure out when there are gaps to be backfilled. pub seq: u64, /// Bitmap of node parity (used when hashing) pub index: u32, } /// --- programs/account-compression/src/state/path_node.rs --- #[derive(AnchorDeserialize, AnchorSerialize, Clone, Copy, Debug)] pub struct PathNode { pub node: [u8; 32], pub index: u32, } /// --- programs/account-compression/src/events/application_data.rs --- #[derive(AnchorDeserialize, AnchorSerialize)] #[repr(C)] pub enum ApplicationDataEvent { V1(ApplicationDataEventV1), } #[derive(AnchorDeserialize, AnchorSerialize)] pub struct ApplicationDataEventV1 { pub application_data: Vec<u8>, }

Update the Canopy

During the merkle tree initialization process, since change_log is None, no updates are made to the canopy.
/// --- programs/account-compression/src/canopy.rs --- pub fn update_canopy( canopy_bytes: &mut [u8], max_depth: u32, change_log: Option<&ChangeLogEvent>, ) -> Result<()> { check_canopy_bytes(canopy_bytes)?; let canopy = cast_slice_mut::<u8, Node>(canopy_bytes); let path_len = get_cached_path_length(canopy, max_depth)?; if let Some(cl_event) = change_log { match &*cl_event { ChangeLogEvent::V1(cl) => { // Update the canopy from the newest change log for path_node in cl.path.iter().rev().skip(1).take(path_len as usize) { // node_idx - 2 maps to the canopy index canopy[(path_node.index - 2) as usize] = path_node.node; } } } } Ok(()) } /// --- programs/account-compression/src/canopy.rs --- #[inline(always)] pub fn check_canopy_bytes(canopy_bytes: &[u8]) -> Result<()> { if canopy_bytes.len() % size_of::<Node>() != 0 { msg!( "Canopy byte length {} is not a multiple of {}", canopy_bytes.len(), size_of::<Node>() ); err!(AccountCompressionError::CanopyLengthMismatch) } else { Ok(()) } }

Replace Leaf

replace_leaf facilitates the modification of a specific leaf within a Concurrent Merkle Tree (CMT), ensuring that updates are consistent and secure, even in the presence of concurrent operations.
 
Purpose:
  • Primary Role: Overwrites an existing leaf at a specified index in the Concurrent Merkle Tree with a new leaf value.
  • Concurrency Handling: Ensures that the update is valid even if multiple modifications are attempted concurrently by verifying proofs and updating the tree accordingly.
  • Security: Validates authority and ensures that only authorized entities can perform updates.
 
Steps:
  1. Owner Verification
    1. Ensures that the merkle_tree account is owned by the SPL Account Compression program.
  1. Deserialize Merkle Tree Header Data
      • Accesses and separates the data within the merkle_tree account into the header and the rest of the tree data.
      • Converts the raw header bytes into a structured ConcurrentMerkleTreeHeader object.
  1. Authority and Leaf Index Validation
      • Ensures that the signer (authority) has the right to modify the tree.
      • Validates that the provided index is within the bounds of the tree's capacity.
  1. Separate Merkle Tree and Canopy Bytes
    1. Determines the size of the Merkle Tree data based on the header and splits the remaining data into the Merkle Tree nodes (tree_bytes) and the canopy (canopy_bytes).
  1. Collecting Proof Nodes
      • Iterates over the remaining_accounts (expected to contain proof nodes) and collects their byte representations.
      • Enhances the collected proof by integrating cached nodes from the canopy, ensuring that the proof is complete and efficient.
  1. Applying the set_leaf Operation
    1. Executes the set_leaf method on the ConcurrentMerkleTree to replace the specified leaf.
  1. Updating the Canopy with the New Change Log
    1. Incorporates the latest changes into the canopy, ensuring that cached nodes reflect the most recent state of the tree, facilitating accurate and efficient proof generation in future operations.
  1. Emitting a Change Log Event
    1. Emits an event representing the change made to the Merkle Tree for off-chain indexers and listeners to track updates.
/// --- programs/account-compression/src/lib.rs --- /// Executes an instruction that overwrites a leaf node. /// Composing programs should check that the data hashed into previous_leaf /// matches the authority information necessary to execute this instruction. pub fn replace_leaf( ctx: Context<Modify>, root: [u8; 32], previous_leaf: [u8; 32], new_leaf: [u8; 32], index: u32, ) -> Result<()> { /// Owner Verification require_eq!( *ctx.accounts.merkle_tree.owner, crate::id(), AccountCompressionError::IncorrectAccountOwner ); /// Deserialize Merkle Tree Header Data let mut merkle_tree_bytes = ctx.accounts.merkle_tree.try_borrow_mut_data()?; let (header_bytes, rest) = merkle_tree_bytes.split_at_mut(CONCURRENT_MERKLE_TREE_HEADER_SIZE_V1); let header = ConcurrentMerkleTreeHeader::try_from_slice(header_bytes)?; /// Authority and Leaf Index Validation header.assert_valid_authority(&ctx.accounts.authority.key())?; header.assert_valid_leaf_index(index)?; /// Separate Merkle Tree and Canopy Bytes let merkle_tree_size = merkle_tree_get_size(&header)?; let (tree_bytes, canopy_bytes) = rest.split_at_mut(merkle_tree_size); /// Collecting Proof Nodes let mut proof = vec![]; for node in ctx.remaining_accounts.iter() { proof.push(node.key().to_bytes()); } fill_in_proof_from_canopy(canopy_bytes, header.get_max_depth(), index, &mut proof)?; let id = ctx.accounts.merkle_tree.key(); /// Applying the set_leaf Operation // A call is made to ConcurrentMerkleTree::set_leaf(root, previous_leaf, new_leaf, proof, index) let change_log_event = merkle_tree_apply_fn_mut!( header, id, tree_bytes, set_leaf, root, previous_leaf, new_leaf, &proof, index, )?; /// Updating the Canopy with the New Change Log update_canopy( canopy_bytes, header.get_max_depth(), Some(&change_log_event), )?; /// Emitting a Change Log Event wrap_event( &AccountCompressionEvent::ChangeLog(*change_log_event), &ctx.accounts.noop, ) }

Understanding the Canopy:

  • Definition: The canopy is a cache of the uppermost levels of the Merkle Tree. It stores precomputed nodes to optimize proof generation and reduce transaction sizes.
  • Role in replace_leaf:
    • Proof Enhancement: By leveraging the canopy, the function can truncate proofs, embedding cached nodes instead of transmitting them explicitly. This reduces the amount of data that needs to be sent with the transaction.
    • Efficiency: Facilitates faster and more efficient proof verification by minimizing the depth of the proof that needs to be processed on-chain.

fill_in_proof_from_canopy

In large Merkle Trees, providing a full proof for a leaf can be inefficient due to transaction size limitations. The canopy allows the tree to cache upper-level nodes, enabling proofs to be truncated and supplemented with cached data on-chain.
notion image
 
account-compression program stores those cached nodes in an array. In the example above, it stores the below array in canopy_bytes:
If we want to update Node#101, we can just provide proof Node#100, then account_compression program will find other proof nodes Node#11 and Node#0 from canopy.
notion image
 
fill_in_proof_from_canopy injects cached proofs to proof argument.
 
Parameters:
  • canopy_bytes: &[u8]: A byte slice representing the canopy data, which is a cache of the upper levels of the Merkle Tree.
  • max_depth: u32: The maximum depth of the Merkle Tree.
  • index: u32: The index of the leaf node for which the proof is being constructed.
  • proof: &mut Vec<Node>: A mutable reference to a vector of Node (32-byte arrays) representing the partial Merkle proof collected so far. This function will append additional nodes to this proof using the canopy.
 
Steps:
  1. Setting Up an Empty Node Cache
    1. Initializes a cache of empty nodes to optimize the retrieval of empty nodes at various tree levels.
      The comment mentions that "30 is hard coded as it is the current max depth that SPL Compression supports." This means that the cache is prepared to handle trees up to depth 30.
  1. Validating the Canopy Bytes
    1. Ensures that the canopy_bytes provided are valid in terms of length. It checks that the length of canopy_bytes is a multiple of the size of a Node (32 bytes).
  1. Casting Canopy Bytes to Nodes
    1. Converts the raw byte slice canopy_bytes into a slice of Node structs (32-byte arrays).
      Assumption: The canopy_bytes length has already been validated to be a multiple of size_of::<Node>().
  1. Determining the Cached Path Length
    1. Calculates the number of levels from the top of the tree that are cached in the canopy.
      get_cached_path_length computes the path_len based on the size of the canopy.
  1. Computing the Node Index where the Path Intersects the Canopy
    1. Determines the index of the node in the canopy where the path from the leaf to the root intersects the cached portion.
  1. Initializing the Inferred Nodes Vector
    1. Prepares a vector to collect the nodes inferred from the canopy that will be added to the proof.
  1. Looping to Collect Inferred Nodes from the Canopy
  1. Adjusting the Proof to Ensure Correct Length
    1. Adds the necessary inferred nodes to the proof, skipping any excess nodes.
      • proof.len(): The number of nodes already in the proof (collected from remaining_accounts).
      • inferred_nodes.len(): The number of nodes inferred from the canopy.
      • max_depth as usize: The total expected proof length for a tree of depth max_depth.
      • overlap: The number of nodes to skip from inferred_nodes to avoid exceeding the required proof length.
/// --- programs/account-compression/src/canopy.rs --- pub fn fill_in_proof_from_canopy( canopy_bytes: &[u8], max_depth: u32, index: u32, proof: &mut Vec<Node>, ) -> Result<()> { /// Setting Up an Empty Node Cache // 30 is hard coded as it is the current max depth that SPL Compression supports let mut empty_node_cache = Box::new([EMPTY; 30]); /// Validating the Canopy Bytes check_canopy_bytes(canopy_bytes)?; /// Casting Canopy Bytes to Nodes let canopy = cast_slice::<u8, Node>(canopy_bytes); /// Determining the Cached Path Length let path_len = get_cached_path_length(canopy, max_depth)?; /// Computing the Node Index where the Path Intersects the Canopy let mut node_idx = ((1 << max_depth) + index) >> (max_depth - path_len); /// Initializing the Inferred Nodes Vector let mut inferred_nodes = vec![]; /// Looping to Collect Inferred Nodes from the Canopy while node_idx > 1 { // node_idx - 2 maps to the canopy index let shifted_index = node_idx as usize - 2; let cached_idx = if shifted_index % 2 == 0 { shifted_index + 1 } else { shifted_index - 1 }; if canopy[cached_idx] == EMPTY { let level = max_depth - (31 - node_idx.leading_zeros()); let empty_node = empty_node_cached::<30>(level, &mut empty_node_cache); inferred_nodes.push(empty_node); } else { inferred_nodes.push(canopy[cached_idx]); } node_idx >>= 1; } /// Adjusting the Proof to Ensure Correct Length // We only want to add inferred canopy nodes such that the proof length // is equal to the tree depth. If the length of proof + length of canopy nodes is // less than the tree depth, the instruction will fail. let overlap = (proof.len() + inferred_nodes.len()).saturating_sub(max_depth as usize); proof.extend(inferred_nodes.iter().skip(overlap)); Ok(()) }

get_cached_path_length

get_cached_path_length calculates the level counts of the canopy tree.
For a tree with level , we know the nodes count is .
So to check whether the length of canopy array is valid, we need to check whether the length plus 2 is a power of 2.
And it checks the nodes stored in the canopy array doesn’t exceed the maximal nodes can be stored. The left side is amount of nodes stored in canopy array, the right side is all nodes stored in the tree except root node. (canopy array doesn’t store root node)
closest_power_of_2.trailing_zeros() -1 equals which represents the depth of canopy tree.
/// --- programs/account-compression/src/canopy.rs --- #[inline(always)] fn get_cached_path_length(canopy: &[Node], max_depth: u32) -> Result<u32> { // The offset of 2 is applied because the canopy is a full binary tree without the root node // Size: (2^n - 2) -> Size + 2 must be a power of 2 let closest_power_of_2 = (canopy.len() + 2) as u32; // This expression will return true if `closest_power_of_2` is actually a power of 2 if closest_power_of_2 & (closest_power_of_2 - 1) == 0 { // (1 << max_depth) returns the number of leaves in the full merkle tree // (1 << (max_depth + 1)) - 1 returns the number of nodes in the full tree // The canopy size cannot exceed the size of the tree if closest_power_of_2 > (1 << (max_depth + 1)) { msg!( "Canopy size is too large. Size: {}. Max size: {}", closest_power_of_2 - 2, (1 << (max_depth + 1)) - 2 ); return err!(AccountCompressionError::CanopyLengthMismatch); } } else { msg!( "Canopy length {} is not 2 less than a power of 2", canopy.len() ); return err!(AccountCompressionError::CanopyLengthMismatch); } // 1 is subtracted from the trailing zeros because the root is not stored in the canopy Ok(closest_power_of_2.trailing_zeros() - 1) }

Example

In the above example, where we want to update Node#101:
  • node_idx: the index of the node in the full tree nodes except root node.(here the index starts from 1).
  • shifted_index: the index of node in canopy array (index starts from 0)
is the index of the first leaf in a tree whose depth is starts from 1.
For example, if the is 3, then the index of Node#000 is 8. (Root#1 is 1, Node#0 is 2, Node#1 is 3, Node#00 is 4, et.)
notion image
 
represents the index (from 1) of the first node in the canopy tree, which is Node#00 whose index is 4.
notion image
 
represents the index of intersection node in the leaf level in canopy tree(starts from 0).
In our example, the intersection node is Node#10 whose index is 0b10 (Node#00 is 0, Node#01 is 1).
 
Combining them together, represents the index of the intersection node in the canopy tree nodes (not only leaves, but all nodes) starts from 1. In our example, the node_idx is 6 .
notion image
 
node_idx starts from 1 and has considered root node. Also canopy array doesn’t store root node, so we need to subtract 2 to get the intersection node index in the canopy array.
let shifted_index = node_idx as usize - 2;
 
If shifted_index is odd, it means the intersection node is a right child node, so the proof node is on its left, otherwise right.
let cached_idx = if shifted_index % 2 == 0 { shifted_index + 1 } else { shifted_index - 1 };
 
If canopy[cached_idx] is EMPTY, it means its children has been updated since tree initialization. So we can just calcualte its value from EMPTY nodes.
node_idx’s type is u32, (32-node_idx.leading_zeros() - 1) represents the level counted from top to bottom. So max_depth - (32-node_idx.leading_zeros() + 1) represents the level counted from bottom to up.
In our example, the intersection node of Node#101 is Node#10. (32-node_idx.leading_zeros() - 1) is 2, max_depth - (32-node_idx.leading_zeros() + 1) is also 2
if canopy[cached_idx] == EMPTY { let level = max_depth - (31 - node_idx.leading_zeros()); let empty_node = empty_node_cached::<30>(level, &mut empty_node_cache); inferred_nodes.push(empty_node); } else { inferred_nodes.push(canopy[cached_idx]); }

Determining the Cached Path Length

/// --- programs/account-compression/src/canopy.rs --- #[inline(always)] fn get_cached_path_length(canopy: &[Node], max_depth: u32) -> Result<u32> { // The offset of 2 is applied because the canopy is a full binary tree without the root node // Size: (2^n - 2) -> Size + 2 must be a power of 2 let closest_power_of_2 = (canopy.len() + 2) as u32; // This expression will return true if `closest_power_of_2` is actually a power of 2 if closest_power_of_2 & (closest_power_of_2 - 1) == 0 { // (1 << max_depth) returns the number of leaves in the full merkle tree // (1 << (max_depth + 1)) - 1 returns the number of nodes in the full tree // The canopy size cannot exceed the size of the tree if closest_power_of_2 > (1 << (max_depth + 1)) { msg!( "Canopy size is too large. Size: {}. Max size: {}", closest_power_of_2 - 2, (1 << (max_depth + 1)) - 2 ); return err!(AccountCompressionError::CanopyLengthMismatch); } } else { msg!( "Canopy length {} is not 2 less than a power of 2", canopy.len() ); return err!(AccountCompressionError::CanopyLengthMismatch); } // 1 is subtracted from the trailing zeros because the root is not stored in the canopy Ok(closest_power_of_2.trailing_zeros() - 1) }

Concurrent Tree Set Leaf

set_leaf is designed to update a leaf node at a specified index in a Merkle tree. It ensures that the update is valid by verifying the provided Merkle proof and handles concurrency by fast-forwarding proofs through any changes that have occurred since the proof was generated.
 
Parameters:
  • &mut self: Mutable reference to the ConcurrentMerkleTree instance.
  • current_root: The Merkle root at the time the proof was generated.
  • previous_leaf: The expected current value of the leaf at the given index.
  • new_leaf: The new value to set at the leaf node.
  • proof_vec: A vector containing the Merkle proof from the leaf to the root.
  • index: The index of the leaf node to update.
Return Value:
  • Returns a Result<Node, ConcurrentMerkleTreeError>, where the Node is the new root of the Merkle tree after the update.
 
Steps:
  1. Bounds Checking
    1. Ensures that MAX_DEPTH and MAX_BUFFER_SIZE are within acceptable limits.
      • Checks if max_depth is less than 31 (to prevent overflow in bit operations).
      • Asserts that max_buffer_size is a power of 2.
  1. Leaf Index Validation
    1. Verifies that the provided index is within the bounds of the tree's depth: Checks if leaf_index is less than 2^max_depth.
  1. Tree Initialization Check
    1. Ensures that the Merkle tree has been initialized before proceeding.
      • is_initialized: Checks if buffer_size, sequence_number, and active_index are non-zero.
  1. Index vs. Rightmost Proof Index
    1. Ensures that the index is not greater than the index of next node to be added (rightmost_proof.index).
  1. Preparing the Proof Array
    1. Initializes a fixed-size array proof of length MAX_DEPTH and fills it with the provided proof.
      fill_in_proof:
      • Copies elements from proof_vec into proof.
      • If proof_vec is shorter than MAX_DEPTH, fills the remaining with default Node values (value of nodes in empty tree).
  1. Apply Proof to Update Tree
    1. update the leaf using proof.
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/concurrent_merkle_tree.rs --- /// This method will update the leaf at `index`. /// /// However if the proof cannot be verified, this method will fail. pub fn set_leaf( &mut self, current_root: Node, previous_leaf: Node, new_leaf: Node, proof_vec: &[Node], index: u32, ) -> Result<Node, ConcurrentMerkleTreeError> { /// Bounds Checking check_bounds(MAX_DEPTH, MAX_BUFFER_SIZE); /// Leaf Index Validation check_leaf_index(index, MAX_DEPTH)?; /// Tree Initialization Check if !self.is_initialized() { return Err(ConcurrentMerkleTreeError::TreeNotInitialized); } /// Index vs. Rightmost Proof Index if index > self.rightmost_proof.index { Err(ConcurrentMerkleTreeError::LeafIndexOutOfBounds) } else { /// Preparing the Proof Array let mut proof: [Node; MAX_DEPTH] = [Node::default(); MAX_DEPTH]; fill_in_proof::<MAX_DEPTH>(proof_vec, &mut proof); log_compute!(); /// Apply Proof to Update Tree self.try_apply_proof( current_root, previous_leaf, new_leaf, &mut proof, index, true, ) } } /// Enforce constraints on max depth and buffer size #[inline(always)] fn check_bounds(max_depth: usize, max_buffer_size: usize) { // We cannot allow a tree depth greater than 30 because of the bit math // required to update `ChangeLog`s assert!(max_depth < 31); // This will return true if MAX_BUFFER_SIZE is a power of 2 or if it is 0 assert!(max_buffer_size & (max_buffer_size - 1) == 0); } fn check_leaf_index(leaf_index: u32, max_depth: usize) -> Result<(), ConcurrentMerkleTreeError> { if leaf_index >= (1 << max_depth) { return Err(ConcurrentMerkleTreeError::LeafIndexOutOfBounds); } Ok(()) } pub fn is_initialized(&self) -> bool { !(self.buffer_size == 0 && self.sequence_number == 0 && self.active_index == 0) } /// --- /solana-program-library/libraries/concurrent-merkle-tree/src/hash.rs --- /// Fills in proof to the height of the concurrent merkle tree. /// Missing nodes are inferred as empty node hashes. pub fn fill_in_proof<const MAX_DEPTH: usize>( proof_vec: &[Node], full_proof: &mut [Node; MAX_DEPTH], ) { solana_logging!("Attempting to fill in proof"); if !proof_vec.is_empty() { full_proof[..proof_vec.len()].copy_from_slice(proof_vec); } for (i, item) in full_proof .iter_mut() .enumerate() .take(MAX_DEPTH) .skip(proof_vec.len()) { *item = empty_node(i as u32); } } /// --- /solana-program-library/libraries/concurrent-merkle-tree/src/node.rs --- /// Calculates the hash of empty nodes up to level i pub fn empty_node(level: u32) -> Node { empty_node_cached::<0>(level, &[]) } /// Calculates and caches the hash of empty nodes up to level i pub fn empty_node_cached<const N: usize>(level: u32, cache: &[Node; N]) -> Node { let mut data = EMPTY; if level != 0 { let target = (level - 1) as usize; let lower_empty = if target < cache.len() && cache[target] != EMPTY { cache[target] } else { empty_node(target as u32) }; let hash = hashv(&[lower_empty.as_ref(), lower_empty.as_ref()]); data.copy_from_slice(hash.as_ref()); } data }

Update Canopy

/// --- programs/account-compression/src/lib.rs --- /// Executes an instruction that overwrites a leaf node. /// Composing programs should check that the data hashed into previous_leaf /// matches the authority information necessary to execute this instruction. pub fn replace_leaf( ctx: Context<Modify>, root: [u8; 32], previous_leaf: [u8; 32], new_leaf: [u8; 32], index: u32, ) -> Result<()> { /// ... update_canopy( canopy_bytes, header.get_max_depth(), Some(&change_log_event), )?; /// ... }
 
After having called set_leaf, it calls ChangeLogEvent::from to derive ChangeLogEvent from the newest change log, tree pda account key, and sequencer number of the merkle tree.
/// --- programs/account-compression/src/macros.rs --- /// This macro applies functions on a ConcurrentMerkleT:ee and emits leaf information /// needed to sync the merkle tree state with off-chain indexers. #[macro_export] macro_rules! _merkle_tree_depth_size_apply_fn { ($max_depth:literal, $max_size:literal, $id:ident, $bytes:ident, $func:ident, TreeLoad::Mutable, $($arg:tt)*) => { match ConcurrentMerkleTree::<$max_depth, $max_size>::load_mut_bytes($bytes) { Ok(merkle_tree) => { match merkle_tree.$func($($arg)*) { Ok(_) => { Ok(Box::<ChangeLogEvent>::from((merkle_tree.get_change_log(), $id, merkle_tree.sequence_number))) } Err(err) => { msg!("Error using concurrent merkle tree: {}", err); err!(AccountCompressionError::ConcurrentMerkleTreeError) } } } Err(err) => { msg!("Error zero copying concurrent merkle tree: {}", err); err!(AccountCompressionError::ZeroCopyError) } } }; /// ... }
 
ChangeLogEvent::from calculates index of each node in the path in the whole merkle tree (index starts from 1 including root node).
/// --- programs/account-compression/src/events/changelog_event.rs --- impl<const MAX_DEPTH: usize> From<(Box<ChangeLog<MAX_DEPTH>>, Pubkey, u64)> for Box<ChangeLogEvent> { fn from(log_info: (Box<ChangeLog<MAX_DEPTH>>, Pubkey, u64)) -> Self { let (changelog, tree_id, seq) = log_info; let path_len = changelog.path.len() as u32; let mut path: Vec<PathNode> = changelog .path .iter() .enumerate() .map(|(lvl, n)| { PathNode::new( *n, (1 << (path_len - lvl as u32)) + (changelog.index >> lvl), ) }) .collect(); path.push(PathNode::new(changelog.root, 1)); Box::new(ChangeLogEvent::V1(ChangeLogEventV1 { id: tree_id, path, seq, index: changelog.index, })) } }
 
For example, Root#1’s index is 1, Node#0’s is 2, Node#1 is 2, Node#00 is 3, etc.
notion image
 
update_canopy updates canopy tree.
It iterates path_nodes in the change log, locates its index in the canopy array, and updates canopy.
/// --- programs/account-compression/src/canopy.rs --- pub fn update_canopy( canopy_bytes: &mut [u8], max_depth: u32, change_log: Option<&ChangeLogEvent>, ) -> Result<()> { check_canopy_bytes(canopy_bytes)?; let canopy = cast_slice_mut::<u8, Node>(canopy_bytes); let path_len = get_cached_path_length(canopy, max_depth)?; if let Some(cl_event) = change_log { match &*cl_event { ChangeLogEvent::V1(cl) => { // Update the canopy from the newest change log for path_node in cl.path.iter().rev().skip(1).take(path_len as usize) { // node_idx - 2 maps to the canopy index canopy[(path_node.index - 2) as usize] = path_node.node; } } } } Ok(()) }
 
Because cl.path includes root node, and canopy doesn’t store root node, so it skips root node.
cl.path.iter().rev().skip(1).take(path_len as usize)
 
path_node.index starts from 1, from root node. So we need substract 2 to get the correct index of canopy node in canopy array,
canopy[(path_node.index - 2) as usize] = path_node.node;

Emit Event

Calls noop program to emit change log in this tree operation.
/// --- programs/account-compression/src/lib.rs --- /// Executes an instruction that overwrites a leaf node. /// Composing programs should check that the data hashed into previous_leaf /// matches the authority information necessary to execute this instruction. pub fn replace_leaf( ctx: Context<Modify>, root: [u8; 32], previous_leaf: [u8; 32], new_leaf: [u8; 32], index: u32, ) -> Result<()> { /// ... wrap_event( &AccountCompressionEvent::ChangeLog(*change_log_event), &ctx.accounts.noop, ) /// ... } /// --- programs/account-compression/src/noop/mod.rs --- pub fn wrap_event<'info>( event: &AccountCompressionEvent, noop_program: &Program<'info, Noop>, ) -> Result<()> { invoke( &spl_noop::instruction(event.try_to_vec()?), &[noop_program.to_account_info()], )?; Ok(()) }

Transfer Authority

transfer_authority transfers authority of CMT. Authority of CMT can update node in it.
/// --- programs/account-compression/src/lib.rs --- /// Context for transferring `authority` #[derive(Accounts)] pub struct TransferAuthority<'info> { #[account(mut)] /// CHECK: This account is validated in the instruction pub merkle_tree: UncheckedAccount<'info>, /// Authority that controls write-access to the tree /// Typically a program, e.g., the Bubblegum contract validates that leaves are valid NFTs. pub authority: Signer<'info>, } /// Transfers `authority`. /// Requires `authority` to sign pub fn transfer_authority( ctx: Context<TransferAuthority>, new_authority: Pubkey, ) -> Result<()> { require_eq!( *ctx.accounts.merkle_tree.owner, crate::id(), AccountCompressionError::IncorrectAccountOwner ); let mut merkle_tree_bytes = ctx.accounts.merkle_tree.try_borrow_mut_data()?; let (mut header_bytes, _) = merkle_tree_bytes.split_at_mut(CONCURRENT_MERKLE_TREE_HEADER_SIZE_V1); let mut header = ConcurrentMerkleTreeHeader::try_from_slice(header_bytes)?; header.assert_valid_authority(&ctx.accounts.authority.key())?; header.set_new_authority(&new_authority); header.serialize(&mut header_bytes)?; Ok(()) } /// --- programs/account-compression/src/state/concurrent_merkle_tree_header.rs --- pub fn set_new_authority(&mut self, new_authority: &Pubkey) { match self.header { ConcurrentMerkleTreeHeaderData::V1(ref mut header) => { header.authority = new_authority.clone(); msg!("Authority transferred to: {:?}", header.authority); } } }

Verify Leaf

verify_leaf is responsible for verifying that a given Merkle proof corresponds to a specific leaf node in a Concurrent Merkle Tree (CMT). If the verification fails, it throws an error.
 
Parameters:
  • ctx: The execution context, containing accounts and other context-specific information.
  • root: The expected root hash of the Merkle tree.
  • leaf: The leaf node value that is being verified.
  • index: The index of the leaf node within the tree.
Return Value:
  • Returns Result<()>, which is Ok(()) if verification succeeds or an error if it fails.
 
Steps:
  1. Verify Account Ownership
    1. Ensures that the merkle_tree account is owned by the spl_account_compression program.
  1. Deserialize Merkle Tree Header
      • Safely borrows the data from the merkle_tree account.
      • Separates the borrowed data into the header portion and the rest of the data.
      • Parses the header bytes into a ConcurrentMerkleTreeHeader struct.
  1. Validate Header and Leaf Index
    1. Ensures that the header is valid and that the provided leaf index is within bounds.
      • header.assert_valid()
        • Checks if the header contains data (whether initialized).
      • header.assert_valid_leaf_index(index)
        • Checks if the provided index is less than 2^max_depth.
  1. Split Rest into Tree Bytes and Canopy Bytes
      • Determines the size of the Merkle tree data based on the header information.
      • Separates the tree data (tree_bytes) from the canopy data (canopy_bytes).
  1. Construct the Proof Vector
    1. Builds the Merkle proof vector from the remaining accounts provided in the context.
  1. Fill in the Proof from the Canopy
    1. Ensures the proof vector has the correct length and fills in any missing nodes using the canopy.
  1. Verify the Leaf Using the Merkle Tree
    1. Gets the public key (ID) of the merkle_tree account.
      Applies the prove_leaf function of the ConcurrentMerkleTree to verify the leaf.
  1. Return Success
/// Verifies a provided proof and leaf. /// If invalid, throws an error. pub fn verify_leaf( ctx: Context<VerifyLeaf>, root: [u8; 32], leaf: [u8; 32], index: u32, ) -> Result<()> { /// Verify Account Ownership require_eq!( *ctx.accounts.merkle_tree.owner, crate::id(), AccountCompressionError::IncorrectAccountOwner ); /// Deserialize Merkle Tree Header let merkle_tree_bytes = ctx.accounts.merkle_tree.try_borrow_data()?; let (header_bytes, rest) = merkle_tree_bytes.split_at(CONCURRENT_MERKLE_TREE_HEADER_SIZE_V1); let header = ConcurrentMerkleTreeHeader::try_from_slice(header_bytes)?; /// Validate Header and Leaf Index header.assert_valid()?; header.assert_valid_leaf_index(index)?; /// Split Rest into Tree Bytes and Canopy Bytes let merkle_tree_size = merkle_tree_get_size(&header)?; let (tree_bytes, canopy_bytes) = rest.split_at(merkle_tree_size); /// Construct the Proof Vector let mut proof = vec![]; for node in ctx.remaining_accounts.iter() { proof.push(node.key().to_bytes()); } /// Fill in the Proof from the Canopy fill_in_proof_from_canopy(canopy_bytes, header.get_max_depth(), index, &mut proof)?; let id = ctx.accounts.merkle_tree.key(); /// Verify the Leaf Using the Merkle Tree merkle_tree_apply_fn!(header, id, tree_bytes, prove_leaf, root, leaf, &proof, index)?; /// Return Success Ok(()) } pub fn assert_valid(&self) -> Result<()> { require_eq!( self.account_type, CompressionAccountType::ConcurrentMerkleTree, AccountCompressionError::IncorrectAccountType, ); Ok(()) } pub fn assert_valid_leaf_index(&self, leaf_index: u32) -> Result<()> { if leaf_index >= (1 << self.get_max_depth()) { return Err(AccountCompressionError::LeafIndexOutOfBounds.into()); } Ok(()) }

Prove Leaf

prove_leaf is to verify that a given leaf exists within the current state of the Merkle tree. It does this by validating the provided proof against the current tree root. If the leaf and proof is invalid, then the transaction reverts.
 
Parameters:
  • &self: Immutable reference to the ConcurrentMerkleTree instance.
  • current_root: The Merkle root at the time the proof was generated.
  • leaf: The leaf node value that is being verified.
  • proof_vec: A slice containing the Merkle proof from the leaf to the root.
  • leaf_index: The index of the leaf node within the tree.
 
Steps:
  1. Bounds Checking
    1. Ensures that the MAX_DEPTH and MAX_BUFFER_SIZE are within acceptable limits.
  1. Leaf Index Validation
    1. Verifies that the provided leaf_index is within the bounds of the tree's depth.
  1. Tree Initialization Check
    1. Ensures that the Merkle tree has been initialized before proceeding.
  1. Rightmost Proof Index Validation
    1. Ensures that the leaf_index is not greater than the current maximum index (rightmost_proof.index).
  1. Preparing the Proof Array
    1. Initializes a fixed-size array proof of length MAX_DEPTH and fills it with the provided proof.
  1. Checking Validity of the Leaf and Proof
    1. Validates the leaf and proof against the current state of the tree.
  1. Handling Invalid Proofs
    1. If the proof is invalid after adjustments, logs the failure and returns an InvalidProof error.
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/concurrent_merkle_tree.rs --- /// This method will fail if the leaf cannot be proven /// to exist in the current tree root. /// /// This method will attempts to prove the leaf first /// using the proof nodes provided. However if this fails, /// then a proof will be constructed by inferring a proof /// from the changelog buffer. /// /// Note: this is *not* the same as verifying that a (proof, leaf) /// combination is valid for the given root. That functionality /// is provided by `check_valid_proof`. pub fn prove_leaf( &self, current_root: Node, leaf: Node, proof_vec: &[Node], leaf_index: u32, ) -> Result<(), ConcurrentMerkleTreeError> { /// Bounds Checking check_bounds(MAX_DEPTH, MAX_BUFFER_SIZE); /// Leaf Index Validation check_leaf_index(leaf_index, MAX_DEPTH)?; /// Tree Initialization Check if !self.is_initialized() { return Err(ConcurrentMerkleTreeError::TreeNotInitialized); } /// Rightmost Proof Index Validation if leaf_index > self.rightmost_proof.index { solana_logging!( "Received an index larger than the rightmost index {} > {}", leaf_index, self.rightmost_proof.index ); Err(ConcurrentMerkleTreeError::LeafIndexOutOfBounds) } else { /// Preparing the Proof Array let mut proof: [Node; MAX_DEPTH] = [Node::default(); MAX_DEPTH]; fill_in_proof::<MAX_DEPTH>(proof_vec, &mut proof); /// Checking Validity of the Leaf and Proof let valid_root = self.check_valid_leaf(current_root, leaf, &mut proof, leaf_index, true)?; /// Handling Invalid Proofs if !valid_root { solana_logging!("Proof failed to verify"); return Err(ConcurrentMerkleTreeError::InvalidProof); } Ok(()) } }

Append

The append function in the spl_account_compression program allows the authority of a Merkle tree to add a new leaf to the tree without providing a proof. It uses the append function implemented in SPL to achieve core logic.
 
Parameters:
  • ctx: Context<Modify>: The execution context, which includes accounts and programs required by the instruction.
  • leaf: [u8; 32]: The new leaf node to be appended to the Merkle tree.
 
Steps:
  1. Verify Merkle Tree Owner
    1. Ensure that the merkle_tree account is owned by the spl_account_compression program.
  1. Split the Account Data into Header and Remaining Parts
    1. Obtain a mutable reference to the data of the merkle_tree account.
    2. Separate the header from the rest of the data.
    3. Deserialize the header bytes into a ConcurrentMerkleTreeHeader struct.
  1. Validate the Authority
    1. Ensure that the merkle tree has been initialized.
    2. Ensure that the signer (authority) is indeed the authority of the Merkle tree.
  1. Split the Remaining Data into Tree Bytes and Canopy Bytes
    1. Uses the max_depth and max_buffer_size from the header to compute the tree size.
      Split the Remaining Data into Tree Bytes and Canopy Bytes
  1. Call the append Method on the Merkle Tree
    1. Append the new leaf to the Merkle tree.
  1. Update the Canopy
    1. Update the canopy with the latest tree changes.
  1. Emit the Change Log Event
    1. Emit an event containing the change log, allowing off-chain indexers to stay updated.
/// --- programs/account-compression/src/lib.rs --- /// This instruction allows the tree's `authority` to append a new leaf to the tree /// without having to supply a proof. /// /// Learn more about SPL /// ConcurrentMerkleTree /// [here](https://github.com/solana-labs/solana-program-library/tree/master/libraries/concurrent-merkle-tree) pub fn append(ctx: Context<Modify>, leaf: [u8; 32]) -> Result<()> { /// Verify Merkle Tree Owner require_eq!( *ctx.accounts.merkle_tree.owner, crate::id(), AccountCompressionError::IncorrectAccountOwner ); /// Split the Account Data into Header and Remaining Parts let mut merkle_tree_bytes = ctx.accounts.merkle_tree.try_borrow_mut_data()?; let (header_bytes, rest) = merkle_tree_bytes.split_at_mut(CONCURRENT_MERKLE_TREE_HEADER_SIZE_V1); let header = ConcurrentMerkleTreeHeader::try_from_slice(header_bytes)?; /// Validate the Authority header.assert_valid_authority(&ctx.accounts.authority.key())?; /// Split the Remaining Data into Tree Bytes and Canopy Bytes let id = ctx.accounts.merkle_tree.key(); let merkle_tree_size = merkle_tree_get_size(&header)?; let (tree_bytes, canopy_bytes) = rest.split_at_mut(merkle_tree_size); /// Call the append Method on the Merkle Tree let change_log_event = merkle_tree_apply_fn_mut!(header, id, tree_bytes, append, leaf)?; /// Update the Canopy update_canopy( canopy_bytes, header.get_max_depth(), Some(&change_log_event), )?; /// Emit the Change Log Event wrap_event( &AccountCompressionEvent::ChangeLog(*change_log_event), &ctx.accounts.noop, ) } /// Context for inserting, appending, or replacing a leaf in the tree /// /// Modification instructions also require the proof to the leaf to be provided /// as 32-byte nodes via "remaining accounts". #[derive(Accounts)] pub struct Modify<'info> { #[account(mut)] /// CHECK: This account is validated in the instruction pub merkle_tree: UncheckedAccount<'info>, /// Authority that controls write-access to the tree /// Typically a program, e.g., the Bubblegum contract validates that leaves are valid NFTs. pub authority: Signer<'info>, /// Program used to emit changelogs as cpi instruction data. pub noop: Program<'info, Noop>, }

Validate the Authority

assert_valid_authority validates:
  1. concurrent tree merkle tree has been initialized
  1. passed authority account equals the authority account registered in the header.
/// --- programs/account-compression/src/state/concurrent_merkle_tree_header.rs --- pub fn assert_valid_authority(&self, expected_authority: &Pubkey) -> Result<()> { self.assert_valid()?; match &self.header { ConcurrentMerkleTreeHeaderData::V1(header) => { require_eq!( header.authority, *expected_authority, AccountCompressionError::IncorrectAuthority, ); } } Ok(()) } pub fn assert_valid(&self) -> Result<()> { require_eq!( self.account_type, CompressionAccountType::ConcurrentMerkleTree, AccountCompressionError::IncorrectAccountType, ); Ok(()) }

Concurrent Tree Append

The append allows to add a new leaf to the rightmost position of the tree efficiently, without requiring a proof. This function is optimized for concurrency, ensuring that multiple operations can be performed on the tree simultaneously without conflicts.
 
Parameters:
  • &mut self: Mutable reference to the ConcurrentMerkleTree instance.
  • node: Node: The new leaf node to append.
 
Steps:
  1. Bounds Checking:
      • Purpose: Ensures that the tree's parameters are within acceptable limits.
      • Constraints:
        • MAX_DEPTH < 31: The maximum depth must be less than 31 due to bit manipulation limitations.
        • MAX_BUFFER_SIZE is a power of 2 (or zero): Ensures efficient buffer management.
  1. Tree Initialization Check:
    1. Verifies that the tree has been properly initialized before performing operations.
  1. Empty Node Check:
    1. Prevents appending an empty node, which would be invalid in this context.
  1. Tree Full Check:
    1. Ensures that the tree has not reached its maximum capacity based on its depth.
  1. Initial Tree Append Handling:
    1. If the tree is empty, initialize it with the first node.
  1. Prepare Variables for Append Operation:
      • leaf: Holds the new leaf node.
      • intersection: Determines the level at which the new leaf's path diverges from the existing tree.
      • change_list: An array to record changes at each level for the ChangeLog.
      • intersection_node: Starts as the leaf of the rightmost proof, used to compute parent nodes.
      • empty_node_cache: Cache of empty nodes for efficiency.
  1. Iterate Over Tree Levels to Compute New Root and Update Proofs:
    1. Purpose: Updates node, intersection_node, and self.rightmost_proof to reflect the addition of the new leaf.
      Case 1: i < intersection
      • The sibling should be root of an empty tree. It uses empty_node_cached to calcualte the hash of the sibling node. Then it uses hash_to_parent to calcualte the hash of parent node in the change path.
      • Also it uses hash_to_parent to calculate the hash of the parent node in the rightmost node’s path. It is used to help calculate the parent node hash when the intersection happens.
      Case 2: i == intersection
      • This is when intersection happens, it combines hashes of nodes in new node and rightmost node paths to calcualte the joint parent node hash.
      Case 3: i > intersection
      • It uses rightmost node’s proof nodes as sibling to calcualte parent nodes up to the root.
  1. Update Tree Internal Counters
      • active_index: records the index of the newest tree, which is wrapped in buffer size.
      • buffer_size: the amount of buffered tree.
      • sequence_number: the sequence number of tree.
  1. Update Change Log
  1. Update Rightmost Proof
      • increase index by 1
      • update the leaf node
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/concurrent_merkle_tree.rs --- /// Abstract type for 32 byte leaf data pub type Node = [u8; 32]; /// Appending a non-empty Node will always succeed . pub fn append(&mut self, mut node: Node) -> Result<Node, ConcurrentMerkleTreeError> { /// Bounds Checking: check_bounds(MAX_DEPTH, MAX_BUFFER_SIZE); /// Tree Initialization Check: if !self.is_initialized() { return Err(ConcurrentMerkleTreeError::TreeNotInitialized); } /// Empty Node Check: if node == EMPTY { return Err(ConcurrentMerkleTreeError::CannotAppendEmptyNode); } /// Tree Full Check: if self.rightmost_proof.index >= 1 << MAX_DEPTH { return Err(ConcurrentMerkleTreeError::TreeFull); } /// Initial Tree Append Handling: if self.rightmost_proof.index == 0 { return self.initialize_tree_from_append(node, self.rightmost_proof.proof); } /// Prepare Variables for Append Operation: let leaf = node; let intersection = self.rightmost_proof.index.trailing_zeros() as usize; let mut change_list = [EMPTY; MAX_DEPTH]; let mut intersection_node = self.rightmost_proof.leaf; let empty_node_cache = [Node::default(); MAX_DEPTH]; /// Iterate Over Tree Levels to Compute New Root and Update Proofs: for (i, cl_item) in change_list.iter_mut().enumerate().take(MAX_DEPTH) { *cl_item = node; match i { i if i < intersection => { // Compute proof to the appended node from empty nodes let sibling = empty_node_cached::<MAX_DEPTH>(i as u32, &empty_node_cache); hash_to_parent( &mut intersection_node, &self.rightmost_proof.proof[i], ((self.rightmost_proof.index - 1) >> i) & 1 == 0, ); hash_to_parent(&mut node, &sibling, true); self.rightmost_proof.proof[i] = sibling; } i if i == intersection => { // Compute the where the new node intersects the main tree hash_to_parent(&mut node, &intersection_node, false); self.rightmost_proof.proof[intersection] = intersection_node; } _ => { // Update the change list path up to the root hash_to_parent( &mut node, &self.rightmost_proof.proof[i], ((self.rightmost_proof.index - 1) >> i) & 1 == 0, ); } } } /// Update Tree Internal Counters self.update_internal_counters(); /// Update Change Log self.change_logs[self.active_index as usize] = ChangeLog::<MAX_DEPTH>::new(node, change_list, self.rightmost_proof.index); /// Update Rightmost Proof self.rightmost_proof.index += 1; self.rightmost_proof.leaf = leaf; Ok(node) }

Append Node in Initial Tree

If rightmost_proof.index is 0, which means it’s an initial tree. It calls initialize_tree_from_append to append node.
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/concurrent_merkle_tree.rs --- /// Appending a non-empty Node will always succeed . pub fn append(&mut self, mut node: Node) -> Result<Node, ConcurrentMerkleTreeError> { /// ... if self.rightmost_proof.index == 0 { return self.initialize_tree_from_append(node, self.rightmost_proof.proof); } /// ... }
 
Parameters:
  • leaf: node to be appended
  • proof: the initial proof is proof of an empty node in an empty tree.
 
Steps:
  1. Calculate Root of Empty Tree
    1. Use the stored proof to calculate root.
  1. Check whether the Old Root is Empty Tree Root
    1. Use empty_node to calculate root of an empty tree, and compare it with current root.
      • If equals, then apply proof to update tree
      • else, throw error
  1. Apply Proof to Update Tree
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/hash.rs --- /// Only used to initialize right most path for a completely empty tree. #[inline(always)] fn initialize_tree_from_append( &mut self, leaf: Node, mut proof: [Node; MAX_DEPTH], ) -> Result<Node, ConcurrentMerkleTreeError> { let old_root = recompute(EMPTY, &proof, 0); if old_root == empty_node(MAX_DEPTH as u32) { self.try_apply_proof(old_root, EMPTY, leaf, &mut proof, 0, false) } else { Err(ConcurrentMerkleTreeError::TreeAlreadyInitialized) } }
 
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/hash.rs --- /// Recomputes root of the Merkle tree from Node & proof pub fn recompute(leaf: Node, proof: &[Node], index: u32) -> Node { let mut current_node = leaf; for (depth, sibling) in proof.iter().enumerate() { hash_to_parent(&mut current_node, sibling, index >> depth & 1 == 0); } current_node }
 
try_apply_proof is responsible for applying an update to a leaf node in the tree, given a proof that the leaf exists at a certain index and under a certain root.
Purpose:
  • Attempts to apply a proof to update a leaf node in the tree.
  • Ensures that the proof is valid for the current tree state.
  • Updates internal counters and buffers after successfully applying the proof.
 
Parameters:
  • &mut self: Mutable reference to the ConcurrentMerkleTree instance.
  • current_root: The root node for which the proof was originally generated.
  • leaf: The current value of the leaf node at leaf_index.
  • new_leaf: The new value to replace the leaf node with.
  • proof: The proof path from the leaf node to the root.
  • leaf_index: The index of the leaf node in the tree.
  • allow_inferred_proof: Flag indicating whether to attempt to infer the proof if the root is not found in the change log buffer.
 
Steps:
  1. Logging Internal State:
    1. The function starts by logging important internal state variables for debugging purposes.
  1. Validating the Proof:
    1. It calls check_valid_leaf to verify whether the provided proof is still valid for the current tree state, possibly after fast-forwarding through concurrent changes.
  1. Updating Internal Counters:
    1. Upon successful validation, it updates internal counters to reflect the new state.
      Increments the active_index, buffer_size, and sequence_number.
  1. Updating Buffers and Returning New Root:
    1. Finally, it updates the internal buffers with the new leaf and proof and returns the new root.
/// --- /solana/solana-program-library/libraries/concurrent-merkle-tree/src/concurrent_merkle_tree.rs --- /// Note: Enabling `allow_inferred_proof` will fast forward the given proof /// from the beginning of the buffer in the case that the supplied root is /// not in the buffer. #[inline(always)] fn try_apply_proof( &mut self, current_root: Node, leaf: Node, new_leaf: Node, proof: &mut [Node; MAX_DEPTH], leaf_index: u32, allow_inferred_proof: bool, ) -> Result<Node, ConcurrentMerkleTreeError> { /// Logging Internal State: solana_logging!("Active Index: {}", self.active_index); solana_logging!("Rightmost Index: {}", self.rightmost_proof.index); solana_logging!("Buffer Size: {}", self.buffer_size); solana_logging!("Leaf Index: {}", leaf_index); /// Validating the Proof: let valid_root = self.check_valid_leaf(current_root, leaf, proof, leaf_index, allow_inferred_proof)?; if !valid_root { return Err(ConcurrentMerkleTreeError::InvalidProof); } /// Updating Internal Counters: self.update_internal_counters(); /// Updating Buffers and Returning New Root: Ok(self.update_buffers_from_proof(new_leaf, proof, leaf_index)) }
 
check_valid_leaf
Purpose:
  • Validates that the provided proof and leaf are consistent with the current tree state.
  • Attempts to fast-forward the proof through change logs if necessary. This operation updates the proof according to previous tree updates.
  • Ensures that the leaf has not been modified since the proof was generated.
 
Steps:
  1. Finding the Root in the Change Log Buffer:
    1. Tries to find the current_root in the change log buffer.
  1. Fast-Forwarding the Proof:
    1. Calls fast_forward_proof to update the proof and leaf by applying changes from the change logs.
      • updatable_leaf_node is the leaf node that may be updated.
      • proof_leaf_unchanged indicates whether the leaf was modified during fast-forwarding.
  1. Checking for Leaf Modification:
    1. If the leaf was modified (i.e., proof_leaf_unchanged is false), it returns a LeafContentsModified error. It means that there has been an operation which has modified the leaf we want to update before. (Todo: why this is not allowed?)
  1. Validating the Updated Proof:
    1. Uses check_valid_proof to validate the updated proof against the current tree root.
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/concurrent_merkle_tree.rs --- #[inline(always)] fn check_valid_leaf( &self, current_root: Node, leaf: Node, proof: &mut [Node; MAX_DEPTH], leaf_index: u32, allow_inferred_proof: bool, ) -> Result<bool, ConcurrentMerkleTreeError> { /// Finding the Root in the Change Log Buffer: let mask: usize = MAX_BUFFER_SIZE - 1; let (changelog_index, use_full_buffer) = match self.find_root_in_changelog(current_root) { Some(matching_changelog_index) => (matching_changelog_index, false), None => { if allow_inferred_proof { solana_logging!("Failed to find root in change log -> replaying full buffer"); ( self.active_index.wrapping_sub(self.buffer_size - 1) & mask as u64, true, ) } else { return Err(ConcurrentMerkleTreeError::RootNotFound); } } }; /// Fast-Forwarding the Proof: let mut updatable_leaf_node = leaf; let proof_leaf_unchanged = self.fast_forward_proof( &mut updatable_leaf_node, proof, leaf_index, changelog_index, use_full_buffer, ); /// Checking for Leaf Modification: if !proof_leaf_unchanged { return Err(ConcurrentMerkleTreeError::LeafContentsModified); } /// Validating the Updated Proof: Ok(self.check_valid_proof(updatable_leaf_node, proof, leaf_index)) }
 
find_root_in_changelog
find_root_in_changelog iterates change logs to match current_root.
For example:
  • if the MAX_BUFFER_SIZE is 4, whose binary format is 100. So the mask being 11.
  • buffer_size is 4 which means there are 4 buffered tree roots.
  • current active_index is 0 of type u64.
Then it first tries to get change_logs[0], then go backwards to change_logs[3]
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/concurrent_merkle_tree.rs --- #[inline(always)] fn find_root_in_changelog(&self, current_root: Node) -> Option<u64> { let mask: usize = MAX_BUFFER_SIZE - 1; for i in 0..self.buffer_size { let j = self.active_index.wrapping_sub(i) & mask as u64; if self.change_logs[j as usize].root == current_root { return Some(j); } } None }
 
fast_forward_proof
Purpose:
  • Updates the proof and leaf node by applying the changes recorded in the change logs.
  • Determines if the leaf at leaf_index has been modified.
 
Steps:
  1. Initializing Updated Leaf:
    1. updated_leaf will be modified as changes are applied.
  1. Modifies Proof by Iterating through the Change Logs
    1. Loops through the change logs, updating the proof and updated_leaf. During node adding operation, use_full_buffer is set false. And it iterates change logs to update proof till the newest change log.
  1. Checking for Leaf Modification:
      • Determines if the leaf has been modified during fast-forwarding.
      • Returns true if the leaf is unchanged, false otherwise.
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/concurrent_merkle_tree.rs --- /// Modifies the `proof` for leaf at `leaf_index` /// in place by fast-forwarding the given `proof` through the /// `changelog`s, starting at index `changelog_buffer_index` /// Returns false if the leaf was updated in the change log #[inline(always)] fn fast_forward_proof( &self, leaf: &mut Node, proof: &mut [Node; MAX_DEPTH], leaf_index: u32, mut changelog_buffer_index: u64, use_full_buffer: bool, ) -> bool { solana_logging!( "Fast-forwarding proof, starting index {}", changelog_buffer_index ); let mask: usize = MAX_BUFFER_SIZE - 1; /// Initializing Updated Leaf: let mut updated_leaf = *leaf; log_compute!(); /// Modifies Proof by Iterating through the Change Logs loop { // If use_full_buffer is false, this loop will terminate if the initial value of // changelog_buffer_index is the active index if !use_full_buffer && changelog_buffer_index == self.active_index { break; } changelog_buffer_index = (changelog_buffer_index + 1) & mask as u64; self.change_logs[changelog_buffer_index as usize].update_proof_or_leaf( leaf_index, proof, &mut updated_leaf, ); // If use_full_buffer is true, this loop will do 1 full pass of the change logs if use_full_buffer && changelog_buffer_index == self.active_index { break; } } log_compute!(); /// Checking for Leaf Modification: let proof_leaf_unchanged = updated_leaf == *leaf; *leaf = updated_leaf; proof_leaf_unchanged }
 
Proof and Leaf Update
update_proof_or_leaf updates proof according to change logs.
It identifies the path common part of two nodes from root. And updates proof accordingly.
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/changelog.rs --- /// Fast forwards the given proof and corresponding leaf by applying an /// update from the current change log pub fn update_proof_or_leaf( &self, leaf_index: u32, proof: &mut [Node; MAX_DEPTH], leaf: &mut Node, ) { let padding: usize = 32 - MAX_DEPTH; if leaf_index != self.index { // This bit math is used to identify which node in the proof // we need to swap for a corresponding node in a saved change log let common_path_len = ((leaf_index ^ self.index) << padding).leading_zeros() as usize; let critbit_index = (MAX_DEPTH - 1) - common_path_len; proof[critbit_index] = self.path[critbit_index]; } else { *leaf = self.get_leaf(); } }
 
Let’s consider a tree with depth 3.
If user wants to update Node#001, it calcualtes proof based on Root#0 . However, before his tx gets executed, there is another user who sends a tx which updates the Node#011 and the root updates from Root#0 to Root#1. The update operation changes Node#01 which is a node in the proof user passes.
So the proof user passes becomes inconsistent with the newest root, the proof needs to be updated.
 
Steps:
  • padding: 29
  • common_path_len = 1
  • critbit_index = 1
  • So we update proof[1] which is Node#01
 
common_path_len represents the number of same nodes in the path from root to leaf. And each leaf has an index, whose binary format reflects the path of the node in the tree, where 0 represents left child node and 1 represents right child node.
Because self.index and leaf_index's types are u32, so we need to padding them before call leading_zeros.
notion image
 
notion image
 
check_valid_proof
check_valid_proof checks the leaf and proof is consistent with current tree root.
 
Steps:
  1. Check the Tree has been Initialized
  1. Check the Leaf Index doesn’t exceed Range
  1. Compute and Compare Root
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/concurrent_merkle_tree.rs --- /// Checks that the proof provided is valid for the current root. pub fn check_valid_proof( &self, leaf: Node, proof: &[Node; MAX_DEPTH], leaf_index: u32, ) -> bool { /// Check the Tree has been Initialized if !self.is_initialized() { solana_logging!("Tree is not initialized, returning false"); return false; } /// Check the Leaf Index doesn’t exceed Range if check_leaf_index(leaf_index, MAX_DEPTH).is_err() { solana_logging!("Leaf index out of bounds for max_depth"); return false; } /// Compute and Compare Root recompute(leaf, proof, leaf_index) == self.get_root() } fn check_leaf_index(leaf_index: u32, max_depth: usize) -> Result<(), ConcurrentMerkleTreeError> { if leaf_index >= (1 << max_depth) { return Err(ConcurrentMerkleTreeError::LeafIndexOutOfBounds); } Ok(()) } /// --- /solana-program-library/libraries/concurrent-merkle-tree/src/hash.rs --- pub fn recompute(leaf: Node, proof: &[Node], index: u32) -> Node { let mut current_node = leaf; for (depth, sibling) in proof.iter().enumerate() { hash_to_parent(&mut current_node, sibling, index >> depth & 1 == 0); } current_node }
 
update_internal_counters
update_internal_counters updates active_index, buffer_size and sequencer_number.
Note that active_index is wrapped, and buffer_size has max size defined by MAX_BUFFER_SIZE.
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/concurrent_merkle_tree.rs --- /// Implements circular addition for changelog buffer index fn update_internal_counters(&mut self) { let mask: usize = MAX_BUFFER_SIZE - 1; self.active_index += 1; self.active_index &= mask as u64; if self.buffer_size < MAX_BUFFER_SIZE as u64 { self.buffer_size += 1; } self.sequence_number = self.sequence_number.saturating_add(1); }
 
update_buffers_from_proof updates change_log and rightmost_proof.
 
Steps:
  1. Calculate Root and Update Change Log
    1. Call replace_and_recompute_path to calcualte new root and update change log.
  1. Update Rightmost Proof
    1. The index is 0 which equals the rightmost_proof.index (initial tree’s rightmost_proof.index is 0).
      • update the proof to the appended node’s proof
      • update the leaf to the inserted node
      • update the index to be 1. (index of next node to be appended)
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/concurrent_merkle_tree.rs --- /// Creates a new root from a proof that is valid for the root at /// `self.active_index` fn update_buffers_from_proof(&mut self, start: Node, proof: &[Node], index: u32) -> Node { let change_log = &mut self.change_logs[self.active_index as usize]; // Also updates change_log's current root let root = change_log.replace_and_recompute_path(index, start, proof); // Update rightmost path if possible if self.rightmost_proof.index < (1 << MAX_DEPTH) { if index < self.rightmost_proof.index { change_log.update_proof_or_leaf( self.rightmost_proof.index - 1, &mut self.rightmost_proof.proof, &mut self.rightmost_proof.leaf, ); } else { assert!(index == self.rightmost_proof.index); solana_logging!("Appending rightmost leaf"); self.rightmost_proof.proof.copy_from_slice(proof); self.rightmost_proof.index = index + 1; self.rightmost_proof.leaf = change_log.get_leaf(); } } root }

Append Node in Non-initial Tree

Node Adding Diagram Illustraion
The rightmost_proof is the proof of the rightmost node in the tree.
  • leaf is rightmost node
  • index is the index of the next node to be appended.
The tree has just been initialized is a special case, whose proof is the proof an empty node in an empty tree, leaf is empty node, index is 0. This situation is handled separately.
 
With the rightmost proof, when adding a node, we only need to pass in the node to be added. Then, we can calculate the new root node based on the rightmost proof stored in the contract and the node.
 
Initialize Tree
The rightmost_proof in the picture is the value after initialization execution.
notion image
 
Append First Node
The rightmost_proof in the picture is the value before the node is added.
leaf node index’s binary format is 001
so intersection is 0, so the node first hashes with intersection_node Node#0, then hashes with proof nodes.
notion image
 
Append Second Node
The rightmost_proof in the picture is the value before the node is added.
leaf node index’s binary format is 010
intersection is 1, so it first hashes with empty node, then hashes with intersection_node which is Node#a, after which it hashes with proof nodes up to the root.
notion image

Insert or Append

insert_or_append takes a proof, and will attempt to write the given leaf to the specified index in the tree. If the insert operation fails, the leaf will be appended to the tree.
/// --- programs/account-compression/src/lib.rs --- /// This instruction takes a proof, and will attempt to write the given leaf /// to the specified index in the tree. If the insert operation fails, the leaf will be `append`-ed /// to the tree. /// It is up to the indexer to parse the final location of the leaf from the emitted changelog. pub fn insert_or_append( ctx: Context<Modify>, root: [u8; 32], leaf: [u8; 32], index: u32, ) -> Result<()> { require_eq!( *ctx.accounts.merkle_tree.owner, crate::id(), AccountCompressionError::IncorrectAccountOwner ); let mut merkle_tree_bytes = ctx.accounts.merkle_tree.try_borrow_mut_data()?; let (header_bytes, rest) = merkle_tree_bytes.split_at_mut(CONCURRENT_MERKLE_TREE_HEADER_SIZE_V1); let header = ConcurrentMerkleTreeHeader::try_from_slice(header_bytes)?; header.assert_valid_authority(&ctx.accounts.authority.key())?; header.assert_valid_leaf_index(index)?; let merkle_tree_size = merkle_tree_get_size(&header)?; let (tree_bytes, canopy_bytes) = rest.split_at_mut(merkle_tree_size); let mut proof = vec![]; for node in ctx.remaining_accounts.iter() { proof.push(node.key().to_bytes()); } fill_in_proof_from_canopy(canopy_bytes, header.get_max_depth(), index, &mut proof)?; // A call is made to ConcurrentMerkleTree::fill_empty_or_append let id = ctx.accounts.merkle_tree.key(); let change_log_event = merkle_tree_apply_fn_mut!( header, id, tree_bytes, fill_empty_or_append, root, leaf, &proof, index, )?; update_canopy( canopy_bytes, header.get_max_depth(), Some(&change_log_event), )?; wrap_event( &AccountCompressionEvent::ChangeLog(*change_log_event), &ctx.accounts.noop, ) }
 
fill_empty_or_append tries to insert first, but if it fails, it will append the node.
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/concurrent_merkle_tree.rs --- /// Convenience function for `set_leaf` /// /// This method will `set_leaf` if the leaf at `index` is an empty node, /// otherwise it will `append` the new leaf. pub fn fill_empty_or_append( &mut self, current_root: Node, leaf: Node, proof_vec: &[Node], index: u32, ) -> Result<Node, ConcurrentMerkleTreeError> { check_bounds(MAX_DEPTH, MAX_BUFFER_SIZE); check_leaf_index(index, MAX_DEPTH)?; if !self.is_initialized() { return Err(ConcurrentMerkleTreeError::TreeNotInitialized); } let mut proof: [Node; MAX_DEPTH] = [Node::default(); MAX_DEPTH]; fill_in_proof::<MAX_DEPTH>(proof_vec, &mut proof); log_compute!(); match self.try_apply_proof(current_root, EMPTY, leaf, &mut proof, index, false) { Ok(new_root) => Ok(new_root), Err(error) => match error { ConcurrentMerkleTreeError::LeafContentsModified => self.append(leaf), _ => Err(error), }, } }

Close Empty Tree

close_empty_tree closes an empty CMT.
/// --- programs/account-compression/src/lib.rs --- pub fn close_empty_tree(ctx: Context<CloseTree>) -> Result<()> { require_eq!( *ctx.accounts.merkle_tree.owner, crate::id(), AccountCompressionError::IncorrectAccountOwner ); let mut merkle_tree_bytes = ctx.accounts.merkle_tree.try_borrow_mut_data()?; let (header_bytes, rest) = merkle_tree_bytes.split_at_mut(CONCURRENT_MERKLE_TREE_HEADER_SIZE_V1); let header = ConcurrentMerkleTreeHeader::try_from_slice(header_bytes)?; header.assert_valid_authority(&ctx.accounts.authority.key())?; let merkle_tree_size = merkle_tree_get_size(&header)?; let (tree_bytes, canopy_bytes) = rest.split_at_mut(merkle_tree_size); let id = ctx.accounts.merkle_tree.key(); merkle_tree_apply_fn_mut!(header, id, tree_bytes, prove_tree_is_empty,)?; // Close merkle tree account // 1. Move lamports let dest_starting_lamports = ctx.accounts.recipient.lamports(); **ctx.accounts.recipient.lamports.borrow_mut() = dest_starting_lamports .checked_add(ctx.accounts.merkle_tree.lamports()) .unwrap(); **ctx.accounts.merkle_tree.lamports.borrow_mut() = 0; // 2. Set all CMT account bytes to 0 header_bytes.fill(0); tree_bytes.fill(0); canopy_bytes.fill(0); Ok(()) }
 
/// --- /solana-program-library/libraries/concurrent-merkle-tree/src/concurrent_merkle_tree.rs --- /// Errors if one of the leaves of the current merkle tree is non-EMPTY pub fn prove_tree_is_empty(&self) -> Result<(), ConcurrentMerkleTreeError> { if !self.is_initialized() { return Err(ConcurrentMerkleTreeError::TreeNotInitialized); } let empty_node_cache = [EMPTY; MAX_DEPTH]; if self.get_root() != empty_node_cached::<MAX_DEPTH>(MAX_DEPTH as u32, &empty_node_cache) { return Err(ConcurrentMerkleTreeError::TreeNonEmpty); } Ok(()) }

Reference