├── README.md
├── kip-0001.md
├── kip-0002.md
├── kip-0003.md
├── kip-0004.md
├── kip-0005.md
├── kip-0005
    ├── .gitignore
    ├── reference.py
    └── test-cases.py
├── kip-0006.md
├── kip-0009.md
├── kip-0010.md
├── kip-0013.md
├── kip-0014.md
└── kip-0015.md


/README.md:
--------------------------------------------------------------------------------
1 | # Kaspa Improvement Proposals
2 | 


--------------------------------------------------------------------------------
/kip-0001.md:
--------------------------------------------------------------------------------
 1 | ```
 2 |   KIP: 1
 3 |   Layer: Core
 4 |   Title: Rewriting the Kaspa Full-Node in the Rust Programming Language
 5 |   Author: Michael Sutton <msutton@cs.huji.ac.il>
 6 |           Ori Newman <orinewman1@gmail.com>
 7 |   Status: under development
 8 | ```
 9 | 
10 | The following KIP was posted on Kaspa's discord server at 07/06/2022 ([message link](https://discord.com/channels/599153230659846165/844142778232864809/994251164524748820)), 
11 | is under active development ([rusty-kaspa](https://github.com/kaspanet/rusty-kaspa)), and is brought here for completeness.
12 |  
13 | ### Motivations
14 | * Refactored and extensible codebase. The current codebase has evolved through
15 | years of R&D and uncertainty and has a decent amount of technical debt. Some
16 | components are fragile and are hard to maintain and extend. A reform of the
17 | codebase is crucial for making it accessible to new developers and for making it
18 | possible to implement new major features (e.g., smart contract support; consensus
19 | ordering algorithm upgrade)
20 | * Efficiency and performance. In order to reach maximal efficiency and higher block
21 | and transaction rates, we suggest that the system needs to be rewritten in a
22 | performance-oriented programming language and with a performance oriented
23 | mindset. Using Rust for the rewrite will open many opportunities in this aspect, while
24 | still providing many high-level constructs which are essential for realizing a complex
25 | system like Kaspa.
26 | 
27 | ### Goals
28 | * Implementing the Kaspa full-node in Rust
29 | * Reaching higher efficiency and improved performance with current net params
30 | * Benchmarking various network params through devnets and testnets, analyzing the
31 | trade-offs, and settling for some BPS, TPS configuration for the long-run.
32 | * Simplified and modularized codebase
33 | * Incorporation of pending features
34 | * Documentation (including flows and sub-protocols)
35 | * Comprehensive benchmarking suite
36 | 
37 | ### Milestones
38 | 1. A node partially implemented in Rust. Namely, all core logic and core algorithms from
39 | consensus level and below should be implemented correctly in Rust. There are two
40 | possible ways to test such a partial system. The exact method is to be determined by
41 | relevant time and effort estimations. The two options are:
42 |    1. Hybrid go-rust full-node which can connect to current mainnet and testnet and
43 | function properly. External components including the P2P and RPC layers will
44 | remain in Go. Both system parts will be attached through a cross-language
45 | interop API.
46 |    2. A test level consensus API in Rust which can be validated extensively
47 | through existing and new integration tests.
48 | 
49 | 2. The above partial node with specific performance targets. There are two types of
50 | possible performance gain:
51 |    1. Single-core performance improvement: we expect a natural gain from the
52 | usage of Rust alone and the lack of a GC. Additionally, DB optimizations such
53 | as binary serialization and Block Header compression can affect the runtime
54 | as well (initial target: 5x)
55 |    2. Multi-core scaling: Implementation of parallelism within consensus. This
56 | includes organizing consensus block and transaction processing in a way that
57 | allows parallelism of independent tasks (initial target: strong scaling; might
58 | require high BPS for being meaningful)
59 | 
60 | 3. New features on consensus level:
61 |    1. Header pruning. The outcome of this should be a node running for long
62 | periods with nearly fixed DB size (this is currently achieved by resyncing).
63 | 
64 | 4. A full-node implemented completely in Rust. This includes P2P, RPC, IBD, Mempool,
65 | Mining manager and all remaining components. RPC should be redesigned to allow
66 | a complete API change (if so desired), though backward compatibility might be a
67 | requirement.
68 | 
69 | 5. New features on node/network level:
70 |    1. Archival nodes P2P
71 |    2. Header compression on P2P level
72 | 
73 | 6. Testnet performance targets:
74 |    1. 1000 TPS (using 1-5 BPS)
75 |    2. 10 BPS
76 |    3. 32 BPS
77 |    4. 100 BPS (or max possible, since there is a trade-off of increased header size
78 | when more blocks are mined in parallel)
79 | 
80 | 7. (Mainnet BPS and TPS targets are subject to many system-wide aspects and
81 | tradeoffs. The goal of this rewrite is not to end with mainnet running with 100 BPS,
82 | but rather to allow exploring this parameter space and making the right decisions)
83 | 


--------------------------------------------------------------------------------
/kip-0002.md:
--------------------------------------------------------------------------------
 1 | ```
 2 |   KIP: 2
 3 |   Layer: Consensus (hard fork), API/RPC
 4 |   Title: Upgrade consensus to follow the DAGKNIGHT protocol
 5 |   Author: Yonatan Sompolinsky
 6 |           Michael Sutton <msutton@cs.huji.ac.il>
 7 |   Status: proposed
 8 | ```
 9 | 
10 | # Motivation
11 | [DAGKNIGHT](https://eprint.iacr.org/2022/1494.pdf) (DK) is a new consensus protocol, written by the authors of this KIP, that achieves responsiveness whilst being 50%-byzantine tolerant. It is therefore faster and more secure than GHOSTDAG (GD), which governes the current Kaspa network. In DK there’s no a priori hardcoded parameter k, and consequently it can adapt to the “real” k in the network. Concretely, in DK, clients or their wallets should incorporate k into their local confirmation policy of transactions (similarly to some clients requiring 6 confirmations in Bitcoin, and some 30 confirmations).
12 | 
13 | # Goals
14 | * Complete the R&D work necessary to implement DK for Kaspa.
15 | * **Implement DK on Kaspa as a consensus upgrade**.
16 | * Add support and API for wallets' transaction acceptance policy, to correspond to DK's confirmation speed.
17 | 
18 | # Deliverables
19 | * Applied research:
20 |   - Adapt the consensus algorithm to enforce a global maximum bound on network latency (can be taken with a huge safety margin; does not affect confirmation times), which is necessary for difficulty and minting regulation, pruning, and more.
21 |   - Devise efficient algorithms to implement the DK procedures — the current pseudocode is highly inefficient. The implementation will partially utilize the existing optimized GHOSTDAG codebase, as the latter is a subprocedure of DK.
22 |   - Research the optimal bps in terms of confirmation times, and provide a recommendation. (optional)
23 | * Implementation:
24 |   - Implement DK on the Kaspa rust codebase as a consensus upgrade.
25 |   - Design a transaction confirmation policy API and implement the supporting functionality in the node.
26 |   - Documentation of consensus changes and API additions.
27 | 
28 | # Backwards compatibility
29 | * Breaks consensus rules, requires hardfork
30 | * Adds (and potentially breaks) RPC API
31 | 


--------------------------------------------------------------------------------
/kip-0003.md:
--------------------------------------------------------------------------------
 1 | ```
 2 |   KIP: 3
 3 |   Layer: Consensus (hard fork), DAA
 4 |   Title: Block sampling for efficient DAA with high BPS
 5 |   Author: Shai Wyborski <shai.wyborski@mail.huji.ac.il>
 6 |           Michael Sutton <msutton@cs.huji.ac.il>
 7 |   Status: rejected (see below)
 8 | ```
 9 | 
10 | # Motivation
11 | The difficulty adjustment algorithm requires maintaining, for each block, a list of the N blocks in its past with the highest accumulated weight. When validating a block this list needs to be traversed to compute the various quantities required for computing the difficulty target. Since the length of the difficulty window corresponds to real-time considerations, the size of desired difficulty window is measured in seconds, not blocks. This means that if we increase the block rate by a factor of R, we increase the complexity of the difficulty adjustment by R^2 (the first factor is due to the fact that validating each block requires R times more steps, and the second factor is since blocks emerge R times faster).
12 | 
13 | This is partially mitigated by the incremental nature of the difficulty window. Like the blue set, each block inherits its selected parent's difficulty window and then fixes it according to the blocks in its anticone. This converts some of the CPU overhead to RAM overhead, which would also increase by a factor of R.
14 | 
15 | To reduce the overhead, the current Golang implementation uses window sizes that are not optimal (263 blocks for timestamp flexibility, 2641 blocks for DAA window). As a result, miners whose clock drifts into the future more than two minutes than the majority of the network have their block delayed decreasing the probability they will be blue. A secondary motivation for this proposal is to increase the timestamp flexibility (which requires increasing the DAA window accordingly) efficiently.
16 | 
17 | # Block Sampling
18 | 
19 | We propose, instead of using all of the blocks in the past, to sample a subset of the blocks to be used for difficulty adjustment. The sampling method should be:
20 | 
21 |  * Deterministic: it could be calculated from the DAG and would turn out the same for all nodes
22 |  * Uniformly distributed: all blocks have a similar chance to be sampled
23 |  * Incremental: it should not depend on the point of view of a particular block, so it could be inherited by future blocks
24 |  * Secure: a miner should not be able to cheaply affect the probability of their block to be sampled, the cost should be comparable to discarding a valid block
25 | 
26 | We propose the following sampling method: to sample one in every C blocks, blocks satisfying that the blake2b hash of the kHH hash of their header has log(C) trailing 0s. The reason we don't use the kHH of the header directly is for a cleaner design and to prevent the (unlikely, but possible) scenario that as the difficulty increases we run out of bits. We do not use the blake2b hash directly on the header to prevent an adversarial miner from filtering blake2b nonces before computing kHH nonces (an attack with marginal cost given that currently blake2b is extremely faster than kHH).
27 | 
28 | Since the block sampling relies on the kHH of the header, it makes more sense to compute it while verifying the difficulty of the block, and store two booleans, one for DAA window and the other for timespamp flexibility.
29 | 
30 | (note that this method only allows choosing C to be a power of 2, there are methods to refine this. For example, by limiting the first three bits to be 0, and the last two bits to contain *at least one* 0, we get a probability of (1/8)\*(3/4) = 3/32 for each block to be sampled. I see no reason to treat this problem generally, whatever sampling rate we choose, we could tailor a sampling rule ad-hoc that is sufficiently close)
31 | 
32 | Update: [Georges Künzli](https://github.com/tiram88) proposed the following condition for sampling one in every C blocks: consider a u32 or u16 (LE or BE) of the Blake hash as V, define a threshold
33 | ``T = (((u32::MAX as u64 + 1 + C / 2) / C) - 1) as u32``
34 | and sample any block having V <= T. The bigger the part of the hash being used for T the higher the precision we get. The selection test is then also very cheap.
35 | 
36 | That this method is deterministic and incremental is true-by-design, the uniformity follows from assuming the standard assumption that Blake sufficiently approximates a random oracle. The security follows from the fact that the nonce is included in the header, so (assuming Blake and kHH distribute independently), the expected number of hashes required to find a nonce that makes the block both valid and selected is C times larger than the hashes required for a valid block. It could be argued that a negligible factor of (C-1)/C is required for the miner to adversarialy mine a block which is not selected (whereby harming the sampling process), however, the memorylessness of the Poisson process implies that, conditioned on the fact that the miner found a nonce for a sampled block, they still have the same expected waiting time as working on a new block.
37 | 
38 | # Regulating Emissions
39 | 
40 | We currently regulate emissions by not rewarding blocks outside the difficulty window. This approach is unsuitable when the difficulty window is sampled. So instead, we only reward blocks whose accumulated blue score is at least as much as the lowest accumulated blue score witnessed in the difficulty window.
41 | 
42 | # Proposed Constants
43 | 
44 | We propose the following:
45 | 
46 |  * Increase the timestamp flexibility from two minutes to 10 minutes. This requires a window time of 20 minutes. I propose a sample rate of a block per 10 seconds. The overall size of the timestamp flexibility window would then be 121 blocks.
47 |  * Increase the length of the difficulty window to 500 minutes, sampling a block per 30 seconds. The overall size of the difficulty window would be 1000 blocks.
48 | 
49 | In practice, the length of the windows and the sample rate would probably be slightly adjusted for simpler implementation. I am not explicating the adjustments as they depend on the block rate.
50 | 
51 | # Loose Ends
52 | 
53 |  * On a glance, it might seem worrisome that the bound on block rewards is probabilistic and has variance, since it might destabilize the emission schedule. However, since this only holds for red blocks that were not previously merged, the effect is marginal. Furthermore, the pruning protocol strongly limits the ability to merge old blocks, and the bound thereof will become more steep as we increase the length (in real world time) of the difficulty window. While I am positive this effect is negligible, we should measure it on the testnet before deploying to main. 
54 |  * The DAA is currently retargeted based on the average difficulty across the difficulty window. This causes the difficulty adjustment to lag during times of greatly changing global hashrate. It might be better to use a different averaging, e.g. giving more weights to newer blocks (or just taking the average of a small subwindow). 
55 |  * The median timestamp selection is much more sensitive to variance than the actual difficulty retargeting, we should make sure that the chosen constants do not incur problematic variance.
56 | 
57 | # Backwards compatibility
58 | Breaks consensus rules, requires hardfork
59 | 
60 | # Rejection
61 | After more consideration we noticed the following attack: a new large miner joining the network can supress sampled blocks, thus preventing the network from adjusting to the added difficulty. This could be solved in several ways, the most direct one being to use the sampled blocks for sampling timestamps, but choosing the difficulty according to the *total* number of blocks created, not just sampled blocks. This solution incurs added complexity required to track the actual number of produced blocks, and still allows a large miner to somewhat tamper with the timestamp deviation tolerance (thus exacerbating difficulty attacks, though to a limited extend). When discussing the solution we came up with a different approach detailed in [KIP4](kip-0004.md). While the KIP4 approach is very similar to the current proposal, we found it sufficiently different to warrent a new KIP rather than updating the current one.
62 | 


--------------------------------------------------------------------------------
/kip-0004.md:
--------------------------------------------------------------------------------
  1 | ```
  2 |   KIP: 4
  3 |   Layer: Consensus (hard fork), DAA
  4 |   Title: Sparse Difficulty Windows
  5 |   Author: Shai Wyborski <shai.wyborski@mail.huji.ac.il>
  6 |           Michael Sutton <msutton@cs.huji.ac.il>
  7 |           Georges Künzli <georges.kuenzli@dplanet.ch>
  8 |   Status: implemented in testnet ([PR](https://github.com/kaspanet/rusty-kaspa/pull/186))
  9 | ```
 10 | 
 11 | # Motivation
 12 | The difficulty adjustment algorithm requires maintaining, for each block, a list of the N blocks in its past with the highest accumulated weight. When validating a block this list needs to be traversed to compute the various quantities required for computing the difficulty target. Since the length of the difficulty window corresponds to real-time considerations, the size of desired difficulty window is measured in seconds, not blocks. This means that if we increase the block rate by a factor of R, we increase the complexity of the difficulty adjustment by R^2 (the first factor is due to the fact that validating each block requires R times more steps, and the second factor is since blocks emerge R times faster).
 13 | 
 14 | This is partially mitigated by the incremental nature of the difficulty window. Like the blue set, each block inherits its selected parent's difficulty window and then fixes it according to the blocks in its merge set. This converts some of the CPU overhead to RAM overhead, which would also increase by a factor of R.
 15 | 
 16 | To reduce the overhead, the current Golang implementation uses window sizes that are not optimal (263 blocks for timestamp flexibility, 2641 blocks for DAA window). As a result, miners whose clock drifts into the future more than two minutes than the majority of the network have their block delayed decreasing the probability they will be blue. Hence, a secondary motivation for this proposal is to increase the timestamp flexibility (which requires increasing the DAA window accordingly) efficiently.
 17 | 
 18 | # Difficulty Windows
 19 | 
 20 | For the sake of discussion, we will call a window long/short to indicate the length of time it represents, and large/small to indicate the number of blocks it contains. In the current state, the size of a window is its length times the BPS. Thus, increasing the BPS by a factor of R while retaining window *lengths* implies increasing the *sizes* of all windows by a factor of R.
 21 | 
 22 | Currently, a window of a block B of length L is the set of L\*BPS blocks in the past of B with the highest blue work.
 23 | 
 24 | Difficulty windows are windows used for determining how to readjust the difficulty target. The difficulty adjustment algorithm actually uses two windows:
 25 |  * the *timestamp flexibility* window, used to bound the allowed deviation from the expected timestamp.
 26 |  * the *difficulty* window, used to approximate the deviation of block creation time from the expected time.
 27 | 
 28 | Timestamp flexibility length corresponds to natural deviations and drifts across different clocks of the network. Short timestamp flexibility window means the network is less tolerable towards clock drifts and more punishing towards poorly connected miners. However, long timestamp flexibility allows an adversarial miner more leeway for timestamp manipulations (e.g. for the purpose of difficulty attacks). This is countered by making the difficulty window much longer.
 29 | 
 30 | In the current implementations, efficiency considerations already force us to choose a shorter-than-desired timestamp flexibility length, namely 263 seconds. The corresponding difficulty window length is 2641 seconds. These lengths are already not optimal in two ways: the tolerated drift is too short, and the maximum possible difficulty attack via timestamp manipulation is higher than desired. However, efficiency concerns prohibit meaningfully extending the windows. (It should be noted that an optimal timestamp manipulation attack would require a lot of resources and luck to apply consistently, while only affecting the network mildly.)
 31 | 
 32 | Increasing the block rates while retaining window lengths will force us to increase window sizes accordingly, which is prohibitive in terms of complexity. Making the windows shorter is prohibitive in terms of security. Hence, finding a better way to represent a window is required.
 33 | 
 34 | # Sparse Windows
 35 | 
 36 | The idea of sparse window is simple: instead of using all blocks in the past, choose an appropriate subset of the past. If the subset is small enough, yet well distributed across the non-sparse window, it could be used as a replacement. The subset should be:
 37 | 
 38 |  * Deterministic: it could be calculated from the DAG and would turn out the same for all nodes
 39 |  * Uniformly distributed: it should be well spread across the non-sparse window
 40 |  * Incremental: it should not depend on the point of view of a particular block, so it could be inherited by future blocks
 41 |  * Secure: a miner should not be able to cheaply affect the probability of their block to be sampled, the cost should be comparable to discarding a valid block
 42 | 
 43 | In the [previous proposal](kip-0003.md), we used the hash of the window block as a method to sample blocks. However, this approach turned out to be gameable. In the current proposal we consider a much simpler approach, defined by the following parameters:
 44 | * ``length``: the length of the window in seconds
 45 | * ``size``: the size of the window in blocks
 46 | 
 47 | The ``length`` and ``size`` measurements are natural in the sense that they describe the real-time period represented by the window, and the number of samples we desire, regardless of BPS. E.g. we could specify that "for computing difficulty we should sample 1000 blocks from the last 500 minutes".
 48 | 
 49 | Since security only relies on the *lenghts* of the flexibility and difficulty windows, this allows us to modify the sizes without compromising security. In particular, it is reasonable to use a lower sample rate for the difficulty compared to the flexibility, as it is measured over a much larger period of time.
 50 | 
 51 | # Our Proposal
 52 | 
 53 | From these we can define a new quantity ``sparsity = length*BPS/size``. Intuitively, we use "one out of every ``sparsity`` blocks". Note that ``sparsity`` is the only quantity affected by the BPS, and in particular setting ``sparsity=1`` recovers the current, non-sparse windows.
 54 | 
 55 | We also assume that for any block B there is some determined order to traverse the merge set of B, we will shortly propose an explicit ordering we find suitable. Once such an ordering is available, we intuitively define the window by starting with the window of B.selected_parent and then traversing B.merge_set, each time adding one block and the skipping ``sparsity-1`` blocks, while also removing the blocks with lowest blue_work from the window to conserve its size.
 56 | 
 57 | The resulting subset is equally dense in all contiguous stretches of the non-sparse window (relative to blue_work ordering) whose size is larger than a typical anticone, making it suitable for our purposes.
 58 | 
 59 | We define for any block B a field called ``B.window(size,sparsity)`` which contains a *bounded* min_heap prioritized by ``blue_work`` and bounded by ``size``. This means that if there are ``size`` elements in ``B.window`` then whenever an element is pushed into ``B.window(size,sparsity)``, ``B.window(size,sparsity).extract_min()`` is performed automatically, consevring the heap size.
 60 | 
 61 | Note that the field ``B.window(size,sparsity)`` is *transient*: it is not stored with the block, but it is cached. In case the selected parent of B is not cached, this field would be computed on the fly.
 62 | 
 63 | **Remark**: Currently, if the selected parent of B is not cached, the entire window is computed. A possible optimization would be to check whether a shallow selected ancestor is cached and proceed the computation from there. However, in practice the selected parent is almost always cached, so this optimization is probably not worth the effort.
 64 | 
 65 | The procedure for computing ``B.window(size,sparsity)`` is as follows:
 66 |      
 67 |     function complete_window(m,B):
 68 |         let i = 0
 69 |         for C in B.merge_set:
 70 |             if C.blue_score < B.blue_score - length*BPS:
 71 |                 mark C as non-DAA
 72 |                 continue
 73 |             i++
 74 |             if i + B.selected_parent.DAA_score % sparsity == 0:
 75 |                 m.push(C)
 76 |      
 77 |     if B.selected_parent is cached:
 78 |         B.window(size,sparsity) = complete_window(B,B.selected_parent.window(size,sparsity).copy())
 79 |     else:
 80 |         B.window(size,sparsity) = new bounded_min_heap(bound = size)
 81 |         C = B
 82 |         while B.window(size,sparsity).size() < size OR C.blue_work >= B.window(size,sparsity).min():
 83 |             B.window(size,sparsity) = complete_window(C,B.window(size,sparsity))
 84 |             C = C.selected_parent
 85 |             
 86 | **Remarks**:
 87 |  * The window of the cached selected parent is copied and not modified in place, as it might be needed for computing the windows of several blocks
 88 |  * The computation process in the non-cached case goes back in the chain, but it produces the same result as incrementally updating a cached window due to the strong monotonicity of DAA_score.
 89 | 
 90 | It remains to specify the ordering in which B.merge_set is traversed, we propose using blue_work in *descending* order. That is, starting with the highest blue_work. This way the ordering of the merge set depends on the future, making it harder to game.
 91 | 
 92 | # Regulating Emissions
 93 | 
 94 | A block B would not reward any block C in its merge set which is marked as non-DAA. Consequentially, the sets of blocks that are rewarded is the set of all blocks that are not marked non-DAA by any chain block. This exclusion of blocks that were created "too late" to be rewarded without messing the schedule is very similar to the current non-rewarding policy, though there are edge cases of blocks that will be rewarded according to the proposed policy albeit not rewarded according to the current policy. Such blocks can only be produced in scenarios that occur very rarely naturally, and producing such scenarios deliberately requires a large fraction of the global hashrate. Even if an adversary put the work to produce such a scenario artifically, thus creating excess emission, they can not guarantee that this excesss emission would be paid to them. Such an attack is highly impractical, and requires resources that could be used much more productively to disrupt the network. Hence, we disregard this nuance for the sake of simplicity.
 95 | 
 96 | # Proposed Constants
 97 | 
 98 | We propose the following:
 99 | 
100 |  * Timesatmp flexibility: ``length=1210`` (~20 minutes), ``size=121``.
101 |  * Difficulty: ``length=30000`` (500 minutes), ``size=1000``.
102 | 
103 | # Backwards compatibility
104 | Breaks consensus rules, requires hardfork
105 | 


--------------------------------------------------------------------------------
/kip-0005.md:
--------------------------------------------------------------------------------
  1 | ```
  2 |   KIP: 5
  3 |   Layer: Applications
  4 |   Title: Message Signing
  5 |   Author: coderofstuff
  6 |   Status: Active
  7 | ```
  8 | 
  9 | # Motivation
 10 | Signing messages provides a mechanism for proving access to a given address without revealing your private keys.
 11 | This paves the way for proving ownership of funds, native assets (not implemented yet) among many possible use-cases.
 12 | 
 13 | This document aims to provide a standard which applications can implement their message signing and verification functionalities.
 14 | 
 15 | # Specifications
 16 | 
 17 | ## Signing a message
 18 | 
 19 | This part of the process is relevant to applications that have access to the private key
 20 | and will sign some message.
 21 | 
 22 | Given:
 23 | - A string of arbitrary length. We'll call this `raw_message`
 24 | - A `private_key`
 25 | 
 26 | Output: signature associated with that string with some given private key
 27 | 
 28 | 1. Hash the message using Blake2B. To create sufficient domain separation with transaction input sighash, use a different Blake2B key
 29 | `PersonalMessageSigningHash` (Transaction hashes uses `TransactionSigningHash` as key to blake2b) and a digest length of `32`
 30 | 2. Schnorr sign the hashed message
 31 | 
 32 | In summary: `schnorr_sign(blake2b(raw_message, digest_size=32, key='PersonalMessageSigningHash'), private_key)`
 33 | 
 34 | ### Why hash a message?
 35 | 1. Reduces the size of some arbitrary message to a fixed length hash
 36 | 2. Prevents signing of sighashes accidentally - the raw message is hashed with a different blake2b key (`PersonalMessageSigningHash`) from what is used for transaction hashes (`TransactionSigningHash`), creating sufficient domain separation
 37 | 
 38 | ## Verifying a message signature
 39 | 
 40 | This part of the process is relevant to applications that is asking some public_key owner
 41 | to provide evidence that they have access to the private_key of this public_key.
 42 | 
 43 | Given:
 44 | - A string of arbitrary length. We'll call this `raw_message`
 45 | - A `public_key` which the application has asked to sign the message
 46 | 
 47 | Output: `true` if the signature is valid, `false` otherwise
 48 | 
 49 | 1. Hash the raw message in the same way as above for signing
 50 | 2. Schnorr verify the signature matches the `public_key` you are testing for
 51 | 
 52 | In summary: `schnorr_verify(blake2b(raw_message, digest_size=32, key='PersonalMessageSigningHash'), public_key)`
 53 | 
 54 | # Sample Implementation
 55 | 
 56 | Full code in /kip-0005/test-cases.py
 57 | 
 58 | ```
 59 | from hashlib import blake2b
 60 | from secp256k1 import PublicKey, PrivateKey
 61 | 
 62 | # Assume we have a private-public key pair
 63 | # some_secret_key (private) and some_public_key (public)
 64 | 
 65 | def hash_message(raw_message) -> bytes:
 66 |     message_hash = blake2b(digest_size=32, key=bytes("PersonalMessageSigningHash", "ascii"))
 67 |     message_hash.update(raw_message)
 68 |     
 69 |     return message_hash.digest()
 70 | 
 71 | def sign_message(raw_message) -> bytes:
 72 |     message_digest = hash_message(raw_message)
 73 | 
 74 |     return PrivateKey(some_secret_key).schnorr_sign(message_digest, None)
 75 | 
 76 | def verify_message_signature(raw_message, message_signature) -> bool:    
 77 |     message_digest = hash_message(raw_message)
 78 | 
 79 |     return PublicKey(some_public_key).schnorr_verify(message_digest, message_signature, None)
 80 | ```
 81 | 
 82 | # Test Vectors
 83 | 
 84 | Keys taken from https://github.com/bitcoin/bips/blob/master/bip-0340/test-vectors.csv
 85 | 
 86 | index | secret key | public key | aux_rand | message_str | signature
 87 | --- | --- | --- | --- | --- | ---
 88 | 0 | 0000000000000000000000000000000000000000000000000000000000000003 | F9308A019258C31049344F85F89D5229B531C845836F99B08601F113BCE036F9 | 0000000000000000000000000000000000000000000000000000000000000000 | Hello Kaspa! | 40B9BB2BE0AE02607279EDA64015A8D86E3763279170340B8243F7CE5344D77AFF1191598BAF2FD26149CAC3B4B12C2C433261C00834DB6098CB172AA48EF522 |
 89 | 1 | B7E151628AED2A6ABF7158809CF4F3C762E7160F38B4DA56A784D9045190CFEF | DFF1D77F2A671C5F36183726DB2341BE58FEAE1DA2DECED843240F7B502BA659 | 0000000000000000000000000000000000000000000000000000000000000001 | Hello Kaspa! | EB9E8A3C547EB91B6A7592644F328F0648BDD21ABA3CD44787D429D4D790AA8B962745691F3B472ED8D65F3B770ECB4F777BD17B1D309100919B53E0E206B4C6 |
 90 | 2 | B7E151628AED2A6ABF7158809CF4F3C762E7160F38B4DA56A784D9045190CFEF | DFF1D77F2A671C5F36183726DB2341BE58FEAE1DA2DECED843240F7B502BA659 | 0000000000000000000000000000000000000000000000000000000000000001 | こんにちは世界 | 810653D5F80206DB519672362ADD6C98DAD378844E5BA4D89A22C9F0C7092E8CECBA734FFF7922B656B4BE3F4B1F098899C95CB5C1023DCE3519208AFAFB59BC |
 91 | 3 | B7E151628AED2A6ABF7158809CF4F3C762E7160F38B4DA56A784D9045190CFEF | DFF1D77F2A671C5F36183726DB2341BE58FEAE1DA2DECED843240F7B502BA659 | 0000000000000000000000000000000000000000000000000000000000000001 | (See `Test CAse 3 Full Text` section) | 40CBBD3938867B10076BB14835557C062F5BF6A4682995FC8B0A1CD2ED986EEDAAA00CFE04F6C9E5A9546B860732E5B903CC82780228647D5375BEC3D2A4983A |
 92 | 
 93 | # Test Case 3 Full Text
 94 | ```
 95 | Lorem ipsum dolor sit amet. Aut omnis amet id voluptatem eligendi sit accusantium dolorem 33 corrupti necessitatibus hic consequatur quod et maiores alias non molestias suscipit? Est voluptatem magni qui odit eius est eveniet cupiditate id eius quae aut molestiae nihil eum excepturi voluptatem qui nisi architecto?
 96 | 
 97 | Et aliquid ipsa ut quas enim et dolorem deleniti ut eius dicta non praesentium neque est velit numquam. Ut consectetur amet ut error veniam et officia laudantium ea velit nesciunt est explicabo laudantium sit totam aperiam.
 98 | 
 99 | Ut omnis magnam et accusamus earum rem impedit provident eum commodi repellat qui dolores quis et voluptate labore et adipisci deleniti. Est nostrum explicabo aut quibusdam labore et molestiae voluptate. Qui omnis nostrum At libero deleniti et quod quia.
100 | ```
101 | 


--------------------------------------------------------------------------------
/kip-0005/.gitignore:
--------------------------------------------------------------------------------
1 | __pycache__


--------------------------------------------------------------------------------
/kip-0005/reference.py:
--------------------------------------------------------------------------------
  1 | #  The following code is from: https://raw.githubusercontent.com/bitcoin/bips/master/bip-0340/reference.py
  2 | from typing import Tuple, Optional, Any
  3 | import hashlib
  4 | 
  5 | p = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F
  6 | n = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364141
  7 | 
  8 | # Points are tuples of X and Y coordinates and the point at infinity is
  9 | # represented by the None keyword.
 10 | G = (0x79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798, 0x483ADA7726A3C4655DA4FBFC0E1108A8FD17B448A68554199C47D08FFB10D4B8)
 11 | 
 12 | Point = Tuple[int, int]
 13 | 
 14 | # This implementation can be sped up by storing the midstate after hashing
 15 | # tag_hash instead of rehashing it all the time.
 16 | def tagged_hash(tag: str, msg: bytes) -> bytes:
 17 |     tag_hash = hashlib.sha256(tag.encode()).digest()
 18 |     return hashlib.sha256(tag_hash + tag_hash + msg).digest()
 19 | 
 20 | def is_infinite(P: Optional[Point]) -> bool:
 21 |     return P is None
 22 | 
 23 | def x(P: Point) -> int:
 24 |     assert not is_infinite(P)
 25 |     return P[0]
 26 | 
 27 | def y(P: Point) -> int:
 28 |     assert not is_infinite(P)
 29 |     return P[1]
 30 | 
 31 | def point_add(P1: Optional[Point], P2: Optional[Point]) -> Optional[Point]:
 32 |     if P1 is None:
 33 |         return P2
 34 |     if P2 is None:
 35 |         return P1
 36 |     if (x(P1) == x(P2)) and (y(P1) != y(P2)):
 37 |         return None
 38 |     if P1 == P2:
 39 |         lam = (3 * x(P1) * x(P1) * pow(2 * y(P1), p - 2, p)) % p
 40 |     else:
 41 |         lam = ((y(P2) - y(P1)) * pow(x(P2) - x(P1), p - 2, p)) % p
 42 |     x3 = (lam * lam - x(P1) - x(P2)) % p
 43 |     return (x3, (lam * (x(P1) - x3) - y(P1)) % p)
 44 | 
 45 | def point_mul(P: Optional[Point], n: int) -> Optional[Point]:
 46 |     R = None
 47 |     for i in range(256):
 48 |         if (n >> i) & 1:
 49 |             R = point_add(R, P)
 50 |         P = point_add(P, P)
 51 |     return R
 52 | 
 53 | def bytes_from_int(x: int) -> bytes:
 54 |     return x.to_bytes(32, byteorder="big")
 55 | 
 56 | def bytes_from_point(P: Point) -> bytes:
 57 |     return bytes_from_int(x(P))
 58 | 
 59 | def xor_bytes(b0: bytes, b1: bytes) -> bytes:
 60 |     return bytes(x ^ y for (x, y) in zip(b0, b1))
 61 | 
 62 | def lift_x(x: int) -> Optional[Point]:
 63 |     if x >= p:
 64 |         return None
 65 |     y_sq = (pow(x, 3, p) + 7) % p
 66 |     y = pow(y_sq, (p + 1) // 4, p)
 67 |     if pow(y, 2, p) != y_sq:
 68 |         return None
 69 |     return (x, y if y & 1 == 0 else p-y)
 70 | 
 71 | def int_from_bytes(b: bytes) -> int:
 72 |     return int.from_bytes(b, byteorder="big")
 73 | 
 74 | def hash_sha256(b: bytes) -> bytes:
 75 |     return hashlib.sha256(b).digest()
 76 | 
 77 | def has_even_y(P: Point) -> bool:
 78 |     assert not is_infinite(P)
 79 |     return y(P) % 2 == 0
 80 | 
 81 | def pubkey_gen(seckey: bytes) -> bytes:
 82 |     d0 = int_from_bytes(seckey)
 83 |     if not (1 <= d0 <= n - 1):
 84 |         raise ValueError('The secret key must be an integer in the range 1..n-1.')
 85 |     P = point_mul(G, d0)
 86 |     assert P is not None
 87 |     return bytes_from_point(P)
 88 | 
 89 | def schnorr_sign(msg: bytes, seckey: bytes, aux_rand: bytes) -> bytes:
 90 |     d0 = int_from_bytes(seckey)
 91 |     if not (1 <= d0 <= n - 1):
 92 |         raise ValueError('The secret key must be an integer in the range 1..n-1.')
 93 |     if len(aux_rand) != 32:
 94 |         raise ValueError('aux_rand must be 32 bytes instead of %i.' % len(aux_rand))
 95 |     P = point_mul(G, d0)
 96 |     assert P is not None
 97 |     d = d0 if has_even_y(P) else n - d0
 98 |     t = xor_bytes(bytes_from_int(d), tagged_hash("BIP0340/aux", aux_rand))
 99 |     k0 = int_from_bytes(tagged_hash("BIP0340/nonce", t + bytes_from_point(P) + msg)) % n
100 |     if k0 == 0:
101 |         raise RuntimeError('Failure. This happens only with negligible probability.')
102 |     R = point_mul(G, k0)
103 |     assert R is not None
104 |     k = n - k0 if not has_even_y(R) else k0
105 |     e = int_from_bytes(tagged_hash("BIP0340/challenge", bytes_from_point(R) + bytes_from_point(P) + msg)) % n
106 |     sig = bytes_from_point(R) + bytes_from_int((k + e * d) % n)
107 |     
108 |     if not schnorr_verify(msg, bytes_from_point(P), sig):
109 |         raise RuntimeError('The created signature does not pass verification.')
110 |     return sig
111 | 
112 | def schnorr_verify(msg: bytes, pubkey: bytes, sig: bytes) -> bool:
113 |     if len(pubkey) != 32:
114 |         raise ValueError('The public key must be a 32-byte array.')
115 |     if len(sig) != 64:
116 |         raise ValueError('The signature must be a 64-byte array.')
117 |     P = lift_x(int_from_bytes(pubkey))
118 |     r = int_from_bytes(sig[0:32])
119 |     s = int_from_bytes(sig[32:64])
120 |     if (P is None) or (r >= p) or (s >= n):
121 |         
122 |         return False
123 |     e = int_from_bytes(tagged_hash("BIP0340/challenge", sig[0:32] + pubkey + msg)) % n
124 |     R = point_add(point_mul(G, s), point_mul(P, n - e))
125 |     if (R is None) or (not has_even_y(R)) or (x(R) != r):
126 |         
127 |         return False
128 |     
129 |     return True
130 | 


--------------------------------------------------------------------------------
/kip-0005/test-cases.py:
--------------------------------------------------------------------------------
  1 | from hashlib import blake2b
  2 | from reference import *
  3 | 
  4 | class TestCase:
  5 |     def __init__(self,
  6 |                  raw_message: str,
  7 |                  secret_key: bytes,
  8 |                  public_key: bytes,
  9 |                  aux_rand: bytes,
 10 |                  signature: bytes,
 11 |                  verification_result: bool
 12 |         ):
 13 |         self.raw_message: bytes = bytes(raw_message, 'utf-8')
 14 |         self.secret_key: bytes = secret_key
 15 |         self.public_key: bytes = public_key
 16 |         self.aux_rand: bytes = aux_rand
 17 |         self.expected_signature: bytes = signature
 18 |         self.expected_verification_result: bool = verification_result
 19 | 
 20 |     def hash_message(self) -> bytes:
 21 |         message_hash = blake2b(digest_size=32, key=bytes("PersonalMessageSigningHash", "ascii"))
 22 |         message_hash.update(self.raw_message)
 23 |         
 24 |         return message_hash.digest()
 25 | 
 26 |     def sign_message(self) -> bytes:
 27 |         message_digest = self.hash_message()
 28 | 
 29 |         return schnorr_sign(message_digest, self.secret_key, self.aux_rand)
 30 | 
 31 |     def verify_message_signature(self, message_signature: bytes) -> bool:   
 32 |         message_digest = self.hash_message()
 33 | 
 34 |         return schnorr_verify(message_digest, self.public_key, message_signature)
 35 | 
 36 |     def run_test_case(self):
 37 |         sig = self.sign_message()
 38 |         print("Received:\t", sig.hex().upper())
 39 |         print("Expecting:\t", self.expected_signature.hex().upper())
 40 |         is_passed = (self.expected_signature == sig) == self.expected_verification_result
 41 |         assert(is_passed)
 42 | 
 43 |         is_passed = is_passed and (self.verify_message_signature(self.expected_signature) == self.expected_verification_result)
 44 |         print("Test Result:\t", is_passed)
 45 |         assert(is_passed)
 46 | 
 47 | if __name__ == "__main__":
 48 |     # Test Case 0
 49 |     print('Running Test Case 0')
 50 |     print('-------------------')
 51 |     TestCase(
 52 |         'Hello Kaspa!',
 53 |         bytes.fromhex('0000000000000000000000000000000000000000000000000000000000000003'),
 54 |         bytes.fromhex('F9308A019258C31049344F85F89D5229B531C845836F99B08601F113BCE036F9'),
 55 |         bytes_from_int(0),
 56 |         bytes.fromhex('40B9BB2BE0AE02607279EDA64015A8D86E3763279170340B8243F7CE5344D77AFF1191598BAF2FD26149CAC3B4B12C2C433261C00834DB6098CB172AA48EF522'),
 57 |         True
 58 |     ).run_test_case()
 59 |     print('')
 60 | 
 61 |     # Test Case 1
 62 |     print('Running Test Case 1')
 63 |     print('-------------------')
 64 |     TestCase(
 65 |         'Hello Kaspa!',
 66 |         bytes.fromhex('B7E151628AED2A6ABF7158809CF4F3C762E7160F38B4DA56A784D9045190CFEF'),
 67 |         bytes.fromhex('DFF1D77F2A671C5F36183726DB2341BE58FEAE1DA2DECED843240F7B502BA659'),
 68 |         bytes_from_int(1),
 69 |         bytes.fromhex('EB9E8A3C547EB91B6A7592644F328F0648BDD21ABA3CD44787D429D4D790AA8B962745691F3B472ED8D65F3B770ECB4F777BD17B1D309100919B53E0E206B4C6'),
 70 |         True
 71 |     ).run_test_case()
 72 |     print('')
 73 | 
 74 |     # Test Case 2
 75 |     print('Running Test Case 2')
 76 |     print('-------------------')
 77 |     TestCase(
 78 |         'こんにちは世界',
 79 |         bytes.fromhex('B7E151628AED2A6ABF7158809CF4F3C762E7160F38B4DA56A784D9045190CFEF'),
 80 |         bytes.fromhex('DFF1D77F2A671C5F36183726DB2341BE58FEAE1DA2DECED843240F7B502BA659'),
 81 |         bytes_from_int(1),
 82 |         bytes.fromhex('810653D5F80206DB519672362ADD6C98DAD378844E5BA4D89A22C9F0C7092E8CECBA734FFF7922B656B4BE3F4B1F098899C95CB5C1023DCE3519208AFAFB59BC'),
 83 |         True
 84 |     ).run_test_case()
 85 |     print('')
 86 | 
 87 |     # Test Case 3
 88 |     print('Running Test Case 3')
 89 |     print('-------------------')
 90 |     super_long_text = '''Lorem ipsum dolor sit amet. Aut omnis amet id voluptatem eligendi sit accusantium dolorem 33 corrupti necessitatibus hic consequatur quod et maiores alias non molestias suscipit? Est voluptatem magni qui odit eius est eveniet cupiditate id eius quae aut molestiae nihil eum excepturi voluptatem qui nisi architecto?
 91 | 
 92 | Et aliquid ipsa ut quas enim et dolorem deleniti ut eius dicta non praesentium neque est velit numquam. Ut consectetur amet ut error veniam et officia laudantium ea velit nesciunt est explicabo laudantium sit totam aperiam.
 93 | 
 94 | Ut omnis magnam et accusamus earum rem impedit provident eum commodi repellat qui dolores quis et voluptate labore et adipisci deleniti. Est nostrum explicabo aut quibusdam labore et molestiae voluptate. Qui omnis nostrum At libero deleniti et quod quia.'''
 95 |     TestCase(
 96 |         super_long_text,
 97 |         bytes.fromhex('B7E151628AED2A6ABF7158809CF4F3C762E7160F38B4DA56A784D9045190CFEF'),
 98 |         bytes.fromhex('DFF1D77F2A671C5F36183726DB2341BE58FEAE1DA2DECED843240F7B502BA659'),
 99 |         bytes_from_int(1),
100 |         bytes.fromhex('40CBBD3938867B10076BB14835557C062F5BF6A4682995FC8B0A1CD2ED986EEDAAA00CFE04F6C9E5A9546B860732E5B903CC82780228647D5375BEC3D2A4983A'),
101 |         True
102 |     ).run_test_case()
103 |     print('')
104 | 


--------------------------------------------------------------------------------
/kip-0006.md:
--------------------------------------------------------------------------------
  1 | ```
  2 |   KIP: 6
  3 |   Layer: Consensus (hard fork), Block Validation
  4 |   Title: Proof of Chain Membership (PoChM)
  5 |   Author: Shai Wyborski <shai.wyborski@mail.huji.ac.il>
  6 |   Comments-URI: https://research.kas.pa/t/kip-6-discussion-thread/189
  7 |   Status: Draft
  8 |   Type: Informational
  9 | ```
 10 | 
 11 | # Motivation
 12 | The pruning mechanism makes it impossible to prove that a transaction was included in the ledger after it has been pruned. A currently available solution is publicly run archival nodes (in the form of block explorers) that store all historical data. However, this is not a sustainable solution since it relies on centralized service providers, and since the size of the databases increases rapidly with time and adoption.
 13 | 
 14 | A better solution is to provide a *cryptographically verifiable proof* that a transaction was posted to the blockchain. Such a proof could only be *generated* before the transaction was pruned (or by using an archival node), but could be *verified* indefinitely.
 15 | 
 16 | There are two types of proofs:
 17 |  * a *proof of publication* (PoP), proving that a transaction has appeared on the blockchain at a certain time
 18 |  * a *txn receipt* (txR), further proving that the published transaction was validated (that is, that there was no conflicting transaction that was eventually accepted)
 19 | I will refer to these two types of proof collectively as *inclusion proofs*.
 20 | 
 21 | In the current consensus rules, it is technically possible to create inclusion proofs, but such proofs can grow rather large (over 20 megabytes in the worst case). Additionally, generating and validating these proofs needs to be done manually, as these functionalities are not implemented in the node.
 22 | 
 23 | I propose a small modification to the block validation rules that extremely reduces the size of inclusion proofs to the order of several kilobytes while incurring very mild costs on the network's performance (in terms of storage costs and block validation complexity). I also provide precise algorithmic descriptions of how inclusion proofs are generated and verified, with the intention that they will be implemented as API calls.
 24 | 
 25 | Interestingly, while the notion of txR is strictly stronger than a PoP, it is actually easier to implement in Kaspa. This is because *accepted* transactions get special treatment: each selected chain block contains the root of a Merkle tree of all transactions that were accepted by the block (the ATMR, see below). Hence, a transaction was accepted *if and only if* it appears on this Merkle tree on a selected chain block.
 26 | 
 27 | Hence, to provide a proof of receipt for ``txn``, it suffices to provide the Merkle proof that a block ``B`` accepted ``txn``, along with some proof that ``B`` was a selected chain block. To provide a proof of publication, it suffices to provide a proof that a block ``B`` was a selected chain block, a chain of headers going from ``B`` to some block ``C`` and a Merkle proof that ``C`` accepted ``txn``.
 28 | 
 29 | The common part of these two types of proofs is showing that ``B`` is a header of a *selected chain* block, that is, providing a *proof of chain membership* (PoChM, pronounced like the latin word *pacem*). This is the focus of the current proposal.
 30 | 
 31 | I stress that the current sizes are calculated with respect to the values of various parameters in the current 1BPS consensus. Changing these parameters would require recalculating these values. However, increasing block rates will only increase the factor by which proposed proofs are improved upon currently possible proofs (roughly because currently possible proofs are as large as the entire ledger stored between two consecutive pruning blocks, whereas the size of proposed proofs grows *logarithmically* with the number of *chain blocks* between two consecutive pruning blocks. In particular, increasing BPS will increase the size of current proofs, but not the size of proposed proofs).
 32 | 
 33 | # Notations
 34 | 
 35 | In this proposal, it is convenient to use the notation ``Past(B)`` (resp. ``Future(B)``) to denote the past (resp. future) of the block ``B`` *including* the block ``B`` itself. The method names are capitalized to differentiate them from the common notations ``past(B)`` and ``future(B)`` which *exclude* the block ``B`` itself.
 36 | 
 37 | I use the notation ``parent(B,n)`` to note the *nth selected parent* of ``B``. For brevity, I use ``parent(B)`` instead of ``parent(B,1)``. For any ``n>1`` we can recursively define ``parent(B,n)=parent(parent(B,n-1))``.
 38 | 
 39 | # Posterity Headers
 40 | 
 41 | The consensus state of a Kaspa node includes a list of selected chain block headers. These headers are sampled at (approximate) regular intervals and are stored indefinitely. Hence, we call them *posterity headers*.
 42 | 
 43 | Currently, posterity headers are taken from blocks used as pruning blocks, and a posterity header is stored once every 24 hours, whereby they are also commonly referred to as *pruning headers*. Later in this proposal I consider decoupling pruning headers from posterity headers, though I propose to delay this step to a separate update (mainly due to engineering complexity considerations). That being said, I currently disregard the pruning motivation and refer to these headers as *posterity headers* for convenience. 
 44 | 
 45 | Given a block ``B`` let ``posterity(B)`` be the *earliest* posterity header such that ``B ∈ Past(posterity(B))``, or ``null`` if such a stored header does not exist yet. Let ``posterity_depth(B)`` output the integer ``n`` satisfying ``B=parent(posterity(B),n)``.
 46 | 
 47 | For any block ``B`` let ``next_posterity(B)`` be the block header with the following property: if ``B`` is a posterity header, then ``next_posterity(B)`` is the next posterity header. Note that this is well-defined even if ``B`` is not a posterity header.
 48 | 
 49 | Posterity headers have the following important properties:
 50 |  * They are determined in consensus. That is, all nodes store the same posterity headers.
 51 |  * They are headers of blocks in the *selected chain*.
 52 |  * Each block ``B`` contains a pointer to ``posterity(posterity(posterity(B)))``. Hence, the chain of posterity headers is verifiable all the way down to genesis. In particular, obtaining and verifying the posterity chain is part of the process of syncing a new node.
 53 | 
 54 | The reason that ``B`` points to ``posterity(posterity(posterity(B)))`` and not is ``posterity(B)`` that the original motivation for storing these blocks comes from the pruning mechanism, where these depth-3 pointers are required (for reasons outside the scope of the current discussion).
 55 | 
 56 | # Accepted Transactions Merkle Root (ATMR)
 57 | 
 58 | In Bitcoin, every header contains the root of a Merkle tree of all transactions included in this block. In Kaspa, this Merkle tree is extended to contain all *accepted* transactions included in this block *and its merge set*. That is, all transactions that appear either in the block or the merge set of the block, except transactions that conflict with other transactions that precede them in the GHOSTDAG ordering. 
 59 | 
 60 | # Proofs of Chain Membership (PoChM)
 61 | 
 62 | To provide a txR for ``txn``, it suffices to provide the following:
 63 |  * A header of a block ``B`` and a Merkle proof that ``txn`` appears in ``B``'s (ATMR)
 64 |  * Proof that ``B`` appeared in the selected chain
 65 | 
 66 | This suffices to prove that ``txn`` was validated even if a conflicting transaction ``txn'`` (or any number thereof) was also included in the blockDAG: the validation rules imply that ``txn`` and ``txn'`` both appeared in the anticone of ``B``, and that ``txn`` preceded ``txn'`` in the GHOSTDAG ordering.
 67 | 
 68 | The first item is a straightforward Merkle proof. However, the second item is trickier. I refer to such a proof as a *proof of chain membership* (PoChM, pronounced like the Latin word *pacem*) for ``B``. The rest of this document is concerned with providing a PoChM for an arbitrary block ``B``.
 69 | 
 70 | # PoChM Without Hard-Fork
 71 | 
 72 | Currently, the most straightforward way to construct a PoChM for ``B`` is to store the entire set ``Future(B) ∩ Past(posterity(B))``. The result is a "diamond shaped" DAG whose top block is ``posterity(B)`` and bottom block is ``B``. Given a DAG of this shape as proof, any node could verify that the top block is a posterity block, and that following the selected parent from the top block leads to the bottom block.
 73 | 
 74 | The problem with this proof is its size. In the worst case, it would be about as large as 24 hours worth of headers. At the current 1BPS and header size of 248 bytes, this sums to about 24 megabytes.
 75 | 
 76 | Remark: This could be improved slightly by, instead of storing the entire set, only storing the headers of the selected chain blocks and their parents. This data suffices to compute the selected parent of each selected chain block and validate the proof. However, this does not seem to improve the size by much. Also note that proofs for many transactions that were accepted in chain blocks with the same ``posterity`` can be aggregated. In particular, a PoChM for a block ``B`` is also a PoChM for any chain block ``C ∈ Future(B) ∩ Past(posterity(B))``
 77 | 
 78 | Remark: Note that a PoP does not actually require the entire PoChM. It suffices to provide a chain of headers going from any posterity block to any block merging ``txn``. Since in PoP we don't care whether ``txn`` was eventually accepted, we are not concerned about whether this chain ever strays from the selected chain. However, the improvement is only by a constant factor (the ratio of chain blocks to total blocks), which is currently estimated to be a factor of around two.
 79 | 
 80 | # Performance trade-offs
 81 | 
 82 | We aim to decrease the size of a PoChM as much as possible while minimizing performance costs. There are two relevant types of costs: header sizes, and block validation complexity.
 83 | 
 84 | As an extreme example, one could provide a very small PoChM by including in each posterity header the entire list of headers of all chain blocks down to the next posterity header. This "solution" is obviously prohibitive as it will make block headers huge.
 85 | 
 86 | On the other extreme, one could include in each header a *Merkle tree* of all chain blocks down to the next posterity header, and let the PoChM of B be a Merkle proof for that tree. While this "solution" only increases the size of a block header by 32 bytes (the size of a single hash), it makes it necessary to compute tens of thousands of hashes to validate a single block header, which is prohibitively costly.
 87 | 
 88 | Our proposal "balances" the second approach: we add to each header the root of a Merkle tree that only contains *logarithmically many* headers. This allows generating a PoChM in the form of a logarithmically long chain of headers and Merkle proofs.
 89 | 
 90 | In the current parametrization, implementing our proposal requires increasing the size of a header by a single hash (32 bytes), and adds a validation step with constant space complexity and a time complexity of θ(log(N)) where N is the number of *chain* blocks between two consecutive posterity blocks. The size of a PoChM, as well as the time required to verify it, is θ(log(N)loglog(N)).
 91 | 
 92 | We will provide non-asymptotic bounds after specifying the solution. For now we will state that in the current parametrization (without increasing posterity header density), the new block validation step requires computing 33 hashes (and can be skipped for blocks outside the selected chain), and that, in the worst case, the size of a PoChM is about 9.5 kilobytes.
 93 | 
 94 | # Our Proposal
 95 | 
 96 | The block header will contain a new field called the *PoChM Merkle root* (PMR) defined as follows: let k be the least integer such that ``parent(B,2^k) ∈ Past(next_posterity(B))``, then PMR is the root of the Merkle tree containing the headers ``parent(B,2^i)`` for ``i = 0,...,k-1``.
 97 | 
 98 | Let ``PMR(B,i)`` be the function that outputs a Merkle proof that ``hash(parent(B,2^i))`` is in the tree whose root is the PMR of B.
 99 | 
100 | The process of header validation of chain block candidates will include verifying the PMR. I propose that the PMR will not be validated for blocks that are not chain candidates. In particular, a block whose PMR is invalid but is otherwise valid will remain in the DAG but will be disqualified from being a selected tip/parent. A similar approach is employed e.g. when validating UTXO commitments or checking that the block does not contain double spends. (A crucial subtlety is that, eventually, the selected parent of a block is *always* the parent with the highest blue accumulated work (BAW). If while validating the selected parent the block turns out to be disqualified, then the pointer to this block is *removed*. This rule allows us to read the selected parent of any block from the headers of its parents alone, without worrying about the parent with the highest BAW is disqualified for some external reason that requires additional data to notice).
101 | 
102 | The procedure for generating a *PoChM* for a block ``B`` is as follows:
103 |      
104 |     Let C = posterity(B)
105 |     If C=null:
106 |          Return error
107 |     Let d = posterity_depth(B)
108 |     Let proof = []
109 |     While true:
110 |           Let i = floor(log_2(d))
111 |           proof.append(PMR(C,i))
112 |           d -= 2^i
113 |           If d == 0:
114 |               Break
115 |           C = parent(C,2^i)
116 |           proof.append(C)
117 |      Return proof
118 | 
119 | To understand how to validate the proof we first consider it in two simple cases:
120 | 
121 | If there is some i such that ``posterity_depth(B) = 2^i`` then ``B`` itself is a member of the PMR of ``posterity(B)`` and the entire PoChM is a single Merkle proof.
122 | 
123 | If there are some i>j such that ``posterity_depth(B) = 2^i + 2^j`` then the proof would contain three items:
124 |  * A Merkle proof that ``hash(parent(posterity(B),2^i))`` is in the PMR of ``posterity(B)``
125 |  * The header ``parent(posterity(B),2^i)`` (that in particular includes its PMR)
126 |  * A Merkle proof that ``hash(B)`` is in the PMR of ``parent(posterity(B),2^i)``
127 | 
128 | By verifying the proofs and hashes above, one verifies that ``B`` is indeed a chain block. The general procedure extends similarly.
129 | 
130 | # PoP and validation receipt
131 | 
132 | To generate a validation receipt for ``txn``:
133 |  * Find the block ``B`` that accepted ``txn``
134 |  * Output: PoChM for ``B``, Merkle proof that ``B`` is in the ATMR of ``B``
135 | 
136 | To generate a PoP for ``txn``:
137 |  * Find the block ``C`` that in which ``txn`` was published
138 |  * Let ``B`` be the earliest selected chain block with ``C`` in its past
139 |  * Output: PoChM for ``B``, chain of headers from ``B`` to ``C``, Merkle proof that ``txn`` is in ``C``'s ATMR
140 | 
141 | Note that the PoP can be optimized: instead of using ``C``, find a block ``C'`` along the path from ``B`` to ``C'`` that accepts ``txn`` and minimizes the length of chain from ``B`` to ``C'``, use ``C'`` instead of ``C`` in the last step. This optimization has the nice property that if ``txn`` was accepted, then the resulting PoP will be identical to the validation receipt. However, it might be the case that the increased complexity of implementing the optimization is more substantial than the reduction in proof size, which should be measured in practice.
142 | 
143 | # Posterity Header Density
144 | 
145 | As a final optimization, I consider increasing the stored block density to once an hour. The main motivation for this optimization is to reduce the time required before a PoChM could be generated. A prerequisite for generating a PoChM is that stored_header(B) already exists, and reducing this time is beneficial. Additionally, this would meaningfully reduce (by around 40%) the complexity of the added verification step and of PoChM verification and the size of a PoChM.
146 | 
147 | However, this introduces additional costs. Currently, posterity headers double as pruning headers. Decoupling them from each other means that an additional ``posterity header`` would have to be added, increasing the header size to 312 bytes. In addition, this decoupling is tricky engineering-wise and is probably more complicated to implement than the entire rest of the KIP.
148 | 
149 | I recommend first implementing the current KIP using the current posterity/pruning header, and deciding on separating posterity headers with increased density later, based on demand and necessity. It might also be the case that the additional pointer might be removed (i.e. the pruning mechanism will somehow piggyback on the posterity blocks in a way that doesn't have computational costs). This should be subject of an independent discussion, to be concluded in a follow-up KIP.
150 | 
151 | # Size of PoChM
152 | 
153 | Computing the actual size of a PoChM is a bit tricky, as it depends on the number of *chain* blocks between ``C`` and ``next_posterity(C)`` for several blocks ``C``. Buy staring at the KGI for sufficiently long, one can get convinced that the selected chain grows by about one block every two seconds (that is, in 1BPS we see that about half of the blocks are chain blocks). To provide a reliable upper bound, I will assume that the selected chain grows at a rate of about 1 block per second. Note that the growth of the selected chain is not governed by block rates, but by network conditions. Hence, I assume this holds with overwhelming probability for any block rate. This might not hold if network conditions improve greatly. However, we'll soon see that the growth asymptotics (as a function of the number of chain blocks between two consecutive posterity blocks) are a very mild log*loglog, whereby this is hardly a concern. I will demonstrate this with concrete numbers after we obtain an expression for the size of a PoChM.
154 | 
155 | Let ``L_e`` and ``L_a`` denote the size of a header and a hash, respectively. Let ``N`` be the number of seconds between two consecutive posterity blocks. For a block ``B`` let ``|B|`` denote the Hamming weight of the binary representation of ``posterity_depth(B)``. It follows that a PaChM for ``B`` contains ``|B|`` Merkle proofs and ``|B|-1`` headers (in particular, if ``B=posterity(B)`` (equivalently ``posterity_depth(B)=0``) then ``|B|=0``, and indeed the "proof" is empty, since the consensus data itself proves that ``B`` is a chain block).
156 | 
157 | The size of each merkle proof is ``(2*log(logN))+1)*L_a``, so the total size of the PaChM is ``(2*log(logN))+1)*L_a|B| + L_e*(|B|-1)``. In the worst case, we have that ``|B| = log(N)`` so we obtain a bound of ``(2*log(logN))+1)*logN*L_a + L_e*(logN-1)``. Assuming ``L_e=32 Bytes`` and ``L_a=280 Bytes`` and ``N=86400`` (that is, that a posterity block is sampled once every 24 hours), this comes up to 9 kilobytes.
158 | 
159 | Increasing posterity density to once per hour (that is, setting N=3600) would decrease the largest PoChM size to 6 kilobytes.
160 | 
161 | If network conditions improve so much that the selected chain grows at a rate of 10 blocks per second (which is unlikely to happen in the foreseeable future), the largest PoChM size would be 11 kilobytes for 24-hour density and 8 kilobytes for one-hour density.
162 | 
163 | The size of a txR is the same as the size of a PoChM up to a single Merkle proof. Note that the size of such a Merkle proof is *not* ``log(logN)`` as the ATMR does not contain block headers but transactions. Hence, the size of this Merkle proof is logarithmic in the number of transactions *accepted* by ``B``, and is subject to adoption, and block rates. To get a loose upper bound, consider a scenario where each block contains 300 transactions, and merges 99 blocks (e.g. 100BPS assuming an average network delay of one second). In this scenario, the ATMR would contain around 30000 transactions, so the Merkle proof would contain the transaction itself and 29 hashes, making it about 1KB large.
164 | 
165 | # Resource Cost
166 | 
167 | 24-hour posterity density:
168 |  * Constant storage: Currently, three days of ledger data are stored, which contain about 260,000 headers. Storing as many headers requires about 8.5 megabytes.
169 |  * Accumulated storage: An additional hash to each posterity header increases the state growth by about 11 kilobytes a year
170 |  * Block Validation: The new step requires computing a Merkle root of a tree of height ``logN`` containing chain blocks. Since there is already fast random access to all chain blocks, the heaviest part of the computation is the number of hashes, where the required number of hashes is ``2*logN-1``. Assuming a selected chain growth rate of one block per second, this becomes 32 hashes. When there are no many reorgs this comes up to 32 hashes/second. If somehow the selected chain growth rate increases to say 10 blocks per second, this would become 39 hashes per block or 390 hashes per second. Using an efficient hash such as blake2, computing this many hashes is negligible even for very weak CPUs.
171 | 
172 | one-hour posterity density:
173 |  * Constant storage: same as above
174 |  * Accumulated storage: an additional 23 headers per day accumulate to about 2.3 megabytes per year
175 |  * Block Validation: The number of hashes/second will reduce to 23 hashes/second for 1 block/second selected chain growth, or to 300 hashes/second for 10 blocks/second selected chain growth.
176 | 
177 | It is fair to say that in all scenarios, the resource costs of this upgrade are very marginal.
178 | 
179 | # Backwards compatibility
180 | Breaks consensus rules, requires hardfork. Changes header structure.
181 | 


--------------------------------------------------------------------------------
/kip-0009.md:
--------------------------------------------------------------------------------
  1 | ```
  2 |   KIP: 9
  3 |   Layer: Mempool, P2P, Consensus
  4 |   Title: Extended mass formula for mitigating state bloat
  5 |   Authors: Michael Sutton <msutton@cs.huji.ac.il>
  6 |            Ori Newman <orinewman1@gmail.com>
  7 |            Shai Wyborski <shai.wyborski@mail.huji.ac.il>
  8 |            Yonatan Sompolinsky
  9 |   Comments-URI: https://research.kas.pa/t/quadratic-storage-mass-and-kip9/159
 10 |   Status: proposed, implemented in the rust codebase and activated in testnet 11 
 11 | ```
 12 | 
 13 | We propose a mechanism for regulating the growth rate of the UTXO set in both organic and adversarial settings. Our proposal is fundamentally different than existing attempts to mitigate state bloat (e.g., statelessness or state rent, both requiring active user involvement), and only entails a simple change to the logic of how the mass of a transaction is computed. In this proposal, we specify the revised formula and its consequences and describe how ecosystem software such as wallets, pools, and exchanges should adapt to this change without affecting the quality of service. We provide an intuitive overview of the properties of the new mass formula and defer the formal treatment to a soon-to-be-published preprint.
 14 | 
 15 | # Motivation
 16 | A few months ago, the Kaspa network faced a dust attack that exploited the high throughput to create many minuscule UTXOs, forever increasing the storage costs of full nodes. The attack was countered and eventually stopped by deploying a mempool patch [2]. The patch imposed a limitation on the number of transactions with more outputs than inputs that are allowed in each block; see [1] for a detailed account of the attack and its resolution. Since many standard transactions are of this form (most typically, a transaction with a single input, a destination output, and a change output), this resulted in noticeable UX and QoS inconveniences. This KIP aims to address the state bloat challenge more fundamentally, by introducing a transaction cost -- or mass -- function that inherently limits state bloat attacks. While the current state of the system is bearable (the patch increasing the delay of such transactions by a dozen seconds at worst), increased adoption will greatly exacerbate this effect, providing ample motivation to expedite the implementation of a sustainable solution.
 17 | 
 18 | # Specification
 19 | 
 20 | The formal specification of the suggested changes is described below, followed by a detailed overview and analysis.
 21 | 
 22 | ### Extended mass formula
 23 | 
 24 | We refer to the quantity currently known as "mass" by the name `compute_mass`, and introduce a new quantity called the `storage_mass`. As the name suggests, the former regulates computational costs while the latter regulates storage costs. We redefine the total mass of a transaction as the maximum over both:
 25 | $$\text{mass}(tx) = max\lbrace\text{compute mass}(tx) , \text{storage mass}(tx)\rbrace\text{.}$$
 26 | 
 27 | In the context of storage mass, a transaction is modeled as two *collections* of values: the input values $I$ and output values $O$. We use the notation $x^+ = \max\lbrace x,0 \rbrace$. The storage mass is defined as follows:
 28 | 
 29 | $$\text{storage mass}(tx) = C\cdot\left(\sum_{o  \in  O} \frac{1}{o} - \frac{|I|^2}{\sum_{v \in I} v}\right)^+\text{,}$$
 30 | where $C$ is a constant controlling the correlation of inverse KAS value to mass units.
 31 | 
 32 | A relaxed version of the mass formula treats inputs and outputs symmetrically:
 33 | $$\text{storage mass}^*(tx) = C\cdot\left(\sum_{o  \in  O} \frac{1}{o} - \sum_{v  \in  I} \frac{1}{v}\right)^+\text{.}$$
 34 | 
 35 | We call this the *relaxed formula*. As shown in [3], `storage_mass*` can be used for transactions satisfying $|O|\le|I|\le 2$ or if $|O|=1$.
 36 | 
 37 | As mandatory in consensus systems, storage mass calculation must use only integers and cannot rely on floating-point arithmetics. This means that the constant $C$ must be computed within each fraction otherwise the loss of precision will render the calculation useless.
 38 | The following compiles the overall pseudo-code for calculating the new extended mass formula:
 39 | ```python
 40 | def negative_mass(I, |O|):
 41 |     """
 42 |     Calculates the negative component of the storage mass formula. Note that there is no dependency on output
 43 |     values but only on their count. The code is designed to maximize precision and to avoid intermediate overflows.  
 44 |     In practice, all calculations should be saturating, i.e., clamped between u64::MIN and u64::MAX  
 45 |     """
 46 |     if |O| == 1 or |O| <= |I| <= 2:
 47 |         return sum(C/v for v in I)
 48 |     return |I|*(C/(sum(v for v in I)/|I|)) 
 49 | 
 50 | def storage_mass(I, O):
 51 |     N = negative_mass(I, |O|)
 52 |     P = sum(C/o for o in O)
 53 |     return max(P-N, 0)
 54 | 
 55 | def mass(tx):
 56 |     return max(storage_mass(I=tx.inputs, O=tx.outputs), compute_mass(tx)) 
 57 | ```
 58 | 
 59 | Storage mass (unlike compute mass) cannot be computed from the standalone transaction structure, but requires knowing the exact input values, which are only available with full UTXO context (i.e., it requires a *populated* transaction).
 60 | 
 61 | #### UTXO plurality adjustment
 62 | 
 63 | To further refine the extended mass formula, the storage mass calculation is adjusted to account for UTXO entries with non-standard script public key sizes (i.e. larger than the typical 35 bytes), thereby capturing the additional persistent storage required.
 64 | 
 65 | - **Base Unit Definition**: Define a base UTXO storage unit as `UtxoUnit = 100` bytes—derived from the constant parts of a UTXO (63 bytes) plus the maximum standard public key size (35 bytes).
 66 | 
 67 | - **Calculating Plurality**: For a given UTXO entry, compute its plurality as:  
 68 |   $P = \lceil \text{entry.size} / \text{UtxoUnit} \rceil$  
 69 |   This value $P$ represents how many standard-sized entries the UTXO effectively occupies. The entry is then treated as $P$ entries, each holding $\text{entry.amount} / P$ KAS.
 70 | 
 71 | - **Adjusting the Harmonic Component**: The original harmonic term in the storage mass formula is:  
 72 |   $$\sum_{o \in O} \frac{1}{o}$$  
 73 |   For each output, this is now generalized to:  
 74 |   $$\frac{P^2}{o}$$
 75 | 
 76 | - **Adjusting the Arithmetic Component**: Similarly, the original arithmetic component:  
 77 |   $$\frac{|I|^2}{\sum_{v \in I} v}$$  
 78 |   is generalized to:  
 79 |   $$\frac{\left(\sum_{i \in I} P_i\right)^2}{\sum_{v \in I} v}$$  
 80 |   where $P_i$ is the plurality of input $i$ and the sum of input values remains unchanged.
 81 | 
 82 | This reduction effectively maps larger UTXO entries to the usage of multiple standard units, aligning the storage mass with the actual storage impact of non-standard UTXO sizes.
 83 | 
 84 | 
 85 | ### Constants
 86 | We suggest setting $C=10^{12}$. Note that transaction input and output values are represented in dworks (also known as sompis), which are the smallest unit of KAS value (a single KAS is equal $10^{8}$ dworks).  
 87 | 
 88 | ### P2P & Mining rules
 89 | The new mass formula can be implemented as a mempool policy rather than a consensus rule. This has the benefit of not requiring a fork to roll out, but the disadvantage of trusting the miners to follow this policy. Due to the urgency of the update, we suggest to initially roll it out as a P2P/mining rule, while also including it in the nearest upcoming hard-fork.
 90 | 
 91 | **P2P rule** A node receiving a $\text{tx}$ via RPC or P2P should validate the transaction and calculate its mass using the new formula. If the computed mass exceeds the standard mass (currently $100,000$ grams), $\text{tx}$ should be rejected from the mempool and not broadcasted to peers.
 92 | 
 93 | **Mining** The new mass should be used by the mempool transaction selection algorithm to calculate the $\text{fee}/\text{mass}$ ratio, and the overall block template mass should respect the block mass limit ($500,000$ grams).
 94 | 
 95 | ### Consensus (hard-fork)
 96 | 
 97 | Applying the proposal as a consensus change requires the following updates:
 98 | 
 99 | **Transaction mass field** A new explicit `storage_mass` field should be added to the `Transaction` structure. This field is used as a *commitment* to the storage mass consumed by the transaction, until it can be verified in full UTXO context. To make the stated mass binding, the mass field must be hashed when computing the transaction hash. However, we suggest that the logic for computing a transaction *ID* should ignore this field, as this avoids unnecessary changes in ecosystem software (clients such as wallets frequently compute the ID locally). Note that wallets composing a transaction can leave the mass field empty, and let miners fill it for them during block template building.
100 | 
101 | **Block validation rule** The `storage_mass` field of a transaction represents its storage mass. A transaction's compute mass can be calculated in isolation. As such, for all transactions in a block the storage mass and compute mass will be tracked and summed independently and both total storage mass and total compute mass are checked that they are each under the block mass limit. Tracking these masses independently allows for better consumption of block space considering compute mass and storage mass consume different resources on the node.
102 | 
103 | 
104 | **Transaction validation rule** The full mass is only calculated during contextual transaction validation (that is, validation with full UTXO context of the containing block or a chain block merging it). If the computed mass does not match the committed mass, the transaction is considered invalid and is discarded. Like other transaction validation rules, blocks containing transactions committed to a wrong storage mass are considered disqualified from the selected-chain. 
105 | 
106 | ### Wallets
107 | Current wallet developers in the Kaspa ecosystem are familiar with the term “compounding”. When `compute_mass` of a transaction is larger than $M$ (where $M=100,000$ is the standard transaction mass limit), wallets compose a chain or a tree of transactions for iteratively compounding UTXOs until a final payment can be made with the desired, typically large, value. Similarly, with `storage_mass` introduced, wallets might encounter cases where the desired payment is extremely small and the initial `storage_mass` is larger than $M$. Below we specify a “fragmentation” process, which iteratively “unbalances” a pair of UTXOs until one of them is small enough to make the micropayment. Notably, this unbalancing process relies on the sensitivity of the relaxed formula to the distribution of *input* values.
108 | 
109 | Consider a user with a single UTXO holding $1$ KAS making a payment of $0.05$ KAS. For simplicity, assume the transaction fee is always $0.001$ KAS. The following chain of transactions performs the desired micropayment while not surpassing $M$ in any transaction:
110 | 
111 | ```
112 |        0.5       0.1      0.05
113 |     /       \  /     \  / 
114 | 1.0                
115 |     \       /  \     /  \ 
116 |       0.449     0.898     0.947
117 | ```
118 | 
119 | 
120 | In section [Wallet algorithm](#wallet-algorithm) we fully specify an unbalancing algorithm that minimizes the number of steps required to make the micropayment, and discuss the relation to existing wallet algorithms. In section [Micropayments](#micropayments) we provide alternative costless methods for supporting micropayments in the long-run. 
121 | 
122 | Currently, the vast majority of Kaspa transactions have outputs larger than one KAS, for which `storage_mass` hardly ever surpass $M$. Hence, a simple but very short-term way for wallets to adapt to the new formula is to reject payments below $0.2$ KAS (roughly the value at which storage mass gets close to $M$). If the *change* value is very small, larger/more inputs should be used, or the fee can be increased to consume the change. Exceptions to the common usage are faucet-like services used to demonstrate or experiment with the system. We suggest adapting the fragmentation process so that the wallet backing such services will always have small enough UTXOs ready to make the next micropayment.    
123 | 
124 | ### RPC API
125 | Before the hard-fork is applied, all mining software *must update* their RPC client interface to a version that includes the new transaction mass field. Otherwise, blocks submitted via RPC will hash incorrectly and be deemed invalid.
126 | 
127 | # Rationale
128 | The following provides an overview of the design decisions made and the rationality behind them. Formal statements presented rely on the analysis in [3]. 
129 | 
130 | ### Transaction mass vs. fees
131 | In Kaspa, the capacity of a block is given in a quantity called *mass*. The difference between mass and size is that different types of data are considered "heavier" or "denser", in the sense that they consume more mass per byte.  For instance, signatures have a relatively high mass per byte compared with other data since verifying them is computationally intensive. For an exact specification of how `compute_mass` is currently calculated, see [here](https://kaspa.aspectron.org/transactions/constraints/mass.html#transaction-mass-limits).
132 | 
133 | Increasing the mass cost of storage-wasting transactions is an effective countermeasure for regulating state growth, as it allows us to control how much each block increases the state (to a limited but sufficient extent, as we will soon see). This approach could be considered an adaptable version of imposing high fees on wasteful transactions, with the following benefits:
134 | 1. It does not require an estimate of what constitutes a "high" fee. Instead, the fee is naturally monetized as the value of the blockspace consumed by the transaction.
135 | 2. It works in the absence of an active fee market. Even if there are no fees, the mass limits and bounded block rate regulate the storage increase.
136 | 
137 | ### Quadratic storage costs via local mass costs
138 | Our overarching goal is to make the cost of wasting some amount of storage *quadratic* in the amount of the storage wasted. That is, taking up ten times more storage takes 100 times more mass, taking up 100 times more storage takes 10000 times more mass, etc.
139 | 
140 | As mentioned in the previous section, lower bounds on mass consumption can be seen as lower bounds on the *time* required to execute a series of transactions. Alternatively, one can argue that the *budget* locked within a segment of storage, adds additional cost to the currency owner, in the form of initial acquisition, unearned interest, etc. Thus, we sought for a formula where the accumulated mass required to waste some amount of storage decreases with the *budget* locked within it. If we use *growth* to denote the inflation in storage caused by the attack, the desired constraint is that the accumulated mass is proportional to $\text{growth}^2/\text{budget}$.
141 | 
142 | <!-- Including the linear `budget` cofactor essentially means that the amount of mass the user gets "for free" grows like the square root of their budget. This provides enough leeway to reduce the effect of the mass policy on everyday usage, while still ensuring that the cost of a storage attack (on a fixed budget) increases quadratically. -->
143 | 
144 | Designing a mass function providing such quadratic growth is challenging, due to two contrasting desires:
145 | 1. Local mass pricing: the mass of a transaction is defined locally and may not depend on other transactions. 
146 | 2. Global cost: the cost should be quadratic in the state inflation caused, regardless of how the attacker structured his attack. An attacker can structure his attack in many ways, creating many transactions with intertwined dependencies between them, and the mass formula must ensure that even the cleverest of attacks will ultimately pay a quadratic price.
147 | 
148 | To illustrate the intricacy, consider attempting the above by setting the mass of a transaction with $I$ inputs and $O$ outputs to be proportional to $(O-I)^2$. While locally it achieves quadratic growth, a clever attacker can increase the size of the UTXO set by $N$ entries by creating $N$ transactions, each with one input and two outputs. The accumulated mass of the attack is linear in $N$ rather than quadratic. In fact, we prove in [3] that *any* attempt to compute mass based only on the *amount* of inputs and outputs or their total value (while ignoring how the values are distributed) will fail. The reason for this ultimately boils down to the following observation: the mass of a transaction with two outputs of value $V/2$ is the same as that of a transaction (with the same inputs) outputting two inputs of values $\varepsilon V$ and $(1-\varepsilon)V$ for an arbitrarily small $\varepsilon$, and this property can be abused to cheaply increase the storage.
149 | 
150 | ### Revisiting the mass formula
151 | A more mathematical representation of the formula defined [above](#extended-mass-formula) can be given using the harmonic and arithmetic means ($H, A$ respectively):
152 | $$\text{storage mass}(tx) = C\left(\frac{|O|}{H(O)} - \frac{|I|}{A(I)}\right)^+$$
153 | 
154 | The idea behind this formula is that a transaction should be *charged* for the storage it requires and *credited* for the storage it frees. However, there is a principal difference between them: the storage charge should reflect how distributed the outputs are, while the credit should only reflect the number of consumed UTXOs and their *total* value. This way, transactions that do not increase the UTXO set much (or even decrease it) but create minuscule outputs are heavily charged, while transactions whose inputs and outputs are on the same order of magnitude will not pay a lot of mass, even if they do increase the UTXO set. This excludes the ability of a storage attack, since such an attack requires breaking large UTXOs into small UTXOs. This is why using the arithmetic mean to credit and the harmonic mean to charge makes sense: The arithmetic mean is the same for any two sets of values of the same size and sum, it completely "forgets" how the values are distributed. On the other hand, the harmonic mean is extremely sensitive to how values are distributed and becomes very small if the smallest value is very small (by definition, the harmonic sum of $N>1$ values, one of which is $v$, is smaller than $vN$).
155 | 
156 | In [3], we formalize this intuitive interpretation by proving that this definition of `storage_mass` satisfies the property that the total  `storage_mass` required for executing a set of transactions is lower bounded by a quadratic function of the global state growth it caused. More formally, assume $G$ is a DAG of transactions containing an aggregate value of $\text{budget}(G)$, and which increases the UTXO set by $\text{growth}(G)$ entries. Then, 
157 | $$\sum_{\text{tx}\in G}\text{storage mass}\left(tx\right)\ge C\cdot\frac{\text{growth}^2(G)}{\text{budget}}\text{.}$$
158 | 
159 | Below we provide concrete illustrations of the consequences of this bound.
160 | 
161 | <!-- ### Relaxed storage mass formula
162 | Consider the following alternative definition:
163 | $$\text{storage mass}^*(tx) = C\cdot\left(\frac{|O|}{H(O)} - \frac{|I|}{H(I)}\right)^+\text{.}$$
164 | 
165 | **Remark**: Some might wonder whether we can apply the relaxation for *any* transaction with $|O|\le |I|$. The answer is that we believe it is, but are unable to prove it, and are convinced that proving it requires a completely different proof construction than the one we used. However, it does not make much of a difference, since a transaction with $|O|\le |I|$ can be broken into a small number of transactions with $|O| \le |I| \le 2$ with very small overhead. -->
166 | 
167 | # Security analysis and growth regulation
168 | In this section, we consider the consequences of applying the storage mass policy, using $C=10^{12}$. We consider an attacker seeking to increase the storage requirements by one gigabyte. For that, they would have to create $20$ million, or $2 \cdot 10^7$ new UTXO entries. We can now ask ourselves two questions: 1. How long would the attack last given a fixed budget? 2. How expensive the attack should be given it should last a fixed amount of time.
169 | 
170 | - **Fixed budget** Say the attacker has a budget of $20,000$ kas. That is, $2\cdot 10^4\cdot 10^8$ dworks. Plugging this into the bound, we get that $C\cdot growth^2/budget = (10^{12}\cdot 4 \cdot 10^{14})/(2 \cdot 10^4 \cdot 10^8) = 2 \cdot 10^{14}$. That is, the attack would cost $2 \cdot 10^{14}$ grams, which would take the full capacity of 400 million blocks. Hence, in 10BPS such an attack would require at least a year and a half, assuming the attacker uses 100% of the network throughput and the fees are negligible.
171 | 
172 | - **Fixed growth rate** Say the attacker wishes to increase the storage by a whole GB within a single day (again, assuming the attacker is given the entire network throughput for negligible fees). In 10BPS, the network has a throughput of a bit over $4\cdot 10^{11}$ grams per day. Substituting into the bound and rearranging we get $budget \ge C \cdot growth^2/\text{mass}$. Substituting $C = 10^{12}$, $growth = 2 \cdot 10^7$ and $\text{mass} = 4\cdot 10^{11}$ we get that the required budget is at least $10^{15}/4$ dworks, which is $2.5$ million kaspa.
173 | 
174 | Overall, the attack is either very slow or very expensive. How does this compare with the dust attack? That attack created 1185 UTXOs per block. In 10BPS, this means 50GB per day. Without the quadratic bound, the only limitation to such an attack is that each output must be at least $10,000$ dworks. In other words, assuming 10BPS, an attacker could increase the storage by 50GB in a single day for 100,000 kaspa. The computation above shows that, with a budget of 100,000 kaspa, it would take *750 years* to waste 50GB. Conversely, wasting 50GB in one day requires a budget of 125 *million* kaspa.
175 | 
176 | We can also use the bound to provide an absolute worst-case upper bound on the ***organic*** growth of the UTXO set. Assume for simplicity the total circulation is $20$ billion kaspa. In 10BPS, the daily mass capacity is about $4\cdot 10^{11}$. The quadratic bound implies that in $d$ days the storage can increase by at most $460\sqrt{d}$ gigabytes. This allows us to bound the storage increase as a function of the time that passed since the mass policy was implemented. During the first day, the storage can increase by at most one terabyte. During the first year: at most $10$ terabyte. During the first ten years: at most $25$ terabytes. Reaching $100$ terabyte will require almost 130 years, and $1$ peta will not happen before 13 thousand years. This is a *very* mild growth considering it bounds even the worst scenario possible: all Kaspa holders joining forces to increase the storage as much as possible, managing to apply the best strategy to do so.
177 | 
178 | # Quality of service
179 | 
180 | We discuss the implications of storage mass on common everyday transactions.
181 | 
182 | First consider a transaction $\text{tx}$ with a single $100$ kaspa input, and two near-equal outputs of values $\sim 50$ kaspa. We can compute that
183 | $$\text{storage mass}(\text{tx}) \approx 2C\cdot \left(\frac{2}{50 \cdot 10^8} - \frac{1}{100 \cdot 10^8}\right) = 300$$
184 | 
185 | In contrast, one can check that $\text{compute mass}(\text{tx}) \ge 2000$, so we see that the total mass is not changed, despite the fact that $\text{tx}$ is a relatively low-value transaction with more outputs than inputs.
186 | 
187 | More generally, we see that the `storage_mass` becomes dominant only when relatively small inputs emerge (where the exact threshold is inversely proportional to the total budget of the transaction). Still, even in everyday use, we see that small outputs can emerge. Small UTXOs commonly appear as change or as micropayments, and however we set $C$, we should account for the possibility that a standard payment from a standard wallet could exceed this threshold, affecting the QoS of everyday users. In section [Wallet algorithm](#wallet-algorithm) we will show how to compose transactions for sending arbitrarily small values, and in section [Micropayments](#micropayments) we will discuss strategies for mitigating this cost altogether.
188 | 
189 | 
190 | ### Compounding transactions
191 | A transaction $\text{tx}$ is *compounding* if $|O| \le |I|$ and all values in $O$ are equal, namely $=\text{budget}/|O|$ (most commonly, we have that $|O|=1$). Since all values in $O$ are the same we have that $H(O) = A(O) = \text{budget}/|O|$ and we get that
192 | $$\text{storage mass}(\text{tx}) = C\cdot\left(\frac{|O|}{A(O)} - \frac{|I|}{A(I)}\right)^+ = C\cdot\left(\frac{|O|^2 - |I|^2}{\text{budget}}\right)^+ = 0\text{.}$$
193 | Hence, compounding several outputs into an *equal or smaller* number of outputs *of equal value* will never incur storage mass. This is true *regardless of the magnitude* of the output values.
194 | 
195 | It is worth deliberating a bit on how fees affect this phenomenon. The presence of a fee modifies the mass equation as so:
196 | $\text{storage mass}\left(\text{tx}\right)=C\cdot\left(\frac{\left|O\right|^{2}}{\text{budget}-\text{fee}}-\frac{\left|I\right|^{2}}{\text{budget}}\right)$, and after some rearrangement, one can see that the storage mass becomes positive if and only if $\left(1-\left(\frac{\left|O\right|}{\left|I\right|}\right)^{2}\right)<\frac{\text{fee}}{\text{budget}}$. It is nice to notice that the condition above completely depends on the number of inputs and outputs and the fee-to-budget ratio, not the actual values or even $C$. However, in scenarios where the storage mass is positive, its actual value does depend on all of these quantities. The first example that jumps to the eye is the case $|I|=|O|$, where clearly *any* positive fee will beget positive storage mass. However, one can see numerically that unless the values are very small and/or the fee is very large, the storage mass will still be smaller than the computation mass.
197 | 
198 | In the case $|O| = 1$ the condition becomes $\left(1-\frac{1}{\left|I\right|^{2}}\right)<\frac{\text{fee}}{\text{budget}}$.
199 | So even for $|I|=2$, the storage mass vanishes unless the fee is at least three-quarters of the entire value.
200 | 
201 | The overall takeaway of the analysis is that compounding is treated "nicely" by the storage mass function, even in the presence of fees. 
202 | 
203 | ### Exchanges and pools
204 | Exchanges and pools should immediately benefit from this proposal. Both do not usually deal with micropayments, and the currently deployed patch highly limits their ability to submit many transactions at times of increased load. Also in the long term, an exchange can be considered a highly active wallet whose deposit and withdrawal volumes are roughly the same. That is, the values of deposits are distributed roughly the same as those of withdrawals. By matching inputs with outputs of the same magnitude, exchanges can keep the storage mass low even in a future where typical transactions have low (say, sub-kaspa) values.
205 | 
206 | Similarly, pools can remove most storage mass by setting the cashout limit to be larger than a typical coinbase output.
207 | 
208 | ### Micropayments
209 | A comprehensive solution must also consider scenarios where everyday users wish to spend very small values (much smaller than $0.1$ kaspa). We discuss how this can be achieved, assuming all wallets hold at least 1 kaspa (i.e. $10^8$ dworks).
210 | 
211 | Consider an eccentric millionaire with a penchant for ice cream. Every morning, our hero grabs a wallet containing a single UTXO of at least $1000$ kaspa, and goes to the stand to buy a vanilla cone for a comfortable price of $0.1$ kaspa. How should the millionaire pay for her purchase? 
212 | 
213 | Consider a transaction $\text{tx}$ with a single input of value $V$ and two outputs of values $v$ and $V-v$ where $v\ll V$, then it follows that
214 | $$\text{storage mass}\left(\text{tx}\right)\approx C\cdot\left(\frac{1}{V-v}+\frac{1}{v}-\frac{1}{V}\right)\approx\frac{C}{v}\text{.}$$
215 | In particular, in order to pay the vendor $0.1$ kaspa, and the change back to himself, the millionaire would have to create a transaction whose mass is around $100,000$ grams. This is one-fifth of the capacity of a block. That is, the fees for this transaction would be 40 times more expensive than "standard" transactions (whose storage mass is lower than their computation mass). Paying as much for a single transaction might be excusable as a one-time last resort, but users would not (and should not) agree to pay it daily.
216 | 
217 | The key point is in realizing that the merchant selling the ice cream does not keep such small amounts indefinitely, but rather will compound them eventually (say on a daily basis). However the future actions of the merchant are unknown to the system, and the action is locally indistinguishable from the actions of a deliberate dust attacker. In an account-based model, such a transaction would merely appear as a transfer of a small value between significantly larger accounts. Essentially, the account model embeds the future compounding of the payment into the local operation.
218 | 
219 | <!-- The buyer and the seller can mutually create a transaction with an input of, say, $10$ `KAS` from each and an output of $10.0001$ to the seller and $9.9999$ to the buyer, which results in negligible mass. 
220 | In an account-based model, this problem does not exist: the millionaire could simply pay *some* of the money in his wallet to the merchant's wallet. The crux is that this behavior can be emulated in the UTXO model by creating a mutually signed transaction.  -->
221 | 
222 | We argue that this exact behavior can be emulated also in the UTXO model by creating a mutually signed transaction. 
223 | The vendor and the millionaire create a transaction together with two $1000$ kaspa inputs, one spent by the vendor and the other spent by the millionaire, and two outputs, one paying $1000.1$ kaspa to the vendor and the other paying $999.9$ kaspa to the millionaire. This facilitates the payment while removing the storage mass costs. Note that the UTXO available to the merchant need not exactly match in value to the millionaire's wallet. Even using a UTXO of $10$ kaspa, the outputs are $999.9$ and $10.1$. Since the smallest output is much larger, the mass would turn out much smaller. 
224 | 
225 | <!-- One can compute that for these exact numbers the mass turns out to be just below a kilogram, which is inconsequential, given that the computation mass is typically twice as much (recall that the total mass is the maximum of the two, not the sum). It is a reasonably good approximation to say that the storage mass becomes larger than the computation mass if the vendor's UTXO is at least two hundred times smaller (or larger) than the millionaire's. -->
226 | 
227 | The downside to this solution is that the merchant must constantly have a hot wallet available and cooperate with the customer to create a mutually signed transaction. In a following KIP, we will specify *auto-compounding* wallet addresses, where UTXOs owned by such addresses will allow anyone to add to their balance without the owner of the UTXO having to sign it. Among other applications, this mechanism will allow the millionaire to purchase ice cream as described above, using her wallet alone.
228 | 
229 | ### Wallet algorithm
230 | In this section we provide an optimal algorithm for making a payment of a given value using a given set of UTXOs. The algorithm is "optimal" in the sense that it minimizes the number of transactions required to make this payment and the overall mass they consume.
231 | 
232 | Disclaimer: the algorithm described does not cover all possible edge-cases (especially fee-related aspects), and is brought here as a guideline to the path which should be taken. With the help of the community we hope to soon publish a comprehensive standalone reference implementation (in the form of a Python Jupyter notebook), which will address the need for a more accurate reference.
233 | 
234 | Denote $M$ to be the maximal storage mass a single transaction could have. The value $M$ can be either the common mempool bound (currently set to $100,000$ grams), or user specified. Additionally, assume there is a conversion ratio $r$ of mass to fee. The value of $r$ is usually obtained by the wallet software by monitoring the network usage. Hence a transaction of mass $m$ will pay a fee of $rm$. We let $F=rM$ denote the fee of a maximally heavy transaction.
235 | 
236 | Say the user desires to make a payment with value $P$ and that we already collected sufficient inputs $I$ such that $\sum_{v \in I}v \ge P + F$. The algorithm pre-assumes that a transaction composed of $I$ and $O = \lbrace P, \sum_{v \in I}v - P - F\rbrace$ has $\text{compute mass} \le M$ but $\text{storage mass} > M$. Otherwise, if compute mass is too large, it should be solved using the current methods of compounding by building a transaction tree. Recall that such compounding transactions will never have storage mass, so there's never a need to solve both objectives in parallel. Eventually we arrive at a root transaction where compute mass is low enough, at which point storage mass can be dealt with if needed. 
237 | 
238 | **Single step optimization** We begin by solving a single step of the process. The question we ask is “Given a mass bound $M$ and a set of inputs $I$, what is the minimum payment value possible without surpassing $M$?”. Or in more mathematical terms “What is the maximum asymmetry we can create between the 2 outputs given the constrains?”. Denote $N$ to be the negative component of the storage mass as described by the relaxed formula. Note that $I$ and the number of outputs (which is known to be 2 in this case), are sufficient for calculating this part. Let $T=\sum_{v \in I}v - F$ denote the total outputs value. We need to solve the following equation: $M = C/T\alpha + C/T(1-\alpha) - N$, where $\alpha \in (0, 1)$. Reorganizing terms we get $(M+N)T/C = 1/\alpha + 1/(1-\alpha)$. Let $D = (M+N)T/C$. Reorganizing terms further we arrive at the quadratic equation $D\alpha^2 - D\alpha + 1 = 0$ which solutions for are $\alpha = (D \pm \sqrt{D^2 - 4D})/2D$. Note that from symmetry of $\alpha, (1-\alpha)$ both solutions of the equation essentially give the same answer.
239 | 
240 | **Iterative process** Using this single step optimization we now describe an iterative algorithm for composing the chain of transactions required for making a micropayment: 
241 | 
242 | ```python
243 | def build_txs(inputs, payment):
244 |     txs = []
245 |     while storage_mass(I=inputs, O=[payment, sum(inputs) - payment - F]) > M:
246 |         T = sum(inputs) - F
247 |         N = negative_mass(inputs, 2) 
248 |         D = (M + N)T/C
249 |         alpha = (D - sqrt(D^2 - 4D))/2D                         # single step optimization, taking the smaller solution 
250 |         outputs = [ceiling(alpha * T), T - ceiling(alpha * T)]  # round up in order to not increase the mass above M
251 |         txs.append((inputs, outputs))
252 |         inputs = outputs
253 |     txs.append((inputs, [payment, sum(inputs) - payment - F]))
254 |     return txs
255 | ```
256 | 
257 | **Remarks**
258 | - In all iterations of the while loop (except maybe for the first), $I=O=2$ 
259 | - Without the relaxed formula which uses the harmonic mean over the inputs (in the 2:2 case), the loop would not converge. The arithmetic averaging done over the inputs would yield the same $N$ value over and over
260 | - The initial inputs should contain sufficient slack to cover all fees along the process. The value returned by the initial call to `storage_mass` is a good estimation for the overall mass required and hence can be used to estimate the overall fee.  
261 | - In all intermediate transactions built within the loop, the recipient for both outputs should be the change address of the *sender* 
262 | - In the final transaction built, it is possible that the actual fee can be decreased below $F$ depending on the final mass
263 | 
264 | # Implementation details
265 | 
266 | ### Consensus implementation
267 | 
268 | Recall that consensus implementation required a relatively complex [design](#consensus-hardfork). Here we elaborate on this subtlety, and how it was solved by slightly changing the structure of a transaction.
269 | 
270 | **Background** The block validation in Kaspa is divided into three phases:
271 | - Header validation: the block header has the required form, points to known blocks, satisfies the difficulty target, etc.
272 | - Data validation: the block body has the required form, as do all transactions therein, and in particular it satisfies the mass limit
273 | - Transaction validation: all transactions have valid signatures and spend existing UTXOs.
274 | Perhaps surprisingly, a block only has to pass the first two phases to be considered valid. The reason for that is that the third part of validation does not rely on the block data alone, but requires us to compute the UTXO set from the point of view of that block. For efficiency reasons, we only want to compute the UTXO set for selected-chain blocks. This tension is resolved by having this "intermediate status" where a block could be perfectly valid, but disqualified from being in the selected-chain. In particular, future blocks pointing at this block will process any valid transactions that appear therein (otherwise they will also be disqualified from the selected-chain).
275 | 
276 | **The Problem** The values of UTXOs are not explicitly written inside the transaction but are rather recovered from the UTXO set. So in order to compute the storage mass, we must compute the UTXO state of the block. This contradicts our ambition to validate the block during the data validation phase while avoiding having to compute its UTXO state.
277 | 
278 | **Solution** We adapt an idea we already used for fixing an issue with the `sig_op_count` opcode (see [4]). We require transactions to have an explicit `storage_mass` field. During the data validation phase, the client only verifies that the sum of masses (storage mass and compute mass tracked independently) does not exceed the block mass limit. Verifying that the stated mass values are consistent with the formula is only performed for chain blocks and deferred to the transaction validation phase. Transactions with false mass values are considered invalid, and blocks containing them are disqualified from the selected-chain. 
279 | 
280 | ### Tracking Massess Independently
281 | 
282 | This KIP proposes that compute mass and storage mass be tracked independently and checked that both totals are under the block mass limit. Let's say we will combine this value somehow into a single mass (such as with a `max` operation) to demonstrate the inefficiency of such an alternative implementation. A transaction that is with `(compute mass | storage mass)` of `(490,000 | 1,000)` and another that is `(1,000 | 490,000)` cannot fit together in a single block even though they consume different kinds of resources on the node (one consumes a lot of compute and the other consumes a lot of storage). Track each mass independently allows for including both transactons that consume __different__ resources heavily in the same block.
283 | 
284 | ### Current status
285 | This proposal is initially implemented by the Kaspa on Rust codebase ([PR](https://github.com/kaspanet/rusty-kaspa/pull/379)). Currently applied on the mempool level for all networks (mainnet, testnet 10, testnet 11). It is also implemented on the consensus level, but only activated for testnet 11 (TN11). With the TN11 hardfork from this [other PR](https://github.com/kaspanet/rusty-kaspa/pull/595) the current implementation now uses the max of compute mass and storage mass in the transaction mass field and storage mass computation uses the relaxed version of the formula. The implementation should be updated shortly to reflect this final proposal. To avoid code clutter and multiple versions, we suggest hard-forking or restarting TN11 with the fixed rules. 
286 | 
287 | # References
288 | [1] “A Very Dusty Yom-Kippur (dust attack post-mortem)” https://medium.com/@shai.wyborski/a-very-dusty-yom-kippur-dust-attack-post-mortem-faa11804e37
289 | 
290 | [2] Kaspad go-lang dust patch: https://github.com/kaspanet/kaspad/pull/2244
291 | 
292 | [3] Unpublished paper on state growth regulation in permissionless systems, this reference will be updated once the paper is published online.
293 | 
294 | [4] “Kaspa Security Patch and Hard Fork — September 2022” https://medium.com/@michaelsuttonil/kaspa-security-patch-and-hard-fork-september-2022-12da617b0094
295 | 


--------------------------------------------------------------------------------
/kip-0010.md:
--------------------------------------------------------------------------------
  1 | ```
  2 |   KIP: 10
  3 |   Layer: Consensus, Script Engine
  4 |   Title: New Transaction Opcodes for Enhanced Script Functionality
  5 |   Authors: Maxim Biryukov (@biryukovmaxim),
  6 |          Ori Newman <orinewman1@gmail.com>
  7 |   Status: proposed, implemented in the rust codebase and activated in testnet 11
  8 | ```
  9 | 
 10 | ## Abstract
 11 | 
 12 | This KIP introduces transaction introspection opcodes and enhanced arithmetic capabilities to the Kaspa scripting language. The primary additions include opcodes for querying transaction metadata, input/output properties, and support for 8-byte integer arithmetic operations. These enhancements enable more sophisticated script conditions and use cases, particularly in support of mutual transactions as discussed in KIP-9.
 13 | ## Motivation
 14 | 
 15 | This proposal addresses the need for more flexible and powerful scripting capabilities. The new opcodes allow scripts to access transaction data directly and perform calculations with larger numbers, enabling the implementation of various advanced transaction types.
 16 | The introduction of these features will:
 17 | 
 18 | 1. Enable more sophisticated smart contracts and conditional spending scenarios.
 19 | 2. Support the implementation of mutual transactions as discussed in KIP-9.
 20 | 3. Enable precise calculations with larger numeric values through 8-byte integer support.
 21 | 
 22 | ## Specification
 23 | 
 24 | ### 1. New Opcodes
 25 | 
 26 | The following new opcodes are introduced to enhance script functionality:
 27 | #### Transaction Level Opcodes:
 28 | 
 29 | 1. `OpTxInputCount` (0xb3): Returns the total number of inputs in the transaction
 30 | 2. `OpTxOutputCount` (0xb4): Returns the total number of outputs in the transaction
 31 | 3. `OpTxInputIndex` (0xb9): Returns the index of the current input being validated
 32 | 
 33 | #### Input/Output Query Opcodes:
 34 | 1. `OpTxInputAmount` (0xbe): Returns the amount of the specified input
 35 | 2. `OpTxInputSpk` (0xbf): Returns the script public key of the specified input
 36 | 3. `OpTxOutputAmount` (0xc2): Returns the amount of the specified output
 37 | 4. `OpTxOutputSpk` (0xc3): Returns the script public key of the specified output
 38 | 
 39 | ### 2. Enhanced Integer Support
 40 | The proposal extends arithmetic operations to support 8-byte integers (previously limited to 4 bytes). This enhancement applies to:
 41 | 1. Basic arithmetic operations
 42 | 2. Numeric comparisons
 43 | 3. Stack operations involving numbers
 44 | 4. All opcodes that produce or consume numeric values
 45 | 
 46 | ### 3. Opcode Behavior
 47 | 
 48 | #### 3.1 Input/Output Query Opcodes
 49 | 
 50 | - These opcodes expect an index parameter on the stack
 51 | - The index must be within valid bounds (0 to n-1, where n is the number of inputs/outputs)
 52 | - For amount opcodes, values are returned in sompis
 53 | - Script public key values include both version and script bytes
 54 | 
 55 | #### 3.2 Transaction Metadata Opcodes
 56 | 
 57 | - Return values directly without requiring parameters
 58 | - Values are pushed as minimal-encoded numbers
 59 | - Always succeed if executed (assuming KIP-10 is active)
 60 | 
 61 | ### 4. Consensus Changes
 62 | 
 63 | - The implementation of these opcodes requires a hard fork, as they introduce new functionality to the scripting language.
 64 | - All nodes must upgrade to support these new opcodes for the network to remain in consensus.
 65 | - The activation of these opcodes should be scheduled for a specific daa score, allowing sufficient time for the network to upgrade.
 66 | 
 67 | ### 5. Activation
 68 | The features introduced in this KIP are activated based on DAA score:
 69 | 1. Prior to activation:
 70 |    - New opcodes are treated as invalid
 71 |    - Arithmetic operations remain limited to 4 bytes
 72 | 2. After activation:
 73 |    - All new opcodes become available
 74 |    - 8-byte arithmetic support is enabled
 75 |    - Existing scripts continue to function as before
 76 | 
 77 | ### 6. Reserved Opcodes
 78 | The following opcodes are reserved for future expansion:
 79 | - OpTxVersion (0xb2)
 80 | - OpTxLockTime (0xb5)
 81 | - OpTxSubnetId (0xb6)
 82 | - OpTxGas (0xb7)
 83 | - OpTxPayload (0xb8)
 84 | - OpOutpointTxId (0xba)
 85 | - OpOutpointIndex (0xbb)
 86 | - OpTxInputScriptSig (0xbc)
 87 | - OpTxInputSeq (0xbd)
 88 | - OpTxInputBlockDaaScore (0xc0)
 89 | - OpTxInputIsCoinbase (0xc1)
 90 | 
 91 | ## Rationale
 92 | 
 93 | The enhanced opcodes and arithmetic capabilities address several key requirements:
 94 | 
 95 | 1. Transaction Introspection: Scripts can now examine and validate transaction properties directly, enabling complex conditional logic
 96 | 2. Larger Number Support: 8-byte integers allow for precise calculations with large values, essential for financial operations
 97 | 3. Future Extensibility: Reserved opcodes provide a clear path for future enhancements
 98 | 4. Backward Compatibility: The activation mechanism ensures a smooth network upgrade
 99 | 
100 | These opcodes are designed to work within the existing P2SH framework, maintaining compatibility with current address types while significantly expanding the possibilities for script design.
101 | 
102 | ## Backwards Compatibility
103 | 
104 | This proposal requires a hard fork, as it introduces new opcodes to the scripting language. Older software will require an update to support these new features. Existing scripts and addresses remain valid, but cannot use the new functionality without being updated.
105 | 
106 | ## Reference Implementation
107 | 
108 | A reference implementation of the new opcodes and example usage can be found in the following pull request to the rusty-kaspa repository:
109 | 
110 | [https://github.com/kaspanet/rusty-kaspa/pull/487](https://github.com/kaspanet/rusty-kaspa/pull/487)
111 | 
112 | 
113 | ## Enabling Micropayments within KIP-9 Constraints
114 | KIP-9 introduced the 'storage_mass' formula to mitigate UTXO bloat. This formula, storage_mass(tx) = C * (|O| / H(O) - |I| / A(I))^+, penalizes transactions that create many small-value outputs, where '|O|' is the number of outputs, '|I|' is the number of inputs, H(O) is the harmonic mean of the output values, A(I) is the arithmetic mean of the input values, and 'C' is a constant.
115 | 
116 | This formula reflects the principle that a transaction should be charged for the storage its outputs require, with the harmonic mean making the formula highly sensitive to small output values. Standard micropayment approaches that involve numerous tiny UTXOs become impractical due to the increased 'storage_mass'.
117 | 
118 | Relaxing 'storage_mass' is possible by adding inputs to transactions, which increases the arithmetic mean of the inputs A(I), thus reducing the overall storage mass. However, this traditionally requires coordination and signatures from multiple parties.
119 | 
120 | KIP-10 introduces new opcodes that allow simulating account-based model without needing new signatures when value is sent to an address. By providing introspection capabilities, scripts can enforce rules that govern how UTXOs are spent based on their value, destination address, and other transaction properties
121 | 
122 | ## Example Usage
123 | 
124 | To illustrate the practical application of the new opcodes, we present three scenarios: a threshold scenario, a shared secret scenario, and a mining pool payment scenario.
125 | 
126 | ### 1. Threshold Scenario
127 | 
128 | ```
129 | OP_IF
130 |    <owner_pubkey> OP_CHECKSIG
131 | OP_ELSE
132 |    OP_TXINPUTINDEX OP_TXINPUTSPK OP_TXINPUTINDEX OP_TXOUTPUTSPK OP_EQUALVERIFY
133 |    OP_TXINPUTINDEX OP_TXOUTPUTAMOUNT
134 |    <threshold_value> OP_SUB
135 |    OP_TXINPUTINDEX OP_TXINPUTAMOUNT
136 |    OP_GREATERTHANOREQUAL
137 | OP_ENDIF
138 | ```
139 | 
140 | This scenario demonstrates a script that allows for two types of spending conditions:
141 | 
142 | 1. Owner spending: The owner can spend the UTXO by providing a valid signature.
143 | 2. Borrower spending: Anyone can spend the UTXO if they create an output with a value greater than the input by a specified threshold amount, sent to the same script.
144 |    How it Works and Benefits:
145 | 
146 | This creates an "additive-only spending" pattern where UTXOs can only be spent by "increasing their value" (that is, sending to an output with the same address as the UTXO with increased value), effectively preventing value extraction without owner authorization.
147 | 
148 | #### How it Works and Benefits
149 | * Uses `OpTxInputAmount` and `OpTxOutputAmount` to verify the threshold condition
150 | * Aligns with KIP-9's goal of controlling UTXO growth through value-based constraints
151 | * Enables automated services that can "borrow" funds by adding value
152 | * Provides spam protection through the threshold requirement while incentivizing UTXO value growth
153 | 
154 | #### Mining Pool Application
155 | A practical application of the threshold scenario is in mining pool operations. Currently, pools must accumulate coinbase UTXOs and make less frequent payouts to mitigate KIP-9 mass penalties when outputs exceed inputs. 
156 | With KIP-10 threshold scripts, pools can:
157 | 
158 | 1. Have participants use threshold-based P2SH addresses
159 | 2. Make payouts even with a single coinbase input by "borrowing" participants' UTXOs
160 | 3. Create efficient M:N transactions combining multiple inputs for multiple participant payouts
161 | 4. Maintain frequent payouts without mass penalties
162 | 5. Allow participants to retain full control of their funds while enabling pool operations
163 | 
164 | Concept:
165 | 1. Pool Operation: The pool manages a KIP-10 compatible P2SH address for each participant, ensuring each address always has at least one UTXO.
166 | 2. Payout Process: When a block is mined, the pool can efficiently distribute rewards using a single transaction with multiple inputs and outputs.
167 | 
168 | #### Related Work: Additive Addresses
169 | While this KIP focuses on transaction introspection opcodes, it's worth noting a related but separate feature being discussed for Kaspa: additive addresses. Additive addresses would allow for auto-compounding behavior where UTXOs can be spent by adding to their value, subject to certain constraints.
170 | 
171 | The key distinction is that additive addresses can be implemented without consensus changes, using only the existing P2SH mechanism:
172 | 
173 | 1. A public key and threshold value can be combined into a standard script that enforces additive-only spending rules
174 | 2. This script can be hashed to create a P2SH script public key
175 | 3. The reverse operation (extracting the public key and threshold from the P2SH hash) is impossible by design, as P2SH only stores the hash of the script
176 | 
177 | While additive addresses complement the functionality provided by KIP-10's introspection opcodes, they serve different purposes and will be proposed separately. The introspection opcodes in this KIP provide general-purpose transaction inspection capabilities, while additive addresses offer a specific optimization for auto-compounding use cases.
178 | 
179 | ### 2. Shared Secret Scenario
180 | 
181 | ```
182 | OP_IF
183 |     <owner_pubkey> OP_CHECKSIG
184 | OP_ELSE
185 |     OP_DUP <shared_secret_pubkey> OP_EQUALVERIFY
186 |     OP_CHECKSIGVERIFY
187 |     OP_TXINPUTINDEX OP_TXINPUTSPK OP_TXINPUTINDEX OP_TXOUTPUTSPK OP_EQUALVERIFY
188 |     OP_TXINPUTINDEX OP_TXOUTPUTAMOUNT OP_TXINPUTINDEX OP_TXINPUTAMOUNT OP_GREATERTHANOREQUAL
189 | OP_ENDIF
190 | ```
191 | 
192 | This scenario demonstrates a more complex script that allows for two types of spending:
193 | 
194 | 1. Owner spending: The owner can spend the UTXO by providing a valid signature.
195 | 2. *Authorized Borrower Spending*: Someone can spend the UTXO only if they:
196 |    1. Know a pre-defined secret (implemented as a keypair and verified with a signature)
197 |    2. Create outputs matching or exceeding input values
198 |    3. Send funds back to the same script address
199 | 
200 | #### How it Works and Benefits
201 | * Uses `OpTxInputSpk` and `OpTxOutputSpk` to enforce script and value conditions
202 | * Adds authentication layer through shared secret verification
203 | * Limits access to specific authorized parties while maintaining automation benefits
204 | 
205 | ### Reference Implementations
206 | 
207 | Implementations of these scenarios and additional examples can be found in the [rusty-kaspa repository](https://github.com/biryukovmaxim/rusty-kaspa/blob/kip-10-mutual-tx/crypto/txscript/examples/kip-10.rs)
208 | 
209 | ## Security Considerations
210 | 
211 | 1. Increased complexity in transaction validation, requiring careful implementation and testing.
212 | 2. Potential for resource consumption attacks if scripts using these opcodes are not properly limited.
213 | 3. Implications for transaction caching and optimizations, as scripts may now depend on broader transaction context.
214 | 4. Potential privacy implications of allowing scripts to access more transaction data.
215 | 
216 | Implementers should be aware of these considerations and implement appropriate safeguards, such as script size and complexity limits, to mitigate potential risks.
217 | 
218 | ## References
219 | 
220 | 1. KIP-9: Extended mass formula for mitigating state bloat
221 |    [https://github.com/kaspanet/kips/blob/master/kip-0009.md](https://github.com/kaspanet/kips/blob/master/kip-0009.md)
222 | 
223 | 2. Auto-Compounding Additive Addresses Discussion
224 |    [https://research.kas.pa/t/auto-compounding-additive-addresses-kip10-draft/168](https://research.kas.pa/t/auto-compounding-additive-addresses-kip10-draft/168)
225 | 
226 | 3. Micropayments Discussion (KIP-9 Follow-up)
227 |    [https://research.kas.pa/t/micropayments/20](https://research.kas.pa/t/micropayments/20)
228 | 
229 | 4. Bitcoin Cash CHIP-2021-02: Native Introspection Opcodes
230 |    [https://gitlab.com/GeneralProtocols/research/chips/-/blob/master/CHIP-2021-02-Add-Native-Introspection-Opcodes.md](https://gitlab.com/GeneralProtocols/research/chips/-/blob/master/CHIP-2021-02-Add-Native-Introspection-Opcodes.md)
231 | 
232 |    This proposal draws significant inspiration from BCH's implementation of transaction introspection opcodes. The BCH implementation demonstrated the viability and benefits of native introspection over covenant workarounds. While Kaspa's implementation differs in some details due to its unique architecture and requirements, the core principles and many design decisions were informed by BCH's successful deployment of these features. We appreciate the extensive research and documentation provided by the BCH community in CHIP-2021-02.
233 | 
234 | 5. BCH Implementation Reference
235 |    - Bitcoin Cash Node implementation: [BCHN MR 1208](https://gitlab.com/bitcoin-cash-node/bitcoin-cash-node/-/merge_requests/1208)
236 |    - Test cases: [BCHN Native Introspection Tests](https://gitlab.com/bitcoin-cash-node/bitcoin-cash-node/-/blob/6fff8c761fda0ad15c9752d02db31aa65d58170f/src/test/native_introspection_tests.cpp)


--------------------------------------------------------------------------------
/kip-0013.md:
--------------------------------------------------------------------------------
  1 | ```
  2 |   KIP: 13
  3 |   Layer: Consensus (hard fork), Block Size
  4 |   Title: Transient Storage Handling
  5 |   Author: Michael Sutton <msutton@cs.huji.ac.il>
  6 |           coderofstuff
  7 |   Status: draft
  8 | ```
  9 | 
 10 | We propose tracking the transient storage consumption of transactions as a `mass` and bounding such consumption to a reasonable size for node operators. In the Kaspa jargon, the word `mass` is used for sizes that are used to limit transaction throughput due to their externality.
 11 | 
 12 | # Motivation
 13 | 
 14 | Since KIP9, transactions now consume two types of masses, compute mass and storage mass. These masses are used to limit some consumption of resource per block (compute and persitent storage, respectively). However, neither of these directly limit the block size, byte-wise. We need to have a way to limit block byte sizes directly to have finer control over the worst-case storage and bandwidth used by a node.
 15 | 
 16 | ## Context
 17 | 
 18 | Payloads have been enabled in Testnet 11 which is running at 10 blocks per second (BPS). Payload bytes are charged at a gram per byte. At 10 BPS and with the current block mass limit of `500,000` grams, through the use of massive payloads, it's possible that storage within the pruning period can consume close to 1TB of data. To demonstrate this, we use the following calculation:
 19 | 
 20 | ```
 21 | // Given:
 22 | block_mass_limit = 500,000
 23 | testnet_11_pruning_depth = 1,119,290
 24 | testnet_11_finality_depth = 432,000
 25 | bytes_per_gb = 1,000,000,000
 26 | ```
 27 | 
 28 | ```
 29 | // Assuming every block mass is exactly 1 byte:
 30 | worst_case_usage
 31 | = ((testnet_11_pruning_depth + testnet_11_finality_depth) * block_mass_limit) / bytes_per_gb;
 32 | = ((1,119,290 + 432,000) * 500,000) / 1,000,000,000
 33 | = 775 GB
 34 | ```
 35 | 
 36 | Requiring node operators and common users who run nodes to have close to 1TB free to run a node is a tall ask. We must consider reducing this worst-case usage.
 37 | 
 38 | ## Typical Transactions
 39 | 
 40 | These are the most common kinds of tranactions that network users make.
 41 | 
 42 | A typical 1:2 transaction (1 input, 2 outputs)'s consumption looks like this
 43 | 
 44 | |Field|Data|
 45 | | --- | --- |
 46 | | Size | 316 bytes |
 47 | | Actual  Compute Mass | 2036 grams |
 48 | | TPB* | 245 |
 49 | | Total Block Size** | 77,420 bytes |
 50 | 
 51 | *The transactions per block if it were completely filled with transactions like this only
 52 | 
 53 | **Total Block Size consumed if the block was filled with transactions like this only
 54 | 
 55 | Similarly 2:2 transactions (2 inputs, 2 outputs)'s consumption looks like this
 56 | 
 57 | |Field|Data|
 58 | | --- | --- |
 59 | | Size | 434 bytes |
 60 | | Actual Compute Mass | 3154 grams |
 61 | | TPB* | 158 |
 62 | | Total Block Size** | 68,572 bytes |
 63 | 
 64 | Finally a typical KRC20 transaction is usually either a 1:1 or 1:2 transaction. It's signature length is roughly 261 bytes.
 65 | 
 66 | |Field|Data|
 67 | | --- | --- |
 68 | | Size | 429 bytes |
 69 | | Actual  Compute Mass | 1819 grams |
 70 | | TPB* | 274 |
 71 | | Total Block Size** | 117,546 bytes |
 72 | 
 73 | Plugging these sizes with the exact max pruning length (pruning + finality depth) of 10BPS, we get worst case disk sizes of 120GB, 106GB and 182GB respectively. 
 74 | 
 75 | If we round it up to 125kb per block you get worst case usage of 193GB, which is much more reasonable and accessible than the initially calculated 775GB worst case requirement.
 76 | 
 77 | With these figures in mind, we can set the goals to be as follows:
 78 | 1. Keep the above costs for typical transactions (so we still fit those 150-250 TPB) 
 79 | 2. Limit block size to 125KB
 80 | 
 81 | # Our Proposal
 82 | 
 83 | ## Consensus Changes
 84 | 
 85 | The `check_block_mass` implementation introduced in KIP9 checks whether the transactions within a block fit within the block mass limit. It tracks storage mass and compute mass totals independently, and ensures each total mass falls within the block mass limit. We propose the following changes:
 86 | 
 87 | ### Change 1 - New Mass: Transient Storage mass
 88 | 
 89 | Introduce a new mass called `transient_storage_mass`. We will use this to limit the block size to 125KB as follows:
 90 | - `transient_storage_mass = transaction_serialized_size(tx) * 4`.
 91 | - The `4` above comes directly from trying to reduce the block size to 125KB (500,000 mass / 125,000 bytes = 4 mass / byte)
 92 | 
 93 | ### Change 2 - Independent Block Mass Tracking
 94 | Recall that KIP9 introduced tracking of `compute_mass` and `storage_mass` independently for block mass limits. This change is to add the `transient_storage_mass` as another mass to be tracked independently.
 95 | 
 96 | ## Mempool Changes
 97 | 
 98 | ### Change 1 - Fee Rate Keys
 99 | 
100 | The `calculated_transient_storage_mass` will be pre-computed and accessible within the mempool. Fee rate keys will be updated to include `calculated_transient_storage_mass` in its mass via the `max` operator.
101 | 
102 | That is, the fee rate key is based on `mass` where `mass = max(compute_mass, storage_mass, transient_storage_mass)`.
103 | 
104 | Optimizing the mempool transaction selection mechanism (tracking each mass independently somehow) is out of scope for this KIP.
105 | 
106 | ### Change 2 - Validations
107 | 
108 | Anywhere in the mempool where transaction compute mass is validated (such as against standard size), such mass will incorporate `transient_storage_mass` via the `max` operator.
109 | 
110 | # Backwards compatibility
111 | Breaks consensus rules, requires hardfork
112 | 


--------------------------------------------------------------------------------
/kip-0014.md:
--------------------------------------------------------------------------------
  1 | ```
  2 | KIP: 14
  3 | Layer: Consensus (hard fork)
  4 | Title: The Crescendo Hardfork
  5 | Type: Consensus change (block rate, script engine)
  6 | Author: Michael Sutton <msutton@cs.huji.ac.il>
  7 | Comments-URI: https://research.kas.pa/t/crescendo-hardfork-discussion-thread/279
  8 | created: 2025-01-21
  9 | updated: 2025-03-04
 10 | Status: implemented; pending hardfork
 11 | ```
 12 | 
 13 | # Abstract
 14 | This KIP proposes the implementation of the Crescendo Hardfork for the Kaspa network, detailing the consensus changes and transition strategy for various components. The primary change in this KIP is the increase in Blocks Per Second (BPS) from 1 to 10, a significant adjustment with wide-ranging implications for node performance as well as storage and bandwidth requirements. This necessitates updates to many consensus parameters and, in some cases, a rethinking of existing mechanisms to ensure they remain efficient at the increased block rate (e.g., the sparse DAA window introduced in KIP-4).
 15 | 
 16 | Additionally, this hardfork includes the activation of other significant improvements to Kaspa. Starting with KIP-9, which introduces a critical mechanism for managing and mitigating state bloat, thereby regulating persistent storage requirements. This is complemented by KIP-13, which regulates transient storage requirements for a node. Another major component activated in this hardfork is KIP-10, which introduces new introspection opcodes to Kaspa's script engine. Through such introspection, these opcodes enable the concept of covenants, allowing for advanced transaction controls, including the design of additive addresses that support microtransactions and complement KIP-9.
 17 | 
 18 | This hardfork also marks the closure of KIP-1, the Kaspa Rust Rewrite. The performance improvements enabled by Rusty Kaspa (RK) provide the foundation necessary to support this upgrade, allowing the network to handle the increased demands of 10 BPS and beyond.
 19 | 
 20 | # Motivation
 21 | The Crescendo Hardfork is a proactive upgrade to the Kaspa network, made possible through RK. With TN11—a testnet operating at 10 BPS—running stably for over a year, the network has demonstrated its readiness for this transition. By increasing capacity and speed, this upgrade positions Kaspa to support anticipated demand from emerging technologies, including smart contract layers enabled via the ongoing based ZK bridge design [1].
 22 | 
 23 | All of the changes described in this KIP (excluding KIP-13 and KIP-15) have already been implemented and tested in TN11, providing valuable insights into their performance and stability. These results ensure that the proposed modifications are ready for deployment on the Kaspa mainnet.
 24 | 
 25 | # Specification
 26 | ## Consensus changes
 27 | ### 1. BPS-related changes
 28 | 
 29 | The following details the changes solely related to the bps increase.  
 30 | 
 31 | 1. **Increasing BPS from 1 to 10**:
 32 |    - This change is governed by a consensus parameter named `target_time_per_block` (milliseconds), which controls the expected time between blocks. To increase bps from 1 to 10, the `target_time_per_block` will be reduced from 1000 ms to 100 ms.
 33 |    - This adjustment will in turn cause the difficulty adjustment algorithm to reduce the difficulty by a factor of 10, thus accelerating block creation tenfold. Further details on this switch are provided below under "Transitioning strategy".
 34 | 
 35 | 2. **Re-adjusting the Ghostdag K parameter**:
 36 |    - Reducing block time leads to a higher rate of parallel blocks. Consequently, the Ghostdag K parameter, which is a function of $2 \lambda D$ (where $\lambda$ is the block rate and $D$ is the a priori delay bound), must be recalibrated to maintain network security adhering to the Ghostdag formula (see eq. 1 from section 4.2 of the PHANTOM-GHOSTDAG paper [2]).
 37 |    - Setting $D=5, \delta=0.01$, the new Ghostdag K is recalculated to be 124 based on the Poisson tail cutoff therein.
 38 | 
 39 | 3. **Scaling time-based consensus parameters**:
 40 |    - Several parameters conceptually defined by time duration but applied via block count must be scaled with the new bps:
 41 |      - **Finality Depth ($\phi$)**: Previously defined for a 24-hour duration at 1 bps (86,400 blocks), it will now correspond to a 12-hour duration at 10 bps (432,000 blocks).
 42 |      - **Merge Depth Bound ($M$)**: Defined for a 1-hour duration, it will now increase from 3600 blocks at 1 bps to 36,000 blocks at 10 bps.
 43 |      - **Pruning Depth**: Calculated as $\phi + 2M + 4KL + 2K + 2$ [3], where:
 44 |        - $\phi$: Finality Depth
 45 |        - $M$: Merge Depth Bound
 46 |        - $L$: Mergeset Size Limit (see below)
 47 |        - $K$: Ghostdag K
 48 |        - The pruning depth formula provides a lower bound, yet the actual pruning period can be set longer. Plugging in the scaled parameters, the lower bound is calculated to be 627,258 blocks, representing approximately ~17.4238 hours. We suggest rounding this up to 30 hours for simplicity and practical application. A 30-hour period is closer to the current mainnet pruning period (~51 hours) and aligns closely with the value used and benchmarked throughout TN11 (~31 hours).
 49 |      - **Coinbase Maturity**: Originally defined as 100 seconds or ~100 blocks at 1 bps, this will now correspond to 1000 blocks at 10 bps.
 50 | 
 51 | 4. **Conservative scaling of performance-impacting parameters**:
 52 |    - **Max Block Parents**: Increased from 10 to 16. Based on continuous TN11 data, 16 remains well above the average number of DAG tips, ensuring that all tips are normally merged by subsequent blocks.
 53 |    - **Mergeset Size Limit ($L$)**: Increased from 180 to 248 ($2K$) to accommodate the higher bps while maintaining storage efficiency.
 54 | 
 55 | 5. **Adjustments to the Coinbase reward mechanism**:
 56 |    - The scheme described below keeps the reward system and the emission schedule precisely intact by conceptually transferring the current reward per block to a *reward per second* with equal value.     
 57 |    - Specifically, the reward table will continue to map from months to rewards, but the reward is now considered a per-second reward. To calculate the per-block reward, the per-second reward is divided by the bps.
 58 |    - Special care must be taken to correctly calculate the current emission month. Previously, the DAA score (essentially a block count) mapped directly to seconds since blocks were produced at a rate of 1 block per second. Post-hardfork, with 10 bps, the DAA score at activation must be used to maintain accurate second counting.
 59 |    - Two key values are used in the subsidy month calculation:
 60 |      - `deflationary_phase_daa_score`: A constant from the current consensus rules that marks the start of the deflationary phase.
 61 |      - `crescendo_activation_daa_score`: The DAA score at the time of the Crescendo activation, which will be set as part of the hardfork's implementation.
 62 |    - The following code depicts the required permanent change in subsidy month calculation. The returned `subsidy_month` value can then be used as before to extract the reward from the subsidy-by-month table.
 63 | 
 64 | ```rust
 65 |     // We define a year as 365.25 days and a month as 365.25 / 12 = 30.4375
 66 |     // SECONDS_PER_MONTH = 30.4375 * 24 * 60 * 60
 67 |     const SECONDS_PER_MONTH: u64 = 2629800;
 68 | 
 69 |     fn subsidy_month(daa_score: u64) -> u64 {
 70 |         if daa_score < crescendo_activation_daa_score {
 71 |             // Pre activation, we simply assume block count represents second units (since block per second = 1)
 72 |             return (daa_score - deflationary_phase_daa_score) / SECONDS_PER_MONTH;
 73 |         }
 74 | 
 75 |         // Else, count seconds differently before and after Crescendo activation
 76 |         let seconds_since_deflationary_phase_started = 
 77 |             (crescendo_activation_daa_score - deflationary_phase_daa_score) + 
 78 |             (daa_score - crescendo_activation_daa_score) / bps;
 79 |         return seconds_since_deflationary_phase_started / SECONDS_PER_MONTH;
 80 |     }
 81 | ```
 82 | 
 83 | ### 2. Activation of earlier KIPs
 84 | 
 85 | - **Activation of KIP-4: Sparse DAA and Median Time Windows**:
 86 |    - Transitioning to sparse Difficulty Adjustment Algorithm (DAA) and sparse Median Time (MT) windows while maintaining their previous durations (2641 seconds for DAA; 263 seconds for Median Time).
 87 |    - The size of these sparse windows (in blocks) is determined by dividing their durations by chosen sampling intervals. For DAA, we choose a sampling interval of 4 seconds, resulting in a window size of $\lceil 2641/4 \rceil = 661$. For MT, we choose a sampling interval of 10 seconds, resulting in $\lceil 263/10 \rceil = 27$. Notably, these window sizes are now independent of bps.
 88 |    - Sampling intervals are scaled by bps to calculate *block* sample rates:
 89 |      - `past_median_time_sample_rate = 100` (from `MEDIAN_TIME_SAMPLE_INTERVAL=10`).
 90 |      - `difficulty_adjustment_sample_rate = 40` (from `DIFFICULTY_WINDOW_SAMPLE_INTERVAL=4`).
 91 | 
 92 | - **KIP-9 (Storage Mass)**: Introduces a storage mass formula to mitigate and regulate UTXO set growth in both organic and adversarial conditions.
 93 | - **KIP-13 (Transient Storage Mass)**: Implements transient storage mass to regulate short-term storage usage.
 94 | - **KIP-10 (Script Engine Enhancements)**: Introduces direct introspection within the script engine, enabling covenants and advanced transaction controls.
 95 | - **KIP-15 (Recursive Canonical Transaction Ordering Commitment)**: Renames header field `AcceptedIDMerkleRoot` to `SequencingCommitment`—computed as `hash(SelectedParent.SequencingCommitment, AcceptedIDMerkleRoot)`, where `AcceptedIDMerkleRoot` is derived using the canonical consensus order of accepted transactions. This change secures transaction ordering for L2 networks—enabling designs like the Accepted Transactions Archival Node (ATAN) (Note: current implementation kept the original field name).
 96 | 
 97 | 
 98 | ### 3. Additional changes
 99 | 
100 | The following details additional minor changes that do not warrant a separate KIP.
101 | 
102 | #### Enabling transaction payloads
103 | 
104 | This hardfork introduces support for arbitrary data in the `payload` field of native (non-coinbase) transactions. Native transactions, which represent the standard transaction type, now support payloads, while coinbase transactions retain their existing restricted format. Transactions already include a reserved byte array field named `payload`, which was previously required to remain empty for native transactions. This restriction is now lifted, enabling native transactions to carry arbitrary data.
105 | 
106 | To ensure proper handling, the `sighash` mechanism must be adapted to include the `payload` field, proving authorization over the payload. This is achieved by hashing the `payload` into a `payload_hash` and incorporating it into the overall `sighash`. For backwards compatibility, if the `payload` is empty, the zero hash is returned as the `payload_hash`.
107 | 
108 | The `payload` field is already included in the transaction `hash` and `id` in current Kaspa implementations. However, other clients may have assumed this field is always empty and must verify their implementations to account for it.
109 | 
110 | This change enables preliminary second-layer smart contract implementations, leveraging Kaspa for sequencing and data availability without settlement functionality yet. Abuse or spam risks are mitigated by the transient storage regulation introduced in KIP-13. This functionality has already been implemented in RK and activated for TN11 ([pull request](https://github.com/kaspanet/rusty-kaspa/pull/591)).
111 | 
112 | #### Runtime sigop counting
113 | 
114 | To address inefficiencies in static sigop counting, the script engine now counts sigops at runtime. For instance, in scripts supporting additive addresses (cf. KIP-10 for borrower spending details), the previous static scan for `SigVerify` opcodes penalized transactions by charging for sigops regardless of execution. This update tallies only executed sigops, reducing transaction mass and lowering fees, while preserving backward compatibility by allowing transactions to commit to a sigop count that meets or exceeds the runtime value.
115 | 
116 | 
117 | 
118 | ## Transitioning strategy
119 | 
120 | ### General activation strategy
121 | In Kaspa, the DAA score of a block typically determines when forking rules are activated. However, certain consensus changes can affect the DAA score of the block itself, resulting in circular logic. One notable example is the activation of KIP-4 (sparse DAA window), which modifies how the DAA score is calculated. To avoid this cycle, we propose using the DAA score of the block's selected parent to determine activation. Another scenario where the selected parent's score must be considered instead is the increase in Ghostdag K, since Ghostdag is computed before the DAA score is known.
122 | 
123 | To simplify implementation, we suggest extending this method to all header-related changes, which includes all bps-related changes (excluding coinbase rewards).
124 | 
125 | For KIPs 9, 10, and 13, as well as payload activation and coinbase reward changes (all block-body-related), we recommend using the usual, more straightforward approach of relying on the DAA score of the block itself.
126 | 
127 | 
128 | ### Handling difficulty adjustment during the transition
129 | 
130 | When transitioning to 10 bps, a significant challenge arises because the DAA window post-activation spans both the 1 bps and 10 bps eras. Careful consideration is required to prevent an overly large decrease in difficulty caused by the slower block production rate at the beginning of the window. Furthermore, the adoption of KIP-4 introduces an additional layer of complexity, as we are shifting from a full DAA window to a sparse one.
131 | 
132 | #### Proposed solution
133 | 
134 | To address these challenges, the following approach is proposed:
135 | 
136 | - **Reset the DAA window**: The DAA window should be reset at the activation point and should include only blocks mined post-activation.
137 | - **Difficulty calculation for initial blocks**:
138 |   - **Empty window**: When the window is empty (i.e., the selected parent was mined before activation), increase the difficulty target by a factor of 10, reflecting the new bps. This adjustment reduces the amount of work required to find a block by tenfold, resulting in a tenfold increase in block rate with the same hashrate.
139 |   - **Partial window**: For blocks where the window contains some post-activation blocks but has not yet reached the minimum required size, use the selected parent's difficulty as is.
140 | - **Minimum window size**: The minimum window size should be set to a value in the range of 60–661 blocks, corresponding to approximately 4–44 minutes, to balance stability and responsiveness.
141 | 
142 | 
143 | ### Pruning point adjustment
144 | 
145 | Upon activation, the pruning depth increases to accommodate the higher bps. However, the pruning point itself should not regress to meet the new depth immediately. Instead, it must transition gradually under the updated rules, remaining fixed at its current location until a point above it accumulates sufficient depth. This ensures pruning point monotonicity.
146 | 
147 | #### Rigorous pruning point rules
148 | 
149 | 1. Denote $B_n$ as the block at depth $n$ on its selected chain. Note that the definition of depth remains unchanged from current rules.
150 | 2. For every block $B$ mined post-activation, let $pre(B)$ denote the pruning point of its last chain ancestor mined pre-activation.
151 | 3. Denote the new pruning depth as $P$.
152 | 4. The pruning point for a post-activation block $B$, denoted $\pi(B)$, is determined by the following rule: $\pi(B) := \max(pre(B), B_P)$.
153 | 
154 | # Acknowledgment
155 | 
156 | I thank @coderofstuff for helping with the management of the hardfork detailed in this KIP, and all of Kaspa's core developers and researchers for their diligent work toward this massive endeavor.
157 | 
158 | 
159 | # References
160 | - [1] [L1<>L2 bridge design](https://research.kas.pa/c/l1-l2/11)
161 | - [2] [PHANTOM GHOSTDAG: A Scalable Generalization of Nakamoto Consensus](https://eprint.iacr.org/2018/104.pdf)
162 | - [3] [Prunality Analysis](https://github.com/kaspanet/docs/blob/main/Reference/prunality/Prunality.pdf)
163 | 


--------------------------------------------------------------------------------
/kip-0015.md:
--------------------------------------------------------------------------------
  1 | ```
  2 | KIP: 15
  3 | Layer: Consensus (hard fork)
  4 | Title: Canonical Transaction Ordering and SelectedParent Accepted Transactions Commitment
  5 | Type: Consensus change (block format)
  6 | Author: Mike Zak <feanorr@gmail.com>, Ro Ma @reshmem
  7 | Comments-URI: https://research.kas.pa/t/kip-15-discussion-thread/303
  8 | created: 2025-02-01
  9 | updated: 2025-02-23
 10 | Status: proposed 
 11 | ```
 12 | 
 13 | # Motivation
 14 | L2 networks on top of Kaspa rely on Kaspa for both consensus and data availability.  
 15 | In other words, the Kaspa L1 provides the list of accepted transactions and their order, 
 16 | while L2 interprets the transaction payloads and executes corresponding logic.
 17 | In such cases the ordering of transaction acceptance in L1 has to be the ordering of transaction execution on L2.  
 18 | As such, the acceptance and ordering of transactions on L1 is of utmost importance to L2 applications. 
 19 | 
 20 | In addition, a Kaspa node prunes the transaction and header data after c. 52 hours ( 30 hours post-Crescendo ) 
 21 | of their publication. This is made possible by a UTXO-commitment posted in the Kaspa block header, which allows a new
 22 | client to download the state of the network (it's UTXO set) at some block B from an untrusted source, than validate
 23 | it against B's UTXO-commitment.
 24 | 
 25 | However, general-execution L2 networks can not have a cryptographical commitment easily available on-DAG and verified 
 26 | by the L1 consensus layer.  
 27 | Creating such a commitment to a general-execution account-model state is a non-trivial problem requiring ZK-proofs.
 28 | Kaspa has plans for supporting such commitments, posted to the DAG and verified by L1 opcodes, but the specifics of
 29 | this design are still under discussion, with currently no clear timeline or complete understanding of the solution's
 30 | properties.
 31 | 
 32 | ## Accepted Transactions Archival Node 
 33 | As an intermediary solution, as well as in other cases where ZK-proofs are not viable for the L2 for any reason, a new 
 34 | type of archival node might be utilized: This node does not archive the DAG structure or block data, it only archives 
 35 | the accepted transactions' data, as well as their order of acceptance. For the sake of this document, we shall call
 36 | such a node an Accepted Transactions Archival Node (ATAN).
 37 | 
 38 | An ATAN would listen to VirtualSelectedParentChainChanged, noting the RpcAcceptedTransactionIds and extracting the 
 39 | transaction data from the (still un-pruned) block data.  
 40 | This way it is currently possible to collect and store all data required by the L2 network, assuming the 
 41 | ATAN has been online since the L2 network's launch, with only intermittent downtimes up to the size of the 
 42 | pruning window at a time.
 43 | 
 44 | Bootstrapping a new ATAN from an existing, untrusted ATAN would require downloading the transactions data and their 
 45 | ordering, as well as a downloading and validating a cryptographic proof testifying to the above against a 
 46 | commitment commonly available to any pruning Kaspa node.
 47 | 
 48 | A naive design could go along the following lines:
 49 | ```
 50 | 1. Download the Selected Parent Chain block headers from tip up to the inception of L2
 51 | 2. For each block in the Selected Parent Chain bottom-to-top:
 52 |    1. Download the list of accepted transactions
 53 |    2. Validate it against the block's AcceptedIDMerkleRoot
 54 | ```
 55 | 
 56 | The above design has two faults:
 57 | 1. There is much data in the block headers which is redundant for this process 
 58 | 2. Currently, AcceptedIDMerkleRoot does not commit to transaction ordering. (See more on this in the following section)
 59 | 
 60 | ## AcceptedIDMerkleRoot sorting
 61 | Currently, after collecting all transactions accepted by a block, a Kaspa node sorts these transactions according
 62 | to their hash before calculating AcceptedIDMerkleRoot.  
 63 | 
 64 | This was done as an optimization, to facilitate future proofs of exclusion:  
 65 | A proof of exclusion in a non-sorted merkle tree requires the revelation of all the tree's nodes (`O(n)`), while 
 66 | a proof of exclusion in a sorted merkle tree only requires the revelation of a single branch within the 
 67 | tree (`O(log n)`), showing two adjacent nodes, one lesser and one greater than the node we wish to prove its exclusion.
 68 | 
 69 | As far as we know, there is currently no application on top of Kaspa using the above feature. Additionally, a proof of exclusion from the `mergeset` of a specific chain block seems unuseful, while for proving exclusion from a period such as a day you would need a linearly long proof for showing exclusion from each and every chain block.
 70 | We would also wish to argue that the above optimization is of lesser importance than the ability to prove transaction acceptance order. 
 71 | 
 72 | # Our Proposal
 73 | We propose a new block-header field named **SequencingCommitment** to replace **AcceptedIDMerkleRoot**.
 74 | 
 75 | **SequencingCommitment** will be calculated as following:
 76 | ```
 77 | 1. **AcceptedIDMerkleRoot** = the root of a tree constructed from the block's AcceptanceData keeping canonical order 
 78 | (in other words - the same way AcceptedIDMerkleRoot was calculated up until now, but skipping the order by hash)
 79 | 2. **SequencingCommitment** = hash(*SelectedParent.SequencingCommitment*, *AcceptedIDMerkleRoot*).
 80 | ```
 81 | The hashing algorithm that should be used is the same hashing algorithm used throughout Kaspa for merkle trees
 82 | (currently blake2b).
 83 | 
 84 | ## Accepted Transactions Archival Node Design
 85 | Given the above changes in Kaspa, the following design for an ATAN can be proposed:
 86 | As mentioned above, the node listens to VirtualSelectedParentChainChanged and stores for every chain block B:
 87 | 1. B.TxList - The list of transactions accepted by B, in their order of acceptance.
 88 | 2. B.SequencingCommitment.
 89 | 
 90 | This should be stored starting from some block P, ordered bottom-to-top.  
 91 | P must be a pruning point, so that any untrusting client that has access to a Kaspa full node will be able
 92 | to recognize it.
 93 | P should be chosen as the most recent pruning point for which either all L2 transactions are in P's future, 
 94 | or there is some sort of commitment for the state of L2 in P's future.
 95 | 
 96 | Note that the above is enough to provide a list of all Kaspa transactions and their order of acceptance to a trusting
 97 | client, even if said client has no access to a Kaspa full node.
 98 | 
 99 | A synced ATAN X can bootstrap an untrusting ATAN Y, assuming that Y does have access to a Kaspa full node as follows:
100 | ```
101 | 1. X sends to Y: P.Hash and P.SequencingCommitment.
102 | 2. Y verifies that it recognizes P as a pruning point.
103 | 3. For each chain block B from P.SelectedChild up to Tip:
104 |    1. X sends to Y: B.TxList and B.SequencingCommitment (excluding P.TxList)
105 |    2. Y does:
106 |        1. B.ExpectedAcceptedIDMerkleRoot = B.TxList.MerkleRoot()
107 |        2. B.ExpectedSequencingCommitment = hash(B.SelectedParent.SequencingCommitment, B.ExpectedAcceptedIDMerkleRoot)
108 |        3. Verify that B.ExpectedSequencingCommitment == SequencingCommitment
109 | ```
110 | Note we start from P.SelectedChild, because P.SequencingCommitment can not be verified without it's SelectedParent.
111 | 
112 | # Sample Code
113 | KIP-15 implementation can be found in the following PR: https://github.com/kaspanet/rusty-kaspa/pull/636 
114 | 
115 | # Backwards compatibility
116 | Breaks consensus rules, requires a hardfork.
117 | 
118 | # Acknowledgements
119 | Thanks to Michael Sutton @michaelsutton for the discussions leading to this KIP.  
120 | 


--------------------------------------------------------------------------------