docs: update docs for dyn and dyncall

0xPolygonMiden · Oct 23, 2024 · 898d4ef · 898d4ef
1 parent b67fc8c
commit 898d4ef
Show file tree

Hide file tree

Showing 8 changed files with 79 additions and 52 deletions.
diff --git a/docs/src/assets/design/decoder/decoder_dyn_operation.png b/docs/src/assets/design/decoder/decoder_dyn_operation.png
diff --git a/docs/src/assets/design/decoder/decoder_dyncall_operation.png b/docs/src/assets/design/decoder/decoder_dyncall_operation.png
diff --git a/docs/src/design/decoder/constraints.md b/docs/src/design/decoder/constraints.md
@@ -138,10 +138,10 @@ $$
 In the above, $a$ represents the address value in the decoder which corresponds to the hasher chiplet address at which the hasher was initialized (or the last absorption took place).  As such, $a + 7$ corresponds to the hasher chiplet address at which the result is returned.
 
 $$
-f_{ctrli} = f_{join} + f_{split} + f_{loop} + f_{dyn} + f_{call} \text{ | degree} = 5
+f_{ctrli} = f_{join} + f_{split} + f_{loop} + f_{call} \text{ | degree} = 5
 $$
 
-In the above, $f_{ctrli}$ is set to $1$ when a control flow operation that signifies the initialization of a control block is being executed on the VM.  Otherwise, it is set to $0$. An exception is made for the `SYSCALL` operation. Although it also signifies the initialization of a control block, it must additionally send a procedure access request to the [kernel ROM chiplet](../chiplets/kernel_rom.md) via the chiplets bus. Therefore, it is excluded from this flag and its communication with the chiplets bus is handled separately.
+In the above, $f_{ctrli}$ is set to $1$ when a control flow operation that signifies the initialization of a control block is being executed on the VM (only those control blocks that don't do any concurrent requests to the chiplets but).  Otherwise, it is set to $0$. An exception is made for the `DYN`, `DYNCALL`, and `SYSCALL` operations, since although they initialize a control block, they also run another concurrent bus request, and so are handled separately. 
 
 $$
 d = \sum_{b=0}^6(b_i \cdot 2^i)
@@ -151,7 +151,7 @@ In the above, $d$ represents the opcode value of the opcode being executed on th
 
 Using the above variables, we define operation values as described below.
 
-When a control block initializer operation (`JOIN`, `SPLIT`, `LOOP`, `DYN`, `CALL`, `SYSCALL`) is executed, a new hasher is initialized and the contents of $h_0, ..., h_7$ are absorbed into the hasher. As mentioned above, the opcode value $d$ is populated in the second capacity resister via the $\alpha_5$ term.
+When a control block initializer operation (`JOIN`, `SPLIT`, `LOOP`, `CALL`) is executed, a new hasher is initialized and the contents of $h_0, ..., h_7$ are absorbed into the hasher. As mentioned above, the opcode value $d$ is populated in the second capacity resister via the $\alpha_5$ term.
 
 $$
 u_{ctrli} = f_{ctrli} \cdot (h_{init} + \alpha_5 \cdot d) \text{ | degree} = 6
@@ -171,6 +171,22 @@ $$
 
 The above value sends both the hash initialization request and the kernel procedure access request to the chiplets bus when the `SYSCALL` operation is executed.
 
+Similar to `SYSCALL`, `DYN` and `DYNCALL` are handled separately, since in addition to communicating with the hash chiplet they must also issue a memory read operation for the hash of the procedure being called. 
+
+$$
+h_{dynordyncall} = \alpha_0 + \alpha_1 \cdot m_{bp} + \alpha_2 \cdot a' 
+$$
+
+$$
+m_{dynordyncall} = \alpha_0 + \alpha_1 \cdot m_{read} + \alpha_2 \cdot ctx + \alpha_3 \cdot s_0 + \alpha_4 \cdot clk + <[\alpha_5 \dots \alpha_8], h[0 \dots 4]>
+$$
+
+$$
+u_{dynordyncall} = (f_{dyn} + f_{dyncall}) (h_{dynordyncall} \cdot m_{dynordyncall})
+$$
+
+In the above, $h_{dynordyncall}$ can be thought of as $h_{init}$, but where the values used for the hasher decoder trace registers is all 0's. $m_{dynordyncall}$ represents a memory read request from memory address $s_0$ (the top stack element), where the result is placed in the first half of the decoder hasher trace, and where $m_{read}$ is a label that represents a memory read request.
+
 When `SPAN` operation is executed, a new hasher is initialized and contents of $h_0, ..., h_7$ are absorbed into the hasher. The number of operation groups to be hashed is padded to a multiple of the rate width ($8$) and so the $\alpha_4$ is set to 0:
 
 $$
@@ -192,8 +208,8 @@ $$
 Using the above definitions, we can describe the constraint for computing block hashes as follows:
 
 > $$
-b_{chip}' \cdot (u_{ctrli} + u_{syscall} + u_{span} + u_{respan} + u_{end} + \\
-1 - (f_{ctrli} + f_{syscall} + f_{span} + f_{respan} + f_{end})) = b_{chip}
+b_{chip}' \cdot (u_{ctrli} + u_{syscall} + u_{dynordyncall} + u_{span} + u_{respan} + u_{end} + \\
+1 - (f_{ctrli} + f_{syscall} + f_{dyn} + f_{dyncall} + f_{span} + f_{respan} + f_{end})) = b_{chip}
 $$
 
 We need to add $1$ and subtract the sum of the relevant operation flags to ensure that when none of the flags is set to $1$, the above constraint reduces to $b_{chip}' = b_{chip}$.
@@ -249,6 +265,15 @@ $$
 v_{dyn} = f_{dyn} \cdot (\alpha_0 + \alpha_1 \cdot a' + \alpha_2 \cdot a) \text{ | degree} = 6
 $$
 
+When a `DYNCALL` operation is executed, row $(a', a, 0, ctx, fmp, b_0, b_1, fn\_hash[0..3])$ is added to the block stack table:
+
+$$
+\begin{align*}
+v_{dyncall} &= f_{dyncall} \cdot (\alpha_0 + \alpha_1 \cdot a + \alpha_2 \cdot a' + \alpha_4 \cdot ctx \\
+&+ \alpha_5 \cdot fmp + \alpha_6 \cdot b_0 + \alpha_7 \cdot b_1 + <[\alpha_8, \alpha_{11}], fn\_hash[0..3]>) \text{ | degree} = 6
+\end{align*}
+$$
+
 When a `CALL` or `SYSCALL` operation is executed, row $(a', a, 0, ctx, fmp, b_0, b_1, fn\_hash[0..3])$ is added to the block stack table:
 
 $$
@@ -272,8 +297,8 @@ Using the above definitions, we can describe the constraint for updating the blo
 
 > $$
 p_1' \cdot (u_{end} + u_{respan} + 1 - (f_{end} + f_{respan})) = p_1 \cdot \\
-(v_{join} + v_{split} + v_{loop} + v_{span} + v_{respan} + v_{dyn} + v_{callorsyscall} + 1 - \\
-(f_{join} + f_{split} + f_{loop} + f_{span} + f_{respan} + f_{dyn} + f_{call} + f_{syscall}))
+(v_{join} + v_{split} + v_{loop} + v_{span} + v_{respan} + v_{dyn} + v_{dyncall} + v_{callorsyscall} + 1 - \\
+(f_{join} + f_{split} + f_{loop} + f_{span} + f_{respan} + f_{dyn} + f_{dyncall} + f_{call} + f_{syscall}))
 $$
 
 We need to add $1$ and subtract the sum of the relevant operation flags from each side to ensure that when none of the flags is set to $1$, the above constraint reduces to $p_1' = p_1$.
@@ -332,20 +357,10 @@ When `REPEAT` operation is executed, hash of loop body is added to the block has
 
 $$v_{repeat} = f_{repeat} \cdot (ch_1 + \alpha_7) \text{ | } \text{degree} = 5$$
 
-When the `DYN` operation is executed, the hash of the dynamic child is added to the block hash table. Since the child is dynamically specified by the top four elements of the stack, the value representing the *dyn* block's child must be computed based on the stack rather than from the decoder's hasher registers:
-
-$$
-ch_{dyn} = \alpha_0 + \alpha_1 \cdot a' + \sum_{i=0}^3(\alpha_{i+2} \cdot s_{3-i}) \text{ | degree} = 1
-$$
-
-$$
-v_{dyn} = f_{dyn} \cdot ch_{dyn}  \text{ | degree} = 6
-$$
-
-When the `CALL` or `SYSCALL` operation is executed, the hash of the callee is added to the block hash table.
+When `DYN`, `DYNCALL`, `CALL` or `SYSCALL` operation is executed, the hash of the child is added to the block hash table. In all cases, this child is found in the first half of the decoder hasher state.
 
 $$
-v_{callorsyscall} = (f_{call} + f_{syscall}) \cdot ch_1  \text{ | degree} = 5
+v_{allcalls} = (f_{dyn} + f_{dyncall} + f_{call} + f_{syscall}) \cdot ch_1  \text{ | degree} = 6
 $$
 
 When `END` operation is executed, hash of the completed block is removed from the block hash table. However, we also need to differentiate between removing the first and the second child of a *join* block. We do this by looking at the next operation. Specifically, if the next operation is neither `END` nor `REPEAT` nor `HALT`, we know that another block is about to be executed, and thus, we have just finished executing the first child of a *join* block. Thus, if the next operation is neither `END` nor `REPEAT` nor `HALT` we need to set the term for $\alpha_6$ coefficient to $1$ as shown below:
@@ -358,7 +373,7 @@ Using the above definitions, we can describe the constraint for updating the blo
 
 > $$
 p_2' \cdot (u_{end} + 1 - f_{end}) = \\
-p_2 \cdot (v_{join} + v_{split} + v_{loop} + v_{repeat} + v_{dyn} + v_{callorsyscall} + 1 - (f_{join} + f_{split} + f_{loop} + f_{repeat} + f_{dyn} + f_{call} + f_{syscall}))
+p_2 \cdot (v_{join} + v_{split} + v_{loop} + v_{repeat} + v_{allcalls} + 1 - (f_{join} + f_{split} + f_{loop} + f_{repeat} + f_{dyn} + f_{dyncall} + f_{call} + f_{syscall}))
 $$
 
 We need to add $1$ and subtract the sum of the relevant operation flags from each side to ensure that when none of the flags is set to $1$, the above constraint reduces to $p_2' = p_2$.

diff --git a/docs/src/design/decoder/main.md b/docs/src/design/decoder/main.md
@@ -336,18 +336,37 @@ When the VM executes a `SPAN` operation, it does the following:
 
 #### DYN operation
 
-Before a `DYN` operation is executed by the VM, the prover populates $h_0, ..., h_7$ registers with $0$ as shown in the diagram below.
-
 ![decoder_dyn_operation](../../assets/design/decoder/decoder_dyn_operation.png)
 
-In the above diagram, `blk` is the ID of the *dyn* block which is about to be executed. `blk` is also the address of the hasher row in the auxiliary hasher table. `prnt` is the ID of the block's parent.
+In the above diagram, `blk` is the ID of the *dyn* block which is about to be executed. `blk` is also the address of the hasher row in the auxiliary hasher table. `p_addr` is the ID of the block's parent.
 
 When the VM executes a `DYN` operation, it does the following:
 
-1. Adds a tuple `(blk, prnt, 0, 0...)` to the block stack table.
-2. Gets the hash of the dynamic code block `dynamic_block_hash` from the top four elements of the stack.
-2. Adds the tuple `(blk, dynamic_block_hash, 0, 0)` to the block hash table.
-3. Initiates a 2-to-1 hash computation in the hash chiplet (as described [here](#simple-2-to-1-hash)) using `blk` as row address in the auxiliary hashing table and $h_0, ..., h_7$ as input values.
+1. Adds a tuple `(blk, p_addr, 0, 0...)` to the block stack table.
+2. Sends a memory read request to the memory chiplet, using `s0` as the memory address. The result `hash of callee` is placed in the decoder hasher trace at $h_0, h_1, h_2, h_3$.
+3. Adds the tuple `(blk, hash of callee, 0, 0)` to the block hash table.
+4. Initiates a 2-to-1 hash computation in the hash chiplet (as described [here](#simple-2-to-1-hash)) using `blk` as row address in the auxiliary hashing table and `[ZERO; 8]` as input values.
+5. Performs a stack left shift
+    - Above `s16` was pulled from the stack overflow table if present; otherwise set to `0`.
+
+Note that unlike `DYNCALL`, the `fmp`, `ctx`, `in_syscall` and `fn_hash` registers are unchanged.
+
+#### DYNCALL operation
+
+![decoder_dyncall_operation](../../assets/design/decoder/decoder_dyncall_operation.png)
+
+In the above diagram, `blk` is the ID of the *dyn* block which is about to be executed. `blk` is also the address of the hasher row in the auxiliary hasher table. `p_addr` is the ID of the block's parent.
+
+When the VM executes a `DYNCALL` operation, it does the following:
+
+1. Adds a tuple `(blk, p_addr, 0, ctx, fmp, b_0, b_1, fn_hash[0..3])` to the block stack table.
+2. Sends a memory read request to the memory chiplet, using `s0` as the memory address. The result `hash of callee` is placed in the decoder hasher trace at $h_0, h_1, h_2, h_3$.
+3. Adds the tuple `(blk, hash of callee, 0, 0)` to the block hash table.
+4. Initiates a 2-to-1 hash computation in the hash chiplet (as described [here](#simple-2-to-1-hash)) using `blk` as row address in the auxiliary hashing table and `[ZERO; 8]` as input values.
+5. Performs a stack left shift
+    - Above `s16` was pulled from the stack overflow table if present; otherwise set to `0`.
+
+Similar to `CALL`, `DYNCALL` resets the `fmp`, sets up a new `ctx`, and sets the `fn_hash` registers to the callee hash. `in_syscall` needs to be 0, since calls are not allowed during a syscall.
 
 #### END operation
 

diff --git a/docs/src/design/stack/op_constraints.md b/docs/src/design/stack/op_constraints.md
@@ -191,7 +191,7 @@ This group contains operations which require constraints with degree up to $3$.
 | `RCOMBBASE`  | $89$         | `101_1001`      | [Crypto ops](./crypto_ops.md)          | $5$         |
 | `EMIT`       | $90$         | `101_1010`      | [System ops](./system_ops.md)          | $5$         |
 | `PUSH`       | $91$         | `101_1011`      | [I/O ops](./io_ops.md)                 | $5$         |
-| `<unused>`   | $92$         | `101_1100`      |                                        | $5$         |
+| `DYNCALL`    | $92$         | `101_1100`      | [Flow control ops](../decoder/main.md) | $5$         |
 | `<unused>`   | $93$         | `101_1101`      |                                        | $5$         |
 | `<unused>`   | $94$         | `101_1110`      |                                        | $5$         |
 | `<unused>`   | $95$         | `101_1111`      |                                        | $5$         |

diff --git a/docs/src/user_docs/assembly/code_organization.md b/docs/src/user_docs/assembly/code_organization.md
@@ -46,27 +46,30 @@ end
 Finally, a procedure cannot contain *solely* any number of [advice injectors](./io_operations.md#nondeterministic-inputs), `emit`, `debug` and `trace` instructions. In other words, it must contain at least one instruction which is not in the aforementioned list.
 
 #### Dynamic procedure invocation
-It is also possible to invoke procedures dynamically - i.e., without specifying target procedure labels at compile time. Unlike static procedure invocation, recursion is technically possible using dynamic invocation, but dynamic invocation is more expensive, and has less available operand stack capacity for procedure arguments, as 4 elements are required for the MAST root of the callee. There are two instructions, `dynexec` and `dyncall`, which can be used to execute dynamically-specified code targets. Both instructions expect the [MAST root](../../design/programs.md) of the target to be provided via the stack. The difference between `dynexec` and `dyncall` corresponds to the difference between `exec` and `call`, see the documentation on [procedure invocation semantics](./execution_contexts.md#procedure-invocation-semantics) for more detail.
+It is also possible to invoke procedures dynamically - i.e., without specifying target procedure labels at compile time. There are two instructions, `dynexec` and `dyncall`, which can be used to execute dynamically-specified code targets. Both instructions expect the [MAST root](../../design/programs.md) of the target to be stored in memory, and the memory address of the MAST root to be on the top of the stack. The difference between `dynexec` and `dyncall` corresponds to the difference between `exec` and `call`, see the documentation on [procedure invocation semantics](./execution_contexts.md#procedure-invocation-semantics) for more details.
 
-Dynamic code execution in the same context is achieved by setting the top $4$ elements of the stack to the hash of the dynamic code block and then executing the `dynexec` or `dyncall` instruction. You can obtain the hash of a procedure in the current program, by name, using the `procref` instruction. See the following example of pairing the two:
+
+Dynamic code execution in the same context is achieved by setting the top element of the stack to the memory address where the  hash of the dynamic code block is stored, and then executing the `dynexec` or `dyncall` instruction. You can obtain the hash of a procedure in the current program, by name, using the `procref` instruction. See the following example of pairing the two:
 
 ```
-procref.foo
+# Retrieve the hash of `foo`, store it at `ADDR`, and push `ADDR` on top of the stack
+procref.foo mem_storew.ADDR dropw push.ADDR
+
+# Execute `foo` dynamically
 dynexec
 ```
 
 During assembly, the `procref.foo` instruction is compiled to a `push.HASH`, where `HASH` is the hash of the MAST root of the `foo` procedure.
 
 During execution of the `dynexec` instruction, the VM does the following:
 
-1. Reads, but does not consume, the top 4 elements of the stack to get the hash of the dynamic target (i.e. the operand stack is left unchanged).
-2. Load the code block referenced by the hash, or trap if no such MAST root is known.
-3. Execute the loaded code block
+1. Read the top stack element $s_0$, and read the memory word at address $s_0$ (the hash of the dynamic target),
+2. Shift the stack left by one element,
+3. Load the code block referenced by the hash, or trap if no such MAST root is known,
+4. Execute the loaded code block.
 
 The `dyncall` instruction is used the same way, with the difference that it involves a context switch to a new context when executing the referenced block, and switching back to the calling context once execution of the callee completes.
 
-> **Note**: In both cases, the stack is left unchanged. Therefore, if the dynamic code is intended to manipulate the stack, it should start by either dropping or moving the code block hash from the top of the stack.
-
 ### Modules
 A *module* consists of one or more procedures. There are two types of modules: *library modules* and *executable modules* (also called *programs*).
 

diff --git a/docs/src/user_docs/assembly/execution_contexts.md b/docs/src/user_docs/assembly/execution_contexts.md
@@ -18,6 +18,7 @@ When a procedure is invoked via a `call`, `dyncall`, or a `syscall` instruction,
 
 - Execution moves into a different context. In case of the `call` and `dyncall` instructions, a new user context is created. In case of a `syscall` instruction, the execution moves back into the root context.
 - All stack items beyond the 16th item get "hidden" from the invoked procedure. That is, from the standpoint of the invoked procedure, the initial stack depth is set to 16.
+    - Note that for `dyncall`, the stack is shifted left by one element before being set to 16.
 
 When the callee returns, the following happens:
 
@@ -26,8 +27,9 @@ When the callee returns, the following happens:
 
 The manipulations of the stack depth described above have the following implications:
 
-- The top 16 elements of the stack can be used to pass parameters and return values between the caller and the callee. NOTE: Except for `dyncall`, as that instruction requires the first 4 elements to be the hash of the callee procedure, so only 12 elements are available in that case.
+- The top 16 elements of the stack can be used to pass parameters and return values between the caller and the callee.
 - Caller's stack beyond the top 16 elements is inaccessible to the callee, and thus, is guaranteed not to change as the result of the call.
+    - As mentioned above, in the case of `dyncall`, the elements at indices 1 to 17 at the call site will be accessible to the callee (shifted to indices 0 to 16)
 - At the end of its execution, the callee must ensure that stack depth is exactly 16. If this is difficult to ensure manually, the [`truncate_stack`](../stdlib/sys.md) procedure can be used to drop all elements from the stack except for the top 16.
 
 #### Invoking via `exec` instruction
@@ -42,7 +44,7 @@ A _kernel_ defines a set of procedures which can be invoked from user contexts t
 
 A kernel can be defined similarly to a regular [library module](./code_organization.md#library-modules) - i.e., it can have internal and exported procedures. However, there are some small differences between what procedures can do in a kernel module vs. what they can do in a regular library module. Specifically:
 
-- Procedures in a kernel module cannot use `call` or `syscall` instructions. This means that creating a new context from within a `syscall` is not possible.
+- Procedures in a kernel module cannot use `call`, `dyncall` or `syscall` instructions. This means that creating a new context from within a `syscall` is not possible.
 - Unlike procedures in regular library modules, procedures in a kernel module can use the `caller` instruction. This instruction puts the hash of the procedure which initiated the parent context onto the stack.
 
 ### Memory layout