Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assembler: enable vendoring of compiled libraries (fixes #1435) #1643

Draft
wants to merge 13 commits into
base: next
Choose a base branch
from

Conversation

paracetamolo
Copy link
Contributor

@paracetamolo paracetamolo commented Jan 27, 2025

Before this MR, the only ways to link a library during assembly are:

  • Add the module sources. The library is effectively recompiled as part of the new library/program
  • Add a compiled library. In this case all references to this library are compiled as external nodes

This MR adds the possibility to link a compiled library during assembly but have its MAST forest be merged in the resulting library/program. Two desirable properties:

  • No external nodes are added for procedures that are present in vendored libraries, which improves performance.
  • No unused procedures are present in the resulting program/library. If the vendored library contains extra unused procedures they are removed during assembly.

@paracetamolo paracetamolo self-assigned this Jan 27, 2025
@paracetamolo paracetamolo changed the title Marco vendoring Assembler: enable vendoring of compiled libraries (fixes #1435) Jan 27, 2025
@paracetamolo paracetamolo added the assembly Related to Miden assembly label Jan 27, 2025
@bobbinth bobbinth requested review from plafer and bitwalker January 28, 2025 06:13
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you. Not a full review yet, but I left a few comments inline - one of them describing a potential alternative approach.

assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
@paracetamolo paracetamolo force-pushed the marco-vendoring branch 3 times, most recently from f73998a to 806a91e Compare January 30, 2025 14:03
@paracetamolo
Copy link
Contributor Author

I pushed a new version that uses the approach mentioned by @bobbinth in the comment above. At assembly time, we merge all the vendored libraries collected into a single MAST forest that is passed to the builder. On a call to ensure_external the builder first checks into the vendored mast if the procedure is present and it that case it will copy the its subtree.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! Not a full review, but I left some comments inline.

Overall, it feels like this approach should work better than the previous one.

Comment on lines 258 to 271
pub fn add_vendored_library(&mut self, library: impl AsRef<Library>) -> Result<(), Report> {
self.add_library(&library)?;
self.vendored_libraries
.insert(*library.as_ref().digest(), library.as_ref().clone());
Ok(())
}

pub fn with_vendored_library(mut self, library: impl AsRef<Library>) -> Result<Self, Report> {
self.add_vendored_library(library)?;
Ok(self)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add doc comments to these functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think add_library and add_vendored_library are a bit confusing names, especially the former. Would it make sense to rename them dynamically_link and statically_link ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind renaming, but I'm not sure dynamically_link() and statically_link() are better options. Curious what others think though.

assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mod.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mast_forest_builder.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mast_forest_builder.rs Outdated Show resolved Hide resolved
assembly/src/assembler/mast_forest_builder.rs Outdated Show resolved Hide resolved
core/src/mast/mod.rs Outdated Show resolved Hide resolved
@paracetamolo
Copy link
Contributor Author

Are there more interesting tests that we could run? Right now there is a single test checking that a used procedure is inlined and an unused one is deleted.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you. I reviewed pretty much all non-test and on-CLI code and left some comments inline.

Comment on lines +86 to +88
pub fn new<'a>(
vendored_libraries: impl IntoIterator<Item = &'a Library>,
) -> Result<Self, Report> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add doc comments to this function.

Also, nit: I'd move it above the build() function (e.g., above line 79).

Comment on lines +280 to +307
/// Iterates over all the nodes a root depends on, in pre-order.
/// The iteration can include other roots in the same forest.
pub struct RootIterator<'a> {
forest: &'a MastForest,
discovered: Vec<MastNodeId>,
unvisited: Vec<MastNodeId>,
}
impl<'a> RootIterator<'a> {
pub fn new(root: &MastNodeId, forest: &'a MastForest) -> Self {
let discovered = vec![];
let unvisited = vec![*root];
RootIterator { forest, discovered, unvisited }
}
}
impl Iterator for RootIterator<'_> {
type Item = MastNodeId;
fn next(&mut self) -> Option<MastNodeId> {
while let Some(id) = self.unvisited.pop() {
let mut children = self.forest[id].children();
if children.is_empty() {
return Some(id);
};
self.discovered.push(id);
self.unvisited.append(&mut children);
}
self.discovered.pop()
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it feels a bit weird to have this defined between MastForest impl blocks. I would probably move down to line 657 and add a section separator there.

Also, RootIterator feels like we are iterating over roots. Maybe it RootSubtreeIterator or something similar would be more clear?

Comment on lines +629 to +631
pub fn remap(self, remapping: &Remapping) -> Self {
*remapping.get(&self).unwrap_or(&self)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add doc comments to this function.

}
}

pub fn children(&self) -> Vec<MastNodeId> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will result on a vector allocation on every invocation of this method. Looking at how this method is used, a more efficient alternative could be:

pub fn append_children_to(&self, target: &mut Vec<MastNodeId>) {
    ...
}

@@ -143,6 +143,34 @@ impl MastNode {
}
}

pub fn remap(&mut self, remapping: &BTreeMap<MastNodeId, MastNodeId>) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add doc comments to this function.

Comment on lines +298 to +303
let mut children = self.forest[id].children();
if children.is_empty() {
return Some(id);
};
self.discovered.push(id);
self.unvisited.append(&mut children);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the previous comment, we could re-write this as:

let node = self.forest[id];
if !node.has_children() {
    return Some(id);
} else {
    self.discovered.push(id);
    node.append_children_to(&mut self.unvisited);
}

Comment on lines 185 to +188
pub fn remove_nodes(
&mut self,
nodes_to_remove: &BTreeSet<MastNodeId>,
) -> Option<BTreeMap<MastNodeId, MastNodeId>> {
) -> BTreeMap<MastNodeId, MastNodeId> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change needed for this PR?

Comment on lines +258 to +260
/// Adds a compiled library that can be used to copy procedures during assembly instead of
/// introducing external nodes.
pub fn add_vendored_library(&mut self, library: impl AsRef<Library>) -> Result<(), Report> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would write this as:

/// Adds a compiled library procedures from which will be vendored into the assembled code.
///
/// Vendoring in this context means that when a procedure from this library is invoked from the
/// assembled code, the entire procedure MAST will be copied into the assembled code. Thus,
/// when the resulting code is executed on the VM, the vendored library does not need to be 
/// provided to the VM to resolve external calls.

We should also update the description of add_library() procedure above.

Comment on lines +267 to +268
/// See [`Self::add_vendored_library`]
pub fn with_vendored_library(mut self, library: impl AsRef<Library>) -> Result<Self, Report> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would write this as:

/// Adds a compiled library procedures from which will be vendored into the assembled code.
///
/// See [`Self::add_vendored_library`]

Comment on lines +480 to +485
for old_id in RootIterator::new(&root_id, &self.vendored_mast.clone()) {
let mut node = self.vendored_mast[old_id].clone();
node.remap(&self.vendored_remapping);
let new_id = self.ensure_node(node)?;
self.vendored_remapping.insert(old_id, new_id);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will copy the node + its decorators. But where do we copy the advice map data from the vendored libraries?

@bobbinth
Copy link
Contributor

bobbinth commented Feb 1, 2025

@plafer, @bitwalker - could you also take a look at this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
assembly Related to Miden assembly
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants