-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructure executor #38
base: main
Are you sure you want to change the base?
Conversation
Here I try to accomplish two things: First, to better separate the actual minimum requirements and interface from the description of our preferred tool. In this way, `executorlib.Executor` is only one implementation of what `pyiron_workflow` requires. To this end I split appart "downstream" sections (base and workflow and what they require) and an "upstream" section (executorlib and what it offers). Second, there is a mismatch between the `executorlib` flow management and `pyiron_workflow` flow management, which was embedded in this statement: > * For both `pyiron_base` and `pyiron_workflow` it would be convenient if users could send the entire graph to the HPC for execution, without relying on a connection to an external process monitoring the graph as it is currently implemented in `pyiron_workflow`. This can be achieved if a series of smaller nodes with similar resource requirements can be converted into one larger task which is then executed by `executorlib` as a single job in the queuing system. So a node in `pyiron_workflow` does not have a one to one relation to an task in `executorlib`. In `executorlib` tasks are primarily defined by having similar resoruce requirements, only when a process has changing resource requirements, then it makes sense to separate them into individul tasks in `executorlib`. In the same way `executorlib` can be used to efficiently parallelize the execution. Which I streamlined down to a single "TODO" around task grouping for executorlib. One issue here is that on the `pyiron_workflow` side, the nodes are all functional but the graph as a whole has state (inputs and outputs), and _after_ the node functionality is executed, a callback function is used to map the result of that functionality back onto the graph state (i.e. to populate the output channels with data). In the context of firing off the entire graph to something like SLURM all at once, I don't see an easy technical solution for holding and updating the graph state. I also don't see and easy technical solution for avoiding the graph IO updates -- `pyiron_workflow` is exactly a workflow manager and expects to be able to manage the workflow. The second issue is that, from [a conversation with @jan-janssen](https://github.com/pyiron/specs/pull/31/files#r1789103867), my understanding is that executorlib handles the dependency trees in the following way: DAG only, each downstream can depend on multiple upstream, but each dependency must be absorbed in its entirety -- i.e. multiple-input-single-output. That is not an incompatible world to `pyiron_workflow`, but represents only a subset of the cyclic-and multiple-input-multiple-output graphs representable by `pyiron_workflow`, so we would at most be able to use this functionality on a subset of graphs. I think this sort of dependency tree is a great feature for `executorlib`, and is absolutely useful for the "high performance expert" user case. I just want to break it apart from the universal spec. For the `pyiron_workflow` use case, it would be much more productive to focus on the abilities to (a) run the entire graph remotely or (b) run individual nodes remotely (per @Tara-Lakshmipathy's [PR](#28)). These have the advantage that they're both already possible, so it's just a matter of building nicer users interfaces to the existing capability, e.g. in part by combining `Executor` and `FileExecutor` as already suggested here.
I am not so happy with the refactoring, as it gives the impression that |
It is not clear to me why you come to this conclusion. To define the spec we need to clearly state the general interface (done --
|
While this is correct on the technical level, our users want to use their workflows in the HPC context, which is not covered by
I agree that we should do a consistency check, to validate that both
Do we plan to have a
I spend the weekend to introduce decorators in
I agree, I think the primary question is: Can we agree on a joined interface? Based on a discussion I had with Joerg back at IPAM in 2013, it is important to distinguish workflow nodes and tasks for the executor. Not every small python function - e.g. the summation of two numbers - is worth submitting to an executor, rather multiple nodes with similar resource requirements - e.g. serial python function calls - should be bundled to a single task and then be submitted to the executor as a single task. The mapping of nodes to tasks has to be handled by the workflow manager, as it requires knowledge of the task dependencies, while the executor only provides a simple interface to execute tasks. For the context of |
Sorry for the delay in replying, I missed the alert on this thread.
I don't understand the nature of this objection. We don't have separate spec files for base and workflow. I thought one of the key objectives in this venture was to see what tools both platforms could jointly exploit? In this case we need to write down somewhere what the expectations and needs of those platforms are. Doing it in the file for the tool makes the most sense for me, but otherwise this similar content strictly needs to be copied into platform spec files, not just ignored.
Yeah, I totally agree 100%. The only place
Good, sounds great. Again, not related to this PR. This PR simply acknowledges that base doesn't leverage
I don't want to argue semantics here; from the
I don't actually need this feature nor feel strongly about its (in)existence. I'm going to simply remove the line from the PR. Please take a look again and let me know if you have objections to any specific content in the PR. |
|
||
## Downstream -- `pyiron_workflow` | ||
|
||
* An `executor` or information for creating one can be assigned to a node, and that node will use it for the core functionality of the node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only understand the first part (i.e. "executor can be assigned to a node"). I guess the rest has to be reformulated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to reformulate. I'm also content that it's somewhat vague here -- I'm trying only to note who else in the project uses the executor functionality, not how it is used, which we shouldn't care about over here! It is probably worth mentioning that pwf relies on submit but not map, but anyhow a good solution is implementing both so it's nbd
Here I try to accomplish two things:
First, to better separate the actual minimum requirements and interface from the description of our preferred tool. In this way,
executorlib.Executor
is only one implementation of whatpyiron_workflow
requires. To this end I split appart "downstream" sections (base and workflow and what they require) and an "upstream" section (executorlib and what it offers).Second, there is a mismatch between the
executorlib
flow management andpyiron_workflow
flow management, which was embedded in this statement:Which I streamlined down to a single "TODO" around task grouping for executorlib.
One issue here is that on the
pyiron_workflow
side, the nodes are all functional but the graph as a whole has state (inputs and outputs), and after the node functionality is executed, a callback function is used to map the result of that functionality back onto the graph state (i.e. to populate the output channels with data). In the context of firing off the entire graph to something like SLURM all at once, I don't see an easy technical solution for holding and updating the graph state. I also don't see and easy technical solution for avoiding the graph IO updates --pyiron_workflow
is exactly a workflow manager and expects to be able to manage the workflow.The second issue is that, from a conversation with @jan-janssen, my understanding is that executorlib handles the dependency trees in the following way: DAG only, each downstream can depend on multiple upstream, but each dependency must be absorbed in its entirety -- i.e. multiple-input-single-output. That is not an incompatible world to
pyiron_workflow
, but represents only a subset of the cyclic-and multiple-input-multiple-output graphs representable bypyiron_workflow
, so we would at most be able to use this functionality on a subset of graphs.I think this sort of dependency tree is a great feature for
executorlib
, and is absolutely useful for the "high performance expert" user case. I just want to break it apart from the universal spec. For thepyiron_workflow
use case, it would be much more productive to focus on the abilities to (a) run the entire graph remotely or (b) run individual nodes remotely (per @Tara-Lakshmipathy's PR). These have the advantage that they're both already possible, so it's just a matter of building nicer users interfaces to the existing capability, e.g. in part by combiningExecutor
andFileExecutor
as already suggested here.