-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Executor submodule #765
Executor submodule #765
Conversation
Pull Request Test Coverage Report for Build 5524794387
💛 - Coveralls |
|
||
def _apply_cloudpickle(fn, /, *args, **kwargs): | ||
fn = cloudpickle.loads(fn) | ||
return fn(*args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you also want to cloudpickle
the output on the way back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean. Can you suggest a modification to the test that makes it fail without this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a suggestion below. The other part I am surprised about is that you do not pickle the arguments, but only the function. Should not we pickle everything together?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is only necessary when the args or return values are themselves things that cannot be normally pickled (eg instances of classes created by a decorator). Tomorrow I'll come back and make the test case much crueler. Hopefully then we'll see the tests fail without your suggestion and similar treatment for the args
Just for future reference https://github.com/pyiron-dev/pysqa-executor |
|
||
def _apply_cloudpickle(fn, /, *args, **kwargs): | ||
fn = cloudpickle.loads(fn) | ||
return fn(*args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return fn(*args, **kwargs) | |
return cloudpickle.dumps(fn(*args, **kwargs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ok to me to get the ball rolling. Can we think about making this dynamic somehow? Like I give you any executor and you give me back one that cloudpickles? OTOH, we'll on the order of less than a dozen "official" executors, so wrapping them once is also not too much hassle.
Since we have this now, I'll go ahead and rename tinybase's executors, since their are slightly different and I didn't like that name clash to begin with.
I'm working on this currently. I agree that for non-pickleable callables that take pickleable arguments and have a pickleable return value, it's fine. Jan's suggestion fixes the case of non-pickleable return values but then the future results need to be unpickled post-facto, i.e. I see a potential solution where we really replace the underlying |
So I'd like to merge this one as-is. I added a bit of stuff so that both the docstrings and tests are more explicit that this handles unpickleable callables but that arguments and return values must still be pickleable. This is already totally sufficient for #760, where our callables are methods of dynamically created nodes, but the input and output is (so far) always just regular data. Jan's suggestion above does extend the class to handle unpickleable return values, and one can imagine a similar attack on arguments. The problem is that it also messes with the interface, since I have a solution in mind (sketched below), but I would much rather get the current code -- which is working fine and tested -- merged, and either replace it or add a new more powerful executor in a future PR. This will keep the PRs smaller, give Jan's serializing and hashing executor a logical place to live on contrib, and remove barriers to working on #760 in case my sketch of an idea doesn't pan out. @jan-janssen, are you amenable to such a split/delay? The idea: We can get things working ok, but the future result is ugly, so we'll need a new class to handle this, like class CloudLoadsFuture(Future):
def result(self):
result = super().result()
if isinstance(result, bytes):
result = cloudpickle.loads(result)
return result Then we apply a wrapper and subclass the subprocess executor to make sure everything is getting serialized before it gets seen by the class CloudPickledCallable:
def __init__(self, fnc: callable):
self.fnc_serial = cloudpickle.dumps(fnc)
def __call__(self, /, dumped_args, dumped_kwargs):
fnc = cloudpickle.loads(self.fnc_serial)
args = cloudpickle.loads(dumped_args)
kwargs = cloudpickle.loads(dumped_kwargs)
return cloudpickle.dumps(fnc(*args, **kwargs))
@classmethod
def dumps(cls, stuff):
return cloudpickle.dumps(stuff)
class CloudpickleProcessPoolExecutor(ProcessPoolExecutor):
def submit(self, fn, /, *args, **kwargs):
return self._submit(
CloudPickledCallable(fn),
CloudPickledCallable.dumps(args),
CloudPickledCallable.dumps(kwargs)
)
def _submit(self, fn, /, *args, **kwargs):
# This is just exactly ProcessPoolExecutor.submit, but we hack in
# f = CloudLoadsFuture()
# instead of
# f = _base.Future()
# for the future instantiation
raise NotImplementedError Of course this may still not do what I want, is probably not the most elegant solution, and I want to take another peek around to see if there is an existing tool that does this (there's at least a |
UPDATE: Actually that plan of attack worked basically flawlessly (at least as far as my current tests can see), and is drafted over in #767 |
As it is |
Introduce a new executor submodule for our custom children of
concurrent.futures.Executor
. Once it's wrapped up in class form I'd like to stick @jan-janssen's hash-and-serialize executor here so I can start playing with it in the workflows. Jan, if I get to packaging it that way first (maybe tomorrow or Wednesday if life goes smoothly?) I'll open a draft PR right away so you can see what's going on.In the meantime, here is a little executor I found on stack overflow that exploits cloudpickle so that we can handle dynamically defined classes. I did a little MWE of how workflow node decorators are running and test against that.