-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch configuration executor List type to Iterable #2943
Conversation
The list of executors in configuration is intended to be read only. Using Sequence for this makes that more explicit, as Sequences cannot be modified. This is part of a move towards using Sequence rather than List in many places (although the runtime type remains a list) because Sequence has better behaviour with subclasses of its parameter type: it is covariant, rather than invariant. This PR rearranges the validation logic a bit so that there is no longer an arbitrary property setter for the executor property - it was always a bug to replace the executor list in a config once parsl has been configured, and removing this property setter enforces that more strongly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Och; one of those seemingly tiny details that is really helpful to have available when in the thick of it. Much appreciated.
parsl/config.py
Outdated
self._executors: Sequence[ParslExecutor] | ||
if executors is None: | ||
executors = [ThreadPoolExecutor()] | ||
self.executors = executors | ||
self._executors = [ThreadPoolExecutor()] | ||
else: | ||
self._validate_executors(executors) | ||
self._executors = executors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stylistically, you might consider removing the if-condition for less mental load while reading. Perhaps:
if executors is None:
executors = [ThreadPoolExecutor()]
assert executors is not None # possibly necessary for mypy -- didn't run this ...
self._validate_executors(executors)
self._executors = executors
Now there's a single point of authority for the typing (from __init__
's signature), a single assignment, and it's clear that self._executors
can't be None
.
Alternatively, is it an error if executors
was specified as an empty sequence? Could use a naked truthy check to cover both:
if not executors:
executors = [ThreadPoolExecutor()]
# or possibly:
# executors = executors or [ThreadPoolExecutor()]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... it was always a bug to replace the executor list in a config once parsl has been configured, ...
you might consider also making a copy of what's passed in. That is, nominally we don't want a user to change (accidentally) the list they passed in, right? Perhaps:
def __init__(
self,
# note: Sequence -> Iterable
executors: Optional[Iterable[ParslExecutor]] = None,
...
):
executors = tuple(executors or [])
if not executors:
executors = (ThreadPoolExecutor(),)
self._validate_executors(executors)
self._executors = executors
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bbc3dda broadly makes the changes you suggested.
I think there are some weird corner cases though because of the interaction between being a bool and being an iterable:
in the presence of an executors structure that is boolean false, it won't be iterated, and a default will be selected.
in the presence of an executors structure that is boolean true but iterates to empty, a default will not be selected and a configuration error for "no executors!" will be raised.
I don't think anything forbids that second case from happening (but i'm not entirely familar with the specifications for bool vs iterable). But it's also something I'm not too fussed about - the "obvious" structures a user might use here of tuple, list and set have the right iteration vs bool properties at least.
I think this is probably the right thing to do. Breaking change: There's a corner case where you would only use the implicit _parsl_internal executor: eg. if you were only doing join_apps and file staging, and if you were co-ordinating non-Parsl futures and not running any python_ or bash_ apps. But I think that is sufficiently obscure, this check is the right thing to do and if anyone complains, it can be removed again.
This is a supertype of the previous-commit Sequence, itself a supertype of List that is current in master. It subtly emphasises that the collection of iterators will be read over once by Parsl and not touched again. This drives a change in the validation call to validate after copying into an internal structure.
Per a Slack conversation, the behavior to preserve is "_if you don't specify any executors, you get a default local thread one_". Accordingly, concretize the iterator if it's not `None`, and then act accordingly.
The list of executors in configuration is intended to be read only. Using Iterable for this makes that more explicit, while also allowing other structures that make sense such as tuple or set.
This is part of a move towards using more constrainted sequence types (for example
Sequence
) rather than List in many places (although the runtime type remains a list) because these can have better behaviour with subclasses of its parameter type: sequences and iterables are covariant, rather than invariant.This PR rearranges the validation logic a bit so that there is no longer an arbitrary property setter for the executor property. It was always a bug to replace the executor list in a config once Parsl has been configured, and removing this property setter enforces that more strongly.
Type of change