Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch configuration executor List type to Iterable #2943

Merged
merged 6 commits into from
Nov 10, 2023

Conversation

benclifford
Copy link
Collaborator

@benclifford benclifford commented Nov 1, 2023

The list of executors in configuration is intended to be read only. Using Iterable for this makes that more explicit, while also allowing other structures that make sense such as tuple or set.

This is part of a move towards using more constrainted sequence types (for example Sequence) rather than List in many places (although the runtime type remains a list) because these can have better behaviour with subclasses of its parameter type: sequences and iterables are covariant, rather than invariant.

This PR rearranges the validation logic a bit so that there is no longer an arbitrary property setter for the executor property. It was always a bug to replace the executor list in a config once Parsl has been configured, and removing this property setter enforces that more strongly.

Type of change

  • Code maintentance/cleanup

The list of executors in configuration is intended to be read only.
Using Sequence for this makes that more explicit, as Sequences cannot
be modified.

This is part of a move towards using Sequence rather than List in
many places (although the runtime type remains a list) because Sequence
has better behaviour with subclasses of its parameter type: it is
covariant, rather than invariant.

This PR rearranges the validation logic a bit so that there is no
longer an arbitrary property setter for the executor property - it was
always a bug to replace the executor list in a config once parsl has
been configured, and removing this property setter enforces that more
strongly.
Copy link
Collaborator

@khk-globus khk-globus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Och; one of those seemingly tiny details that is really helpful to have available when in the thick of it. Much appreciated.

parsl/config.py Outdated
Comment on lines 95 to 100
self._executors: Sequence[ParslExecutor]
if executors is None:
executors = [ThreadPoolExecutor()]
self.executors = executors
self._executors = [ThreadPoolExecutor()]
else:
self._validate_executors(executors)
self._executors = executors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stylistically, you might consider removing the if-condition for less mental load while reading. Perhaps:

if executors is None:
    executors = [ThreadPoolExecutor()]
assert executors is not None  # possibly necessary for mypy -- didn't run this ...
self._validate_executors(executors)
self._executors = executors

Now there's a single point of authority for the typing (from __init__'s signature), a single assignment, and it's clear that self._executors can't be None.

Alternatively, is it an error if executors was specified as an empty sequence? Could use a naked truthy check to cover both:

if not executors:
    executors = [ThreadPoolExecutor()]
# or possibly:
# executors = executors or [ThreadPoolExecutor()]

Copy link
Collaborator

@khk-globus khk-globus Nov 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... it was always a bug to replace the executor list in a config once parsl has been configured, ...

you might consider also making a copy of what's passed in. That is, nominally we don't want a user to change (accidentally) the list they passed in, right? Perhaps:

def __init__(
    self,
        # note: Sequence -> Iterable
    executors: Optional[Iterable[ParslExecutor]] = None,
    ...
):
    executors = tuple(executors or [])
    if not executors:
        executors = (ThreadPoolExecutor(),)
    self._validate_executors(executors)
    self._executors = executors
    ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bbc3dda broadly makes the changes you suggested.

I think there are some weird corner cases though because of the interaction between being a bool and being an iterable:

in the presence of an executors structure that is boolean false, it won't be iterated, and a default will be selected.

in the presence of an executors structure that is boolean true but iterates to empty, a default will not be selected and a configuration error for "no executors!" will be raised.

I don't think anything forbids that second case from happening (but i'm not entirely familar with the specifications for bool vs iterable). But it's also something I'm not too fussed about - the "obvious" structures a user might use here of tuple, list and set have the right iteration vs bool properties at least.

parsl/config.py Outdated Show resolved Hide resolved
I think this is probably the right thing to do.

Breaking change:

There's a corner case where you would only use the implicit
_parsl_internal executor:
eg. if you were only doing join_apps and file staging, and
if you were co-ordinating non-Parsl futures and not running any
python_ or bash_ apps. But I think that is sufficiently obscure, this
check is the right thing to do and if anyone complains, it can be
removed again.
This is a supertype of the previous-commit Sequence, itself a
supertype of List that is current in master.

It subtly emphasises that the collection of iterators will be read
over once by Parsl and not touched again.

This drives a change in the validation call to validate after copying
into an internal structure.
@benclifford benclifford changed the title Switch configuration executor List type to Sequence Switch configuration executor List type to Iterable Nov 8, 2023
Per a Slack conversation, the behavior to preserve is "_if you don't specify
any executors, you get a default local thread one_".  Accordingly, concretize
the iterator if it's not `None`, and then act accordingly.
@benclifford benclifford merged commit 896514f into master Nov 10, 2023
4 checks passed
@benclifford benclifford deleted the benc-mypy-config-executors-list branch November 10, 2023 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants