Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace attribute filters rather than merging them #2155

Open
tcompa opened this issue Dec 19, 2024 · 6 comments · Fixed by #2168
Open

Replace attribute filters rather than merging them #2155

tcompa opened this issue Dec 19, 2024 · 6 comments · Fixed by #2168
Labels
flexibility Support more workflow-execution use cases runner

Comments

@tcompa
Copy link
Collaborator

tcompa commented Dec 19, 2024

In general, merging different filters may be complex - especially so in the upcoming version where attributes are associated to a list of possible values for a given key (e.g. well equal to B03 or B04 or B05).

Example: what does it mean to "merge" filters well=[B03, B04, B05] and well=[B03, B06]? What is the expected outcome?
We can clearly find the most reasonable answer, and implement it.

This would be even more complex if we have both include and exclude (which we actually propose to not pursue - ref #2154).

Note that this definition is strictly necessary for the interactive definition of job filters, as in fractal-analytics-platform/fractal-web#682 (comment). In this modal we have to create the "patch" filters which would produce the expected list of images when combined with existing filters.

With @mfranzon, we suggest to take a more clear direction - and state that attribute filters are never merged, but only replaced by highest-priority values.

Example:
old_attribute_filter = {well=[B03, B04, B05], anotherkey: [value1, value2]}
new_attribute_filter = {well=[B03, B06]}
resulting_filter = {well=[B03, B06], anotherkey: [value1, value2]}

Note: none of this applies to type filters, because there is no ambiguity in merging key=True with a higher-priority key=False. It's already a replacement. This is connected to our proposal to also split attributes and types into different db/API field - ref #2153.

@tcompa
Copy link
Collaborator Author

tcompa commented Dec 19, 2024

Side comment: the choice of replacing (rather than merging) this kind of filters also makes it much easier to look at a job and immediately understand which filters where applied, because it does not require the user to be aware of the underlying merging rules (which are far from intuitive).

@jluethi
Copy link
Collaborator

jluethi commented Dec 19, 2024

I could see something like this. For the UI: As long as we pre-fill the dataset attribute filters, we can just take the final selection in the run modal as the needed filters without having to combine them again

@jluethi
Copy link
Collaborator

jluethi commented Dec 19, 2024

Job attribute filters always take priority over dataset attribute filters. The dataset attribute filters are replaced by the job filters.

What if a task sets an output attribute filter? Do we even allow that? We probably should not


Types: First set by datasets type filters. Overwritten by workflow task input type filter. Workflow task output can set new types that overwrite both before.

@jluethi
Copy link
Collaborator

jluethi commented Dec 19, 2024

Side discussion:
How will we handle: I ran 5 wells until task 3. Continue task 4 only in the wells that are "ready":
Option 1: We do this status based somehow => details TBD
Option 2: We write the attribute filters back to the dataset when they are set by a job

This is about setting smart defaults, not stopping users from setting other filters.

@tcompa
Copy link
Collaborator Author

tcompa commented Jan 15, 2025

As of this morning call:

Let's make sure we only use job.attribute_filters within the runner, and let's make sure that the two clients make it easy to use dataset.attribute_filters as a default value for job.attribute_filters upon submission.

@tcompa
Copy link
Collaborator Author

tcompa commented Jan 15, 2025

As of this morning call:

Let's make sure we only use job.attribute_filters within the runner, and let's make sure that the two clients make it easy to use dataset.attribute_filters as a default value for job.attribute_filters upon submission.

Ref 87f4fc6

This was referenced Jan 15, 2025
@tcompa tcompa linked a pull request Jan 15, 2025 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flexibility Support more workflow-execution use cases runner
Projects
Development

Successfully merging a pull request may close this issue.

2 participants