Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to parse workflow config when setting slurm_extra based on attempt #208

Open
blaiseli opened this issue Jan 27, 2025 · 3 comments
Open

Comments

@blaiseli
Copy link

Software Versions

$ snakemake --version
8.27.1
$ pip list | grep "snakemake-executor-plugin-slurm"
snakemake-executor-plugin-slurm           0.15.0
snakemake-executor-plugin-slurm-jobstep   0.2.1
$ sinfo --version
slurm 23.02.6

Describe the bug

When I try to set resources using an <v1> if attempt == 1 else <v2> expression in the workflow config.yaml, this works with integers, but not with strings.

Example:

set-resources:
    my_rule:
        mem_mb: 16384 if attempt == 1 else (32768 if attempt == 2 else (49152 if attempt == 3 else 65536))
        runtime: 119 if attempt == 1 else 1439
        cpus_per_task: 32 if attempt == 1 else 12
        # This fails
        slurm_extra: "'--qos=fast'" if attempt == 1 else "'--qos=normal'"
        # This works:
        # slurm_extra: "'--qos=fast'"

Logs

When trying to set up slurm_extra as above, I get the following kind of error:

slurm_script: error: Couldn't parse config file: while parsing a block mapping
  in "/pasteur/appa/homes/bli/src/nanopore_assembly/src/workflow/profile/config.yaml", line 82, column 9
expected <block end>, but found '<scalar>'
  in "/pasteur/appa/homes/bli/src/nanopore_assembly/src/workflow/profile/config.yaml", line 86, column 37
@blaiseli
Copy link
Author

blaiseli commented Jan 27, 2025

The following syntax seems to be parsable:

slurm_extra: '( "--qos=fast" if attempt == 1 else "--qos=normal" )'

@cmeesters
Copy link
Member

Ah, thank you for this feedback. I will update the docs accordingly - and perhaps open a PR to allow direct qos settings.

May I ask: this qos selection seems like something you ought to place in different partitions rather than choosing qos features. Can you offer a documentation link? I am trying to make the plugin as generic as possible. It's hard. My fellow admins have quite some phantasy, when it comes to (mis)configuring SLURM.

@blaiseli
Copy link
Author

May I ask: this qos selection seems like something you ought to place in different partitions rather than choosing qos features. Can you offer a documentation link?

I'm not sure to understand your question.

The documentation of our institution's cluster has restricted access, and I'm not even sure where to find what... Here are some explanations:

On our institution's cluster, different partitions have access to different sets of QOS. A given QOS allows a certain maximum runtime. So if I notice that a job tends to timeout, I ask for a longer runtime at further attempts, and I need to adjust the QOS accordingly using slurm_extra.

On top of that, a given user has access to a certain set of partitions. I write snakemake workflows that may / will have to be run by other users, with a more restricted set of partitions than me.

Concretely, I run some of my tests using the "common" partition, because I know that those users will have access to it. And the "common" partition has access to "ultrafast", "fast", and "normal" QOSes, on which jobs can request up to 5 minutes, 2 hours, and 1 day respectively.

One of my rules tends to run in about 2h, sometimes less, sometimes more, so I make a first attempt with 119 minutes on qos fast, then 1439 on qos normal (I reserve 1 minute, in order to be able to group the rule with another short-running downstream one). At the same time, I also play with memory, because sometimes, the job gets oomkilled (possibly depending on the size of the input data), and I try to reduce the time spent in the queue by reducing the number of requested processors. Hence the weird workflow profile...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants