get_cpus_per_task() assigns 1 to cpus_per_task when not specified #27

therealjpetereit · 2025-01-07T05:40:08Z

Hello and merry 2025 🎉🙌

I am using the execellent Snakemake workflow automation our SLURM HPC Cluster in Western Australia.

As most HPC clusters there are specific rules on what job submissions are allowed and which are not.

For us there is a rule that if I request a GPU node I cannot set cpus_per_task=x

The problem I am encountering is that when I omit cpus_per_task=x from my config.yaml or snakefile resources, the slurm executor
plugin will automatically set cpus_per_task=1 as defined by the get_cpus_per_task() function.

Is there a way to instead of setting it to 1, to make slurm do this?

lets say if I do cpus_per_task='default' the plugin would simply omit setting a cpus_per_task value

Cheers
Jakob

The text was updated successfully, but these errors were encountered:

cmeesters · 2025-01-07T08:37:59Z

Thank you for bringing this issue to my attention. Interesting. Can you use --cpus-per-gpu instead?

I will polish the PR on GRES (see snakemake/snakemake-executor-plugin-slurm#173 ) in the coming days and wonder whether it helps to code "if a gpu is selected, set --cpus-per-gpu" instead. Of course, the idea is to provide a solution, that works everywhere. It appears to be a rather futile task. My fellow admins have quite a fantasy when deviating from standards, and SchedMD (the company behind SLURM) occasionally has funny ideas, too.

therealjpetereit · 2025-01-07T10:17:08Z

Hi,

Can you use --cpus-per-gpu instead?
Depends how it's implemented, the way they use it in the cluster config is that cpus_per_task is red flagged if gpu partition is used. The system has net cpus_per_gpu=8 in our case, but that number is arbitrary and could be anything in the future.

So my recommendation would be to change make cpus_per_task in the snakemake sheduler accept a 'none' argument and it omits the line where cpus_per_task is added to the sbatch command.
I saw that in one of the helper scripts.

I'll track that down an add it as extra comment.

the idea is to provide a solution, that works everywhere.

here's hoping 😂😅

Cheers
J

therealjpetereit · 2025-01-07T10:22:23Z

Ok,

so I think if you look at lib/python3.12/site-packages/snakemake_executor_plugin_slurm/__init__.py

it has in line 178
call += f" --cpus-per-task={get_cpus_per_task(job)}"

that could have an if not statement for cpus_per_task='none'

Maybe you have a better idea, this is just what came to my mind.

Thanks for the prompt responses 🙌

cmeesters · 2025-01-07T10:41:33Z

My idea would be to still have a default. 1 CPU is fine for most gpu tasks. So, get_cpus_per_task(job) should be

renamed
return --cpus-per-gpu=<value> if a gpu is selected
else return --cpus-per-taks=<value>

where <value> is equal to threads of a resource section or threads directive of a rule and will take cpus_per_taks or cpus_per_gpu as a fallback.

Does this make sense to you? And would it work on your cluster?

The idea is to keep the workflow itself generic. If 8 is the upper value for now, that is a non-issue. You might be using a multi-gpu + multi-cpu application (in the future), where this proposed layout might come in handy. For know, it will not hurt, or will it?

BTW to me there is no difference in --cpus_per_task or --cpus_per_gpu, because a gpu-task is a process like any other regarding the scheduling process (SLURM is rather ignorant to the offloading to gpus). The difference might be imposed, because otherwise admins need to implement a plugin, to deny cpu-only jobs in gpu partitions. But that is only my take.

therealjpetereit · 2025-01-07T10:52:26Z

Hi,

I just tried if --cpus_per_gpu works but sadly it doesn't. I assume it still does cpus_per_task in the background.

sbatch --export=ALL -A XXX-gpu' -p gpu -t 600 --mem 1000 --ntasks=1 --cpus-per-task=5 --gpus=1 test.sh
sbatch: error: lua: cli_filter: cannot explicitly request CPU resources for GPU allocation; each allocated GPU allocates 8 cores
sbatch: error: cli_filter plugin terminated with error

sbatch --export=ALL -A 'XXX-gpu' -p gpu -t 600 --mem 1000 --ntasks=1 --cpus-per-gpu=5 --gpus=1 test.sh
sbatch: error: lua: cli_filter: cannot explicitly request CPU resources for GPU allocation; each allocated GPU allocates 8 cores
sbatch: error: cli_filter plugin terminated with error

Your logic is sound, sadly the coding of the cli_filter doesn't allow it in my specific case.
Our cluster basically takes the --gpus=x and then sets the calue for cpus, which collides with setting any cpus.

I did tell teh HPC staff that this is an issue, but alas I haven't heard from them.

cmeesters · 2025-01-07T11:20:38Z

cannot explicitly request CPU resources for GPU allocation; each allocated GPU allocates 8 cores

ah, this is what you meant. My bad. Still doesn't make sense. Even, if you get assigned n cpus, you should not be able to oversubscribe but explicitly underusing a resource, which otherwise does not get used anyway? (Don't tell me that a signification fraction of GPU applications will use exactly 8 cpus! ;-)

It appears your admin team wrote a filter script. Now, I could set point 2 in my list to simply returning no string, if there is only one cpu selected or none are selected. I do not want users to explicitly write None either way.

therealjpetereit · 2025-01-07T13:58:40Z

Hi,

yes you are correct, it's a bit of a silly filter.

That solution should work for me, as long maybe return no string if cpus are set to zero ?
Anyway, I am happy with any solution =)

Cheers
J

therealjpetereit · 2025-01-10T01:39:36Z

Hi again,

I just came across the same situation for memory, once you have a solution for the cpus_per_task, would you mind applying the same to --mem ?

Cheers
J

cmeesters · 2025-01-10T17:21:27Z

I'm nearing completion. Hopefully, a fix to all our liking will be released next week.

Yet, I do not understand your last comment. We have

if job.resources.get("mem_mb_per_cpu"):
      call += f" --mem-per-cpu {job.resources.mem_mb_per_cpu}"
elif job.resources.get("mem_mb"):
      call += f" --mem {job.resources.mem_mb}"
else:
      self.logger.warning(
          "No job memory information ('mem_mb' or 'mem_mb_per_cpu') is given "
          "- submitting without. This might or might not work on your cluster."
      )

(hopefully, I got the indentation right - see lines 235 et seqq., ought to be correct there.)

Anyway, there is no default setting to be omitted. These parameters are just set if required by a user.

Cheers
Christian

therealjpetereit changed the title ~~get_cpus_per_task() assings 1 to cpus_per_task when not specified~~ get_cpus_per_task() assigns 1 to cpus_per_task when not specified Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get_cpus_per_task() assigns 1 to cpus_per_task when not specified #27

get_cpus_per_task() assigns 1 to cpus_per_task when not specified #27

therealjpetereit commented Jan 7, 2025

cmeesters commented Jan 7, 2025

therealjpetereit commented Jan 7, 2025

therealjpetereit commented Jan 7, 2025

cmeesters commented Jan 7, 2025

therealjpetereit commented Jan 7, 2025 •

edited

Loading

cmeesters commented Jan 7, 2025

therealjpetereit commented Jan 7, 2025

therealjpetereit commented Jan 10, 2025

cmeesters commented Jan 10, 2025

get_cpus_per_task() assigns 1 to cpus_per_task when not specified #27

get_cpus_per_task() assigns 1 to cpus_per_task when not specified #27

Comments

therealjpetereit commented Jan 7, 2025

cmeesters commented Jan 7, 2025

therealjpetereit commented Jan 7, 2025

therealjpetereit commented Jan 7, 2025

cmeesters commented Jan 7, 2025

therealjpetereit commented Jan 7, 2025 • edited Loading

cmeesters commented Jan 7, 2025

therealjpetereit commented Jan 7, 2025

therealjpetereit commented Jan 10, 2025

cmeesters commented Jan 10, 2025

therealjpetereit commented Jan 7, 2025 •

edited

Loading