You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Over in #45, @CraftComputing ran into some errors running this playbook on an Intel Granite Rapids Xeon 6980P system.
That system has 2x Intel Xeon 6980P 128-Core / 256-Thread CPUs, meaning there's a total of 256 cores, and 512 threads, when Hyperthreading is enabled.
The current code that generates the cluster-hosts file uses ansible_processor_vcpus to indicate the number of slots available on the machine, which in this case would output 512:
There are two things I could do to resolve the issue:
Drop the slots part out of the cluster-hosts file entirely, and rely on mpirun choosing the correct number of cores.
Switch to using ansible_processor_cores or maybe ansible_processor_nproc?
The latter option could also be complicated on multi-CPU systems, because I might need to do some math to generate the number of cores correctly...
In the end, the simplest thing may be to document that "if you have Hyperthreading enabled, override variable XYZ to specify the total number of cores on your system"... something like that. And then use a variable that defaults to ansible_processor_vcpus but can be overridden.
The text was updated successfully, but these errors were encountered:
Over in #45, @CraftComputing ran into some errors running this playbook on an Intel Granite Rapids Xeon 6980P system.
That system has 2x Intel Xeon 6980P 128-Core / 256-Thread CPUs, meaning there's a total of 256 cores, and 512 threads, when Hyperthreading is enabled.
The current code that generates the
cluster-hosts
file usesansible_processor_vcpus
to indicate the number of slots available on the machine, which in this case would output512
:top500-benchmark/templates/mpi-node-config.j2
Line 2 in 53ae3f3
There are two things I could do to resolve the issue:
slots
part out of thecluster-hosts
file entirely, and rely onmpirun
choosing the correct number of cores.ansible_processor_cores
or maybeansible_processor_nproc
?The latter option could also be complicated on multi-CPU systems, because I might need to do some math to generate the number of cores correctly...
In the end, the simplest thing may be to document that "if you have Hyperthreading enabled, override variable XYZ to specify the total number of cores on your system"... something like that. And then use a variable that defaults to
ansible_processor_vcpus
but can be overridden.The text was updated successfully, but these errors were encountered: