Skip to content
This repository has been archived by the owner on Aug 9, 2024. It is now read-only.

Commit

Permalink
feat!: overhaul slurmd charm API
Browse files Browse the repository at this point in the history
Merge pull request #34 from jamesbeedy/slurm_config_editor_preparation

Summary of Changes:

- improve the way slurmd sends slurmctld its partition and node parameter options
- add an action, node-config to get and set unit-level node-configuration
- add partition-config charm configuration that allows an operator to set
- partition configuration
- consolidate the yaml files into charmcraft.yaml
- remove unused code
- remove slurm-ops-manager
- replace nhc resource with nhc in build process in charmcraft.yaml
- remove dependencies on slurmdbd

BREAKING CHANGES: No longer compatible with the old Slurm charm API. Charms will need to be tested locally as the charms in the edge branch on CharmHub are not compatible with the new API version.
  • Loading branch information
NucciTheBoss authored Jun 28, 2024
2 parents 39bbea5 + ac2edba commit 70a61a9
Show file tree
Hide file tree
Showing 30 changed files with 2,736 additions and 1,205 deletions.
20 changes: 16 additions & 4 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: woke
uses: get-woke/woke-action@v0
with:
Expand All @@ -35,18 +35,29 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: Checkout
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Install dependencies
run: python3 -m pip install tox
- name: Run linters
run: tox -e lint

type:
name: Type check with pyright
runs-on: ubuntu-22.04
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install dependencies
run: python3 -m pip install tox
- name: Run pyright
run: tox -e type

unit-test:
name: Unit tests
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Install dependencies
run: python3 -m pip install tox
- name: Run tests
Expand All @@ -63,10 +74,11 @@ jobs:
needs:
- inclusive-naming-check
- lint
- type
- unit-test
steps:
- name: Checkout
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Setup operator environment
uses: charmed-kubernetes/actions-operator@main
with:
Expand Down
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,6 @@ __pycache__/
*.py[cod]
.idea
.vscode/
version

# Disable woke checking for nhc.conf.tmpl
src/templates/nhc.conf.tmpl
34 changes: 27 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,33 @@ This operator should be used with Juju 3.x or greater.
```shell
$ juju deploy slurmctld --channel edge
$ juju deploy slurmd --channel edge
$ juju deploy slurmdbd --channel edge
$ juju deploy mysql --channel 8.0/edge
$ juju deploy mysql-router slurmdbd-mysql-router --channel dpe/edge
$ juju integrate slurmctld:slurmd slurmd:slurmd
$ juju integrate slurmdbd-mysql-router:backend-database mysql:database
$ juju integrate slurmdbd:database slurmdbd-mysql-router:database
$ juju integrate slurmctld:slurmdbd slurmdbd:slurmdbd
$ juju integrate slurmctld:slurmd slurmd:slurmctld
```

### Operations
This charm hardens and simplifies operations by codifying common administration operations as charm actions.

#### Partition Configuration
Specify partition parameters using the charm configuration, `partition-config`.

##### Use the `partition-config` to set custom partition parameters.
```bash
$ juju config slurmd partition-config="State=INACTIVE"
```

#### Node Configuration Parameters
You can get and set the node configuration using the `node-config` action.

##### Use the `node-config` action to get the node configuration for the unit.
```bash
$ juju run --quiet slurmd/0 node-config --format json | jq ".[].results.node.config"
"NodeName=juju-462521-4 NodeAddr=10.240.222.28 State=UNKNOWN RealMemory=64012 CPUs=12 ThreadsPerCore=2 CoresPerSocket=6 SocketsPerBoard=1"
```

##### Use the `node-config` action to set a custom weight value for the node.
```bash
$ juju run --quiet slurmd/0 node-config parameters="Weight=5000" --format json | jq ".[].results.node.config"
"NodeName=juju-462521-4 NodeAddr=10.240.222.28 State=UNKNOWN RealMemory=64012 CPUs=12 ThreadsPerCore=2 CoresPerSocket=6 SocketsPerBoard=1 Weight=5000"
```

## Project & Community
Expand Down
15 changes: 0 additions & 15 deletions actions.yaml

This file was deleted.

108 changes: 88 additions & 20 deletions charmcraft.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,29 @@
# Copyright 2020 Omnivector, LLC
# See LICENSE file for licensing details.

name: slurmd
type: charm

summary: |
Slurmd, the compute node daemon of Slurm.
description: |
This charm provides slurmd, munged, and the bindings to other utilities
that make lifecycle operations a breeze.
slurmd is the compute node daemon of SLURM. It monitors all tasks running
on the compute node, accepts work (tasks), launches tasks, and kills
running tasks upon request.
links:
contact: https://matrix.to/#/#hpc:ubuntu.com

issues:
- https://github.com/charmed-hpc/slurmd-operator/issues

source:
- https://github.com/charmed-hpc/slurmd-operator

assumes:
- juju

bases:
- build-on:
- name: ubuntu
Expand All @@ -10,25 +32,71 @@ bases:
- name: ubuntu
channel: "22.04"
architectures: [amd64]
- name: centos
channel: "7"
architectures: [amd64]

parts:
charm:
build-packages: [git]
charm-python-packages: [setuptools]

# Create a version file and pack it into the charm. This is dynamically generated
# as part of the build process for a charm to ensure that the git revision of the
# charm is always recorded in this version file.
version-file:
plugin: nil
build-packages:
- git
- wget
override-build: |
VERSION=$(git -C $CRAFT_PART_SRC/../../charm/src describe --dirty --always)
echo "Setting version to $VERSION"
echo $VERSION > $CRAFT_PART_INSTALL/version
stage:
- version
wget https://github.com/mej/nhc/releases/download/1.4.3/lbnl-nhc-1.4.3.tar.gz
craftctl default
provides:
slurmctld:
interface: slurmd
limit: 1

config:
options:
partition-config:
type: string
default: ""
description: >
Additional partition configuration parameters, specified as a space separated `key=value`
in a single line. Find a list of all possible partition configuration parameters
[here](https://slurm.schedmd.com/slurm.conf.html#SECTION_PARTITION-CONFIGURATION).
Example usage:
```bash
$ juju config slurmd partition-config="DefaultTime=45:00 MaxTime=1:00:00"
```

nhc-conf:
default: ""
type: string
description: >
Multiline string.
These lines are appended to the `nhc.conf` maintained by the charm.
Example usage:
```bash
$ juju config slurmd nhc-conf="$(cat extra-nhc.conf)"
```
actions:
node-configured:
description: Remove a node from DownNodes when the reason is `New node`.

node-config:
description: >
Set or return node configuration parameters.
To get the current node configuration for this unit:
``bash
$ juju run slurmd/0 node-parameters
```
To set node level configuration parameters for the unit `slurmd/0`:
``bash
$ juju run slurmd/0 node-config parameters="Weight=200 Gres=gpu:tesla:1,gpu:kepler:1,bandwidth:lustre:no_consume:4G"
```
params:
parameters:
type: string
description: >
Node configuration parameter as defined [here](https://slurm.schedmd.com/slurm.conf.html#SECTION_NODE-CONFIGURATION).
show-nhc-config:
description: Display `nhc.conf`.
40 changes: 0 additions & 40 deletions config.yaml

This file was deleted.

41 changes: 4 additions & 37 deletions dispatch
Original file line number Diff line number Diff line change
@@ -1,44 +1,11 @@
#!/bin/bash
# This hook installs the dependencies needed to run the charm,
# creates the dispatch executable, regenerates the symlinks for start and
# upgrade-charm, and kicks off the operator framework.

set -e

# Source the os-release information into the env
. /etc/os-release

if ! [[ -f '.installed' ]]
then
if [[ $ID == 'centos' ]]
then
# Install dependencies and build custom python
yum -y install epel-release
yum -y install wget gcc make tar bzip2-devel zlib-devel xz-devel openssl-devel libffi-devel sqlite-devel ncurses-devel

export PYTHON_VERSION=3.8.16
wget https://www.python.org/ftp/python/${PYTHON_VERSION}/Python-${PYTHON_VERSION}.tar.xz -P /tmp
tar xvf /tmp/Python-${PYTHON_VERSION}.tar.xz -C /tmp
cd /tmp/Python-${PYTHON_VERSION}
./configure --enable-optimizations
make -C /tmp/Python-${PYTHON_VERSION} -j $(nproc) altinstall
cd $OLDPWD
rm -rf /tmp/Python*

elif [[ $ID == 'ubuntu' ]]
then
# Necessary to compile and install NHC
apt-get install --assume-yes make
fi
touch .installed
fi

# set the correct python bin path
if [[ $ID == "centos" ]]
then
PYTHON_BIN="/usr/bin/env python3.8"
else
PYTHON_BIN="/usr/bin/env python3"
# Necessary to compile and install NHC
apt-get install --assume-yes make
touch .installed
fi

JUJU_DISPATCH_PATH="${JUJU_DISPATCH_PATH:-$0}" PYTHONPATH=lib:venv $PYTHON_BIN ./src/charm.py
JUJU_DISPATCH_PATH="${JUJU_DISPATCH_PATH:-$0}" PYTHONPATH=lib:venv /usr/bin/env python3 ./src/charm.py
Loading

0 comments on commit 70a61a9

Please sign in to comment.