From b59be9d4ebb23209fdc0cc9ee1ca5e0f66d97d1e Mon Sep 17 00:00:00 2001 From: Dean Roehrich Date: Fri, 5 Jul 2024 10:10:59 -0500 Subject: [PATCH 1/3] Add status.requiredDaemons to DirectiveBreakdown Signed-off-by: Dean Roehrich --- docs/guides/directive-breakdown/readme.md | 30 +++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/docs/guides/directive-breakdown/readme.md b/docs/guides/directive-breakdown/readme.md index fff73f2..3d193c5 100644 --- a/docs/guides/directive-breakdown/readme.md +++ b/docs/guides/directive-breakdown/readme.md @@ -149,3 +149,33 @@ A location constraint consists of an `access` list and a `reference`. * `status.compute.constraints.location.access` is a list that specifies what type of access the compute nodes need to have to the storage allocations in the allocation set. An allocation set may have multiple access types that are required * `status.compute.constraints.location.access.type` specifies the connection type for the storage. This can be `network` or `physical` * `status.compute.constraints.location.access.priority` specifies how necessary the connection type is. This can be `mandatory` or `bestEffort` + +## RequiredDaemons + +The `status.requiredDaemons` section of the `DirectiveBreakdown` tells the WLM about any driver-specific daemons it must enable for the job; it is assumed that the WLM knows about the driver-specific daemons and that if the users are specifying these then the WLM knows how to start them. The `status.requiredDaemons` section will exist only for `jobdw` and `persistentdw` directives. An example of the `status.requiredDaemons` section is included below. + +```yaml +status: +... + requiredDaemons: + - copy-offload +... +``` + +The allowed list of required daemons that may be specified is defined in the [nnf-ruleset.yaml for DWS](https://github.com/NearNodeFlash/nnf-sos/blob/master/config/dws/nnf-ruleset.yaml), found in the `nnf-sos` repository. The `ruleDefs.key[requires]` statement is specified in two places in the ruleset, one for `jobdw` and the second for `persistentdw`. The ruleset allows a list of patterns to be specified, allowing one for each of the allowed daemons. + +The `DW` directive will include a comma-separated list of daemons after the `requires` keyword. The following is an example: + +```bash +#DW jobdw type=xfs capacity=1GB name=stg1 requires=copy-offload +``` + +The DWDirectiveRule resource currently active on the system can be viewed with: + +```console +kubectl get -n dws-system dwdirectiverule nnf -o yaml +``` + +### Valid Daemons + +Each site should define the list of daemons that are valid for that site and recognized by that site's WLM. The initial `nnf-ruleset.yaml` defines only one, called `copy-offload`. When a user specifies `copy-offload` in their `DW` directive, they are stating that their compute-node application will use the Copy Offload API Daemon described in the [Data Movement Configuration](../data-movement/readme.md). From 490acb9eb2699d6b1d5e6b37cbd9a9392cd22e43 Mon Sep 17 00:00:00 2001 From: Dean Roehrich Date: Fri, 5 Jul 2024 11:06:50 -0500 Subject: [PATCH 2/3] Refer to RequiredDaemons from the data-movement doc Signed-off-by: Dean Roehrich --- docs/guides/data-movement/readme.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/guides/data-movement/readme.md b/docs/guides/data-movement/readme.md index cf6c345..6381ff3 100644 --- a/docs/guides/data-movement/readme.md +++ b/docs/guides/data-movement/readme.md @@ -90,6 +90,8 @@ The `CreateRequest` API call that is used to create Data Movement with the Copy options to allow a user to specify some options for that particular Data Movement. These settings are on a per-request basis. +The Copy Offload API requires the `nnf-dm` daemon to be running on the compute node. This daemon may be configured to run full-time, or it may left in a disabled state if the WLM is expected to run it only when a user requests it. See [Compute Daemons](../compute-daemons/readme.md) for the systemd service configuration of the daemon. See `RequiredDaemons` in [Directive Breakdown](../directive-breakdown/readme.md) for a description of how the user may request the daemon, in the case where the WLM will run it only on demand. + See the [DataMovementCreateRequest API](copy-offload-api.html#datamovement.DataMovementCreateRequest) definition for what can be configured. From f025fc949831a6050b913370b3844d78fbbf2dae Mon Sep 17 00:00:00 2001 From: Dean Roehrich Date: Fri, 5 Jul 2024 14:43:29 -0500 Subject: [PATCH 3/3] review Signed-off-by: Dean Roehrich --- docs/guides/data-movement/readme.md | 8 +++++++- docs/guides/directive-breakdown/readme.md | 2 +- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/docs/guides/data-movement/readme.md b/docs/guides/data-movement/readme.md index 6381ff3..db60ed9 100644 --- a/docs/guides/data-movement/readme.md +++ b/docs/guides/data-movement/readme.md @@ -90,7 +90,13 @@ The `CreateRequest` API call that is used to create Data Movement with the Copy options to allow a user to specify some options for that particular Data Movement. These settings are on a per-request basis. -The Copy Offload API requires the `nnf-dm` daemon to be running on the compute node. This daemon may be configured to run full-time, or it may left in a disabled state if the WLM is expected to run it only when a user requests it. See [Compute Daemons](../compute-daemons/readme.md) for the systemd service configuration of the daemon. See `RequiredDaemons` in [Directive Breakdown](../directive-breakdown/readme.md) for a description of how the user may request the daemon, in the case where the WLM will run it only on demand. +The Copy Offload API requires the `nnf-dm` daemon to be running on the compute node. This daemon may be configured to run full-time, or it may be left in a disabled state if the WLM is expected to run it only when a user requests it. See [Compute Daemons](../compute-daemons/readme.md) for the systemd service configuration of the daemon. See `RequiredDaemons` in [Directive Breakdown](../directive-breakdown/readme.md) for a description of how the user may request the daemon, in the case where the WLM will run it only on demand. + +If the WLM is running the `nnf-dm` daemon only on demand, then the user can request that the daemon be running for their job by specifying `requires=copy-offload` in their `DW` directive. The following is an example: + +```bash +#DW jobdw type=xfs capacity=1GB name=stg1 requires=copy-offload +``` See the [DataMovementCreateRequest API](copy-offload-api.html#datamovement.DataMovementCreateRequest) definition for what can be configured. diff --git a/docs/guides/directive-breakdown/readme.md b/docs/guides/directive-breakdown/readme.md index 3d193c5..8967b28 100644 --- a/docs/guides/directive-breakdown/readme.md +++ b/docs/guides/directive-breakdown/readme.md @@ -170,7 +170,7 @@ The `DW` directive will include a comma-separated list of daemons after the `req #DW jobdw type=xfs capacity=1GB name=stg1 requires=copy-offload ``` -The DWDirectiveRule resource currently active on the system can be viewed with: +The `DWDirectiveRule` resource currently active on the system can be viewed with: ```console kubectl get -n dws-system dwdirectiverule nnf -o yaml