Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add depends on #17

Merged
merged 3 commits into from
Jun 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@ ENV USER_GID=${USER_GID}
USER root
RUN apt-get update && apt-get install -y less python3-pip

# "Install" the frobnicator plugin
RUN wget https://raw.githubusercontent.com/researchapps/flux-core/add-dependency-frob-plugin/src/bindings/python/flux/job/frobnicator/plugins/dependency.py && \
mv dependency.py /usr/lib/python3.10/site-packages/flux/job/frobnicator/plugins/

# Add the group and user that match our ids
RUN groupadd -g ${USER_GID} ${USERNAME} && \
adduser --disabled-password --uid ${USER_UID} --gid ${USER_GID} --gecos "" ${USERNAME} && \
Expand Down
9 changes: 8 additions & 1 deletion docs/docs/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Here is an example that assumes receiving a Jobspec on a flux cluster.

We are prototyping user-space subsystems, for which we do a satisfy request of a contender jobspec against a directory of user space subsystem files, which each should
be JGF (json graph format) graphs. While this can be paired with run (to determine if the run should proceed) we provide a separate "satisfy" command to test and prototype the tool.
We also provide a set of example user subsystems in `examples/subsystems` for each of environment modules and spack. This means we can do satsify requests of jobspecs against the subsystem directory as follows. Here is an example that is satisfied:
We also provide a set of example user subsystems in `examples/subsystems` for each of environment modules and spack. This means we can do satisfy requests of jobspecs against the subsystem directory as follows. Here is an example that is satisfied:

```bash
$ jobspec satisfy ./examples/subsystems/jobspec-spack-subystem-satisfied.yaml --subsystem-dir ./examples/subsystems
Expand Down Expand Up @@ -192,4 +192,11 @@ Just for fun (posterity) I briefly tried having emoji here:

![assets/img/emoji.png](assets/img/emoji.png)


#### 4. Depends On

We have support for depends on, but it requires a custom frobnicator plugin to create a dependency based on job name. You
can see the small tutorial [here](https://github.com/compspec/jobspec/tree/main/examples/depends_on) where you can run the entire thing in the VSCode developer environment.


[home](/README.md#jobspec)
83 changes: 83 additions & 0 deletions examples/depends_on/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# JobSpec Nextgen with Depends On

We are going to use jobspec nextgen to run a small workflow with depends on. Note that this requires:

- The [frobnicator plugin](https://github.com/flux-framework/flux-core/pull/5982) to set a dependency name. This comes with the Developer Environment VSCode container.
- You will need to get the hostname of the container and update the [broker.toml](broker.toml) here. Also generate a curve certificate and R lite file:

## 1. Setup

First, install a developer version of jobspec (from the root):

```bash
pip install -e .
```

Next, update [broker.toml](broker.toml) to have the hostname of your container (this is in `examples/depends_on` relative to the root). Then:

```bash
flux R encode --local > /tmp/R
flux keygen /tmp/curve.cert
```

You should then be able to start your sub-instance:

```bash
flux start -o --config=./broker.toml
```
```console
vscode@c970303f1b79:/workspaces/iflux/jobspec$ flux resource list
STATE NNODES NCORES NGPUS NODELIST
free 1 10 0 c970303f1b79
allocated 0 0 0
down 0 0 0
```

Check that the frobnicator plugin is installed.

```bash
$ flux job-frobnicator --list-plugins
```
```console
Available plugins:
constraints Apply constraints to incoming jobspec based on broker config.
defaults Apply defaults to incoming jobspec based on broker config.
dependency Translate dependency.name into a job id for dependency.afterok
```

Try submitting a job - it should work.

```bash
flux run hostname
```

## 2. Test Depends On

We can now submit a simple jobspec that has depends_on, which is included in [jobspec.yaml](jobspec.yaml).

```bash
jobspec run ./jobspec.yaml
```

The second task (task-2) depends on task-1, and the first task will sleep for a minute, so at first you'll see:

```bash
$ flux jobs -a
JOBID USER NAME ST NTASKS NNODES TIME INFO
ƒ3sL5KeV5 vscode task-2 D 4 1 - depends:after-success=6329825492992
ƒ3sGt1CFy vscode task-1 R 4 1 13.91s 747b0768eb45
```

And then after a minute:

```bash
$ flux jobs -a
JOBID USER NAME ST NTASKS NNODES TIME INFO
ƒ3sL5KeV5 vscode task-2 CD 4 1 3.035s 747b0768eb45
ƒ3sGt1CFy vscode task-1 CD 4 1 1.001m 747b0768eb45
```

Tada! And that's it. You can obviously do more complex things, but this is a great start!
Note that we also keep the plugin [dependency.py](dependency.py) here in case the pull request branch
is lost. You basically need to start a broker that has it enabled, and then put it in the Python flux install
at `flux/job/frobnicator/plugins`.
30 changes: 30 additions & 0 deletions examples/depends_on/broker.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
[access]
allow-guest-user = true
allow-root-owner = true

# Point to resource definition generated with flux-R(1).
[resource]
path = "/tmp/R"

[bootstrap]
curve_cert = "/tmp/curve.cert"
default_port = 8050
default_bind = "tcp://eth0:%p"
default_connect = "tcp://%h:%p"

# CHANGE THIS TO YOUR HOSTNAME (or container hostname)
hosts = [
{ host="747b0768eb45"},
]


[archive]
dbpath = "/tmp/job-archive.sqlite"
period = "1m"
busytimeout = "50s"

[ingest.frobnicator]
plugins = [ "defaults", "constraints", "dependency" ]

[sched-fluxion-qmanager]
queue-policy = "fcfs"
65 changes: 65 additions & 0 deletions examples/depends_on/dependency.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
##############################################################
# Copyright 2022 Lawrence Livermore National Security, LLC
# (c.f. AUTHORS, NOTICE.LLNS, COPYING)
#
# This file is part of the Flux resource manager framework.
# For details, see https://github.com/flux-framework.
#
# SPDX-License-Identifier: LGPL-3.0
##############################################################

"""Translate dependency.name into a job id for dependency.afterok

"""

from flux.job.frobnicator import FrobnicatorPlugin


class DependencyAdder:
"""Get the system attribute for dependency.name and add a
task dependency. Raise an error if we can't find it.
"""

def __init__(self, config={}):
# We don't need this for anything, just saving
# for the heck
self.config = config

def add_dependency(self, jobspec):
"""We need to translate a dependency name into a job id. We will:
1. Get the desired name from system attribute dependency.name
2. list all flux jobs (I know, I know) and get the job id
3. Update the jobspec to have it.
4. Raise error if the name does not exist.

Bullet 2 is a bad design and thus this is only for experimentation.
"""
dependency_name = jobspec.attributes["system"].get("dependency", {}).get("name")
if not dependency_name:
return

# We are going to add the first matched job as a dependency
import flux
import flux.job

handle = flux.Flux()
for job in flux.job.list.job_list(handle).get()["jobs"]:
if job["name"] == dependency_name:
jobspec.attributes["system"]["dependencies"] = [
{"scheme": "afterok", "value": str(job["id"])}
]
return

raise ValueError(f"Job with name {dependency_name} is not known")


class Frobnicator(FrobnicatorPlugin):
def __init__(self, parser):
super().__init__(parser)

def configure(self, args, config):
self.config = DependencyAdder(config)

def frob(self, jobspec, user, urgency, flags):
"""This frobber is looking for an attribute "dependency.name"""
self.config.add_dependency(jobspec)
28 changes: 28 additions & 0 deletions examples/depends_on/jobspec.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
version: 1

# This is the same example as hello-world-jobspec.yaml
# But without the slot
resources:
sleep-resources:
type: node
count: 1
with:
- type: core
count: 4

tasks:
- name: task-1
command:
- bash
- -c
- "echo Starting task 1; sleep 60; echo Finishing task 1"

resources: sleep-resources
- name: task-2
depends_on: ["task-1"]
command:
- bash
- -c
- "echo Starting task 2; sleep 3; echo Finishing task 2"

resources: sleep-resources
1 change: 0 additions & 1 deletion examples/flux/jobspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ tasks:
resources: single-node

- name: task-2
depends_on: ["task-1"]
command:
- bash
- -c
Expand Down
3 changes: 1 addition & 2 deletions examples/group-with-group.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,9 @@ groups:
- group: group-2

- name: group-2
depends_on: ["group-1"]
resources: common
tasks:
- command:
- bash
- -c
- "echo Starting task 1 in group 2; sleep 3; echo Finishing task 1 in group 2"
- "echo Starting task 1 in group 2; sleep 3; echo Finishing task 1 in group 2"
1 change: 0 additions & 1 deletion examples/hello-world-jobspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ tasks:

resources: sleep-resources
- name: task-2
depends_on: ["task-1"]
command:
- bash
- -c
Expand Down
3 changes: 1 addition & 2 deletions examples/task-with-group.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ resources:

groups:
- name: task-2
depends_on: ["task-1"]
resources: common
tasks:
- command:
Expand All @@ -27,4 +26,4 @@ tasks:
- "echo Starting task 1; sleep 3; echo Finishing task 1"

# flux batch...
- group: task-2
- group: task-2
8 changes: 4 additions & 4 deletions jobspec/transformer/flux/steps.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,10 +100,10 @@ def prepare(self, command=None, waitable=False):
cwd = attributes.get("cwd")
watch = attributes.get("watch")

# We can't support this yet because it needs the jobid
# That design to require to get it seems fragile
# for depends_on in task.get("depends_on") or []:
# cmd += [f"--dependency={depends_on}"]
# Note that you need to install our frobnicator plugin
# for this to work. See the examples/depends_on directory
for depends_on in task.get("depends_on") or []:
cmd += [f"--setattr=dependency.name={depends_on}"]

if cwd is not None:
cmd += ["--cwd", cwd]
Expand Down
2 changes: 1 addition & 1 deletion jobspec/version.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = "0.1.13"
__version__ = "0.1.14"
AUTHOR = "Vanessa Sochat"
AUTHOR_EMAIL = "[email protected]"
NAME = "jobspec"
Expand Down