Skip to content

Commit

Permalink
docs: add documentation
Browse files Browse the repository at this point in the history
Signed-off-by: vsoch <[email protected]>
  • Loading branch information
vsoch committed Apr 29, 2024
1 parent c1b89a0 commit 505dd24
Show file tree
Hide file tree
Showing 29 changed files with 1,228 additions and 230 deletions.
1 change: 1 addition & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ repos:
rev: v4.3.0
hooks:
- id: check-added-large-files
args: ["--maxkb=2000"]
- id: check-case-conflict
- id: check-docstring-first
- id: end-of-file-fixer
Expand Down
223 changes: 1 addition & 222 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,228 +11,7 @@ It is a transformational layer, or a simple language that converts steps needed
for a specific clusters scheduler. We are currently prototyping off of the Flux JobSpec, and intent
to derive some variant between that and something more. It is JobSpec... the next generation! 🚀️

⭐️ [Read the specification](spec-1.md) ⭐️

Some drafts are included in [docs/drafts](docs/drafts)

## Usage

A JobSpec consists of one or more tasks that have dependencies. This level of dependency is what can be represented in a scheduler.
The JobSpec library here reads in the JobSpec and can map that into specific cluster submit commands.
Here is an example that assumes receiving a Jobspec on a flux cluster.

#### 1. Start Flux

Start up the development environment to find yourself in a container with flux. Start a test instance:

```bash
flux start --test-size=4
```

Note that we have 4 faux nodes and 40 faux cores.

```bash
flux resource list
```
```console
STATE NNODES NCORES NGPUS NODELIST
free 4 40 0 194c2b9f4f3c,194c2b9f4f3c,194c2b9f4f3c,194c2b9f4f3c
allocated 0 0 0
down 0 0 0
```

Ensure you have jobspec installed! Yes, we are vscode, installing to the container, so we use sudo. YOLO.

```bash
sudo pip install -e .
```

#### 2. Command Line Examples

We are going to run the [examples/hello-world-jobspec.yaml](examples/hello-world-jobspec.yaml). This setup is way overly
complex for this because we don't actually need to do any staging or special work, but it's an example, so intended to be so.
Also note that the design of this file is subject to change. For example, we don't have to include the transform directly in the
jobspec - it can be a file that the jobspec writes, and then the command is issued. I like it better as a piece of it, so am putting
it there for the time being, mostly because it looks nicer. I'm sure someone will disagree with me about that.

```bash
# Submit a basic set of jobs with dependencies
jobspec run ./examples/hello-world-jobspec.yaml
```
```console
=> flux workload
=> flux submit ƒDjkLvNF9 OK
=> flux submit ƒDjzAyfhh OK
```

Add debug to see commands submit

```bash
jobspec --debug run ./examples/hello-world-jobspec.yaml
```
```console
=> flux workload
=> flux submit ƒ2i6n8XHSP OK
flux submit --job-name task-1 -N 1 bash -c echo Starting task 1; sleep 3; echo Finishing task 1
=> flux submit ƒ2i6qafcUw OK
flux submit --job-name task-2 -N 1 bash -c echo Starting task 2; sleep 3; echo Finishing task 2
```

Note that the default transformer is flux, so the above are equivalent to:

```bash
jobspec run -t flux ./examples/hello-world-jobspec.yaml
jobspec run --transformer flux ./examples/hello-world-jobspec.yaml
```

#### 3. Nested Examples

Try running some advanced examples. Here is a group within a task.

```bash
jobspec --debug run ./examples/task-with-group.yaml
```
```console
=> flux workload
=> flux submit ƒ2iiMFBqxT OK
flux submit --job-name task-1 -N 1 bash -c echo Starting task 1; sleep 3; echo Finishing task 1
=> flux batch ƒ2iiQpk7Qj OK
#!/bin/bash
flux submit --job-name task-2-task-0 --flags=waitable bash -c echo Starting task 2; sleep 3; echo Finishing task 2
flux job wait --all
flux job submit /tmp/jobspec-.bvu1v7vk/jobspec-5y9n9u0y
```

That's pretty intuitive, because we see that there is a flux submit first, followed by a batch that has a single task run. The last line "flux submit" shows how we are submitting the script that was just shown.
What about a group within a group?

```bash
$ jobspec --debug run ./examples/group-with-group.yaml
```
```console
=> flux workload
=> flux batch ƒ2jEE7NPXM OK
#!/bin/bash
flux submit --job-name group-1-task-0 --flags=waitable bash -c echo Starting task 1 in group 1; sleep 3; echo Finishing task 1 in group 1
flux job submit --flags=waitable /tmp/jobspec-.ljjiywaa/jobspec-kb5y5lsl
# rm -rf /tmp/jobspec-.ljjiywaa/jobspec-kb5y5lsl
flux job wait --all
flux job submit /tmp/jobspec-.45jezez5/jobspec-8dr1udhx
```

The UI here needs some work, but here is what we see above.

```console
# This is the start of the workload - the entire next gen jobspec always produces one workload
=> flux workload

# This is the top level group that has the other group within - it's the top level "flux batch" that we submit
=> flux batch ƒ2e7Ay6jvo OK

# This is showing the first script that is written
#!/bin/bash

# Here is the first job submit, now namespaced to group-1 (if the user, me, didn't give it a name)
flux submit --job-name group-1-task-0 --flags=waitable bash -c echo Starting task 1 in group 1; sleep 3; echo Finishing task 1 in group 1

# This is submitting group-2 - the jobspec is written in advance
flux job submit --flags=waitable /tmp/jobspec-.ljjiywaa/jobspec-kb5y5lsl

# And this will be how we clean it up as we go - always after it's submit. I'm commenting it out for now because rm -rf makes me nervous!
# rm -rf /tmp/jobspec-.ljjiywaa/jobspec-kb5y5lsl

# This is the actual end of the batch script
flux job wait --all

# This is showing submitting the batch script above, kind of confusing because it looks like it's within it (it's not, just a bad UI for now)
flux job submit /tmp/jobspec-.45jezez5/jobspec-8dr1udhx
```

And because I didn't clean it up, here is the contents of the batch in the batch for group-2

```bash
#!/bin/bash
flux submit --job-name group-2-task-0 --flags=waitable bash -c echo Starting task 1 in group 2; sleep 3; echo Finishing task 1 in group 2
flux job wait --all
```

#### 4. Python Examples

It could also be the case that you want something running inside a lead broker instance to receive Jobspecs incrementally and then
run them. This Python example can help with that by showing how to accomplish the same, but from within Python.

```bash
python3 ./examples/flux/receive-job.py
```
```console
=> flux workload
=> flux submit ƒKCJG2ESB OK
=> flux submit ƒKCa5iZsd OK
```

Just for fun (posterity) I briefly tried having emoji here:

![img/emoji.png](img/emoji.png)


### Frequently Asked Questions

#### Is this a Flux jobspec?

Despite the shared name, this is not a Flux jobspec. Type `man bash` to see that the term "jobspec" predates flux. If we lived in a universe of just Flux, sure we wouldn't need this. But the world is more than Flux, and we want to extend our Jobspec to that - providing an abstraction that works with Flux, but also works with other workload managers and compute environments and application programming interfaces.

#### What are steps?

A step is a custom setup or staging command that might be allowed for a specific environment. For example, workload managers that know how to map or stage files can use the "stage" step. General steps to write scripts can arguably used anywhere with some form of filesystem, shared or not. The steps that are allowed for a task are shown in the [spec](spec.md). At the onset we will make an effort to only add steps that can be supported across transformer types.

#### Where are the different transformers defined?

We currently have our primary (core) transformers here in [jobspec/transformer](jobspec/transformer), however a registry that discovers jobspec-* named Python modules can allow an out of tree install and use of a transfomrmer. This use case is anticipating clusters with some custom or private logic that cannot be shared in a public GitHub repository.


### Means of Interaction

There are several likely means of interacting with this library:

- As a service that runs at some frequency to receive jobs (written as a loop in Python in some context)
- As a cron job that does the same (an entry to crontab to run "jobspec" at some frequency)
- As a one off run (an example above)

For the example usage here, and since the project I am working on is concerned with Flux, we will start with the simplest case - a client that is running inside a flux instance (meaning it can import flux) that reads in a jobspec with a section that defines a set of transforms, and then issues the commands to stage the setup and use flux to run the work defined by the jobspec.

## Developer

### Organization

While you can write an external transformer (as a plugin) a set of core transformers are provided here:

- [jobspec/transformer](jobspec/transformer): core transformer classes that ship internally here.

### Writing a Transformer

For now, the easiest thing to do is add a single file (named by your transformer) to [jobspec/transformer](jobspec/transformer)
and copy the precedence in the file. A transformer minimally is a class with a name, description, and some number of steps.
You can then use provided steps in [jobspec/steps](jobstep/steps) or use the `StepBase` to write your own. At the end of
your transformer file you simply need to register the steps you want to use:

```python
# A transformer can register shared steps, or custom steps
Transformer.register_step(steps.WriterStep)
Transformer.register_step(batch)
Transformer.register_step(submit)
Transformer.register_step(stage)
```

If there is a skip you want the user to be able to define (but skip it for your transformer, for whatever reason you might have)
just register the empty step with the name you want to skip. As an example, let's say my transforer has no concept of a stage
(sharing a file across separate nodes) given that it has a shared filesystem. I might want to do:

```python
import jobspec.steps as steps

# This will not fail validation that the step is unknown, but skip it
Transformer.register_step(steps.EmptyStep, name="stage")
```
⭐️ [Documentation](https://compspec.github.io/jobspec) ⭐️

## License

Expand Down
Empty file added docs/.nojekyll
Empty file.
7 changes: 7 additions & 0 deletions docs/assets/css/main.css

Large diffs are not rendered by default.

22 changes: 22 additions & 0 deletions docs/assets/css/style.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#particles-js {
background-color: #091a28 !important;
}
.contents {
width:100% !important;
}

.project-button {
width: 100px;
margin: 10px 0;
padding: 5px;
color: #1a222c;
background-color: transparent;
border: 1px solid #1a222c;
border-radius: 2px;
text-align: center;
padding:10px;
outline: none;
text-decoration: none;
cursor: pointer;
transition: color .3s ease-out,background-color .3s ease-out,border-color .3s ease-out;
}
84 changes: 84 additions & 0 deletions docs/assets/css/syntax-vsoch.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
p .highlighter-rouge,
table .highlighter-rouge,
cell .highlighter-rouge {
background-color: #F7F7F7;
}
pre.highlight,
pre.highlighter-rouge,
.highlighter-rouge,
.highlighter-rouge pre,
.highlight,
.highlight pre {
/* background-color: #131d28; */
padding-left:20px;
margin-top: 20px;
margin-bottom:20px;
border-radius: 5px;
}
.highlighter-rouge pre {
padding-top:20px;
padding-bottom:20px;
}
.highlight, .highlighter-rouge .hll { background-color: #272822; }
.highlight, .highlighter-rouge .c { color: #75715e } /* Comment */
.highlight, .highlighter-rouge .err { color: #960050 } /* Error */
.highlight, .highlighter-rouge .k { color: #66d9ef } /* Keyword */
.highlight, .highlighter-rouge .l { color: #ae81ff } /* Literal */
.highlight, .highlighter-rouge .n { color: #f8f8f2 } /* Name */
.highlight, .highlighter-rouge .o { color: #f92672 } /* Operator */
.highlight, .highlighter-rouge .p { color: #f8f8f2 } /* Punctuation */
.highlight, .highlighter-rouge .cm { color: #75715e } /* Comment.Multiline */
.highlight, .highlighter-rouge .cp { color: #75715e } /* Comment.Preproc */
.highlight, .highlighter-rouge .c1 { color: #75715e } /* Comment.Single */
.highlight, .highlighter-rouge .cs { color: #75715e } /* Comment.Special */
.highlight, .highlighter-rouge .gs { font-weight: bold } /* Generic.Strong */
.highlight, .highlighter-rouge .kc { color: #66d9ef } /* Keyword.Constant */
.highlight, .highlighter-rouge .kd { color: #66d9ef } /* Keyword.Declaration */
.highlight, .highlighter-rouge .kn { color: #f92672 } /* Keyword.Namespace */
.highlight, .highlighter-rouge .kp { color: #66d9ef } /* Keyword.Pseudo */
.highlight, .highlighter-rouge .kr { color: #66d9ef } /* Keyword.Reserved */
.highlight, .highlighter-rouge .kt { color: #66d9ef } /* Keyword.Type */
.highlight, .highlighter-rouge .ld { color: #e6db74 } /* Literal.Date */
.highlight, .highlighter-rouge .m { color: #ae81ff } /* Literal.Number */
.highlight, .highlighter-rouge .s { color: #e6db74 } /* Literal.String */
.highlight, .highlighter-rouge .na { color: #a6e22e } /* Name.Attribute */
.highlight, .highlighter-rouge .nb { color: #f8f8f2 } /* Name.Builtin */
.highlight, .highlighter-rouge .nc { color: #a6e22e } /* Name.Class */
.highlight, .highlighter-rouge .no { color: #66d9ef } /* Name.Constant */
.highlight, .highlighter-rouge .nd { color: #a6e22e } /* Name.Decorator */
.highlight, .highlighter-rouge .ni { color: #f8f8f2 } /* Name.Entity */
.highlight, .highlighter-rouge .ne { color: #a6e22e } /* Name.Exception */
.highlight, .highlighter-rouge .nf { color: #a6e22e } /* Name.Function */
.highlight, .highlighter-rouge .nl { color: #f8f8f2 } /* Name.Label */
.highlight, .highlighter-rouge .nn { color: #f8f8f2 } /* Name.Namespace */
.highlight, .highlighter-rouge .nx { color: #a6e22e } /* Name.Other */
.highlight, .highlighter-rouge .py { color: #f8f8f2 } /* Name.Property */
.highlight, .highlighter-rouge .nt { color: #f92672 } /* Name.Tag */
.highlight, .highlighter-rouge .nv { color: #f8f8f2 } /* Name.Variable */
.highlight, .highlighter-rouge .ow { color: #f92672 } /* Operator.Word */
.highlight, .highlighter-rouge .w { color: #f8f8f2 } /* Text.Whitespace */
.highlight, .highlighter-rouge .mf { color: #ae81ff } /* Literal.Number.Float */
.highlight, .highlighter-rouge .mh { color: #ae81ff } /* Literal.Number.Hex */
.highlight, .highlighter-rouge .mi { color: #ae81ff } /* Literal.Number.Integer */
.highlight, .highlighter-rouge .mo { color: #ae81ff } /* Literal.Number.Oct */
.highlight, .highlighter-rouge .sb { color: #e6db74 } /* Literal.String.Backtick */
.highlight, .highlighter-rouge .sc { color: #e6db74 } /* Literal.String.Char */
.highlight, .highlighter-rouge .sd { color: #e6db74 } /* Literal.String.Doc */
.highlight, .highlighter-rouge .s2 { color: #e6db74 } /* Literal.String.Double */
.highlight, .highlighter-rouge .se { color: #ae81ff } /* Literal.String.Escape */
.highlight, .highlighter-rouge .sh { color: #e6db74 } /* Literal.String.Heredoc */
.highlight, .highlighter-rouge .si { color: #e6db74 } /* Literal.String.Interpol */
.highlight, .highlighter-rouge .sx { color: #e6db74 } /* Literal.String.Other */
.highlight, .highlighter-rouge .sr { color: #e6db74 } /* Literal.String.Regex */
.highlight, .highlighter-rouge .s1 { color: #e6db74 } /* Literal.String.Single */
.highlight, .highlighter-rouge .ss { color: #e6db74 } /* Literal.String.Symbol */
.highlight, .highlighter-rouge .bp { color: #f8f8f2 } /* Name.Builtin.Pseudo */
.highlight, .highlighter-rouge .vc { color: #f8f8f2 } /* Name.Variable.Class */
.highlight, .highlighter-rouge .vg { color: #f8f8f2 } /* Name.Variable.Global */
.highlight, .highlighter-rouge .vi { color: #f8f8f2 } /* Name.Variable.Instance */
.highlight, .highlighter-rouge .il { color: #ae81ff } /* Literal.Number.Integer.Long */

.highlight, .highlighter-rouge .gh { } /* Generic Heading & Diff Header */
.highlight, .highlighter-rouge .gu { color: #75715e; } /* Generic.Subheading & Diff Unified/Comment? */
.highlight, .highlighter-rouge .gd { color: #f92672; } /* Generic.Deleted & Diff Deleted */
.highlight, .highlighter-rouge .gi { color: #a6e22e; } /* Generic.Inserted & Diff Inserted */
Binary file added docs/assets/fonts/devicon.ttf
Binary file not shown.
Binary file added docs/assets/fonts/devicon.woff
Binary file not shown.
Binary file added docs/assets/fonts/fontawesome-webfont.ttf
Binary file not shown.
Binary file added docs/assets/fonts/fontawesome-webfont.woff
Binary file not shown.
Binary file added docs/assets/fonts/fontawesome-webfont.woff2
Binary file not shown.
Binary file added docs/assets/img/avocado.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/img/avocado2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
Binary file added docs/assets/img/jobspec-bot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/assets/js/main.js

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions docs/assets/js/sweet-scroll.min.js

Large diffs are not rendered by default.

41 changes: 41 additions & 0 deletions docs/docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# JobSpec

> Next Generation
This is an abstract job specification that takes a description of work, including requirements, services, and resources, and transforms it into the correct submission commands for a specific workload manager or cloud API. It is intended to be used with the <a href="https://converged-computing.github.io/rainbow" target="_blank">Rainbow Scheduler</a> but Rainbow is not required. We are starting with <a href="https://flux-framework.readthedocs.io" target="_blank">Flux Framework</a> and will be extending to Kubernetes and other workload managers and APIs soon.</p>

## Specifications

For our current specification, see [the spec](spec.md). For a first draft (that wasn't used) see [here](drafts/).

## Frequently Asked Questions

### Is this a Flux jobspec?

Despite the shared name, this is not a Flux jobspec. Type `man bash` to see that the term "jobspec" predates flux. If we lived in a universe of just Flux, sure we wouldn't need this. But the world is more than Flux, and we want to extend our Jobspec to that - providing an abstraction that works with Flux, but also works with other workload managers and compute environments and application programming interfaces.

### What are steps?

A step is a custom setup or staging command that might be allowed for a specific environment. For example, workload managers that know how to map or stage files can use the "stage" step. General steps to write scripts can arguably used anywhere with some form of filesystem, shared or not. The steps that are allowed for a task are shown in the [spec](spec.md). At the onset we will make an effort to only add steps that can be supported across transformer types.

### Where are the different transformers defined?

We currently have our primary (core) transformers here in [jobspec/transformer](jobspec/transformer), however a registry that discovers jobspec-* named Python modules can allow an out of tree install and use of a transfomrmer. This use case is anticipating clusters with some custom or private logic that cannot be shared in a public GitHub repository.

#### Means of Interaction

There are several likely means of interacting with this library:

- As a service that runs at some frequency to receive jobs (written as a loop in Python in some context)
- As a cron job that does the same (an entry to crontab to run "jobspec" at some frequency)
- As a one off run with `jobspec run ...`

For the example usage here, and since the project I am working on is concerned with Flux, we will start with the simplest case - a client that is running inside a flux instance (meaning it can import flux) that reads in a jobspec with a section that defines a set of transforms, and then issues the commands to stage the setup and use flux to run the work defined by the jobspec.

## Commands

Read more about the commands and getting started [here](commands.md#commands).

## Development

Read our [developer guide](#developer.md)
Loading

0 comments on commit 505dd24

Please sign in to comment.