Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplify resources #13

Merged
merged 1 commit into from
May 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
447 changes: 4 additions & 443 deletions docs/docs/drafts/README.md

Large diffs are not rendered by default.

445 changes: 445 additions & 0 deletions docs/docs/drafts/spec-draft-1.md

Large diffs are not rendered by default.

631 changes: 631 additions & 0 deletions docs/docs/drafts/spec-draft-2.md

Large diffs are not rendered by default.

14 changes: 4 additions & 10 deletions docs/docs/spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,9 +87,9 @@ tasks:
ior -b 10g -O summaryFormat=json
```

This is a more condensed, and easier to read version. We aren't reading _exactly_ from top to bottom because we have to jump back up to see the "spack-resources" reference, but it's more succinct in total, making
it appealing still. The above assumes a cluster with a shared filesystem, where a spack install is already on the user's default path.
Now let's walk through specific sections of the above, and then we will move into advanced patterns.
This is a more condensed, and easier to read version. And to make our lives easier (for writing Go) we are going to adhere to strictly requiring all resources to be in named sections at the top. We are also going to add a dummy field "schedule" to indicate that a resource is at the top level and should be asked for (to the scheduler) to schedule separately. I know this is a bad design and I welcome someone else to work on it. I am just too dumb today.

The above assumes a cluster with a shared filesystem, where a spack install is already on the user's default path. Now let's walk through specific sections of the above, and then we will move into advanced patterns.

## Tasks

Expand Down Expand Up @@ -226,15 +226,9 @@ The "local" alongside a task command indicates that it isn't a submit or batch,

## Resources

Now let's talk about resources. The most basic definition of resources has them alongside groups and tasks.
One of the following is REQUIRED:

- A top level "resources" section with named entries that are referenced within tasks and/or groups. In this case, instead of an explicit definition of resources, a task or group can define a single string with the key (lookup) to the named section.
- Within- group or task "resources" that are defined explicitly alongside the task or group.

Now let's talk about resources. Resources are all required to be in named groups at the top section. If you don't put `schedule: true` in any group, they are all assumed to be wanted for a separate scheduling request. If you only want to ask for a subset of the resources (or some are nested) then set `scheduled: true` to those.
While it is not enforced (assuming you know what you are doing, or something like grow/autoscale is possible) it is typically suggested that child resources are a subset of parent resources. Some special cases included:

- If a group does not have resources defined, each task within is expected to have resources, and the group is the sum across them.
- If a task does not have resources defined, it inherits the same resources as the parent group.
- A standalone task or group without resources is not allowed.

Expand Down
14 changes: 7 additions & 7 deletions examples/flux/jobspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,18 @@ requires:
io.archspec:
cpu.target: amd64

resources:
single-node:
count: 1
type: node

tasks:
- name: task-1
command:
- bash
- -c
- "echo Starting task 1; sleep 3; echo Finishing task 1"

resources:
count: 1
type: node
resources: single-node

- name: task-2
depends_on: ["task-1"]
Expand All @@ -25,6 +27,4 @@ tasks:
- -c
- "echo Starting task 2; sleep 3; echo Finishing task 2"

resources:
count: 1
type: node
resources: single-node
14 changes: 7 additions & 7 deletions examples/hello-world-jobspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,19 @@ requires:
io.archspec:
cpu.target: amd64

resources:
single-node:
count: 1
type: node

tasks:
- name: task-1
command:
- bash
- -c
- "echo Starting task 1; sleep 3; echo Finishing task 1"

resources:
count: 1
type: node

resources: single-node
- name: task-2
depends_on: ["task-1"]

Expand All @@ -25,6 +27,4 @@ tasks:
- -c
- "echo Starting task 2; sleep 3; echo Finishing task 2"

resources:
count: 1
type: node
resources: single-node
3 changes: 3 additions & 0 deletions jobspec/runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,9 @@ def parse(self, jobspec):
def announce(self):
pass

def flatten(self, filename):
raise NotImplementedError

def run(self, filename):
"""
Run the transformer
Expand Down
152 changes: 4 additions & 148 deletions jobspec/schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,12 +78,7 @@
# These are task level items that over-ride global
"requires": {"$ref": "#/definitions/requires"},
# Resources in a task can be traditional OR a string reference
"resources": {
"oneOf": [
{"$ref": "#/definitions/resources"},
{"type": "string"},
]
},
"resources": {"type": "string"},
"attributes": {"$ref": "#/definitions/attributes"},
# A task can reference another group (a flux batch)
"group": {"type": "string"},
Expand Down Expand Up @@ -113,12 +108,7 @@
# These are task level items that over-ride global
"requires": {"$ref": "#/definitions/requires"},
# Resources in a task can be traditional OR a string reference
"resources": {
"oneOf": [
{"$ref": "#/definitions/resources"},
{"type": "string"},
]
},
"resources": {"type": "string"},
"attributes": {"$ref": "#/definitions/attributes"},
"depends_on": {"type": "array", "items": {"type": "string"}},
# Tasks for the group
Expand All @@ -145,6 +135,7 @@
"type": {"enum": ["node"]},
"count": {"type": "integer", "minimum": 1},
"unit": {"type": "string"},
"schedule": {"type": "boolean"},
"with": {
"type": "array",
"minItems": 1,
Expand All @@ -168,142 +159,7 @@
"count": {"type": "integer", "minimum": 1},
"unit": {"type": "string"},
"label": {"type": "string"},
"exclusive": {"type": "boolean"},
"with": {
"type": "array",
"minItems": 1,
"maxItems": 2,
"items": {"oneOf": [{"$ref": "#/definitions/intranode_resource_vertex"}]},
},
},
"additionalProperties": False,
},
},
}


jobspec_nextgen_draft = {
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "http://github.com/flux-framework/rfc/tree/master/data/spec_24/schema.json",
"title": "jobspec-01",
"description": "JobSpec the Next Generation",
"type": "object",
"required": ["version", "tasks"],
"properties": {
# This is not a flux JobSpec, and we start at v1
"version": {
"description": "the jobspec version",
"type": "integer",
"enum": [1],
},
# These are optional global resources
"requires": {"$ref": "#/definitions/requires"},
"resources": {"$ref": "#/definitions/resources"},
"attributes": {"$ref": "#/definitions/attributes"},
# Tasks are one or more named tasks
"tasks": {
"description": "tasks configuration",
"type": "array",
# If no slot is defined, it's implied to be at the top level (the node)
"properties": {
# These are task level items that over-ride global
"requires": {"$ref": "#/definitions/requires"},
"resources": {"$ref": "#/definitions/resources"},
"attributes": {"$ref": "#/definitions/attributes"},
# Name only is needed to reference the task elsewhere
"name": {"type": "string"},
"depends_on": {"type": "array", "items": {"type": "string"}},
"parent": {"type": "string"},
# How many of this task are to be run?
"replicas": {"type": "number", "minimum": 1, "default": 1},
"level": {"type": "number", "minimum": 1, "default": 1},
# A command can be a string or a list of strings
"command": {
"type": ["string", "array"],
"minItems": 1,
"items": {"type": "string"},
},
# Custom logic for the transformer
"steps": {
"type": ["array"],
"items": {
"type": "object",
"properties": {
"name": {
"type": "string",
"enum": ["stage"],
},
},
"required": ["name"],
},
},
},
},
"additionalProperties": False,
},
"definitions": {
"attributes": {
"description": "system, parameter, and user attributes",
"type": "object",
"properties": {
"duration": {"type": "number", "minimum": 0},
"cwd": {"type": "string"},
"environment": {"type": "object"},
},
},
"requires": {
"description": "compatibility requirements",
"type": "object",
},
"resources": {
"description": "requested resources",
"oneOf": [
{"$ref": "#/definitions/node_vertex"},
{"$ref": "#/definitions/slot_vertex"},
],
},
"intranode_resource_vertex": {
"description": "schema for resource vertices within a node, cannot have child vertices",
"type": "object",
"required": ["type", "count"],
"properties": {
"type": {"enum": ["core", "gpu"]},
"count": {"type": "integer", "minimum": 1},
"unit": {"type": "string"},
},
"additionalProperties": False,
},
"node_vertex": {
"description": "schema for the node resource vertex",
"type": "object",
"required": ["type", "count"],
"properties": {
"type": {"enum": ["node"]},
"count": {"type": "integer", "minimum": 1},
"unit": {"type": "string"},
"with": {
"type": "array",
"minItems": 1,
"maxItems": 1,
"items": {
"oneOf": [
{"$ref": "#/definitions/slot_vertex"},
{"$ref": "#/definitions/intranode_resource_vertex"},
]
},
},
},
"additionalProperties": False,
},
"slot_vertex": {
"description": "special slot resource type - label assigns to task slot",
"type": "object",
"required": ["type", "count", "with", "label"],
"properties": {
"type": {"enum": ["slot"]},
"count": {"type": "integer", "minimum": 1},
"unit": {"type": "string"},
"label": {"type": "string"},
"schedule": {"type": "boolean"},
"exclusive": {"type": "boolean"},
"with": {
"type": "array",
Expand Down
2 changes: 1 addition & 1 deletion jobspec/version.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = "0.1.0"
__version__ = "0.1.1"
AUTHOR = "Vanessa Sochat"
AUTHOR_EMAIL = "[email protected]"
NAME = "jobspec"
Expand Down