Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge develop into main #2

Merged
merged 16 commits into from
Jan 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions .github/workflows/package.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: Publish to Test PyPI

on:
push:
branches: ["develop", "main"]
release:
types: [published]

permissions:
contents: read

jobs:
deploy:

runs-on: ubuntu-latest

environment: release
permissions:
id-token: write # IMPORTANT: this permission is mandatory for trusted publishing

steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
cache: 'pip'
- name: Install dependencies
run: |
# python -m pip install --upgrade pip
pip install hatch
- name: Build package
run: hatch build
# - name: Test package
# run: hatch -e test run nose2 --verbose
- name: Publish package distributions to Test PyPI
if: github.ref != 'refs/heads/main'
uses: pypa/gh-action-pypi-publish@release/v1
with:
skip-existing: true
repository-url: https://test.pypi.org/legacy/
- name: Publish package distributions to PyPI
if: github.ref == 'refs/heads/main'
uses: pypa/gh-action-pypi-publish@release/v1
with:
skip-existing: true
repository-url: https://upload.pypi.org/legacy/
135 changes: 129 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,132 @@ Zoo runner using Argo Workflows

## Environment variables

STORAGE_CLASS
DEFAULT_VOLUME_SIZE
DEFAULT_MAX_CORES
DEFAULT_MAX_RAM
ARGO_WF_ENDPOINT
ARGO_WF_TOKEN
- `STORAGE_CLASS`: k8s cluster RWX storage class, defaults to `standard`.
- `DEFAULT_VOLUME_SIZE`: Calrissian default RWX volume size, defaults to `12Gi`.
- `DEFAULT_MAX_CORES`: Calrissian default max cores, defaults to `4`.
- `DEFAULT_MAX_RAM`: Calrissian default max RAM, defaults to `4Gi`.
- `ARGO_WF_ENDPOINT`: this is the Argo Workflows API endpoint, defaults to `"http://localhost:2746"`.
- `ARGO_WF_TOKEN`: this is the Argo Workflows API token that can be retrieved with: `kubectl get -n ns1 secret argo.service-account-token -o=jsonpath='{.data.token}' | base64 --decode`
- `ARGO_WF_SYNCHRONIZATION_CM`: this is the Argo Workflows synchronizaion configmap (with key "workflow"). For tests, we use "semaphore-argo-cwl-runner"
- `ARGO_CWL_RUNNER_TEMPLATE`: this is the Argo Workflows WorkflowTemplate that runs the CWL, defaults to: "argo-cwl-runner"
- `ARGO_CWL_RUNNER_ENTRYPOINT`: this is the Argo Workflows WorkflowTemplate entrypoint, defaults to: "calrissian-runner"

## Requirements

The Argo Workflows deployment has a Argo Workflows `WorkflowTemplate` or `ClusterWorkflowTemplate` impllementing the execution of a Calrissian Job and exposing the interface:

**Input parameters:**

```yaml
templates:
- name: calrissian-runner
inputs:
parameters:
- name: parameters
description: Parameters in JSON format
- name: cwl
description: CWL document in JSON format
- name: max_ram
default: 8G
description: Max RAM (e.g. 8G)
- name: max_cores
default: '4'
description: Max cores (e.g. 4)
- name: entry_point
description: CWL document entry_point
```

**Outputs:**

```yaml
outputs:
parameters:
- name: results
valueFrom:
parameter: '{{steps.get-results.outputs.parameters.calrissian-output}}'
- name: log
valueFrom:
parameter: '{{steps.get-results.outputs.parameters.calrissian-stderr}}'
- name: usage-report
valueFrom:
parameter: '{{steps.get-results.outputs.parameters.calrissian-report}}'
- name: stac-catalog
valueFrom:
parameter: '{{steps.stage-out.outputs.parameters.stac-catalog}}'
- name: feature-collection
valueFrom:
parameter: >-
{{steps.feature-collection.outputs.parameters.feature-collection}}
artifacts:
- name: tool-logs
from: '{{steps.get-results.outputs.artifacts.tool-logs}}'
- name: calrissian-output
from: '{{steps.get-results.outputs.artifacts.calrissian-output}}'
- name: calrissian-stderr
from: '{{steps.get-results.outputs.artifacts.calrissian-stderr}}'
- name: calrissian-report
from: '{{steps.get-results.outputs.artifacts.calrissian-report}}'
```

Where:

- `results` is the Calrissian job stdout
- `log` is the Calrissian job stderr
- `usage-report` is the Calrissian usage report
- `stac-catalog` is the s3 path to the published STAC Catalog
- `feature-collection` is the Feature Collection with the STAC Items produced

And the artifacts:

- `tool-logs` is the Calrissian CWL step logs defined as:


```yaml
artifacts:
- name: tool-logs
path: /calrissian/logs
s3:
key: '{{workflow.name}}-{{workflow.uid}}-artifacts/tool-logs'
archive:
none: {}
```

- `calrissian-output` is the Calrissian stdout
- `calrissian-stderr` is the Calrissian job stderr
- `calrissian-report` is the Calrissian usage report

See the example provided in folder `example`

## Caveats

### Additional volumes in the Argo Workflows WorkflowTemplate that runs the CWL

Let's say one wants to add a configmap on the Argo Workflows WorkflowTemplate that runs the CWL.

By design, this volume must also be declared in an Argo Workflows Workflow that wants to run the WorkflowTemplate in a step.

This means that if the Argo Workflows WorkflowTemplate that runs the CWL declares:

```yaml
volumes:
- name: cwl-wrapper-config-vol
configMap:
name: cwl-wrapper-config
items:
- key: main.yaml
- key: rules.yaml
- key: stage-in.cwl
- key: stage-out.cwl
```

The

```python
config_map_volume(
name="cwl-wrapper-config-vol",
configMapName="cwl-wrapper-config",
items=[{"key": "main.yaml"}, {"key": "rules.yaml"}, {"key": "stage-in.yaml"}, {"key": "stage-out.yaml"}],
defaultMode=420,
optional=False
)
```
Loading
Loading