Merge pull request #84 from Ortec-Finance/prometheus-connection

Expanding Sailfish-dispatcher to properly react on Metrics # Conflicts: # k8s/sailfish/overlays/feature/kustomization.yaml
Ortec-Finance · May 6, 2024 · 7e2b116 · 7e2b116
1 parent 7fd7ff9
commit 7e2b116
Show file tree

Hide file tree

Showing 30 changed files with 346 additions and 98 deletions.
diff --git a/.devcontainer/requirements.txt b/.devcontainer/requirements.txt
@@ -6,3 +6,4 @@ kubernetes==29.0.0
 requests==2.31.0
 requests-oauthlib==2.0.0
 python-qpid-proton==0.39.0
+prometheus-api-client==0.5.5
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,4 +1,6 @@
 # Changelog
+## v0.31.0
+Expanding Dispatcher to use Prometheus Queries to determine the Best destination for a Job.
 
 ## v0.30.0
 Adding an example of an exporter deployment that is able to export metrics into Prometheus from a Grid Intensity Provider.

diff --git a/README.md b/README.md
@@ -6,6 +6,9 @@ Sailfish uses two RedHat supported operators to function: an AMQ Broker to captu
 This enables Sailfish to complete distributed computations on container level, leveraging the Public Cloud providers flexbility on provisioning Virtual Machines.  
 
 # Videos
+Demo with Carbon Aware Scheduling:
+>[![Watch the Demo](https://img.youtube.com/vi/z7qnYsmZjLM/default.jpg)](https://youtu.be/z7qnYsmZjLM)
+
 Demo with Azure Red Hat Openshift:
 >[![Watch the Demo](https://img.youtube.com/vi/MwGDWiQNGPg/default.jpg)](https://youtu.be/MwGDWiQNGPg)
 

diff --git a/docs/features/broker-scale-to-zero.md b/docs/features/broker-scale-to-zero.md
@@ -9,7 +9,6 @@ The ScaledObject enabled by the `broker-scale-to-zero` component triggers a scal
 
 Do not use the `ephemeral-broker` component as that might result in data loss.
 
-
 ## Configuring your workloads
 ### The Gateway
 The Gateway workload must be configured to wait for the broker to be up and running. This can be done by simply pinging the broker in a loop until successful.
@@ -22,7 +21,7 @@ The ScaledJobs outside of the worker and manager must be added to the `triggers`
 ### ArgoCD Configuration
 Because the `sailfish-amq-broker-autoscaler` `ScaledObject` now manages the size of the Broker, this must be ignored by ArgoCD.
 Make sure to have this configured in your ignoreDifferences:
-```
+```yaml
   ## This is needed for the broker to be able to scale to zero!
   ignoreDifferences:
     - group: broker.amq.io

diff --git a/docs/features/multi-cluster-deployments.md b/docs/features/multi-cluster-deployments.md
@@ -7,36 +7,47 @@ This feature connects Sailfish instances across multiple Clusters.
 - Not use `broker-scale-to-zero` component
 
 ## Using SailfishCluster CRD
-When enabling the `multi-cluster-controller` you deploy a Controller that listens to this yaml below:
+When enabling the `multi-cluster` component you deploy a Controller and a ScaledJob that listens to this yaml below:
 
 ```yaml
 apiVersion: ortec-finance.com/v1alpha1
 kind: SailfishCluster
 metadata:
   name: sailfish-cluster
 spec:
+  cluster:
+    queue: sailfishJob # This will define what queue the dispatcher will choose as a destination when the local cluster is the best choice
+  triggers:
+    operator: MIN
+    variables:
+      - type: prometheus
+        query: grid_intensity_carbon_average{location="NL"}
+        clusterRef: eu # This will reference clusters defined under /spec/clusters
+      - type: prometheus
+        query: grid_intensity_carbon_average{location="US-CAL-CISO"}
+        clusterRef: local # This will use what is defined under /spec/cluster/queue
   clusters:      
-    - name: eu      
-      host: sailfish-broker-bridge-0-svc.your-namespace.svc.cluster.local     
-    - name: na      
-      host: sailfish-broker-bridge-0-svc.your-namespace.svc.cluster.local   
+    - name: eu  
+      host: sailfish-broker-bridge-0-svc.rdlabs-experiment-cas-eu-west.svc.cluster.local
 ```
-The `SailfishCluster` yaml allows you to define the cluster that you wish to connect to your `sailfish-broker`. The `multi-cluster-controller` will then create Bridge Queues as a result.
-You must add the `SailfishCluster` yaml in your own deployment configuration to create the bridge queus.
 
-## Determining the Host
-Every `sailfish-broker` has a bridge connector. You must reference the bridge connector of the remote sailfish instance that you'd like to connect to.
+The `SailfishCluster` manifest allows you to define the cluster that you wish to connect to your `sailfish-broker`. The `multi-cluster-controller` will then create Bridge Queues as a result.
+You must add the `SailfishCluster` manifest in your own deployment configuration to create the bridge queus.
 
 ## Changes to the Gateway
 The default Queue flow looks like this:
 `Gateway -> sailfishJob Queue -> sailfishTask Queue`
-With Multi cluster deployments, we're adding another queue inbetween the Gateway and the Job. What that means for you is that you must reference the `sailfishDispatch` Queue. There will be a shared ScaledJob spun up to handle the messages from this queue, and dispatch them based on metrics (TODO) to either the local `sailfishJob` queue or one of the remote sailfishJob Queues. 
+With Multi cluster deployments, we're adding another queue inbetween the Gateway and the Job. What that means for you is that you must reference the `sailfishDispatch` Queue. There will be a ScaledJob spun up to handle the messages from this queue, and dispatch them based on metrics to either the local `sailfishJob` queue or one of the remote sailfishJob Queues. The code for the Dispatcher ScaledJob is defined in `/operator/cluster-dispatcher`
 
 This will result in this flow:
-`Gateway -> sailfishDispatcher -> sailfishJob Queue || sailfishJob Remote Queue -> sailfishTask Queue`
+`Gateway -> sailfishDispatcher -> (sailfishJob Queue OR sailfishJob Remote Queue) -> sailfishTask Queue`
+
+## Determining the Host
+Every `sailfish-broker` has a bridge connector already defined. You must reference the bridge connector of the remote sailfish instance that you'd like to connect to.
+
 
 ## Limitations
 - It is currently not possible to override the remote queue to a different name, please create a RFC this if it is required for your use case.
 - It is currently not possible to reschedule ongoing Jobs, All Jobs go through the sailfish
 - It is currently not possible to use the `multi-cluster-deployments` feature with the `broker-scale-to-zero`. This is due to the broker needing to be online to sustain the bridge
-- It is currently not possible to schedule jobs outside of your cluster, this needs the implementation of a RedHat AMQ Interconnect 
+- It is currently not possible to schedule jobs outside of your cluster, this needs the implementation of a RedHat AMQ Interconnect
diff --git a/docs/features/observability.md b/docs/features/observability.md
@@ -5,7 +5,7 @@ If you want to integrate these Grafana dashboards you have to patch `/metadata/l
 
 You can use inline patching in your ArgoCD Application, like such:
 
-```
+```yaml
 apiVersion: argoproj.io/v1alpha1
 kind: Application
 metadata:

diff --git a/docs/features/spot-machinesets.md b/docs/features/spot-machinesets.md
@@ -8,7 +8,7 @@ In sailfish, that is no problem, as the message in the queue will be put back fo
 ## How to enable
 In your MachineSet ArgoCD Application, simply add the parameter:
 
-```
+```yaml
     helm:
       parameters:
         - name: enableSpotVM

diff --git a/docs/sailfish-machines.md b/docs/sailfish-machines.md
@@ -8,13 +8,13 @@ However, if the Manager is heavy and scalable, we recommend to add a `nodeSelect
 Using the MachineSet Helm chart declared in `/k8s/cluster-config/machinesets` you will get three machinesets, one in each zone.
 
 All these Sailfish Machines are by default Tainted with this:
-```
+```yaml
 - effect: NoSchedule
   key: application
   value: sailfish-hpc
 ```
 To have your Workers schedule here, they are by default tolerating this taint by declaring this under `/spec/jobTargetRef/template/spec`:
-```
+```yaml
 tolerations:  
     - effect: NoSchedule
     key: application
@@ -26,13 +26,13 @@ In addition to the tolerations, you also need to point the workloads to land on
 
 ## NodeSelector Label
 All Sailfish Machines have this label:
-```
+```yaml
 metadata:
   labels:
     sailfish/application: {{ .Values.application }}
 ```
 By default the Workers also implement this label in `spec/jobTargetRef/template/spec`:
-```
+```yaml
 nodeSelector:   
   sailfish/application: sailfish 
 ```

diff --git a/docs/setup.md b/docs/setup.md
@@ -11,14 +11,13 @@ This design makes it easy upgrade to new features of Sailfish! Read more in #2 D
 You can find examples of the ArgoCD Applications under `sailfish-example/argocd`.
 
 # 1. Cluster Configuration
-- Step A and B should be done once per Cluster.
-- Step C is something you need to setup atleast once yourself.
+- Step ABC should be done once per Cluster.
+- Step D is something you need to setup atleast once yourself.
 
 ## 1a Prometheus
 Sailfish requires Openshift user-workload monitoring to be enabled, check out how to do that here:
 https://docs.openshift.com/container-platform/4.13/monitoring/enabling-monitoring-for-user-defined-projects.html
 
-
 ## 1b Operators 
 To deploy or synchronize your operators to work with the current sailfish version, 
 
@@ -29,10 +28,12 @@ To deploy or synchronize your operators to work with the current sailfish versio
 
 The operator config are defined in `k8s/cluster-config/operators`. You don't need to modify anything here!
 
+## 1c Sailfish Operator
+With Sailfish we also deploy a CRD to support the Sailfish Operator. There is currently only one CRD called: `SailfishCluster`. Make sure to update your RBAC so that ArgoCD & You can use this manifest.
 
-## 1c Machines
+## 1d Machines
 We recommend to deploy seperate machinesets for Sailfish, we've provided an example for machinesets that are configured to work with ARO in this folder: `/k8s/cluster-config/machinesets`. You can deploy these with an ArgoCD application, find the example here: `sailfish-example/argocd/apps/machines.yaml`.
-```
+```yaml
     helm:
       parameters:
         - name: clusterName

diff --git a/k8s/cluster-config/operators/kustomization.yaml b/k8s/cluster-config/operators/kustomization.yaml
@@ -1,4 +1,5 @@
 resources:
   - knative
   - keda
-  - amq
+  - amq
+  - sailfish
diff --git a/k8s/cluster-config/operators/sailfish/crd.yaml b/k8s/cluster-config/operators/sailfish/crd.yaml
@@ -17,6 +17,9 @@ spec:
               properties:
                 clusters:
                   type: array
+                  required:
+                    - name
+                    - status
                   items:
                     type: object
                     properties:
@@ -26,27 +29,56 @@ spec:
                         type: string
                       queue:
                         type: string
+                      query:
+                        type: string
+                      reason:
+                        type: string
             spec:
               type: object
+              required:
+                - cluster
               properties:
                 triggers:
-                  type: array
-                  items:
-                    type: object
-                    properties:
-                      promql:
-                        type: string
-                      carbon-aware:
-                        type: string
+                  type: object
+                  properties:
+                    operator:
+                      type: string
+                    variables:
+                      description: "List of queries that will be evaluated by the operator"
+                      type: array
+                      items:
+                        type: object
+                        properties:
+                          type:
+                            type: string
+                          query:
+                            type: string
+                          clusterRef:
+                            type: string
+                            description: "Reference to the cluster that will be used if the value of the query is best compared to the other variables"
+                        required:
+                          - type
+                          - query
+                          - clusterRef
+                cluster:
+                  type: object
+                  description: "Defines the Local Sailfish Cluster"
+                  properties:
+                    queue:
+                      type: string
                 clusters:
                   type: array
+                  description: "Defines the Remote Sailfish Clusters"
                   items:
                     type: object
                     properties:
                       name:
                         type: string
                       host:
                         type: string
+                        description: "To reference a remote Sailfish Cluster, use this, the Queues will be automatically generated"
+                    required:
+                      - name
   scope: Namespaced
   names:
     plural: sailfishclusters

diff --git a/k8s/sailfish/base/foundation/manager-autoscaler.yaml b/k8s/sailfish/base/foundation/manager-autoscaler.yaml
@@ -7,11 +7,11 @@ metadata:
     app.kubernetes.io/component: sailfish-manager
     app.kubernetes.io/instance: sailfish-manager
     app.kubernetes.io/name: sailfish-manager
-    app.kubernetes.io/part-of: sailfish-app
+    app.kubernetes.io/part-of: sailfish-manager
     app.openshift.io/runtime: python
 spec:
   jobTargetRef:
-    ttlSecondsAfterFinished: 100000 # completed or failed jobs will be removed after 100 seconds note that it successful an failed JobsHistoryLimit also afect kept jobs
+    ttlSecondsAfterFinished: 100000
     template:
       spec:
         # Provide your own container that you'd like to run as a Job Manager

diff --git a/k8s/sailfish/base/foundation/sailfish-operator/permissions/rb.yaml b/k8s/sailfish/base/foundation/sailfish-operator/permissions/rb.yaml
@@ -8,4 +8,16 @@ subjects:
 roleRef:
   kind: Role
   name: sailfish-operator-role
+  apiGroup: rbac.authorization.k8s.io
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: RoleBinding
+metadata:
+  name: sailfish-operator-view
+subjects:
+- kind: ServiceAccount
+  name: sailfish-operator
+roleRef:
+  kind: ClusterRole
+  name: view
   apiGroup: rbac.authorization.k8s.io
diff --git a/k8s/sailfish/base/foundation/sailfish-operator/permissions/role.yaml b/k8s/sailfish/base/foundation/sailfish-operator/permissions/role.yaml
@@ -36,4 +36,11 @@ rules:
   - "secrets"
   verbs:
   - "get"
-  - "list"
+  - "list"
+- apiGroups:
+  - ""
+  resources:
+  - "namespaces"
+  - "services"
+  verbs:
+  - "get"
diff --git a/k8s/sailfish/base/foundation/worker-autoscaler.yaml b/k8s/sailfish/base/foundation/worker-autoscaler.yaml
@@ -7,11 +7,11 @@ metadata:
     app.kubernetes.io/component: sailfish-worker
     app.kubernetes.io/instance: sailfish-worker
     app.kubernetes.io/name: sailfish-worker
-    app.kubernetes.io/part-of: sailfish-app
+    app.kubernetes.io/part-of: sailfish-worker
     app.openshift.io/runtime: python
 spec:
   jobTargetRef:
-    ttlSecondsAfterFinished: 100 # completed or failed jobs will be removed after 100 seconds note that it successful an failed JobsHistoryLimit also afect kept jobs
+    ttlSecondsAfterFinished: 100
     template:
       spec:
         # Make sure to select the node that belongs to your solution.
@@ -48,5 +48,5 @@ spec:
     parallelism: 1
   successfulJobsHistoryLimit: 1   # Optional. Default: 100. How many completed jobs should be kept.
   pollingInterval: 2 
-  maxReplicaCount: 5             # Optional. Default: 100
+  maxReplicaCount: 50             # Optional. Default: 100
   failedJobsHistoryLimit: 5       # Optional. Default: 100. How many failed jobs should be kept.
diff --git a/.../multi-cluster-controller/deployment.yaml → .../components/multi-cluster/deployment.yaml b/.../multi-cluster-controller/deployment.yaml → .../components/multi-cluster/deployment.yaml
@@ -2,6 +2,8 @@ apiVersion: apps/v1
 kind: Deployment
 metadata:
   name: sailfish-multi-cluster-controller
+  labels:
+    app.kubernetes.io/part-of: sailfish-operator
 spec:
   replicas: 1
   selector:

diff --git a/...ulti-cluster-controller/dispatch-job.yaml → ...omponents/multi-cluster/dispatch-job.yaml b/...ulti-cluster-controller/dispatch-job.yaml → ...omponents/multi-cluster/dispatch-job.yaml
@@ -7,13 +7,14 @@ metadata:
     app.kubernetes.io/component: sailfish-dispatcher
     app.kubernetes.io/instance: sailfish-dispatcher
     app.kubernetes.io/name: sailfish-dispatcher
-    app.kubernetes.io/part-of: sailfish-app
+    app.kubernetes.io/part-of: sailfish-operator
     app.openshift.io/runtime: python
 spec:
   jobTargetRef:
-    ttlSecondsAfterFinished: 100 # completed or failed jobs will be removed after 100 seconds note that it successful an failed JobsHistoryLimit also afect kept jobs
+    ttlSecondsAfterFinished: 100
     template:
       spec:
+        serviceAccountName: sailfish-operator
         containers:
         - name: dispatcher
           image: sailfish-dispatcher
@@ -39,6 +40,8 @@ spec:
               value: sailfish-broker-hdls-svc.$(MY_POD_NAMESPACE).svc.cluster.local
             - name: QUEUE_PORT
               value: '5672'
+            - name: PROMETHEUS_URL
+              value: https://thanos-querier.openshift-monitoring.svc.cluster.local:9092
           resources:
             requests:
               memory: 200M

diff --git a/...lti-cluster-controller/kustomization.yaml → ...mponents/multi-cluster/kustomization.yaml b/...lti-cluster-controller/kustomization.yaml → ...mponents/multi-cluster/kustomization.yaml
diff --git a/...nents/multi-cluster-controller/queue.yaml → ...lfish/components/multi-cluster/queue.yaml b/...nents/multi-cluster-controller/queue.yaml → ...lfish/components/multi-cluster/queue.yaml