Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OPENJDK-2992] Create DeploymentConfig and Route for jlink apps #514

Open
wants to merge 11 commits into
base: jlink-dev
Choose a base branch
from

Conversation

Josh-Matsuoka
Copy link
Contributor

Addresses https://issues.redhat.com/browse/OPENJDK-2992

Cleanup/Continuation of #500

This cleans up the original PR, bringing it in line with the new naming convention for created objects, adding in the missing target port, and converting the Deployment to a DeploymentConfig to work with ImageStreamTags

Copy link
Member

@jmtd jmtd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! Some notes

  • can we get a default value for TARGET_PORT?
  • please add TARGET_PORT to the examples in templates/jlink/README.md
  • The Port parameter in the service object needs templating too; the default of 80 doesn't work with the quickstart we use in README.md

Copy link

@sefroberg sefroberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structure looks correct.

Copy link
Member

@jmtd jmtd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding those; this is much closer. I've just tried it again in CRC and it's still not quite there. One more difference I notice between what this creates and what is created if I do "+Add" by hand in the console, is the latter creates a Pod object as well. Perhaps we need to add that.

@jmtd
Copy link
Member

jmtd commented Nov 20, 2024

Here's a dump of the pod object that was created in my CRC instance when I did a manual "Add". I'm guessing the vast majority of fields here aren't needed in the template, and somehow we 'll have to resolve the ImageStream/Image discrepancy for this

apiVersion: v1
kind: Pod
metadata:
  annotations:
    k8s.v1.cni.cncf.io/network-status: |-
      [{
          "name": "openshift-sdn",
          "interface": "eth0",
          "ips": [
              "10.217.1.68"
          ],
          "default": true,
          "dns": {}
      }]
    openshift.io/generated-by: OpenShiftWebConsole
    openshift.io/scc: restricted-v2
    seccomp.security.alpha.kubernetes.io/pod: runtime/default
  creationTimestamp: "2024-11-20T10:27:01Z"
  generateName: join-76fb988d6-
  labels:
    app: join
    deployment: join
    pod-template-hash: 76fb988d6
  name: join-76fb988d6-gmcnp
  namespace: jlink1
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: join-76fb988d6
    uid: ed6e2e33-3d6f-40b3-9520-33f039c5c14f
  resourceVersion: "39753"
  uid: 05cea876-ef10-41ad-96d7-872468638fc0
spec:
  containers:
  - image: image-registry.openshift-image-registry.svc:5000/jlink1/quarkus-quickstart-lightweight-image@sha256:26654e381a0d87e88b6b2903ee2b0d1431a53a20cbb2c926a1932d9215f15f9c
    imagePullPolicy: IfNotPresent
    name: join
    ports:
    - containerPort: 8080
      protocol: TCP
    resources: {}
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      runAsNonRoot: true
      runAsUser: 1000660000
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-r8pvn
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  imagePullSecrets:
  - name: default-dockercfg-m2v5c
  nodeName: crc-97g8f-master-0
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    fsGroup: 1000660000
    seLinuxOptions:
      level: s0:c26,c5
    seccompProfile:
      type: RuntimeDefault
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: kube-api-access-r8pvn
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
      - configMap:
          items:
          - key: service-ca.crt
            path: service-ca.crt
          name: openshift-service-ca.crt
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2024-11-20T10:27:01Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2024-11-20T10:27:07Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2024-11-20T10:27:07Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2024-11-20T10:27:01Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: cri-o://b171de0db71303e4de17949e27f41503aa7a1f36b6696d28ae7cbcb6cea7a4c6
    image: image-registry.openshift-image-registry.svc:5000/jlink1/quarkus-quickstart-lightweight-image@sha256:26654e381a0d87e88b6b2903ee2b0d1431a53a20cbb2c926a1932d9215f15f9c
    imageID: image-registry.openshift-image-registry.svc:5000/jlink1/quarkus-quickstart-lightweight-image@sha256:26654e381a0d87e88b6b2903ee2b0d1431a53a20cbb2c926a1932d9215f15f9c
    lastState: {}
    name: join
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2024-11-20T10:27:06Z"
  hostIP: 192.168.126.11
  phase: Running
  podIP: 10.217.1.68
  podIPs:
  - ip: 10.217.1.68
  qosClass: BestEffort
  startTime: "2024-11-20T10:27:01Z"

@jmtd
Copy link
Member

jmtd commented Dec 18, 2024

Thanks for continuing to work on this. We're not quite there yet. Here's the
process I go through to review it each time, and the acceptance criteria to
merge:

  1. create a new namespace in my crc instance (or a new crc instance)
  2. follow the steps in templates/jlink/README.md
  3. watch the builds from the web console
  4. visit "Topology" and see what's up

With the current state (adb47cce3197581b42f364bec17b353fd6b57998), the
label on the top-level Template object means the template won't load.

Once that's resolved, the final state is a DeploymentConfig and a Pod that
are separate in the Topology view, instead of part of the same Application.
In this pic, the two objects on the left are created from the template, and
the objects on the right are created when I do "+Add" manually:

topology

Where we want to get to is to have the objects combined like in the manual
case. I think this should be possible with DeploymentConfigs, even though
the manual approach creates a Deployment instead (we've discussed the
difficulties of creating a Deployment in the template, vis-a-vis the
ImageStream versus image: issue). I think it's a matter of labelling,
but I'm not sure.

Clicking on the "external link" icon, to visit the URI for the route,
results in "Application is not available" for the Route created by
the template, but works for the manual case.

The acceptance criteria is thus

  1. template is valid
  2. builds auto-trigger
  3. Pod, DeploymentConfig are created
  4. Pod, DeploymentConfig are unified in the Topology view
  5. the relevant Route works

@Josh-Matsuoka
Copy link
Contributor Author

The latest push fixes the label issue.

The problem doesn't seem to be with the route, rather it's with the container itself. Looking at the logs before it crashes I'm seeing

ERROR: Failed to start application (with profile [prod])
java.lang.RuntimeException: Failed to start quarkus
at io.quarkus.runner.ApplicationImpl.doStart(Unknown Source)
at io.quarkus.runtime.Application.start(Application.java:101)
at io.quarkus.runtime.ApplicationLifecycleManager.run(ApplicationLifecycleManager.java:111)
at io.quarkus.runtime.Quarkus.run(Quarkus.java:71)
at io.quarkus.runtime.Quarkus.run(Quarkus.java:44)
at io.quarkus.runtime.Quarkus.run(Quarkus.java:124)
at io.quarkus.runner.GeneratedMain.main(Unknown Source)
Caused by: java.lang.IllegalStateException: Unable to create folder at path '/tmp/vertx-cache/8814133929435640674'

It looks like quarkus isn't able to create any temporary files/directories inside the container when built through the template. I don't see any differences in the pod specs that should be causing this, do you have any ideas @jmtd ?

@Josh-Matsuoka
Copy link
Contributor Author

Upon further investigation it looks like the user we switch to for running the java command (USER 185) doesn't have write permissions on the file system so it's unable to create the cache for quarkus. This causes quarkus to crash and the pod to enter a CrashLoopBackoff which is why the application is unavailable.

Removing the USER 185 and presumably running as root everything works as expected, but this probably isn't the best solution here, do you have any suggestions for a different user and/or how to fix this permissions issue?

@Josh-Matsuoka
Copy link
Contributor Author

@jmtd any thoughts on the above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants