Is it possible to import current cluster to temporal operator ? #562

mfractal · 2023-11-23T16:39:09Z

we have a production cluster deployed via helm chart, i'd like to migrate to use the operator without any downtime if possible. what would be the best path to do it ?

alexandrevilain · 2023-11-23T19:23:53Z

Hi!

I never did that but it could be doable. Let's try this!

What I could suggest to you:

Create a new TemporalCluster on a dev cluster (with brand new storage) and fill the spec fields to make the operator generate a configmap that looks like (as much as possible) with the current running cluster configmap (the config generated by the helm chart). The only diff should be database users, password and endpoints. If you have diff on the configmap, please raise them on this issue. Maybe we have missing feature on the operator spec.

Then try to make services managed by the operator to join your existing cluster. To do that, create a new TemporalCluster with the spec you filled on the dev cluster. The operator will try to configure the database for you. To make it skip the persistence reconciliation, update the TemporalCluster's status with the following fields:

    persistence:
      defaultStore:
        created: true
        schemaVersion: 1.21.2
        setup: true
        type: postgres
      visibilityStore:
        created: true
        schemaVersion: 1.21.2
        setup: true
        type: postgres

(update it with the right values).
This will make the operator to only deploy the components.

If the services deployed by the operator have successfully joined the current existing cluster you'll be able to uninstall the helm chart.

I have no clue if it could work, let's try this :)

yujunz · 2023-11-25T01:18:27Z

Then try to make services managed by the operator to join your existing cluster. To do that, create a new TemporalCluster with the spec you filled on the dev cluster. The operator will try to configure the database for you. To make it skip the persistence reconciliation, update the TemporalCluster's status

Interesting. We have the similar requirement too. Could you elaborate a bit on the terms and steps?

"services managed by the operator" => suppose it refers to the deployment, configmap built by operator from the CRD
"join" => not sure what exactly it means. Does it mean the operator take ownership of the helm chart deployed resources and reconcile them? Or additional connection to the database used by helm chart in parallel? Or some kind of multi-cluster joining?
"TemporalCluster's status" => https://alexandrevilain.github.io/temporal-operator/api/v1beta1/#temporal.io/v1beta1.TemporalClusterStatus ?

One known issue during our previous attempt to take ownership of existing database is the conflict on cluster meta info which seems to be a checksum and hard to reverse engineering to the source. The workaround was deleting it and let temporal regenerate it from the configmap created by operator.

Aoao54 · 2023-11-30T03:51:39Z

Hi @mfractal @alexandrevilain
We have the similar requirement. I was gonna try this.

generate a configmap that looks like (as much as possible) with the current running cluster configmap (the config generated by the helm chart)

But in our production cluster's clusterMetadata config , the Cluster Name is "active" , the value of Helm chart default : )

 clusterMetadata:
  enableGlobalDomain: false
  failoverVersionIncrement: 10
  masterClusterName: "active"
  currentClusterName: "active"
  clusterInformation:
    active:
      enabled: true
      initialFailoverVersion: 1
      rpcName: "temporal-frontend"
      rpcAddress: "127.0.0.1:7933"

Looks like it's impossible to make those two config fits. Since the clusterMetadata config is auto generated by operator and can't config now. Unless we set our cluster name to "active".

In our solution, there will be a downtime.But it works.

Scale down production to zero replicas
Deploy temporal cluster with operator using the same prod db

It will take over the running and closed workflow executions.

If you got panic: Cluster info initial versions have duplicates with the new deployment. The reason is below image.

You can bypass it by deleting the old clusterMetadata which is stored in the table cluster_metadata_info of default DB.

alexandrevilain · 2024-04-01T12:03:12Z

Hi!
Good news, I think that #494 would help you :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to import current cluster to temporal operator ? #562

Is it possible to import current cluster to temporal operator ? #562

mfractal commented Nov 23, 2023

alexandrevilain commented Nov 23, 2023

yujunz commented Nov 25, 2023

Aoao54 commented Nov 30, 2023

alexandrevilain commented Apr 1, 2024

Is it possible to import current cluster to temporal operator ? #562

Is it possible to import current cluster to temporal operator ? #562

Comments

mfractal commented Nov 23, 2023

alexandrevilain commented Nov 23, 2023

yujunz commented Nov 25, 2023

Aoao54 commented Nov 30, 2023

alexandrevilain commented Apr 1, 2024