Active failover mode can not be auto upgraded #667

peijianju · 2020-11-12T20:33:53Z

Dear kube aranogdb team,

Our cluster is using the active failover mode. But there has been an issue when upgrading from 3.6.5 to 3.7.3.

Here are some details

Our kube arangodb operator is at 1.1.0
We have the arangodeployment running in active failover mode, arangodb at 3.6.5.
We edit the arangodeployment's image field to an ArangDB 3.7.3 version
I see an agent pod begin to restart, but it finished in error mode here is the log

2020-11-12T20:29:44Z [1] ERROR [3bc7f] {startup} Database directory version (30605) is lower than current version (30703).
2020-11-12T20:29:44Z [1] ERROR [ebca0] {startup} ----------------------------------------------------------------------
2020-11-12T20:29:44Z [1] ERROR [24e3c] {startup} It seems like you have upgraded the ArangoDB binary.
2020-11-12T20:29:44Z [1] ERROR [8bcec] {startup} If this is what you wanted to do, please restart with the
2020-11-12T20:29:44Z [1] ERROR [b0360] {startup}   --database.auto-upgrade true
2020-11-12T20:29:44Z [1] ERROR [13414] {startup} option to upgrade the data in the database directory.
2020-11-12T20:29:44Z [1] ERROR [24bd1] {startup} ----------------------------------------------------------------------'

Thus, we are missing --database.auto-upgrade

From ArangoDB documents here https://www.arangodb.com/docs/stable/deployment-kubernetes-upgrading.html

For minor level upgrades (e.g. 3.3.9 to 3.4.0) each server is stopped, then the new version is started with --database.auto-upgrade and once that is finish the new version is started with the normal arguments.

So I believe --database.auto-upgrade should be added by the kube arangodb operator.

May I have some help, please? thank you

The text was updated successfully, but these errors were encountered:

ajanikow · 2020-11-13T13:42:16Z

Hello!

Agent should be restarted with this flag by Operator. Did agent fail during first restart?

However, if Operator did his job properly even restart during upgrade should be fine. We test it during QA phase (it is one of standard scenarios).

Can you share with us arangodeployment? With current status. And also events of arangodeployment -+ operator logs (grep by -i action).

Best Regards,
Adam.

peijianju · 2020-11-13T13:52:27Z

Hi Adam,

Yes, the agent failed with error. Seems the operator is not adding the flag

arangodb-to-git-hub.yaml.txt

peijianju · 2020-11-13T14:02:30Z

Uploading operator.log.txt…

peijianju · 2020-11-13T14:03:39Z

Uploading events.log.txt…

peijianju · 2020-11-16T01:01:40Z

Hi, thanks for the response.

I tried a scenario where is only db update (NO operator update), this is what I did

kubectl delte arango arangodb remove our deployment.
make sure 1.1.0 operator is running
create arangodeployment with 3.6.5 arangodb
wait to see all pod running
kubectl edit arango arangodb edit image to a 3.7.3 image
then I see an agent po stopped, recreated, then errored

arangodb.yaml.txt
operator.log.txt

peijianju · 2020-12-02T14:59:44Z

Hi @ajanikow ,

We changed imageDiscoveryMode to kubelet. but a strange thing happned.

the master DB and all agent are upgraded
the follower DB falls into error state, with --database.auto-upgrade missing

May I have some help?

peijianju · 2020-12-02T19:49:29Z

I think this is caused by upgrading operator and DB at the same time.

Operator upgrade triggers a rolling update. And if we did a DB upgrade at the same time, a pod (could be agent or db pod) will be left in error status, saying --database.auto-upgrade true is missing

I will need to make sure the DB upgrade is after the rolling update.

So what would be the recommended way to know that a rolling update is done?

ajanikow self-assigned this Nov 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Active failover mode can not be auto upgraded #667

Active failover mode can not be auto upgraded #667

peijianju commented Nov 12, 2020

ajanikow commented Nov 13, 2020

peijianju commented Nov 13, 2020

peijianju commented Nov 13, 2020

peijianju commented Nov 13, 2020

peijianju commented Nov 16, 2020

peijianju commented Dec 2, 2020

peijianju commented Dec 2, 2020

Active failover mode can not be auto upgraded #667

Active failover mode can not be auto upgraded #667

Comments

peijianju commented Nov 12, 2020

ajanikow commented Nov 13, 2020

peijianju commented Nov 13, 2020

peijianju commented Nov 13, 2020

peijianju commented Nov 13, 2020

peijianju commented Nov 16, 2020

peijianju commented Dec 2, 2020

peijianju commented Dec 2, 2020