Skip to content

Commit

Permalink
Merge pull request #8638 from romayalon/romy-online-upgrade-improvements
Browse files Browse the repository at this point in the history
NC | Online upgrade improvements
  • Loading branch information
romayalon authored Jan 5, 2025
2 parents 8ec7dd5 + d8a032a commit 5c4b49b
Show file tree
Hide file tree
Showing 5 changed files with 185 additions and 17 deletions.
168 changes: 163 additions & 5 deletions docs/NooBaaNonContainerized/Upgrade.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
1. [Introduction](#introduction)
2. [General Information](#general-information)
3. [Download Upstream RPM](#download-upstream-rpm)
4. [Offline Upgrade](#offline-upgrade)
4. [Offline Upgrade (Version < 5.18.0)](#offline-upgrade-version--5180)
1. [Offline Upgrade steps](#offline-upgrade-steps)
5. [Online Upgrade](#online-upgrade)
5. [Online Upgrade (Version >= 5.18.0)](#online-upgrade-version--5180)


## Introduction
Expand All @@ -29,7 +29,7 @@ This document provides step-by-step instructions to help you successfully upgrad

For NooBaa upstream (open source code) RPM download instructions, See [NooBaa Non Containerized Getting Started](./GettingStarted.md).

## Offline Upgrade
## Offline Upgrade (Version < 5.18.0)
The currently available upgrade process of NooBaa Non Containerized is an offline upgrade. Offline upgrade means that NooBaa service must be stopped during the upgrade and that NooBaa endpoints won't be handling S3 requests at the time of the upgrade.

### Offline Upgrade steps
Expand Down Expand Up @@ -69,5 +69,163 @@ The currently available upgrade process of NooBaa Non Containerized is an offlin
cat /etc/noobaa.conf.d/system.json
{"hostname":{"current_version":"5.17.0","upgrade_history":{"successful_upgrades":[{"timestamp":1719299738760,"completed_scripts":[],"from_version":"5.15.4","to_version":"5.17.0"}]}}}
```
## Online Upgrade
The process of Online Upgrade of Non Containerized NooBaa is not supported yet.
## Online Upgrade (Version >= 5.18.0)

### Online Upgrade Goals
**1. Minimal downtime -** Ensure minimal downtime for each node.

**2. Incremental changes -** Split upgrade to small chunks, for example, upgrade nodes one by one. Each node will get upgraded on its turn, the other nodes will still be available for handling s3 requests.

**3. Rollback capability -** Mechanism for revert to the previous version in case something went wrong during the upgrade.

**4. Schema backward compatibility -** Changes to account/bucket/config schema must be backwards compatible to allow seamless transitions to new version.


### Online Upgrade Algorithm

1. Initiate config directory backup (#1).
2. Iterate nodes one by one -
* Stop NooBaa service (or suspend the node in CES)
* RPM upgrade each node.
* Restart NooBaa service on each node.
3. Wait for all hosts to finish RPM upgrade (source code upgrade).
4. Initiate config directory backup (#2).
5. Initiate upgrade of the config directory using a noobaa-cli complete upgrade command. (point of no return)

Online Upgrade Algorithm commands examples -
1. Config directory backup -
1. CES - `mms3 config backup /path/to/backup/location`
2. Non CES - `cp -R /etc/noobaa.conf.d/ /path/to/backup/location`
2. Stop NooBaa service - `systemctl stop noobaa`
3. RPM upgrade on a specific node - `rpm -Uvh /path/to/new_noobaa_rpm_version.rpm`
4. Restart NooBaa service - `systemctl restart noobaa`
5. `noobaa-cli upgrade start --expected_version=5.18.0 --expected_hosts=hostname1,hostname2,hostname3`

### Additional Upgrade Properties of `system.json`-

1. New per host property -
- config_dir_version

2. New config directory information -
- config_directory
- config_dir_version
- phase
- upgrade_package_version
- in_progress_upgrade - (during the upgrade)
- timestamp
- completed_scripts
- running_host
- config_dir_from_version
- config_dir_to_version
- package_from_version
- package_to_version
- upgrade_history
- last_failure (if last upgrade failed)
- successful_upgrades

#### system.json new information examples -
1. During Upgrade - `cat /etc/noobaa.conf.d/system.json | jq .`
```json
{
"my_host1":{
"current_version":"5.18.0",
"config_dir_version": "1.0.0",
"upgrade_history":{
"successful_upgrades":[{
"timestamp":1730890665481,
"from_version":"5.17.1",
"to_version":"5.18.0"
}]
}
},
"config_directory":{
"phase":"CONFIG_DIR_LOCKED", // <- config dir is locked during an upgrade
"config_dir_version":"0.0.0", // <- config_dir_version is still the old config_dir_version
"upgrade_package_version":"5.17.1", // <- upgrade_package_version is still the old upgrade_package_version
"in_progress_upgrade":[{ // <- in_progress_upgrade property during the upgrade
"timestamp":1730890691016,
"completed_scripts": [],
"running_host":"my_host1",
"config_dir_from_version":"0.0.0",
"config_dir_to_version":"1.0.0",
"package_from_version":"5.17.1",
"package_to_version":"5.18.0"
}]
}
}
```

2. After a successful upgrade - `cat /etc/noobaa.conf.d/system.json | jq .`
```json
{
"my_host1":{
"current_version":"5.18.0",
"config_dir_version": "1.0.0",
"upgrade_history":{
"successful_upgrades":[{
"timestamp":1730890665481,
"from_version":"5.17.1",
"to_version":"5.18.0"
}]
}
},
"config_directory":{
"phase":"CONFIG_DIR_UNLOCKED", // <- after a successful upgrade, config dir is unlocked
"config_dir_version":"1.0.0", // <- config_dir_version is the new config_dir_version
"upgrade_package_version":"5.18.0", // <- upgrade_package_version is the new upgrade_package_version
"upgrade_history":{ // <- a new item in the successful upgrades array was added
"successful_upgrades":[{
"timestamp":1730890691016,
"completed_scripts":
["/usr/local/noobaa-core/src/upgrade/nc_upgrade_scripts/1.0.0/config_dir_restructure.js"],
"running_host":"my_host1",
"config_dir_from_version":"0.0.0",
"config_dir_to_version":"1.0.0",
"package_from_version":"5.17.1",
"package_to_version":"5.18.0"
}]
}
}
}
```

3. After a failing upgrade - `cat /etc/noobaa.conf.d/system.json | jq .`
```json
{
"my_host1":{
"current_version":"5.18.0",
"config_dir_version": "1.0.0",
"upgrade_history":{
"successful_upgrades":[{
"timestamp":1730890665481,
"from_version":"5.17.1",
"to_version":"5.18.0"
}]
}
},
"config_directory":{
"phase":"CONFIG_DIR_LOCKED", // <- after a failing upgrade, config dir is still locked
"config_dir_version":"0.0.0", // <- config_dir_version is still the old config_dir_version
"upgrade_package_version":"5.17.1", // <- upgrade_package_version is still the old upgrade_package_version
"upgrade_history":{
"successful_upgrades": [], // <- successful_upgrades array is empty/doesn't contain the failed upgrade
"last_failure":{ // <- a last_failure property is set in upgrade history
"timestamp":1730890676741,
"completed_scripts":[
"/usr/local/noobaa-core/src/upgrade/nc_upgrade_scripts/1.0.0/config_dir_restructure.js"],
"running_host":"my_host1",
"config_dir_from_version":"0.0.0",
"config_dir_to_version":"1.0.0",
"package_from_version":"5.17.1",
"package_to_version":"5.18.0",
"error": "Error: _run_nc_upgrade_scripts: nc upgrade manager failed!!!, Error: this is a mock error\n at NCUpgradeManager._run_nc_upgrade_scripts (/usr/local/noobaa-core/src/upgrade/nc_upgrade_manager.js:258:19)\n at async NCUpgradeManager.upgrade_config_dir (/usr/local/noobaa-core/src/upgrade/nc_upgrade_manager.js:119:13)\n at async start_config_dir_upgrade (/usr/local/noobaa-core/src/manage_nsfs/upgrade.js:52:29)\n at async Object.manage_upgrade_operations (/usr/local/noobaa-core/src/manage_nsfs/upgrade.js:22:13)\n at async main (/usr/local/noobaa-core/src/cmd/manage_nsfs.js:73:13)"
}
}
}
}
```
### Upgrade Helpers
1. NooBaa Health CLI - will report on the config directory status, upgrade failures and hosts that are blocked for config directory updates.
2. NooBaa CLI upgrade status - will print the upgrade status per the information written in system.json.
Original file line number Diff line number Diff line change
Expand Up @@ -569,7 +569,7 @@ describe('nc upgrade manager - upgrade config directory', () => {
config_dir_version: this_upgrade.config_dir_to_version,
upgrade_package_version: this_upgrade.package_to_version,
upgrade_history: {
last_failure: system_data.config_directory.upgrade_history.last_failure,
// last_failure should be removed after a successful upgrade
successful_upgrades: [this_upgrade, ...system_data.config_directory.upgrade_history.successful_upgrades]
}
}
Expand Down
6 changes: 4 additions & 2 deletions src/upgrade/nc_upgrade_manager.js
Original file line number Diff line number Diff line change
Expand Up @@ -251,7 +251,7 @@ class NCUpgradeManager {
*/
async _run_nc_upgrade_scripts(this_upgrade) {
try {
await run_upgrade_scripts(this_upgrade, this.upgrade_scripts_dir, { dbg });
await run_upgrade_scripts(this_upgrade, this.upgrade_scripts_dir, { dbg, from_version: this_upgrade.package_from_version });
} catch (err) {
const upgrade_failed_msg = `_run_nc_upgrade_scripts: nc upgrade manager failed!!!, ${err}`;
dbg.error(upgrade_failed_msg);
Expand All @@ -265,6 +265,7 @@ class NCUpgradeManager {
* 2. config_dir_version is the new version
* 3. upgrade_package_version is the new source code version
* 4. add the finished upgrade to the successful_upgrades array
* 5. last_failure is removed after a successful upgrade
* @param {Object} system_data
* @param {Object} this_upgrade
* @returns {Promise<Void>}
Expand All @@ -279,7 +280,8 @@ class NCUpgradeManager {
upgrade_package_version: this_upgrade.package_to_version,
upgrade_history: {
...upgrade_history,
successful_upgrades: [this_upgrade, ...successful_upgrades]
successful_upgrades: [this_upgrade, ...successful_upgrades],
last_failure: undefined
}
};
const updated_system_data = { ...system_data, config_directory: updated_config_directory };
Expand Down
18 changes: 10 additions & 8 deletions src/upgrade/nc_upgrade_scripts/1.0.0/config_dir_restructure.js
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@ const nb_native = require('../../../util/nb_native');
* 2. creation of accounts_by_name/ directory
* 3. Upgrade config files of all accounts under accounts/ (old directory)
* 4. delete accounts/ directory
* @param {*} dbg
* @param {{dbg: *, from_version: String}} params
*/
async function run({ dbg }) {
async function run({ dbg, from_version }) {
try {
const config_fs = new ConfigFS(config.NSFS_NC_CONF_DIR, config.NSFS_NC_CONFIG_DIR_BACKEND);
const fs_context = config_fs.fs_context;
Expand All @@ -40,10 +40,10 @@ async function run({ dbg }) {
await config_fs.create_dir_if_missing(config_fs.accounts_by_name_dir_path);

const old_account_names = await config_fs.list_old_accounts();
const failed_accounts = await upgrade_accounts_config_files(config_fs, old_account_names, dbg);
const failed_accounts = await upgrade_accounts_config_files(config_fs, old_account_names, from_version, dbg);

if (failed_accounts.length > 0) throw new Error('NC upgrade process failed, failed_accounts array length is bigger than 0' + util.inspect(failed_accounts));
await move_old_accounts_dir(fs_context, config_fs, old_account_names, dbg);
await move_old_accounts_dir(fs_context, config_fs, old_account_names, from_version, dbg);
} catch (err) {
dbg.error('NC upgrade process failed due to - ', err);
throw err;
Expand All @@ -56,13 +56,14 @@ async function run({ dbg }) {
* 2. upgrade account config file with 3 retries
* @param {import('../../../sdk/config_fs').ConfigFS} config_fs
* @param {String[]} old_account_names
* @param {String} from_version
* @param {*} dbg
* @returns {Promise<Object[]>}
*/
async function upgrade_accounts_config_files(config_fs, old_account_names, dbg) {
async function upgrade_accounts_config_files(config_fs, old_account_names, from_version, dbg) {
const failed_accounts = [];

const backup_access_keys_path = path.join(config_fs.config_root, '.backup_access_keys_dir/');
const backup_access_keys_path = path.join(config_fs.config_root, `.backup_access_keys_dir_${from_version}/`);
await config_fs.create_dir_if_missing(backup_access_keys_path);

for (const account_name of old_account_names) {
Expand Down Expand Up @@ -250,12 +251,13 @@ async function create_account_access_keys_index_if_missing(config_fs, account_up
* @param {nb.NativeFSContext} fs_context
* @param {import('../../../sdk/config_fs').ConfigFS} config_fs
* @param {String[]} old_account_names
* @param {String} from_version
* @param {*} dbg
* @returns {Promise<Void>}
*/
async function move_old_accounts_dir(fs_context, config_fs, old_account_names, dbg) {
async function move_old_accounts_dir(fs_context, config_fs, old_account_names, from_version, dbg) {
const old_account_tmp_dir_path = path.join(config_fs.old_accounts_dir_path, native_fs_utils.get_config_files_tmpdir());
const hidden_old_accounts_path = path.join(config_fs.config_root, '.backup_accounts_dir/');
const hidden_old_accounts_path = path.join(config_fs.config_root, `.backup_accounts_dir_${from_version}/`);
try {
await nb_native().fs.mkdir(fs_context, hidden_old_accounts_path);
} catch (err) {
Expand Down
8 changes: 7 additions & 1 deletion src/upgrade/upgrade_utils.js
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,13 @@ async function load_required_scripts(server_version, container_version, upgrade_
*
* @param {Object} this_upgrade
* @param {string} upgrade_scripts_dir
* @param {Object} options
* @param {{
* dbg?: *,
* db_client?: import('../util/db_client'),
* system_store?: import('../server/system_services/system_store').SystemStore,
* system_server?: import('../server/system_services/system_server'),
* from_version?: String
* }} options
*/
async function run_upgrade_scripts(this_upgrade, upgrade_scripts_dir, options) {
const from_version = this_upgrade.from_version || this_upgrade.config_dir_from_version;
Expand Down

0 comments on commit 5c4b49b

Please sign in to comment.