Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add overview of preparing for DR #3558

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
2 changes: 0 additions & 2 deletions guides/common/assembly_deployment-path.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,4 @@ include::modules/con_best-practices-for-role-based-access-control-in-project.ado

include::modules/con_configuring-provisioning.adoc[leveloffset=+1]

include::modules/con_planning-for-disaster-recovery.adoc[leveloffset=+1]

include::modules/con_additional-deployment-tasks.adoc[leveloffset=+1]
5 changes: 5 additions & 0 deletions guides/common/assembly_preparing-for-disaster-recovery.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
include::modules/con_preparing-for-disaster-recovery.adoc[]

include::modules/con_overview-of-recommended-disaster-recovery-plans.adoc[leveloffset=+1]

include::modules/ref_additional-resources-preparing-for-disaster-recovery.adoc[leveloffset=+1]
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
[id="overview-of-recommended-disaster-recovery-plans"]
= Overview of recommended disaster recovery plans

Choose a disaster recovery plan that best helps ensure the continuity of {Project} services in your deployment.

//[IMPORTANT]
//====
//During recovery, you will need to ensure that the hostname of your {ProjectServer} does not change.
//The IP address can change.
//====

Virtualizing your {ProjectServer}::
asteflova marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

@Lennonka Lennonka Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Virtualizing your {ProjectServer}::
.Virtualizing your {ProjectServer}

I would personally prefer informal headings for the plan names but it isn't blocking your PR.

How do I back up?:::
Virtualize your {ProjectServer} and use the hypervisor tools to take virtual machine snapshots of the server.
This method is suitable if you can run {Project} on top of virtualization.
How will I recover in case of a disruptive event?:::
To recover {Project} services, restore a virtual machine snapshot.
Advantages:::
Disadvantages and expected impact:::
Expect some amount of data inconsistency after recovery, based on how old your last snapshot is.
You will lose data changes that have occurred since the snapshot you are using to recover was taken.
//While taking snapshots frequently will result in smaller amounts of data loss, creating the snapshots takes time and the snapshots themselves also take up space.
//When planning your snapshot taking schedule, compare these factors with your tolerance for data loss.

Active and passive {ProjectServer}, with external storage::
How do I back up?:::
Store the following critical data on network attached storage: content in `/var/lib/pulp` and database in `/var/lib/pgsql`.
Replicate this storage into a different data center.
Attach the storage to a {ProjectServer} that is a clone of the primary {ProjectServer} but runs passively.
//You can mount the network attached storage directly by both the active and the passive {ProjectServer}s or you can replicate the network attached storage on some interval to another location.
How will I recover in case of a disruptive event?:::
To recover {Project} services, switch DNS records of the active {ProjectServer} with the passive {ProjectServer}.
This ensures that the passive server becomes the active server.
All hosts remain connected without configuration updates.
Advantages:::
Disadvantages and expected impact:::
If the network attached storage is replicated to another location, expect some amount of data inconsistency after recovery based on the synchronization interval.

Active and passive {ProjectServer}, with backup and restore::
How do I back up?:::
Ensure periodic backups of your {ProjectServer}.
Copy this backup to a passive {ProjectServer} and restore them on the passive server.
How will I recover in case of a disruptive event?:::
To recover {Project} services, switch DNS records of the active {ProjectServer} with the passive {ProjectServer}.
asteflova marked this conversation as resolved.
Show resolved Hide resolved
This ensures that the passive server becomes the active server.
All hosts remain connected without configuration updates.
//Use a low DNS time to live (TTL) value to help ensure that hosts reach the new active {ProjectServer} quickly.
//Consider your tolerance for how long it takes before your hosts are able to reconnect and access the correct {ProjectServer} and set your TTL according to your needs.
Advantages:::
Disadvantages and expected impact:::
Expect some amount of data inconsistency after recovery, based on how often you took and restored backups and on how long it takes to complete the restore process.

Dual active {ProjectServer}::
How do I back up?:::
Operate an active, independent {ProjectServer} per data center.
Hosts from each data center are registered to the {ProjectServer} in that data center.
Then configure automation to ensure recovery in case of a disruptive event.
For example, you can periodically run a health check and if the health check discovers that the current {ProjectServer} a host is registered to does not resolve, the host is re-registered to the other {ProjectServer}.
+
To minimize downtime, you can automate the recovery in various ways.
For example, you can use the {Project} Ansible collection.
For more information, see {AdministeringDocURL}Managing_Project_with_Ansible_Collections_admin[Managing {Project} with Ansible collections] in _{AdministeringDocTitle}_.
asteflova marked this conversation as resolved.
Show resolved Hide resolved
How will I recover in case of a disruptive event?:::
To recover {Project} services, re-configure hosts to point at the {ProjectServer} in the other data center.
You will need to re-register each host to the new server.
Advantages:::
Disadvantages and expected impact:::
ifdef::katello,orcharhino,satellite[]
You must ensure that content synchronization and content view creation are synchronized to create the same content view in each {Project} and prevent content drift.
Content drift occurs when available content deviates from the intended state defined by a content view.
If you fail to prevent content drift, expect inconsistency in the content that is available to hosts.

//As an alternative, you can implement the following setups:
//* You can choose one of the two {ProjectServer}s to be the source of truth for content synchronization and content view creation.
//In this case, the other {ProjectServer} synchronizes its content from the first {ProjectServer}.
//* You can configure a third {ProjectServer} to act as the content definition source of truth.
//In this case, the other {ProjectServer}s act as management servers.
//+
//For more information, see {ContentManagementDocURL}Synchronizing_Content_Between_Servers_content-management[Synchronizing content between {ProjectServerTitle}s] in _{ContentManagementDocTitle}_.
endif::[]
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[id="preparing-for-disaster-recovery"]
= Preparing for disaster recovery

{Team} recommends preparing a disaster recovery plan to ensure the continuity of {Project} services in case of a disruptive event.
asteflova marked this conversation as resolved.
Show resolved Hide resolved
These guidelines help ensure that you will be able to restore your {Project} deployment to an operational state after an incident.
Original file line number Diff line number Diff line change
@@ -1,15 +1,11 @@
[id="planning-for-disaster-recovery_{context}"]
= Planning for disaster recovery
[id="additional-resources-preparing-for-disaster-recovery"]
= Additional resources

Ensure to back up your {Project} data so that you can recover your {Project} deployment in case of a disaster.

To create backups of your {ProjectServer} and {SmartProxyServers}, use the `{foreman-maintain} backup` command.
* To create backups of your {ProjectServer} and {SmartProxyServers}, use the `{foreman-maintain} backup` command.
For more information, see {AdministeringDocURL}backing-up-{project-context}-server-and-{smart-proxy-context}_admin[Backing up {ProjectServer} and {SmartProxyServer}] in _{AdministeringDocTitle}_.

To backup your hosts, you can use remote execution to configure recurring backup tasks that {Project} will run on the hosts.
* To back up your hosts, you can use remote execution to configure recurring backup tasks that {Project} will run on the hosts.
For more information, see {ManagingHostsDocURL}Configuring_and_Setting_Up_Remote_Jobs_managing-hosts[Configuring and setting up remote jobs] in _{ManagingHostsDocTitle}_.

ifndef::satellite[]
To create snapshots of hosts, you can use the Snapshot Management plugin.
* To create snapshots of hosts, you can use the Snapshot Management plugin.
For more information, see {ManagingHostsDocURL}Creating_Snapshots_of_a_Host_managing-hosts[Creating snapshots of a host] in _{ManagingHostsDocTitle}_.
endif::[]
2 changes: 2 additions & 0 deletions guides/doc-Planning_for_Project/master.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ include::common/assembly_deployment-path.adoc[leveloffset=+1]

include::common/assembly_common-deployment-scenarios.adoc[leveloffset=+1]

include::common/assembly_preparing-for-disaster-recovery.adoc[leveloffset=+1]

include::topics/Provisioning_Concepts.adoc[]

include::topics/Required_Technical_Users.adoc[]
Expand Down