Skip to main content
Version: 3.2

Failback an application within synchronous DR in airgapped bare metal

Failback is the process of moving the application and its data back to the source cluster once the source cluster is restored and operational again.

Once your unhealthy Kubernetes cluster is back up and running, the Portworx nodes in that cluster will not immediately rejoin the cluster. They will stay in Out of Quorum state until you explicitly Activate this cluster domain.

After this domain is marked as Active you can failback the applications if you want.

The following considerations are used in the examples on this page. Update them to the appropriate values for your environment:

  • Source Cluster is the Kubernetes cluster which is down and where your applications were originally running. The cluster domain for this source cluster isus-east-1a.
  • Destination Cluster is the Kubernetes cluster where the applications will be failed over. The cluster domain for this destination cluster isus-east-1b.
  • The Zookeeper application is being failed over to the destination cluster.

Prerequisite

You must ensure that Stork version 24.2.0 or newer is installed on both the source and destination clusters.

note

If you are using a Stork version prior to 24.2.0, then you can follow this procedure to perform a failback.

Create a reverse ClusterPair

note

Skip this section if you have created a bidirectional ClusterPair, and move to the next section.

You need to create a reverse ClusterPair if you had initially paired your clusters in a unidirectional manner (from source to destination), and now you should establish a pairing from the destination cluster back to the source cluster. The reverse ClusterPair enables reverse communication between the clusters (from destination to source), allowing for failback.

Run the following command from your destination cluster to create a reverse ClusterPair:

storkctl create clusterpair reverse-migration-cluster-pair \
--namespace <migrationnamespace> \
--src-kube-file <destination-kubeconfig-file> \
--dest-kube-file <source-kubeconfig-file> \
--mode sync-dr \
--unidirectional
important

Ensure to provide the destination kubeconfig file with src-kube-file and source destination kubeconfi file with dest-kube-file as mentioned in the above command.

Reactivate your source cluster domain

Once your source cluster is operational, perform the following steps from your destination cluster to activate your source cluster domain:

  1. Run the following command to activate the source cluster domain:

    storkctl activate clusterdomain us-east-1a
    Cluster Domain activate operation started successfully for us-east-1a
  2. Verify if the source cluster domain is activated:

    storkctl get clusterdomainsstatus
    NAME                            LOCAL-DOMAIN   ACTIVE                                     INACTIVE   CREATED
    px-dr-cluster us-east-1a us-east-1a (InSync), us-east-1b (InSync) 29 Nov 22 22:09 UTC

Reverse sync your clusters

If the destination cluster has been running applications for some time, it is possible that the state of your application on the destination cluster differs from your source cluster. This is due to the creation of new resources or changes in data within stateful applications on the destination cluster.

It is recommended to perform one migration from destination cluster to your source cluster before failing back your applications, so that you have the most up-to-date applications on your original source cluster.

As both of your clusters are accessible, follow the instructions to configure a reverse migration schedule:

  1. Create a schedule policy on your destination cluster using the instructions in the Create a schedule policy section.

  2. Create a migration schedule on your destination cluster using the storkctl create migrationschedule command. For more information on how to use the command, see Create MigrationSchedule with storkctl.

Perform failback

You can perform a failback using the storkctl perform failback -m <reverse-migration-schedule> -n <reverse-migration-schedule-namespace> command.

You can also use one of the following flags to include or exclude a specific subset of the namespace for the migration, but not both at the same time.

  • --include-namespaces - Includes a subset of namespaces for the migration.
  • --exclude-namespaces - Excludes a subset of namespaces for the migration.

To start the failback operation, run the following command in the destination cluster:

storkctl perform failback -m <reverse-migration-schedule> -n <reverse-migration-schedule-namespace>

Example:

storkctl perform failback -m reverse-migration-schedule -n zookeeper
Started failback for MigrationSchedule zookeeper/reverse-migration-schedule
To check failback status use the command : `storkctl get failback failback-reverse-migration-schedule-2024-05-21-115006 -n zookeeper`

Check failback status

Run the following command to check the status of the failback operation. You can get the failback-action-name from the output of the storkctl perform failback command.

storkctl get failback <failback-action-name> -n <reverse-migration-schedule-namespace>

Example:

$ storkctl get failback failback-reverse-migration-schedule-2024-05-21-115006 -n zookeeper
NAME                                 CREATED               STAGE       STATUS       MORE INFO
failback-reverse-migration-schedule-2024-05-21-115006 21 May 24 11:50 UTC Completed Successful Scaled up Apps in : 1/1 namespaces

If the status is failed, you can use the kubectl describe actions <failback-action-name> -n <reverse-migration-schedule-namespace> command to get more information about the failure.

Verify volumes and Kubernetes resources are migrated

To verify the volumes and Kubernetes resources that are migrated to the source cluster, run the following command:

kubectl get all -n <reverse-migration-schedule-namespace>

Example:

kubectl get all -n zookeeper
NAME                     READY   STATUS    RESTARTS   AGE
pod/zk-544ffcc474-6gx64 1/1 Running 0 18h

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/zk-service ClusterIP 10.233.22.60 <none> 3306/TCP 18h

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/zk 1/1 1 1 18h

NAME DESIRED CURRENT READY AGE
replicaset.apps/zk-544ffcc474 1 1 1 18h