In case of a disaster, where one of your Kubernetes clusters is down and inaccessible, you can failover the applications running on it to an operational Kubernetes cluster. To achieve this, you should stop your application on the source cluster and start the application on an active Kubernetes cluster.
The following considerations are used in the examples on this page. Update them to the appropriate values for your environment:
- Source Cluster is the Kubernetes cluster which is down and where your applications were originally running. The cluster domain for this source cluster is
- Destination Cluster is the Kubernetes cluster where the applications will be failed over. The cluster domain for this destination cluster is
Deactivate the failed cluster domain
In order to failover an application, you need to instruct Stork and Portworx that one of your Kubernetes clusters is down by marking the source cluster as inactive, if the cluster is accessible.
Run the following command to deactivate the source cluster. You need to run this command on the destination cluster where Portworx is still running:
storkctl deactivate clusterdomain us-east-1a
Cluster Domain deactivate operation started successfully for us-east-1a
Verify if your source cluster domain has been deactivated:
storkctl get clusterdomainsstatus
NAME LOCAL-DOMAIN ACTIVE INACTIVE CREATED
px-dr-cluster us-east-1a us-east-1b (SyncStatusUnknown) us-east-1a (SyncStatusUnknown) 29 Nov 22 22:09 UT
You can see that the cluster domain of your source cluster is listed under
INACTIVEindicating that your source cluster domain is deactivated.
Stop the application on the source cluster (if accessible or applicable)
If your source Kubernetes cluster is still alive and is accessible, Portworx by Pure Storage recommends you to stop the applications before failing them over to the destination cluster.
You need to stop the applications by manually changing the replica count of your deployments and statefulsets to 0. In this way, your application resources will persist in Kubernetes, but the actual application will not be running.
kubectl scale --replicas 0 statefulset/<your-app-name> -n <migrationnamespace>
The above command will scale down the replica count of your application running in the
Suspend the migrations on the source cluster (if accessible)
Skip this section if
autoSuspend is set, which will automatically suspend your migration schedules on the source cluster. Therefore, proceed to the next section.
Run the following command to suspend the migration schedule. Once the replicas for your application's statefulset are set to 0, you need to suspend the migration schedule on the source cluster. This is done so that your application's stateful sets are not updated to 0 replicas on the destination cluster:
storkctl suspend migrationschedule migrationschedule -n <migrationnamespace>
Verify if the schedule has been suspended:
storkctl get migrationschedule -n <migrationnamespace>
NAME POLICYNAME CLUSTERPAIR SUSPEND LAST-SUCCESS-TIME LAST-SUCCESS-DURATION
migrationschedule <your-schedule-policy> <your-clusterpair-name> true 01 Dec 22 23:31 UTC 10s
Start the application on the destination cluster
You can allow Stork to activate migration either on all namespaces or one namespace at a time. For performance reasons, if you have a high number of namespaces in your migration schedule, Portworx by Pure Storage recommends you migrate one namespace at a time.
Each application spec will have the annotation
stork.openstorage.org/migrationReplicasindicating the replica count on the source cluster. Run the following command to update the replica count of your app to the same number as on your source cluster:
storkctl activate migration -n <migrationnamespace>
Run the following command to migrate all namespaces:
storkctl activate migration --all-namespaces
Stork will look for that annotation and scale it to the correct number automatically. Once the replica count is updated, the application will start running, and the failover will be completed.
Verify that your application is up and running:
kubectl get pods -n <migrationnamespace>
NAME READY STATUS RESTARTS AGE
zk-0 1/1 Running 0 3m18s
zk-1 1/1 Running 0 2m54s
zk-2 1/1 Running 0 99s
You can see that the status of all application pods (for example, Zookeeper pods) in the
<migrationnamespace>namespace are running, indicating that your application is operational.