Setup a witness node for synchronous DR in airgapped bare metal
Summary and Key concepts
Summary
This article describes the configuration and deployment of a witness node within a Portworx cluster for maintaining quorum in synchronous disaster recovery (DR) setups. The Portworx quorum requires a majority of active storage nodes to remain operational. In scenarios where the quorum is at risk due to network issues or node failures, a witness node acts as a quorum tie-breaker, ensuring continued operation even if one data center is unavailable. The article provides step-by-step instructions for setting up this witness node on a storageless Portworx node, specifying prerequisites such as a DR license and Docker installation.
Kubernetes Concepts
- Pods: Used here for running Portworx services across nodes, enabling distributed storage and quorum management within Kubernetes clusters.
Portworx Concepts
-
Witness Node: A dedicated, storageless Portworx node deployed in a third data center. It serves as a quorum tie-breaker to prevent cluster downtime during network partitions or data center failures.
-
Synchronous Disaster Recovery (DR): A Portworx DR setup where two Kubernetes clusters are located in the same Metro Area Network. Both clusters participate in quorum management.
-
pxctl: Portworx command-line tool used for monitoring and managing the cluster, including checking witness node status and validating license information.
-
DR License: A required license to activate DR features within Portworx, necessary for deploying and operating a witness node in the cluster.
In a Portworx cluster, quorum refers to the minimum number of active storage nodes necessary to maintain cluster operation. If at least half of the nodes are offline, the cluster loses quorum causing all operations to stop and Portworx does not process any IOs.
In a Synchronous DR setup, a single Portworx cluster spans two Kubernetes clusters, each within a Metro Area Network. The same quorum principles apply here, with all storage nodes from both source and destination data centers contributing to quorum. Portworx quorum can be lost in the event of a disaster.
To solve the quorum issue, you can deploy a witness site that is used as the quorum tie-breaker when there is a network partition or when a data center goes offline. The witness node is a single virtual machine and a special Portworx storageless node that participates in quorum, but does not store any data. Typically situated in a third data center, the witness node ensures quorum integrity.
Prerequisites
- Portworx Enterprise DR license is activated.
- 4 cores minimum, 8 cores recommended, 4 GB minimum, 8 GB recommended.
- The witness node needs to be a storageless node.
- Docker engine is installed.
Setup a witness node
Perform the following to set up a witness node:
-
Check your Portworx Enterprise version by running the following command on your source and destination clusters (both should have the same version):
kubectl get pods -A -o jsonpath="{.items[*].spec.containers[*].image}" | xargs -n1 | sort -u | grep oci-monitor
-
Download the witness-install.sh script file on a designated VM.
-
Install the witness node on a single storageless Portworx node on the designated VM. You need to specify the same Portworx Enterprise version that you retrieved in Step 1 along with the external
etcd
endpoints, as shown in the following example:sh witness-install.sh --cluster-id=px-cluster \
--etcd="etcd:http://<your-etcd-endpoint1>:2379,etcd:http://<your-etcd-endpoint2>:2379,etcd:http://<your-etcd-endpoint3>:2379" \
--docker-image=portworx/px-enterprise:<your-px-version> -
Verify Portworx status on the witness node:
pxctl status
The
witness-install.sh
script can take a couple of minutes to complete, as shown in the following example output:Status: PX is operational
Telemetry: Disabled or Unhealthy
Metering: Disabled or Unhealthy
License: Trial (expires in 30 days)
..
....You will see
PX is operational
and once the script is successfully completed, you can quit the script by enteringctrl + c
. Note that the witness node requires a valid Portworx license. To check the status of your license, use thepxctl license list
command.
Related topics
-
To upgrade the witness node, see Upgrade the Portworx OCI bundle.
-
To uninstall the witness node, see Uninstall the Portworx OCI bundle.