Dynamic Pools for Volumes with Replication Factor 1
This feature is under Directed Availability. Please engage with your Portworx representative if you are interested and need to enable it in your environment under the current guidelines.
Portworx’s Kube Datastore (KDS) is a distributed datastore that provides highly available, resilient, and efficient datastores for shared storage backend (SAN), cloud drives, and local block storage.
Each storage pool is created using a set of homogeneous storage devices available within a node, and a collection of these homogeneous storage pools within a cluster combined together forms a Kube Datastore. For more information, see Kube Datastore.
Static storage pools: Pools created from direct attached storage in a shared-nothing architecture. They cannot be detached and reattached to another node after creation. High availability for volumes in static pools is achieved through replication (repl=2 or repl=3). If a node fails, another replica serves the volume based on the configured replication factor.
Dynamic storage pools: Pools created from FlashArray Cloud Drives (FACD). These pools can be detached from one node and reattached to another. During relocation, all volumes in the pool move together instead of relying on intra-cluster replication. This enables efficient failover for volumes by leveraging the resiliency of the shared storage backend. Portworx Enterprise supports dynamic pools for volumes with repl=1 in environments using FACD and PX-StoreV2. Static pools can continue to host volumes with higher replication factors (repl>1).
KDS dynamic storage pools are designed for shared storage backend (SAN) environments. KDS dynamic storage pools leverage the capabilities of the underlying FlashArray volumes to deliver superior data reduction, data protection, and resiliency, including:
- Erasure coding for fault tolerance
- Data reduction for storage efficiency
- Efficient network utilization
- Elimination of network I/O amplification caused by replication
- No east-west traffic for rebuild operations
Storage Configuration for Dynamic and Static Pools
In environments that use both dynamic and static pools, clusters may contain a mix of repl=1 and repl=2 (or higher) volumes. Mixed pool configurations are supported but require explicit planning to ensure correct workload placement and failover behavior.
Dynamic pools support only repl=1 volumes and enable pool-level failover. Volumes with higher replication factors (repl=2 or repl=3) use replication-based high availability and do not participate in dynamic pool failover.
When you configure storage, follow these guidelines:
- Use dynamic pools for workloads that require fast failover using shared storage backends.
- Use static pools for workloads that require replication-based availability (
repl=2+). - Configure storage classes with the appropriate replication factor. For example,
repl=1for dynamic pools.
If you are migrating from a replication-based setup to dynamic pools:
- Convert existing volumes to
repl=1using HA reduction. For more information, see Decreasing the replication factor. - Pool failover does not occur if any volume in the pool has a replication factor greater than 1.
In mixed environments:
- Pools with
repl=1volumes can fail over across nodes. - Pools with
repl=2+volumes use replication-based recovery. - Failover eligibility is determined at the pool level based on the volumes present.
Dynamic pools failover and node selection
Dynamic pools with repl=1 volumes can fail over to any node in the cluster. The only exception is nodes explicitly marked as storageless, which are not eligible to host storage in a disaggregated cluster.
During failover, Portworx Enterprise automatically migrates the affected pool to a healthy node without restarting the Portworx service and without interrupting application I/O. This applies to both planned maintenance events and unplanned node failures. As a result, volumes remain available and workloads continue running with minimal disruption.
A coordinator node manages the failover process. It monitors node health and orchestrates pool migration by selecting a suitable destination node based on the volume placement strategy, failure domain, attached volume replica count, and overall node health. If the coordinator node fails shortly after a storage node failure, failover timing may be further delayed. For more information on how a node is selected for migration, see Node selection during pool migration.
Pool failover operations are retried automatically up to three times if they fail. If all retries fail, manual intervention may be required. After the underlying issue is resolved, triggering failover again typically restores normal operation.
If multiple pools fail over at the same time, each pool migrates independently. Portworx Enterprise manages pool migration using the node selection criteria described in Node selection during pool migration. Based on these criteria, pools may migrate to the same node or to different nodes.
After a pool fails over to a new node, it does not automatically fail back to the original node. Portworx Enterprise rebalances and redistributes the pool based on cluster imbalance conditions. For more information, see Dynamic pool rebalance.
- During VM deployments, Portworx Enterprise attempts to co-locate cloned PVCs (for example, those created from golden images) and data drives on the same storage pool. To achieve this placement, Portworx Enterprise temporarily performs a high-availability (HA) update that increases the replica count and then reduces it after the placement completes. This operation typically occurs shortly after VM creation. Pool failover is temporarily blocked during the brief window when these HA operations are in progress.
- When both pool resize and pool failover operations are triggered, the operation that starts first takes precedence. The other operation remains blocked until the first operation completes.
Dynamic pools failover scenarios
Portworx Enterprise supports automatic pool migration in several scenarios:
-
Unexpected node failure - When a node fails unexpectedly, Portworx Enterprise detects the failure and automatically migrates the affected pools to healthy nodes. Applications using
repl=1volumes continue running without interruption.
During unexpected outages, failover timing depends on how quickly the FlashArray storage backend detects that the connection to the failed node is lost. If volumes are remotely attached to virtual machines, a delayed failover can cause disk I/O timeout events. Detection times typically depend on the storage protocol and may take up to 1 minute for iSCSI connections and up to 2 minutes for Fibre Channel or NVMe-TCP connections.RecommendationIf volumes are remotely attached to virtual machines, delayed failover can cause a Blue Screen of Death (BSOD) in KubeVirt Windows virtual machines. Portworx by Everpure recommends increasing the disk I/O timeout to 300 seconds when using dynamic pools with KubeVirt Windows virtual machines.
-
Planned maintenance or upgrades - During planned node maintenance, or upgrades, Portworx Enterprise receives a notification from the orchestrator or administrator about any node-related activity. Portworx Enterprise pre-migrates pools from the affected node, allowing upgrades to proceed without impacting application availability.
Planned operations such as OpenShift upgrades or CLI-triggered pool migrations typically complete without downtime. -
Manual pool migration - You can also manually migrate a pool to another node for maintenance or optimization. For more information, see Manual migration of dynamic pools.
Node selection during pool migration
During pool migration, Portworx Enterprise selects a target node using a filtering and scoring process. This ensures that pools move only to suitable nodes and that workload impact is minimized.
Node filtering
Portworx Enterprise first filters out nodes that do not meet the required criteria. A node is not considered for migration if it:
- Is in a different zone than the source node
- Is not in an UP state
- Is cordoned
- Is undergoing autoscaling
- Is in maintenance mode
- Does not have a metadata drive
- Has reached maximum pool capacity
- Has offline storage pools
Only nodes that pass all the above filtering checks are considered for migration.
Node scoring
After filtering, Portworx Enterprise evaluates the remaining nodes and assigns a score based on the number of attached volume replicas. Nodes with fewer attached volume replicas receive a higher score. This helps distribute load evenly across the cluster and avoids overloading individual nodes. Portworx Enterprise selects the node with the highest score as the migration target.
Node selection is based primarily on attached volume replica counts. This approach may result in uneven pool distribution across nodes, as pool count per node is not considered during placement.
Dynamic pool rebalance
Dynamic pool rebalance automatically maintains a balanced distribution of eligible storage pools across nodes in a Portworx cluster. Over time, pool distribution can become uneven due to node failures, maintenance-driven migrations, or the addition of new nodes. Rebalance periodically evaluates the cluster and redistributes eligible pools to reduce imbalance and minimize workload impact during future failures.
Dynamic pool rebalance runs automatically in the background and evaluates pool distribution across nodes at regular intervals. It detects imbalance based on repl=1 workload distribution and selects eligible pools for migration. Pool movement uses the same mechanism as dynamic pool failover and maintains the same availability guarantees. Portworx Enterprise adjusts how frequently rebalance runs based on the severity of imbalance.
- After a failover event, rebalance operations do not run immediately. A cooldown period of approximately 15 minutes is enforced before rebalance resumes.
- During an active rebalance operation, additional pool failover attempts from the same node are temporarily blocked until the rebalance completes.
- Rebalance does not perform automatic failback. Instead, it redistributes pools based on imbalance conditions.
- Rebalance triggers when a node exceeds 50% above the average replica count.
Cluster imbalance is determined by uneven distribution of repl=1 workloads across nodes. Based on this distribution, Portworx Enterprise classifies cluster state into levels such as balanced, low, moderate, or severe imbalance and adjusts rebalance frequency accordingly.
Rebalance is triggered when a node exceeds the configured imbalance threshold based on the average number of attached volume replicas in the cluster. By default, this occurs when a node has more than 50% above the average attached volume replica count. Rebalance runs at an adaptive interval that depends on the degree of imbalance. Higher imbalance results in more frequent rebalance operations, while lower imbalance reduces the frequency. You can configure the imbalance threshold using runtime options. For more information, see Dynamic pool failover and rebalance runtime options.
A pool is eligible for rebalance only if it meets all of the following conditions:
- It has active (attached) volumes
- It is not currently undergoing failover
- Moving the pool does not violate placement constraints
- The source node retains at least one pool after migration
Dynamic pools containing unsupported volume types are not eligible for rebalance.
Prerequisites
Ensure that your cluster meets the following prerequisites before you enable dynamic pools:
- Portworx Enterprise version 3.6.0 or later is installed.
- Portworx Operator version 26.1.0 or later is installed.
- OpenShift version 4.18 or later is running.
- The cluster is deployed on FlashArray Cloud Drives (FACD).
- The cluster is deployed on PX-StoreV2.
Limitations
Before enabling dynamic pools, consider the following limitations:
- Supported only on Pure FlashArray storage platform.
- Supported only on OpenShift distribution.
- Does not support Auto Journal Device.
- Does not support Smart upgrade.
Portworx Enterprise limits the number of nodes and pools that can participate in a failover at a time and performing a smart upgrade can exceed these limits delaying failover operations. - Does not support multi-drive pools. Dynamic pools require exactly one FACD drive per pool. Clusters that already contain multi-drive pools cannot be upgraded to a dynamic pool enabled configuration.
- Pool failover does not occur if the pool contains any of the following volume types:
- SharedV4 volumes
- Volumes with Journal profiles
- Volumes with replication factor greater than 1 (
repl=2or higher) - PX-Fast volumes
- Pool failover is temporarily blocked on both source and target nodes during pool expansion operations. Failover resumes automatically after the expansion completes.
- Pool failover fails if volume attach operations are in progress for volumes on the same pool.
- Automatic VM re-alignment does not occur after pool migration. Virtual machines are not rebalanced to align with data locality.
- Dynamic pools are validated for up to:
- 256 KDS volumes per node
- 1 dynamic pool per node
- 1000 KDS volumes per cluster Exceeding these limits can increase failover times and cause I/O pauses in virtual machines. These pauses may appear as Event ID 129 errors or Blue Screen of Death (BSOD) events on Windows VMs, or cause applications to become unresponsive.
Enable Dynamic Pools
You can enable dynamic pools using the StorageCluster custom resource.
-
Generate the Portworx specification from Portworx Central.
Ensure that you select Pure FlashArray as the platform, PX-StoreV2 as the storage provider, and Openshift 4+ as the distribution.
For more information, see Installation of Portworx with FlashArray using Portworx Central. -
In the
specsection of theStorageCluster, configure dynamic pools and the storage provider:spec:
kubeDatastore:
enableDynamicPools: true
cloudStorage:
provider: pure -
Verify the configuration to confirm that the dynamic pools is enabled:
kubectl get stc -n portworx <pxcluster> -o jsonpath='{.spec.kubeDatastore}'You should see an output similar to:
kubeDatastore:
{“enableDynamicPools”:true} -
Check the status of the
StorageCluster:kubectl get stc px-cluster -n portworxThe
STATUSfield should displayUpdatingorRunning.importantIf the
StorageClusterstatus displaysDegraded, you must check for errors and fix any issues before retrying to enable dynamic pools. To do this, perform the following steps:-
Check the
StorageClusterto identify the issue:kubectl describe stc px-cluster -n portworxReview the events and error messages to determine the cause.
-
Resolve the underlying issue identified in the previous step.
-
Reapply the configuration change if necessary:
kubectl apply -f <updated-storagecluster.yaml> -
Verify that the StorageCluster status transitions to
Updatingand then toRunning:kubectl get stc px-cluster -n portworx
-
Dynamic pools are successfully enabled when the enableDynamicPools field is set to true and the StorageCluster status is Running.
When you enable dynamic pools, Portworx Enterprise automatically adds a metadata drive to all non-storageless nodes. This drive supports dynamic pool operations and remain fixed to the node and does not move during pool relocation.
If you disable dynamic pools, these metadata drives remain on the nodes. You must manually decommission these metadata-only nodes to remove them.
Dynamic pool failover and rebalance runtime options
The following table provides the runtime options that can be used to configure dynamic pool failover behavior and the imbalance threshold at runtime.
Do not modify the runtime options unless necessary. Contact Portworx Support before changing the runtime options.
| Parameter | Description | Default value |
|---|---|---|
pause-pool-failovers | Pauses dynamic pool failover. When enabled, automatic movement of storage pools between nodes is disabled. | false |
pause-dynamic-pool-rebalance | Pauses dynamic pool rebalance. When enabled, automatic redistribution of pools is disabled. Applies only when dynamic pools are enabled. | false |
failover-cleanup-retention-hours | Specifies how long failover-related metadata is retained before cleanup. | 12h |
max-active-failover-nodes | Specifies the maximum number of nodes that can participate in failover at a time. | 5 |
max-active-failover-pools | Specifies the maximum number of pools that can participate in failover at a time. | 8 |
dynamic-pool-rebalance-min-interval-secs | Specifies the minimum interval between rebalance cycles during severe imbalance conditions. | 900 |
dynamic-pool-rebalance-default-interval-secs | Specifies the default interval used during moderate imbalance conditions. | 1800 |
dynamic-pool-rebalance-max-interval-secs | Specifies the maximum interval between rebalance cycles during low or balanced conditions. | 3600 |
dynamic-pool-rebalance-tolerance-percent | Specifies the threshold that defines when the cluster is considered balanced. | 50 |
dynamic-pool-rebalance-moderate-imbalance-percent | Specifies the threshold that defines moderate imbalance in the cluster. | 100 |
dynamic-pool-rebalance-severe-imbalance-percent | Specifies the threshold that defines severe imbalance in the cluster. | 200 |
dynamic-pool-rebalance-max-moves-per-cycle | Specifies the maximum number of pool movements allowed in a single rebalance cycle. | 1 (max 5) |
Manual migration of dynamic pools
During planned node upgrades, maintenance, or optimization, you can manually migrate all pools on a node, or select specific pools from that node to migrate individually.
-
To manually migrate all the pools in a node to another node:
pxctl service pool migrate --node <node-id>Replace
<node-id>with the id of the node that hosts the pools to migrate.
Portworx automatically selects the destination node and relocates the pools without disrupting active workloads. -
To migrate specific pools from a node to another node:
pxctl service pool migrate --node <node-id> [--pools <uuid1,uuid2>]Replace:
<node-id>with the id of the node that hosts the pools to migrate.<uuid1,uuid2>with the UUIDs of the pools to migrate. You can specify one or more pool UUIDs separated by commas.
Portworx automatically selects the destination node and relocates the pools without disrupting active workloads.
-
To view all pool migration plans and intents in the cluster:
pxctl sv pool migrate list-
To view only the pool migration plans in the cluster:
pxctl sv pool migrate list --plans -
To view only the pool migration intents in the cluster:
pxctl sv pool migrate list --intents -
To view only active migration intents in the cluster:
pxctl sv pool migrate list --active-only
-
-
To check the status of ongoing or completed pool migration operations:
pxctl sv pool migrate status [--source-node-id <node-id>] [--pool <pool-uuid>] [--plan-id <plan-id>] [--intent-id <intent-id>]Replace:
<node-id>with the id of the node that hosts the migrated pools.<pool-uuid>with the UUID of the migrated pool.<plan-id>with the ID of the migration plan.<intent-id>with the ID of the migration intent.