Portworx Enterprise Release Notes
3.3.1.1
July 15, 2025
To install or upgrade Portworx Enterprise to version 3.3.1.1, ensure that you are running one of the supported kernels and all system requirements are met.
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-44778 | The attachment count for Flash Array Direct Access RWX volumes (raw block) and Portworx RWX volumes (raw block) on a node might not be properly decremented when a volume is detached from a remote node during KubeVirt VM live migrations. User Impact: Nodes can incorrectly reach their attachment limit, blocking new volume attachments and leaving pods stuck in the "ContainerCreating" state. Resolution: Detachments on remote nodes are now properly accounted for, ensuring correct attachment counts and allowing new pods to schedule as expected. Components: Storage Affected Versions: 3.2.3 and later | Minor |
3.3.1
July 8, 2025
To install or upgrade Portworx Enterprise to version 3.3.1, ensure that you are running one of the supported kernels and all system requirements are met.
New Features
- TLS Encryption for Internal KVDB Communication
Portworx now supports enabling Transport Layer Security (TLS) for internal KVDB communication on the following platforms:- Amazon Elastic Kubernetes Service (EKS)
- Azure Kubernetes Service (AKS)
- Google Kubernetes Engine (GKE)
- IBM Cloud Kubernetes Service (IKS)
- VMware Tanzu Kubernetes Grid Integration (TKGI)
- Rancher Kubernetes Engine 2 (RKE2)
- Oracle Container Engine for Kubernetes (OKE)
- Mirantis Kubernetes Engine (MKE)
- Google Anthos
- OpenShift Container Platform (OCP)
- Kubernetes Operations (KOPS)
- VMware Tanzu Kubernetes Grid Service (TKGS)
- Red Hat OpenShift Service on AWS (ROSA)
- Azure Red Hat OpenShift (ARO)
cert-manager
. For more information, see Enable TLS for Internal KVDB.
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-45055 | During snapshot creation, a race condition between internal Portworx processes might result in duplicate snapshot entries for the same volume name. In such cases, an incorrect UUID from a duplicate or incomplete entry might be stored in the PersistentVolume (PV) label referencing the snapshot. This can lead the CSI driver to interact with an invalid snapshot object. User Impact: Snapshot creation or volume clone operations might intermittently fail. Mount operations for volumes derived from snapshots might fail because of incorrect UUID references, which can disrupt backup pipelines, volume provisioning, or automated workflows that rely on snapshot-based clones. Resolution: Portworx Enterprise now stores and returns only the correct UUID during snapshot creation, eliminating duplicate snapshot metadata that previously disrupted CSI operations. Components: Volume Management Affected Versions: 3.3.0 | Minor |
PWX-43085 | For ReadWriteMany (RWX) raw block devices (shared raw block devices), discard operations could not be disabled at the device level. In OCP, some hypervisor versions had issues working directly with 4Kn (block size of pxd block device) correctly. This resulted in unstable VM, that gets paused under incorrect discard handling at the hypervisor. User Impact: When these volumes are used by hypervisors (e.g., KubeVirt VMs), discard operations can lead to stability or performance issues, especially on hypervisors that handles discards incorrectly while with 4Kn block devices (PXD devices are 4Kn always). Resolution: Portworx now supports disabling discard operations at the block device level by setting the nodiscard option on RWX raw block volumes.Components: Storage Affected Versions: 3.3.0 | Minor |
3.3.0.1
June 25, 2025
To install or upgrade Portworx Enterprise to version 3.3.0.1, ensure that you are running one of the supported kernels and all system requirements are met.
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-44992 | When using Portworx Enterprise 3.3.0 with fuse driver version earlier than 3.3.0, Portworx might fail to clean up stale PXD block devices for encrypted volumes, if the volume is re-attached to a new node after the source node goes offline. User Impact: Stale PXD devices with encrypted volumes will remain on the source nodes and will not be cleaned up, even if the volume has moved to another node. Any subsequent attempt to attach the same volume to this node will fail. However, the volume can still be attached successfully to other nodes in the cluster. Resolution: Portworx Enterprise 3.3.0.1 now completely cleans up stale PXD devices for encrypted volumes from a node, even when an older fuse driver is in use. Components: Volume Management Affected Versions: 3.3.0 | Minor |
3.3.0
June 23, 2025
To install or upgrade Portworx Enterprise to version 3.3.0, ensure that you are running one of the supported kernels and all system requirements are met.
New Features
-
Active Cluster on FlashArray Direct Access volumes
Portworx now supports ActiveCluster on FlashArray Direct Access volumes with PX-StoreV2, allowing synchronous replication and automatic failover across multiple FlashArrays. For more information, see Install Portworx with Pure Storage FlashArray Direct Access volumes with ActiveCluster setup. -
Application I/O Control leveraging Control Group v2
Portworx now supports Application I/O Control on hosts that use cgroup v2, in addition to cgroup v1. Portworx automatically detects the available cgroup version and applies I/O throttling accordingly to ensure seamless operation across supported Linux distributions. For more information, see Application I/O Control. -
Vault for storing vSphere credentials
Portworx Enterprise now supports storing vSphere credentials in Vault when using Vault as a secret provider, to provide a more secure and centralized way to manage vSphere credentials. Previously, vSphere credentials were stored in Kubernetes secrets. For more information, see Secrets Management with Vault. -
Enhanced Cluster-wide Diagnostics Collection and Upload
Portworx now supports cluster-level diagnostics collection through thePortworxDiag
custom resource. When created, the Portworx Operator launches temporary diagnostic pods that collect node-level data and Portworx pod logs, store the results in the/var/cores
directory, and then automatically delete the diagnostic pods. For more information, see On-demand diagnostics usingPortworxDiag
custom resource. -
Volume-granular Checksum Verification tool for PX Store-V1
Portworx now supports block-level checksum verification across volume replicas using thepxctl volume verify-checksum
command for PX-StoreV1. This feature ensures data integrity by comparing checksums across all replicas and supports pause/resume functionality with configurable I/O controls. For more information, see pxctl volume. -
TLS Encryption for Internal KVDB Communication
Portworx now supports enabling Transport Layer Security (TLS) for internal KVDB communication on Google Anthos. Subsequent releases will include support for additional platforms. This feature secures communication between internal KVDB and all Portworx nodes using TLS certificates managed bycert-manager
. For more information, see Enable TLS for Internal KVDB.
Early Access Features
-
Portworx Shared RWX Block volumes for KubeVirt VMs
Portworx now supports ReadWriteMany (RWX) raw block volumes for KubeVirt virtual machines (VMs), enabling high-performance, shared storage configurations that support live migration of VMs in OpenShift environments. For more information, see Manage Shared Block Device (RWX Block) for KubeVirt VMs. -
Enhance capacity management by provisioning custom storage pools
Portworx now enables provisioning of storage pools during and post Portworx installation, enhancing the management of storage capacity. For more information, see Provision storage pool. -
Journal IO support for PX-StoreV2
Portworx now supports Journal device setup and journal IO profile volumes for PX-StoreV2. For more information, see Add a journal device. -
Support for multiple connections on the same NIC interface or bonded NIC using LACP
Portworx enables the use of multiple connections on the same NIC interface or bonded NIC interfaces using LACP, to enhance performance as data traffic can be distributed across multiple links. For more information, see Configure multiple NICs with LACP NIC Bonding. -
Pool drain
Portworx now supports moving volume replicas between storage pools using the pool drain operation. For more information, see Move volumes using pool drain.
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-43808 | Pure FA Cloud Drives attach backend volumes to hosts on FlashArray using the hostnames retrieved from the NODE_NAME container environment variable, which is specified by the Portworx spec.nodeName field. This might lead to hostname collisions across clusters. User Impact: Backend volumes might mount to hosts of other clusters. Resolution: The Pure FlashArray CloudDrive feature now sets the Purity hostname using a combination of the hostname and NODE_UID , limited to a maximum of 63 characters. This prevents hostname collisions and ensures that backend volumes are mounted only on the correct hosts. This also allows easier mapping back to the original host in the FlashArray user interface (UI) and logs.Components: Drive & Pool Management Affected Versions: 3.2.2 and earlier | Minor |
PWX-43472 | When a storage node fails, storageless nodes would repeatedly attempt cloud drive failover. Each attempt opened a new connection to kvdb /etcd but did not close it.User Impact: Open connections might eventually exhaust available file descriptors, making etcd non-responsive to new connections. Resolution: Connections opened for kvdb health checks during failover attempts are properly closed, preventing resource exhaustion and maintaining etcd responsiveness. Components: Control Plane, KVDB Affected Versions: 3.2.2.1 | Minor |
PWX-41940 | Portworx telemetry did not collect kubelet logs from cluster nodes. Only Portworx logs were available for troubleshooting. User Impact: Without kubelet logs, diagnosing cluster-level Kubernetes issues (such as pod crashes, evictions, or node failures) was slower and less effective, impeding root cause analysis and consistent monitoring across environments. Resolution: Telemetry-enabled clusters now periodically send filtered kubelet logs, which provides more complete telemetry for debugging and alerting. Components: Telemetry and Monitoring Affected Versions: 3.3.0 | Minor |
PWX-36280 | Portworx did not display kube-scheduler, kube-controller-manager, and pause image details in the /version endpoint output.User Impact: Without image details, it is difficult to obtain complete component version information when querying the manifest or automating image checks using the curl command. Resolution: The /version endpoint now includes kube-scheduler, kube-controller-manager, pause, and other relevant images in its version manifest, addressing the need for comprehensive version reporting via standard API calls.Components: Install & Uninstall, Operator Affected Versions: All | Minor |
PWX-32328 | Sometimes, Portworx propagated volume metrics to Prometheus from the wrong node, and in some cases, metrics for deleted volumes were reported as active. User Impact: A single volume appears as attached to two different nodes, resulting in false alerts about the actual state of storage volumes in Prometheus. Resolution: Volume-related metrics are now emitted only by the node where the volume is actually attached. Components: Telemetry and Monitoring Affected Versions: All | Minor |
PWX-27968 | Volume replicas may be placed incorrectly when its VPS volume affinity or volume anti-affinity rule contains multiple match expressions. The provisioner may treat a partial match as a whole match, thus mistakenly selecting/deselecting certain pools during volume provisioning. User Impact: When users create new volumes or add an existing volume using a VPS volume rule with multiple match expressions, users may see replicas placed on unwanted nodes (volume-affinity scenario), or provision failure (volume anti-affinity rule). Resolution: Modify the provision algorithm to always evaluate VPS volume rules on a per-volume basis, thereby avoiding confusion between partial match and full match. Recommendation: No need to modify VPS and storage classes. With the new version, new volumes will be placed correctly according to the VPS rules. However, incorrectly placed existing volumes still require manual fix (move replicas using pxctl command).Components: Volume Placement & Balancing Affected Versions: 3.3.x | Minor |
PWX-39098 | Abrupt pod shutdown or deletion in rare scenarios might leave behind (retain) device mappings. User Impact: New pods attempting to use the volume become stuck in the ContainerCreating phase due to incomplete cleanup of the device mappings.Resolution: The fix adds additional methods to remove any retained device mappings and attempts to clean them up. If the cleanup is unsuccessful, an appropriate message is returned. Components: Shared volumes Affected Versions: 3.2.0 | Minor |
Known issues (Errata)
Issue Number | Issue Description | Severity |
---|---|---|
PWX-44992 | When using Portworx Enterprise 3.3.0 with fuse driver version earlier than 3.3.0, Portworx might fail to clean up stale PXD block devices for encrypted volumes, if the volume is re-attached to a new node after the source node goes offline. | Minor |
PWX-43720 | In Portworx Enterprise version 3.3.0 or later, when used with operator version 25.2.0 or later, the
Components: Telemetry | Minor |
PWX-43275 | When using FlashArray Direct Access (FADA) volumes in ActiveCluster mode, if a volume is deleted while one of the backend arrays is unavailable, orphan volumes may remain on the FlashArray that was down. This issue does not affect application workloads directly, but manual cleanup might be required to identify and remove orphaned volumes.
Components: Volume Management | Minor |
PWX-44473 | KubeVirt virtual machines using Portworx on IPv6 clusters with SharedV4 volumes may enter a Workaround: Configure the
| Minor |
PWX-44223 | A storage-less node may fail to pick up the drive-set from a deleted storage node if the API token used for FlashArray authentication has expired and the node does not automatically retrieve the updated token from the Kubernetes secret. As a result, the storage-less node is unable to log in to the FlashArray and fails to initiate drive-set failover after the storage node is deleted.
Components: Drive & Pool Management | Minor |
PWX-44623 | When provisioning virtual machines (VMs) on FlashArray using KubeVirt, the VM might remain in the Provisioning state if the underlying PersistentVolumeClaim (PVC) fails with the error
The VM will attempt to provision a new PVC automatically. Components: Volume Management | Minor |
PWX-43060 | If a FlashArray Direct Access (FADA) multipath device becomes unavailable after the volume has already been marked as
Components: Volume Management | Minor |
PWX-43212 | Some VMs might remain stuck in the Shutting Down state after a FlashArray (FA) failover, especially when nodes are overpopulated with VMs. This is a known occurrence related to VM density and node resource allocation. Resolution: Monitor the number of VMs assigned to each node and plan resource allocation across the cluster to reduce the risk. Components: Volume Management | Minor |
PWX-44486 | If the coordinator node of an RWX volume (i.e. the node where the volume is currently attached) is placed into Maintenance mode, application pods using the volume might temporarily experience I/O disruption and encounter Input/Output errors. Resolution:
To prevent this issue, before putting a node into Maintenance mode, check if any volumes (especially RWX volumes) are attached to it. If they are attached, restart Portworx on the node first by running Components: Storage | Minor |
3.2.3
May 13, 2025
To install or upgrade Portworx Enterprise to version 3.2.3, ensure that you are running one of the supported kernels and all prerequisites are met.
New Features
- FlashArray Direct Access shared raw block (RWX) volumes
Portworx now supports FADA shared raw block (RWX) volumes, enabling live migration of KubeVirt VMs with high-performance raw block storage. This eliminates filesystem overhead, improves I/O performance, and ensures seamless migration by allowing simultaneous volume access on source and destination nodes. For more information, see Run KubeVirt VMs with FlashArray Direct Access shared raw block (RWX) volumes.
Note: This release also addresses security vulnerabilities.
Improvements
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-42785 | FlashArray Fibre Channel integration now filters out WWNs from uncabled ports when creating hosts. This enhancement reduces manual intervention and prevents errors during volume attachment in environments with partially connected FC ports. | Volume Management |
PWX-43645 | Portworx now supports setting the sticky bit for FlashArray Direct Access (FADA) volumes. You can set the sticky bit using the --sticky flag with the pxctl volume update command. | Volume Management |
PWX-42482 | A backoff mechanism now limits repeated calls, reducing kube-apiserver load. Unnecessary LIST /api/v1/nodes API calls from nodes in the NotReady state or with px/enabled=false are reduced, which improves efficiency. | API |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-42489 | Portworx made repeated GET requests to /api/v1/namespaces/portworx/configmaps/px-telemetry-phonehome from nodes that were either NotReady or had px/enabled=false .User Impact: These API calls were unnecessary and added load to the Kubernetes API server, particularly in clusters with many storageless or inactive nodes. Resolution: Portworx startup previously made unconditional API calls to fetch telemetry configuration. This has been fixed by updating the sequence to first check local sources before querying the Kubernetes API. Components: Telemetry and Monitoring Affected Versions: 3.2.2.2 or earlier | Minor |
PWX-43598 | Repl-add could select a replica node that doesn't comply with the volume's volume-affinity VPS rule when no valid pools are available. If a KubeVirt VPS fixer job is running, it may enter a loop of repeated repl-add and repl-remove operations on the same volume without resolving the placement issue. User Impact: This may lead to incorrect replica placement and violation of affinity rules. The VPS fixer job can create unnecessary load by repeatedly attempting to correct the placement. Resolution: Portworx now evaluates additional conditions before allowing fallback to relaxed volume-affinity placement. Relaxed-mode is applied only when no nodes are available that meet the required affinity criteria, ensuring more consistent replica alignment. Components: Volume Placement and Balancing Affected Versions: 3.2.2.2 and earlier | Minor |
Known issues (Errata)
Issue Number | Issue Description | Severity |
---|---|---|
PWX-43463 | In OpenShift virtualization, after you restart the entire Kubernetes cluster, virtual machines remain in the Workaround:
| Minor |
PWX-43486 | When using FlashArray Direct Access (FADA) shared block volumes, virtual machines (VMs) might temporarily stop during live migration if the primary FlashArray (FA) controller reboots while the secondary controller is unavailable. This occurs because I/O paths are unavailable, causing I/O errors that pause the VM. Workaround:
| Minor |
PWX-42358 | On RHEL 8.10 systems, running Linux kernel 4.18, Workaround: To resolve this issue, restart the Portworx service or manually recreate the missing cgroup directories by running the following commands:
| Minor |
PWX-43849 | Portworx does not support Debian 11 with kernel version 5.10.0-34-amd64 for PX-StoreV1 due to a known issue, and we recommend using Debian 12 with kernel version | Minor |
3.2.2.2
April 17, 2025
Portworx now supports IPv6 clusters for OpenShift with KubeVirt in dual-stack networking mode. To install or upgrade Portworx Enterprise to version 3.2.2.2, ensure that you are running one of the supported kernels and all prerequisites are met.
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-42915 | In clusters where IPv6 is preferred (PX_PREFER_IPV6_NETWORK_IP=true ), Sharedv4 volume mounts may fail if Portworx selects an incorrect IPv6 address. This causes pods to remain in the ContainerCreating state with a "permission denied" error from the server.User Impact: Pods using Sharedv4 volumes may fail to start in IPv6-preferred or dual-stack clusters. This does not affect clusters using IPv4 by default. Resolution: Portworx now uses a consistent strategy to select the most appropriate IPv6 address:
Components: Shared Volumes Affected Versions: 3.2.2.1 or earlier | Minor |
PWX-42843 | When Portworx was deployed in a dual-stack (IPv4 and IPv6) Kubernetes cluster, it created a sharedv4 Kubernetes Service without explicitly specifying the ipFamily field. If ipFamily wasn't set, Kubernetes created an IPv4 address by default, while Portworx was listening on an IPv6 address.User Impact: Pods using sharedv4 service volumes failed to start because sharedv4 volume mounts couldn't complete using the IPv4-based Kubernetes Service IP address. Resolution: Portworx now explicitly sets the ipFamily field on the sharedv4 Kubernetes Service based on the IP address it uses in a dual-stack Kubernetes cluster.Components: Shared Volumes Affected Versions: 3.2.2.1 or earlier | Minor |
Known issues (Errata)
Issue Number | Issue Description | Severity |
---|---|---|
PWX-43355 | KubeVirt virtual machines using Portworx on IPv6 clusters with SharedV4 volumes may enter a Workaround: Configure the
| Minor |
3.2.2.1
March 26, 2025
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-42778 | Processing unaligned write requests on volumes might cause partial data transfer issues. User Impact: When processing large write requests (e.g., 1MB), unaligned blocks at the start of a request might lead to partial data transfers. This occurs when the available space in the user iovecs runs out before the last portion of the data is copied.Note: This issue occurs only when using the virtio driver in a KubeVirt deployment.Resolution: Improved handling of unaligned requests prevents premature exhaustion of user iovecs and ensures that all data is copied for large write operations. Components: Storage Affected Versions: 3.2.2 | Minor |
3.2.2
March 10, 2025
Since February 10, 2025, the Portworx Essentials license was discontinued. Starting with version 3.2.2, no images will be released for Portworx Essentials.
To install or upgrade Portworx Enterprise to version 3.2.2, ensure that you are running one of the supported kernels and that the prerequisites are met.
New Features
-
Encryption support for FlashArray Direct Access (FADA)
Portworx now supports FADA volume Encryption, providing seamless data protection by encrypting information both in transit and at rest on FlashArray storage. Encryption keys are used consistently across the cluster, even with multiple FlashArrays. This feature ensures that data remains secure throughout the process, with encryption handled at the storage level. For more information, see Create encrypted PVCs in FlashArray. -
NVMe-oF/TCP support for FlashArray Direct Access (FADA)
Portworx now supports NVMe-oF/TCP protocol, providing high-performance, low-latency storage access for Kubernetes applications using FlashArray LUNs. By leveraging standard TCP/IP, this feature eliminates the need for specialized networking hardware like RoCEv2, making deployment more flexible and cost-effective while maintaining optimal performance. For more information, see Set up NVMe-oF TCP protocol with FlashArray. -
PX-StoreV2 support on additional platforms
Portworx now supports installation with PX-StoreV2 on the following platforms: -
Portworx Enterprise now supports Kubernetes version 1.31, starting from version 1.31.6. Before upgrading Kubernetes to 1.31.6 or later, update the Portworx Operator to version 24.2.3. For more details, refer to the Portworx Operator 24.2.3 release notes.
The logUploader
utility is now hosted in the portworx/log-upload
repository. Please update your image repository mirrors to pull from this new location.
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-41668 | In environments with slow container runtimes, the Portworx Pod could report READY=0/1 (Not Ready) even when the backend was fully operational. This occurred due to internal readiness checks failing to update in rare cases.User Impact: The pod might appear as Not Ready , causing confusion in monitoring.Resolution: The readiness check logic has been fixed, ensuring the POD transitions correctly to READY=1/1 when the backend is operational.Components: Monitoring Affected Versions: 3.2.1.2 or earlier | Minor |
PWX-40755 | When Portworx is configured with separate data and management interfaces, some KubeVirt VMs may enter a paused state during platform or Portworx upgrades. User Impact: During upgrades, certain KubeVirt VMs may pause unexpectedly and require manual intervention to restart. Resolution: The issue has been fixed, ensuring KubeVirt VMs remain operational during Portworx and platform upgrades, without requiring manual restarts. Components: Upgrade Affected Versions: 3.2.0, 3.2.1, 3.2.1.1, and 3.2.1.2 | Minor |
PWX-40564 | Pool expansion could fail if the backend FA volume was expanded but the updated size was not reflected on the node. User Impact: If a drive was expanded only in the backend and a pool expansion was attempted with a size smaller than the backend size, the operation would fail. Resolution: Pool expansion now correctly retrieves and updates the drive size from the backend, preventing failures caused by size mismatches. Components: Drive and Pool Management Affected Versions: All | Minor |
PWX-39322 | Cloud drive lock contention during the startup of an affected node could cause inconsistencies in the internal KVDB, potentially triggering a panic in other PX nodes. User Impact: In large clusters, where lock contention is more likely, this issue could significantly extend the Portworx cluster restore process. Resolution: If an inconsistency is detected when the affected node starts, it now performs a cleanup to resolve the issue, preventing other nodes from panicking. Components: Drive and Pool Management Affected Versions: All | Minor |
PWX-40423 | Decommissioning a non-KVDB storage node did not automatically delete the associated drives from the FA backend. User Impact: Users had to manually remove drives from the FA backend after decommissioning a node. Resolution: The decommission process has been updated to ensure that backend devices are deleted automatically when the node wipe is completed. Components: Drive and Pool Management Affected Versions: All | Minor |
PWX-41685 | The PVC label template in VPS did not recognize incoming label keys containing multiple segments (dots). As a result, the template was not replaced with the label value, leading to unintended VPS behavior. User Impact: Users utilizing PVC label templates with multi-segment PVC labels experienced incorrect VPS functionality. Resolution: Updated the pattern matching for PVC label templates to support multi-segment label keys, ensuring correct label value replacement. Components: Volume Placement and Balancing Affected Versions: All | Minor |
PWX-40364 | When volume IDs had varying lengths (as expected), the defrag schedule occasionally failed to resume from the correct position after pausing. Instead, it restarted from the beginning, preventing the completion of a full iteration. User Impact: The built-in defrag schedule was unable to iterate through all volumes, rendering it ineffective in addressing performance issues. Users had to revert to using a defrag script. Resolution: The built-in defrag schedule now correctly resumes from the last stopped position and iterates through all volumes as expected. Components: KVDB Affected Versions: 3.2.0 and 3.2.1 | Minor |
PWX-37613 | If a pool expansion failed after cloud drives were expanded but before the pool was updated, attempting a subsequent expansion with a smaller size resulted in an error. User Impact: Users could experience a pool expansion failure if a previous expansion was interrupted and left unfinished, and they attempted another expansion of a smaller size. Resolution: The second pool expansion request now detects and completes the previously interrupted expansion instead of failing. Components: Drive and Pool Management Affected Versions: 3.1.2 to 3.2.1 | Minor |
PWX-38702 | In certain failover scenarios, mounting a shared file system could fail with an "Exists" or "file exists" error. This issue occurs due to an unclean unmount when the file system was last mounted on the same node. User Impact: This might result in user pods remaining in “Container Creating” state. Resolution: The fix addresses multiple underlying causes that lead to unclean unmounts. Additionally, since this issue can also arise due to a race condition in the Linux kernel, the fix now detects such scenarios, aborts the mount process, and provides a clear error message. Components: Shared Volumes Affected Versions: 3.2.0 to 3.2.1.2 | Minor |
PWX-42043 | The CLI command pxctl cred list [-j] returns an error and fails to list credentials.User Impact: If the cluster contains non-S3 credentials, the pxctl cred list [-j] command will not display the credentials.Resolution: The command now correctly lists all credentials, including non-S3 credentials, without errors. Components: CLI and API Affected Versions: 3.1.8, 3.2.1.2 | Minor |
Known issues (Errata)
-
PWX-42379: On PX-Security enabled clusters running Kubernetes 1.31 or later, expanding an in-tree PersistentVolumeClaim (PVC) fails due to compatibility issues. This prevents users from increasing storage capacity through standard PVC expansion methods, potentially impacting workloads that require additional storage.
Workaround: Until this issue is resolved in a future
external-resizer
sidecar release from the upstream Kubernetes community, users can manually expand the volume usingpxctl volume update --size <new-size> <volume-name>
instead of updating the PVC size.
Components: Volume Management
Affected Versions: 3.2.1.1 or later
Severity: Minor -
PWX-42513: When you deploy more than 100 apps with FlashArray Direct Access (FADA) PVCs using NVMe-oF-TCP at the same time, volumes are created in the backend. However, the attempt to attach hosts to the volume in the Portworx layer sometimes fails, leaving device mappers on the hosts with no available paths. Because the mapper device is created, Portworx attempts to create a filesystem but hangs due to the missing paths.
Additionally, PVC creations can get stuck in the
ContainerCreating
state. The large number of multipath FADA volumes increases the time required for newer FADA volumes' multipath to appear, causing Portworx to enter an error state.Note: We recommend creating FADA volumes in batches with a significant interval between each batch.
Workaround: To recover from this state, perform the following steps:
-
Identify the affected device:
multipath -ll
eui.00806e28521374ac24a9371800023155 dm-34 ##,##
size=50G features='1 queue_if_no_path' hwhandler='0' wp=rw -
Disable queueing for the affected device:
multipathd disablequeueing map eui.00806e28521374ac24a9371800023155
-
Flush the multipath device:
multipath -f eui.00806e28521374ac24a9371800023155
-
Verify that the device has been removed:
multipath -ll eui.00806e28521374ac24a9371800023155
-
Reattach the volume manually from the FA controller to the host (worker node).
-
Confirm that the device is correctly reattached and that paths are available:
multipath -ll eui.00806e28521374ac24a9371800023155
eui.00806e28521374ac24a9371800023155 dm-34 NVME,Pure Storage FlashArray
size=50G features='4 queue_if_no_path retain_attached_hw_handler queue_mode bio' hwhandler='0' wp=rw
`-+- policy='queue-length 0' prio=50 status=active
|- 1:245:24:544 nvme1n24 259:68 active ready running
`- 0:1008:24:544 nvme0n24 259:71 active ready running -
Confirm that no Portworx processes are in an uninterruptible sleep state (
D
state) using the following command:ps aux | grep " D "
Components: Volume Management
Affected Versions: 3.2.2
Severity: Minor -
3.2.1.2
February 04, 2025
New Features
Portworx Enterprise now supports the following:
- Installation of Portworx with PX-StoreV2 on Rancher clusters running on Ubuntu or SUSE Linux Micro. For hardware and software requirements, see Prerequisites.
- Rancher clusters on SUSE Linux Micro. For a list of supported distributions and kernel versions, see Qualified Distros and Kernel Versions.
Note: This release also addresses security vulnerabilities.
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-41663 | If Kubernetes clusters contain FlashBlade volumes migrated from Pure Storage Orchestrator (PSO) clusters, the Portworx process on these systems enters a continuous crash loop, preventing normal volume operations. User Impact: Portworx repeatedly crashes and restarts, preventing normal cluster operation. Resolution: This issue has been resolved. Portworx no longer crashes in environments with FlashBlade volumes migrated from PSO clusters. Components: Upgrade Affected Versions: 3.2.1, 3.2.1.1 | Major |
Known issues (Errata)
Issue Number | Issue Description | Severity |
---|---|---|
PD-3880 | On systems with automatic updates enabled, the system may upgrade to a kernel version that is not listed on the supported kernel page. This can prevent the Portworx kernel module from loading, resulting in a failed Portworx installation. Workaround: Disable automatic updates and verify that you are using a supported kernel version. Components: Installation Affected Versions: 3.2.x, 3.1.x, 3.0.x | Major |
3.2.1.1
December 17, 2024
Visit these pages to see if you're ready to upgrade to this version:
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-40233 | The volume snapshot count with Portworx CSI for FlashArray and FlashBlade license has been increased from 5 to 64 | Licensing & Metering |
PWX-37757 | The Pure export rules for accessing FlashBlade are now defined by the specified accessModes in the PVC specification.
| Volume Management |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-38838 | During asynchronous disaster recovery, delete requests for objects in the object-store were being rejected when the schedule contained more than 16 volumes. User Impact: Due to this issue, users saw some of the asynchronous disaster recovery relayed objects were not being cleaned up from object-store. Resolution: The system has been updated to accept up to 64 delete requests. This change prevents objects from being retained in the object-store when the schedule includes more than 16 volumes but fewer than 64. Components: Migration Affected Versions: 3.2.x, 3.1.x, 3.0.x | Major |
PWX-40477 | Portworx cluster failed to migrate from using external KVDB to internal KVDB. User Impact: Due to this issue, users were unable to migrate their Portworx clusters from external KVDB to internal KVDB, disrupting operations that rely on the internal KVDB for managing the cluster's state and configuration. Resolution: Portworx clusters can now be successfully migrated from external KVDB to internal KVDB. For instructions, contact the Portworx support team. Components: KVDB Affected Versions: 3.2.0, 2.13.12 | Major |
3.2.1
December 2, 2024
Visit these pages to see if you're ready to upgrade to this version:
New features
Portworx by Pure Storage is proud to introduce the following new features:
- Portworx now supports the PX-StoreV2 backend on the following platforms
3.2.0
October 31, 2024
Visit these pages to see if you're ready to upgrade to this version:
Portworx 3.2.0 requires Portworx Operator 24.1.3 or newer.
New features
Portworx by Pure Storage is proud to introduce the following new features:
- Secure multi-tenancy with Pure FlashArray When a single FlashArray is shared among multiple users, administrators can use realms to allocate storage resources to each tenant within isolated environments. Realms set boundaries, allowing administrators to define custom policies for each tenant. When a realm is specified, the user must provide a FlashArray pod name where Portworx will create all volumes (direct access or cloud drives) within that realm. This ensures that each tenant can only see their own storage volumes when logged into the array.
- Support for VMware Storage vMotion Portworx now supports the Storage vMotion feature of VMware, enabling vSphere cloud drives to be moved from one datastore to another without any downtime.
- Defragmentation schedules
Users can now set up defragmentation schedules using
pxctl
commands during periods of low workload to improve the performance of Portworx.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-35876 | For IBM customers, Portworx now supports the StorageClass with the encryption flag set to true. | Marketplaces |
PWX-38395 | Previously, all storageless nodes would restart to claim a driveset when a storage node went down and its driveset was detached in the same zone. With this improvement, only one storageless node will claim ownership of the driveset and restart, while the other storageless nodes remain unaffected and do not restart. | Drive & Pool Management |
PWX-33561 | For partially attached drivesets, Portworx now detaches the driveset only when cloud drives are not mounted, avoiding unnecessary detachment when a mount is present. | Drive & Pool Management |
PWX-37403 | FlashArray now allows specifying multiple management ports for the same FlashArray. If customers are on a VLAN connection to FlashArray, the virtual IP address might encounter issues. Customers can specify the management IPs of the controllers directly in the secret as comma-separated values. | Drive & Pool Management |
PWX-38597 | For FlashArray Cloud Drives, on Portworx restart, any stale entries of the driveset are cleaned, and the locally attached driveset is prioritized for mounting volumes rather than checking all other drives. | Drive & Pool Management |
PWX-39131 | The total number of GET API calls are reduced significantly. | Drive & Pool Management |
PWX-38551 | The latency of any operation on FlashArray due to multiple API calls has been reduced. Portworx now uses the FlashArray IDs stored in the cloud drive config map to limit API calls only to the FlashArray where the drive resides. | Drive & Pool Management |
PWX-37864 | When you add a drive using the pool expand add-drive operation, the config map is now automatically updated with the pool ID of the newly added drive, preventing the need for a Portworx restart. | Drive & Pool Management |
PWX-38630 | Portworx now supports adding a cloud drive to a storageless node when the cloud drive specification for the journal device in the StorageCluster spec is explicitly set to a value other than auto . | Drive & Pool Management |
PWX-38074 | Improved the startup timing of Portworx nodes in multi-FlashArray setups by handling metrics timeouts more effectively. When volume creation on a FlashArray takes too long, Portworx now avoids sending further requests to that FlashArray for 15 minutes, allowing other nodes to continue the startup process without delays. | Drive & Pool Management |
PWX-38644 | For FlashArray Cloud Drives, pool expansion failure messages are no longer overridden by maintenance mode messages, providing more useful error information for users to debug their environment. | Drive & Pool Management |
PWX-33042 | In disaggregated environments, users cannot add drives to a storageless node labeled as portworx.io/node-type=storageless . To add drives, users need to change the node label to portworx.io/node-type=storage and restart Portworx. | Drive & Pool Management |
PWX-38169 | During pool expansion, Portworx now check the specific driveset that the node records, rather than iterating through all drivesets in the cluster randomly. This change significantly reduces the number of API calls made to the backend, thereby decreasing the time required for pool expansion and minimizing the risk of failure, particularly in large clusters. | Drive & Pool Management |
PWX-38691 | Portworx now raises an alert called ArrayLoginFailed when it fails to log into a FlashArray using the provided credentials. The alert includes a message listing the arrays where the login is failing. | Drive & Pool Management |
PWX-37672 | The pxctl cd i --<node-ID> command now displays the IOPS set during disk creation | Drive & Pool Management |
PWX-37439 | Azure users can now specify IOPS and throughput parameters for Ultra Disk and Premium v2 disks. These parameters can only be set during the installation process. | Drive & Pool Management |
PWX-38397 | Portworx now exposes NFS proc FS pool stats as Prometheus metrics. Metrics to track the number of Packets Arrived , Sockets Enqueued , Threads Woken , and Threads Timedout have been added. | Shared Volumes |
PWX-35278 | A cache for the NFS and Mountd ports has been added, so the system no longer needs to look up the ports every time. The GetPort function is only called the first time during the creation or update of the port, and the cache updates if accessed 15 minutes after the previous call. | Shared Volumes |
PWX-33580 | The NFS unmount process has been improved by adding a timeout for the stat command, preventing it from getting stuck when the NFS server is offline and allowing retries without hanging. | Shared volumes |
PWX-38180 | Users can now set the QPS and Burst rate to configure the rate at which API requests are made to the Kubernetes API server. This ensures that the failover of the sharedv4 service in a scaled setup is successful, even if another operation causes an error and restarts some application pods. To do this, add the following environment variables:
| Shared Volumes |
PWX-39035 | Portworx will no longer print the Last Attached field in the CLI's volume inspect output if the volume has never been attached. | Volume Management |
PWX-39373 | For FlashArray Direct Access volumes, the token timeout time has been is increased from 15 minutes to 5 hours to avoid, which provides enough time for Portworx to process large number of API token requests | Volume Management |
PWX-39302 | For Portworx CSI volumes, calls to the Kubernetes API to inspect a PVC have been significantly reduced, improving performance. | Volume Management |
PWX-37798 | Users can now remove labels from a Portworx volume using the pxctl volume update -l command, allowing them to manually assign pre-provisioned Portworx volumes to a pod. | Volume Management |
PWX-38585 | FlashArray Direct Access users can now clone volumes using pxctl . | Volume Management |
PWX-35300 | Improved FlashBlade Direct Access volume creation performance by removing an internal lock, which previously caused delays during parallel creation processes. | Volume Management |
PWX-37910 | Cloudsnaps are now initialized using a snapshot of KVDB avoiding failure errors. | Storage |
PWX-35130 | Portworx now sends an error message and exits the retry loop when a volume is stuck in a pending state, preventing continuous creation attempts. | Storage |
PWX-35769 | Storageless nodes now remain in maintenance mode without being decommissioned, even if they exceed the auto-decommission timeout. This prevents failure for user-triggered operations when the storageless node is in maintenance mode. | Control Plane |
PWX-39540 | Portworx now ensures the correct information for a pure volume is returned, even if the FlashArray is buggy, preventing node crashes. | Control Plane |
PWX-37765 | The pxctl volume list command has been improved to allow the use of the --pool-uid flag alongside the --trashcan flag, enabling the filtering of trashcan volumes based on the specified Pool UUID. | CLI & API |
PWX-37722 | Added a new --pool-uid flag to the pxctl clouddrive inspect command, allowing users to filter the inspect output based on the specified Pool UUID. | CLI & API |
PWX-30622 | The output of the pxctl volume inspect <volume-id> command now displays the labels alphabetically, making it easier to track any changes made to labels. | CLI & API |
PWX-39146 | The pxctl status output also includes a timestamp indicating when the information was collected. | CLI & API |
PWX-36245 | PX-StoreV2 pools now support a maximum capacity of 480TB by choosing appropriate chunk size during pool creation. | PX-StoreV2 |
PWX-39059 | Portworx now installs successfully on cGroupsV2 and Docker Container runtime environments. | Install & Uninstall |
PWX-37195 | Portworx now automatically detects SELinux-related issues during installation and attempts to resolve them, ensuring a smoother installation process on SELinux-enabled platforms. | Install & Uninstall |
PWX-38848 | Portworx now properly handles the floating license-lease updates, when cloud-drives move between the nodes. | Licensing & Metering |
PWX-38694 | Improved the time to bring up a large cluster by removing a short-lived cluster lock used in cloud drive deployments. | KVDB |
PWX-38577 | The logic for handling KVDB nodes when out of quorum has been improved in Portworx. Now, Portworx processes do not restart when KVDB nodes are down. | KVDB |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-38609 | Portworx sometimes lost the driveset lock for FlashArray cloud drives when the KVDB drive was removed in situations such as KVDB failover. User Impact: Loss of the driveset lock resulted in other nodes attempting to attach a drive already attached to the current node. Resolution: Portworx now uses a temporary driveset to safely remove the KVDB drive. Components: KVDB Affected Versions: 3.1.5 | Critical |
PWX-38721 | Portworx attempted to mount FlashBlade Direct Access volumes using the NFS IP. However, if an existing mount point used an FQDN, Portworx defaulted to the FQDN after a restart. If a Kubernetes mount request timed out, but Portworx completed it successfully, Kubernetes retried the request. Portworx then returned an error due to the FQDN, leading to repeated mount attempts. User Impact: Application pods with a timed-out initial mount request were stuck in the ContainerCreating state. Resolution: Portworx now performs IP resolution on the existing mount entry. If they match, it confirms the mount paths are already created, and Portworx returns a success. Components: Volume Management Affected Versions: 3.1.x, 3.0.x | Critical |
PWX-38618 | In a cluster where multiple applications used the same FlashBlade Direct Access volume, some applications used FQDNs while others used IP addresses. The NFS server recognized only the FQDN, causing a mismatch in the mount source paths tracked by Portworx. User Impact: Application pods using IPs to mount the FlashBlade Direct Access volume were stuck in the terminating state. Resolution: When a request is received from CSI to unmount a target path for FlashBlade Direct Access, Portworx unconditionally unmounts it, even if the source path differs from the one recognized by it. Components: Volume Management Affected Versions: 3.1.x, 3.0.x | Critical |
PWX-38376 | During node initialization in the boot-up process, FlashArray properties are required for all the dev mapper paths already present on the node. This call is made to all arrays configured in pure.json configuration file, which sometimes failed, causing the initialization to fail.User Impact: Users saw node initialization failures due to errors from arrays that had no volumes for the current node. Additionally, unintended extra API calls were made to the arrays, contributing to the overall API load. Resolution: Portworx now uses the FlashArray volume serial to determine which array the volume belongs to. The array ID is then passed as a label selector to DeviceMappings, ensuring that only the relevant array is queried. Components: Volume Management Affected Versions: 3.1.x, 3.0.x | Critical |
PWX-36693 | When a storageless node transitioned to a storage node, the node's identity changed as it took over the storage node identity. The old identity corresponding to the storageless node was removed from the Portworx cluster. All volumes attached to the removed node were marked as detached, even if pods were currently running on the node. User Impact: Volumes incorrectly appeared as detached, even while pods were running and consuming the volumes. Resolution: Portworx now decommissions cloud drives only after the AutoDecommissionTimeout expires, ensuring that volumes remain attached to the node and are not incorrectly displayed as detached. Components: Volume Management Affected Versions: 3.1.1 | Critical |
PWX-38173 | When the storage node attempted to restart, it could not attach the previous driveset, as it was already claimed by another node, and could not start as a new node because the drives were still mounted. User Impact: The storage node attempting to come back online repeatedly restarted due to unmounted drive mount points. Resolution: Portworx now automatically unmounts FlashArray drive mount points if it detects that the previous driveset is unavailable but its mount points still exist. Component: Drive and Pool Management Affected Versions: 3.0.x, 3.1.x | Critical |
PWX-38862 | During Portworx upgrades, a sync call was triggered and became stuck on nodes when the underlying mounts were unhealthy. User Impact: Portworx upgrades were unsuccessful on nodes with unhealthy shared volume mounts. Resolution: Portworx has removed the sync call, ensuring that upgrades now complete successfully. Components: Drive & Pool Management Affected Versions: 3.1.x | Critical |
PWX-38936 | If a storage node restarts, it restarted a few times before it could successfully boot, because its driveset was locked and won't be available for a few minutes. User Impact: Users saw the Failed to take the lock on drive set error message and the node took longer time to restart.Resolution: In such case Portworx tells the restarting node that the driveset is not locked, and thus it is able to claim this driveset without having to wait until the lock expires. During this time other nodes still see this driveset as locked and unavailable Components: Drive & Pool Management Affected Versions: 3.1.x | Major |
PWX-39627 | In large Portworx clusters with many storage nodes using FlashArray or FlashBlade as the backend, multiple nodes might simultaneously attempt to update the lock configmap, resulting in conflict errors from Kubernetes. User Impact: Although the nodes eventually resolved the conflicts, this issue spammed the logs and slowed down boot times, especially in large clusters. Resolution: The refresh interval has been changed from 20 seconds to 1 minute. In case of a conflict error, Portworx now delays the retry by a random interval between 1 and 2 seconds, reducing the likelihood of simultaneous updates. Additionally, the conflict is logged only after 10 consecutive occurrences, indicating a real issue. Components: Drive & Pool Management Affected Versions: 3.1.x, 3.0.x | Major |
PWX-36318 | In IBM Cloud, the node name is the same as the node IP. If the selected subnet had very few available IPs and Portworx replaced a worker node, the new node would take the same IP. User Impact: When Portworx started on the replaced node with the same IP, it incorrectly assumed that it had locally attached drives due to the volume attachments. This assumption led to an attempt to access the non-attached device path on the new node, causing Portworx to fail to start. Resolution: With the new provider-id annotation added to the volume attachment, Portworx now correctly identifies the replaced node as a new one without local attachments.Component: Drive and Pool Management Affected Versions: 3.1.x | Major |
PWX-38114 | In IBM Cloud, the node name is the same as the node IP. If the selected subnet had very few available IPs and a worker node was replaced, the new node had the same IP. User Impact: When Portworx started on the replaced node with the same IP, it incorrectly assumed it had locally attached drives due to existing volume attachments, leading to a stat call on the non-attached device path and causing Portworx to fail to start. Resolution: The volume attachment now includes a new annotation, provider-id , which is the unique provider ID of the node, allowing Portworx to recognize that the replaced node is new and has no local attachments.Component: Drive and Pool Management Affected Versions: 3.0.x, 3.1.x | Major |
PWX-37283 | A storageless node did not transition into a storage node after a restart if it initially became storageless due to infrastructure errors unrelated to Portworx. User Impact: These errors caused the node to have attached drives that it was unaware of, preventing the node from recognizing that it could use these drives during the transition process. Resolution: when a storageless node attempts to become a storage node, it checks for any attached drives that it previously did not recognize. Using this information, the storageless node can now correctly decide whether to restart and transition into a storage node. Component: Drive and Pool Management Affected Versions: 3.1.x, 3.0.x | Major |
PWX-38760 | On a node with existing FlashBlade volumes mounted via NFS using a DNS/FQDN endpoint, if Portworx received repeated requests to mount the same FlashBlade volume on the same mount path but using an IP address instead of the FQDN, Portworx returned an error for the repeated requests. User Impact: Pods were stuck in the ContainerCreating state. Resolution: Portworx has been updated to recognize and return success for such repeated requests when existing mount points are present. Components: Volume Management Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
PWX-37614 | When a Portworx volume with volumeMode=Block was created from a StorageClass that also had fs or fsType specified, Portworx incorrectly attempted to format the raw block volume with the specified file system.User Impact: Users were unable to use a common StorageClass for creating both block and file volumes. Resolution: Portworx now allows the creation of raw block PVCs even if fs or fsType parameters are specified in the StorageClass.Components: Volume Management Affected Versions: 3.1.2 | Major |
PWX-37282 | HA-Add and HA-level recovery failed on volumes with volume-affinity VPS, as the volume-affinity VPS restricted pool provisioning to certain nodes. User Impact: Users experienced issues such as volumes losing HA after node decommission or HA-Add operations failing. Resolution: The restriction of volume-affinity VPS has been relaxed. Portworx now prioritizes pools that match VPS labels but will select secondary candidate pools under specific conditions, such as during HA increases and when the volume carries the specified VPS labels. This change does not affect VPS validity. Components: Storage Affected Versions: 3.1.x, 3.0.x | Major |
PWX-38539 | The Autopilot config triggered multiple rebalance audit operations for Portworx processes, which overloaded Portworx and resulted in process restarts. User Impact: Users saw alerts indicating Portworx process restarts. Resolution: Portworx now combines multiple rebalance audit triggers into a single execution, minimizing the load on Portworx processes and reducing the likelihood of restarts. Components: Storage Affected Versions: 3.1.2.1 | Major |
PWX-38681 | If there were any bad mounts on the host, volume inspect calls for FlashArray Direct Access volumes would take a long time, as df -h calls would hang.User Impact: Users experienced slowness when running pxctl volume inspect <volId> .Resolution: Portworx now extracts the FlashArray Direct Access volume dev mapper path and runs df -h only on that specific path.Components: CLI and API Affected Versions: 3.1.x, 3.0.x | Major |
PWX-37799 | Portworx restarted when creating a cloud backup due to a KVDB failure. User Impact: If a cloud backup occurred during a KVDB failure, Portworx would unexpectedly restart. Resolution: The nil pointer error causing the restart has been fixed. Now, Portworx raises an alert for backup failure instead of unexpectedly restarting. Components: Cloudsnaps Affected Versions: 3.1.x, 3.0.x | Major |
PWX-39080 | When the Kubernetes API server throttled Portworx requests, in certain scenarios, a background worker thread would hold a lock for an extended period, causing Portworx to assert and restart. User Impact: Portworx asserted and restarted unexpectedly. Resolution: The Kubernetes API calls from the background worker thread have been moved outside the lock's context to prevent the assert. Components: KVDB Affected Versions: 3.2.0 | Major |
PWX-37589 | When Azure users attempted to resize their drives, Portworx performed an online expansion for Azure drives, which did not align with Azure's recommendation to detach drives of 4 TB or smaller from the VM before expanding them. User Impact: Azure drives failed to resize and returned the following error: Message: failed to resize cloud drive to: 6144 due to: compute.DisksClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="BadRequest" Message="Disk of size 4096 GB (<=4096 GB) cannot be resized to 6144 GB (>4096 GB) while it is attached to a running VM. Please stop your VM or detach the disk and retry the operation Resolution: Portworx now detaches drives of 4 TB or smaller before performing pool expansion, instead of attempting online expansion. Components: Drive & Pool Management Affected Versions: 3.0.x,3.1.x | Minor |
PWX-36683 | Portworx failed to resolve the correct management IP of the cluster and contacted the Telemetry system using an incorrect IP/port combination. This issue caused the pxctl status command output to result in Telemetry erroneously reporting as Disabled or Degraded .User Impact: Telemetry would sometimes appear to be unhealthy even when it was functioning correctly. This could lead to confusion and misinterpretation of the system's health status. Resolution: The issue was resolved by fixing the logic that chooses the management IP, ensuring that Portworx correctly resolves the management IP of the cluster. This change prevents the system from using the wrong IP/port combination to contact the Telemetry system, thereby ensuring accurate reporting of Telemetry status. Components: Telemetry & Monitoring Affected Versions: 3.0.x,3.1.x | Minor |
Known issues (Errata)
Issue Number | Issue Description | Severity |
---|---|---|
PD-3505 | EKS users may encounter issues installing Portworx on EKS version 1.30. This version requires the Amazon Linux 2023 (AL2023) kernel, which, in turn, enforces IMDSv2 by default Workaround:
Affected Versions: 3.0., x, 3.1.x | Critical |
PD-3329 | Provisioning of KubeVirt VM fails if the bootOrder is not specified for the VM disks and the first disk is not a PVC or a DataVolume. Workaround: Specify the bootOrder in the VM spec or ensure that the first disk is a PVC or a DataVolume. Components: KVDB Affected Versions: 3.1.3 | Major |
PD-3324 | Portworx upgrades may fail with Unauthorized errors due to the service account token expiring when the Portworx pod terminates in certain Kubernetes versions. This causes API calls to fail, potentially leading to stuck Kubernetes upgrades. Workaround: Upgrade the Portworx Operator to version 24.2.0 or higher, which automatically issues a new token for Portworx. Components: Install & Uninstall Affected Versions: 3.1.1, 3.2.0 | Major |
PD-3412 | A Kubernetes pod can get stuck in the ContainerCreating state with the error message: MountVolume.SetUp failed for volume "<PV_NAME>" : rpc error: code = Unavailable desc = failed to attach volume: Volume: <VOL_ID> is attached on: <NODE_ID> , where NODE_ID is the Portworx NODE ID of the same node where the pod is trying to be created.Workaround: Restart the Portworx service on the impacted node. Components: Volume Management Affected Versions: 3.2.0 | Major |
PD-3408 | If you have configured IOPS and bandwidth for a FlashArray Direct Access volume, and that volume is snapshotted and later restored into a new volume, the original IOPS and bandwidth settings are not honored. Workaround: Manually set the IOPS and bandwidth directly on the FlashArray for the restored volume. Components: Volume Management Affected Versions: 3.1.4, 3.2.0 | Major |
PD-3434 | During node decommission, if a node is rebooted, it can enter a state where the node spec has been deleted, but the associated cloud drive has not been cleaned up. If this node is recommissioned, the Portworx reboot fails because both the previous and current drivesets are attached to the node.Workaround:
Components: Drive & Pool Management Affected Versions: 3.2.0 | Major |
PD-3409 | When a user create a journal device as a dedicated cloud drive and create the storage pool using the pxctl sv add-drive command, the cloud drives are not automatically deleted when the storage pool is deleted.Workaround: Manually remove the drives after deleting the pool. Components: Drive & Pool Management Affected Versions: 3.2.0 | Major |
PD-3416 | When you change the zone or any labels on an existing Portworx storage node with cloud drives, Portworx may fail to start on that node. If the labels are changed, the driveset associated with the old zone might become orphaned, and a new storage driveset may be created. Workaround: To change topology labels on existing storage nodes, contact Portworx support for assistance. Components: Drive & Pool Management Affected Versions: 3.2.0 | Major |
PD-3496 | For Portworx installation, using FlashArray Direct Access, without a Realm specified. If the user clones a volume that is inside a FlashArray pod and clones it to a new volume that is not in a FlashArray pod, the cloned volume appearsto be bound but might not be attachable. Workaround: Include the parameter pure_fa_pod_name: "" in the StoargeClass of the cloned volumes.Components: Drive & Pool Management Affected Versions: 3.2.0 | Major |
PD-3494 | In a vSphere local mode installation environment, users may encounter incorrect alerts stating that cloud drives were moved to a datastore lacking the expected prefix (e.g., local-i ).when performing Storage vMotion of VMDKs associated with specific VMs.Workaround: This alert can be safely ignored. Components: Drive & Pool Management Affected Versions: 3.2.0 | Major |
PD-3365 | When you run the drop_cache service on Portworx nodes, it can cause Portworx to fail to start due to known issues in the kernel.Workaround: Avoid tunning drop_cache service on Portworx nodes.Components: Storage Affected Versions: 3.1.4, 3.2.0 | Minor |
3.1.8
January 28, 2025
Visit these pages to see if you're ready to upgrade to this version:
Note
This version addresses security vulnerabilities.
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-41332 | When running Portworx in debug mode with FlashBlade, certain log entries displayed extraneous information under rare conditions. User Impact: Unwanted information appeared in the log entries. Resolution: Portworx has been enhanced to ensure that only relevant information is displayed. Affected Versions: 3.1.x, 3.0.x, 2.13.x Component: Volume Management | Major |
PWX-41329 PWX-41480 | When executing a few commands, extraneous information was displayed in their output. User Impact: Unwanted information appeared in the output of certain commands. Resolution: Portworx has been enhanced to ensure that only relevant information is displayed. Affected Versions: 3.1.x, 3.0.x, 2.13.x Component: CLI & API | Major |
3.1.7
December 3, 2024
Visit these pages to see if you're ready to upgrade to this version:
Note
This version addresses security vulnerabilities.
3.1.6.1
November 13, 2024
Visit these pages to see if you're ready to upgrade to this version:
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-39990 | As part of node statistics collection, Portworx read the timestamp data stats while its storage component was updating them at the same time, leading to data conflicts. User Impact: The Portworx storage component restarted due to an invalid memory access issue. Resolution: A lock mechanism has been added to manage concurrent reads and writes to the timestamp data, preventing conflicts. Affected Versions: 3.1.0 Component: Storage | Critical |
3.1.6
October 02, 2024
Visit these pages to see if you're ready to upgrade to this version:
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-38930 | For PX-StoreV2 deployments with volumes that had a replica factor greater than 1 and were either remotely attached or not accessed through PX-Fast PVCs, if a power loss, kernel panic, or ungraceful node reboot occurred, the data was incorrectly marked as stable due to buffering in the underlying layers, despite being unstable. User Impact: In these rare situations, this issue can mark PVC data as unstable. Resolution: Portworx now correctly marks the data as stable , preventing this problem. Components: PX-StoreV2 Affected Versions: 2.13.x, 3.0.x, 3.1.x | Critical |
3.1.5
September 19, 2024
Visit these pages to see if you're ready to upgrade to this version:
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-38849 | For Sharedv4 volumes, users can now apply the disable_others=true label to limit the mountpoint and export path permissions to 0770 , effectively removing access for other users and enhancing the security of the volumes. | Volume Management |
PWX-38791 | The FlashArray Cloud Drive volume driveset lock logic has been improved to ensure the driveset remains locked to its original node, which can otherwise detach due to a connection loss to the FlashArray during a reboot, preventing other nodes from claiming it:
| Drive & Pool Management |
PWX-38714 | During the DriveSet check, if device mapper devices are detected, Portworx cleans them before mounting FlashArray Cloud Drive volumes. This prevents mounting issues during failover operations on a FlashArray Cloud Drive volume. | Drive & Pool Management |
PWX-37642 | The logic for the sharedv4 mount option has been improved:
| Sharedv4 Volumes |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-36679 | Portworx could not perform read or write operations on Sharedv4 volumes if NFSD version 3 was disabled in /etc/nfs.conf .User Impact: Read or write operations failed on Sharedv4 volumes. Resolution: Portworx no longer depends on the specific enabled NFSD version and now only checks if the service is running. Components: Shared Volumes Affected Versions: 3.1.0 | Major |
PWX-38888 | In some cases, when a FlashArray Direct Access volume failed over between nodes, Portworx version 3.1.4 did not properly clean up the mount path for these volumes. User Impact: Application pods using FlashArray Direct Access volumes were stuck in the Terminating state.Resolution: Portworx now properly handles the cleanup of FlashArray Direct Access volume mount points during failover between nodes. Components: Volume Management Affected Versions: 3.1.4 | Minor |
3.1.4
August 15, 2024
Visit these pages to see if you're ready to upgrade to this version:
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-37590 | Users running on environments with multipath version 0.8.8 and using FlashArray devices, either as Direct Access Volumes or Cloud Drive Volumes, may have experienced issues with the multipath device not appearing in time. User Impact: Users saw Portworx installations or Volume creation operations fail. Resolution: Portworx is now capable of running on multipath version 0.8.8. Components: Drive and Pool Management Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
3.1.3
July 16, 2024
Visit these pages to see if you're ready to upgrade to this version:
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-37576 | Portworx has significantly reduced the number of vSphere API calls during the booting process and pool expansion. | Drive & Pool Management |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-37870 | When PX-Security is enabled on a cluster that is also using Vault for storing secrets, the in-tree provisioner (kubernetes.io/portworx-volume) fails to provision a volume. User Impact: PVCs became stuck in a Pending state with the following error: failed to get token: No Secret Data found for Secret ID . Resolution: Use the CSI provisioner (pxd.portworx.com) to provision the volumes on clusters that have PX-Security enabled. Components: Volume Management Affected Versions: 3.0.3, 3.1.2 | Major |
PWX-37799 | A KVDB failure sometimes Portworx to restart when creating cloud backups. User Impact: Users saw Portworx restart unexpectedly. Resolution: Portworx now raises an alert, notifying users of a backup failure instead of unexpectedly restarting. Components: Cloudsnaps Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
PWX-37661 | If the credentials provided in px-vsphere-secret were invalid, Portworx failed to create a Kubernetes client, and the process would restart every few seconds leading to many login failures continuously. User Impact: Users saw a large number of client creation trials, which may have lead to the credentials being blocked or too many API calls. Resolution: If the credentials are invalid, Portworx will now wait for secret to be changed before trying to log in again. Components: Drive and Pool Management Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
PWX-37339 | Sharedv4 service failover did not work correctly when a node had a link-local IP from the subnet 169.254.0.0/16. In clusters running OpenShift 4.15 or later, Kubernetes nodes may have a link-local IP from this subnet by default. User Impact: Users saw disruptions in applications utilizing sharedv4-service volumes when the NFS server node went down. Resolution: Portworx has been improved to prevent VM outages in such situations. Components: Sharedv4 Affected Versions: 3.1.0.2 | Major |
3.1.2.1
July 8, 2024
Visit these pages to see if you're ready to upgrade to this version:
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-37753 | Portworx reloaded and reconfigured VMs on every boot, which is a costly activity in vSphere. User Impact: Users saw a significant number of VM reload and reconfigure activities during Portworx restarts, which sometimes overwhelmed vCenter. Resolution: Portworx has been optimized to minimize unnecessary reload and reconfigure actions for VMs. Now, these actions are mostly triggered only once during the VM's lifespan. Component: Drive and Pool Management Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
PWX-35217 | Portworx maintained two vSphere sessions at all times. These sessions would become idle after Portworx restarts, and vSphere would eventually clean them up. vSphere counts idle sessions toward its session limits, which caused an issue if all nodes restarted simultaneously in a large cluster. User Impact: In large clusters, users encountered the 503 Service Unavailable error if all nodes restarted simultaneously.Resolution: Portworx now actively terminates sessions after completing activities like boot and pool expansion. Note that in rare situations where Portworx might not close the sessions, users may still see idle sessions. These sessions are cleaned by vSphere based on the timeout settings of the user's environment. Component: Drive and Pool Management Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
PWX-36727 | When a user decommissioned a node, Portworx would process the node deletion in the background. And for every volume delete or update operation, it checked if all nodes marked as decommissioned had no references to these volumes, which took a long time to delete a node. User Impact: The Portworx cluster went down as the KVDB node timed out. Resolution: The logic for decommissioning nodes has been improved to prevent such situations. Component: KVDB Affected Versions: 3.1.x, 3.0.x, 2.13.x | Minor |
3.1.2
June 19, 2024
Visit these pages to see if you're ready to upgrade to this version:
New features
Portworx by Pure Storage is proud to introduce the following new features:
- Customers can now migrate legacy shared volumes to sharedv4 service volumes.
- For FlashBlade Direct Access volumes, users can provide multiple NFS endpoints using the
pure_nfs_endpoint
parameter. This is useful when the same FlashBlade is shared across different zones in a cluster.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-33044 | Portworx will perform additional live VM migrations to ensure a KubeVirt VM always uses the block device directly by running the VM on the volume coordinator node. | Sharedv4 |
PWX-23390 | Stork will now raise events on a pod or VM object if it fails to schedule them in a hyperconverged fashion. | Stork and DR |
PWX-37113 | In KubeVirt environments, Portworx no longer triggers RebalanceJobStarted and RebalanceJobFinished alarms every 15 minutes due to the KubeVirt fix-vps job. Alarms are now raised only when the background job is moving replicas. | Storage |
PWX-36600 | The output of the rebalance HA-update process has been improved to display the state of each action during the process. | Storage |
PWX-36854 | The output of the pxctl volume inspect command has been improved. The Kind field can now be left empty inside the claimRef , allowing the output to include application pods that are using the volumes. | Storage |
PWX-33812 | Portworx now supports Azure PremiumV2_LRS and UltraSSD_LRS disk types. | Drive and Pool Management |
PWX-36484 | A new query parameter ce=azure has been added for Azure users to identify the cloud environment being used. The parameter ensures that the right settings and optimizations are applied based on the cloud environment. | Install |
PWX-36714 | The timeout for switching licenses from floating to Portworx Enterprise has been increased, avoiding timeout failures. | Licensing |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-36869 | When using a FlashArray on Purity 6.6.6 with NVMe-RoCE, a change in the REST API resulted in a deadlock in Portworx. User Impact: FlashArray Direct Access attachment operations never completed, and FlashArray Cloud Drive nodes failed to start. Resolution: Portworx now properly handles the changed API for NVMe and does not enter a deadlock. Component: FA-FB Affected Versions: 3.1.x, 3.0.x, 2.13.x | Critical |
PWX-37059 | In disaggregated mode, storageless nodes restarted every few minutes attempting to claim the storage driveset and ended up being unsuccessful. User Impact: Due to storageless node restarts, some customer applications experienced IO disruption. Resolution: When a storage node goes down, Portworx now stops storageless nodes from restarting in a disaggregated mode, avoiding them to claim the storage driveset. Component: Drive and Pool Management Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
PWX-37351 | If the drive paths changed due to a node restart or a Portworx upgrade, it led to a storage down state on the node. User Impact: Portworx failed to restart because of the storage down state. Components: Drive & Pool Management Affected Versions: 3.1.0.3, 3.1.1.1 | Major |
PWX-36786 | An offline storageless node was auto-decommissioned under certain race conditions, making the cloud-drive driveset orphaned. User Impact: When Portworx started as a storageless node using this orphaned cloud-drive driveset, it failed to start since the node's state was decommissioned. Resolution: Portworx now auto-cleans such orphaned storageless cloud-drive drivesets and starts successfully. Component: Drive and Pool Management Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
PWX-36887 | When one of the internal KVDB nodes was down for several minutes, Portworx added another node to the KVDB cluster. Portworx initially added the new KVDB member as a learner. If, for some reason, KVDB connectivity was lost for more than a couple of minutes after adding the learner, the learner stayed in the cluster and prevented a failover to a different KVDB node. User Impact: The third node was not able to join the KVDB cluster with the error Peer URLs already exists. KVDB continued to run with only two members.Resolution: When Portworx encounters the above error, it removes the failed learner from the cluster, thereby allowing the third node to join. Component: Internal KVDB Affected Versions: 3.0.x, 3.1.1 | Major |
PWX-36873 | When Portworx was using HashiCorp's Vault configured with Kubernetes or AppRole authentication, it attempted to automatically refresh the access tokens when they expired. If the Kubernetes Service Account was removed or the AppRole expired, the token-refresh kept failing, and excessive attempts to refresh it caused a crash of the Vault service on large clusters. User Impact: The excessive attempts to refresh the tokens caused a crash of the Vault service on large clusters. Resolution: Portworx nodes now detect excessive errors from the Vault service and will avoid accessing Vault for the next 5 minutes. Component: Volume Management Affected Versions: 3.0.5, 3.0.3 | Major |
PWX-36601 | Previously, the default timeout for rebalance HA-update actions was 30 minutes. This duration was insufficient for some very slow setups, resulting in HA-update failures. User Impact: The rebalance job for HA-update failed to complete. In some cases, the volume's HA-level changed unexpectedly. Resolution: The default rebalance HA-update timeout has been increased to 5 hour Components: Storage Affected Versions: 2.13.x, 3.0.x, 3.1.x | Major |
PWX-35312 | In version 3.1.0, a periodic job that fetched drive properties caused an increase in the number of API calls across all platforms. User Impact: The API rate limits approached their maximum capacity more quickly, stressing the backend. Resolution: Portworx improved the system to significantly reduce the number of API calls on all platforms. Component: Cloud Drives Affected Versions: 3.1.0 | Major |
PWX-30441 | For AWS users, Portworx did not update the drive properties for the gp2 drives that were converted to gp3 drives. User Impact: As the IOPS of such drives changed, but not updated, pool expansion failed on these drives. Resolution: During the maintenance cycle, that is required for converting gp2 drives to gp3, Portworx now refreshes the disk properties of these drives. Component: Cloud Drives Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
PWX-36139 | During pool expansion with the add-drive operation using the CSI provider on a KVDB node, there is a possibility of the new drive getting the StorageClass of the KVDB drive instead of the data drive, if they are different.User Impact: In such a case, a drive might have been added but the pool expansion operation failed, causing some inconsistencies. Resolution: Portworx takes the StorageClass of only the data drives present in the node. Component: Pool Management Affected Versions: 3.1.x, 3.0.x, 2.13.x | Minor |
Known issues (Errata)
Issue Number | Issue Description | Severity |
---|---|---|
PD-3031 | For an Azure cluster with storage and storageless nodes using Premium LRS or SSD drive types, when a user updates the Portworx StorageClass to use PremiumV2 LRS or Ultra SSD drive types, the changes might not reflect on the existing nodes.Workaround: StorageClass changes will apply only to the new nodes added to the cluster. For existing nodes, perform the following steps:
Affected versions: 3.1.2 | Major |
PD-3012 | If maxStorageNodesPerZone is set to a value greater than the current number of worker nodes in an AKS cluster, additional storage nodes in an offline state may appear post-upgrade due to surge nodes.Workaround: Manually delete any extra storage node entries created during the Kubernetes cluster upgrade by following thenode decommission process.Components: Cloud Drives Affected versions: 2.13.x, 3.0.x, 3.1.x | Major |
PD-3013 | Pool expansion may fail if a node is rebooted before the expansion process is completed, displaying errors such as drives in the same pool not of the same type . Workaround: Retry the pool expansion on the impacted node. Components: Drive and Pool Management Affected versions: 3.1.2 | Major |
PD-3035 | Users may encounter issues with migrations of legacy shared volumes to shared4v service volumes appearing stuck if performed on a decommissioned node. Workaround: If a node is decommissioned during a migration, the pods running on that node must be forcefully terminated to allow the migration to continue. Component: Shared4v Volumes Affected version: 3.1.2 | Major |
PD-3030 | In environments where multipath is used to provision storage disks for Portworx, incorrect shutdown ordering may occur, causing multipath to shut down before Portworx. This can lead to situations where outstanding IOs from applications, still pending in Portworx, may fail to reach the storage disk. Workaround:
Affected Versions: 3.1.2 | Major |
3.1.1
April 03, 2024
Visit these pages to see if you're ready to upgrade to this version:
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-35939 | For DR clusters, the cluster domain of the nodes is exposed in the node inspect and node enumerate SDK responses. This information is used by the operator to create the pod disruption budget, preventing loss during Kubernetes upgrades. | DR and Migration |
PWX-35395 | When Portworx encounters errors like checksum mismatch or bad disk sectors while reading data from the backend disk the IOOperationWarning alert will be raised. This alert is tracked by the metric px_alerts_iooperationwarning_total . | Storage |
PWX-35738 | Portworx now queries an optimized subset of VMs to determine the driveset to attach, avoiding potential errors during an upgrade where a transient state of a VM could have resulted in an error during boot. | Cloud Drives |
PWX-35397 | The start time for Portworx on both Kubernetes and vSphere platforms has been significantly reduced by eliminating repeated calls to the Kubernetes API and vSphere servers. | Cloud Drives |
PWX-35042 | The Portworx CLI has been enhanced with the following improvements:
| Cloud Drives |
PWX-33493 | For pool expansion operations with the pxctl sv pool expand command, the add-disk and resize-disk flags have been renamed to add-drive and resize-drive , respectively. The command will continue to support the old flags for compatibility. | Cloud Drives |
PWX-35351 | The OpenShift Console now displays the Used Space for CSI sharedV4 volumes. | Sharedv4 |
PWX-35187 | Customers can now obtain the list of Portworx images from the spec generator. | Spec Generator |
PWX-36543 | If the current license is set to expire within the next 60 days, Portworx now automatically updates the IBM Marketplace license to a newer one upon the restart of the Portworx service. | Licensing |
PWX-36496 | The error messages for pxctl license activate have been improved to return a more appropriate error message in case of double activation. | Licensing |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-36416 | When a PX-StoreV2 pool reached its full capacity and could not be expanded further using the resize-drive option, it went offline due to a pool full condition.User Impact: If pool capacity reached a certain threshold, the pool went offline. Resolution: Since PX-StoreV2 pools cannot be expanded using the add-drive operation. You can increase the capacity on a node by adding new pools to it:
Affected Versions: 3.0.0 | Critical |
PWX-36344 | A deadlock in the Kubernetes Config lock led to failed pool expansion. User Impact: Customers needed to restart Portworx if pool expansion became stuck. Resolution: An unbuffered channel that resulted in a deadlock when written to in a very specific window is now changed to have a buffer, breaking the deadlock. Components: Pool Management Affected Versions: 2.13.x, 3.0.x | Major |
PWX-36393 | Occasionally, Portworx CLI binaries were installed incorrectly due to issues (e.g., read/write errors) that the installation process failed to detect, causing the Portworx service to not start. User Impact: Portworx upgrade process failed. Resolution: Portworx has improved the installation process by ensuring the correct installation of CLI commands and detecting these errors during the installation. Components: Install Affected Versions: 2.13.x, 3.0.x | Major |
PWX-36339 | For a sharedv4 service pod, there was a race condition where the cached mount table failed to reflect the unmounting of the path. User Impact: Pod deletion got stuck in the Terminating state, waiting for the underlying mount point to be deleted. Resolution: Force refresh of cache for an NFS mount point if it is not attached and is already unmounted. This will ensure that the underlying mount path gets removed and the pod terminates cleanly. Components: Sharedv4 Affected versions: 2.13.x, 3.0.x | Major |
PWX-36522 | When FlashArray Direct Access volumes and FlashArray Cloud Drive volumes were used together, the system couldn't mount the PVC due to an Invalid arguments for mount entry error, causing the related pods to not start. User Impact: Application pods failed to start. Resolution: The mechanism to populate the mount table on restart has been changed to ensure an exact device match rather than a prefix-based search, addressing the root cause of the incorrect mount entries and subsequent failures. Components: Volume Management Affected version: 3.1.0 | Major |
PWX-36247 | The field portworx.io/misc-args had an incorrect value of -T dmthin instead of -T px-storev2 to select the backend type..User Impact: Customers had to manually change this argument to -T px-storev2 after generating the spec from the spec generator.Resolution: The value for the field has been changed to -T px-storev2 .Components: FA-FB Affected version: 3.1.0 | Major |
PWX-35925 | When downloading air-gapped bootstrap specific for OEM release (e.g. px-essentials ), the script used an incorrect URL for the Portworx images.User Impact: The air-gapped bootstrap script fetched the incorrect Portworx image, particularly for Portworx Essentials. Resolution: The air-gapped bootstrap has been fixed, and now efficiently handles the OEM release images. Components: Install Affected version: 2.13.x, 3.0.x | Major |
PWX-35782 | In a synchronous DR setup, a node repeatedly crashed during a network partition because Portworx attempted to operate on a node from another domain that was offline and unavailable. User Impact: In the event of a network partition between the two domains, temporary node crashes could occur. Resolution: Portworx now avoids the nodes that are not online or unavailable from other domain. Components: DR and Migration Affected version: 3.1.0 | Major |
PWX-36500 | Older versions of Portworx installations with FlashArray Cloud Drive displayed an incorrect warning message in the pxctl status output on RHEL 8.8 and above OS versions, even though the issue had been fixed in the multipathd package that comes with these OS versions.User Impact: With Portworx version 2.13.0 or above, users on RHEL 8.8 or higher who were using FlashArray Cloud Drives saw the following warning in the pxctl status output: WARNING: multipath version 0.8.7 (between 0.7.7 and 0.9.3) is known to have issues with crashing and/or high CPU usage. If possible, please upgrade multipathd to version 0.9.4 or higher to avoid this issue .Resolution: The output of pxctl status has been improved to display the warning message for the correct RHEL versions.Components: FA-FB Affected version: 2.13.x, 3.0.x, 3.1.0 | Major |
PWX-33030 | For FlashArray Cloud Drives, when the skip_kpartx flag was set in the multipath config, the partition mappings for device mapper devices did not load, prevented Portworx from starting correctly.User Impact: This resulted in a random device (either a child or a parent/dm device) with the UUID label being selected and attempted to be mounted. If a child device was chosen, the mount would fail with a Device is busy error.Resolution: Portworx now avoids such a situation by modifying the specific unbuffered channel to include a buffer, thus preventing the deadlock. Components: FA-FB Affected version: 2.13.x, 3.0.x | Minor |
3.1.0.1
March 20, 2024
Visit these pages to see if you're ready to upgrade to this version:
This is a hotfix release intended for IBM Cloud customers. Please contact the Portworx support team for more information.
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-36260 | When installing Portworx version 3.1.0 from the IBM Marketplace catalog, the PX-Enterprise IBM Cloud license for a fresh installation is valid until November 30, 2026. However, for existing clusters that were running older versions of Portworx and upgraded to 3.1.0, the license did not automatically update to reflect the new expiry date of November 30, 2026.User Impact: With the old license expiring on April 2, 2024, Portworx operations could be affected after this date. Resolution: To extend the license until November 30, 2026, follow the instructions on the Upgrading Portworx on IBM Cloud via Helm page to update to version 3.1.0.1. Components: Licensing Affected versions: 2.13.x, 3.0.x, 3.1.0 | Critical |
3.1.0
January 31, 2024
Visit these pages to see if you're ready to upgrade to this version:
Starting with version 3.1.0:
- Portworx CSI for FlashArray and FlashBlade license SKU will only support Direct Access volumes and no Portworx volumes. If you are using Portworx volumes, reach out to the support team before upgrading Portworx.
- Portworx Enterprise will exclusively support kernel versions 4.18 and above.
New features
Portworx by Pure Storage is proud to introduce the following new features:
- The auto_journal profile is now available to detect the IO pattern and determine whether the
journal
IO profile is beneficial for an application. This detector analyzes the incoming write IO pattern to ascertain whether thejournal
IO profile would improve the application's performance. It continuously analyzes the write IO pattern and toggles between thenone
andjournal
IO profiles as needed. - A dynamic labeling feature is now available, allowing Portworx users to label Volume Placement Strategies(VPS) flexibly and dynamically. Portworx now supports the use of dynamic labeling through the inclusion of
${pvc.labels.labelkey}
in values.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-31558 | Google Anthos users can now generate the correct Portworx spec from Portworx Central, even when storage device formats are incorrect. | Spec Generation |
PWX-28654 | Added the NonQuorumMember flag to the node inspect and Enumerate SDK API responses. This flag provides an accurate value depending on whether a node contributes to cluster quorum. | SDK/gRPC |
PWX-31945 | Portworx now provides an internal API for listing all storage options on the cluster. | SDK/gRPC |
PWX-29706 | Portworx now supports a new streaming Watch API that provides updates on volume information that has been created, modified, or deleted. | SDK/gRPC |
PWX-35071 | Portworx now distinguishes between FlashArray and FlashBlade calls, routing them to appropriate backends based on the current volume type (file or block), thereby reducing the load on FlashArray or FlashBlade backends. | FA-FB |
PWX-34033 | For FlashArray and FlashBlade integrations, many optimizations have been made in caching and information sharing, resulting in a significant reduction in number of REST calls made to the backing FlashArray and FlashBlade. | FA-FB |
PWX-35167 | The default timeout for the FlashBlade Network Storage Manager (NSM) lock has been increased to prevent Portworx restarts. | FA-FB |
PWX-30083 | Portworx now manages the TTL for alerts instead of relying on etcd's key expiry mechanism. | KVDB |
PWX-33430 | The error message displayed when a KVDB lock times out has been made more verbose to provide a better explanation. | KVDB |
PWX-34248 | The sharedv4 parameter in a StorageClass enables users to choose between sharedv4 and non-shared volumes:
| Sharedv4 |
PWX-35113 | Users can now enable the forward-nfs-attach-enable storage option for applications using sharedv4 volumes. This allows Portworx to attach a volume to the most suitable available nodes. | Sharedv4 |
PWX-32278 | On the destination cluster, all snapshots are now deleted during migration when the parent volume is deleted. | Stork |
PWX-32260 | The resize-disk option for pool expansion is now also available on TKGS clusters. | Cloud Drives |
PWX-32259 | Portworx now uses cloud provider identification by reusing the provider's singleton instance, avoiding repetitive checks if the provider type is already specified in the cluster spec. | Cloud Drives |
PWX-35428 | In environments with slow vCenter API responses, Portworx now caches specific vSphere API responses, reducing the impact of these delays. | Cloud Drives |
PWX-33561 | When using the PX-StoreV2 backend, Portworx now detaches partially attached driversets for cloud-drives only when the cloud-drives are not mounted. | Cloud Drives |
PWX-33042 | In a disaggregated deployment, storageless nodes can be converted to storage nodes by changing the node label to portworx.io/node-type=storage | Cloud Drives |
PWX-28191 | AWS credentials for Drive Management can now be provided through a Kubernetes secret px-aws in the same namespace where Portworx is deployed. | Cloud Drives |
PWX-34253 | Azure users will now see accurate storage type displays: Premium_LRS is identified as SSD, and NVME storage is also correctly represented. | Cloud Drives |
PWX-31808 | Pool deletion is now allowed for vSphere cloud drives. | Cloud Drives |
PWX-32920 | vSphere drives can now be resized up to a maximum of 62 TB per drive. | Pool Management |
PWX-32462 | Portworx now permits most overlapping mounts and will only reject overlapping mounts if a bidirectional (i.e., shared) parent directory mount is present. | px-runc |
PWX-32905 | Portworx now properly detects the NFS service on OpenShift platforms. | px-runc |
PWX-35292 | To reduce log volume in customer clusters, logs generated when a volume is not found during CSI mounting have been moved to the TRACE level. | CSI |
PWX-34995 | Portworx CSI for FlashArray and FlashBlade license SKU now counts Portworx and FA/FB drives separately based on the drive type. | Licensing |
PWX-35452 | The mount mapping's lock mechanism has been improved to prevent conflicts between unmount and mount processes, ensuring more reliable pod start-ups. | Volume Management |
PWX-33577 | The fstrim operation has been improved for efficiency:
| Storage |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-31652 | Portworx was unable to identify the medium for the vSphere cloud drives. User Impact: Portworx deployment failed on vSphere with cloud drives. Resolution: Portworx now successfully identifies the drive medium type correctly and can be deployed on a cluster with vSphere cloud drives. Components: Drive & Pool Management Affected Versions: 2.13.x | Critical |
PWX-35430 | Requests for asynchronous DR migration operations were previously load balanced to nodes that were not in the same cluster domain. User Impact: In hybrid DR setups, such as one where cluster A is synchronously paired with cluster B, and cluster B is asynchronously paired with cluster C, any attempts to migrate from Cluster B to Cluster C would result in failure, showing an error that indicates a BackupLocation not found .Resolution: Portworx now ensures that migration requests are load balanced within nodes in the same cluster domain as the initial request. Components: DR and Migration Affected Versions: 3.0.4 | Critical |
PWX-35277 | In an asynchronous DR deployment, if security/auth is enabled in a Portworx cluster, migrations involving multiple volumes would fail with authentication errors. User Impact: Migrations in asynchronous DR setups involving multiple volumes failed with authentication errors. Resolution: Authentication logic has been modified to handle migrations involving multiple volumes on the auth enabled clusters. Components: DR and Migrations Affected versions: 3.0.0 | Critical |
PWX-34369 | When using HTTPS endpoints for cluster pairing, Portworx incorrectly parsed the HTTPS URL scheme. User Impact: Cluster pairing would fail when using an HTTPS endpoint. Resolution: Portworx has now corrected the HTTPS URL parsing logic. Components: DR and Migration Affected versions: 3.0.0 | Critical |
PWX-35466 | Cloudsnaps or asynchronous DR operations failed when attempted from a metro cluster due to inaccessible credentials. This issue specifically occurred if the credential was not available from both domains of the metro cluster. User Impact: Cloudsnap operations or asynchronous DR from metro clusters could fail if the required credentials were not accessible in both domains. Resolution: Portworx now detects a coordinator node that has access to the necessary credentials for executing cloudsnaps or asynchronous DR operations. Components: DR and Migration Affected versions: 3.0.4 | Critical |
PWX-35324 | FlashArray Direct Access volumes are formatted upon attachment. All newly created volumes remain in a pending state until they are formatted. If Portworx was restarted before a volume had been formatted, it would delete the volume that was still in the pending state. User Impact: The newly created FlashArray Direct Access volumes were deleted. Resolution: Portworx now avoids deleting volumes that are in the pending state. Components: FA-FB Affected versions: 3.0.x | Critical |
PWX-35279 | Upon Portworx startup, if there were volumes attached from a FlashArray that was not registered in the px-pure-secret , Portworx would detach them as part of a cleanup routine.User Impact: Non-Portworx disks, including boot drives and other FlashArray volumes, were mistakenly detached from the node and required reconnection. Resolution: Portworx no longer cleans up healthy FlashArray volumes on startup. Components: FA-FB Affected versions: 2.13.11, 3.0.0, 3.0.4 | Critical |
PWX-34377 | Portworx was incorrectly marking FlashBlade Direct Attach volumes being transitioned to read-only status. This incorrect identification led to a restart of all pods associated with these volumes. User Impact: The restart of running pods resulted in application restarts or failures. Resolution: Checks within Portworx that were leading to false identification of Read-Only transitions for FlashBlade volumes have been fixed. Components: FA-FB Affected versions: 3.0.4 | Critical |
PWX-32881 | The CSI driver failed to register after the Anthos storage validation test suite was removed and a node was re-added to the cluster. User Impact: The CSI server was unable to restart if the Unix domain socket had been deleted. Resolution: The CSI server now successfully restarts and restores the Unix domain socket, even if the socket has been deleted. Update to this version if your workload involves deleting the kubelet directory during node decommissioning.Components: CSI Affected versions: 3.0.0 | Critical |
PWX-31551 | The latest OpenShift installs have more strict SELinux policies, which prevent non-privileged pods to access the csi.sock CSI interface file.User Impact: Portworx install failed. Resolution: All Portworx CSI pods are now configured as privileged pods.Components: oci-monitor Affected versions: 2.13.x, 3.0.x | Critical |
PWX-31842 | On TKGI clusters, if Portworx service and pods were restarted, it led to excessive mounts (mount-leaks). User Impact: The IO operations on the node would progressively slow down, until the host would completely hang. Resolution: The mountpoints that are used by Portworx have been changed. Components: oci-monitor Affected versions: 2.1.1 | Critical |
PWX-35603 | When running Portworx on older Linux systems (specifically those using GLIBC 2.31 or older) in conjunction with newer versions of Kubernetes, Portworx previously failed to detect dynamic updates of pod credentials and tokens, hence led to Unauthorized errors when utilizing Kubernetes client APIs.User Impact: Users could encounter Unauthorized errors when using Kubernetes client APIs.Resolution: Dynamic token updates are now processed correctly by Portworx. Components: oci-monitor Affected versions: 3.0.1 | Critical |
PWX-34250 | If encryption was applied on both the client side (using an encryption passphrase) and the server side (using Server-Side Encryption, SSE) for creating credential commands, this approach failed to configure S3 storage in Portworx to use both encryption methods. User Impact: Configuration of S3 storage would fail in the above mentioned condition. Resolution: Users can now simultaneously use both server-side and client-side encryption when creating credentials for S3 or S3-compatible object stores. Components: Cloudsnaps Affected versions: 3.0.2, 3.0.3, 3.0.4 | Critical |
PWX-22870 | Portworx installations would by default automatically attempt to install NFS packages on the host system. However, since NFS packages add new users/groups, they were often blocked on Red Hat Enterprise Linux / CentOS platforms with SELinux enabled. User Impact: Sharedv4 volumes failed to attach on platforms with SELinux enabled. Resolution: Portworx installation is now more persistent on Red Hat Enterprise Linux / CentOS platforms with SELinux enabled. Components: IPV6 Affected versions: 2.5.4 | Major |
PWX-35332 | Concurrent access to an internal data structure containing NFS export entries resulted in a Portworx node crashing with the fatal error: concurrent map read and map write in knfs.HasExports error.User Impact: This issue triggered a restart of Portworx on that node. Resolution: A lock mechanism has been implemented to prevent this issue. Components: Sharedv4 Affected versions: 2.10.0 | Major |
PWX-34865 | When upgrading Portworx from version 2.13 (or older) to version 3.0 or newer, the internal KVDB version was also updated. If there was a KVDB membership change during the upgrade, the internal KVDB lost quorum in some corner cases. User Impact: The internal KVDB lost quorum, enforcing Portworx upgrade of a KVDB node that was still on an older Portworx version. Resolution: In some cases, Portworx now chooses a different mechanism for the KVDB membership change. Components: KVDB Affected versions: 3.0.0 | Major |
PWX-35527 | When a Portworx KVDB node went down and subsequently came back online with the same node ID but a new IP address, Portworx nodes on the other servers continued to use the stale IP address for connecting to KVDB. User Impact: Portworx nodes faced connection issues while connecting to the internal KVDB, as they attempted to use the outdated IP address. Resolution: Portworx now updates the correct IP address on such nodes. Component: KVDB Affected versions: 2.13.x, 3.0.x | Major |
PWX-33592 | Portworx incorrectly applied the time set by the execution_timeout_sec option.User Impact: Some operations time out before the time set through the execution_timeout_sec option.Resolution: The behavior of this runtime option is now fixed. Components: KVDB Affected versions: 2.13.x, 3.0.x | Major |
PWX-35353 | Portworx installations (version 3.0.0 or newer) failed on Kubernetes systems using Docker container runtime versions older than 20.10.0. User Impact: Portworx installation failed on Docker container runtimes older than 20.10.0. Resolution: Portworx can now be installed on older Docker container runtimes. Components: oci-monitor Affected versions: 3.0.0 | Major |
PWX-33800 | In Operator version 23.5.1, Portworx was configured so that a restart of the Portworx pod would also trigger a restart of the portworx.service backend.User Impact: This configuration caused disruptions in storage operations. Resolution: Now pod restarts do not trigger a restart of the portworx.service backend.Components: oci-monitor Affected versions: 2.6.0 | Major |
PWX-32378 | During the OpenShift upgrade process, the finalizer service, which ran when Portworx was not processing IOs, experienced a hang and subsequently timed out. User Impact: This caused the OpenShift upgrade to fail. Resolution: The Portworx service now runs to stop Portworx and sets the PXD_timeout during OpenShift upgrades. Components: oci-monitor Affected versions: 2.13.x, 3.0.x | Major |
PWX-35366 | When the underlying nodes of an OKE cluster were replaced multiple times (due to upgrades or other reasons), Portworx failed to start, displaying the error Volume cannot be attached, because one of the volume attachments is not configured as shareable .User Impact: Portworx became unusable on nodes that were created to replace the original OKE worker nodes. Resolution: Portworx now successfully starts on such nodes. Components: Cloud Drives Affected versions: 2.13.x, 3.0.x | Major |
PWX-33413 | After an upgrade, when a zone name case was changed, Portworx considered this to be a new zone. User Impact: The calculation of the total storage in the cluster by Portworx became inaccurate. Resolution: Portworx now considers a zone name with the same spelling, regardless of case, to be the same zone. For example, Zone1, zone1, and ZONE1 are all considered the same zone. Components: Cloud Drives Affected versions: 2.12.1 | Major |
PWX-33040 | For Portworx users using cloud drives on the IBM platform, when the IBM CSI block storage plugin was unable to successfully bind Portworx cloud-drive PVCs (for any reason), these PVCs remained in a pending state. As a retry mechanism, Portworx created new PVCs. Once the IBM CSI block storage plugin was again able to successfully provision drives, all these PVCs got into a bound state.User Impact: A large number of unwanted block devices were created in users' IBM accounts. Resolution: Portworx now cleans up unwanted PVC objects during every restart and KVDB failover. Components: Cloud Drives Affected versions: 2.13.0 | Major |
PWX-35114 | The storageless node could not come online after Portworx was deployed and showed the failed to find any available datastores or datastore clusters error.User Impact: Portworx failed to start on the storageless node which had no access to a datastore. Resolution: Storageless nodes can now be deployed without any access to a datastore. Components: Cloud Drives Affected versions: 2.13.x, 3.0.x | Major |
PWX-33444 | If a disk that was attached to a node became unavailable, Portworx continuously attempted to find the missing drive-set. User Impact: Portworx failed to restart. Resolution: Portworx now ignores errors related to missing disks and attempts to start by attaching to the available driveset, or it creates a new driveset if suitable drives are available on the node. Components: Cloud Drives Affected versions: 2.13.x, 3.0.x | Major |
PWX-33076 | When more than one container mounted to a docker volume, all of them mounted to the same path as the mount path was not unique as it only had the volume name. User Impact: When one container used to go offline, it would unmount for the other container mounted to the same volume. Resolution: The volume mount HTTP request ID is now attached to the path which makes the path unique for every mount to the same volume. Components: Volume Management Affected versions: 2.13.x, 3.0.x | Major |
PWX-35394 | Host detach operation on the volume failed with the error HostDetach: Failed to detach volume .User Impact: A detach or unmount operation on a volume would get stuck if attach and detach operations were performed in quick succession, leading to incomplete unmount operations. Resolution: Portworx now reliably handles detach or unmount operations on a volume, even when attach and detach operations are performed in quick succession. Components: Volume Management Affected Versions: 2.13.x, 3.0.x | Major |
PWX-32369 | In a synchronous DR setup, cloudsnaps with different objectstores for each domain failed to backup and cleanup the expired cloudsnaps. User Impact: The issue occurred because of a single node, which did not have access to both the objectstores, was performing cleanup of the expired cloudsnaps. Resolution: Portworx now designates two nodes, one in each domain, to perform the cleanup of the expired cloudsnaps. Components: Cloudsnaps Affected versions: 2.13.x, 3.0.x | Major |
PWX-35136 | During cloudsnap deletions, some objects were not removed because the deletion requests exceeded the S3 API's limit for the number of objects that could be deleted at once. User Impact: This would leave objects on S3 for deleted cloudsnaps, thereby consuming S3 capacity. Resolution: Portworx has been updated to ensure that deletion requests do not exceed the S3 API's limit for the number of objects that can be deleted. Components: Cloudsnaps Affected versions: 2.13.x, 3.0.x | Major |
PWX-34654 | Cloudsnap status returned empty results without any error for a taskID that was no longer in the KVDB. User Impact: No information was provided for users to take corrective actions. Resolution: Portworx now returns an error instead of empty status values. Components: Cloudsnaps Affected versions: 2.13.x, 3.0.x | Major |
PWX-31078 | When backups were restored to a namespace different from the original volume's, the restored volumes retained labels indicating the original namespace, not the new one. User Impact: The functionality of sharedv4 volumes would impact due to the labels not accurately reflecting the new namespace in which the volumes were located. Resolution: Labels for the restored volume have been fixed to reflect the correct namespace in which the volume resides. Components: Cloudsnaps Affected versions: 2.13.x, 3.0.x | Major |
PWX-32278 | During migration, on destination cluster the orphan snapshot was left behind even though parent volume was not present during certain error scenarios. User Impact: This can lead to an increase in capacity usage. Resolution: Now, such orphan cloudsnaps are deleted when the parent volume is deleted. Components: Asynchronous DR Affected versions: 2.13.x, 3.0.x | Major |
PWX-35084 | Portworx incorrectly determined the number of CPU cores when running on hosts enabled with cGroupsV2. User Impact: This created issues when limiting the CPU resources, or pinning the Portworx service to certain CPU cores. Resolution: Portworx now properly determines number of available CPU cores. Components: px-runc Affected versions: 3.0.2 | Major |
PWX-32792 | On OpenShift 4.13, Portworx did not proxy portworx-service logs. It kept journal logs from multiple machine IDs, which caused the Portworx pod to stop proxying the logs from portworx.service .User Impact: In OpenShift 4.13, the generation of journal logs from multiple machine IDs led to the Portworx pod ceasing to proxy the logs from portworx.service .Resolution: Portworx log proxy has been fixed to locate the correct journal log using the current machine ID. Components: Monitoring Affected versions: 2.13.x, 3.0.x | Major |
PWX-34652 | During the ha-update process, all existing volume labels were removed and could not be recovered.User Impact: This resulted in the loss of all volume labels, significantly impacting volume management and identification. Resolution: Volume labels now do not change during the ha-update process.Components: Storage Affected versions: 2.13.x, 3.0.x | Major |
PWX-34710 | A large amount of log data was generated during storage rebalance jobs or dry runs. User Impact: This led to log files occupying a large amount of space. Resolution: The volume of logging data has been reduced by 10%. Components: Storage Affected versions: 2.13.x, | Major |
PWX-34821 | In scenarios where the system is heavily loaded and imbalanced, elevated syncfs latencies were observed. This situation led to the fs_freeze call, responsible for synchronizing all dirty data, timing out before completion.User Impact: Users experienced timeouts during the fs_freeze call, impacting the normal operation of the system.Resolution: Restart the system and retry the snapshot operation. Components: Storage Affected versions: 3.0.x | Major |
PWX-33647 | When the Portworx process are restarted, it verifies the existing mounts on the system for sanity. If one of the mounts was NFS mount of a Portworx volume, the mount point verification would hung as Portworx was in the process of starting up. User Impact: The Portworx process would not come up and would enter an infinite wait, waiting for the mount point verification to return. Resolution: When Portworx is starting up, it now skips the verification of Portworx-backed mount points to allow the startup process to continue. Components: Storage Affected versions: 3.0.2 | Major |
PWX-33631 | Portworx applied locking mechanisms to synchronize requests across different worker nodes during the provisioning of CSI volumes, to distribute workloads evenly causing decrease in performance for CSI volume creation. User Impact: This synchronization approach led to a decrease in performance for CSI volume creation in heavily loaded clusters. Resolution: If experiencing slow CSI volume creation, upgrade to this version. Components: CSI Affected versions: 2.13.x, 3.0.x | Major |
PWX-34355 | In certain occasions, while mounting an FlashArray cloud drive disks backing a storage pool, Portworx used the single path device instead of multipath device. User Impact: Portworx entered in the StorageDown state.Resolution: Portworx now identifies the multipath device associated with a given device name and uses this multipath device for mounting operations. Components: FA-FB Affected versions: 2.10.0, 2.11.0, 2.12.0, 2.13.0, 2.13.11, 3.0.0 | Major |
PWX-34925 | When a large number of FlashBlade Direct Access volumes were created subsequently could lead to restating of Portworx with the fatal error: sync: unlock of unlocked mutex error.User Impact: When trying to create a large number of FlashBlade volumes concurrently, Portworx process might get restarted due to contention on the lock. Resolution: Improved the locking mechanism to avoid this error. Components: FA-FB Affected versions: 3.0.4 | Major |
PWX-35680 | The Portworx spec generator was incorrectly defaulting telemetry to be disabled when the StorageCluster spec was generated outside of the Portworx Central UI. This does not affect customers who applied a storagecluster with an empty telemetry spec or generated their spec through the UI. User Impact: Telemetry was disabled by default. Resolution: To enable telemetry, users should explicitly specify it if intended. Components: Spec-Gen Affected versions: 2.12.0, 2.13.0, 3.0.0 | Major |
PWX-34325 | When operating Kubernetes with the containerd runtime and a custom root directory set in the containerd configuration, the installation of Portworx would fail.User Impact: Portworx install would fail, resulting in unusual error messages due to a bug in containerd. Resolution: The installation will now intercept the error message and replace it with a clearer message that includes suggestions on how to fix the Portworx configuration. Components: Installation Affected versions: 3.0.0 | Minor |
PWX-33557 | The CallHome functionality sometimes unconditionally attempted to send the data to the local telemetry service. User Impact: This caused errors, if the telemetry was disabled. Resolution: The CallHome now sends data only if the Telemetry has been enabled. Components: Monitoring Affected versions: 3.0.0 | Minor |
PWX-32536 | Portworx installation failed on certain Linux systems using cGroupsV2 and containerd container runtimes, as it was unable to properly locate container identifiers. User Impact: Portworx installation failed. Resolution: The container scanning process has been improved to ensure successful Portworx installation on such platforms. Components: oci-monitor Affected versions: 2.13.x, 3.0.x | Minor |
PWX-30967 | During volume provisioning, snapshot volume labels are included in the count. The nodes were disqualified for provisioning when volume_anti_affinity or volume_affinity VPS was configured, resulting in volume creation failures.User Impact: When stale snapshots existed, the creation of volumes using the VPS with either volume_anti_affinity or volume_affinity setting would fail.Resolution: Upgrade to this version and retry previously failed volume creation request. Components: Stork Affected versions: 2.13.2 | Minor |
PWX-33999 | During the installation of NFS packages, Portworx incorrectly interpreted any issues or errors that occurred as timeout errors. User Impact: Portworx misrepresented and masked the original issues. Resolution: Portworx now accurately processes NFS installation errors during its installation. Components: px-runc Affected versions: 2.7.0 | Minor |
PWX-33008 | Creation of a proxy volume with CSI enabled and RWX access mode failed due to the default use of sharedv4 for all RWX volumes in CSI. User Impact: Users could not create proxy volumes with CSI enabled and RWX access mode. Resolution: To successfully create proxy volumes with CSI and RWX access mode, upgrade to this version. Components: Sharedv4 Affected versions: 3.0.0 | Minor |
PWX-34326 | The Portworx CSI Driver GetPluginInfo API returned an incorrect CSI version. User Impact: This resulted in confusion when the CSI version was retrieved by the Nomad CLI. Resolution: The Portworx CSI Driver GetPluginInfo API now returns the correct CSI version. Components: CSI Affected versions: 2.13.x,3.0.x | Minor |
PWX-31577 | Occasionally, when a user requested cloudsnap to stop, it would lead to incorrect increase in the available resources. User Impact: More cloudsnaps were started and they were stuck in the NotStarted state as resources were unavailable.Resolution: Stopping cloudsnaps does not incorrectly now increase the available resources, thus avoiding the issue. Components: Cloudsnaps Affected versions: 2.13.x, 3.0.x | Minor |
Known issues (Errata)
Issue Number | Issue Description | Severity |
---|---|---|
PD-2673 | KubeVirt VM or container workloads may remain in the Starting state due to the remounting of volumes failing with a device busy error.Workaround:
Affected versions: 2.13.x, 3.0.x | Critical |
PD-2546 | In a synchronous DR deployment, telemetry registrations might fail on the destination cluster. Workaround:
Affected versions: 3.0.4 | Critical |
PD-2574 | If a disk is removed from an online pool using the PX-StoreV2 backend, it may cause a kernel panic. Workaround: To avoid kernel panic, do not remove disks from an online pool or node. Components: Storage Affected versions: NA | Critical |
PD-2387 | In OpenShift Container Platform (OCP) version 4.13 or newer, application pods using Portworx sharedv4 volumes can get stuck in Terminating state. This is because kubelet is unable to stop the application container when an application namespace is deleted.Workaround:
Terminating state, reboot the node on which the pod is running. Note that after rebooting, it might take several minutes for the pod to transition out of the Terminating state.Components: Sharedv4 Affected versions: 3.0.0 | Major |
PD-2621 | Occasionally, deleting a TKGi cluster with Portworx fails with the Warning: Executing errand on multiple instances in parallel. error.Workaround: Before deleting your cluster, perform the following steps:
Components: Kubernetes Integration Affected versions: | Major |
PD-2631 | After resizing a FlashArray Direct Access volume with a filesystem (such as ext4, xfs, or others) by a significant amount, you might not be able to detach the volume, or delete the pod using this volume. Workaround: Allow time for the filesystem resizing process to finish. After the resize is complete, retry the operations. Components: FA-FB Affected versions: 2.13.x, 3.0.x, 3.1.0 | Major |
PD-2597 | Online pool expansion with the add-disk operation might fail when using the PX-StoreV2 backend.Workaround: Enter the pool into maintenance mode, then expand your pool capacity. Components: Storage Affected versions: 3.0.0, 3.1.0 | Major |
PD-2585 | The node wipe operation might fail with the Node wipe did not cleanup all PX signatures. A manual cleanup maybe required. error on a system with user setup device names containing specific Portworx reserved keywords(such as pwx ).Workaround: You need to rename or delete devices that use Portworx reserved keywords in their device names before retrying the node wipe operation. Furthermore, it is recommended not to use Portworx reserved keywords such as px , pwx , pxmd , px-metadata , pxd , or pxd-enc while setting up devices or volumes, to avoid encountering such issues.Components: Storage Affected versions: 3.0.0 | Major |
PD-2665 | During a pool expansion operation, if a cloud-based storage disk drive provisioned on a node is detached before the completion of the pool resizing or rebalancing, you can see the show drives: context deadline exceeded error in the output of the pxctl sv pool show command.Workaround: Ensure that cloud-based storage disk drives involved in pool expansion operations remain attached until the resizing and rebalancing processes are fully completed. In cases where a drive becomes detached during this process, hard reboot the node to restore normal operations. Component: PX-StoreV2 Affected versions: 3.0.0, 3.1.0 | Major |
PD-2833 | With Portworx 3.1.0, migrations might fail between two clusters if one of the clusters is running a version of Portworx older than 3.1.0, resulting in a key not found error.Workaround: Ensure that both the source and destination clusters are upgraded to version 3.1.0 or newer. Components: DR & Migration Affected Versions: 3.1.0 | Minor |
PD-2644 | If an application volume contains a large number of files (e.g., 100,000) in a directory, changing the ownership of these files can take a long time, causing delays in the mount process. Workaround: If the ownership change is taking a long time, Portworx by Pure Storage recommends setting fsGroupChangePolicy to OnRootMismatch . For more information, see the Kubernetes documentation.Components: Storage Affected versions: 2.13.x, 3.0.x | Minor |
PD-2359 | When a virtual machine is transferred from one hypervisor to another and Portworx is restarted, the CSI container might fail to start properly and shows the CrashLoopBackoff error.Workaround: Remove the topology.portworx.io/hypervisor label from the affected node.Components: CSI Affected versions: 2.13.x, 3.0.x | Minor |
PD-2579 | When the Portworx pod (oci-mon ) cannot determine the management IP used by the Portworx container, the pxctl status command output on this pod shows a Disabled or Unhealthy status.Workaround: This issue is related to display only. To view the correct information, run the following command directly on the host machine: kubectl exec -it <oci-mon pod> -- nsenter --mount=/host_proc/1/ns/mnt -- pxctl status .Components: oci-monitor Affected versions: 2.13.0 | Minor |
3.0.5
April 17, 2024
Visit these pages to see if you're ready to upgrade to this version:
For users currently on Portworx versions 2.11.x, 2.12.x, or 2.13.x, Portworx by Pure Storage recommends upgrading to Portworx 3.0.5 instead of moving to the next major version.
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-36858 | When using Hashicorp Vault integration, Portworx nodes kept attempting to connect to the Vault service. In the case of misconfigured authentication, the excessive attempts to log in to Vault crashed the Vault service. User Impact: Excessive attempts led to crashing of Vault services. Resolution: Portworx has implemented exponential back-off to reduce the frequency of login attempts to the Vault service. Components: Secret Store Affected Versions: 3.0.4 | Critical |
PWX-36873 | When Portworx is using Harsicorp's Vault configured with Kubernetes or Approle authentication, it automatically refreshes expired access tokens. However, if the Kubernetes or Service Account got removed or Approle expired, the token-refresh failed. User Impact: Excessive attempts to refresh the access tokens caused the Vault service to crash, especially in large clusters. Resolution: The Portworx node now identifies excessive errors from the Vault service and and will avoid accessing Vault for a cooling-off period of 5 minutes. Components: Secret Store Affected Versions: 3.0.3 | Major |
PWX-36847 | In case of a Kubernetes API call failure, Portworx used to incorrectly assume the zone of the node to be the default empty zone. Due to this, it tried to attach drives that belonged to that default zone. As there are no drives created in this default zone, Portworx went ahead and created a new set of drives, assuming this node to be in a different zone. User Impact: This led to duplicate entries and cluster went out of quorum. Resolution: Portworx now does not treat the default zone as a special zone. This allows Portworx to check for any existing drives that are already attach or available to be attached from any zone before trying to create new ones. Components: Cloud Drives Affected Versions: 3.0.3 | Major |
PWX-36786 | An offline, storageless node was incorrectly auto-decommissioned due to specific race conditions, resulting in the clouddrive DriveSet being left orphaned. User Impact: Portworx failed to start when attempting to operate as a storageless node using this orphaned clouddrive DriveSet, due to the node being in a decommissioned state. Resolution: Portworx now automatically cleans up such orphaned storageless clouddrive DriveSets, allowing it to start successfully. Components: Cloud Drive Affected Versions: 2.13.x, 3.0.x, and 3.1.x | Major |
3.0.4
November 15, 2023
Visit these pages to see if you're ready to upgrade to this version:
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-34315 | Improved how Portworx identifies pods with volumes in the Read-Only state before restarting them. | Storage |
PWX-34153 | CSI sidecar images are updated to the latest open source versions. | CSI |
PWX-34029 | Portworx now removes stale FlashArray multipath devices upon startup, which may result from pod failovers (for FlashArray Direct Access) or drive set failovers (for FlashArray Cloud Drives) while Portworx was not running. These stale devices had no direct impact but could have led to slow operations if many were present. | FA-FB |
PWX-34974 | Users can now configure the default duration, which is set to 15 minutes, after which the logs should be refreshed to get the most up-to-date statistics for FlashBlade volumes, using the following command::pxctl cluster options update --fb-stats-expiry-duration <time-in-minutes> The minimum duration for refresh is one minute. | FA-FB |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-34334 | Cloudsnaps of an aggregated volume with a replication level of 2 or more uploaded incorrect data if one of the replica nodes from which a previous cloudsnap operation had been executed was down. User Impact: The most recent snapshots were lost. Resolution: Portworx now forces a full backup in scenarios where the previous cloudsnap node is down. Components: Cloudsnaps Affected versions: 3.0.x | Critical |
PWX-33632 | If an attach request remained in the processing queue for a long time, it would lead to a panic. User Impact: Portworx would restart on the node. This was because an FA attach operation involved making REST API calls to FA, as well as running iSCSI rescans, which consumed more time. When Portworx received a high volume of requests to attach FA DirectAccess volumes, the queue for these attach requests gradually grew over time, leading to a panic in Portworx. Resolution: The timeout for queued attach requests has been increased to 15 minutes for FA DirectAccess volumes. Components: FA-FB Affected versions: 2.13.x, 3.0.x | Critical |
PWX-34885 | When NFS proxy volumes were created, it resulted in the restart of the Portworx service. User Impact: Although NFS proxy volumes were created, the service restart affected user applications. Resolution: Portworx now creates NFS proxy volumes successfully without restarting the Portworx service. Components: Storage Affected versions: 3.0.2 | Critical |
PWX-34277 | When an application pod using an FA Direct Access volume was failed over to another node, and Portworx was restarted on the original node, the pod on the original node became stuck in the Terminating state. User Impact: Portworx didn't clean up the mountpaths where the volume had previously been attached, as it couldn't locate the application on the local node. Resolution: Portworx now cleans up the mountpath even when the application is not found on the node. Components: FA-FB Affected versions: 2.13.x, 3.0.x | Major |
PWX-30297 | Portworx failed to restart when a multipath device was specified for the internal KVDB. Several devices with the kvdbvol label were found for the multipath device. Portworx selected the first device on the list, which might not have been the correct one.User Impact: Portworx failed to start because it selected the incorrect device path for KVDB. Resolution: When a multipath device is specified for the internal KVDB, Portworx now selects the correct device path. Components: KVDB Affected versions: 2.11.x | Major |
PWX-33935 | When the --sources option was used in the pxctl volume ha-update command for the aggregated volume, it caused the Portworx service processes to abort with an assertion.User Impact: The Portworx service on all nodes in the cluster continuously kept restarting. Resolution: Contact the Portworx support team to restore your cluster. Components: Storage Affected versions: 2.13.x, 3.0.x | Major |
PWX-33898 | When two pods, both using the same RWO FA Direct Access volume, were started on two different nodes, Portworx would move the FA Direct Access volume attachment to the node where the most recent pod was running, rather than rejecting the setup request for the second pod. User Impact: A stale FA Direct Access multipath device remained on the original node where the first pod was started, causing subsequent attach or mount requests on that node to fail. Resolution: A second pod request for the same RWO FA Direct Access volume on a different node will now be rejected if such a FA Direct Access volume is already attached and in use on another node. Components: FA-FB Affected versions: 2.13.11 | Major |
PWX-33828 | If you deleted a FA Direct Access PVC attached to an offline Portworx node, Portworx removed the associated volume from its KVDB. However, the FlashArray did not delete its associated volume because it remained connected to the offline node on the FlashArray. User Impact: This created orphaned volumes on the FlashArray. Resolution: Portworx now detects a volume that is attached to an offline Portworx node and will disconnect it from all the nodes in the FlashArray and avoid orphaned volumes. If there are any existing orphaned volumes, clean them manually. Components: FA-FB Affected versions: 2.13.8 | Major |
3.0.3
October 11, 2023
Notes
- This version addresses security vulnerabilities.
- Starting with version 3.0.3, aggregated volumes with PX-StoreV2 are not supported.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-32255 | Now the runtime option fast_node_down_detection is enabled by default. This option allows quick detection of when the Portworx service goes offline. | Storage |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-33113 | Portworx reduced the pricing for GCP Marketplace from 55 cents/node/hour to 33 cents/node/hour, but this change was not being reflected for existing users who were still reporting billing to the old endpoint. User Impact: Existing GCP Marketplace users were being incorrectly billed at the previous rate of 55 cents/node/hour. Resolution: Upgrade Portworx to version 3.0.3 to reflect the new pricing rate. Components: Billing Affected versions: 2.13.8 | Critical |
PWX-34025 | In certain cases, increasing the replication level of a volume on a PX-StoreV2 cluster created new replicas with non-zero blocks that had been overwritten with zeros on the existing replicas. User Impact: The Ext4 filesystem reported a mismatch and delayed allocation failures when a user application attempted to write data to the volume. Resolution: Users can now run the fsck operation to rectify the failures or remove the added replicas from the volume.Components: PX-StoreV2 Affected versions: 3.0.2 | Major |
3.0.2
September 28, 2023
Visit these pages to see if you're ready to upgrade to this version:
Notes
This version addresses security vulnerabilities.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-32226 | AWS users can now choose to enable server side encryption for s3 credentials, assuming the s3 object-store provider supports it. Use the --s3-sse flag with either the AES256 or aws:kms value.
| Cloudsnaps |
PWX-33229 | Previously, a Portworx license would expire if Portworx could not reach its billing server within 72 hours. Now users can continue to use Portworx for up to 30 days if the billing servers are not reachable. | Licensing |
PWX-31233 | Portworx has removed volume size enforcement for FlashArray and FlashBlade Direct Access volumes. This will allow users to create volumes greater than 40TiB for all license types. | Licensing |
PWX-33551 | Users can now configure the REST API call timeout (in seconds) for FA/FB by adding the new environment variable PURE_REST_TIMEOUT to the StorageCluster. When updating this value, the execution timeout should also be updated accordingly using the following command:pxctl cluster options update --runtime-options execution_timeout_sec=<sec> PURE_REST_TIMEOUT is set to 8 seconds and execution_timeout_sec to 60 seconds by default. Contact Portworx support to find the right values for your cluster. | FA-FB |
PWX-33364 | As part of FlashArray integration, Portworx has now reduced the number of API calls it makes to the arrays endpoint on FA. | FA-FB |
PWX-33593 | Portworx now caches certain FlashArray attachment system calls, improving the performance of mount operations for FA backed volumes on nodes with large numbers of attached devices, or many redundant paths to the array. | FA-FB |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-33451 | In certain cases, increasing the replication level of an aggregated volume failed to zero out specific blocks associated with stripes belonging to replication set 1 or higher, where zero data was expected. User Impact: Ext4 filesystem complained about a mismatch and delayed allocation failures when a user application tried to write data to an aggregated Portworx volume. Resolution: Users can now run the fsck operation to rectify the failures or remove the added replicas from the aggregated volume.Components: Storage Affected versions: 3.0.0, 2.12.x, 2.13.x | Critical |
PWX-33258 | Sometimes, Portworx timed out FlashBlade direct access volume creation when it took over 30 seconds. User Impact: Volume creation stayed in a pending state. Resolution: The timeout for FB volume creation has been increased to 180 seconds (3 minutes) to allow more time for FBDA volume creation. User can now use the --fb-lock-timeout cluster option to increase the timeout for FB volume creation beyond 180 seconds (3 minutes).Components: FA-FB Affected versions: 2.13.6 | Critical |
PWX-32428 | For the PKS environment, the sharedv4 mount failed on the remote client node with the error No such file or directory .User Impact: The restarts of the Portworx pods and service lead to excessive mounts (mount-leaks) on the PKS platforms. Thus, progressively slowing down the IO operations on the node. Resolution: The mountpoints are changed that Portworx uses on the PKS platform. If you are experiencing slowdowns on a PKS node, upgrade the Operator to the latest version, and reboot the affected PKS nodes. Components: Sharedv4 Affected versions: 2.12.x, 2.13.x | Critical |
PWX-33388 | The standalone SaaS metering agent crashed the Portworx container with a nil panic error. User Impact: This caused the Portworx container on one node to crash continuously. Resolution: Upgrade to 3.0.2 if you are using a SaaS license to avoid this issue. Components: Control Plane Affected versions: 3.0.1, 3.0.0 | Critical |
PWX-32074 | The CPU core numbers were wrongly detected by the px-runc command.User Impact: Portworx did not start on the requested cores. Resolution: The behavior of the --cpuset-cpus argument of the px-runc install command has been fixed. User can now specify the CPUs on which Portworx execution should be allowed.Components: px-runc Affected versions: 2.x.x | Critical |
PWX-33112 | Timestamps were incorrectly recorded in the write-ahead log. User Impact: The write operations were stuck due to a lack of log reservation space. Resolution: Portworx now consistently flushes timestamp references into the log. Components: Storage Affected versions: 2.12.x, 2.13.x | Critical |
PWX-31605 | The pool expansion failed because the serial number from the WWID could not be extracted. User Impact: FlashArray devices (both cloud drives and direct access) encountered expansion or attachment failures when multipath devices from other vendors (such as HPE or NetApp) were attached. Resolution: This issue has been fixed. Components: Pool Management Affected versions: 2.13.2 | Critical |
PWX-33120 | Too many unnecessary vSphere API calls were made by Portworx. User Impact: An excess of API calls and vSphere events could have caused confusion and distraction for users of vSphere Cloud Drives. Resolution: If you are seeing many vSphere VM Reconfigure events at a regular interval in the clusters configured with Portworx Cloud Drives, upgrade Portworx to the latest version. Components: Metering & Billing Affected versions: 3.0.0 | Major |
PWX-33299 | When using custom image registry, OCI-Monitor was unable to locate Kubernetes nampspaces to pull secrets. User Impact: Portworx installation failed with the error Failed retrieving default/tcr-pull-cpaas-5000 . Resolution: Portworx now consults container-runtime and Kubernetes to determine the correct Kubernetes namespace for Portworx installation. Components: OCI Monitor Affected versions: 3.0.0, 2.13.x, 2.12.x | Major |
PWX-31840 | When resizing a volume, the --provisioning-commit-labels cluster option was not honored, resulting in unlimited thin provisioning. User Impact: Portworx volumes were resized to large sizes without rejections, exceeding pool provisioning limits. Resolution: Now the --provisioning-commit-labels cluster option is honored during resizing volumes and prevents unexpected large volumes.Components: Storage Affected versions: 2.12.x, 2.13.x | Major |
PWX-32572 | When using the older Containerd versions (v1.4.x or 1.5.x), Portworx kept opening connections to Containerd, eventually depleting all the file-descriptors available on the system. User Impact: Portworx nodes crashed with the too many open files error. Resolution: Portworx no longer leaks the file-descriptors when working with older Containerd versions. Components: OCI Monitor Affected versions: 2.13.6, 3.0.0 | Minor |
PWX-30781 | The kubernetes version parameter (?kbver ) in the air-gapped script did not process the version extension.User Impact: The script generated the wrong image URLs for the Kubernetes dependent images. Resolution: Parsing of the kbver parameter has been fixed. Components: Spec Generator Affected versions: 3.0.0 | Minor |
3.0.1
September 3, 2023
Visit these pages to see if you're ready to upgrade to this version:
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-33389 | The Portworx CSI license for FA/FB validation failed when upgraded Purity to version 6.4.2 or newer, causing the Portworx license status to appear expired. User Impact: Users could not create new volumes. Resolution: The auth token is no longer used by Portworx when making API or api_version calls to FA during license validation.Components: FA-FB Affected versions: 3.0.0 | Critical |
PWX-33223 | Portworx was hitting a panic when a value was set for an uninitialized object. User Impact: This caused the Portworx container to crash and restart. Resolution: Upgrade to Portworx version 3.0.1 if using Pure cloud drives. Components: FA-FB Affected versions: 3.0.0 | Major |
Known issues (Errata)
Issue Number | Issue Description |
---|---|
PD-2349 | When you upgrade Portworx to a higher version, the upgrade is successful, but the Portworx CSI license renewal could take a long time. Workaround: Run the pxctl license reset command to reflect the correct license status. |
PD-2350 | Upgrades on some nodes may become stuck with the following message: This node is already initialized but could not be found in the cluster map. . This issue can be caused by an orphaned storageless node. Workaround: Verify if the node which has this error is a storageless node. If it is, delete the orphaned storageless node using the command: pxctl clouddrive delete --node <> to progress the upgrade. |
3.0.0
July 11, 2023
Visit these pages to see if you're ready to upgrade to this version:
Notes
Portworx 3.0.0 requires Portworx Operator 23.5.1 or newer.
New features
Portworx by Pure Storage is proud to introduce the following new features:
-
AWS users can now deploy Portworx with the PX-StoreV2 datastore. In order to have PX-StoreV2 as your default datastore, your cluster should pass the preflight check, which verifies your cluster's compatibility with the PX-StoreV2 datastore.
-
You can now provision and use cloud drives on FlashArrays that are in the same zone using the CSI topology for FlashArray Cloud Drives feature. This improves fault tolerance for replicas, performance, and manageability for large clusters.
-
For environments such as GCP and Anthos that follow blue-green upgrade model, Portworx allows temporary license extension to minimize downtime during upgrades. Once you start the license expansion, the Portworx cluster's license will temporarily be extended to accommodate up to double the number of licensed nodes. While the existing nodes (called blue nodes) serve production traffic, Portworx will expand the cluster by adding new nodes (called green nodes) that have upgraded Linux OS or new hardware.
-
Portworx now offers the capability to utilize user-managed keys for encrypting cloud drives on Oracle Cloud Infrastructure Container Engine for Kubernetes (OKE). By leveraging powerful encryption algorithms, the Oracle disk encryption feature converts data into an encrypted format, ensuring that unauthorized individuals cannot access it. You can specify the encryption key in the StorageCluster using the following cloud-drive volume specifications:
type=pv-<number-of-vpus>,size=<size-of-disk>,kms=<ocid-of-vault-key>
-
Portworx now enables you to define custom tags for cloud drives provisioned across various platforms such as AWS, Azure, GCP, and Oracle cloud. While installing Portworx, you can specify the custom tags in the StorageCluster spec:
type=<device-type>,size=<volume-size>,tags=<custom-tags>
This user-defined metadata enhances flexibility, organization, and provides additional contextual information for objects stored in the cloud. It empowers users with improved data management, search capabilities, and greater control over their cloud-based data.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-29486 | Portworx now supports online expansion of storage pools containing auto journal devices with disk-resize operation. | Pool Management |
PWX-29435 | When you run the pxctl sv pool expand -o add-disk command, the common disk tags from existing drives will be attached to the newly added cloud-drive disk. | Pool Management |
PWX-28904 | Storage pool expansion now supports online pool resizing on Azure, with no downtime. This is as long as Microsoft's documentation requirements are met. | Pool Management |
PWX-30876 | Pool expansion with add-disk operation is now supported for repl1 volumes. | Pool Management |
PWX-29863 | The pool expansion completion message is improved to Pool resize requested successfully. Please check resize operation status with pxctl sv pool show . | Pool Management |
PWX-28665 | The pxctl cd list command now lists cloud-drives on nodes with local drives. | Cloud Drives |
PWX-28697 | FlashArray cloud drives now show information about the array they are located on. Use pxctl cd inspect to view this information. | Cloud Drives |
PWX-29348 | Added 3 new fields to the CloudBackupSize API to reflect the correct backup size:
| Cloudsnaps |
PWX-27610 | Portworx will now periodically defragment the KVDB database. KVDB will be defragmented every 2 weeks by default, if the DB size is greater than 100 MiB. You can also configure the defragment schedule using the following options with the pxctl cluster options update command:
| KVDB |
PWX-31403 | For AWS clusters, Portworx now defaults the following configurations for dedicated KVDB disk:
| KVDB |
PWX-31055 | The alert message for VolumeSpaceLow is improved to show clear information. | Storage |
PWX-29785 | Improved the implementation to restrict the nodiscard and autofstrim flags on XFS volumes. These two flags are disabled for volumes formatted with XFS. | PX-StoreV1 |
PWX-30557 | Portworx checks pool size and drive count limits before resizing the storage pool. It will abort with a proper error message if the resolved pool expansion plan exceeds limits. | PX-StoreV2 |
PWX-30820 | Portworx now redistributes cloud migration request received from stork between all the nodes in the cluster using a round-robin mechanism. This helps evenly distribute the migration workload across all the nodes in the cluster and avoids hot spots. | DR & Migration |
PWX-29428 | Portworx CSI images now use the registry.k8s.io registry. | CSI |
PWX-28035 | Portworx now supports distributing FlashArray Cloud Drive volumes among topologically distributed FlashArrays. | FA-FB |
PWX-31500 | The pxctl cluster provision-status command will now show more states of a pool. | CLI |
PWX-31257 | The pxctl alerts show command with the --start-time and --end-time options can now be used independently. | Monitoring |
PWX-30754 | Added support for leases permission to the PVC controller ClusterRole. | Spec Generation |
PWX-29202 | pxctl cluster provision-status will now show the host name for nodes. The host name helps you to correlate that command's output with the node list provided by pxctl status . | CLI |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-30030 | Some volumes incorrectly showed Not in quorum status.User Impact: Portworx volumes were out of quorum after a network split even though all the nodes and pools for the volume's replica were online and healthy. This happened when the node could communicate over network to KVDB but not with rest of the nodes. Resolution: Restart the Portworx service on the node where the volume is currently attached. Components: Storage Affected versions: 2.12.2 | Critical |
PWX-30511 | When autofstrim was disabled, internal autofstrim volume information was not removed completely. User Impact: An error occurred while running manual fstrim. Resolution: This issue has been fixed. Components: Storage Affected versions: 2.12.x, 2.13.x | Critical |
PWX-30294 | The pvc-controller pods failed to start in the DaemonSet deployment. User Impact: The pvc-controller failed due to the deprecated values of the --leader-elect-resource-lock flag.Resolution: These values have been removed to use the default leases value.Components: Spec Generator Affected versions: 2.12.x, 2.13.x | Critical |
PWX-30930 | The KVDB cluster could not form a quorum after KVDB was down on one node. User Impact: On a loaded cluster or when the underlying KVDB disk had latency issues, KVDB nodes failed to elect leaders among themselves. Resolution: Increase the heartbeat interval using the runtime option kvdb-heartbeat-interval=1000 .Components: KVDB Affected versions: 2.12.x, 2.13.x | Critical |
PWX-30985 | Concurrent pool expansion operations using add-disk and auto resulted in pool expansion failure, with the error mountpoint is busy .User Impact: Pool resize requests were rejected. Resolution: Portworx now serializes pool expansion operations. Components: Pool Management Affected versions: 2.12.x, 2.13.x | Major |
PWX-30685 | In clusters running with cloud drives and auto-journal partitions, pool deletion resulted in deleting the data drive with an auto-journal partition. User Impact: Portworx had issues restarting after the pool deletion operation. Resolution: Upgrade to the current Portworx version. Components: Pool Management Affected versions: 2.12.x, 2.13.x | Major |
PWX-30628 | The pool expansion would result in a deadlock when it had a volume in a re-sync state and the pool was already full. User Impact: Pool expansion would get stuck if a volume in the pool was in a re-sync state and the pool was full. No new pool expansions can be issued on such a pool. Resolution: Pool expansion will now be aborted immediately if it detects an unclean volume in the pool. Components: Pool Management Affected versions: 2.12.x, 2.13.x | Major |
PWX-30551 | If a diagnostics package collection was triggered during a node initialization, it caused the node initialization to fail and the node to restart. User Impact: The node restarted when node initialization and diagnostics package collection occurred at the same time. Resolution: Now diagnostics package collection will not restart the node. Components: Storage Affected versions: 2.12.x, 2.13.x | Major |
PWX-29976 | Cloud drive creation failed when a vSphere 8.0 datastore cluster was used for Portworx installation. User Impact: Portworx failed to install on vSphere 8 with datastore clusters. Resolution: This issue has been fixed. Components: Cloud Drives Affected versions: 2.13.1 | Major |
PWX-29889 | Portworx installation with local install mode failed when both a journal device and a KVDB device were configured simultaneously. User Impact: Portworx would not allow creating multiple disks in local mode install Resolution: This issue has been fixed. Components: KVDB Affected versions: 2.12.x, 2.13.x | Major |
PWX-29512 | In certain cases, a KVDB node failover resulted in inconsistent KVDB membership, causing an orphaned entry in the cluster. User Impact: The cluster operated with one less KVDB node. Resolution: Every time Portworx performs a KVDB failover, if it detects an orphaned node, Portworx removes it before continuing the failover operation. Components: KVDB Affected versions: 2.13.x | Major |
PWX-29511 | Portworx would remove an offline internal KVDB node as part of its failover process, even when it was not part of quorum. User Impact: The KVDB cluster would lose quorum and required manual intervention to restore its functionality. Resolution: Portworx will not remove a node from the internal KVDB cluster if it is out of quorum. Components: KVDB Affected versions: 2.13.x | Major |
PWX-28287 | Pool expansion on an EKS cluster failed while optimization of the associated volume(s) was in progress. User Impact: Pool expansion was unsuccessful. Resolution: Portworx catches these scenarios early in the pool expansion process and provide a clear and readable error message to the user. Components: Cloud Drives Affected versions: 2.12.x, 2.13.x | Major |
PWX-28590 | In vSphere local mode install, storageless nodes (disaggregated mode) would claim storage ownership of a hypervisor if it was the first to boot up. This meant that a node capable of creating storage might not be able to get ownership. User Impact: In vSphere local mode, Portworx installed in degraded mode. It occurred during a fresh install or when an existing storage node was terminated. Resolution: This issue has been fixed. Components: Cloud Drives Affected versions: 2.12.1 | Major |
PWX-30831 | On EKS, if the cloud drives were in different zones or removed, Portworx failed to boot up in certain situations. User Impact: Portworx did not start on an EKS cluster with removed drives. Resolution: Portworx now ignores zone mismatches and sends alerts for deleted drives. It will now not abort the boot up process and continue to the next step. Components: Cloud Drives Affected versions: 2.12.x, 2.13.x | Major |
PWX-31349 | Sometimes Portworx processes on the destination or DR cluster would restart frequently due to a deadlock between the node responsible for distributing the restore processing and the code attempting to attach volumes internally. User Impact: Restore operations failed Resolution: This issue has been fixed. Components: DR and Migration Affected versions: 2.12.x, 2.13.x | Major |
PWX-31019 | During cloudsnap backup/restore, there was a crash occasionally caused by the array index out of range of the preferredNodeForCloudsnap function. User Impact: Cloudsnap restore failed. Resolution: This issue has been fixed. Components: Storage Affected versions: 2.12.x, 2.13.x | Major |
PWX-30246 | Portworx NFS package installation failed due to a lock held by the unattended-upgrade service running on the system. User Impact: Sharedv4 volume mounts failed Resolution: Portworx NFS package install now waits for the lock, then installs the required packages. This issue is resolved after upgrading to the current version and restarting the Portworx container. Components: Sharedv4 Affected versions: 2.11.2, 2.12.1 | Major |
PWX-30338 | VPS pod labels were not populated in the Portworx volume spec. User Impact: VPS using the podMatchExpressions field in a StatefulSet sometimes failed to function correctly because volume provisioning and pod inception occurred at the same time.Resolution: Portworx ensures that volume provisioning collects the pod name before provisioning. Components: Volume Placement Strategies Affected versions: 2.12.x, 2.13.x | Minor |
PWX-28317 | A replica set was incorrectly created for proxy volumes. User Impact: When a node was decommissioned, it got stuck if a proxy volume’s replica set was on that node. Resolution: Now replica sets are not created for proxy volumes. Components: Proxy Volumes Affected versions: 2.11.4 | Minor |
PWX-29411 | In vSphere, when creating a new cluster, KVDB disk creation failed for a selected KVDB node. User Impact: In the local mode install, the KVDB disk creation failures resulted in wrongly giving up ownership of a hypervisor. This created two storage nodes on the same hypervisors. Resolution: This issue has been fixed. Components: Cloud Drives Affected versions: 2.12.1. 2.13.x | Minor |
PWX-28302 | The pool expand command failed to expand an existing pool size when it was increased by 4 GB or less. User Impact: If the user expanded the pool by 4 GB or less, the pxctl sv pool expand command failed with an invalid parameter error.Resolution: Increase the pool size by at least 4 GB. Components: PX-StoreV2 Affected versions: 2.12.x, 2.13.x | Minor |
PWX-30632 | NFS backupLocation for cloudBackups failed with the error validating credential: Empty name string for nfs error. The NFS name used by Portworx to mount the NFS server was not passed to the required function.User Impact: Using BackupLocations for NFS targets failed. Resolution: Portworx now passes the credential name to the function that uses the name to mount the NFS server. Components: Cloudsnaps Affected versions: 2.13.x | Minor |
PWX-25792 | During the volume mount of FA/FB DA volumes, Portworx did not honor the nosuid mount option specified in the storage class.User Impact: Post migration from PSO to Portworx, volumes with the nosuid mount option failed to mount on the host.Resolution: Portworx now explicitly sets the nosuid mount option in the mount flag before invoking the mount system call.Components: FA-FB Affected versions: 2.11.0 | Minor |
Known issues (Errata)
Issue Number | Issue Description |
---|---|
PD-2149 | Portworx 3.0.0 cannot be installed using the Rancher catalog chart. You should use PX-Central to generate the Portworx spec. |
PD-2107 | If there is a ha-update operation while the volume is in a detached state, a different node might start publishing the volume metrics, but the old node won’t stop publishing the volume metrics. This results in duplicate metrics, and only one will have the correct currhalevel.Workaround: For detached volumes, before doing a ha-update , attach the volume manually through pxctl . |
PD-2086 | Portworx does not support Oracle API signing keys with apassphrase. Workaround: Use API signing keys without a passphrase. |
PD-2122 | The add-drive operation fails when a drive is added to an existing cloud-based pool.Workaround: Use the pxctl service pool expand -operation add-disk -uid <pool-ID> -size <new-storage-pool-size-in-GiB> command to add a new drive to such pools. |
PD-2170 | The pool expansion can fail on Google Cloud when using the pxctl service pool expand -operation add-disk command with the error Cause: ProviderInternal Error: googleapi: Error 503: Internal error. Please try again or contact Google Support. Workaround: Rerun the command. |
PD-2188 | In OCP 4.13 or newer, when the application namespace or pod is deleted, application pods that use Portworx sharedv4 volumes can get stuck in the Terminating state. The output of the ps -ef --forest command for the stuck pod showed that the conmon process had one or more defunct child processes. Workaround: Find the nodes on which the sharedv4 volume(s) used by the affected pods are attached, then restart the NFS server on those nodes with the systemctl restart nfs-server command. Wait for a couple of minutes. If the pod is still stuck in the Terminating state, reboot the node on which the pod is running. The pod might take several minutes to release after a reboot. |
PD-2209 | When Portworx is upgraded to version 3.0.0 without upgrading Portworx Operator to version 23.5.1, telemetry is disabled. This is because the port is not updated for the telemetry pod. Workaround: Upgrade Portworx Operator to the latest version and bounce the Portworox pods manually. |
PD-2615 | Migrations triggered as part of Async DR will fail in the "Volume stage" when Portworx is configured with PX-Security on the source and destination clusters. Workaround: Please contact support if you encounter this issue. |
Known issues (Errata) with PX-StoreV2 datastore
Issue Number | Issue Description |
---|---|
PD-2138 | Scaling down the node groups in AWS results in node termination. After a node is terminated, the drives are moved to an available storageless node. However, in some cases, after migration the associated pools remain in an error state. Workaround: Restart the Portworx service, then run a maintenance cycle using the pxctl sv maintenance --cycle command. |
PD-2116 | In some cases, re-initialization of a node fails after it is decommissioned and wiped with the error Failed in initializing drives on the node x.x.x.x : failed to vgcreate . Workaround: Reboot the node and retry initializing it. |
PD-2141 | When cloud drives are detached and reattached manually, the associated pool can go down and remain in an error state. Workaround: Restart the Portworx service, then run a maintenance cycle using the pxctl sv maintenance --cycle command. |
PD-2153 | If the add-drive operation is interrupted by a drive detach, scale down or any other operation, the pool expansion can get stuck.Workaround: Reboot the node. |
PD-2174 | When you add unsupported drives to the StorageCluster spec of a running cluster,Portworx goes down. Workaround: Remove the unsupported drive from the StorageCluster spec. The Portworx Operator will recreate the failed pod and Portworx will be up and running again on that node. |
PD-2208 | Portworx on-premises with PX-StoreV2 fails to upgrade to version 3.0.0. Workaround: Replace -T dmthin with -T px-storev2 in your StorageCluster, as the dmthin flag is deprecated. After updating the StorageCluster spec, restart the Portworx nodes. |
2.13.12
March 05, 2024
Visit these pages to see if you're ready to upgrade to this version:
Fixes
Issue Number | Issue Description |
---|---|
PWX-35603 | When running Portworx on older Linux systems (specifically those using GLIBC 2.31 or older) with newer versions of Kubernetes, Portworx failed to detect dynamic updates of pod credentials and tokens. This led to Unauthorized errors when using Kubernetes client APIs.Resolution: Portworx now correctly processes dynamic token updates. |
PWX-29750 | In certain cases, the cloudsnaps that were using S3 object-stores were not completely deleted because S3 object-stores did not support bulk deletes or were unable to handle large cloudsnaps. This resulted in undeleted cloudsnap objects, leading to unnecessary capacity consumption on S3. Resolution: Portworx now addresses and resolves such cloudsnaps deletion issues. |
PWX-35136 | During cloudsnap deletions, some objects were not removed because the deletion requests exceeded the S3 API's limit for the number of objects that could be deleted at once. This would leave objects on S3 for deleted cloudsnaps, thereby consuming S3 capacity. Resolution: Portworx now ensures that deletion requests do not exceed the S3 API's limit. |
PWX-31019 | An array index out of range error in the preferredNodeForCloudsnap function occasionally caused crashes during cloudsnap backup/restore operations.Resolution: This issue has been fixed, and Portworx now prevents such crashes during cloudsnap backup or restore operations. |
PWX-30030 | Some Portworx volumes incorrectly showed Not in quorum status after a network split, even though all the nodes and pools for the volume's replica were online and healthy. This happened when the node could communicate over network to KVDB but not with rest of the nodes. Resolution: Portworx volumes now accurately reflect their current state in such situations. |
PWX-33647 | When the Portworx process are restarted, it verifies the existing mounts on the system for sanity. If one of the mounts was NFS mount of a Portworx volume, the mount point verification would hung as Portworx was in the process of starting up. Resolution: When Portworx is starting up, it now skips the verification of Portworx-backed mount points to allow the startup process to continue. |
PWX-29511 | Portworx would remove an offline internal KVDB node as part of its failover process, even when it was not part of quorum. The KVDB cluster would lose quorum and required manual intervention to restore its functionality. Resolution: Portworx does not remove a node from the internal KVDB cluster if it is out of quorum. |
PWX-29533 | During node initialization with cloud drives, a race condition occasionally occurred between the Linux device manager (udevd) and Portworx initialization, causing node initialization failures. This was because drives were not fully available for Portworx's use, preventing users from adding new nodes to an existing cluster. Resolution: Portworx has increased the number of retries for accessing the drives during initialization to mitigate this failure. |
PWX-35650 | GKE customers encountered a nil panic exception when the provided GKE credentials were invalid. Resolution: Portworx now properly shuts down and logs the error, aiding in the diagnosis of credential-related issues. |
Known issues (Errata)
Issue Number | Issue Description |
---|---|
PD-2768 | When cloning or capturing a snapshot of an FlashArray Direct Access PVC that is either currently resizing or has encountered a resizing failure, the clone or snapshot creation might fail. Workaround: Initiate the resize operation again on the original volume, followed by the deletion and recreation of the clone or snapshot, or allow for an automatic retry. |
2.13.11
October 25, 2023
Visit these pages to see if you're ready to upgrade to this version:
Notes
- This version addresses security vulnerabilities.
- It is recommended that you upgrade to the most latest version of Portworx when upgrading from version 2.13.11.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description |
---|---|
PWX-34029 | Portworx now removes stale FlashArray multipath devices upon startup, which may result from pod failovers (for FlashArray Direct Access) or drive set failovers (for FlashArray Cloud Drives) while Portworx was not running. These stale devices had no direct impact but could have lead to slow operations if many were present. |
PWX-33551 | You can now configure the REST API call timeout (in seconds) for FA/FB by adding the new environment variable PURE_REST_TIMEOUT to the StorageCluster. When updating this value, you should also update the execution timeout using the following command:pxctl cluster options update --runtime-options execution_timeout_sec=<sec> PURE_REST_TIMEOUT is set to 8 seconds and execution_timeout_sec to 60 seconds by default. Contact Portworx support to find the right values for your cluster. This improvement was included in Portworx version 3.0.2 and now is backported to 2.13.11. |
PWX-33229 | Previously, a Portworx license would expire if Portworx could not reach its billing server within 72 hours. Users can now continue to use Portworx for up to 30 days if the billing servers are not reachable. This improvement was included in Portworx version 3.0.2 and now is backported to 2.13.11. |
PWX-33364 | As part of FlashArray integration, Portworx has now reduced the number of API calls it makes to the arrays endpoint on FA. This improvement was included in Portworx version 3.0.2 and now is backported to 2.13.11. |
Fixes
Issue Number | Issue Description |
---|---|
PWX-33828 | If you deleted a FA Direct Access PVC attached to an offline Portworx node, Portworx removed the associated volume from its KVDB. However, the FlashArray did not delete its associated volume because it remained connected to the offline node on the FlashArray. This created orphaned volumes on the FlashArray. Resolution: Portworx now detects a volume that is attached to an offline Portworx node and will disconnect it from all the nodes in the FlashArray and avoid orphaned volumes. |
PWX-33632 | If an attach request remained in the processing queue for a long time, it would lead to a panic, causing Portworx to restart on a node. This was because an FA attach operation involved making REST API calls to FA, as well as running iSCSI rescans, which consumed more time. When Portworx received a high volume of requests to attach FA DirectAccess volumes, the queue for these attach requests gradually grew over time, leading to a panic in Portworx. Resolution: The timeout for queued attach requests has been increased to 15 minutes for FA DirectAccess volumes. |
PWX-33898 | When two pods, both using the same RWO FA Direct Access volume, were started on two different nodes, Portworx would move the FADA volume attachment to the node where the most recent pod was running, rather than rejecting the setup request for the second pod. This resulted in a stale FADA multipath device remaining on the original node where the first pod was started, causing subsequent attach or mount requests on that node to fail Resolution: A second pod request for the same RWO FA Direct Access volume on a different node will now be rejected if such a FA Direct Access volume is already attached and in use on another node. |
PWX-33631 | To distribute workloads across all worker nodes during the provisioning of CSI volumes, Portworx obtains locks to synchronize requests across different worker nodes. Resolution: If CSI volume creation is slow, upgrade to this version. |
PWX-34277 | When an application pod using an FA Direct Access volume was failed over to another node, and Portworx was restarted on the original node, the pod on the original node became stuck in the Terminating state. This occurred because Portworx didn't clean up the mountpaths where the volume had previously been attached, as it couldn't locate the application on the local node. Resolution: Portworx now cleans up the mountpath even when the application is not found on the node. |
PWX-34334 | Cloudsnaps of an aggregated volume with a replication level of 2 or more uploaded incorrect data if one of the replica nodes from which a previous cloudsnap operation had been executed was down. Resolution: Portworx now forces a full backup in scenarios where the previous cloudsnap node is down. |
PWX-33935 | When the --sources option was used in the pxctl volume ha-update command for the aggregated volume, it caused the Portworx service processes to abort with an assertion. As a result, the Portworx service on all nodes in the cluster continuously kept restarting.Resolution: Contact the Portworx support team to restore your cluster. |
PWX-34025 | In certain cases, increasing the replication level of a volume on a PX-StoreV2 cluster created new replicas with non-zero blocks that were overwritten with zeros on the existing replicas. This caused the ext4 filesystem to report a mismatch and delayed allocation failures when a user application attempted to write data to the volume. Resolution: Users can now run the fsck operation to rectify the failures or remove the added replicas from the volume. This issue has been fixed in Portworx version 3.0.3 and now backported to 2.13.11. |
PWX-33451 | In certain cases, increasing the replication level of an aggregated volume failed to zero out specific blocks associated with stripes belonging to replication set 1 or higher, where zero data was expected. This caused the ext4 filesystem to report a mismatch and delayed allocation failures when a user application tried to write data to an aggregated Portworx volume. Resolution: Users can now run the fsck operation to rectify the failures or remove the added replicas from the aggregated volume. This issue has been fixed in Portworx version 3.0.2 and is now backported to 2.13.11. |
PWX-32572 | When using the older Containerd versions (v1.4.x or 1.5.x), Portworx kept opening connections to Containerd, eventually depleting all the file-descriptors available on the system. This caused the Portworx nodes to crash with the too many open files error. Resolution: Portworx no longer leaks the file-descriptors when working with older Containerd versions. This issue has been fixed in Portworx version 3.0.2 and is now backported to 2.13.11. |