3.1.4
Aug 15, 2024
Visit these pages to see if you're ready to upgrade to this version:
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-37590 | Users running on environments with multipath version 0.8.8 and using FlashArray devices, either as Direct Access Volumes or Cloud Drive Volumes, may have experienced issues with the multipath device not appearing in time. User Impact: Users saw Portworx installations or Volume creation operations fail. Resolution: Portworx is now capable of running on multipath version 0.8.8. Components: Drive and Pool Management Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
3.1.3
July 16, 2024
Visit these pages to see if you're ready to upgrade to this version:
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-37576 | Portworx has significantly reduced the number of vSphere API calls during the booting process and pool expansion. | Drive & Pool Management |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-37870 | When PX-Security is enabled on a cluster that is also using Vault for storing secrets, the in-tree provisioner (kubernetes.io/portworx-volume) fails to provision a volume. User Impact: PVCs became stuck in a Pending state with the following error: failed to get token: No Secret Data found for Secret ID . Resolution: Use the CSI provisioner (pxd.portworx.com) to provision the volumes on clusters that have PX-Security enabled. Components: Volume Management Affected Versions: 3.0.3, 3.1.2 | Major |
PWX-37799 | A KVDB failure sometimes Portworx to restart when creating cloud backups. User Impact: Users saw Portworx restart unexpectedly. Resolution: Portworx now raises an alert, notifying users of a backup failure instead of unexpectedly restarting. Components: Cloudsnaps Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
PWX-37661 | If the credentials provided in px-vsphere-secret were invalid, Portworx failed to create a Kubernetes client, and the process would restart every few seconds leading to many login failures continuously. User Impact: Users saw a large number of client creation trials, which may have lead to the credentials being blocked or too many API calls. Resolution: If the credentials are invalid, Portworx will now wait for secret to be changed before trying to log in again. Components: Drive and Pool Management Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
PWX-37339 | Sharedv4 service failover did not work correctly when a node had a link-local IP from the subnet 169.254.0.0/16. In clusters running OpenShift 4.15 or later, Kubernetes nodes may have a link-local IP from this subnet by default. User Impact: Users saw disruptions in applications utilizing sharedv4-service volumes when the NFS server node went down. Resolution: Portworx has been improved to prevent VM outages in such situations. Components: Sharedv4 Affected Versions: 3.1.0.2 | Major |
3.1.2.1
July 8, 2024
Visit these pages to see if you're ready to upgrade to this version:
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-37753 | Portworx reloaded and reconfigured VMs on every boot, which is a costly activity in vSphere. User Impact: Users saw a significant number of VM reload and reconfigure activities during Portworx restarts, which sometimes overwhelmed vCenter. Resolution: Portworx has been optimized to minimize unnecessary reload and reconfigure actions for VMs. Now, these actions are mostly triggered only once during the VM's lifespan. Component: Drive and Pool Management Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
PWX-35217 | Portworx maintained two vSphere sessions at all times. These sessions would become idle after Portworx restarts, and vSphere would eventually clean them up. vSphere counts idle sessions toward its session limits, which caused an issue if all nodes restarted simultaneously in a large cluster. User Impact: In large clusters, users encountered the 503 Service Unavailable error if all nodes restarted simultaneously.Resolution: Portworx now actively terminates sessions after completing activities like boot and pool expansion. Note that in rare situations where Portworx might not close the sessions, users may still see idle sessions. These sessions are cleaned by vSphere based on the timeout settings of the user's environment. Component: Drive and Pool Management Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
PWX-36727 | When a user decommissioned a node, Portworx would process the node deletion in the background. And for every volume delete or update operation, it checked if all nodes marked as decommissioned had no references to these volumes, which took a long time to delete a node. User Impact: The Portworx cluster went down as the KVDB node timed out. Resolution: The logic for decommissioning nodes has been improved to prevent such situations. Component: KVDB Affected Versions: 3.1.x, 3.0.x, 2.13.x | Minor |
3.1.2
June 19, 2024
Visit these pages to see if you're ready to upgrade to this version:
New features
Portworx by Pure Storage is proud to introduce the following new features:
- Customers can now migrate legacy shared volumes to sharedv4 service volumes.
- For FlashBlade Direct Access volumes, users can provide multiple NFS endpoints using the
pure_nfs_endpoint
parameter. This is useful when the same FlashBlade is shared across different zones in a cluster.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-33044 | Portworx will perform additional live VM migrations to ensure a KubeVirt VM always uses the block device directly by running the VM on the volume coordinator node. | Sharedv4 |
PWX-23390 | Stork will now raise events on a pod or VM object if it fails to schedule them in a hyperconverged fashion. | Stork and DR |
PWX-37113 | In KubeVirt environments, Portworx no longer triggers RebalanceJobStarted and RebalanceJobFinished alarms every 15 minutes due to the KubeVirt fix-vps job. Alarms are now raised only when the background job is moving replicas. | Storage |
PWX-36600 | The output of the rebalance HA-update process has been improved to display the state of each action during the process. | Storage |
PWX-36854 | The output of the pxctl volume inspect command has been improved. The Kind field can now be left empty inside the claimRef , allowing the output to include application pods that are using the volumes. | Storage |
PWX-33812 | Portworx now supports Azure PremiumV2_LRS and UltraSSD_LRS disk types. | Drive and Pool Management |
PWX-36484 | A new query parameter ce=azure has been added for Azure users to identify the cloud environment being used. The parameter ensures that the right settings and optimizations are applied based on the cloud environment. | Install |
PWX-36714 | The timeout for switching licenses from floating to Portworx Enterprise has been increased, avoiding timeout failures. | Licensing |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-36869 | When using a FlashArray on Purity 6.6.6 with NVMe-RoCE, a change in the REST API resulted in a deadlock in Portworx. User Impact: FlashArray Direct Access attachment operations never completed, and FlashArray Cloud Drive nodes failed to start. Resolution: Portworx now properly handles the changed API for NVMe and does not enter a deadlock. Component: FA-FB Affected Versions: 3.1.x, 3.0.x, 2.13.x | Critical |
PWX-37059 | In disaggregated mode, storageless nodes restarted every few minutes attempting to claim the storage driveset and ended up being unsuccessful. User Impact: Due to storageless node restarts, some customer applications experienced IO disruption. Resolution: When a storage node goes down, Portworx now stops storageless nodes from restarting in a disaggregated mode, avoiding them to claim the storage driveset. Component: Drive and Pool Management Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
PWX-37351 | If the drive paths changed due to a node restart or a Portworx upgrade, it led to a storage down state on the node. User Impact: Portworx failed to restart because of the storage down state. Components: Drive & Pool Management Affected Versions: 3.1.0.3, 3.1.1.1 | Major |
PWX-36786 | An offline storageless node was auto-decommissioned under certain race conditions, making the cloud-drive driveset orphaned. User Impact: When Portworx started as a storageless node using this orphaned cloud-drive driveset, it failed to start since the node's state was decommissioned. Resolution: Portworx now auto-cleans such orphaned storageless cloud-drive drivesets and starts successfully. Component: Drive and Pool Management Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
PWX-36887 | When one of the internal KVDB nodes was down for several minutes, Portworx added another node to the KVDB cluster. Portworx initially added the new KVDB member as a learner. If, for some reason, KVDB connectivity was lost for more than a couple of minutes after adding the learner, the learner stayed in the cluster and prevented a failover to a different KVDB node. User Impact: The third node was not able to join the KVDB cluster with the error Peer URLs already exists. KVDB continued to run with only two members.Resolution: When Portworx encounters the above error, it removes the failed learner from the cluster, thereby allowing the third node to join. Component: Internal KVDB Affected Versions: 3.0.x, 3.1.1 | Major |
PWX-36873 | When Portworx was using HashiCorp's Vault configured with Kubernetes or AppRole authentication, it attempted to automatically refresh the access tokens when they expired. If the Kubernetes Service Account was removed or the AppRole expired, the token-refresh kept failing, and excessive attempts to refresh it caused a crash of the Vault service on large clusters. User Impact: The excessive attempts to refresh the tokens caused a crash of the Vault service on large clusters. Resolution: Portworx nodes now detect excessive errors from the Vault service and will avoid accessing Vault for the next 5 minutes. Component: Volume Management Affected Versions: 3.0.5, 3.0.3 | Major |
PWX-36601 | Previously, the default timeout for rebalance HA-update actions was 30 minutes. This duration was insufficient for some very slow setups, resulting in HA-update failures. User Impact: The rebalance job for HA-update failed to complete. In some cases, the volume's HA-level changed unexpectedly. Resolution: The default rebalance HA-update timeout has been increased to 5 hour Components: Storage Affected Versions: 2.13.x, 3.0.x, 3.1.x | Major |
PWX-35312 | In version 3.1.0, a periodic job that fetched drive properties caused an increase in the number of API calls across all platforms. User Impact: The API rate limits approached their maximum capacity more quickly, stressing the backend. Resolution: Portworx improved the system to significantly reduce the number of API calls on all platforms. Component: Cloud Drives Affected Versions: 3.1.0 | Major |
PWX-30441 | For AWS users, Portworx did not update the drive properties for the gp2 drives that were converted to gp3 drives. User Impact: As the IOPS of such drives changed, but not updated, pool expansion failed on these drives. Resolution: During the maintenance cycle, that is required for converting gp2 drives to gp3, Portworx now refreshes the disk properties of these drives. Component: Cloud Drives Affected Versions: 3.1.x, 3.0.x, 2.13.x | Major |
PWX-36139 | During pool expansion with the add-drive operation using the CSI provider on a KVDB node, there is a possibility of the new drive getting the StorageClass of the KVDB drive instead of the data drive, if they are different.User Impact: In such a case, a drive might have been added but the pool expansion operation failed, causing some inconsistencies. Resolution: Portworx takes the StorageClass of only the data drives present in the node. Component: Pool Management Affected Versions: 3.1.x, 3.0.x, 2.13.x | Minor |
Known issues (Errata)
Issue Number | Issue Description | Severity |
---|---|---|
PD-3031 | For an Azure cluster with storage and storageless nodes using Premium LRS or SSD drive types, when a user updates the Portworx StorageClass to use PremiumV2 LRS or Ultra SSD drive types, the changes might not reflect on the existing nodes.Workaround: StorageClass changes will apply only to the new nodes added to the cluster. For existing nodes, perform the following steps:
Affected versions: 3.1.2 | Major |
PD-3012 | If maxStorageNodesPerZone is set to a value greater than the current number of worker nodes in an AKS cluster, additional storage nodes in an offline state may appear post-upgrade due to surge nodes.Workaround: Manually delete any extra storage node entries created during the Kubernetes cluster upgrade by following thenode decommission process.Components: Cloud Drives Affected versions: 2.13.x, 3.0.x, 3.1.x | Major |
PD-3013 | Pool expansion may fail if a node is rebooted before the expansion process is completed, displaying errors such as drives in the same pool not of the same type . Workaround: Retry the pool expansion on the impacted node. Components: Drive and Pool Management Affected versions: 3.1.2 | Major |
PD-3035 | Users may encounter issues with migrations of legacy shared volumes to shared4v service volumes appearing stuck if performed on a decommissioned node. Workaround: If a node is decommissioned during a migration, the pods running on that node must be forcefully terminated to allow the migration to continue. Component: Shared4v Volumes Affected version: 3.1.2 | Major |
PD-3030 | In environments where multipath is used to provision storage disks for Portworx, incorrect shutdown ordering may occur, causing multipath to shut down before Portworx. This can lead to situations where outstanding IOs from applications, still pending in Portworx, may fail to reach the storage disk. Workaround:
Affected Versions: 3.1.2 | Major |
3.1.1
April 03, 2024
Visit these pages to see if you're ready to upgrade to this version:
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-35939 | For DR clusters, the cluster domain of the nodes is exposed in the node inspect and node enumerate SDK responses. This information is used by the operator to create the pod disruption budget, preventing loss during Kubernetes upgrades. | DR and Migration |
PWX-35395 | When Portworx encounters errors like checksum mismatch or bad disk sectors while reading data from the backend disk the IOOperationWarning alert will be raised. This alert is tracked by the metric px_alerts_iooperationwarning_total . | Storage |
PWX-35738 | Portworx now queries an optimized subset of VMs to determine the driveset to attach, avoiding potential errors during an upgrade where a transient state of a VM could have resulted in an error during boot. | Cloud Drives |
PWX-35397 | The start time for Portworx on both Kubernetes and vSphere platforms has been significantly reduced by eliminating repeated calls to the Kubernetes API and vSphere servers. | Cloud Drives |
PWX-35042 | The Portworx CLI has been enhanced with the following improvements:
| Cloud Drives |
PWX-33493 | For pool expansion operations with the pxctl sv pool expand command, the add-disk and resize-disk flags have been renamed to add-drive and resize-drive , respectively. The command will continue to support the old flags for compatibility. | Cloud Drives |
PWX-35351 | The OpenShift Console now displays the Used Space for CSI sharedV4 volumes. | Sharedv4 |
PWX-35187 | Customers can now obtain the list of Portworx images from the spec generator. | Spec Generator |
PWX-36543 | If the current license is set to expire within the next 60 days, Portworx now automatically updates the IBM Marketplace license to a newer one upon the restart of the Portworx service. | Licensing |
PWX-36496 | The error messages for pxctl license activate have been improved to return a more appropriate error message in case of double activation. | Licensing |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-36416 | When a PX-StoreV2 pool reached its full capacity and could not be expanded further using the resize-drive option, it went offline due to a pool full condition.User Impact: If pool capacity reached a certain threshold, the pool went offline. Resolution: Since PX-StoreV2 pools cannot be expanded using the add-drive operation. You can increase the capacity on a node by adding new pools to it:
Affected Versions: 3.0.0 | Critical |
PWX-36344 | A deadlock in the Kubernetes Config lock led to failed pool expansion. User Impact: Customers needed to restart Portworx if pool expansion became stuck. Resolution: An unbuffered channel that resulted in a deadlock when written to in a very specific window is now changed to have a buffer, breaking the deadlock. Components: Pool Management Affected Versions: 2.13.x, 3.0.x | Major |
PWX-36393 | Occasionally, Portworx CLI binaries were installed incorrectly due to issues (e.g., read/write errors) that the installation process failed to detect, causing the Portworx service to not start. User Impact: Portworx upgrade process failed. Resolution: Portworx has improved the installation process by ensuring the correct installation of CLI commands and detecting these errors during the installation. Components: Install Affected Versions: 2.13.x, 3.0.x | Major |
PWX-36339 | For a sharedv4 service pod, there was a race condition where the cached mount table failed to reflect the unmounting of the path. User Impact: Pod deletion got stuck in the Terminating state, waiting for the underlying mount point to be deleted. Resolution: Force refresh of cache for an NFS mount point if it is not attached and is already unmounted. This will ensure that the underlying mount path gets removed and the pod terminates cleanly. Components: Sharedv4 Affected versions: 2.13.x, 3.0.x | Major |
PWX-36522 | When FlashArray Direct Access volumes and FlashArray Cloud Drive volumes were used together, the system couldn't mount the PVC due to an Invalid arguments for mount entry error, causing the related pods to not start. User Impact: Application pods failed to start. Resolution: The mechanism to populate the mount table on restart has been changed to ensure an exact device match rather than a prefix-based search, addressing the root cause of the incorrect mount entries and subsequent failures. Components: Volume Management Affected version: 3.1.0 | Major |
PWX-36247 | The field portworx.io/misc-args had an incorrect value of -T dmthin instead of -T px-storev2 to select the backend type..User Impact: Customers had to manually change this argument to -T px-storev2 after generating the spec from the spec generator.Resolution: The value for the field has been changed to -T px-storev2 .Components: FA-FB Affected version: 3.1.0 | Major |
PWX-35925 | When downloading air-gapped bootstrap specific for OEM release (e.g. px-essentials ), the script used an incorrect URL for the Portworx images.User Impact: The air-gapped bootstrap script fetched the incorrect Portworx image, particularly for Portworx Essentials. Resolution: The air-gapped bootstrap has been fixed, and now efficiently handles the OEM release images. Components: Install Affected version: 2.13.x, 3.0.x | Major |
PWX-35782 | In a synchronous DR setup, a node repeatedly crashed during a network partition because Portworx attempted to operate on a node from another domain that was offline and unavailable. User Impact: In the event of a network partition between the two domains, temporary node crashes could occur. Resolution: Portworx now avoids the nodes that are not online or unavailable from other domain. Components: DR and Migration Affected version: 3.1.0 | Major |
PWX-36500 | Older versions of Portworx installations with FlashArray Cloud Drive displayed an incorrect warning message in the pxctl status output on RHEL 8.8 and above OS versions, even though the issue had been fixed in the multipathd package that comes with these OS versions.User Impact: With Portworx version 2.13.0 or above, users on RHEL 8.8 or higher who were using FlashArray Cloud Drives saw the following warning in the pxctl status output: WARNING: multipath version 0.8.7 (between 0.7.7 and 0.9.3) is known to have issues with crashing and/or high CPU usage. If possible, please upgrade multipathd to version 0.9.4 or higher to avoid this issue .Resolution: The output of pxctl status has been improved to display the warning message for the correct RHEL versions.Components: FA-FB Affected version: 2.13.x, 3.0.x, 3.1.0 | Major |
PWX-33030 | For FlashArray Cloud Drives, when the skip_kpartx flag was set in the multipath config, the partition mappings for device mapper devices did not load, prevented Portworx from starting correctly.User Impact: This resulted in a random device (either a child or a parent/dm device) with the UUID label being selected and attempted to be mounted. If a child device was chosen, the mount would fail with a Device is busy error.Resolution: Portworx now avoids such a situation by modifying the specific unbuffered channel to include a buffer, thus preventing the deadlock. Components: FA-FB Affected version: 2.13.x, 3.0.x | Minor |
3.1.0.1
March 20, 2024
Visit these pages to see if you're ready to upgrade to this version:
This is a hotfix release intended for IBM Cloud customers. Please contact the Portworx support team for more information.
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-36260 | When installing Portworx version 3.1.0 from the IBM Marketplace catalog, the PX-Enterprise IBM Cloud license for a fresh installation is valid until November 30, 2026. However, for existing clusters that were running older versions of Portworx and upgraded to 3.1.0, the license did not automatically update to reflect the new expiry date of November 30, 2026.User Impact: With the old license expiring on April 2, 2024, Portworx operations could be affected after this date. Resolution: To extend the license until November 30, 2026, follow the instructions on the Upgrading Portworx on IBM Cloud via Helm page to update to version 3.1.0.1. Components: Licensing Affected versions: 2.13.x, 3.0.x, 3.1.0 | Critical |
3.1.0
January 31, 2024
Visit these pages to see if you're ready to upgrade to this version:
Starting with version 3.1.0:
- Portworx CSI for FlashArray and FlashBlade license SKU will only support Direct Access volumes and no Portworx volumes. If you are using Portworx volumes, reach out to the support team before upgrading Portworx.
- Portworx Enterprise will exclusively support kernel versions 4.18 and above.
New features
Portworx by Pure Storage is proud to introduce the following new features:
- The auto_journal profile is now available to detect the IO pattern and determine whether the
journal
IO profile is beneficial for an application. This detector analyzes the incoming write IO pattern to ascertain whether thejournal
IO profile would improve the application's performance. It continuously analyzes the write IO pattern and toggles between thenone
andjournal
IO profiles as needed. - A dynamic labeling feature is now available, allowing Portworx users to label Volume Placement Strategies(VPS) flexibly and dynamically. Portworx now supports the use of dynamic labeling through the inclusion of
${pvc.labels.labelkey}
in values.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-31558 | Google Anthos users can now generate the correct Portworx spec from Portworx Central, even when storage device formats are incorrect. | Spec Generation |
PWX-28654 | Added the NonQuorumMember flag to the node inspect and Enumerate SDK API responses. This flag provides an accurate value depending on whether a node contributes to cluster quorum. | SDK/gRPC |
PWX-31945 | Portworx now provides an internal API for listing all storage options on the cluster. | SDK/gRPC |
PWX-29706 | Portworx now supports a new streaming Watch API that provides updates on volume information that has been created, modified, or deleted. | SDK/gRPC |
PWX-35071 | Portworx now distinguishes between FlashArray and FlashBlade calls, routing them to appropriate backends based on the current volume type (file or block), thereby reducing the load on FlashArray or FlashBlade backends. | FA-FB |
PWX-34033 | For FlashArray and FlashBlade integrations, many optimizations have been made in caching and information sharing, resulting in a significant reduction in number of REST calls made to the backing FlashArray and FlashBlade. | FA-FB |
PWX-35167 | The default timeout for the FlashBlade Network Storage Manager (NSM) lock has been increased to prevent Portworx restarts. | FA-FB |
PWX-30083 | Portworx now manages the TTL for alerts instead of relying on etcd's key expiry mechanism. | KVDB |
PWX-33430 | The error message displayed when a KVDB lock times out has been made more verbose to provide a better explanation. | KVDB |
PWX-34248 | The sharedv4 parameter in a StorageClass enables users to choose between sharedv4 and non-shared volumes:
| Sharedv4 |
PWX-35113 | Users can now enable the forward-nfs-attach-enable storage option for applications using sharedv4 volumes. This allows Portworx to attach a volume to the most suitable available nodes. | Sharedv4 |
PWX-32278 | On the destination cluster, all snapshots are now deleted during migration when the parent volume is deleted. | Stork |
PWX-32260 | The resize-disk option for pool expansion is now also available on TKGS clusters. | Cloud Drives |
PWX-32259 | Portworx now uses cloud provider identification by reusing the provider's singleton instance, avoiding repetitive checks if the provider type is already specified in the cluster spec. | Cloud Drives |
PWX-35428 | In environments with slow vCenter API responses, Portworx now caches specific vSphere API responses, reducing the impact of these delays. | Cloud Drives |
PWX-33561 | When using the PX-StoreV2 backend, Portworx now detaches partially attached driversets for cloud-drives only when the cloud-drives are not mounted. | Cloud Drives |
PWX-33042 | In a disaggregated deployment, storageless nodes can be converted to storage nodes by changing the node label to portworx.io/node-type=storage | Cloud Drives |
PWX-28191 | AWS credentials for Drive Management can now be provided through a Kubernetes secret px-aws in the same namespace where Portworx is deployed. | Cloud Drives |
PWX-34253 | Azure users will now see accurate storage type displays: Premium_LRS is identified as SSD, and NVME storage is also correctly represented. | Cloud Drives |
PWX-31808 | Pool deletion is now allowed for vSphere cloud drives. | Cloud Drives |
PWX-32920 | vSphere drives can now be resized up to a maximum of 62 TB per drive. | Pool Management |
PWX-32462 | Portworx now permits most overlapping mounts and will only reject overlapping mounts if a bidirectional (i.e., shared) parent directory mount is present. | px-runc |
PWX-32905 | Portworx now properly detects the NFS service on OpenShift platforms. | px-runc |
PWX-35292 | To reduce log volume in customer clusters, logs generated when a volume is not found during CSI mounting have been moved to the TRACE level. | CSI |
PWX-34995 | Portworx CSI for FlashArray and FlashBlade license SKU now counts Portworx and FA/FB drives separately based on the drive type. | Licensing |
PWX-35452 | The mount mapping's lock mechanism has been improved to prevent conflicts between unmount and mount processes, ensuring more reliable pod start-ups. | Volume Management |
PWX-33577 | The fstrim operation has been improved for efficiency:
| Storage |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-31652 | Portworx was unable to identify the medium for the vSphere cloud drives. User Impact: Portworx deployment failed on vSphere with cloud drives. Resolution: Portworx now successfully identifies the drive medium type correctly and can be deployed on a cluster with vSphere cloud drives. Components: Drive & Pool Management Affected Versions: 2.13.x | Critical |
PWX-35430 | Requests for asynchronous DR migration operations were previously load balanced to nodes that were not in the same cluster domain. User Impact: In hybrid DR setups, such as one where cluster A is synchronously paired with cluster B, and cluster B is asynchronously paired with cluster C, any attempts to migrate from Cluster B to Cluster C would result in failure, showing an error that indicates a BackupLocation not found .Resolution: Portworx now ensures that migration requests are load balanced within nodes in the same cluster domain as the initial request. Components: DR and Migration Affected Versions: 3.0.4 | Critical |
PWX-35277 | In an asynchronous DR deployment, if security/auth is enabled in a Portworx cluster, migrations involving multiple volumes would fail with authentication errors. User Impact: Migrations in asynchronous DR setups involving multiple volumes failed with authentication errors. Resolution: Authentication logic has been modified to handle migrations involving multiple volumes on the auth enabled clusters. Components: DR and Migrations Affected versions: 3.0.0 | Critical |
PWX-34369 | When using HTTPS endpoints for cluster pairing, Portworx incorrectly parsed the HTTPS URL scheme. User Impact: Cluster pairing would fail when using an HTTPS endpoint. Resolution: Portworx has now corrected the HTTPS URL parsing logic. Components: DR and Migration Affected versions: 3.0.0 | Critical |
PWX-35466 | Cloudsnaps or asynchronous DR operations failed when attempted from a metro cluster due to inaccessible credentials. This issue specifically occurred if the credential was not available from both domains of the metro cluster. User Impact: Cloudsnap operations or asynchronous DR from metro clusters could fail if the required credentials were not accessible in both domains. Resolution: Portworx now detects a coordinator node that has access to the necessary credentials for executing cloudsnaps or asynchronous DR operations. Components: DR and Migration Affected versions: 3.0.4 | Critical |
PWX-35324 | FlashArray Direct Access volumes are formatted upon attachment. All newly created volumes remain in a pending state until they are formatted. If Portworx was restarted before a volume had been formatted, it would delete the volume that was still in the pending state. User Impact: The newly created FlashArray Direct Access volumes were deleted. Resolution: Portworx now avoids deleting volumes that are in the pending state. Components: FA-FB Affected versions: 3.0.x | Critical |
PWX-35279 | Upon Portworx startup, if there were volumes attached from a FlashArray that was not registered in the px-pure-secret , Portworx would detach them as part of a cleanup routine.User Impact: Non-Portworx disks, including boot drives and other FlashArray volumes, were mistakenly detached from the node and required reconnection. Resolution: Portworx no longer cleans up healthy FlashArray volumes on startup. Components: FA-FB Affected versions: 2.13.11, 3.0.0, 3.0.4 | Critical |
PWX-34377 | Portworx was incorrectly marking FlashBlade Direct Attach volumes being transitioned to read-only status. This incorrect identification led to a restart of all pods associated with these volumes. User Impact: The restart of running pods resulted in application restarts or failures. Resolution: Checks within Portworx that were leading to false identification of Read-Only transitions for FlashBlade volumes have been fixed. Components: FA-FB Affected versions: 3.0.4 | Critical |
PWX-32881 | The CSI driver failed to register after the Anthos storage validation test suite was removed and a node was re-added to the cluster. User Impact: The CSI server was unable to restart if the Unix domain socket had been deleted. Resolution: The CSI server now successfully restarts and restores the Unix domain socket, even if the socket has been deleted. Update to this version if your workload involves deleting the kubelet directory during node decommissioning.Components: CSI Affected versions: 3.0.0 | Critical |
PWX-31551 | The latest OpenShift installs have more strict SELinux policies, which prevent non-privileged pods to access the csi.sock CSI interface file.User Impact: Portworx install failed. Resolution: All Portworx CSI pods are now configured as privileged pods.Components: oci-monitor Affected versions: 2.13.x, 3.0.x | Critical |
PWX-31842 | On TKGI clusters, if Portworx service and pods were restarted, it led to excessive mounts (mount-leaks). User Impact: The IO operations on the node would progressively slow down, until the host would completely hang. Resolution: The mountpoints that are used by Portworx have been changed. Components: oci-monitor Affected versions: 2.1.1 | Critical |
PWX-35603 | When running Portworx on older Linux systems (specifically those using GLIBC 2.31 or older) in conjunction with newer versions of Kubernetes, Portworx previously failed to detect dynamic updates of pod credentials and tokens, hence led to Unauthorized errors when utilizing Kubernetes client APIs.User Impact: Users could encounter Unauthorized errors when using Kubernetes client APIs.Resolution: Dynamic token updates are now processed correctly by Portworx. Components: oci-monitor Affected versions: 3.0.1 | Critical |
PWX-34250 | If encryption was applied on both the client side (using an encryption passphrase) and the server side (using Server-Side Encryption, SSE) for creating credential commands, this approach failed to configure S3 storage in Portworx to use both encryption methods. User Impact: Configuration of S3 storage would fail in the above mentioned condition. Resolution: Users can now simultaneously use both server-side and client-side encryption when creating credentials for S3 or S3-compatible object stores. Components: Cloudsnaps Affected versions: 3.0.2, 3.0.3, 3.0.4 | Critical |
PWX-22870 | Portworx installations would by default automatically attempt to install NFS packages on the host system. However, since NFS packages add new users/groups, they were often blocked on Red Hat Enterprise Linux / CentOS platforms with SELinux enabled. User Impact: Sharedv4 volumes failed to attach on platforms with SELinux enabled. Resolution: Portworx installation is now more persistent on Red Hat Enterprise Linux / CentOS platforms with SELinux enabled. Components: IPV6 Affected versions: 2.5.4 | Major |
PWX-35332 | Concurrent access to an internal data structure containing NFS export entries resulted in a Portworx node crashing with the fatal error: concurrent map read and map write in knfs.HasExports error.User Impact: This issue triggered a restart of Portworx on that node. Resolution: A lock mechanism has been implemented to prevent this issue. Components: Sharedv4 Affected versions: 2.10.0 | Major |
PWX-34865 | When upgrading Portworx from version 2.13 (or older) to version 3.0 or newer, the internal KVDB version was also updated. If there was a KVDB membership change during the upgrade, the internal KVDB lost quorum in some corner cases. User Impact: The internal KVDB lost quorum, enforcing Portworx upgrade of a KVDB node that was still on an older Portworx version. Resolution: In some cases, Portworx now chooses a different mechanism for the KVDB membership change. Components: KVDB Affected versions: 3.0.0 | Major |
PWX-35527 | When a Portworx KVDB node went down and subsequently came back online with the same node ID but a new IP address, Portworx nodes on the other servers continued to use the stale IP address for connecting to KVDB. User Impact: Portworx nodes faced connection issues while connecting to the internal KVDB, as they attempted to use the outdated IP address. Resolution: Portworx now updates the correct IP address on such nodes. Component: KVDB Affected versions: 2.13.x, 3.0.x | Major |
PWX-33592 | Portworx incorrectly applied the time set by the execution_timeout_sec option.User Impact: Some operations time out before the time set through the execution_timeout_sec option.Resolution: The behavior of this runtime option is now fixed. Components: KVDB Affected versions: 2.13.x, 3.0.x | Major |
PWX-35353 | Portworx installations (version 3.0.0 or newer) failed on Kubernetes systems using Docker container runtime versions older than 20.10.0. User Impact: Portworx installation failed on Docker container runtimes older than 20.10.0. Resolution: Portworx can now be installed on older Docker container runtimes. Components: oci-monitor Affected versions: 3.0.0 | Major |
PWX-33800 | In Operator version 23.5.1, Portworx was configured so that a restart of the Portworx pod would also trigger a restart of the portworx.service backend.User Impact: This configuration caused disruptions in storage operations. Resolution: Now pod restarts do not trigger a restart of the portworx.service backend.Components: oci-monitor Affected versions: 2.6.0 | Major |
PWX-32378 | During the OpenShift upgrade process, the finalizer service, which ran when Portworx was not processing IOs, experienced a hang and subsequently timed out. User Impact: This caused the OpenShift upgrade to fail. Resolution: The Portworx service now runs to stop Portworx and sets the PXD_timeout during OpenShift upgrades. Components: oci-monitor Affected versions: 2.13.x, 3.0.x | Major |
PWX-35366 | When the underlying nodes of an OKE cluster were replaced multiple times (due to upgrades or other reasons), Portworx failed to start, displaying the error Volume cannot be attached, because one of the volume attachments is not configured as shareable .User Impact: Portworx became unusable on nodes that were created to replace the original OKE worker nodes. Resolution: Portworx now successfully starts on such nodes. Components: Cloud Drives Affected versions: 2.13.x, 3.0.x | Major |
PWX-33413 | After an upgrade, when a zone name case was changed, Portworx considered this to be a new zone. User Impact: The calculation of the total storage in the cluster by Portworx became inaccurate. Resolution: Portworx now considers a zone name with the same spelling, regardless of case, to be the same zone. For example, Zone1, zone1, and ZONE1 are all considered the same zone. Components: Cloud Drives Affected versions: 2.12.1 | Major |
PWX-33040 | For Portworx users using cloud drives on the IBM platform, when the IBM CSI block storage plugin was unable to successfully bind Portworx cloud-drive PVCs (for any reason), these PVCs remained in a pending state. As a retry mechanism, Portworx created new PVCs. Once the IBM CSI block storage plugin was again able to successfully provision drives, all these PVCs got into a bound state.User Impact: A large number of unwanted block devices were created in users' IBM accounts. Resolution: Portworx now cleans up unwanted PVC objects during every restart and KVDB failover. Components: Cloud Drives Affected versions: 2.13.0 | Major |
PWX-35114 | The storageless node could not come online after Portworx was deployed and showed the failed to find any available datastores or datastore clusters error.User Impact: Portworx failed to start on the storageless node which had no access to a datastore. Resolution: Storageless nodes can now be deployed without any access to a datastore. Components: Cloud Drives Affected versions: 2.13.x, 3.0.x | Major |
PWX-33444 | If a disk that was attached to a node became unavailable, Portworx continuously attempted to find the missing drive-set. User Impact: Portworx failed to restart. Resolution: Portworx now ignores errors related to missing disks and attempts to start by attaching to the available driveset, or it creates a new driveset if suitable drives are available on the node. Components: Cloud Drives Affected versions: 2.13.x, 3.0.x | Major |
PWX-33076 | When more than one container mounted to a docker volume, all of them mounted to the same path as the mount path was not unique as it only had the volume name. User Impact: When one container used to go offline, it would unmount for the other container mounted to the same volume. Resolution: The volume mount HTTP request ID is now attached to the path which makes the path unique for every mount to the same volume. Components: Volume Management Affected versions: 2.13.x, 3.0.x | Major |
PWX-35394 | Host detach operation on the volume failed with the error HostDetach: Failed to detach volume .User Impact: A detach or unmount operation on a volume would get stuck if attach and detach operations were performed in quick succession, leading to incomplete unmount operations. Resolution: Portworx now reliably handles detach or unmount operations on a volume, even when attach and detach operations are performed in quick succession. Components: Volume Management Affected Versions: 2.13.x, 3.0.x | Major |
PWX-32369 | In a synchronous DR setup, cloudsnaps with different objectstores for each domain failed to backup and cleanup the expired cloudsnaps. User Impact: The issue occurred because of a single node, which did not have access to both the objectstores, was performing cleanup of the expired cloudsnaps. Resolution: Portworx now designates two nodes, one in each domain, to perform the cleanup of the expired cloudsnaps. Components: Cloudsnaps Affected versions: 2.13.x, 3.0.x | Major |
PWX-35136 | During cloudsnap deletions, some objects were not removed because the deletion requests exceeded the S3 API's limit for the number of objects that could be deleted at once. User Impact: This would leave objects on S3 for deleted cloudsnaps, thereby consuming S3 capacity. Resolution: Portworx has been updated to ensure that deletion requests do not exceed the S3 API's limit for the number of objects that can be deleted. Components: Cloudsnaps Affected versions: 2.13.x, 3.0.x | Major |
PWX-34654 | Cloudsnap status returned empty results without any error for a taskID that was no longer in the KVDB. User Impact: No information was provided for users to take corrective actions. Resolution: Portworx now returns an error instead of empty status values. Components: Cloudsnaps Affected versions: 2.13.x, 3.0.x | Major |
PWX-31078 | When backups were restored to a namespace different from the original volume's, the restored volumes retained labels indicating the original namespace, not the new one. User Impact: The functionality of sharedv4 volumes would impact due to the labels not accurately reflecting the new namespace in which the volumes were located. Resolution: Labels for the restored volume have been fixed to reflect the correct namespace in which the volume resides. Components: Cloudsnaps Affected versions: 2.13.x, 3.0.x | Major |
PWX-32278 | During migration, on destination cluster the orphan snapshot was left behind even though parent volume was not present during certain error scenarios. User Impact: This can lead to an increase in capacity usage. Resolution: Now, such orphan cloudsnaps are deleted when the parent volume is deleted. Components: Asynchronous DR Affected versions: 2.13.x, 3.0.x | Major |
PWX-35084 | Portworx incorrectly determined the number of CPU cores when running on hosts enabled with cGroupsV2. User Impact: This created issues when limiting the CPU resources, or pinning the Portworx service to certain CPU cores. Resolution: Portworx now properly determines number of available CPU cores. Components: px-runc Affected versions: 3.0.2 | Major |
PWX-32792 | On OpenShift 4.13, Portworx did not proxy portworx-service logs. It kept journal logs from multiple machine IDs, which caused the Portworx pod to stop proxying the logs from portworx.service .User Impact: In OpenShift 4.13, the generation of journal logs from multiple machine IDs led to the Portworx pod ceasing to proxy the logs from portworx.service .Resolution: Portworx log proxy has been fixed to locate the correct journal log using the current machine ID. Components: Monitoring Affected versions: 2.13.x, 3.0.x | Major |
PWX-34652 | During the ha-update process, all existing volume labels were removed and could not be recovered.User Impact: This resulted in the loss of all volume labels, significantly impacting volume management and identification. Resolution: Volume labels now do not change during the ha-update process.Components: Storage Affected versions: 2.13.x, 3.0.x | Major |
PWX-34710 | A large amount of log data was generated during storage rebalance jobs or dry runs. User Impact: This led to log files occupying a large amount of space. Resolution: The volume of logging data has been reduced by 10%. Components: Storage Affected versions: 2.13.x, | Major |
PWX-34821 | In scenarios where the system is heavily loaded and imbalanced, elevated syncfs latencies were observed. This situation led to the fs_freeze call, responsible for synchronizing all dirty data, timing out before completion.User Impact: Users experienced timeouts during the fs_freeze call, impacting the normal operation of the system.Resolution: Restart the system and retry the snapshot operation. Components: Storage Affected versions: 3.0.x | Major |
PWX-33647 | When the Portworx process are restarted, it verifies the existing mounts on the system for sanity. If one of the mounts was NFS mount of a Portworx volume, the mount point verification would hung as Portworx was in the process of starting up. User Impact: The Portworx process would not come up and would enter an infinite wait, waiting for the mount point verification to return. Resolution: When Portworx is starting up, it now skips the verification of Portworx-backed mount points to allow the startup process to continue. Components: Storage Affected versions: 3.0.2 | Major |
PWX-33631 | Portworx applied locking mechanisms to synchronize requests across different worker nodes during the provisioning of CSI volumes, to distribute workloads evenly causing decrease in performance for CSI volume creation. User Impact: This synchronization approach led to a decrease in performance for CSI volume creation in heavily loaded clusters. Resolution: If experiencing slow CSI volume creation, upgrade to this version. Components: CSI Affected versions: 2.13.x, 3.0.x | Major |
PWX-34355 | In certain occasions, while mounting an FlashArray cloud drive disks backing a storage pool, Portworx used the single path device instead of multipath device. User Impact: Portworx entered in the StorageDown state.Resolution: Portworx now identifies the multipath device associated with a given device name and uses this multipath device for mounting operations. Components: FA-FB Affected versions: 2.10.0, 2.11.0, 2.12.0, 2.13.0, 2.13.11, 3.0.0 | Major |
PWX-34925 | When a large number of FlashBlade Direct Access volumes were created subsequently could lead to restating of Portworx with the fatal error: sync: unlock of unlocked mutex error.User Impact: When trying to create a large number of FlashBlade volumes concurrently, Portworx process might get restarted due to contention on the lock. Resolution: Improved the locking mechanism to avoid this error. Components: FA-FB Affected versions: 3.0.4 | Major |
PWX-35680 | The Portworx spec generator was incorrectly defaulting telemetry to be disabled when the StorageCluster spec was generated outside of the Portworx Central UI. This does not affect customers who applied a storagecluster with an empty telemetry spec or generated their spec through the UI. User Impact: Telemetry was disabled by default. Resolution: To enable telemetry, users should explicitly specify it if intended. Components: Spec-Gen Affected versions: 2.12.0, 2.13.0, 3.0.0 | Major |
PWX-34325 | When operating Kubernetes with the containerd runtime and a custom root directory set in the containerd configuration, the installation of Portworx would fail.User Impact: Portworx install would fail, resulting in unusual error messages due to a bug in containerd. Resolution: The installation will now intercept the error message and replace it with a clearer message that includes suggestions on how to fix the Portworx configuration. Components: Installation Affected versions: 3.0.0 | Minor |
PWX-33557 | The CallHome functionality sometimes unconditionally attempted to send the data to the local telemetry service. User Impact: This caused errors, if the telemetry was disabled. Resolution: The CallHome now sends data only if the Telemetry has been enabled. Components: Monitoring Affected versions: 3.0.0 | Minor |
PWX-32536 | Portworx installation failed on certain Linux systems using cGroupsV2 and containerd container runtimes, as it was unable to properly locate container identifiers. User Impact: Portworx installation failed. Resolution: The container scanning process has been improved to ensure successful Portworx installation on such platforms. Components: oci-monitor Affected versions: 2.13.x, 3.0.x | Minor |
PWX-30967 | During volume provisioning, snapshot volume labels are included in the count. The nodes were disqualified for provisioning when volume_anti_affinity or volume_affinity VPS was configured, resulting in volume creation failures.User Impact: When stale snapshots existed, the creation of volumes using the VPS with either volume_anti_affinity or volume_affinity setting would fail.Resolution: Upgrade to this version and retry previously failed volume creation request. Components: Stork Affected versions: 2.13.2 | Minor |
PWX-33999 | During the installation of NFS packages, Portworx incorrectly interpreted any issues or errors that occurred as timeout errors. User Impact: Portworx misrepresented and masked the original issues. Resolution: Portworx now accurately processes NFS installation errors during its installation. Components: px-runc Affected versions: 2.7.0 | Minor |
PWX-33008 | Creation of a proxy volume with CSI enabled and RWX access mode failed due to the default use of sharedv4 for all RWX volumes in CSI. User Impact: Users could not create proxy volumes with CSI enabled and RWX access mode. Resolution: To successfully create proxy volumes with CSI and RWX access mode, upgrade to this version. Components: Sharedv4 Affected versions: 3.0.0 | Minor |
PWX-34326 | The Portworx CSI Driver GetPluginInfo API returned an incorrect CSI version. User Impact: This resulted in confusion when the CSI version was retrieved by the Nomad CLI. Resolution: The Portworx CSI Driver GetPluginInfo API now returns the correct CSI version. Components: CSI Affected versions: 2.13.x,3.0.x | Minor |
PWX-31577 | Occasionally, when a user requested cloudsnap to stop, it would lead to incorrect increase in the available resources. User Impact: More cloudsnaps were started and they were stuck in the NotStarted state as resources were unavailable.Resolution: Stopping cloudsnaps does not incorrectly now increase the available resources, thus avoiding the issue. Components: Cloudsnaps Affected versions: 2.13.x, 3.0.x | Minor |
Known issues (Errata)
Issue Number | Issue Description | Severity |
---|---|---|
PD-2673 | KubeVirt VM or container workloads may remain in the Starting state due to the remounting of volumes failing with a device busy error.Workaround:
Affected versions: 2.13.x, 3.0.x | Critical |
PD-2546 | In a synchronous DR deployment, telemetry registrations might fail on the destination cluster. Workaround:
Affected versions: 3.0.4 | Critical |
PD-2574 | If a disk is removed from an online pool using the PX-StoreV2 backend, it may cause a kernel panic. Workaround: To avoid kernel panic, do not remove disks from an online pool or node. Components: Storage Affected versions: NA | Critical |
PD-2387 | In OpenShift Container Platform (OCP) version 4.13 or newer, application pods using Portworx sharedv4 volumes can get stuck in Terminating state. This is because kubelet is unable to stop the application container when an application namespace is deleted.Workaround:
Terminating state, reboot the node on which the pod is running. Note that after rebooting, it might take several minutes for the pod to transition out of the Terminating state.Components: Sharedv4 Affected versions: 3.0.0 | Major |
PD-2621 | Occasionally, deleting a TKGi cluster with Portworx fails with the Warning: Executing errand on multiple instances in parallel. error.Workaround: Before deleting your cluster, perform the following steps:
Components: Kubernetes Integration Affected versions: | Major |
PD-2631 | After resizing a FlashArray Direct Access volume with a filesystem (such as ext4, xfs, or others) by a significant amount, you might not be able to detach the volume, or delete the pod using this volume. Workaround: Allow time for the filesystem resizing process to finish. After the resize is complete, retry the operations. Components: FA-FB Affected versions: 2.13.x, 3.0.x, 3.1.0 | Major |
PD-2597 | Online pool expansion with the add-disk operation might fail when using the PX-StoreV2 backend.Workaround: Enter the pool into maintenance mode, then expand your pool capacity. Components: Storage Affected versions: 3.0.0, 3.1.0 | Major |
PD-2585 | The node wipe operation might fail with the Node wipe did not cleanup all PX signatures. A manual cleanup maybe required. error on a system with user setup device names containing specific Portworx reserved keywords(such as pwx ).Workaround: You need to rename or delete devices that use Portworx reserved keywords in their device names before retrying the node wipe operation. Furthermore, it is recommended not to use Portworx reserved keywords such as px , pwx , pxmd , px-metadata , pxd , or pxd-enc while setting up devices or volumes, to avoid encountering such issues.Components: Storage Affected versions: 3.0.0 | Major |
PD-2665 | During a pool expansion operation, if a cloud-based storage disk drive provisioned on a node is detached before the completion of the pool resizing or rebalancing, you can see the show drives: context deadline exceeded error in the output of the pxctl sv pool show command.Workaround: Ensure that cloud-based storage disk drives involved in pool expansion operations remain attached until the resizing and rebalancing processes are fully completed. In cases where a drive becomes detached during this process, hard reboot the node to restore normal operations. Component: PX-StoreV2 Affected versions: 3.0.0, 3.1.0 | Major |
PD-2833 | With Portworx 3.1.0, migrations might fail between two clusters if one of the clusters is running a version of Portworx older than 3.1.0, resulting in a key not found error.Workaround: Ensure that both the source and destination clusters are upgraded to version 3.1.0 or newer. Components: DR & Migration Affected Versions: 3.1.0 | Minor |
PD-2644 | If an application volume contains a large number of files (e.g., 100,000) in a directory, changing the ownership of these files can take a long time, causing delays in the mount process. Workaround: If the ownership change is taking a long time, Portworx by Pure Storage recommends setting fsGroupChangePolicy to OnRootMismatch . For more information, see the Kubernetes documentation.Components: Storage Affected versions: 2.13.x, 3.0.x | Minor |
PD-2359 | When a virtual machine is transferred from one hypervisor to another and Portworx is restarted, the CSI container might fail to start properly and shows the CrashLoopBackoff error.Workaround: Remove the topology.portworx.io/hypervisor label from the affected node.Components: CSI Affected versions: 2.13.x, 3.0.x | Minor |
PD-2579 | When the Portworx pod (oci-mon ) cannot determine the management IP used by the Portworx container, the pxctl status command output on this pod shows a Disabled or Unhealthy status.Workaround: This issue is related to display only. To view the correct information, run the following command directly on the host machine: kubectl exec -it <oci-mon pod> -- nsenter --mount=/host_proc/1/ns/mnt -- pxctl status .Components: oci-monitor Affected versions: 2.13.0 | Minor |
3.0.5
April 17, 2024
Visit these pages to see if you're ready to upgrade to this version:
For users currently on Portworx versions 2.11.x, 2.12.x, or 2.13.x, Portworx by Pure Storage recommends upgrading to Portworx 3.0.5 instead of moving to the next major version.
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-36858 | When using Hashicorp Vault integration, Portworx nodes kept attempting to connect to the Vault service. In the case of misconfigured authentication, the excessive attempts to log in to Vault crashed the Vault service. User Impact: Excessive attempts led to crashing of Vault services. Resolution: Portworx has implemented exponential back-off to reduce the frequency of login attempts to the Vault service. Components: Secret Store Affected Versions: 3.0.4 | Critical |
PWX-36873 | When Portworx is using Harsicorp's Vault configured with Kubernetes or Approle authentication, it automatically refreshes expired access tokens. However, if the Kubernetes or Service Account got removed or Approle expired, the token-refresh failed. User Impact: Excessive attempts to refresh the access tokens caused the Vault service to crash, especially in large clusters. Resolution: The Portworx node now identifies excessive errors from the Vault service and and will avoid accessing Vault for a cooling-off period of 5 minutes. Components: Secret Store Affected Versions: 3.0.3 | Major |
PWX-36847 | In case of a Kubernetes API call failure, Portworx used to incorrectly assume the zone of the node to be the default empty zone. Due to this, it tried to attach drives that belonged to that default zone. As there are no drives created in this default zone, Portworx went ahead and created a new set of drives, assuming this node to be in a different zone. User Impact: This led to duplicate entries and cluster went out of quorum. Resolution: Portworx now does not treat the default zone as a special zone. This allows Portworx to check for any existing drives that are already attach or available to be attached from any zone before trying to create new ones. Components: Cloud Drives Affected Versions: 3.0.3 | Major |
PWX-36786 | An offline, storageless node was incorrectly auto-decommissioned due to specific race conditions, resulting in the clouddrive DriveSet being left orphaned. User Impact: Portworx failed to start when attempting to operate as a storageless node using this orphaned clouddrive DriveSet, due to the node being in a decommissioned state. Resolution: Portworx now automatically cleans up such orphaned storageless clouddrive DriveSets, allowing it to start successfully. Components: Cloud Drive Affected Versions: 2.13.x, 3.0.x, and 3.1.x | Major |
3.0.4
November 15, 2023
Visit these pages to see if you're ready to upgrade to this version:
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-34315 | Improved how Portworx identifies pods with volumes in the Read-Only state before restarting them. | Storage |
PWX-34153 | CSI sidecar images are updated to the latest open source versions. | CSI |
PWX-34029 | Portworx now removes stale FlashArray multipath devices upon startup, which may result from pod failovers (for FlashArray Direct Access) or drive set failovers (for FlashArray Cloud Drives) while Portworx was not running. These stale devices had no direct impact but could have led to slow operations if many were present. | FA-FB |
PWX-34974 | Users can now configure the default duration, which is set to 15 minutes, after which the logs should be refreshed to get the most up-to-date statistics for FlashBlade volumes, using the following command::pxctl cluster options update --fb-stats-expiry-duration <time-in-minutes> The minimum duration for refresh is one minute. | FA-FB |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-34334 | Cloudsnaps of an aggregated volume with a replication level of 2 or more uploaded incorrect data if one of the replica nodes from which a previous cloudsnap operation had been executed was down. User Impact: The most recent snapshots were lost. Resolution: Portworx now forces a full backup in scenarios where the previous cloudsnap node is down. Components: Cloudsnaps Affected versions: 3.0.x | Critical |
PWX-33632 | If an attach request remained in the processing queue for a long time, it would lead to a panic. User Impact: Portworx would restart on the node. This was because an FA attach operation involved making REST API calls to FA, as well as running iSCSI rescans, which consumed more time. When Portworx received a high volume of requests to attach FA DirectAccess volumes, the queue for these attach requests gradually grew over time, leading to a panic in Portworx. Resolution: The timeout for queued attach requests has been increased to 15 minutes for FA DirectAccess volumes. Components: FA-FB Affected versions: 2.13.x, 3.0.x | Critical |
PWX-34885 | When NFS proxy volumes were created, it resulted in the restart of the Portworx service. User Impact: Although NFS proxy volumes were created, the service restart affected user applications. Resolution: Portworx now creates NFS proxy volumes successfully without restarting the Portworx service. Components: Storage Affected versions: 3.0.2 | Critical |
PWX-34277 | When an application pod using an FA Direct Access volume was failed over to another node, and Portworx was restarted on the original node, the pod on the original node became stuck in the Terminating state. User Impact: Portworx didn't clean up the mountpaths where the volume had previously been attached, as it couldn't locate the application on the local node. Resolution: Portworx now cleans up the mountpath even when the application is not found on the node. Components: FA-FB Affected versions: 2.13.x, 3.0.x | Major |
PWX-30297 | Portworx failed to restart when a multipath device was specified for the internal KVDB. Several devices with the kvdbvol label were found for the multipath device. Portworx selected the first device on the list, which might not have been the correct one.User Impact: Portworx failed to start because it selected the incorrect device path for KVDB. Resolution: When a multipath device is specified for the internal KVDB, Portworx now selects the correct device path. Components: KVDB Affected versions: 2.11.x | Major |
PWX-33935 | When the --sources option was used in the pxctl volume ha-update command for the aggregated volume, it caused the Portworx service processes to abort with an assertion.User Impact: The Portworx service on all nodes in the cluster continuously kept restarting. Resolution: Contact the Portworx support team to restore your cluster. Components: Storage Affected versions: 2.13.x, 3.0.x | Major |
PWX-33898 | When two pods, both using the same RWO FA Direct Access volume, were started on two different nodes, Portworx would move the FA Direct Access volume attachment to the node where the most recent pod was running, rather than rejecting the setup request for the second pod. User Impact: A stale FA Direct Access multipath device remained on the original node where the first pod was started, causing subsequent attach or mount requests on that node to fail. Resolution: A second pod request for the same RWO FA Direct Access volume on a different node will now be rejected if such a FA Direct Access volume is already attached and in use on another node. Components: FA-FB Affected versions: 2.13.11 | Major |
PWX-33828 | If you deleted a FA Direct Access PVC attached to an offline Portworx node, Portworx removed the associated volume from its KVDB. However, the FlashArray did not delete its associated volume because it remained connected to the offline node on the FlashArray. User Impact: This created orphaned volumes on the FlashArray. Resolution: Portworx now detects a volume that is attached to an offline Portworx node and will disconnect it from all the nodes in the FlashArray and avoid orphaned volumes. If there are any existing orphaned volumes, clean them manually. Components: FA-FB Affected versions: 2.13.8 | Major |
3.0.3
October 11, 2023
Notes
- This version addresses security vulnerabilities.
- Starting with version 3.0.3, aggregated volumes with PX-StoreV2 are not supported.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-32255 | Now the runtime option fast_node_down_detection is enabled by default. This option allows quick detection of when the Portworx service goes offline. | Storage |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-33113 | Portworx reduced the pricing for GCP Marketplace from 55 cents/node/hour to 33 cents/node/hour, but this change was not being reflected for existing users who were still reporting billing to the old endpoint. User Impact: Existing GCP Marketplace users were being incorrectly billed at the previous rate of 55 cents/node/hour. Resolution: Upgrade Portworx to version 3.0.3 to reflect the new pricing rate. Components: Billing Affected versions: 2.13.8 | Critical |
PWX-34025 | In certain cases, increasing the replication level of a volume on a PX-StoreV2 cluster created new replicas with non-zero blocks that had been overwritten with zeros on the existing replicas. User Impact: The Ext4 filesystem reported a mismatch and delayed allocation failures when a user application attempted to write data to the volume. Resolution: Users can now run the fsck operation to rectify the failures or remove the added replicas from the volume.Components: PX-StoreV2 Affected versions: 3.0.2 | Major |
3.0.2
September 28, 2023
Visit these pages to see if you're ready to upgrade to this version:
Notes
This version addresses security vulnerabilities.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-32226 | AWS users can now choose to enable server side encryption for s3 credentials, assuming the s3 object-store provider supports it. Use the --s3-sse flag with either the AES256 or aws:kms value.
| Cloudsnaps |
PWX-33229 | Previously, a Portworx license would expire if Portworx could not reach its billing server within 72 hours. Now users can continue to use Portworx for up to 30 days if the billing servers are not reachable. | Licensing |
PWX-31233 | Portworx has removed volume size enforcement for FlashArray and FlashBlade Direct Access volumes. This will allow users to create volumes greater than 40TiB for all license types. | Licensing |
PWX-33551 | Users can now configure the REST API call timeout (in seconds) for FA/FB by adding the new environment variable PURE_REST_TIMEOUT to the StorageCluster. When updating this value, the execution timeout should also be updated accordingly using the following command:pxctl cluster options update --runtime-options execution_timeout_sec=<sec> PURE_REST_TIMEOUT is set to 8 seconds and execution_timeout_sec to 60 seconds by default. Contact Portworx support to find the right values for your cluster. | FA-FB |
PWX-33364 | As part of FlashArray integration, Portworx has now reduced the number of API calls it makes to the arrays endpoint on FA. | FA-FB |
PWX-33593 | Portworx now caches certain FlashArray attachment system calls, improving the performance of mount operations for FA backed volumes on nodes with large numbers of attached devices, or many redundant paths to the array. | FA-FB |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-33451 | In certain cases, increasing the replication level of an aggregated volume failed to zero out specific blocks associated with stripes belonging to replication set 1 or higher, where zero data was expected. User Impact: Ext4 filesystem complained about a mismatch and delayed allocation failures when a user application tried to write data to an aggregated Portworx volume. Resolution: Users can now run the fsck operation to rectify the failures or remove the added replicas from the aggregated volume.Components: Storage Affected versions: 3.0.0, 2.12.x, 2.13.x | Critical |
PWX-33258 | Sometimes, Portworx timed out FlashBlade direct access volume creation when it took over 30 seconds. User Impact: Volume creation stayed in a pending state. Resolution: The timeout for FB volume creation has been increased to 180 seconds (3 minutes) to allow more time for FBDA volume creation. User can now use the --fb-lock-timeout cluster option to increase the timeout for FB volume creation beyond 180 seconds (3 minutes).Components: FA-FB Affected versions: 2.13.6 | Critical |
PWX-32428 | For the PKS environment, the sharedv4 mount failed on the remote client node with the error No such file or directory .User Impact: The restarts of the Portworx pods and service lead to excessive mounts (mount-leaks) on the PKS platforms. Thus, progressively slowing down the IO operations on the node. Resolution: The mountpoints are changed that Portworx uses on the PKS platform. If you are experiencing slowdowns on a PKS node, upgrade the Operator to the latest version, and reboot the affected PKS nodes. Components: Sharedv4 Affected versions: 2.12.x, 2.13.x | Critical |
PWX-33388 | The standalone SaaS metering agent crashed the Portworx container with a nil panic error. User Impact: This caused the Portworx container on one node to crash continuously. Resolution: Upgrade to 3.0.2 if you are using a SaaS license to avoid this issue. Components: Control Plane Affected versions: 3.0.1, 3.0.0 | Critical |
PWX-32074 | The CPU core numbers were wrongly detected by the px-runc command.User Impact: Portworx did not start on the requested cores. Resolution: The behavior of the --cpuset-cpus argument of the px-runc install command has been fixed. User can now specify the CPUs on which Portworx execution should be allowed.Components: px-runc Affected versions: 2.x.x | Critical |
PWX-33112 | Timestamps were incorrectly recorded in the write-ahead log. User Impact: The write operations were stuck due to a lack of log reservation space. Resolution: Portworx now consistently flushes timestamp references into the log. Components: Storage Affected versions: 2.12.x, 2.13.x | Critical |
PWX-31605 | The pool expansion failed because the serial number from the WWID could not be extracted. User Impact: FlashArray devices (both cloud drives and direct access) encountered expansion or attachment failures when multipath devices from other vendors (such as HPE or NetApp) were attached. Resolution: This issue has been fixed. Components: Pool Management Affected versions: 2.13.2 | Critical |
PWX-33120 | Too many unnecessary vSphere API calls were made by Portworx. User Impact: An excess of API calls and vSphere events could have caused confusion and distraction for users of vSphere Cloud Drives. Resolution: If you are seeing many vSphere VM Reconfigure events at a regular interval in the clusters configured with Portworx Cloud Drives, upgrade Portworx to the latest version. Components: Metering & Billing Affected versions: 3.0.0 | Major |
PWX-33299 | When using custom image registry, OCI-Monitor was unable to locate Kubernetes nampspaces to pull secrets. User Impact: Portworx installation failed with the error Failed retrieving default/tcr-pull-cpaas-5000 . Resolution: Portworx now consults container-runtime and Kubernetes to determine the correct Kubernetes namespace for Portworx installation. Components: OCI Monitor Affected versions: 3.0.0, 2.13.x, 2.12.x | Major |
PWX-31840 | When resizing a volume, the --provisioning-commit-labels cluster option was not honored, resulting in unlimited thin provisioning. User Impact: Portworx volumes were resized to large sizes without rejections, exceeding pool provisioning limits. Resolution: Now the --provisioning-commit-labels cluster option is honored during resizing volumes and prevents unexpected large volumes.Components: Storage Affected versions: 2.12.x, 2.13.x | Major |
PWX-32572 | When using the older Containerd versions (v1.4.x or 1.5.x), Portworx kept opening connections to Containerd, eventually depleting all the file-descriptors available on the system. User Impact: Portworx nodes crashed with the too many open files error. Resolution: Portworx no longer leaks the file-descriptors when working with older Containerd versions. Components: OCI Monitor Affected versions: 2.13.6, 3.0.0 | Minor |
PWX-30781 | The kubernetes version parameter (?kbver ) in the air-gapped script did not process the version extension.User Impact: The script generated the wrong image URLs for the Kubernetes dependent images. Resolution: Parsing of the kbver parameter has been fixed. Components: Spec Generator Affected versions: 3.0.0 | Minor |
3.0.1
September 3, 2023
Visit these pages to see if you're ready to upgrade to this version:
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-33389 | The Portworx CSI license for FA/FB validation failed when upgraded Purity to version 6.4.2 or newer, causing the Portworx license status to appear expired. User Impact: Users could not create new volumes. Resolution: The auth token is no longer used by Portworx when making API or api_version calls to FA during license validation.Components: FA-FB Affected versions: 3.0.0 | Critical |
PWX-33223 | Portworx was hitting a panic when a value was set for an uninitialized object. User Impact: This caused the Portworx container to crash and restart. Resolution: Upgrade to Portworx version 3.0.1 if using Pure cloud drives. Components: FA-FB Affected versions: 3.0.0 | Major |
Known issues (Errata)
Issue Number | Issue Description |
---|---|
PD-2349 | When you upgrade Portworx to a higher version, the upgrade is successful, but the Portworx CSI license renewal could take a long time. Workaround: Run the pxctl license reset command to reflect the correct license status. |
PD-2350 | Upgrades on some nodes may become stuck with the following message: This node is already initialized but could not be found in the cluster map. . This issue can be caused by an orphaned storageless node. Workaround: Verify if the node which has this error is a storageless node. If it is, delete the orphaned storageless node using the command: pxctl clouddrive delete --node <> to progress the upgrade. |
3.0.0
July 11, 2023
Visit these pages to see if you're ready to upgrade to this version:
Notes
Portworx 3.0.0 requires Portworx Operator 23.5.1 or newer.
New features
Portworx by Pure Storage is proud to introduce the following new features:
-
AWS users can now deploy Portworx with the PX-StoreV2 datastore. In order to have PX-StoreV2 as your default datastore, your cluster should pass the preflight check, which verifies your cluster's compatibility with the PX-StoreV2 datastore.
-
You can now provision and use cloud drives on FlashArrays that are in the same zone using the CSI topology for FlashArray Cloud Drives feature. This improves fault tolerance for replicas, performance, and manageability for large clusters.
-
For environments such as GCP and Anthos that follow blue-green upgrade model, Portworx allows temporary license extension to minimize downtime during upgrades. Once you start the license expansion, the Portworx cluster's license will temporarily be extended to accommodate up to double the number of licensed nodes. While the existing nodes (called blue nodes) serve production traffic, Portworx will expand the cluster by adding new nodes (called green nodes) that have upgraded Linux OS or new hardware.
-
Portworx now offers the capability to utilize user-managed keys for encrypting cloud drives on Oracle Cloud Infrastructure Container Engine for Kubernetes (OKE). By leveraging powerful encryption algorithms, the Oracle disk encryption feature converts data into an encrypted format, ensuring that unauthorized individuals cannot access it. You can specify the encryption key in the StorageCluster using the following cloud-drive volume specifications:
type=pv-<number-of-vpus>,size=<size-of-disk>,kms=<ocid-of-vault-key>
-
Portworx now enables you to define custom tags for cloud drives provisioned across various platforms such as AWS, Azure, GCP, and Oracle cloud. While installing Portworx, you can specify the custom tags in the StorageCluster spec:
type=<device-type>,size=<volume-size>,tags=<custom-tags>
This user-defined metadata enhances flexibility, organization, and provides additional contextual information for objects stored in the cloud. It empowers users with improved data management, search capabilities, and greater control over their cloud-based data.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description | Component |
---|---|---|
PWX-29486 | Portworx now supports online expansion of storage pools containing auto journal devices with disk-resize operation. | Pool Management |
PWX-29435 | When you run the pxctl sv pool expand -o add-disk command, the common disk tags from existing drives will be attached to the newly added cloud-drive disk. | Pool Management |
PWX-28904 | Storage pool expansion now supports online pool resizing on Azure, with no downtime. This is as long as Microsoft's documentation requirements are met. | Pool Management |
PWX-30876 | Pool expansion with add-disk operation is now supported for repl1 volumes. | Pool Management |
PWX-29863 | The pool expansion completion message is improved to Pool resize requested successfully. Please check resize operation status with pxctl sv pool show . | Pool Management |
PWX-28665 | The pxctl cd list command now lists cloud-drives on nodes with local drives. | Cloud Drives |
PWX-28697 | FlashArray cloud drives now show information about the array they are located on. Use pxctl cd inspect to view this information. | Cloud Drives |
PWX-29348 | Added 3 new fields to the CloudBackupSize API to reflect the correct backup size:
| Cloudsnaps |
PWX-27610 | Portworx will now periodically defragment the KVDB database. KVDB will be defragmented every 2 weeks by default, if the DB size is greater than 100 MiB. You can also configure the defragment schedule using the following options with the pxctl cluster options update command:
| KVDB |
PWX-31403 | For AWS clusters, Portworx now defaults the following configurations for dedicated KVDB disk:
| KVDB |
PWX-31055 | The alert message for VolumeSpaceLow is improved to show clear information. | Storage |
PWX-29785 | Improved the implementation to restrict the nodiscard and autofstrim flags on XFS volumes. These two flags are disabled for volumes formatted with XFS. | PX-StoreV1 |
PWX-30557 | Portworx checks pool size and drive count limits before resizing the storage pool. It will abort with a proper error message if the resolved pool expansion plan exceeds limits. | PX-StoreV2 |
PWX-30820 | Portworx now redistributes cloud migration request received from stork between all the nodes in the cluster using a round-robin mechanism. This helps evenly distribute the migration workload across all the nodes in the cluster and avoids hot spots. | DR & Migration |
PWX-29428 | Portworx CSI images now use the registry.k8s.io registry. | CSI |
PWX-28035 | Portworx now supports distributing FlashArray Cloud Drive volumes among topologically distributed FlashArrays. | FA-FB |
PWX-31500 | The pxctl cluster provision-status command will now show more states of a pool. | CLI |
PWX-31257 | The pxctl alerts show command with the --start-time and --end-time options can now be used independently. | Monitoring |
PWX-30754 | Added support for leases permission to the PVC controller ClusterRole. | Spec Generation |
PWX-29202 | pxctl cluster provision-status will now show the host name for nodes. The host name helps you to correlate that command's output with the node list provided by pxctl status . | CLI |
Fixes
Issue Number | Issue Description | Severity |
---|---|---|
PWX-30030 | Some volumes incorrectly showed Not in quorum status.User Impact: Portworx volumes were out of quorum after a network split even though all the nodes and pools for the volume's replica were online and healthy. This happened when the node could communicate over network to KVDB but not with rest of the nodes. Resolution: Restart the Portworx service on the node where the volume is currently attached. Components: Storage Affected versions: 2.12.2 | Critical |
PWX-30511 | When autofstrim was disabled, internal autofstrim volume information was not removed completely. User Impact: An error occurred while running manual fstrim. Resolution: This issue has been fixed. Components: Storage Affected versions: 2.12.x, 2.13.x | Critical |
PWX-30294 | The pvc-controller pods failed to start in the DaemonSet deployment. User Impact: The pvc-controller failed due to the deprecated values of the --leader-elect-resource-lock flag.Resolution: These values have been removed to use the default leases value.Components: Spec Generator Affected versions: 2.12.x, 2.13.x | Critical |
PWX-30930 | The KVDB cluster could not form a quorum after KVDB was down on one node. User Impact: On a loaded cluster or when the underlying KVDB disk had latency issues, KVDB nodes failed to elect leaders among themselves. Resolution: Increase the heartbeat interval using the runtime option kvdb-heartbeat-interval=1000 .Components: KVDB Affected versions: 2.12.x, 2.13.x | Critical |
PWX-30985 | Concurrent pool expansion operations using add-disk and auto resulted in pool expansion failure, with the error mountpoint is busy .User Impact: Pool resize requests were rejected. Resolution: Portworx now serializes pool expansion operations. Components: Pool Management Affected versions: 2.12.x, 2.13.x | Major |
PWX-30685 | In clusters running with cloud drives and auto-journal partitions, pool deletion resulted in deleting the data drive with an auto-journal partition. User Impact: Portworx had issues restarting after the pool deletion operation. Resolution: Upgrade to the current Portworx version. Components: Pool Management Affected versions: 2.12.x, 2.13.x | Major |
PWX-30628 | The pool expansion would result in a deadlock when it had a volume in a re-sync state and the pool was already full. User Impact: Pool expansion would get stuck if a volume in the pool was in a re-sync state and the pool was full. No new pool expansions can be issued on such a pool. Resolution: Pool expansion will now be aborted immediately if it detects an unclean volume in the pool. Components: Pool Management Affected versions: 2.12.x, 2.13.x | Major |
PWX-30551 | If a diagnostics package collection was triggered during a node initialization, it caused the node initialization to fail and the node to restart. User Impact: The node restarted when node initialization and diagnostics package collection occurred at the same time. Resolution: Now diagnostics package collection will not restart the node. Components: Storage Affected versions: 2.12.x, 2.13.x | Major |
PWX-29976 | Cloud drive creation failed when a vSphere 8.0 datastore cluster was used for Portworx installation. User Impact: Portworx failed to install on vSphere 8 with datastore clusters. Resolution: This issue has been fixed. Components: Cloud Drives Affected versions: 2.13.1 | Major |
PWX-29889 | Portworx installation with local install mode failed when both a journal device and a KVDB device were configured simultaneously. User Impact: Portworx would not allow creating multiple disks in local mode install Resolution: This issue has been fixed. Components: KVDB Affected versions: 2.12.x, 2.13.x | Major |
PWX-29512 | In certain cases, a KVDB node failover resulted in inconsistent KVDB membership, causing an orphaned entry in the cluster. User Impact: The cluster operated with one less KVDB node. Resolution: Every time Portworx performs a KVDB failover, if it detects an orphaned node, Portworx removes it before continuing the failover operation. Components: KVDB Affected versions: 2.13.x | Major |
PWX-29511 | Portworx would remove an offline internal KVDB node as part of its failover process, even when it was not part of quorum. User Impact: The KVDB cluster would lose quorum and required manual intervention to restore its functionality. Resolution: Portworx will not remove a node from the internal KVDB cluster if it is out of quorum. Components: KVDB Affected versions: 2.13.x | Major |
PWX-28287 | Pool expansion on an EKS cluster failed while optimization of the associated volume(s) was in progress. User Impact: Pool expansion was unsuccessful. Resolution: Portworx catches these scenarios early in the pool expansion process and provide a clear and readable error message to the user. Components: Cloud Drives Affected versions: 2.12.x, 2.13.x | Major |
PWX-28590 | In vSphere local mode install, storageless nodes (disaggregated mode) would claim storage ownership of a hypervisor if it was the first to boot up. This meant that a node capable of creating storage might not be able to get ownership. User Impact: In vSphere local mode, Portworx installed in degraded mode. It occurred during a fresh install or when an existing storage node was terminated. Resolution: This issue has been fixed. Components: Cloud Drives Affected versions: 2.12.1 | Major |
PWX-30831 | On EKS, if the cloud drives were in different zones or removed, Portworx failed to boot up in certain situations. User Impact: Portworx did not start on an EKS cluster with removed drives. Resolution: Portworx now ignores zone mismatches and sends alerts for deleted drives. It will now not abort the boot up process and continue to the next step. Components: Cloud Drives Affected versions: 2.12.x, 2.13.x | Major |
PWX-31349 | Sometimes Portworx processes on the destination or DR cluster would restart frequently due to a deadlock between the node responsible for distributing the restore processing and the code attempting to attach volumes internally. User Impact: Restore operations failed Resolution: This issue has been fixed. Components: DR and Migration Affected versions: 2.12.x, 2.13.x | Major |
PWX-31019 | During cloudsnap backup/restore, there was a crash occasionally caused by the array index out of range of the preferredNodeForCloudsnap function. User Impact: Cloudsnap restore failed. Resolution: This issue has been fixed. Components: Storage Affected versions: 2.12.x, 2.13.x | Major |
PWX-30246 | Portworx NFS package installation failed due to a lock held by the unattended-upgrade service running on the system. User Impact: Sharedv4 volume mounts failed Resolution: Portworx NFS package install now waits for the lock, then installs the required packages. This issue is resolved after upgrading to the current version and restarting the Portworx container. Components: Sharedv4 Affected versions: 2.11.2, 2.12.1 | Major |
PWX-30338 | VPS pod labels were not populated in the Portworx volume spec. User Impact: VPS using the podMatchExpressions field in a StatefulSet sometimes failed to function correctly because volume provisioning and pod inception occurred at the same time.Resolution: Portworx ensures that volume provisioning collects the pod name before provisioning. Components: Volume Placement Strategies Affected versions: 2.12.x, 2.13.x | Minor |
PWX-28317 | A replica set was incorrectly created for proxy volumes. User Impact: When a node was decommissioned, it got stuck if a proxy volume’s replica set was on that node. Resolution: Now replica sets are not created for proxy volumes. Components: Proxy Volumes Affected versions: 2.11.4 | Minor |
PWX-29411 | In vSphere, when creating a new cluster, KVDB disk creation failed for a selected KVDB node. User Impact: In the local mode install, the KVDB disk creation failures resulted in wrongly giving up ownership of a hypervisor. This created two storage nodes on the same hypervisors. Resolution: This issue has been fixed. Components: Cloud Drives Affected versions: 2.12.1. 2.13.x | Minor |
PWX-28302 | The pool expand command failed to expand an existing pool size when it was increased by 4 GB or less. User Impact: If the user expanded the pool by 4 GB or less, the pxctl sv pool expand command failed with an invalid parameter error.Resolution: Increase the pool size by at least 4 GB. Components: PX-StoreV2 Affected versions: 2.12.x, 2.13.x | Minor |
PWX-30632 | NFS backupLocation for cloudBackups failed with the error validating credential: Empty name string for nfs error. The NFS name used by Portworx to mount the NFS server was not passed to the required function.User Impact: Using BackupLocations for NFS targets failed. Resolution: Portworx now passes the credential name to the function that uses the name to mount the NFS server. Components: Cloudsnaps Affected versions: 2.13.x | Minor |
PWX-25792 | During the volume mount of FA/FB DA volumes, Portworx did not honor the nosuid mount option specified in the storage class.User Impact: Post migration from PSO to Portworx, volumes with the nosuid mount option failed to mount on the host.Resolution: Portworx now explicitly sets the nosuid mount option in the mount flag before invoking the mount system call.Components: FA-FB Affected versions: 2.11.0 | Minor |
Known issues (Errata)
Issue Number | Issue Description |
---|---|
PD-2149 | Portworx 3.0.0 cannot be installed using the Rancher catalog chart. You should use PX-Central to generate the Portworx spec. |
PD-2107 | If there is a ha-update operation while the volume is in a detached state, a different node might start publishing the volume metrics, but the old node won’t stop publishing the volume metrics. This results in duplicate metrics, and only one will have the correct currhalevel.Workaround: For detached volumes, before doing a ha-update , attach the volume manually through pxctl . |
PD-2086 | Portworx does not support Oracle API signing keys with apassphrase. Workaround: Use API signing keys without a passphrase. |
PD-2122 | The add-drive operation fails when a drive is added to an existing cloud-based pool.Workaround: Use the pxctl service pool expand -operation add-disk -uid <pool-ID> -size <new-storage-pool-size-in-GiB> command to add a new drive to such pools. |
PD-2170 | The pool expansion can fail on Google Cloud when using the pxctl service pool expand -operation add-disk command with the error Cause: ProviderInternal Error: googleapi: Error 503: Internal error. Please try again or contact Google Support. Workaround: Rerun the command. |
PD-2188 | In OCP 4.13 or newer, when the application namespace or pod is deleted, application pods that use Portworx sharedv4 volumes can get stuck in the Terminating state. The output of the ps -ef --forest command for the stuck pod showed that the conmon process had one or more defunct child processes. Workaround: Find the nodes on which the sharedv4 volume(s) used by the affected pods are attached, then restart the NFS server on those nodes with the systemctl restart nfs-server command. Wait for a couple of minutes. If the pod is still stuck in the Terminating state, reboot the node on which the pod is running. The pod might take several minutes to release after a reboot. |
PD-2209 | When Portworx is upgraded to version 3.0.0 without upgrading Portworx Operator to version 23.5.1, telemetry is disabled. This is because the port is not updated for the telemetry pod. Workaround: Upgrade Portworx Operator to the latest version and bounce the Portworox pods manually. |
PD-2615 | Migrations triggered as part of Async DR will fail in the "Volume stage" when Portworx is configured with PX-Security on the source and destination clusters. Workaround: Please contact support if you encounter this issue. |
Known issues (Errata) with PX-StoreV2 datastore
Issue Number | Issue Description |
---|---|
PD-2138 | Scaling down the node groups in AWS results in node termination. After a node is terminated, the drives are moved to an available storageless node. However, in some cases, after migration the associated pools remain in an error state. Workaround: Restart the Portworx service, then run a maintenance cycle using the pxctl sv maintenance --cycle command. |
PD-2116 | In some cases, re-initialization of a node fails after it is decommissioned and wiped with the error Failed in initializing drives on the node x.x.x.x : failed to vgcreate . Workaround: Reboot the node and retry initializing it. |
PD-2141 | When cloud drives are detached and reattached manually, the associated pool can go down and remain in an error state. Workaround: Restart the Portworx service, then run a maintenance cycle using the pxctl sv maintenance --cycle command. |
PD-2153 | If the add-drive operation is interrupted by a drive detach, scale down or any other operation, the pool expansion can get stuck.Workaround: Reboot the node. |
PD-2174 | When you add unsupported drives to the StorageCluster spec of a running cluster,Portworx goes down. Workaround: Remove the unsupported drive from the StorageCluster spec. The Portworx Operator will recreate the failed pod and Portworx will be up and running again on that node. |
PD-2208 | Portworx on-premises with PX-StoreV2 fails to upgrade to version 3.0.0. Workaround: Replace -T dmthin with -T px-storev2 in your StorageCluster, as the dmthin flag is deprecated. After updating the StorageCluster spec, restart the Portworx nodes. |
2.13.12
March 05, 2024
Visit these pages to see if you're ready to upgrade to this version:
Fixes
Issue Number | Issue Description |
---|---|
PWX-35603 | When running Portworx on older Linux systems (specifically those using GLIBC 2.31 or older) with newer versions of Kubernetes, Portworx failed to detect dynamic updates of pod credentials and tokens. This led to Unauthorized errors when using Kubernetes client APIs.Resolution: Portworx now correctly processes dynamic token updates. |
PWX-29750 | In certain cases, the cloudsnaps that were using S3 object-stores were not completely deleted because S3 object-stores did not support bulk deletes or were unable to handle large cloudsnaps. This resulted in undeleted cloudsnap objects, leading to unnecessary capacity consumption on S3. Resolution: Portworx now addresses and resolves such cloudsnaps deletion issues. |
PWX-35136 | During cloudsnap deletions, some objects were not removed because the deletion requests exceeded the S3 API's limit for the number of objects that could be deleted at once. This would leave objects on S3 for deleted cloudsnaps, thereby consuming S3 capacity. Resolution: Portworx now ensures that deletion requests do not exceed the S3 API's limit. |
PWX-31019 | An array index out of range error in the preferredNodeForCloudsnap function occasionally caused crashes during cloudsnap backup/restore operations.Resolution: This issue has been fixed, and Porworx now prevents such crashes during cloudsnap backup or restore operations. |
PWX-30030 | Some Portworx volumes incorrectly showed Not in quorum status after a network split, even though all the nodes and pools for the volume's replica were online and healthy. This happened when the node could communicate over network to KVDB but not with rest of the nodes. Resolution: Portworx volumes now accurately reflect their current state in such situations. |
PWX-33647 | When the Portworx process are restarted, it verifies the existing mounts on the system for sanity. If one of the mounts was NFS mount of a Portworx volume, the mount point verification would hung as Portworx was in the process of starting up. Resolution: When Portworx is starting up, it now skips the verification of Portworx-backed mount points to allow the startup process to continue. |
PWX-29511 | Portworx would remove an offline internal KVDB node as part of its failover process, even when it was not part of quorum. The KVDB cluster would lose quorum and required manual intervention to restore its functionality. Resolution: Portworx does not remove a node from the internal KVDB cluster if it is out of quorum. |
PWX-29533 | During node initialization with cloud drives, a race condition occasionally occurred between the Linux device manager (udevd) and Portworx initialization, causing node initialization failures. This was because drives were not fully available for Portworx's use, preventing users from adding new nodes to an existing cluster. Resolution: Portworx has increased the number of retries for accessing the drives during initialization to mitigate this failure. |
PWX-35650 | GKE customers encountered a nil panic exception when the provided GKE credentials were invalid. Resolution: Portworx now properly shuts down and logs the error, aiding in the diagnosis of credential-related issues. |
Known issues (Errata)
Issue Number | Issue Description |
---|---|
PD-2768 | When cloning or capturing a snapshot of an FlashArray Direct Access PVC that is either currently resizing or has encountered a resizing failure, the clone or snapshot creation might fail. Workaround: Initiate the resize operation again on the original volume, followed by the deletion and recreation of the clone or snapshot, or allow for an automatic retry. |
2.13.11
October 25, 2023
Visit these pages to see if you're ready to upgrade to this version:
Notes
- This version addresses security vulnerabilities.
- It is recommended that you upgrade to the most latest version of Portworx when upgrading from version 2.13.11.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description |
---|---|
PWX-34029 | Portworx now removes stale FlashArray multipath devices upon startup, which may result from pod failovers (for FlashArray Direct Access) or drive set failovers (for FlashArray Cloud Drives) while Portworx was not running. These stale devices had no direct impact but could have lead to slow operations if many were present. |
PWX-33551 | You can now configure the REST API call timeout (in seconds) for FA/FB by adding the new environment variable PURE_REST_TIMEOUT to the StorageCluster. When updating this value, you should also update the execution timeout using the following command:pxctl cluster options update --runtime-options execution_timeout_sec=<sec> PURE_REST_TIMEOUT is set to 8 seconds and execution_timeout_sec to 60 seconds by default. Contact Portworx support to find the right values for your cluster. This improvement was included in Portworx version 3.0.2 and now is backported to 2.13.11. |
PWX-33229 | Previously, a Portworx license would expire if Portworx could not reach its billing server within 72 hours. Users can now continue to use Portworx for up to 30 days if the billing servers are not reachable. This improvement was included in Portworx version 3.0.2 and now is backported to 2.13.11. |
PWX-33364 | As part of FlashArray integration, Portworx has now reduced the number of API calls it makes to the arrays endpoint on FA. This improvement was included in Portworx version 3.0.2 and now is backported to 2.13.11. |
Fixes
Issue Number | Issue Description |
---|---|
PWX-33828 | If you deleted a FA Direct Access PVC attached to an offline Portworx node, Portworx removed the associated volume from its KVDB. However, the FlashArray did not delete its associated volume because it remained connected to the offline node on the FlashArray. This created orphaned volumes on the FlashArray. Resolution: Portworx now detects a volume that is attached to an offline Portworx node and will disconnect it from all the nodes in the FlashArray and avoid orphaned volumes. |
PWX-33632 | If an attach request remained in the processing queue for a long time, it would lead to a panic, causing Portworx to restart on a node. This was because an FA attach operation involved making REST API calls to FA, as well as running iSCSI rescans, which consumed more time. When Portworx received a high volume of requests to attach FA DirectAccess volumes, the queue for these attach requests gradually grew over time, leading to a panic in Portworx. Resolution: The timeout for queued attach requests has been increased to 15 minutes for FA DirectAccess volumes. |
PWX-33898 | When two pods, both using the same RWO FA Direct Access volume, were started on two different nodes, Portworx would move the FADA volume attachment to the node where the most recent pod was running, rather than rejecting the setup request for the second pod. This resulted in a stale FADA multipath device remaining on the original node where the first pod was started, causing subsequent attach or mount requests on that node to fail Resolution: A second pod request for the same RWO FA Direct Access volume on a different node will now be rejected if such a FA Direct Access volume is already attached and in use on another node. |
PWX-33631 | To distribute workloads across all worker nodes during the provisioning of CSI volumes, Portworx obtains locks to synchronize requests across different worker nodes. Resolution: If CSI volume creation is slow, upgrade to this version. |
PWX-34277 | When an application pod using an FA Direct Access volume was failed over to another node, and Portworx was restarted on the original node, the pod on the original node became stuck in the Terminating state. This occurred because Portworx didn't clean up the mountpaths where the volume had previously been attached, as it couldn't locate the application on the local node. Resolution: Portworx now cleans up the mountpath even when the application is not found on the node. |
PWX-34334 | Cloudsnaps of an aggregated volume with a replication level of 2 or more uploaded incorrect data if one of the replica nodes from which a previous cloudsnap operation had been executed was down. Resolution: Portworx now forces a full backup in scenarios where the previous cloudsnap node is down. |
PWX-33935 | When the --sources option was used in the pxctl volume ha-update command for the aggregated volume, it caused the Portworx service processes to abort with an assertion. As a result, the Portworx service on all nodes in the cluster continuously kept restarting.Resolution: Contact the Portworx support team to restore your cluster. |
PWX-34025 | In certain cases, increasing the replication level of a volume on a PX-StoreV2 cluster created new replicas with non-zero blocks that were overwritten with zeros on the existing replicas. This caused the ext4 filesystem to report a mismatch and delayed allocation failures when a user application attempted to write data to the volume. Resolution: Users can now run the fsck operation to rectify the failures or remove the added replicas from the volume. This issue has been fixed in Portworx version 3.0.3 and now backported to 2.13.11. |
PWX-33451 | In certain cases, increasing the replication level of an aggregated volume failed to zero out specific blocks associated with stripes belonging to replication set 1 or higher, where zero data was expected. This caused the ext4 filesystem to report a mismatch and delayed allocation failures when a user application tried to write data to an aggregated Portworx volume. Resolution: Users can now run the fsck operation to rectify the failures or remove the added replicas from the aggregated volume. This issue has been fixed in Portworx version 3.0.2 and is now backported to 2.13.11. |
PWX-32572 | When using the older Containerd versions (v1.4.x or 1.5.x), Portworx kept opening connections to Containerd, eventually depleting all the file-descriptors available on the system. This caused the Portworx nodes to crash with the too many open files error. Resolution: Portworx no longer leaks the file-descriptors when working with older Containerd versions. This issue has been fixed in Portworx version 3.0.2 and is now backported to 2.13.11. |
Known issues (Errata)
Issue Number | Issue Description |
---|---|
PD-2474 | In certain scenarios, you might encounter the alert Failed to delete FlashArray Direct Access volume on FlashArray when deleting an FA Direct Access PVC. This occurs when the Portworx volume and the Kubernetes PVC are deleted, but the deletion fails on the FlashArray due to one of the following reasons:
|
PD-2474 | When a Portworx volume is created, it remains in the down - pending state. This occurs due to a race condition when Portworx is restarted while it is performing an FA API call to create a volume, and the volume creation is not completed on the FA side.Workaround: Delete the volume in the down - pending state using the pxctl volume delete command. |
PD-2477 | During FA Direct Access volume resizing, if the network between FlashArray and Portworx is disconnected, the PVC and the Portworx volume reflect the updated size, but the actual size on the FA backend remains unchanged. Workaround: Once the network is connected again, trigger another PVC resize operation to update the size on the FlashArray backend. |
2.13.10
September 3, 2023
Visit these pages to see if you're ready to upgrade to this version:
Fixes
Issue Number | Issue Description |
---|---|
PWX-33389 | The Portworx CSI license for FA/FB validation failed when upgraded Purity to version 6.4.2 or newer. This caused the Portworx license status to appear expired and users were not able to create new volumes. Resolution: This issue has been fixed in Portworx version 3.0.1 and now backported to 2.13.10. |
Known issues (Errata)
Issue Number | Issue Description |
---|---|
PD-2349 | When you upgrade Portworx to a higher version, the upgrade is successful, but the Portworx CSI license renewal could take a long time. Workaround: Run the pxctl license reset command to reflect the correct license status. |
PD-2350 | Upgrades on some nodes may become stuck with the following message: This node is already initialized but could not be found in the cluster map. . This issue can be caused by an orphaned storageless node. Workaround: Verify if the node which has this error is a storageless node. If it is, delete the orphaned storageless node using the command: pxctl clouddrive delete --node <> to progress the upgrade. |
2.13.9
August 28, 2023
Visit these pages to see if you're ready to upgrade to this version:
Fixes
Issue Number | Issue Description |
---|---|
PWX-33258 | This issue impacted users who are using Flash Blade Direct Access volumes only. When Flash Blade direct access volume creation took more than 30 seconds, Portworx sometimes timed out on volume creation, leaving volumes in a pending state. Resolution: With this fix, the default timeout for FB volume creation has been increased from 30 seconds to 180 seconds (3 minutes). You can also set this timeout to a higher value using the new cluster option called --fb-lock-timeout . You can tune this as required based on the volume creation times on Flash Blade, as it depends on your performance and network bandwidth. You must set this time in minutes; for example, if you want to set the timeout to 6 minutes: pxctl cluster options update --fb-lock-timeout 6 |
2.13.8
August 24, 2023
Visit these pages to see if you're ready to upgrade to this version:
Notes
- This version addresses security vulnerabilities.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description |
---|---|
PWX-33014 | With Portworx Operator 23.7.0, Portworx can dynamically load telemetry port values specified by the operator. |
PWX-30798 | Users can now schedule fstrim operations. |
Fixes
Issue Number | Issue Description |
---|---|
PWX-33006 | The FlashArray Direct Access PVCs were deleted upon a Portworx restart if they were newly created, not yet attached, and in a "Pending" state. There is no data loss since these were unpopulated volumes. Resolution: Portworx has enhanced the code to no longer delete "Pending" FADA volumes on PX startup. |
PWX-30511 | When auto fstrim was disabled, internal state data did not clear and caused manual fstrim to enter an error state. Resolution: This issue has been fixed. |
2.13.7
July 11, 2023
Visit these pages to see if you're ready to upgrade to this version:
Fixes
Issue Number | Issue Description |
---|---|
PWX-31855 | When mounting a large number of PVCs that use FADA volumes, PVC creation took a long time and crashed Portworx. Resolution: The heavyweight list of all devices API has been removed from the attach call, reducing the time taken to attach volumes. |
PWX-30551 | The node restarted when node initialization and diagnostics package collection happened at the same time. Resolution: The diagnostics package collection will not restart the node. |
PWX-21105 | Volume operations such as Attach/Detach/Mount/Unmount would get stuck if a large number of requests were sent for the same volume. Portworx would accept all requests and add them to its API queue. All requests for a specific volume are processed serially. This would cause newer requests to be queued for a longer duration. Resolution: When a request does not get processed within 30s because it is sitting behind other requests in the API queue for the same volume, Portworx will return an error to the client requesting it to try again. |
PWX-29067 | The application pods using FADA volumes were not getting auto remounted in read-write mode when one of the multiple configured network interfaces went down.Resolution: Portworx now enables multiple iSCSI interfaces for FlashArray connections. These interfaces must be registered with the iscsiadm -m iface command. Use the --flasharray-iscsi-allowed-ifaces cluster option to restrict the interfaces used by FADA connections. This ensures if one of the interfaces go down, the FADA volume will stay mounted as read-write. For more details about the flasharray-iscsi-allowed-ifaces flag, see FlashArray and FlashBlade environment variables. |
2.13.6
June 16, 2023
Visit these pages to see if you're ready to upgrade to this version:
Notes
- This version addresses security vulnerabilities.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description |
---|---|
PWX-30569 | Portworx now supports OpenShift version 4.13.0 with kernel version 5.14.0-284.13.1.el9_2.x86_64. |
Fixes
Issue Number | Issue Description |
---|---|
PWX-31647 | If any read-write volume changed to a read-only state, pods using these volumes had to be manually restarted to make the mounts back to read-write. Resolution: A background task is now implemented to run periodically (by default every 30 seconds), which checks for read-only volumes and terminates managed pods using them. You can customize this time-interval with the --ro-vol-pod-bounce-interval cluster option. This background task is enabled for FA DirectAccess volumes by default.To enable this for all Portworx volumes, use the --ro-vol-pod-bounce all cluster option. |
2.13.5
May 16, 2023
Visit these pages to see if you're ready to upgrade to this version:
New features
Portworx by Pure Storage is proud to introduce the following new features:
- Portworx can now be deployed from Azure Marketplace with a pay-as-you-go subscription.
2.13.4
May 09, 2023
Visit these pages to see if you're ready to upgrade to the latest version:
Notes
Portworx by Pure Storage recommends upgrading to Portworx 2.13.4 if you are using Portworx 2.12.0 with Azure managed identity to avoid the PWX-30675 issue, which is explained below.
Fixes
Issue Number | Issue Description |
---|---|
PWX-30675 | During installation of Portworx 2.12.0 on AKS, Portworx checked for the AZURE_CLIENT_SECRET , AZURE_TENANT_ID and AZURE_CLIENT_ID environment variables. However, users of Azure managed identity had only set the AZURE_CLIENT_ID , resulting in a failed installation.Resolution: This issue has been fixed and now Portworx checks only for the AZURE_CLIENT_ID environment variable. |
2.13.3
April 24, 2023
Visit these pages to see if you're ready to upgrade to the latest version:
Notes
If you are currently using any Portworx 2.12 version, Portworx by Pure Storage recommends upgrading to version 2.13.3 due to the PWX-29074 issue, which is explained below.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description |
---|---|
PWX-30420 | In Portworx version 2.13.0, a prerequisite check was implemented to detect the versions of the multipath tool with known issues (0.7.x to 0.9.3) during installation or upgrade of Portworx. If a faulty version was detected, it was not possible to install or upgrade Portworx. However, this prerequisite check has now been removed, and Portworx installs or upgrades are not blocked on these faulty versions. Instead, a warning message is displayed, advising customers to upgrade their multipath package. |
PWX-29992 | In Async DR migration, a snapshot was previously created at the start of restores as a fallback in case of errors, but it added extra load with creation and deletion operations. This is improved, as Portworx do not create a fallback snapshot, allowing users to create clones from the last successful migrated snapshot if necessary for error cases. |
Fixes
Issue Number | Issue Description |
---|---|
PWX-29640 | Incorrect allocation of floating license and insertion of excess data into the Portworx key-value database caused new nodes to repeatedly fail to join the Portworx cluster . Resolution: Cluster-join failures now perform thorough cleanup to remove all temporary resources created during the failed cluster-join attempts. |
PWX-30056 | During migration, if a PVC has the sticky bit set (which prevents volumes from being deleted), it accumulated the internal snapshot that was created for the asynchronous DR deployment, thus consuming extra storage space. Resolution: The internal snapshots are now created without the sticky bit. |
PWX-30484 | The SaaS license key was not activated when installing Portworx version 2.13.0 or later. Resolution: This issue has been fixed. |
PWX-26437 | Due to a rare corner-case condition, node decommissioning could leave orphaned keys in the KVDB. Resolution: The forced node-decommission command has been modified to perform the node-decommission more thoroughly, and to clean up the orphaned data from the KVDB. |
PWX-29074 | Portworx incorrectly pinged the customer DNS server. At regular intervals, when the /etc/hosts file from the node periodically rsynced with the Portworx runc container, it temporarily removed the mappings for KVDB domain names. As a result, internal KVDB name resolution queries were incorrectly forwarded to the customer's DNS servers. Resolution: This issue has been fixed. |
PWX-29325 | The local snapshot schedule could not be changed using the pxctl CLI. An update to a previously created snapshot failed with the error Update Snapshot Interval: Failed to update volume: This IO profile has been deprecated .Resolution: You can now disable snapshot schedules with the —periodic parameter, as shown in the following command:pxctl volume snap-interval-update --periodic 0 <vol-id> |
PWX-30255 | The log message is improved to add extra metadata into node markdown updates. |
Known issues (Errata)
Issue Number | Issue Description |
---|---|
PD-2063 | In an Async DR deployment, if the --sticky flag is set to on for Portworx volumes:
off the sticky bit flag on the Portworx volumes on the source cluster:PX_POD=$(kubectl get pods -l name=portworx -n <px-namespace> -o jsonpath='{.items[0].metadata.name}') kubectl exec $PX_POD -n <px-namespace> -- /opt/pwx/bin/pxctl volume update <vol-id> --sticky off |
2.13.2
April 7, 2023
Visit these pages to see if you're ready to upgrade to the latest version:
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description |
---|---|
PWX-27957 | The volume replica level in an Asynchronous DR deployment now matches the source volume's replica level at the end of each migration cycle. |
PWX-29017 | Storage stats are periodically collected and stored to improve Portworx cluster debugging. |
PWX-29976 | Portworx now supports vSphere version 8.0 |
Fixes
Issue Number | Issue Description |
---|---|
PWX-23651 | Certain workloads involving file truncates or deletes from large to very small sizes caused the volume to enter an internal error state. The issue is specific to the Ext4 filesystem because of the way it handles file truncates/deletes. As a result, PVC resize/expand operations failed.Resolution: Portworx now recognizes these specific categories of errors as fixable and automatically fixes them during the next mount. |
PWX-29353 | If multiple NFS credentials were created with the same NFS server and export paths, cloudsnaps did not work correctly. Resolution: If the export paths are different with the same NFS server, they now get mounted at different mount points, avoiding this issue. |
PWX-28898 | Heavy snapshot loads caused delays in snapshot completion. This caused replicas to lag and the backend storage pool to keep consuming space. Resolution: You can increase the time Portworx waits for storage to complete the snapshot. This will cause the replicas to remain in the pool until the next Portworx service restart, which performs garbage collection of such replicas. |
PWX-28882 | Upgrades or installations of Portworx on Nomad with cloud drives failed at bootup. Impacted versions: 2.10.0 and later. Resolution: Portworx version 2.13.2 can successfully boot up on Nomad with cloud drives. |
PWX-29600 | The VPS Exists operator did not work when the value of key parameter was empty.Resolution: The VPS Exists operator now allows empty values for the key parameter without failing. |
PWX-29719 | On FlashArray cloud drive setup, if some iSCSI interfaces could log in successfully while others failed, the FlashArray connection sometimes failed with the failed to log in to all paths error. This prevented Portworx from restarting successfully in clusters with known network issues. |
PWX-29756 | If FlashArray iSCSI attempted to log in several times, it timed out, creating extra orphaned volumes on the FlashArray. Resolution: The number of retries has been limited to 3. |
PWX-28713 | Kubernetes nodes with Fully Qualified Domain Names (FQDNs) detected FlashArray cloud drives as partially attached. This prevented Portworx from restarting successfully if the FlashArray host name did not match the name of the node, such as with FQDNs. |
PWX-30003 | A race condition when updating volume usage in auto fstrim resulted in Portworx restart. |
2.13.1
April 4, 2023
Visit these pages to see if you're ready to upgrade to the latest version:
New features
Portworx by Pure Storage is proud to introduce the following new features:
-
Portworx can now be deployed from the GCP Marketplace with the following new offerings. You can also change between these offerings after deploying Portworx by changing the value of the environment variable
PRODUCT_PLAN_ID
within your StorageCluster spec:- PX-ENTERPRISE
- PX-ENTERPRISE-DR
- PX-ENTERPRISE-BAREMETAL
- PX-ENTERPRISE-DR-BAREMETAL
Fixes
Issue Number | Issue Description |
---|---|
PWX-29572 | In Portworx 2.13.0, the PSO2PX migration tool would fail with the error pre-create filter failed: CSI PVC Name/Namespace not provided to this request due to a change made in the Portworx CSI Driver.Resolution: For migrating from PSO to Portworx, you should use Portworx 2.13.1. The migration tool will fail with Portworx 2.13.0. |
2.13.0
February 23, 2023
Visit these pages to see if you're ready to upgrade to the latest version:
Notes
A known issue with multipath tool versions 0.7.x to 0.9.3 causes high CPU usage and/or multipath crashes that disrupt IO operations. To prevent this, Portworx now performs a prerequisite check to detect these faulty multipath versions starting with version 2.13.0. If this check fails, it will not be possible to install or upgrade Portworx. Portworx by Pure Storage recommends upgrading the multipath tool version to 0.9.4 before upgrading to Portworx to any Portworx 2.13 version.
New features
Portworx by Pure Storage is proud to introduce the following new features:
- You can now install Portworx on Oracle Container Engine for Kubernetes.
- You can now use Portworx on FlashArray NVMe/RoCE.
Improvements
Portworx has upgraded or enhanced functionality in the following areas:
Improvement Number | Improvement Description |
---|---|
PWX-27200 | Added the following pxctl commands for auto fstrim:
|
PWX-28351 | You can now enable pay-as-you-go billing for Docker Swarm. |
PWX-27523 | CSI sidecar images are updated to the latest open source versions. |
PWX-27920 | Batching is now enabled in the metrics collector to reduce memory usage on large scale clusters. |
PWX-28137 | The Portworx maintained fork for CSI external-provisioner has been removed in favor of the open source version. |
PWX-28149 | The Portworx CSI Driver now distributes volume deletion across the entire cluster for faster burst-deleting of many volumes. |
PWX-28131 | Pool expansion for repl1 volumes is now supported on all cloud environments, except for the following scenarios:
|
PWX-28277 | Updated stork-scheduler deployment and stork-config map in the spec generator to use Kube Scheduler Configuration for Kubernetes version 1.23 or newer. |
PWX-28363 | Reduced the number of vSphere API calls made during Portworx bootup on vSphere. This significantly improves Portworx upgrade times in environments where vSphere servers are overloaded. |
PWX-10054 | Portworx can now monitor the health of an internal KVDB, and when it is detected as unhealthy, Portworx can initiate KVDB failover. |
PWX-27521 | The Portworx CSI driver now supports version 1.7 of the CSI spec. |
Fixes
Issue Number | Issue Description |
---|---|
PWX-23203 | In some cases, migration or Asynchronous DR failed when the source volume was being resized. Resolution: On the destination cluster, Portworx now resizes the volume before migration operations. |
PWX-26061 | Deleting cloudsnaps failed with the curl command on a gRPC port.Resolution: Add a separate field for providing the bucket ID. |
PWX-26928 | Portworx installation would fail when unattended-upgr was running on the system, or you were unable to lock the necessary packages for installation.Resolution: Re-attempt installation after waiting for the lock to be released. |
PWX-27506 | When a node was down for a long time, cloudsnap restores were taking longer to start. Resolution: Portworx now makes other nodes in the cluster to process such restore requests. |
PWX-28305 | Portworx was facing a lock hold timeout assert while detaching sharedv4 service volumes, if the Kubernetes API calls were being rate limited. Resolution: To avoid this assert, the Kubernetes API calls are now done outside the context of a lock. |
PWX-28422 | Snapshot and cloudsnapshot requests were failing if a volume was in the detached state and one of its coordinator had changed IP address. Resolution: Portworx now reattaches the volume with the correct IP address on snapshot and cloudsnapshot request for detached volumes. |
PWX-28224 | The pxctl cd list command was failing to fetch the cloud drives when ran from hot-nodes (nodes with local storage).Resolution: This issue has been fixed. |
PWX-28225 | The summary of the command pxctl cd list was showing all nodes as cloud drive nodes.Resolution: The output of the command is reframed. |
PWX-28321 | The output of pxctl cd list was showing storageless nodes even though there were no storageless nodes present in the cluster.Resolution: Wait for the Portworx cleanup job to be completed, which runs every 30 minutes. |
PWX-28341 | In the NodeStart phase, if a gRPC request for getting node stats was invoked before completion of the pxdriver bootstrap, Portworx would abruptly stop. Resolution: Now Portworx returns an error instead of stopping abruptly. |
PWX-28285 | The high frequency of sharedv4 volume operations (such as create, attach, mount, unmount, detach, or delete) requires frequent changes to NFS exports. This was causing the NFS server to stop responding and a potential node restart. Resolution: When applying changes to NFS exports, Portworx now combines multiple changes together and sends a single batch update to the NFS server. Portworx also limits the frequency of NFS server restarts to prevent such issues. |
PWX-28529 | Fixed an issue where volumes with replicas on a node in pool maintenance were temporarily marked as out of quorum when the replica node exited pool maintenance. |
PWX-28551 | In Portworx version 2.12.1, one of the sanitizing operations changed upper case letters to lower case letters. This caused CSI pod registration issues during the upgrade. Resolution: This issue is fixed as Portworx now adheres to the regular expression for topology label values. |
PWX-28539 | During the attachment of FlashArray (FA) NVMe volumes, Portworx performs stale device cleanup. However, this cleanup process sometimes failed when the device was busy, causing the volume attachment to fail. Resolution: The FA NVMe volumes can now be attached, even if the stale cleanup fails. |
PWX-28614 | Fixed a bug where pool expansion of pools with repl1 volumes did not abort. |
PWX-28910 | In a Synchronous DR deployment, if the domains were imbalanced and one domain had over-provisioned a new volume, all the replicas of the volume would land in the same domain. Resolution: Now the replicas are forced to spread across the failure domains during the volume creation operation in the Synchronous DR deployment. If provisioning is not possible, then the volume creation operation will fail. You can use the pxctl cluster options update -metro-dr-domain-protection off command to disable this protection. |
PWX-28909 | When an error occurred during CSI snapshots, the Portworx CSI driver was incorrectly marking the snapshot ready for consumption. This resulted in a failure to restore PVCs from a snapshot in this case. Resolution: Create a snapshot and immediately hydrate a new PVC with the snapshot contents. |
PWX-29186 | Fields required for Volume Placement Strategy were missing from the CSI volume spec.VolumeLabels . This was resulting in a Volume Placement Strategy where a namespace failed to place volumes correctly.Resolution: While some simple volume placement strategies may work without this fix, users of CSI should upgrade to Portworx version 2.13.0 if they use Volume Placement Strategies. |
Known issues (Errata)
Issue Number | Issue Description |
---|---|
PD-1859 | When storage is full, a repl 1 volume will be in the NOT IN QUORUM state and a deadlock occurs, so you cannot expand the pool.Workaround: To expand the pool, pass the --dont-wait-for-clean-volumes option as part of the expand command. |
PD-1866 | When using FlashArray Cloud Drives and FlashArray Direct Access volumes, Portworx version 2.13.0 does not support Ubuntu versions 20.04 and 22.x with the default multipath package (version 0.8x). Workaround: Portworx requires version 0.9.4 of the multipath-tools package. Reach out to the support team if you need help building the package. |
2.12.6
September 3, 2023
Visit these pages to see if you're ready to upgrade to the latest version:
Fixes
Issue Number | Issue Description |
---|---|
PWX-33389 | The Portworx CSI license for FA/FB validation failed when upgraded Purity to version 6.4.2 or newer. This caused the Portworx license status to appear expired and users were not able to create new volumes. Resolution: This issue has been fixed in Portworx version 3.0.1 and now backported to 2.12.6. |
Known issues (Errata)
Issue Number | Issue Description |
---|---|
PD-2349 | When you upgrade Portworx to a higher version, the upgrade is successful, but the Portworx CSI license renewal could take a long time. Workaround: Run the pxctl license reset command to reflect the correct license status. |
PD-2350 | Upgrades on some nodes may become stuck with the following message: This node is already initialized but could not be found in the cluster map. . This issue can be caused by an orphaned storageless node. Workaround: Verify if the node which has this error is a storageless node. If it is, delete the orphaned storageless node using the command: pxctl clouddrive delete --node <> to progress the upgrade. |
2.12.5
May 09, 2023
Visit these pages to see if you're ready to upgrade to the latest version: