Relaxed Reclaim in OCP on bare metal
Summary and Key concepts
Summary:
The article explains how to use the RelaxedReclaim feature in Portworx to manage and spread out the deletion of volume and snapshot replicas, thereby preventing high filesystem latencies and preserving I/O performance. When deleting large numbers of replicas at once, this feature stages the delete operations in a queue and allows users to specify a delay (in seconds) between each delete operation. By pacing the deletion of snapshots and volumes, the underlying filesystem can handle other operations more efficiently. The article provides commands to enable, reconfigure, and disable RelaxedReclaim, as well as how to set a limit for the maximum number of pending delete operations.
Kubernetes Concepts:
- PersistentVolume: The article deals with deletion of Portworx-managed persistent volumes and snapshots, which are Kubernetes storage objects.
- Filesystem Performance: RelaxedReclaim helps maintain performance by controlling the rate of volume and snapshot deletions to reduce strain on the filesystem.
Portworx Concepts:
-
pxctl Command: The command-line tool for managing Portworx, used in the article to configure RelaxedReclaim options such as deletion delay (
relaxedreclaim-delete-seconds
) and the maximum number of pending operations (relaxedreclaim-max-pending
).
When Portworx deletes snapshots and volumes, it also deletes their replicas. In some scenarios, you may delete a large number of replicas at once. These delete requests can overwhelm the underlying filesystem, causing high filesystem latencies and reducing I/O performance. Using RelaxedReclam, you can stage snapshot and volume replica delete operations in a queue and spread them out over time, giving the filesystem enough bandwidth to handle front-end I/O and lowering filesystem latencies.
RelaxedReclaim works by allowing you to specify a delay, in seconds, between each operation in the queue. Whenever you delete a volume or snapshot, Portworx places a delete operation for each replica targeted by the operation in the queue. The total duration of the RelaxedReclaim operation then becomes equal to the number of replicas in a pool times the rate in seconds at which Portworx deletes them.
For example, a snapshot policy contains definitions for the snapshot frequency and how many snapshots are retained. Once the number of snapshots retained reaches the limit defined in the policy, Portworx deletes an old snapshot every time a new one is taken. On clusters that have a large number of devices, and if all share the same snapshot frequency, Portworx will send a large number of concurrent delete requests for all the replicas to the storage pool. RelaxedReclaim paces these snapshot replica deletes, preserving filesystem performance.
Enable or reconfigure RelaxedReclaim
The relaxedreclaim-delete-seconds
flag sets the time window of a single volume delete. For example, a relaxedreclaim-delete-seconds
value of 60
means that 1 volume/snapshot replica will be deleted every 60 seconds. By default, the RelaxedReclaim value set to zero. To enable or reconfigure it, you must change the value as a cluster option.
Enter the following pxctl
command:
pxctl cluster options update --relaxedreclaim-delete-seconds 5
Once you've applied the cluster option, Portworx will stage any new snapshot and volume deletions and delete them from the pool at the rate you specified.
A Portworx service restart is not required to enable this feature.
Disable RelaxedReclaim
To disable the RelaxedReclaim feature, set the value back to 0
:
pxctl cluster options update --relaxedreclaim-delete-seconds 0
When you disable this feature, Portworx will delete any volumes or snapshots that are staged in the RelaxedReclaim queue immediately.
You do not need to restart the Portworx service to disable this feature.