Delete NFS Backups with Job Limit
When deleting backups that reside on an NFS backup location, Portworx Backup initiates a cleanup process to delete backup data from the storage backend. For each PVC, Portworx Backup spawns a separate delete job. These jobs delete the files and directories associated with the volumes. Previously, there was no upper limit on the number of delete jobs that can run concurrently. This behavior allowed for rapid cleanup by leveraging parallel deletion mechanism specifically when a large number of backups are deleted at once.
However, in application clusters with strict job quotas (such as clusters that enforce a limit on the number of concurrent jobs), this parallel deletion approach can overwhelm system resources, potentially leading to performance issues or job failures due to exceeding the allowed job limit.
To address this, Portworx Backup now introduces a configurable concurrency limit for delete jobs. This enhancement enables users to:
-
Control how many delete jobs run in parallel during backup deletion
-
Prevent cluster overload and adhere to quota restrictions
This topic provides instructions on applying the new concurrency limit to manage deletion operations in constrained environments.
Prerequisites
-
Access to Portworx Backup web console
-
Privileges to delete backups
-
NFS backups configured in the system
-
Portworx Backup 2.10.0 and later
NFS delete job limits
NFS delete job limits are part of the px-backup-config ConfigMap. This ConfigMap resides in the namespace where Portworx Backup is deployed (often referred to as <pxb-namespace>) on the Portworx Backup cluster.
| Parameter | Description | Default Value | Lower Limit | Upper Limit |
|---|---|---|---|---|
NFS_DELETE_JOB_LIMIT | Limits the number of job pods spawned to delete Portworx volumes during backup deletion. | 25 | 1 | Based on cluster pod quota limit |
NFS_CSI_DELETE_JOB_LIMIT | Limits the number of job pods created to delete CSI volumes during backup deletion. | 25 | 1 | Based on cluster pod quota limit |
KDMP_DELETE_JOB_LIMIT | Limits the number of job pods spawned to delete CSI + Offload volumes during backup deletion. This is used for [CSI+Offload](link to backup topic). | 5 | 1 | Based on cluster pod quota limit |
For more information, see px-backup-config ConfigMap parameters.
How it works
Let us consider a scenario where you select two Portworx volume backups that use NFS backend storage for deletion through the Portworx Backup. Suppose these backups collectively have 100 PVCs.
If you configure NFS_DELETE_JOB_LIMIT: "50", Portworx Backup will initiate the deletion process by launching 50 delete jobs, each responsible for cleaning up one PVC.
The deletion workflow proceeds in these stages:
-
50 PVCs are deleted in parallel. The remaining 50 PVCs are queued.
-
Once PVC(s) are deleted out of 50, the subsequent PVCs are picked up for deletion. At any point throughout the deletion process, the number of deletion job pods remain 50.
This continues until all 100 PVCs are deleted. This throttled approach ensures that the number of concurrent delete pods does not exceed the configured limit of 50, preventing resource exhaustion on clusters with job quotas.
The deletion time is primarily determined by the number and size of PVCs and the configured limit contained within the selected backups, not by the total number of backups chosen for deletion.
For more information on states of deletion and how to delete a backup, refer to Delete backup.
Resource deletion is performed as a single operation. All resources associated with a backup are deleted by a single pod.
How to configure
To configure delete job limits, update the px-backup-config ConfigMap on the Portworx Backup cluster. For more information see px-backup-config ConfigMap.
Example configuration:
NFS_DELETE_JOB_LIMIT: "50"
NFS_CSI_DELETE_JOB_LIMIT: "50"
KDMP_DELETE_JOB_LIMIT: "10"
Best Practices
-
Start with conservative limits and increase gradually only if your cluster has sufficient resources.
-
Monitor job behavior and system load during large-scale deletions.
-
Coordinate with your platform or DevOps team if your cluster has a predefined Job quota (for example, 250 delete jobs).