Configure tolerations and resource sizing in PDS
Taints and tolerations are mechanisms in Kubernetes used to control which pods can be scheduled on specific nodes.
Taints
Taints are restrictions applied to nodes that prevent certain pods from being scheduled on them unless those pods explicitly indicate they can tolerate the taints. They ensure that specific nodes are reserved for particular workloads or purposes, such as high-memory tasks, GPU-enabled applications, or critical system services.
Tolerations
Tolerations are configurations applied to pods that allow them to tolerate or match specific taints on nodes. A toleration does not force a pod onto a node; it merely permits the pod to be scheduled there if needed. They allow pods to bypass taints and be scheduled on restricted nodes.
How taints and tolerations work together:
A taint is added to a node, such as:
kubectl taint nodes node1 key=value:NoSchedule
This prevents any pods without a matching toleration from being scheduled on node1
.
A toleration is added to a pod, such as:
tolerations:
- key: key
operator: Equal
value: value
effect: NoSchedule
This allows the pod to be scheduled on node1
despite the taint.
Node configuration example
A node with high memory has the following taint:
kubectl taint nodes high-memory-node memory=high:NoSchedule
Pod configuration example
A pod needing high memory adds a matching toleration:
tolerations:
- key: memory
operator: Equal
value: high
effect: NoSchedule
This ensures the pod is scheduled only on the high-memory-node
.
Resource sizing
Resource sizing in Kubernetes helps define how much CPU and memory a pod can request or use. This ensures fair resource allocation and prevents resource overcommitment.
Predefined sizes:
- Small: Ideal for lightweight applications requiring minimal resources, such as microservices, development environments, and basic monitoring agents. It provides just enough CPU and memory to support non-critical workloads. Typical allocation includes 200m CPU and 256Mi memory requests, with limits at 500m CPU and 512Mi memory.
- Medium: Offers a balanced configuration for moderate workloads like web applications, APIs, and small to medium-sized databases. It ensures stable performance with resource requests of 500m CPU and 1Gi memory, and limits set at 1 CPU and 2Gi memory. This size is suitable for most production applications.
- Large: Designed for high-performance workloads such as big data processing, machine learning, enterprise databases, and resource-intensive backup jobs. It provides ample resources with 1 CPU and 4Gi memory requests, and limits at 2 CPUs and 8Gi memory, ensuring reliable performance for critical tasks.
Custom sizing
Allows administrators to explicitly define the exact CPU and memory requirements for workloads:
- Requests: The minimum resources a pod needs to run. If a node doesn’t have the requested resources, the pod won’t be scheduled.
- Limits: The maximum resources a pod can use. If a pod exceeds its limits, it might be throttled or terminated.
Example:
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2
memory: 4Gi
Requests:
- CPU: 500 millicores (half a CPU core)
- Memory: 1 GiB
- The scheduler ensures the node has at least these resources available before scheduling the pod.
Limits:
- CPU: 2 cores
- Memory: 4 GiB
- The pod cannot exceed these limits, ensuring resource control.
Importance of tolerations and resource sizing
- Cluster efficiency: Ensure workloads are scheduled appropriately based on node capabilities.
- Fair resource allocation: Prevent resource monopolization by limiting resource usage for pods.
- Flexibility: Accommodate diverse workloads with varying resource needs.
- Isolation: Reserve nodes for specific workloads using taints, and allow only matching pods to access them using tolerations.
Configure tolerations using APIs
Using PDS APIs, you can configure tolerations during deployment creation, updates, and restore operations. Here’s a detailed explanation of how to manage tolerations through the APIs:
The PDS APIs listed below are accessible exclusively via the HTTP/2 (gRPC) protocol. Ensure that your client or integration supports gRPC to interact with these APIs effectively.
Create Deployment API
When creating a deployment, you can specify tolerations in the topologies field of the API request. This configuration ensures that the pods are scheduled on nodes with matching taints.
- Tolerations are included in the deployment specification to match the taints on the target nodes.
- The scheduler ensures that pods are scheduled only on nodes where tolerations match the taints.
Example deployment request:
{
"topologies": [{
"podSchedulingConfig": {
"tolerations": [
{
"key": "node-type",
"operator": "EQUAL",
"value": "high-memory",
"effect": "NO_SCHEDULE"
}
]
}
}]
}
where:
key
: Specifies the taint key on the target node (for example,node-type
).operator
: Defines the matching condition (EQUAL or EXISTS).value
: The value of the taint key (for example,high-memory
).effect
: Determines the taint effect (NO_SCHEDULE, PREFER_NO_SCHEDULE, or NO_EXECUTE).
Explicitly empty tolerations
- Specify an empty list of tolerations in the request.
- Ensures no tolerations are applied, allowing pods to be scheduled on any available node without matching taints.
Use case: Allowing flexibility in node selection for non-critical workloads.
Example:
{
"topologies": [{
"podSchedulingConfig": {
"tolerations": []
}
}]
}
No tolerations specified
- If the
podSchedulingConfig
field is omitted, no tolerations are applied. - Use case: General-purpose deployments without specific node constraints.
Explicitly empty tolerations
- Specify an empty list of tolerations in the request.
- Ensures no tolerations are applied, allowing pods to be scheduled on any available node without matching taints.
Use case: Allowing flexibility in node selection for non-critical workloads.
Example:
{
"topologies": [{
"podSchedulingConfig": {
"tolerations": []
}
}]
}
No tolerations specified
- If the
podSchedulingConfig
field is omitted, no tolerations are applied. - Use case: General-purpose deployments without specific node constraints.
Update Deployment API
After creating a deployment, you can modify tolerations using the update deployment API. This allows you to:
- Add new tolerations.
- Modify existing tolerations.
- Remove tolerations.
Use cases
- Scaling: When new nodes with different taints are added to the cluster, update the deployment tolerations to match those nodes.
- Optimization: Adjust tolerations to improve workload placement and resource utilization.
Example update request:
{
"topologies": [{
"podSchedulingConfig": {
"tolerations": [
{
"key": "zone",
"operator": "EQUAL",
"value": "us-west",
"effect": "NO_SCHEDULE"
}
]
}
}]
}
Explicitly empty tolerations
- Specify an empty list of tolerations to remove all tolerations from the deployment.
- Ensures pods are scheduled without any node-specific constraints.
Use case: Generalizing deployments for broader node availability.
Example:
{
"topologies": [{
"podSchedulingConfig": {
"tolerations": []
}
}]
}
No tolerations specified
- If the
podSchedulingConfig
field is omitted in the update request, existing tolerations in the deployment remain unchanged. - Use case: Maintaining current tolerations when modifying other deployment attributes.
Create Restore API
You can also configure tolerations during restore operations. The behavior depends on how tolerations are specified in the restore request.
With new tolerations
- Explicitly define tolerations for the restore operation.
- These tolerations override any tolerations from the source deployment.
Use case: Restoring data to a node with different taints than the original deployment.
Example:
{
"config": {
"podSchedulingConfig": {
"tolerations": [
{
"key": "backup-zone",
"operator": "EQUAL",
"value": "zone1",
"effect": "NO_SCHEDULE"
}
]
}
}
}
Explicitly empty tolerations
- Specify an empty list of tolerations in the restore request.
- This configuration ensures no tolerations are applied to the restored pods.
Use case: Removing all tolerations for the restored pods to allow them to be scheduled anywhere in the cluster.
Example:
{
"config": {
"podSchedulingConfig": {
"tolerations": []
}
}
}
Default tolerations
If no tolerations are specified in the restore request, the tolerations from the source deployment are inherited.
Use case: Maintaining consistency with the original deployment tolerations.
Summary of toleration behaviors
API | Toleration field | Behavior |
---|---|---|
Create Deployment | New tolerations | Applies the specified tolerations for pod scheduling. |
Empty tolerations | Ensures no tolerations are applied, allowing scheduling on any node. | |
No tolerations specified | Default behavior with no node-specific constraints. | |
Update Deployment | New tolerations | Adds or updates tolerations for pod scheduling. |
Empty tolerations | Removes all tolerations from the deployment. | |
No tolerations specified | Leaves existing tolerations unchanged. | |
Create Restore | New tolerations | Overrides the source deployment tolerations with the specified tolerations. |
Empty tolerations | Removes all tolerations from the restored pods. | |
No tolerations specified | Inherits tolerations from the source deployment. |
API behavior for tolerations with invalid or undefined values
When deploying a data service using the PDS API, it is important to understand the behavior of the effect and operator fields in the podSchedulingConfig
. The API does not validate these fields against predefined enum values. As a result, if you provide invalid or random values, they will be treated as unspecified, which may lead to unintended scheduling behavior.
Example of incorrect effect
value in request:
"podSchedulingConfig": {
"tolerations": [
{
"key": "test",
"operator": "EXISTS",
"effect": "dummy_value"
}
]
}
The API defaults to the EFFECT_UNSPECIFIED
when the provided value for effect
(dummy_value) is not valid:
Tolerations:
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
test op=Exists
API response for incorrect effect
:
The API replaces the invalid value with EFFECT_UNSPECIFIED
:
"tolerations": [
{
"key": "key1",
"operator": "EQUAL",
"value": "val1",
"effect": "EFFECT_UNSPECIFIED"
}
]
-
The API does not enforce validation for the effect and operator fields.
-
Invalid, undefined, or random values are treated as unspecified.
-
Unspecified Values:
effect
: Defaults toEFFECT_UNSPECIFIED
, which means it matches all taint effects.operator
: Defaults toOPERATOR_UNSPECIFIED
, which may lead to ambiguous scheduling behavior.
Defined values for effect and operator
Effect enum values:
Value | Description |
---|---|
EFFECT_UNSPECIFIED | Default value; matches all taint effects. |
NO_EXECUTE | Pods with this toleration are evicted from the node if the taint is present. |
NO_SCHEDULE | Pods with this toleration are not scheduled on nodes with matching taints. |
PREFER_NO_SCHEDULE | The scheduler tries to avoid scheduling pods on nodes with matching taints. |
Operator enum values:
Value | Description |
---|---|
OPERATOR_UNSPECIFIED | Default value; may result in undefined behavior. |
EQUAL | Ensures the key and value match the taint. |
EXISTS | Matches any value for the specified key. |
Therefore, always use valid enum values for effect
and operator
to ensure expected behavior.
Example of valid tolerations:
"podSchedulingConfig": {
"tolerations": [
{
"key": "node-type",
"operator": "EQUAL",
"value": "high-memory",
"effect": "NO_SCHEDULE"
},
{
"key": "zone",
"operator": "EXISTS",
"effect": "NO_EXECUTE"
}
]
}
Configure custom taints, tolerations, and resource settings for Portworx applications
This section explains how you can configure custom taints, tolerations, and resource settings for Kubernetes nodes and schedule pods (PDS and Portworx Agent) accordingly.
Configure Portworx Agent
-
Access the PDS platform UI to generate a deployment manifest for Portworx Agent.
The generated manifest serves as the base for adding custom configurations.
-
Modify the manifest to include tolerations in the bootstrapper configuration. Tolerations should match the taints on the target nodes.
Example toleration configuration:
tolerations:
- effect: NoSchedule
key: custom-key
operator: Equal
value: custom-valueAn overlay configuration provides a structured way to customize Helm values for Portworx Agent and PDS applications.
-
Create a ConfigMap for Portworx Agent.
-
Define tolerations and resource settings within the
values.yaml
section.Example overlay configMap for Portworx Agent:
apiVersion: v1
kind: ConfigMap
metadata:
name: px-agent-overlay
namespace: px-system
data:
values.yaml: |-
global:
tolerations:
- effect: NoSchedule
key: px-agent-key
operator: Equal
value: px-agent-value
px-app-operator:
size: medium
tolerations:
- effect: NoSchedule
key: app-key
operator: Equal
value: app-value
Tolerations specified under the global
section in the values.yaml
configuration provide default settings for all operators. However, if both global
and operator-specific tolerations are provided, the operator-specific tolerations will override the global configurations, allowing unique tolerations to be applied for specific operators.
Configure PDS applications
-
Define resource sizing.
PDS applications support predefined and custom resource sizes:
- Predefined sizes: Small, Medium, Large.
- Custom size: Define explicit CPU and memory requirements.
Refer to this section for more information.
Custom resource example:
resources:
limits:
cpu: 530m
memory: 600Mi
requests:
cpu: 230m
memory: 200MiResource configurations are supported for the following Operators:
-
Portworx Agent Operators:
px-agent/px-app-operator
-
PDS Operators:
pds/pds-backups-operator
pds/pds-deployments-operator
pds/pds-target-operator
pds/pds-mutator
pds/pds-external-dns
-
Create an overlay ConfigMap for PDS applications, including tolerations and resource settings.
Example overlay ConfigMap for PDS:
apiVersion: v1
kind: ConfigMap
metadata:
name: pds-overlay
namespace: px-system
data:
values.yaml: |-
global:
tolerations:
- effect: NoSchedule
key: pds-key
operator: Equal
value: pds-value
pds-backups-operator:
size: large
tolerations:
- effect: NoSchedule
key: backups-key
operator: Equal
value: backups-value
Apply configurations using overlays
-
Use
kubectl
to apply the overlay ConfigMaps to the cluster:kubectl apply -f px-agent-overlay.yaml
kubectl apply -f pds-overlay.yaml -
Integrate overlays with target cluster applications.
Overlays provide additional Helm values before installing PDS and Portworx Agent, ensuring the correct tolerations and resource settings are applied.
-
If changes are made to the overlay configurations, [force reconcile](force reconcile) Portworx Agent and PDS applications to ensure the updates are applied to the applications.
Install PDS with tolerations and custom resource settings
This section provides a detailed guide to configuring tolerations and resource settings for PDS and Portworx Agent applications.
-
Access the PDS platform and navigate to the section for creating PDS deployments.
-
Generate a Kubernetes manifest for Portworx Agent.
-
Modify the bootstrapper job by adding tolerations directly to the bootstrapper configuration in the manifest.
Example manifest :
apiVersion: batch/v1
kind: Job
metadata:
name: px-agent-bootstrapper
namespace: px-system
spec:
template:
metadata:
labels:
app: px-agent-bootstrapper
spec:
tolerations:
- effect: NoSchedule
key: node-type
operator: Equal
value: high-performance
- effect: NoExecute
key: infra
operator: Exists
containers:
- name: px-agent
image: portworx/px-agent:latest
resources:
requests:
memory: 256Mi
cpu: 200m
limits:
memory: 512Mi
cpu: 500m
restartPolicy: OnFailure -
Add Custom Resource Settings and tolerations to overlays.
Use overlays to configure tolerations and resource settings for PDS and Portworx Agent applications. Overlays act as an additional layer for Helm configurations.
Overlay for PDS configurations:
- Create a ConfigMap for PDS.
- Define tolerations and resource settings for each PDS Operator.
Example overlay ConfigMap for PDS:
apiVersion: v1
kind: ConfigMap
metadata:
name: pds-overlay
namespace: px-system
data:
values.yaml: |-
global:
tolerations:
- effect: NoSchedule
key: pds-global-key
operator: Equal
value: pds-global-value
external-dns:
tolerations:
- effect: NoSchedule
key: dns-key
operator: Equal
value: dns-value
size: small
pds-mutator:
tolerations:
- effect: NoSchedule
key: mutator-key
operator: Equal
value: mutator-value
size: small
pds-target-operator:
tolerations:
- effect: NoSchedule
key: target-key
operator: Equal
value: target-value
size: medium
externalDNS:
chartUrl: oci://docker.io/portworx/pds-external-dns
chartVersion: 0.1.0-7ca89fe
imageTag: 0.14.2-debian-12-r7
chartConfig:
tolerations:
- effect: NoSchedule
key: external-dns-key
operator: Equal
value: extenal-dns-value
- effect: NoSchedule
key: key
operator: Equal
pds-backups-operator:
tolerations:
- effect: NoSchedule
key: backups-key
operator: Equal
value: backups-value
size: large
px-deployments-operator:
tolerations:
- effect: NoSchedule
key: deployments-key
operator: Equal
value: deployments-value
size: custom
manager:
resources:
limits:
cpu: 530m
memory: 600Mi
requests:
cpu: 230m
memory: 200Mi
Apply configurations
-
Run the following commands to apply the configurations:
kubectl apply -f px-agent-overlay.yaml
kubectl apply -f pds-overlay.yaml -
Proceed with the installation of PDS and Portworx Agent applications. The tolerations and resource settings defined in the overlays will automatically apply.
Force reconciliation for updates
If you update the tolerations or resource settings, you can force reconcile Portworx Agent and PDS applications to apply changes to running applications.
Edit tolerations, resource settings, and other properties using overlay config
This section provides a detailed guide on how to modify tolerations, resource settings, and other configuration properties of applications deployed via PDS and Portworx Agent using overlay configurations. It also explains how to apply the updated settings using force reconciliation.
Overlay configurations allow users to define custom Helm values for applications such as PDS and Portworx Agent. By editing these overlay configurations, users can:
- Update tolerations to match new node taints.
- Modify resource settings (for example: CPU, memory).
- Adjust other deployment-specific properties.
Force reconciliation ensures that any changes made in the overlay configuration are applied to running applications without requiring manual redeployment.
To edit overlay configurations:
-
Identify the ConfigMap associated with the application you want to update. For example:
px-agent-overlay
for Portworx Agent.pds-overlay
for PDS Operators.
-
Edit the overlay configuration to include the updated values for tolerations, resource settings, or other properties.
Example update for Portworx Agent overlay:
apiVersion: v1
kind: ConfigMap
metadata:
name: px-agent-overlay
namespace: px-system
data:
values.yaml: |-
global:
tolerations:
- effect: NoSchedule
key: infra-key
operator: Equal
value: infra-value
px-app-operator:
tolerations:
- effect: NoSchedule
key: app-key
operator: Equal
value: app-value
size: medium
px-tc-operator:
tolerations:
- effect: NoSchedule
key: tc-key
operator: Equal
value: tc-value
size: custom
resources:
limits:
cpu: 530m
memory: 630Mi
requests:
cpu: 230m
memory: 200MiExample update for PDS overlay:
apiVersion: v1
kind: ConfigMap
metadata:
name: pds-overlay
namespace: px-system
data:
values.yaml: |-
pds-mutator:
tolerations:
- effect: NoSchedule
key: new-mutator-key
operator: Equal
value: mutator-value
size: medium
pds-backups-operator:
tolerations:
- effect: NoExecute
key: backups-key
operator: Exists
size: custom
manager:
resources:
limits:
cpu: 1.5
memory: 3Gi
requests:
cpu: 800m
memory: 2Gi -
Save the updated ConfigMap.
Apply the updated ConfigMap to the cluster:
kubectl apply -f px-agent-overlay.yaml
kubectl apply -f pds-overlay.yaml -
Apply the updated configuration using force reconcile. Force reconciliation ensures that the updated overlay configuration is applied to running Portworx Agent and PDS applications.
-
Use the
kubectl get tcapp
command to check the status of the TargetClusterApplication resource:kubectl get tcapp px-agent -n px-system
kubectl get tcapp pds -n px-system
Look for updated pods or events indicating that the changes have been applied.
Node requirements for scheduling data services with taints and tolerations
When using taints and tolerations for scheduling data services, it is essential to ensure that the cluster has enough nodes available to accommodate the desired number of data service instances. Improper node planning or insufficient nodes can result in unscheduled pods and degraded functionality.
Node allocation based on data service instances
Each data service instance (or pod) requires a dedicated node for scheduling. If the number of available nodes with matching taints is less than the required instances, some pods will remain unscheduled.
Use case example:
-
If you taints 2 nodes for data service deployments, taint applied to nodes:
kubectl taint nodes node1 pds=true:NoSchedule
kubectl taint nodes node2 pds=true:NoSchedule -
If you deploy PostgreSQL with 3 instances (replicas), tolerations applied in the deployment:
tolerations:
- key: pds
operator: Equal
value: true
effect: NoSchedule
Outcome: Only 2 pods will be scheduled on the tainted nodes, while the third pod will remain in a pending state due to the lack of a suitable node.
To avoid such scenarios, plan and taint nodes based on the required number of data service instances:
-
Calculate node requirements: Total nodes needed = Number of instances for the largest expected deployment.
-
Example scenarios:
- Single-node deployment: Suitable for lightweight workloads or development environments.
- Multi-node deployment with replicas: For high availability or production-grade setups, ensure at least one node per instance.
- Taint nodes appropriately: Apply taints to reserve nodes exclusively for PDS workloads:
kubectl taint nodes node1 pds=true:NoSchedule
kubectl taint nodes node2 pds=true:NoSchedule
kubectl taint nodes node3 pds=true:NoSchedule
Error scenarios and resolutions
Scenario | Cause | Resolution |
---|---|---|
Pod remains unscheduled | Insufficient tainted nodes | Add more tainted nodes or reduce replicas. |
Node overcommitment | Too many pods scheduled on a single node | Apply proper resource limits and requests in deployment. |
Misconfigured tolerations | Toleration key or value mismatch with taint | Verify and update toleration configurations. |