Skip to main content
Version: 2.10

Portworx Backup Metrics

PX-Backup exposes Prometheus metrics via the /metrics endpoint that provide comprehensive monitoring data for backup, restore, clusters, backup locations, and other resources. Note that this is a point-in-time REST API endpoint that returns current metric values when queried - historical data and time-range queries require a Prometheus server to scrape and store these metrics over time. This guide helps you understand all available metrics, their labels, value ranges, and usage patterns.

Before accessing metrics, ensure you have set up Prometheus to scrape PX-Backup metrics. Refer to the following guides based on your environment:

Scraping Endpoint

PX-Backup metrics can be scraped from the following endpoint:

http://px-backup-svc-endpoint:<rest_port>/metrics

Where <rest_port> is the REST API port (default: 10001).

or

http://<external-ip>:10001/metrics

where <external-ip> is the external IP address of the PX-Backup service. 10001 is the default REST API port.

Metrics Format

All metrics use the pxbackup_ prefix and follow Prometheus naming conventions. The endpoint returns metrics in Prometheus exposition (OpenMetrics) format.

Note: Prometheus /metrics endpoint will serve in response to all data available and doesn't provide incremental delta or filtered data output.

Data provided by /metrics endpoint, but without pxbackup_ prefix can be ignored.

Backfill Behavior

When PX-Backup pod restarts, it loads existing objects from the datastore and creates metrics with the backfill="true" label. This ensures metrics are available for existing backups, restores, and clusters even after pod restarts.

Metrics Categories

Backup Status and Performance Metrics

note

Metric retention is set to 90 days, however metric retention periods vary by metric type, please refer to individual metric descriptions for details.

pxbackup_backup_status

Type: Gauge Lifecycle: Created at backup start, updated during execution, removed on backup object deletion Description: Current status of backups in PX-Backup Usage: Monitor backup health, detect failures, track backup lifecycle

LabelDescriptionTypeValue RangeExample
nameBackup namestringUser-defined backup name"mysql-backup-001"
namespacesKubernetes namespaces backed upstringComma-separated namespace list"default,kube-system"
clusterSource cluster namestringCluster identifier"prod-cluster-1"
user_idUser who created the backupstringUser identifier/email"admin@company.com"
schedule_nameAssociated backup schedulestringSchedule name or empty string"daily-backup-schedule"
org_idOrganization IDstringOrganization identifier"default"
cluster_uidUnique cluster identifierstringUUID format"a1b2cXXX-XXXX-XXX-abcd-ef123456XXXX"
error_reasonError details for failed backupsstringError message or empty"Volume snapshot failed: timeout"
timestamp_in_secsTimestamp of last updatestringUnix timestamp as string"1699123456"
backup_namespaceActual namespaces in backupstringComma-separated namespace list"app1,app2,monitoring"
backfillIndicates backfilled metricstring"true" or empty string"true"

Status Values:

  • 0: Invalid - Backup object is in invalid state
  • 1: Pending - Backup is queued for execution
  • 2: InProgress - Backup is currently running
  • 3: Aborted - Backup was manually aborted
  • 4: Failed - Backup failed with errors
  • 5: Deleting - Backup is being deleted
  • 6: Success - Backup completed successfully
  • 7: Captured - Backup data captured (intermediate state)
  • 8: PartialSuccess - Backup completed with some failures
  • 9: DeletePending - Backup marked for deletion
  • 10: CloudBackupMissing - Cloud backup data is missing

pxbackup_backup_count

Type: Counter Lifecycle: Created at first backup completion, incremented on each subsequent backup, persists within Prometheus retention window (typically 90 days) Description: Total number of backup operations (cumulative) Usage: Track backup frequency, generate rates

LabelDescriptionTypeValue Range
cluster_nameSource cluster namestringCluster identifier
user_idBackup ownerstringUser identifier
org_idOrganization IDstringOrganization identifier
cluster_uidCluster UUIDstringUUID format
statusFinal backup statusstringStatus enum as string

Status Values: "Success", "Failed", "PartialSuccess"

Backup Schedule Metrics

pxbackup_backup_schedule_status

Note: This metric is excluded in the OCP Prometheus
Type: Gauge
Lifecycle: Created when backup schedule is configured, updated when schedule is suspended/resumed, removed on schedule deletion
Description: Status of backup schedules (active/suspended)
Usage: Monitor schedule health, detect suspended schedules

LabelDescriptionTypeValue Range
nameSchedule namestringUser-defined schedule name
namespacesScheduled namespacesstringComma-separated namespace list
clusterTarget clusterstringCluster identifier
user_idSchedule ownerstringUser identifier

Values:

  • 0: Active - Schedule is running normally
  • 1: Suspended - Schedule is suspended/paused

Restore Metrics

pxbackup_restore_status

Type: Gauge Lifecycle: Created at restore start, updated during execution, removed on restore object deletion Description: Current status of restore operations Usage: Monitor restore health, track restore progress

LabelDescriptionTypeValue RangeExample
nameRestore namestringUser-defined restore name"mysql-restore-001"
namespacesTarget namespaces for restorestringComma-separated list"prod-ns,app-ns"
clusterTarget cluster namestringCluster identifier"staging-cluster"
user_idRestore ownerstringUser identifier"admin@company.com"
cluster_uidTarget cluster UUIDstringUUID format"b2c3d4e5-f6g7-8901-bcde-f23456789012"
error_reasonError details for failed restoresstringError message or empty"PVC creation failed"
backupSource backup namestringOriginal backup name"mysql-backup-001"
timestamp_in_secsLast update timestampstringUnix timestamp"1699123456"
org_idOrganization IDstringOrganization identifier"default"
backfillBackfilled metric indicatorstring"true" or empty""

Status Values:

  • 0: Invalid - Restore object is invalid
  • 1: Pending - Restore is queued
  • 2: InProgress - Restore is running
  • 3: Aborted - Restore was aborted
  • 4: Failed - Restore failed
  • 5: Deleting - Restore is being deleted
  • 6: Success - Restore completed successfully
  • 7: Retained - Restore data retained
  • 8: PartialSuccess - Restore completed with some failures

pxbackup_restore_count

Type: Counter Lifecycle: Created at first restore completion, incremented on each subsequent restore, persists indefinitely Description: Total number of restore operations (cumulative) Usage: Track restore frequency

LabelDescriptionTypeValue RangeExample
clusterTarget cluster namestringCluster identifier"cluster-name"
cluster_uidTarget cluster UUIDstringUUID format"670XXXXX-9b11-40a3-XXXX-eda95aXXXXXX"
org_idOrganization IDstringOrganization identifier"default"
statusFinal restore statusstringStatus enum as string"Failed"
user_idRestore ownerstringUser identifier/UUID"70aXXXXX-419c-429f-XXXX-e302c2XXXXXX"

Status Values: "Success", "Failed", "PartialSuccess"

Cluster Metrics

pxbackup_cluster_status

Type: Gauge Lifecycle: Created at cluster registration, updated on connectivity checks, removed on cluster deletion Description: Health status of registered clusters Usage: Monitor cluster connectivity, detect offline clusters

LabelDescriptionTypeValue RangeExample
nameCluster namestringUser-defined cluster name"production-k8s"
user_idCluster ownerstringUser identifier"admin@company.com"
org_idOrganization IDstringOrganization identifier"default"
cluster_uidUnique cluster identifierstringUUID format"c3d4e5f6-g7h8-9012-cdef-345678901234"
error_reasonError details for failed clustersstringError message or empty"Connection timeout"
timestamp_in_secsLast status update timestringUnix timestamp"1699123456"
backfillBackfilled metric indicatorstring"true" or empty""

Status Values:

  • 0: Invalid - Cluster configuration is invalid
  • 1: Online - Cluster is healthy and accessible
  • 2: Offline - Cluster is not reachable
  • 3: DeletePending - Cluster is marked for deletion
  • 4: Pending - Cluster registration is pending
  • 5: Failed - Cluster registration/connection failed
  • 6: Success - Cluster successfully registered but not online yet

Backup Location Metrics

pxbackup_backup_location_status

Type: Gauge Lifecycle: Created when backup location is configured, updated during periodic validation checks, removed on location deletion Description: Status of backup locations in Portworx Backup Usage: Monitor backup destination health

LabelDescriptionTypeValue Range
nameBackup location namestringUser-defined location name
user_idLocation ownerstringUser identifier
org_idOrganization IDstringOrganization identifier
error_reasonError detailsstringError message or empty
timestamp_in_secsLast validation timestringUnix timestamp
backfillBackfilled metricstring"true" or empty

Status Values:

  • 0: Invalid - Location configuration is invalid
  • 1: Valid - Location is accessible and working
  • 2: DeletePending - Location is being deleted
  • 3: ValidationInProgress - Location is being validated
  • 4: ValidationFailed - Location validation failed
  • 5: LimitedAvailability - Location has limited functionality

pxbackup_backuplocation_metrics

Note: This metric is excluded in the OCP Prometheus
Type: Gauge Lifecycle: Created when backup location is configured/added, value remains constant at 1, removed on location deletion Description: Count of configured backup locations Usage: Track backup destination inventory

Labels: name, user_id, org_id Value: Always 1 (indicates location exists)

Cloud Credential Metrics

pxbackup_cloudcred_metrics

Note: This metric is excluded in the OCP Prometheus
Type: Gauge Lifecycle: Created when cloud credential is configured/added, value remains constant at 1, removed on credential deletion Description: Count and type of cloud credentials configured in Portworx Backup Usage: Track credential inventory

ParameterDescriptionTypeValue Range
nameCredential namestringUser-defined name
user_idCredential ownerstringUser identifier

Cloud Credential Type Values:

  • 0: Invalid - Invalid credential type
  • 1: AWS - Amazon Web Services credentials
  • 2: Azure - Microsoft Azure credentials
  • 3: Google - Google Cloud Platform credentials
  • 4: IBM - IBM Cloud credentials
  • 5: Rancher - Rancher credentials

Policy Metrics

pxbackup_schedpolicy_metrics

Note: This metric is excluded in the OCP Prometheus
Type: Gauge Lifecycle: Created when backup schedule policy is configured, removed on policy deletion Description: Count of schedule policies in Portworx Backup Usage: Track policy inventory

Labels: name, type, user_id Value: Always 1 (indicates policy exists)

pxbackup_volumeresourceonlypolicy_metrics

Type: Gauge Lifecycle: Created when volume resource only policy is configured, removed on policy deletion Description: Count of volume resource only policies Usage: Track specialized policy inventory

Labels: name, type, user_id Value: Always 1 (indicates policy exists)

pxbackup_rule_metrics

Type: Gauge Lifecycle: Created when rule is configured, removed on rule deletion Description: Count of backup rules in Portworx Backup Usage: Track rule inventory

Labels: name, user_id Value: Always 1 (indicates rule exists)

note

Backup information metrics, virtual machine metrics, backup volume metrics, and virtual machine resource metrics are supported starting from PX-Backup version 2.10.1

Backup Information Metrics

pxbackup_backup_object_info

Type: Gauge Lifecycle: Created at backup start, updated during execution, removed on backup object deletion. Description: Comprehensive backup information aggregating data from multiple backup-related metrics. Usage: Monitor complete backup details including scheduling, retention, resources, and virtual machines. Rentention period: Default period is 24 hours, can be reset by setting the Helm param: pxbackup.backupInfoMetricsBackfillHours. It can be set to a maximum of (720 hours) 30 days. Setting the value to 0 will remove the metrics from Portworx Backup.

note

To enable these metrics for OpenShift Container Platform (OCP) Prometheus or external Prometheus servers, you must set the pxbackup.enableExternalMetricsScraping Helm parameter during installation or upgrade.

LabelTypeDescription
namestringName of the backup object
uidstringUnique identifier for the backup object
org_idstringOrganization UID that owns this backup
create_time_in_secint64Creation time in seconds (Unix timestamp)
clusterstringName of the cluster if this backup is syned backup
namespacesstringNamespaces where the backup is taken
label_selectorsstringLabel selectors to choose resources for backup
statusstringCurrent status of the backup operation [ Failed(4), Success(6), PartialSuccess(8)]
status_reasonStatus reason of the backup operation
backup_pathstringPath where backup is stored
backup_schedule_namestringName of the backup schedule, if the backup was taken by schedule
backup_schedule_uidstringUnique identifier of the backup schedule, if the backup was taken by schedule
total_sizeintegerTotal size of the backup
resource_countintegerTotal count of resources in backup
backup_location_namestringName of the backup location
backup_location_uidstringuid of the backup location
cloud_credential_namestringName of the cloud credential object attached
cloud_credential_uidstringUnique identifier for the cloud credential rule object
backup_typestringType of backup (generic or normal)
retention_periodintegerBackup retention period
cluster_namestringReference to cluster object
cluster_uidstringUnique identifier for the cluster object
ns_label_selectorsstringLabel selectors for choosing namespaces
large_resource_enabledboolThis flag signifies if the backup involves large number of resources or not
backup_object_typestring[Values = All, VirtualMachine] Gives output of whether it is for all application or virtual machine specific backup
skip_vm_auto_exec_rulesboolSkip auto execution rules for VirtualMachine backup
direct_kdmpboolOption to take backup as direct KDMP
retention_timestringExpiration timestamp for locked backup retention
volumes_completion_timestringThis will store timestamp for the completion of volumes
resources_completion_timestringThis will store timestamp for the completion of resources
total_completion_timestringThis will store timestamp for the completion of entire backup
advanced_resource_label_selectorstringAdvanced label selector supporting operators
schedule_policy_namestringName of the schedule policy object attached
schedule_policy_uidstringUnique identifier of the schedule policy object attached
virtual_machines_total_countint64Total count of virtual machines
virtual_machines_failed_countint64Count of failed virtual machines
volume_resource_only_policy_namestringName of the volume resource only policy attached
volume_resource_only_policy_uidstringUnique Identifier of the volume resource only policy attached

Virtual Machine Metrics

pxbackup_virtual_machine_info

Type: Gauge Lifecycle: Created when virtual machine backup starts, updated during backup execution, removed on backup object deletion. Description: Information about virtual machines included in backups. Usage: Track virtual machine backup status and details. Rentention period: Default period is 24 hours, can be reset by setting the Helm param: pxbackup.backupInfoMetricsBackfillHours. It can be set to a maximum of (720 hours) 30 days. Setting the value to 0 will remove the metrics from Portworx Backup.

note

To enable these metrics for OpenShift Container Platform (OCP) Prometheus or external Prometheus servers, you must set the pxbackup.enableExternalMetricsScraping Helm parameter during installation or upgrade.

LabelTypeDescription
backup_namestringName of the backup in which this volume is part of
backup_idstringUnique Reference to the backup Object
schedule_policy_namestringName of the schedule policy object attached
schedule_policy_uidstringUnique identifier of the schedule policy object attached
cluster_namestringReference to cluster object
cluster_uidstringUnique identifier for the cluster object
namestringName of the virtual machine
namespacestringNamespace of the virtual machine
os_namestringOperating system name
statusstringStatus of the virtual machine backup
status_reasonstringStatus reason of the virtual machine backup
create_time_in_secint64Creation time in seconds (Unix timestamp)

Backup Volume Metrics

pxbackup_backup_volume_info

Type: Gauge Lifecycle: Created when volume backup starts, updated during backup execution, removed on backup object deletion. Description: Detailed information about volumes included in backups. Usage: Track volume backup status, sizes, and storage details. Rentention period: Default period is 24 hours, can be reset by setting the Helm param: pxbackup.backupInfoMetricsBackfillHours. It can be set to a maximum of (720 hours) 30 days. Setting the value to 0 will remove the metrics from Portworx Backup.

note

To enable these metrics for OpenShift Container Platform (OCP) Prometheus or external Prometheus servers, you must set the pxbackup.enableExternalMetricsScraping Helm parameter during installation or upgrade.

LabelTypeDescription
backup_namestringName of the backup in which this volume is part of
backup_idstringUnique Reference to the backup Object
namestringName of the volume
namespacestringNamespace of the volume
pvcstringPersistent Volume Claim name
statusstringStatus Value of the metric [ Failed(4), Success(6), PartialSuccess(8)]
driver_namestringStorage driver name
total_sizeintegerTotal size of the volume
actual_sizeintegerActual backup size (incremental size for incremental backups)
storage_classstringStorage class of the volume
pvc_idstringUnique identifier for the PVC
provisionerstringStorage provisioner
volume_snapshotstringVolume snapshot reference
virtual_machine_namestringAssociated virtual machine name
backup_modestringBackup mode [ Not Supported(1), Full(2), Incremental(3) ]

Virtual Machine Resource Metrics

pxbackup_virtual_machine_resource_info

Type: Gauge Lifecycle: Created when virtual machine resource backup starts, updated during backup execution, removed on backup object deletion Description: Information about Kubernetes resources associated with virtual machines Usage: Track resource backup details for virtual machine workloads Rentention period: Default period is 24 hours, can be reset by setting the Helm param: pxbackup.backupInfoMetricsBackfillHours. It can be set to a maximum of (720 hours) 30 days. Setting the value to 0 will remove the metrics from Portworx Backup.

note

To enable these metrics for OpenShift Container Platform (OCP) Prometheus or external Prometheus servers, you must set the pxbackup.enableExternalMetricsScraping Helm parameter during installation or upgrade.

LabelTypeDescription
virtual_machine_namestringvirtual machine name associated with this resource
namestringName of the resource
namespacestringNamespace of the resource
groupstringGroup of the resource
kindstringkind of the resource
versionstringversion of the resource
backup_namestringName of the backup in which this volume is part of
backup_idstringUnique Reference to the backup Object

Unsupported Metrics

The following metrics are not fully supported yet, it is advised to exclude them in a production environment.

  1. pxbackup_backup_size_bytes
  2. pxbackup_backup_duration_seconds
  3. pxbackup_backup_volume_count
  4. pxbackup_backup_resource_count
  5. pxbackup_restore_size_bytes
  6. pxbackup_restore_duration_seconds
  7. pxbackup_restore_volume_count
  8. pxbackup_restore_resource_count