Skip to main content
Version: 26.2

Monitor PX-CSI with Prometheus

PX-CSI integrates with Prometheus to provide built-in metrics monitoring. This integration enables you to observe the health and performance of PX-CSI components across your cluster.

Monitoring is enabled by default. PX-CSI deploys Prometheus components into the portworx namespace, where Prometheus collects metrics from the PX-CSI controller and node plugin pods. You can view these metrics in the Prometheus UI or integrate them with tools such as Alertmanager and Grafana.

PX-CSI includes the following components as part of its monitoring stack:

  • The Portworx Operator deploys the Prometheus Operator, which manages a custom resource defining the Prometheus configuration and deploys the Prometheus stack.

When monitoring is active, PX-CSI deploys the following resources in the portworx namespace:

  • The Prometheus custom resource named px-prometheus, which defines retention periods, scrape intervals, and other Prometheus settings.
  • A Prometheus instance, deployed as a StatefulSet, which collects and serves metrics.

PX-CSI creates and configures these components automatically. You do not need to deploy or manage them manually.

Monitoring on OpenShift Container Platform

PX-CSI uses the Prometheus instance that OpenShift deploys for monitoring, rather than deploying px-prometheus. On OpenShift, set the spec.monitoring.prometheus.enabled field to false and the spec.monitoring.prometheus.exportMetrics field to true in your StorageCluster.

Enable monitoring

Monitoring is enabled by default. If it is disabled during installation, you can enable it by editing the StorageCluster specification.

StorageCluster
spec:
monitoring:
prometheus:
enabled: false # Do not deploy PX Prometheus. Use OpenShift’s Prometheus instead.
exportMetrics: true # Export metrics to OpenShift Prometheus
note

The enabled field determines if PX-CSI deploys its own Prometheus Operator and Prometheus instance. The exportMetrics field controls whether ServiceMonitor resources are created to allow Prometheus to scrape metrics from PX-CSI components.

Disable monitoring

To disable monitoring, update the StorageCluster specification:

StorageCluster
spec:
monitoring:
prometheus:
enabled: false
exportMetrics: false

Verify monitoring

note

These steps are not applicable on OpenShift Container Platform, because PX-CSI uses the Prometheus instance deployed by OpenShift.

To confirm that monitoring is active and running as expected:

  1. Run the following command to check for Prometheus pods in the portworx namespace:

    kubectl -n portworx get pods | grep prometheus

    The output should include pod names similar to:

    prometheus-px-prometheus-0                2/2     Running   0              23h
    px-prometheus-operator-764bb9c6cb-9qgvd 1/1 Running 0 23h
  2. Run the following command to forward Prometheus to port 9090:

    kubectl -n portworx port-forward prometheus-px-prometheus-0 9090:9090
  3. Open http://localhost:9090/targets to view the Prometheus targets.

    Verify that targets such as px-pure-csi-controller and px-pure-csi-node are listed and have a status of UP.

Prometheus metrics

PX-CSI metrics contain information about PersistentVolumeClaim (PVC) usage, volume lifecycle operations, API requests to external endpoints, CSI call latencies, and host connection health.

API latency metrics

MetricTypeDescription
px_csi_create_volume_latency_msHistogramLatency histogram for the CreateVolume API
px_csi_delete_volume_latency_msHistogramLatency histogram for the DeleteVolume API
px_csi_ctrlpublishvolume_latency_msHistogramLatency histogram for the ControllerPublishVolume API
px_csi_ctrlunpublishvolume_latency_msHistogramLatency histogram for the ControllerUnpublishVolume API
px_csi_nodestagevolume_latency_msHistogramLatency histogram for the NodeStageVolume API
px_csi_nodeunstagevolume_latency_msHistogramLatency histogram for the NodeUnstageVolume API
px_csi_nodepublishvolume_latency_msHistogramLatency histogram for the NodePublishVolume API
px_csi_nodeunpublishvolume_latency_msHistogramLatency histogram for the NodeUnpublishVolume API

Volume attachment metric

MetricTypeDescription
px_csi_attachments_per_nodeGaugeNumber of volume attachments per node

Volume usage metrics

Starting with PX-CSI 26.1.0, PX-CSI exposes volume usage metrics for filesystem volumes.

MetricTypeDescriptionLabels
px_volume_fs_usage_bytesGaugeCurrent filesystem used bytes in the volumenode, volumeid, volumename, pvc
px_volume_capacity_bytesGaugeTotal capacity of the volume in bytesnode, volumeid, volumename, pvc

FlashArray and FlashBlade API request metrics

MetricTypeDescription
px_csi_fafb_all_apis_requests_totalCounterTotal API requests to all configured FA/FB endpoints
px_csi_fafb_apis_volumes_requests_totalCounterAPI requests to the /volumes endpoint
px_csi_fafb_apis_array_requests_totalCounterAPI requests to the /arrays endpoint
px_csi_fafb_apis_volumesnapshots_requests_totalCounterAPI requests to the /volume-snapshots endpoint
px_csi_fafb_apis_hosts_requests_totalCounterAPI requests to the /hosts endpoint
px_csi_fafb_apis_controllers_requests_totalCounterAPI requests to the /controllers endpoint
px_csi_fafb_apis_ports_requests_totalCounterAPI requests to the /ports endpoint
px_csi_fafb_apis_alerts_requests_totalCounterAPI requests to the /alerts endpoint
px_csi_fafb_apis_connections_requests_totalCounterAPI requests to the /connections endpoint
px_csi_fafb_apis_login_requests_totalCounterAPI requests to the /login endpoint
px_csi_fafb_apis_version_requests_totalCounterAPI requests to the /api_version endpoint

Kubelet volume metrics

Starting with PX-CSI 26.1.0, kubelet exposes standard Kubernetes volume metrics for PX-CSI volumes.

MetricTypeDescription
kubelet_volume_stats_capacity_bytesGaugeTotal capacity of the volume in bytes
kubelet_volume_stats_used_bytesGaugeNumber of used bytes in the volume
kubelet_volume_stats_available_bytesGaugeNumber of available bytes remaining in the volume
kubelet_volume_stats_inodesGaugeTotal number of inodes in the volume (filesystem volumes only)
kubelet_volume_stats_inodes_usedGaugeNumber of used inodes in the volume (filesystem volumes only)
kubelet_volume_stats_inodes_freeGaugeNumber of free inodes in the volume (filesystem volumes only)

Host connection health metrics

PX-CSI exposes host connection health metrics. The CSI node driver periodically checks the health of host storage connections (iSCSI, NVMe, and FC) and multipath devices, and exposes the results as Prometheus gauge metrics. Use these metrics to detect degraded or lost storage connections before they affect your workloads.

All metrics in this section include the node_name label, which identifies the Kubernetes node that reports the metric.

note

Host connection health metrics are available in PX-CSI version 26.2.0 or later.

iSCSI connection metrics

Metric NameTypeDescriptionLabels
px_csi_node_iscsi_sessionsGaugeTotal number of iSCSI sessions on the nodenode_name
px_csi_node_iscsi_sessions_healthyGaugeNumber of healthy iSCSI sessions on the nodenode_name

A session is counted as healthy when its iSCSI session state is LOGGED_IN.

NVMe connection metrics

Metric NameTypeDescriptionLabels
px_csi_node_nvme_subsystemsGaugeNumber of available NVMe subsystems that match Pure FA target NQNsnode_name, subsysnqn
px_csi_node_nvme_connectionsGaugeTotal number of NVMe connectionsnode_name, transport_type
px_csi_node_nvme_connections_healthyGaugeNumber of healthy NVMe connections on the nodenode_name, transport_type

A connection is counted as healthy when its state is live. The subsysnqn label contains the NVMe Qualified Name of the subsystem. The transport_type label identifies the NVMe transport type (rdma, tcp, or fc).

FC connection metrics

Metric NameTypeDescriptionLabels
px_csi_node_fc_hostsGaugeNumber of FC hosts on the nodenode_name
px_csi_node_fc_hosts_onlineGaugeNumber of FC hosts that are onlinenode_name
px_csi_node_fc_rportsGaugeNumber of available FC remote-port connections on the nodenode_name
px_csi_node_fc_rports_onlineGaugeNumber of FC remote-port connections that are onlinenode_name

FC hosts and remote ports are counted as online when their port_state is Online.

Multipath device metrics

Metric NameTypeDescriptionLabels
px_csi_multipath_device_total_pathsGaugeTotal number of paths for a multipath device on the nodenode_name, volume_id
px_csi_multipath_device_healthy_pathsGaugeNumber of healthy paths for a multipath device on the nodenode_name, volume_id
px_csi_multipath_device_unhealthy_pathsGaugeNumber of unhealthy paths for a multipath device on the nodenode_name, volume_id

A path is counted as healthy when its state is active. The volume_id label identifies the PX-CSI volume associated with the multipath device.