Portworx CSI Alerts
Portworx CSI has a predefined set of alerts which are listed below. These alerts are critical for analysing the health and performance of your storage infrastructure. Each alert is categorized based on its type and severity, enabling efficient troubleshooting and management.
The table below outlines the key details:
- Name: The identifier for the alert.
- AlertType: The valid values for AlertType are: volume, node, cluster, drive, and pool, each representing a specific category of infrastructure components monitored for potential issues.
- Severity: The level of importance, where "ALARM" indicates critical issues requiring immediate attention.
List of Portworx CSI Alerts
Name | ResourceType | Severity | Description | Metric |
---|---|---|---|---|
MeteringAgentCritical | CLUSTER | ALARM | Triggered when the metering agent encounters a critical problem. | px_alerts_meteringagentcritical_total |
LicenseExpired | CLUSTER | ALARM | Triggered when the cluster license expires. | px_alerts_licenseexpired_total |
LicenseLeaseExpired | CLUSTER | ALARM | Triggered when the license lease has expired since the last lease refresh failed. | px_alerts_licenseleaseexpired_total |
BaseAgentRegistrationFailed | CLUSTER | ALARM | Basge agent failed to register | px_alerts_baseagentregistrationfailed_total |
LicenseExpiring | CLUSTER | WARNING | Warning triggers 7 days before the license will expire. It will also keep triggering after the license has expired (e.g. “Trial license expired 4 days, 06:22 ago”). | px_alerts_licenseexpiring_total |
MeteringAgentWarning | CLUSTER | WARNING | Triggered when the metering agent encounters a non-critical problem. | px_alerts_meteringagentwarning_total |
LicenseLeaseExpiring | CLUSTER | WARNING | Triggered when the license lease is about to expire since the last lease refresh failed. | px_alerts_licenseleaseexpiring_total |
ClusterLicenseUpdated | CLUSTER | NOTIFY | Triggered when a license is updated for a cluster. | px_alerts_clusterlicenseupdated_total |
NodeStartFailure | NODE | ALARM | Triggered when a node in the Portworx cluster fails to start. | px_alerts_nodestartfailure_total |
NodeStateChange | NODE | ALARM | Node state changed (i.e. it went down, came online etc.) | px_alerts_nodestatechange_total |
PXInitFailure | NODE | ALARM | Triggered when Portworx fails to initialize on a node. | px_alerts_pxinitfailure_total |
ClusterManagerFailure | NODE | ALARM | Triggered when Cluster manager on a Portworx node fails to start. The alert message will give more info about the specific error case. | px_alerts_clustermanagerfailure_total |
NodeDecommissionFailure | NODE | ALARM | Triggered when a node could not be decommissioned from Portworx cluster. | px_alerts_nodedecommissionfailure_total |
NodeInitFailure | NODE | ALARM | Triggered when Portworx fails to initialize on a node. | px_alerts_nodeinitfailure_total |
LicenseCheckFailed | NODE | ALARM | Triggered if a node fails a license check. | px_alerts_licensecheckfailed_total |
KvdbConnectionFailed | NODE | ALARM | Triggered if Portworx fails to connect to the KVDB. | px_alerts_kvdbconnectionfailed_total |
InternalKvdbSetupFailed | NODE | ALARM | Triggered if Portworx fails to setup Internal KVDB on a node. | px_alerts_internalkvdbsetupfailed_total |
PortworxMonitorImagePullFailed | NODE | ALARM | Triggered if Portworx fails to pull Portworx images during installation. | px_alerts_portworxmonitorimagepullfailed_total |
PortworxMonitorPrePostExecutionFailed | NODE | ALARM | Triggered if Portworx fails to execute pre or post installation tasks. | px_alerts_portworxmonitorprepostexecutionfailed_total |
PortworxMonitorMountValidationFailed | NODE | ALARM | Triggered if Portworx fails to validate mounts provided to Portworx container during installation. | px_alerts_portworxmonitormountvalidationfailed_total |
PortworxMonitorSchedulerInitializationFailed | NODE | ALARM | Triggered if Portworx fails to initialize connection with scheduler during installation. | px_alerts_portworxmonitorschedulerinitializationfailed_total |
PortworxMonitorServiceControlsInitializationFailed | NODE | ALARM | Triggered if Portworx fails to initialize the service controls during installation. | px_alerts_portworxmonitorservicecontrolsinitializationfailed_total |
PortworxMonitorInstallFailed | NODE | ALARM | Triggered if Portworx installation fails. | px_alerts_portworxmonitorinstallfailed_total |
MissingInputArgument | NODE | ALARM | Triggered if there’s a missing input install argument. | px_alerts_missinginputargument_total |
InvalidArgument | NODE | ALARM | Invalid input argument | px_alerts_invalidargument_total |
PXHostDependencyFailure | NODE | ALARM | Host does not meet dependencies for applied px configuration | px_alerts_pxhostdependencyfailure_total |
CallHomeFailure | NODE | ALARM | Call home failure | px_alerts_callhomefailure_total |
DiagCollectJobCancelled | NODE | ALARM | DiagCollect job cancelled | px_alerts_diagcollectjobcancelled_total |
DiagCollectJobFailed | NODE | ALARM | DiagCollect job failed | px_alerts_diagcollectjobfailed_total |
PXNodePrerequisiteMissing | NODE | ALARM | Triggered when Portworx is missing a prerequisite to start | px_alerts_pxnodeprerequisitemissing_total |
ArrayLoginFailed | NODE | ALARM | Triggered when to login to FlashArray fails | px_alerts_arrayloginfailed_total |
MountpointCleanupFailed | NODE | ALARM | Triggered when mountpoint cleaner fails | px_alerts_mountpointcleanupfailed_total |
PXStateChange | NODE | WARNING | Triggered when the Portworx daemon shuts down in error. | px_alerts_pxstatechange_total |
NodeDecommissionPending | NODE | WARNING | Triggered when a node decommission is kept in pending state as it has data which is not replicated on other nodes. | px_alerts_nodedecommissionpending_total |
NodeMarkedDown | NODE | WARNING | Triggered when a Portworx node marks another node down as it is unable to connect to it. | px_alerts_nodemarkeddown_total |
SecretsAuthFailed | NODE | WARNING | Secrets setup has failed | px_alerts_secretsauthfailed_total |
PortworxStoppedOnNode | NODE | WARNING | Triggered if Portworx is stopped on a node. | px_alerts_portworxstoppedonnode_total |
KvdbConnectionWarning | NODE | WARNING | kvdb endpoint is not accessible | px_alerts_kvdbconnectionwarning_total |
NodeStartCannotProceed | NODE | WARNING | Triggered when Portworx startup on a node cannot proceed because a dependency has not been met | px_alerts_nodestartcannotproceed_total |
NodeStartSuccess | NODE | NOTIFY | Triggered when a node in the Portworx cluster successfully initializes. | px_alerts_nodestartsuccess_total |
PXInitSuccess | NODE | NOTIFY | Triggered when Portworx successfully initializes on a node. | px_alerts_pxinitsuccess_total |
NodeDecommissionSuccess | NODE | NOTIFY | Triggered when a node is successfully decommissioned from Portworx cluster. | px_alerts_nodedecommissionsuccess_total |
PXReady | NODE | NOTIFY | Triggered when Portworx is ready on a node. | px_alerts_pxready_total |
PortworxMonitorImagePullInProgress | NODE | NOTIFY | Triggered when Portworx is pulling and extracting images during installation or upgrade. | px_alerts_portworxmonitorimagepullinprogress_total |
DiagCollectJobStarted | NODE | NOTIFY | DiagCollect job started execution | px_alerts_diagcollectjobstarted_total |
DiagCollectJobInProgress | NODE | NOTIFY | DiagCollect job in progress | px_alerts_diagcollectjobinprogress_total |
DiagCollectJobFinished | NODE | NOTIFY | DiagCollect job finished execution | px_alerts_diagcollectjobfinished_total |
CCMstatusFailed | NODE | NOTIFY | CCM status check failed | px_alerts_ccmstatusfailed_total |
CCMuploadFailed | NODE | NOTIFY | Upload to CCM failed | px_alerts_ccmuploadfailed_total |
ROVolPodBounce | NODE | NOTIFY | Triggered when read-write (rw) volume mounts turn read-only (ro) due to errors. Application pods using them will be bounced. | px_alerts_rovolpodbounce_total |
KvdbEndpointsChanged | NODE | NOTIFY | Triggered when this node starts using a different set of kvdb endpoints. | px_alerts_kvdbendpointschanged_total |
KvdbBootstrapEntryAdded | NODE | NOTIFY | Triggered when this node adds an entry (usually for this node) to the internal KVDB bootstrap database. | px_alerts_kvdbbootstrapentryadded_total |
KvdbBootstrapEntryRemoved | NODE | NOTIFY | Triggered when this node removes an entry (for this or another node) from the internal KVDB bootstrap database. | px_alerts_kvdbbootstrapentryremoved_total |
KvdbMemberAdded | NODE | NOTIFY | Triggered when this node adds itself as a member to the internal KVDB cluster. | px_alerts_kvdbmemberadded_total |
KvdbMemberRemoved | NODE | NOTIFY | Triggered when this node removes itself or another node from the internal KVDB cluster. | px_alerts_kvdbmemberremoved_total |