Skip to main content
Version: 25.03.01

Metrics

Kafka is a distributed event streaming platform capable of handling a large number of events per day, enabling real-time data processing and integration across various applications and systems. Monitoring its metrics is crucial for ensuring performance, stability, and reliability. The following is a list of key Kafka metrics in PDS. Understanding these metrics will help administrators optimize performance, troubleshoot issues, and ensure the Kafka cluster runs smoothly.

note

For Kafka deployment, the data service metrics are accessible on port 5555.

Access metrics

Below is a step-by-step guide on how to access Kafka metrics for PDS deployments:

  1. Identify the Kafka pod running in your namespace:

    kubectl get pods -n <your-namespace>

    Look for the pod name that corresponds to your Kafka instance or its sidecar exporter.

  2. Port-forward from your local machine’s port 5555 to the pod’s port 5555:

    kubectl port-forward -n <your-namespace> <kafka-pod-name> 5555:5555
  3. Open a browser or use curl to go to http://localhost:5555/metrics.

    You should see a text-based Prometheus metrics output specific to Kafka.

  4. Check for the service exposing the Kafka exporter. for example, <release-name>-kafka-exporter:

    kubectl get svc -n <your-namespace>
  5. Access the metrics:

    • If NodePort, note <nodeport>:

      http://<node-ip>:<nodeport>/metrics
    • If LoadBalancer, note <loadbalancer-ip>:

      http://<loadbalancer-ip>:5555/metrics
  6. Verify metrics:

    • Using curl:

      curl http://<host>:5555/metrics

      Replace <host> with either localhost (if using port-forward), <node-ip> (NodePort), or <loadbalancer-ip> (LoadBalancer).

    • Prometheus UI:

      In Prometheus, navigate to the Expression browser and search for metrics beginning with kafka_ or similar Kafka-related prefixes to confirm they are being scraped.

    • Grafana or other dashboards:

      If you have Grafana connected to Prometheus, open your dashboard. Check that Kafka metrics (those starting with kafka_) are being ingested and displayed.

note
  • Ensure that any NetworkPolicies or firewall rules allow inbound traffic on port 5555 if you plan to expose it externally.
  • Metrics naming conventions can vary depending on the Kafka exporter version. Generally, look for prefixes like kafka_.

Kafka metrics

Metric nameType
jmx_config_reload_failure_totalcounter
jmx_config_reload_success_totalcounter
jmx_exporter_build_infogauge
jmx_scrape_cached_beansgauge
jmx_scrape_duration_secondsgauge
jmx_scrape_errorgauge
jvm_buffer_pool_capacity_bytesgauge
jvm_buffer_pool_used_buffersgauge
jvm_buffer_pool_used_bytesgauge
jvm_classes_currently_loadedgauge
jvm_classes_loaded_totalcounter
jvm_classes_unloaded_totalcounter
jvm_compilation_time_seconds_totalcounter
jvm_gc_collection_secondssummary
jvm_memory_committed_bytesgauge
jvm_memory_init_bytesgauge
jvm_memory_max_bytesgauge
jvm_memory_objects_pending_finalizationgauge
jvm_memory_pool_allocated_bytes_totalcounter
jvm_memory_pool_collection_committed_bytesgauge
jvm_memory_pool_collection_init_bytesgauge
jvm_memory_pool_collection_max_bytesgauge
jvm_memory_pool_collection_used_bytesgauge
jvm_memory_pool_committed_bytesgauge
jvm_memory_pool_init_bytesgauge
jvm_memory_pool_max_bytesgauge
jvm_memory_pool_used_bytesgauge
jvm_memory_used_bytesgauge
jvm_runtime_infogauge
jvm_threads_currentgauge
jvm_threads_daemongauge
jvm_threads_deadlockedgauge
jvm_threads_deadlocked_monitorgauge
jvm_threads_peakgauge
jvm_threads_started_totalcounter
jvm_threads_stategauge
kafka_controller_kafkacontroller_activebrokercountgauge
kafka_controller_kafkacontroller_activecontrollercountgauge
kafka_controller_kafkacontroller_eventqueueoperationsstartedcountgauge
kafka_controller_kafkacontroller_eventqueueoperationstimedoutcountgauge
kafka_controller_kafkacontroller_fencedbrokercountgauge
kafka_controller_kafkacontroller_globalpartitioncountgauge
kafka_controller_kafkacontroller_globaltopiccountgauge
kafka_controller_kafkacontroller_lastappliedrecordlagmsgauge
kafka_controller_kafkacontroller_lastappliedrecordoffsetgauge
kafka_controller_kafkacontroller_lastappliedrecordtimestampgauge
kafka_controller_kafkacontroller_lastcommittedrecordoffsetgauge
kafka_controller_kafkacontroller_metadataerrorcountgauge
kafka_controller_kafkacontroller_migratingzkbrokercountgauge
kafka_controller_kafkacontroller_newactivecontrollerscountgauge
kafka_controller_kafkacontroller_offlinepartitionscountgauge
kafka_controller_kafkacontroller_preferredreplicaimbalancecountgauge
kafka_controller_kafkacontroller_timedoutbrokerheartbeatcountgauge
kafka_controller_kafkacontroller_zkmigrationstategauge
kafka_log_log_logendoffsetgauge
kafka_log_log_logstartoffsetgauge
kafka_log_log_numlogsegmentsgauge
kafka_log_log_sizegauge
kafka_log_logcleaner_cleaner_recopy_percentgauge
kafka_log_logcleaner_deadthreadcountgauge
kafka_log_logcleaner_max_buffer_utilization_percentgauge
kafka_log_logcleaner_max_clean_time_secsgauge
kafka_log_logcleaner_max_compaction_delay_secsgauge
kafka_log_logcleanermanager_max_dirty_percentgauge
kafka_log_logcleanermanager_time_since_last_run_msgauge
kafka_log_logcleanermanager_uncleanable_bytesgauge
kafka_log_logcleanermanager_uncleanable_partitions_countgauge
kafka_log_logmanager_logdirectoryofflinegauge
kafka_log_logmanager_offlinelogdirectorycountgauge
kafka_network_processor_idlepercentgauge
kafka_network_requestchannel_requestqueuesizegauge
kafka_network_requestchannel_responsequeuesizegauge
kafka_network_requestmetrics_errors_totalcounter
kafka_network_requestmetrics_requests_totalcounter
kafka_network_socketserver_expiredconnectionskilledcountgauge
kafka_network_socketserver_memorypoolavailablegauge
kafka_network_socketserver_memorypoolusedgauge
kafka_network_socketserver_networkprocessoravgidlepercentgauge
kafka_server_assignmentsmanager_queuedreplicatodirassignmentsgauge
kafka_server_brokertopicmetrics_bytesin_totalcounter
kafka_server_brokertopicmetrics_bytesout_totalcounter
kafka_server_brokertopicmetrics_bytesrejected_totalcounter
kafka_server_brokertopicmetrics_failedfetchrequests_totalcounter
kafka_server_brokertopicmetrics_failedproducerequests_totalcounter
kafka_server_brokertopicmetrics_fetchmessageconversions_totalcounter
kafka_server_brokertopicmetrics_invalidmagicnumberrecords_totalcounter
kafka_server_brokertopicmetrics_invalidmessagecrcrecords_totalcounter
kafka_server_brokertopicmetrics_invalidoffsetorsequencerecords_totalcounter
kafka_server_brokertopicmetrics_messagesin_totalcounter
kafka_server_brokertopicmetrics_nokeycompactedtopicrecords_totalcounter
kafka_server_brokertopicmetrics_producemessageconversions_totalcounter
kafka_server_brokertopicmetrics_reassignmentbytesin_totalcounter
kafka_server_brokertopicmetrics_reassignmentbytesout_totalcounter
kafka_server_brokertopicmetrics_replicationbytesin_totalcounter
kafka_server_brokertopicmetrics_replicationbytesout_totalcounter
kafka_server_brokertopicmetrics_totalfetchrequests_totalcounter
kafka_server_brokertopicmetrics_totalproducerequests_totalcounter
kafka_server_controllerserver_linux_disk_read_bytesgauge
kafka_server_controllerserver_linux_disk_write_bytesgauge
kafka_server_controllerserver_yammer_metrics_countgauge
kafka_server_delayedoperationpurgatory_numdelayedoperationsgauge
kafka_server_delayedoperationpurgatory_purgatorysizegauge
kafka_server_fetchsessioncache_incrementalfetchsessionevictions_totalcounter
kafka_server_fetchsessioncache_numincrementalfetchpartitionscachedgauge
kafka_server_fetchsessioncache_numincrementalfetchsessionsgauge
kafka_server_kafkaserver_brokerstategauge
kafka_server_kafkaserver_linux_disk_read_bytesgauge
kafka_server_kafkaserver_linux_disk_write_bytesgauge
kafka_server_kafkaserver_yammer_metrics_countgauge
kafka_server_metadataloader_currentcontrolleridgauge
kafka_server_metadataloader_currentmetadataversiongauge
kafka_server_metadataloader_handleloadsnapshotcountgauge
kafka_server_replicaalterlogdirsmanager_deadthreadcountgauge
kafka_server_replicaalterlogdirsmanager_failedpartitionscountgauge
kafka_server_replicaalterlogdirsmanager_maxlaggauge
kafka_server_replicaalterlogdirsmanager_minfetchrategauge
kafka_server_replicafetchermanager_deadthreadcountgauge
kafka_server_replicafetchermanager_failedpartitionscountgauge
kafka_server_replicafetchermanager_maxlaggauge
kafka_server_replicafetchermanager_minfetchrategauge
kafka_server_replicamanager_atminisrpartitioncountgauge
kafka_server_replicamanager_failedisrupdates_totalcounter
kafka_server_replicamanager_isrexpands_totalcounter
kafka_server_replicamanager_isrshrinks_totalcounter
kafka_server_replicamanager_leadercountgauge
kafka_server_replicamanager_offlinereplicacountgauge
kafka_server_replicamanager_partitioncountgauge
kafka_server_replicamanager_partitionswithlatetransactionscountgauge
kafka_server_replicamanager_produceridcountgauge
kafka_server_replicamanager_reassigningpartitionsgauge
kafka_server_replicamanager_underminisrpartitioncountgauge
kafka_server_replicamanager_underreplicatedpartitionsgauge
kafka_server_snapshotemitter_latestsnapshotgeneratedagemsgauge
kafka_server_snapshotemitter_latestsnapshotgeneratedbytesgauge
process_cpu_seconds_totalcounter
process_max_fdsgauge
process_open_fdsgauge
process_resident_memory_bytesgauge
process_start_time_secondsgauge
process_virtual_memory_bytesgauge