Monitor a Google Cloud Managed Service for Apache Kafka cluster

Managed Service for Apache Kafka collects metrics that you can use to monitor your Kafka clusters. This page describes how to view these metrics in the Google Cloud console.

Overview

Managed Service for Apache Kafka exports several metrics available in the open-source Kafka distribution, as well as service-specific metrics like consumer group offset lag.

The metrics are organized into four resource categories:

  • Cluster: These metrics are useful for maintaining the overall health of a cluster.

  • Topic: These metrics include publisher and consumer rates and errors. They monitor the overall health of Kafka applications, and issues specific to a broker.

  • Topic Partition: These metrics are intended for monitoring and debugging performance problems specific to individual partitions, such as uneven key distribution.

  • Topic Partition Consumer Group: These metrics monitor the health of consumer applications, primarily consumer lag. Open-source Kafka error metrics for consumer groups are not available by partition, but only at the topic level.

Some metrics can be grouped by broker index. Based on the broker index, you can look up the zone where that broker is provisioned. For more information, see View brokers .

View metrics for a Kafka cluster

You can view the metrics in the following ways:

  • The Cluster detailspage includes monitoring dashboards for clusters, topics, and consumer groups. These dashboards contain predefined charts that let you see the overall health and performance of your cluster.

  • You can use Metrics Explorer to view all available metrics, create custom charts, or aggregate metrics across multiple clusters.

Required roles and permissions

To get the permissions that you need to view monitoring charts, ask your administrator to grant you the Managed Kafka Viewer ( roles/managedkafka.Viewer ) IAM role on your project. For more information about granting roles, see Manage access to projects, folders, and organizations .

You might also be able to get the required permissions through custom roles or other predefined roles .

For more information about this role, see Managed Service for Apache Kafka predefined roles .

Use the monitoring dashboards

To view the monitoring dashboards for a Managed Service for Apache Kafka cluster, perform the following steps:

  1. In the Google Cloud console, go to the Clusterspage.

    Go to Clusters

  2. Click the name of the cluster.

  3. To view metrics for the cluster, select the Monitoringtab.

  4. To view metrics for a topic in the cluster:

    1. Select the Resourcestab.

    2. In the Topicslist, click the name of the topic.

    3. In the Topic detailspage, select the Monitoringtab.

  5. To view metrics for a consumer group in the cluster:

    1. Select the Resourcestab.

    2. In the Consumer groupslist, click the name of the consumer group.

    3. In the Consumer group detailspage, select the Monitoringtab.

For more information, see View a Kafka cluster .

Use Metrics Explorer

To view Managed Service for Apache Kafka metrics by using Metrics Explorer, perform the following steps:

  1. In the Google Cloud console, go to the Metrics explorerpage.

    Go to Metrics explorer

  2. In the Configurationsection, click Select a metric.

  3. In the filter, enter Apache Kafka .

  4. In Active resources, select one of the following:

    • Apache Kafka Cluster

    • Apache Kafka Topic

    • Apache Kafka Topic Partition

    • Apache Kafka Topic Partition Consumer Group

  5. Select a metric and click Apply.

For more information about Metrics Explorer, see Create charts with Metrics Explorer .

Managed Service for Apache Kafka metrics

The following tables list commonly used metrics for monitoring your Kafka cluster and cluster resources. For the complete list of available metrics, see Google Cloud metrics .

The Managed Service for Apache Kafka service is identified by the service URL managedkafka.googleapis.com .

Cluster metrics

The following metrics apply to clusters. To view the metrics for a specific cluster, filter by the cluster_id label.

Metric Description Equivalent MBean Name
Cumulative CPU usage of the cluster in vCPU. This can be useful for understanding the overall cost of operation for the cluster. N/A
Current CPU count configured for the cluster. Can be used to monitor CPU utilization as a ratio with the cpu/usage metric. N/A
Current RAM usage on the cluster. Can be used to monitor RAM utilization as a ratio with the memory/limit metric. N/A
Current configured RAM size of the cluster. Can be used to monitor RAM utilization as a ratio with the memory/usage metric. N/A
The total number of bytes from clients sent to all topics. kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec
The total number of bytes sent to clients from all topics. kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec
The total number of messages that have been published to all topics. kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec
The total number of requests made to the broker kafka.network:type=RequestMetrics,name=RequestsPerSec,request= {Produce|FetchConsumer|FetchFollower},version=([0-9]+)
The total size, in bytes, of requests made to the Cluster. kafka.network:type=RequestMetrics,name=RequestBytes,request= ([-.\w]+)
The current number of partitions handled by this cluster, broken down by broker. kafka.server:type=ReplicaManager,name=PartitionCount
The number of milliseconds taken for each request, at various percentiles kafka.network:type=RequestMetrics,name=TotalTimeMs,request= {Produce|FetchConsumer|FetchFollower}
The current number of Consumer Groups consuming from the broker kafka.server:type=GroupMetadataManager,name=NumGroups
The number of offline topic partitions as observed by the controller. kafka.controller:type=KafkaController,name=OfflinePartitionCount

Topic metrics

The following metrics apply to topics. To view the metrics for a specific topic, filter by the cluster_id and topic_id labels.

Metric Description Equivalent MBean name
The total number of messages published to the topic. kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec, topic=([-.\w]+)
The total number of bytes from clients sent to the topic. kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=([-.\w]+)
The total number of produce and fetch requests made to the topic. kafka.server:type=BrokerTopicMetrics,name=TotalProduceRequestsPerSec,topic=([-.\w]+)
kafka.server:type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec,topic=([-.\w]+)
The total number of failed produce and failed fetch requests made to the topic. kafka.server:type=BrokerTopicMetrics,name=FailedProduceRequestsPerSec,topic=([-.\w]+)
kafka.server:type=BrokerTopicMetrics,name=FailedFetchRequestsPerSec,topic=([-.\w]+)
The total number of bytes sent to clients. kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec, topic=([-.\w]+)

Partition metrics

The following metrics apply to partitions. To view the metrics for a specific partition in a topic, filter by the cluster_id , topic_id , and partition_index labels.

Metric Description Equivalent MBean name
Replication lag in messages between leader and each follower replica. kafka.server:type=FetcherLagMetrics,name=ConsumerLag,clientId=([-.\w]+),topic=([-.\w]+),partition=([0-9]+)
The current number of log segments. This is useful to make sure storage tiering remains healthy. kafka.log:type=Log,name=NumLogSegments,topic=([-.\w]+),partition=([0-9]+)
The first offset for each partition in the topic. In combination with the last_offset , it can be used to monitor an upper bound on the total number of messages stored as well as to find the actual offset of the oldest message. kafka.log:type=Log,name=LogStartOffset,topic=([-.\w]+),partition=([0-9]+)
The last offset in the partition. This can be used to find the latest offset for each partition over time. This can be useful in identifying the specific offset needed to reprocess data starting from a particular time in the past. kafka.log:type=Log,name=LogEndOffset,topic=([-.\w]+),partition=([0-9]+)
The size of the partition on disk in bytes. N/A

Consumer group metrics

The following metrics apply to consumer groups. To view the metrics for a specific consumer group, filter by the consumer_group_id label.

Metric Description Equivalent MBean name
The difference between the latest offset and the last committed offset for the consumer group for each partition. This metric estimates how many produced messages the consumer has not yet successfully processed. N/A

What's next

Create a Mobile Website
View Site in Mobile | Classic
Share by: