Labels
The AlloyDB Omni Kubernetes operator exposes the following types of labels.
Resource labels
AlloyDB Omni Kubernetes operator exposes the following resource labels that uniquely identify the database container that the metrics belong to. These resource labels match the names of the Kubernetes resource that owns the database container:
| Label key | Label value |
|---|---|
| dbnamespace | Namespace of the dbcluster CR |
| dbcluster | Name of the dbcluster CR |
| dbinstance | Name of the dbinstance CR. Only the dbinstance of ReadPool type is supported. If the database container does not belong to a ReadPool dbinstance, this value is n/a
|
| dbnode | Name of the instance CR. Every instance CR has a one-to-one mapping to a database container. |
System metadata labels
System metadata labels will change dynamically when the roles
of
the DB container changes. For example, When your dbcluster is promoted from
secondary to primary, dbcluster_type will change from Secondary
to Primary
.
| Label key | Label value |
|---|---|
| dbcluster_type | Disaster recovery (DR) role of the dbcluster CR. Can be Primary
or Secondary
|
| dbinstance_type | Type of the dbinstance CR. If the container belongs to a ReadPool dbinstance, this value is ReadPool
, otherwise this value is n/a
|
| dbnode_type | HA role of the dbnode, can be Primary
or Standby
|
Metric labels
The specific labels of each metric are listed in the following tables. For
example, database
means the name of a Postgres database hosted inside the AlloyDB Omni database container.
Metrics
AlloyDB Omni Kubernetes operator exposes the following metrics.
The metrics list mentions only metrics labels. All metrics start with alloydb_omni
.
To learn more about metric types, see Metric types
.
Database container-level metrics
Database container-level metrics metrics are collected per AlloyDB Omni database container. Each database container-level metric has resource and system metadata labels.
MAJOR.MINOR
format, for example, 16.3
- wait_event_name: name of the wait event
- wait_event_type: type of the wait event
- wait_event_name: name of the wait event
- wait_event_type: type of the wait event
- application_name: application_name in the replica's connection string to the primary that matches the name of the replica instance CR.
- client_addr: IP address of the replica pod.
- application_name: application_name in the replica's connection string to the primary. It matches the name of the replica instance CR.
- client_addr: IP address of the replica pod.
1
.- application_name: application_name in the replica's connection string to the primary. It matches the name of the replica instance CR.
- client_addr: IP address of the replica pod.
- state: one of [startup, catchup, streaming, backup, stopping]
- application_name: application_name in the replica's connection string to the primary. It matches the name of the replica instance CR.
- client_addr: IP address of the replica pod.
primarySpec
section of spec
portion of the database cluster manifest file.Database-level metrics
These metrics are collected on a per AlloyDB Omni database container per Postgres database level. You can create multiple Postgres databases in one database container. All these metrics have resource, system metadata, and "database" labels. The database label is the name of the Postgres database that the metric belongs to.
- user: Postgres user that ran the queries.
- client_addr: IP address of the client if available, otherwise empty.
- user: Postgres user that ran the queries
- io_type:
readorwrite
Metrics collection metrics
These metrics indicate the status of each metric collection cycle. They have the resource labels mentioned in Labels .
| Name | Description | Unit | Type |
|---|---|---|---|
|
alloydb_omni_monitor_collect_ms
|
Number of milliseconds spent to collect metrics. | ms | gauge |
|
alloydb_omni_monitor_error_count
|
Number of errors encountered while trying to collect metrics this cycle. | gauge | |
|
alloydb_omni_monitor_metric_count
|
Number of metrics collected successfully this cycle. | gauge |
Prometheus metric handler metrics
These metrics are automatically generated by Prometheus for each collection cycle.
| Name | Description | Cause | Type |
|---|---|---|---|
|
promhttp_metric_handler_errors_total
|
Total number of internal errors encountered by the promhttp metric handler. | Cause of the error | counter |
Controller-runtime metrics
These metrics are exposed by the AlloyDB Omni Kubernetes operator for monitoring its health and performance. For more information, see the controller-runtime metrics reference .
| Name | Description | Label | Unit | Type |
|---|---|---|---|---|
workqueue_depth
|
Current depth of workqueue. | name | Gauge | |
workqueue_adds_total
|
Total number of adds handled by workqueue. | name | Counter | |
workqueue_queue_duration_seconds
|
How long in seconds an item stays in workqueue before being requested. | name | seconds | Histogram |
workqueue_work_duration_seconds
|
How long in seconds processing an item from workqueue takes. | name | seconds | Histogram |
workqueue_unfinished_work_seconds
|
How many seconds of work has been done that is in progress and hasn't been observed by work_duration
. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases. |
name | seconds | Gauge |
workqueue_longest_running_processor_seconds
|
How many seconds has the longest running processor for workqueue been running. | name | seconds | Gauge |
workqueue_retries_total
|
Total number of retries handled by workqueue. | name | Counter | |
rest_client_requests_total
|
Number of HTTP requests, partitioned by status code, method, and host. | code, method, host | Counter | |
controller_runtime_reconcile_total
|
Total number of reconciliations per controller. | controller, result | Counter | |
controller_runtime_reconcile_errors_total
|
Total number of reconciliation errors per controller. | controller | Counter | |
controller_runtime_terminal_reconcile_errors_total
|
Total number of terminal errors from the reconciler. | controller | Counter | |
controller_runtime_reconcile_time_seconds
|
Length of time per reconciliation per controller. | controller | seconds | Histogram |
controller_runtime_max_concurrent_reconciles
|
Maximum number of concurrent reconciles per controller. | controller | Gauge | |
controller_runtime_active_workers
|
Number of used workers per controller. | controller | controller | Gauge |
controller_runtime_webhook_latency_seconds
|
Histogram of the latency of processing admission requests. | webhook | seconds | Histogram |
controller_runtime_webhook_requests_total
|
Total number of admission requests by HTTP status code. | webhook, code | Counter | |
controller_runtime_webhook_requests_in_flight
|
Current number of admission requests being served. | webhook | Gauge |
What's next
- To learn how to use metrics for monitoring, see Monitor AlloyDB Omni .

