The following table lists the infrastructure resources that are integrated with Application Monitoring. When these resources are registered as an App Hub service or workload, the telemetry that the resources generate includes application-specific labels . This telemetry includes platform and audit log entries, metric data, and trace data. For a list of services and workloads that are integrated with App Hub, see App Hub supported resources .
The out-of-the-box (OOTB) dashboards generated by Application Monitoring display log and metric data, including the following golden signals, when that data includes application-specific labels:
- Traffic : Incoming request rates on the service or workload over the selected time period.
- Server error rate : Average percentage of incoming requests that generate or map to 5xx HTTP response codes over the selected time period.
- P95 latency : 95th percentile of latency for a request served over the selected time period, in milliseconds.
- Saturation : Measures how full your service or workload is. For example, for managed instance groups (MIGs), Cloud Run, and Google Kubernetes Engine deployments, this field shows the CPU utilization.
Supported infrastructure resources
The Notescolumn lists details about which golden signals are supported. This column also lists limitations.
- Traffic : instance/postgres/transaction_count
- Server error rate : Ratio of the instance/postgres/abort_count to the total instance/postgres/transaction_count .
- Saturation : instance/cpu/average_utilization
- Saturation : cluster/cpu_load
- Saturation : container/cpu/utilizations
- Traffic : request_count
- Server error rate
: The ratio of the number
of requests with a response status of
5xx
to the total number of requests. - P95 latency : request_latencies
- Saturation : container/cpu/utilizations
Only trace spans generated by instrumented workloads contain application labels.
Only customer-instrumented workloads running within Cloud Run generate golden signals.
- Saturation : database/cpu/utilization
- Traffic : api/request_count
- Server error rate
: Ratio of requests with a response
status that indicates an error to the total number of requests.
The following status values indicate an error:
-
DATA_LOSS
-
DEADLINE_EXCEEDED
-
INTERNAL
-
UNAVAILABLE
-
UNIMPLEMENTED
-
UNKNOWN
-
- Saturation : instance/cpu/utilization
- Traffic
: The metric depends on whether the service is
single- or multi-region:
- Single: service/request_count .
- Multi: service/multi_region/request_count .
- Server error rate
: The ratio of the number of service/grpc/finished_requests_count
with a response status that indicates an error to the total number
of finished requests.
The following status values indicate an error:
-
DEADLINE_EXCEEDED
-
INTERNAL
-
UNAVAILABLE
-
UNIMPLEMENTED
-
- P95 latency
: The metric depends on whether the service is
single- or multi-region:
- Single: service/request_latencies
- Multi: service/multi_region/request_latencies .
Application labels aren't attached to spans.
For general information, see Dataproc Metastore overview .
- Traffic : api/request_count
- Server error rate
: Ratio of requests with a response
status that indicates an error to the total number of requests.
The following status values indicate an error:
-
DEADLINE_EXCEEDED
-
INTERNAL
-
UNAVAILABLE
-
UNIMPLEMENTED
-
- P95 latency : api/request_latencies
- Saturation : The ratio of the container/cpu/core_usage_time to the container/cpu/request_cores .
- Traffic
, Server error rate
, and P95 latency
:
For workloads that run on GKE,
these signals are captured from the Prometheus metric
http_server_request_duration_seconds
, which is only available when you instrument your application by using OpenTelemetry. To learn more, see Instrument your application .
- Saturation : The ratio of the container/cpu/core_usage_time to the container/cpu/request_cores .
- Traffic
, Server error rate
, and P95 latency
:
For workloads that run on GKE,
these signals are captured from the Prometheus metric
http_server_request_duration_seconds
, which is only available when you instrument your application by using OpenTelemetry. To learn more, see Instrument your application .
- Saturation : The ratio of the container/cpu/core_usage_time to the container/cpu/request_cores .
- Traffic
, Server error rate
, and P95 latency
:
For workloads that run on GKE,
these signals are captured from the Prometheus metric
http_server_request_duration_seconds
, which is only available when you instrument your application by using OpenTelemetry. To learn more, see Instrument your application .
(Global and regional)
- Traffic : Based on a Cloud Load Balancing metric type that records the request count, like https/request_count . The configuration of the Cloud Load Balancing determines the actual metric.
- Server error rate
: The ratio of the number
of requests with a response status of
5xx
to the total number of requests. - P95 latency : Based on a Cloud Load Balancing metric type that records total latencies, like https/total_latencies . The configuration of the Cloud Load Balancing determines the actual metric.
- Saturation : cluster/cpu/average_utilization
- Traffic : topic/send_request_count
- Server error rate
: The ratio of the number of
requests with a response code of
internal
to the total number of requests. - P95 latency : topic/send_request_latencies
- Traffic : subscription/pull_request_count
- Server error rate
: The ratio of the number of
requests with a response code of
internal
to the total number of requests. - P95 latency : subscription/push_request_latencies
- Traffic : api/api_request_count
- Server error rate
: Ratio of requests with a response
status that indicates an error to the total number of requests.
The following status values indicate an error:
-
data_loss
-
deadline_exceeded
-
internal
-
unavailable
-
unimplemented
-
unknown
-
- P95 latency : api/request_latencies
- Saturation : instance/cpu/utilization