This page describes how to use built-in metrics, custom metrics, and alerts to monitor your agents in Vertex AI Agent Engine.
Overview
You can use Vertex AI Agent Engine using Cloud Monitoring without any additional setup or configuration. Built-in agent metrics are automatically collected and visualized in Cloud Monitoring pages in the Google Cloud console.
Supported built-in metrics
The following agent metrics are supported and associated with the Vertex AI Agent Engine monitored resource  aiplatform.googleapis.com/ReasoningEngine 
 
:
- Request count
 - Request latencies
 - Container CPU allocation time
 - Container memory allocation time
 
Refer to the full list of AI Platform metrics for more details about metric types, units, labels, as well as latency and sampling period.
View metrics for an agent
You can view your agent built-in metrics in the Google Cloud console using the Metrics Explorer :
-  
To get permission to view metrics in Metrics Explorer, ask your administrator to grant you the Monitoring Viewer role (
roles/monitoring.viewer) on your project. -  
Go to Metrics Explorerin the Google Cloud console:
 -  
Select your Google Cloud project.
 -  
Click Select a metricto open a search bar.
 -  
Enter Vertex AI Agent Builder Reasoning Enginein the search bar and click Vertex AI Agent Builder Reasoning Engine.
 -  
Click the Reasoning_enginemetric category, then click a metric, such as Request count.
 -  
Optionally, set additional label filters, aggregation element and adjust the time range.
 
By default, the charts in the Metrics Explorer for the Request countmetric aligns data points with a default time interval and plots data points as request-per-second (a rate metric).
Query metrics for an agent
You can also query metrics through Prometheus Query Language (PromQL) or Cloud Monitoring v3 API . PromQL offers more options for metrics filtering, aggregation and transformation, while the Cloud Monitoring API lets you programmatically list and query all raw data points.
Query metrics with PromQL
You can use PromQL to align and aggregate data points with a custom time
interval and plot transformed data points as the absolute request count
(instead of request-per-second). The following example filters data by Agent
Engine instance ID (  RESOURCE_ID 
 
) and response code
(  RESPONSE_CODE 
 
):
  sum_over_time 
 ( 
  
 increase 
 ( 
  
 aiplatform_googleapis_com 
 : 
 reasoning_engine_request_count 
 { 
  
 monitored_resource 
 = 
 ' 
 aiplatform.googleapis.com/ReasoningEngine 
 ', 
  
 reasoning_engine_id 
 = 
 ' 
  RESOURCE_ID 
 
 ', 
  
 response_code 
 = 
 ' 
  RESPONSE_CODE 
 
 ' 
  
 } 
  
 [ 
 10m 
 ] 
  
 ) 
  
 [ 
 10m 
 : 
 10m 
 ] 
 ) 
 
 
You can query the error rate 
by calculating the ratio of the requests that are labeled
with certain error response codes (such as 500 
) to the total number of requests (percentage of failed requests):
  sum_over_time 
 ( 
  
 sum 
 ( 
  
 rate 
 ( 
  
 aiplatform_googleapis_com 
 : 
 reasoning_engine_request_count 
 { 
  
 monitored_resource 
 = 
 ' 
 aiplatform.googleapis.com/ReasoningEngine 
 ', 
  
 reasoning_engine_id 
 = 
 ' 
  RESOURCE_ID 
 
 ', 
  
 response_code 
 = 
 ' 
 500 
 ' 
  
 } 
  
 [ 
 10m 
 ] 
  
 ) 
  
 ) 
  
 [ 
 10m 
 : 
 10m 
 ] 
 ) 
 / 
 sum_over_time 
 ( 
  
 sum 
 ( 
  
 rate 
 ( 
  
 aiplatform_googleapis_com 
 : 
 reasoning_engine_request_count 
 { 
  
 monitored_resource 
 = 
 ' 
 aiplatform.googleapis.com/ReasoningEngine 
 ', 
  
 reasoning_engine_id 
 = 
 ' 
  RESOURCE_ID 
 
 ', 
  
 } 
  
 [ 
 10m 
 ] 
  
 ) 
  
 ) 
  
 [ 
 10m 
 : 
 10m 
 ] 
 ) 
 
 
For best practices and restrictions for ratio metrics, see About ratios of metrics . For an example of how to set an alert for the error rate metric, see Sample policies in JSON .
Query metrics with Cloud Monitoring API
You can use the Cloud Monitoring API to do the following:
-  
Get the Vertex AI Agent Engine monitored resource definition
 -  
List available agent metric definitions
 -  
Query time-series data for
request_count 
All Agent metrics are associated with the Agent Engine
monitored resource  aiplatform.googleapis.com/ReasoningEngine 
 
.
You can invoke these APIs through APIs Explorer 
, language specific client
libraries or command line. Refer to the documentation 
for reading metrics
through APIs Explorer and client libraries. The following examples demonstrate
the usage in command line, more specifically the curl 
tool.
Get the Agent Engine monitored resource definition
The following command retrieves the definition of the monitored resource using  projects.monitoredResourceDescriptors 
 
, as
well as all available labels which can be used for filtering:
 gcurl  
https://monitoring.googleapis.com/v3/projects/ PROJECT_ID 
/monitoredResourceDescriptors/aiplatform.googleapis.com/ReasoningEngine 
 
The labels should include resource_container 
, location 
and reasoning_engine_id 
.
List available agent metric definitions
The following command uses  projects.metricDescriptors 
 
to retrieve all metrics and label filters for Agent Engine:
 gcurl  
https://monitoring.googleapis.com/v3/projects/ PROJECT_ID 
/metricDescriptors?filter = 
 'metric.type=starts_with("aiplatform.googleapis.com/reasoning_engine")' 
 
 
The result should include the definition for the following metrics as well as their specific labels:
-  
aiplatform.googleapis.com/reasoning_engine/request_count -  
aiplatform.googleapis.com/reasoning_engine/request_latencies -  
aiplatform.googleapis.com/reasoning_engine/cpu/allocation_time -  
aiplatform.googleapis.com/reasoning_engine/memory/allocation_time 
Query time-series data for request_count 
 
 You can use  projects.timeSeries.list 
 
along with parameters like interval 
, filter 
, and aggregation 
to query time-series data.
The following example shows how to query the raw data points for request_count 
metric for a specific agent instance during a specific time window:
 gcurl  
https://monitoring.googleapis.com/v3/projects/ PROJECT_ID 
/timeSeries?filter = 
 'metric.type="aiplatform.googleapis.com/reasoning_engine/request_count"%20AND%20resource.labels.reasoning_engine_id=" RESOURCE_ID 
"&interval.endTime=2025-03-26T11:00:0.0-08:00&interval.startTime=2025-03-26T10:00:0.0-08:00' 
 
 
Replace the following:
- PROJECT_ID : Your Google Cloud project ID.
 - RESOURCE_ID : The Agent Engine instance ID. This is not always required. You can query across multiple Agent Engine instances within the same project.
 -  
interval.startTimeandinterval.endTime: The start (inclusive) and end (exclusive) of the time interval, in RFC 3339 format. For example,"2025-03-26T11:22:33Z"for Coordinated Universal Time (UTC) and"2025-03-26T11:22:33-08:00"for Pacific Standard Time (PST). See the complete definition and more examples in RFC 3339 . 
You should receive a response similar to the following:
  { 
  
 "timeSeries" 
 : 
  
 [ 
  
 { 
  
 "metric" 
 : 
  
 { 
  
 "labels" 
 : 
  
 { 
  
 "response_code" 
 : 
  
 "200" 
 , 
  
 "response_code_class" 
 : 
  
 "2xx" 
  
 }, 
  
 "type" 
 : 
  
 "aiplatform.googleapis.com/reasoning_engine/request_count" 
  
 }, 
  
 "resource" 
 : 
  
 { 
  
 "type" 
 : 
  
 "aiplatform.googleapis.com/ReasoningEngine" 
 , 
  
 "labels" 
 : 
  
 { 
  
 "reasoning_engine_id" 
 : 
  
 " RESOURCE_ID 
" 
 , 
  
 "location" 
 : 
  
 " LOCATION 
" 
 , 
  
 "project_id" 
 : 
  
 " PROJECT_ID 
" 
  
 } 
  
 }, 
  
 "metricKind" 
 : 
  
 "DELTA" 
 , 
  
 "valueType" 
 : 
  
 "INT64" 
 , 
  
 "points" 
 : 
  
 [ 
  
 { 
  
 "interval" 
 : 
  
 { 
  
 "startTime" 
 : 
  
 "2025-03-26T18:55:27.001Z" 
 , 
  
 "endTime" 
 : 
  
 "2025-03-26T18:56:27Z" 
  
 }, 
  
 "value" 
 : 
  
 { 
  
 "int64Value" 
 : 
  
 "25" 
  
 } 
  
 }, 
  
 { 
  
 "interval" 
 : 
  
 { 
  
 "startTime" 
 : 
  
 "2025-03-26T18:54:27.001Z" 
 , 
  
 "endTime" 
 : 
  
 "2025-03-26T18:55:27Z" 
  
 }, 
  
 "value" 
 : 
  
 { 
  
 "int64Value" 
 : 
  
 "36" 
  
 } 
  
 } 
  
 // ... more data points ... 
  
 ] 
  
 } 
  
 // ... potentially more time series with other response codes ... 
  
 ], 
  
 "unit" 
 : 
  
 "1" 
 } 
 
 
See  projects.timeSeries.list 
 
for more details on the response format.
Create custom metrics for an agent
If the built-in agent metrics don't cover your specific use case, you can define custom metrics. You can create custom metrics using the following methods:
-  
Log-based metrics : Observe trends and patterns in a large volume of log entries.
 -  
User-defined metrics : Metrics that aren't defined by Google Cloud, such as capturing application-specific data or client-side system data.
 
Log-based metrics
The following steps demonstrate how to create and use a log-based metric ( tool_calling_count 
) for an example workflow where multiple agents call multiple tools, and you
want to count tool invocations:
-  
Specify your tool to write a log entry every time it's called. For example,
"tool-\<tool-id\> invoked by agent-\<agent-id\>". -  
Create a new counter-type log-based metric through the Google Cloud console:
-  
Go to Log-based Metricspage in the Google Cloud console:
 -  
In the User-defined metricssection, click Create metric. The Create log-based metricpane appears.
 -  
For Metric type, select Counter
 -  
For Detailssection, enter the Log-based metric name. For example,
tool_calling_count. Optionally, enter the Descriptionand Units. -  
For the Filter selectionsection, do the following:
-  
In the Select project or log bucketdrop-down list, select Project logs
 -  
In the Build filterfield, enter the log filter using the logging query language . For example:
resource . type = "aiplatform.googleapis.com/ReasoningEngine" resource . labels . reasoning_engine_id = " RESOURCE_ID " textPayload =~ "tool-\d+ invoked by agent-\d+" -- assuming both tool and agent IDs are numeric 
 -  
 -  
For the Labelssection, add two new labels by clicking the Add labelbutton.
-  
For the first label, do the following:
-  
In the Label namefield, enter
tool. -  
In the Field namefield, enter
textPayload. -  
In the Regular expressionfield, enter
(tool-\d+) invoked by agent-\d+. 
 -  
 -  
For the second label, do the following:
-  
In the Label namefield, enter
agent. -  
In the Field namefield, enter
textPayload. -  
In the Regular expressionfield, enter
tool-\d+ invoked by (agent-\d+). 
 -  
 
- Click Done.
 
 -  
 -  
Click Create metric.
 
 -  
 -  
To view the
tool_calling_countmetric and its associated logs, do the following in the Google Cloud console:-  
Go to Metrics Explorerpage in the Google Cloud console:
 -  
Click Select a metricto open a search bar.
 -  
Enter Vertex AI Agent Builder Reasoning Enginein the search bar and click Vertex AI Agent Builder Reasoning Engine.
 -  
Click the Logs-based metricsmetric category, then click Logging/user/tool_calling_count. Adjust the time range if necessary.
 -  
(Optional) Filter by the labels
toolandagent.-  
To get the total invocation count for a specific tool for all agents, set the filter label
toolwith the value of that tool ID. -  
To get the total invocation count for a specific agent for all tools, set the filter label
agentwith the value of that agent ID. 
Optionally, set the Sum Byto
tooloragentto get the total count broken down by different tools or agents. -  
 
 -  
 
See Logging an agent for instructions on how to write agent logs, and Log-based metrics overview for more details on log-based metrics.
User-defined metrics
The following steps demonstrate how to create and use a user-defined metric ( token_count 
) for an example workflow where multiple agents call multiple models, and you
want to calculate the total count of consumed tokens (assuming that you track the number of tokens since application startup for each invoking agent and target model):
-  
Define the custom metric type by calling
projects.metricDescriptors.createwith the following parameters:-  
name: a URL string, such asprojects/ PROJECT_ID -  
Request body: aMetricDescriptorobject:{ "name" : "token_count" , "description" : "Token Consumed by models." , "displayName" : "Token Count" , "type" : "custom.googleapis.com/token_count" , "metricKind" : "CUMULATIVE" , "valueType" : "INT64" , "unit" : "1" , "labels" : [ { "key" : "model" , "valueType" : "STRING" , "description" : "Model." }, { "key" : "agent" , "valueType" : "STRING" , "description" : "Agent." } ], "monitoredResourceTypes" : [ "generic_node" ] }The new metric
token_countis created with the kindCumulative, representing the total number of tokens since application startup. See Metric kinds and types for more details about theCumulativemetrics. The labelsmodelandagentrepresent the name of the target large language model (LLM) and invoking agent. 
-  
You can find the
token_countmetric in the Metrics Explorer:- Go to Metrics Explorerpage in the Google Cloud console:
 
-  
Click Select a metricto open a search bar.
 -  
Enter Generic nodein the search bar and click Custom metrics.
 -  
Click Token Count.
 
 
 -  
 -  
Write data points to the new metric by calling
projects.timeSeries.createwith the following parameters:-  
name: a URL string, such asprojects/ PROJECT_ID -  
Request body: a list ofTimeSeriesobjects:{ "timeSeries" : [ { "metric" : { "type" : "custom.googleapis.com/token_count" , "labels" : { "model" : "model-1" , "agent" : "agent-1" } }, "resource" : { "type" : "generic_node" , "labels" : { "project_id" : " PROJECT_ID " , "node_id" : " RESOURCE_ID " , "namespace" : "" , "location" : "us-central1" } }, "points" : [ { "interval" : { "startTime" : "2025-03-26T10:00:00-08:00" , "endTime" : "2025-03-26T10:01:00-08:00" }, "value" : { "int64Value" : 15 } } ] }, { "metric" : { "type" : "custom.googleapis.com/token_count" , "labels" : { "model" : "model-1" , "agent" : "agent-2" } }, "resource" : { "type" : "generic_node" , "labels" : { "project_id" : " PROJECT_ID " , "node_id" : " RESOURCE_ID " , "namespace" : "" , "location" : "us-central1" } }, "points" : [ { "interval" : { "startTime" : "2025-03-26T10:00:00-08:00" , "endTime" : "2025-03-26T10:01:00-08:00" }, "value" : { "int64Value" : 20 } } ] } // ... more time series ... ] } 
 -  
 -  
Once the data points are uploaded through the Cloud Monitoring API, you can view the new metric
token_countthrough the Google Cloud console:-  
Go to Metrics Explorerpage in the Google Cloud console:
 -  
Click Select a metricto open a search bar.
 -  
Enter Generic nodein the search bar and click Custom metrics.
 -  
Click Token Count. Adjust the time range and configure label values for
modeloragentif necessary. 
 -  
 
Create alerts for an agent
You can use metrics in combination with alerts. See Alerting overview for more details.
The following example
demonstrates how to create a threshold alert for the request_latencies 
metric
so that you receive notifications when the latency crosses a predefined value for a specified duration:
-  
Go to Alertingpage in the Google Cloud console:
 -  
Click Create Policy. The Create alerting policypage opens.
-  
For Policy configuration mode, select Builder.
 -  
In the Select a metricdrop-down menu, select
Vertex AI Reasoning Engine->reasoning_engine->Request Latency. -  
In the Add filterssection, optionally configure filters (such as
reasoning_engine_id,response_code). -  
In the Transform datasection, toggle Rolling windowand Rolling window functionto values such as
5minand99th percentile(monitor the 99th percentile of the request latency over the 5-minute alignment period). -  
Click Next.
 
 -  
 -  
In the Configure alert triggersection, do the following:
-  
Select Thresholdfor Condition Types.
 -  
Select an Alert trigger, such as Any time series violates.
 -  
Select a Threshold position, such as Above threshold.
 -  
Enter a threshold value, such as
5000ms. -  
Click Next.
 
 -  
 -  
In the Configure notifications and finalize alertsection, do the following:
-  
Select one or more notification channels. See Manage notification channels for more details.
 -  
(Optional) Configure notification subject, incident auto-close duration, application labels, policy labels, severity level and additional documentation.
 -  
Set the policy name in the Name the alert policysection, such as
latency-99p-alert. -  
Click Create policy.
 
 -  
 
In the event of an incident, see Incidents for metric-based alerting policies for more information on acknowledging and investigating the incident and muting the alert.
You can find more alert examples in Sample policies in JSON .
Monitor metrics for an agent
You can use the Vertex AI Agent Engine Overview dashboard to monitor the operational health and performance of your agents.
View the default dashboard
-  
Go to the Dashboardspage in the Google Cloud console:
 -  
Select your Google Cloud project.
 -  
In the My Dashboardspane, add the filter
Name:Vertex AI Agent Engine Overview. -  
Click Vertex AI Agent Engine Overviewto display the default agent dashboard.
 
Customize the default dashboard
The default dashboard contains only the agent built-in metrics . To add your own custom metrics to the dashboard, use the following steps to copy and customize the default dashboard:
-  
Click Copy Dashboard. In the Copy Dashboarddialog, click Copy. The dashboard copy opens. You can also find the dashboard copy in the My Dashboardspane under the Customcategory.
 -  
In the dashboard copy, follow these steps to add a metric:
-  
Click Add widget. The Add widgetside panel appears.
 -  
For Data, select Metric. The Configure widgetside panel appears.
 -  
Click Select a metricto open a search bar.
 -  
If your custom metric is created using log-based metrics :
-  
Enter Vertex AI Agent Builder Reasoning Enginein the search bar and click Vertex AI Agent Builder Reasoning Engine.
 -  
Click the Log-based metricsmetric category, then click a metric, such as Logging/user/tool_calling_count.
 -  
Click Apply.
 
 -  
 -  
If your custom metric is created using user-defined metrics :
-  
Enter Generic Nodein the search bar and click Generic Node.
 -  
Click the Custom metricsmetric category, then click a metric, such as Token Count.
 -  
Click Apply.
 
 -  
 -  
A new chart displaying your custom metric appears in the dashboard.
 
 -  
 -  
You can further adjust the layout of the dashboard, for example:
-  
Move your widget by holding the widget title and dragging it to another location on the same dashboard.
 -  
Resize your widget by holding the widget bottom right corner and adjusting its size.
 
 -  
 
See Add charts and tables to a custom dashboard for more details on adding metric charts using Prometheus Query Language (PromQL) , as well as tabulating your metrics.
If you have configured custom alerts , see Display alerting policies and alerts on a dashboard to add such alerts to your dashboard.

