View AI resources with Application Monitoring

This document describes how to view telemetry for the AI resources used by your App Hub-registered applications, services, and workloads.

To generate metrics such as error rate, latency, and token usage, Application Monitoring queries your trace data for application-specific labels and events that follow the OpenTelemetry GenAI semantic conventions . These metrics quantify the health, performance, and cost of your AI resources, and they are available as aggregated data for applications or as granular data for individual services and workloads.

The following dashboard shows AI resource information for a registered application:

Dashboard that displays AI resource information.

Before you begin

The procedures in this document require a Google Cloud project with active AI resources to analyze. They also require that your AI resources are associated with applications, services, and workloads that are registered with App Hub. Application Monitoring needs telemetry and trace data to produce meaningful results.

Configure roles, APIs, and set up Application Monitoring

Complete the steps defined in Investigate applications, services, and workloads: Before you begin .
Enable the Observability, Cloud Trace, and Telemetry APIs.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role ( roles/serviceusage.serviceUsageAdmin ), which contains the serviceusage.services.enable permission. Learn how to grant roles .

Enable the APIs
To get the permissions that you need to view AI resource usage information, ask your administrator to grant you the following IAM roles:
- Observability Viewer ( roles/observability.viewer ) on your project
- Observability View Accessor ( roles/observability.viewaccessor ) on the observability views that you want to query. You can restrict this grant to a specific view.
For more information about granting roles, see Manage access to projects, folders, and organizations .

You might also be able to get the required permissions through custom roles or other predefined roles .

Develop and register applications, services, and workloads

To display data for AI resources that your applications, services, and workloads use, your trace data must contain application-specific labels and events that follow the OpenTelemetry GenAI semantic conventions . You can get these labels by completing the following steps:

Register your application and its services and workloads with App Hub.
Use the Agent Development Kit (ADK) framework or instrument your application with OpenTelemetry and send your trace data to the Telemetry API. For instrumentation samples, see Instrument ADK applications with OpenTelemetry and Overview of collector-based instrumentation samples .

View AI resources for an application

To view AI resources associated with an application, do the following:

In the Google Cloud console, go to the Application monitoring page:

Go to Application monitoring

If you use the search bar to find this page, then select the result whose subheading is Monitoring .
In the toolbar of the Google Cloud console, select your App Hub host project or management project.
Select the Applicationstab, and then select the application from the list.
Select the Dashboardstab.
In the dashboard's Table of contents, select AI resources.

The system creates the AI resourcesentry when you have at least one AI agent associated with your application that is active. If you don't have any agents or all agents are inactive, the option isn't listed.

The dashboard goes to the AI resourcessection, which displays information such as the following:

Total queries per second and token count.
Average error rate, latency, and tool-call error rate.
Token usage.
Error rates and latency for agents.

View AI resources for a service or workload

To view AI resources associated with a service or workload, do the following:

Open the dashboard for the service or workload:
1. In the Google Cloud console, go to the Application monitoring page:
  
  Go to Application monitoring
  
  If you use the search bar to find this page, then select the result whose subheading is Monitoring .
2. Select the Services and Workloadstab and then select the service or workload.
The dashboard for the service or workload opens. The Table of contentslists the sections in the dashboard, which might depend on the type of AI resource.
To go to the section of the dashboard with information about your AI resources, use the dashboard's Table of contents:
- Agent: Available for agents. This section displays information about sessions, agent invocations, and token usage.
- Tools: Available for agents. This section displays information about tool calls, including error rate, call count, and P95 latency.
- Models: Available for some agents. This section displays information about the number of model calls made by the agent, error rate, and token usage.

Explore telemetry

SQL queries against your trace data determine the data displayed on AI-resource charts. These queries filter trace data by application-specific labels and generative AI events that follow the OpenTelemetry GenAI semantic conventions .

To view the query for a chart, in the toolbar of the chart, select More chart options, and then select Explore in Observability Analytics.

The Observability Analytics opens and displays the SQL query that generates data for the chart. You have the following options:

Inspect the query and then return to Application Monitoring.
Run the query.
Modify the query and then run the modified query.
Create a chart that displays the query result.
Save a chart that displays the query result to a custom dashboard.

To learn more, see the following documents: