Log requests and responses

Vertex AI can log samples of requests and responses for Gemini and supported partner models. The logs are saved to a BigQuery table for viewing and analysis. This page describes how to configure request-response logs for base foundation models and fine-tuned models.

Supported API methods for logging

Request-response logs are supported for all Gemini models that use generateContent or streamGenerateContent .

The following partner models that use rawPredict or streamrawPredict are also supported:

  • Anthropic Claude

Request-response logs for base foundation models

You can configure request-response logs for base foundation models by using the REST API or Python SDK. Logging configurations can take a few minutes to take effect.

Enable request-response logging

Select one the of the following tabs for instructions on enabling request-response logs for a base foundation model.

For Anthropic models, only REST is supported for logging configuration. Enable logging configuration through the REST API by setting publisher to anthropic and setting the model name to one of the supported Claude models .

Python SDK

This method can be used to create or update a PublisherModelConfig .

  publisher_model 
 = 
 GenerativeModel 
 ( 
 'gemini-2.0-pro-001' 
 ) 
 # Set logging configuration 
 publisher_model 
 . 
 set_request_response_logging_config 
 ( 
 enabled 
 = 
 True 
 , 
 sampling_rate 
 = 
 1.0 
 , 
 bigquery_destination 
 = 
 "bq://PROJECT_ID.DATASET_NAME.TABLE_NAME" 
 , 
 enable_otel_logging 
 = 
 True 
 ) 
 

REST API

Create or update a PublisherModelConfig using setPublisherModelConfig :

Before using any of the request data, make the following replacements:

  • ENDPOINT_PREFIX : The region of the model resource followed by - . For example, us-central1- . If using the global endpoint, leave blank. Request-response logging is supported for all regions supported by the model .
  • PROJECT_ID : Your project ID.
  • LOCATION : The region of the model resource. If using the global endpoint, enter global .
  • PUBLISHER : The publisher name. For example, google .
  • MODEL : The foundation model name. For example, gemini-2.0-flash-001 .
  • SAMPLING_RATE : To reduce storage costs, you can set a number between 0 or 1 to define the fraction of requests to log. For example, a value of 1 logs all requests, and a value of 0.1 logs 10% of requests.
  • BQ_URI : the BigQuery table to be used for logging. If you only specify a project name, a new dataset is created with the name logging_ ENDPOINT_DISPLAY_NAME \_ ENDPOINT_ID , where ENDPOINT_DISPLAY_NAME follows the BigQuery naming rules . If you don't specify a table name, a new table is created with the name request_response_logging .

HTTP method and URL:

POST https:// ENDPOINT_PREFIX 
aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID 
/locations/ LOCATION 
/publishers/ PUBLISHER 
/models/ MODEL 
:setPublisherModelConfig

Request JSON body:

{
  "publisherModelConfig": {
     "loggingConfig": {
       "enabled": true,
       "samplingRate": SAMPLING_RATE 
,
       "bigqueryDestination": {
         "outputUri": " BQ_URI 
"
       },
       "enableOtelLogging": true
     }
   }
 }

To send your request, choose one of these options:

curl

Save the request body in a file named request.json , and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https:// ENDPOINT_PREFIX aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID /locations/ LOCATION /publishers/ PUBLISHER /models/ MODEL :setPublisherModelConfig"

PowerShell

Save the request body in a file named request.json , and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https:// ENDPOINT_PREFIX aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID /locations/ LOCATION /publishers/ PUBLISHER /models/ MODEL :setPublisherModelConfig" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

Get logging configuration

Get the request-response logging configuration for the foundation model by using the REST API.

REST API

Get the request-response logging configuration using fetchPublisherModelConfig :

Before using any of the request data, make the following replacements:

  • PROJECT_ID : Your project ID.
  • LOCATION : The location of the model resource.
  • PUBLISHER : The publisher name. For example, google .
  • MODEL : The foundation model name. For example, gemini-2.0-flash-001 .

HTTP method and URL:

GET https:// LOCATION 
-aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID 
/locations/ LOCATION 
/publishers/ PUBLISHER 
/models/ MODEL 
:fetchPublisherModelConfig

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https:// LOCATION -aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID /locations/ LOCATION /publishers/ PUBLISHER /models/ MODEL :fetchPublisherModelConfig"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https:// LOCATION -aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID /locations/ LOCATION /publishers/ PUBLISHER /models/ MODEL :fetchPublisherModelConfig" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

Disable logging

Disable request-response logging for the foundation model by using the REST API or Python SDK.

Python SDK

  publisher_model 
 . 
 set_request_response_logging_config 
 ( 
 enabled 
 = 
 False 
 , 
 sampling_rate 
 = 
 0 
 , 
 bigquery_destination 
 = 
 '' 
 ) 
 

REST API

Use setPublisherModelConfig to disable logging:

Before using any of the request data, make the following replacements:

  • PROJECT_ID : Your project ID.
  • LOCATION : The location of the model resource.
  • PUBLISHER : The publisher name. For example, google .
  • MODEL : The foundation model name. For example, gemini-2.0-flash-001 .

HTTP method and URL:

POST https:// LOCATION 
-aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID 
/locations/ LOCATION 
/publishers/ PUBLISHER 
/models/ MODEL 
:setPublisherModelConfig

Request JSON body:

{
  "publisherModelConfig": {
     "loggingConfig": {
       "enabled": false
     }
  }
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json , and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https:// LOCATION -aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID /locations/ LOCATION /publishers/ PUBLISHER /models/ MODEL :setPublisherModelConfig"

PowerShell

Save the request body in a file named request.json , and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https:// LOCATION -aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID /locations/ LOCATION /publishers/ PUBLISHER /models/ MODEL :setPublisherModelConfig" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

Request-response logs for fine-tuned models

You can configure request-response logs for fine-tuned models by using the REST API or Python SDK.

Enable request-response logs

Select one the of the following tabs for instructions on enabling request-response logs for a fine-tuned model.

Python SDK

This method can be used to update the request-response logging configuration for an endpoint.

  tuned_model 
 = 
 GenerativeModel 
 ( 
 "projects/PROJECT_ID/locations/REGION/endpoints/ENDPOINT_ID" 
 ) 
 # Set logging configuration 
 tuned_model 
 . 
 set_request_response_logging_config 
 ( 
 enabled 
 = 
 True 
 , 
 sampling_rate 
 = 
 1.0 
 , 
 bigquery_destination 
 = 
 "bq://PROJECT_ID.DATASET_NAME.TABLE_NAME" 
 , 
 enable_otel_logging 
 = 
 True 
 ) 
 

REST API

You can only enable request-response logging when you create an endpoint using projects.locations.endpoints.create or patch an existing endpoint using projects.locations.endpoints.patch .

Requests and responses are logged at the endpoint level, so requests sent to any deployed models under the same endpoint are logged.

When you create or patch an endpoint , populate the predictRequestResponseLoggingConfig field of the Endpoint resource with the following entries:

  • enabled : set to True to enable request-response logging.

  • samplingRate : To reduce storage costs, you can set a number between 0 or 1 to define the fraction of requests to log. For example, a value of 1 logs all requests, and a value of 0.1 logs 10% of requests.

  • BigQueryDestination : the BigQuery table to be used for logging. If you only specify a project name, a new dataset is created with the name logging_ ENDPOINT_DISPLAY_NAME _ ENDPOINT_ID , where ENDPOINT_DISPLAY_NAME follows the BigQuery naming rules . If you don't specify a table name, a new table is created with the name request_response_logging .

  • enableOtelLogging : set to true to enable OpenTelemetry (OTEL) logging in addition to the default request-response logging.

To view the BigQuery table schema, see Logging table schema .

The following is an example configuration:

{
  "predictRequestResponseLoggingConfig": {
    "enabled": true,
    "samplingRate": 0.5,
    "bigqueryDestination": {
      "outputUri": "bq://PROJECT_ID.DATASET_NAME.TABLE_NAME"
    },
    "enableOtelLogging": true
  }
}

Get logging configuration

Get the request-response logging configuration for the fine-tuned model by using the REST API.

REST API

Before using any of the request data, make the following replacements:

  • PROJECT_ID : Your project ID.
  • LOCATION : The location of the endpoint resource.
  • MODEL : The foundation model name. For example, gemini-2.0-flash-001 .
  • ENDPOINT_ID : The ID of the endpoint.

HTTP method and URL:

GET https:// LOCATION 
-aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID 
/locations/ LOCATION 
/endpoints/ ENDPOINT_ID 

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https:// LOCATION -aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID /locations/ LOCATION /endpoints/ ENDPOINT_ID "

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https:// LOCATION -aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID /locations/ LOCATION /endpoints/ ENDPOINT_ID " | Select-Object -Expand Content

You should receive a JSON response similar to the following:

Disable logging configuration

Disable the request-response logging configuration for the endpoint.

Python SDK

  tuned_model 
 = 
 GenerativeModel 
 ( 
 "projects/PROJECT_ID/locations/REGION/endpoints/ENDPOINT_ID" 
 ) 
 # Set logging configuration 
 tuned_model 
 . 
 set_request_response_logging_config 
 ( 
 enabled 
 = 
 False 
 , 
 sampling_rate 
 = 
 1.0 
 , 
 bigquery_destination 
 = 
 "bq://PROJECT_ID.DATASET_NAME.TABLE_NAME" 
 , 
 enable_otel_logging 
 = 
 False 
 ) 
 

REST API

{
"predictRequestResponseLoggingConfig": {
  "enabled": false
}
}

Logging table schema

In BigQuery, the logs are recorded using the following schema:

Field name Type Notes
endpoint
STRING Resource name of the endpoint to which the tuned model is deployed.
deployed_model_id
STRING Deployed model ID for a tuned model deployed to an endpoint.
logging_time
TIMESTAMP The time that logging is performed. This is roughly the time that the response is returned.
request_id
NUMERIC The auto-generated integer request ID based on the API request.
request_payload
STRING Included for partner model logging and backward compatibility with the Vertex AI endpoint request-response log.
response_payload
STRING Included for partner model logging and backward compatibility with the Vertex AI endpoint request-response log.
model
STRING Model resource name.
model_version
STRING The model version. This is often "default" for Gemini models.
api_method
STRING generateContent, streamGenerateContent, rawPredict, streamRawPredict
full_request
JSON The full GenerateContentRequest .
full_response
JSON The full GenerateContentResponse .
metadata
JSON Any metadata of the call; contains the request latency.
otel_log
JSON Logs in OpenTelemetry schema format. Only available if otel_logging is enabled in the logging configuration.

Note that request-response pairs larger than the BigQuery write API 10MB row limit are not recorded.

What's next

Create a Mobile Website
View Site in Mobile | Classic
Share by: