Before you begin
This tutorial assumes that you have read and followed the instructions in:
-  Develop a LangChain agent 
: to develop 
agentas an instance ofLangchainAgent. - User authentication to authenticate as a user for querying the agent.
 - Import and initialize the SDK to initialize the client for getting a deployed instance (if needed).
 
Get an instance of an agent
To query a  LangchainAgent 
 
, you need to first create a new instance 
or get an existing instance 
.
To get the  LangchainAgent 
 
corresponding to a specific resource ID:
Vertex AI SDK for Python
Run the following code:
  import 
  
  vertexai 
 
 client 
 = 
  vertexai 
 
 . 
 Client 
 ( 
 # For service interactions via client.agent_engines 
 project 
 = 
 " PROJECT_ID 
" 
 , 
 location 
 = 
 " LOCATION 
" 
 , 
 ) 
 agent 
 = 
 client 
 . 
  agent_engines 
 
 . 
 get 
 ( 
 name 
 = 
 "projects/ PROJECT_ID 
/locations/ LOCATION 
/reasoningEngines/ RESOURCE_ID 
" 
 ) 
 print 
 ( 
 agent 
 ) 
 
 
where
-  
PROJECT_IDis the Google Cloud project ID under which you develop and deploy agents, and -  
LOCATIONis one of the supported regions . -  
RESOURCE_IDis the ID of the deployed agent as areasoningEngineresource . 
Python requests library
Run the following code:
  from 
  
 google 
  
 import 
 auth 
 as 
 google_auth 
 from 
  
 google.auth.transport 
  
 import 
 requests 
 as 
 google_requests 
 import 
  
 requests 
 def 
  
 get_identity_token 
 (): 
 credentials 
 , 
 _ 
 = 
 google_auth 
 . 
 default 
 () 
 auth_request 
 = 
 google_requests 
 . 
 Request 
 () 
 credentials 
 . 
 refresh 
 ( 
 auth_request 
 ) 
 return 
 credentials 
 . 
 token 
 response 
 = 
 requests 
 . 
 get 
 ( 
 f 
 "https:// LOCATION 
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID 
/locations/ LOCATION 
/reasoningEngines/ RESOURCE_ID 
" 
 , 
 headers 
 = 
 { 
 "Content-Type" 
 : 
 "application/json; charset=utf-8" 
 , 
 "Authorization" 
 : 
 f 
 "Bearer 
 { 
 get_identity_token 
 () 
 } 
 " 
 , 
 }, 
 ) 
 
 
REST API
 curl  
 \ 
-H  
 "Authorization: Bearer 
 $( 
gcloud  
auth  
print-access-token ) 
 " 
  
 \ 
-H  
 "Content-Type: application/json" 
  
 \ 
https:// LOCATION 
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID 
/locations/ LOCATION 
/reasoningEngines/ RESOURCE_ID 
 
 
When using the Vertex AI SDK for Python, the agent 
object corresponds to an AgentEngine 
class that contains the following:
- an 
agent.api_resourcewith information about the deployed agent. You can also callagent.operation_schemas()to return the list of operations that the agent supports. See Supported operations for details. - an  
agent.api_clientthat allows for synchronous service interactions - an  
agent.async_api_clientthat allows for asynchronous service interactions 
The rest of this section assumes that you have an AgentEngine 
instance, named as agent 
.
Supported operations
The following operations are supported:
-   
query: for getting a response to a query synchronously. -   
stream_query: for streaming a response to a query. 
Both query 
and stream_query 
methods support the same type of arguments:
-   
input: the messages to be sent to the agent. -   
config: the configuration (if applicable) for the context of the query. 
Query the agent
The command:
  agent 
 . 
 query 
 ( 
 input 
 = 
 "What is the exchange rate from US dollars to SEK today?" 
 ) 
 
 
is equivalent to the following (in full form):
  agent 
 . 
 query 
 ( 
 input 
 = 
 { 
 "input" 
 : 
 [ 
 # The input is represented as a list of messages (each message as a dict) 
 { 
 # The role (e.g. "system", "user", "assistant", "tool") 
 "role" 
 : 
 "user" 
 , 
 # The type (e.g. "text", "tool_use", "image_url", "media") 
 "type" 
 : 
 "text" 
 , 
 # The rest of the message (this varies based on the type) 
 "text" 
 : 
 "What is the exchange rate from US dollars to Swedish currency?" 
 , 
 }, 
 ] 
 }) 
 
 
Roles are used to help the model distinguish between different types of messages 
when responding. When the role 
is omitted in the input, it defaults to "user" 
.
| Role | Description | 
|---|---|
 system 
 |  
 Used to tell the chat model how to behave and provide additional context. Not supported by all chat model providers. | 
 user 
 |  
 Represents input from a user interacting with the model, usually in the form of text or other interactive input. | 
 assistant 
 |  
 Represents a response from the model, which can include text or a request to invoke tools. | 
 tool 
 |  
 A message used to pass the results of a tool invocation back to the model after external data or processing has been retrieved. | 
The type 
of the message will also determine how the rest of the message is
interpreted (see Handle multi-modal content 
).
Query the agent with multi-modal content
We will use the following agent (which forwards the input to the model and does not use any tools) to illustrate how to pass in multimodal inputs to an agent:
  agent 
 = 
 agent_engines 
 . 
 LangchainAgent 
 ( 
 model 
 = 
 "gemini-2.0-flash" 
 , 
 runnable_builder 
 = 
 lambda 
 model 
 , 
 ** 
 kwargs 
 : 
 model 
 , 
 ) 
 
 
Multimodal messages are represented through content blocks that specify a type 
and corresponding data. In general, for multimodal content, you would specify
the type 
to be "media" 
, the file_uri 
to point to a Cloud Storage URI,
and the mime_type 
for interpreting the file.
Image
  agent 
 . 
 query 
 ( 
 input 
 = 
 { 
 "input" 
 : 
 [ 
 { 
 "type" 
 : 
 "text" 
 , 
 "text" 
 : 
 "Describe the attached media in 5 words!" 
 }, 
 { 
 "type" 
 : 
 "media" 
 , 
 "mime_type" 
 : 
 "image/jpeg" 
 , 
 "file_uri" 
 : 
 "gs://cloud-samples-data/generative-ai/image/cricket.jpeg" 
 }, 
 ]}) 
 
 
Video
  agent 
 . 
 query 
 ( 
 input 
 = 
 { 
 "input" 
 : 
 [ 
 { 
 "type" 
 : 
 "text" 
 , 
 "text" 
 : 
 "Describe the attached media in 5 words!" 
 }, 
 { 
 "type" 
 : 
 "media" 
 , 
 "mime_type" 
 : 
 "video/mp4" 
 , 
 "file_uri" 
 : 
 "gs://cloud-samples-data/generative-ai/video/pixel8.mp4" 
 }, 
 ]}) 
 
 
Audio
  agent 
 . 
 query 
 ( 
 input 
 = 
 { 
 "input" 
 : 
 [ 
 { 
 "type" 
 : 
 "text" 
 , 
 "text" 
 : 
 "Describe the attached media in 5 words!" 
 }, 
 { 
 "type" 
 : 
 "media" 
 , 
 "mime_type" 
 : 
 "audio/mp3" 
 , 
 "file_uri" 
 : 
 "gs://cloud-samples-data/generative-ai/audio/pixel.mp3" 
 }, 
 ]}) 
 
 
For the list of MIME types supported by Gemini, visit the documentation on:
Query the agent with a runnable configuration
When querying the agent, you can also specify a config 
for the agent (which
follows the schema of a  RunnableConfig 
 
).
Two common scenarios are:
- Default configuration parameters: 
-  
run_id/run_name: identifier for the run. -  
tags/metadata: classifier for the run when tracing with OpenTelemetry . 
 -  
 - Custom configuration parameters (via 
configurable):-  
session_id: the session under which the run is happening (see Store chat history ). -  
thread_id: the thread under which the run is happening (see Store Checkpoints ). 
 -  
 
As an example:
  import 
  
 uuid 
 run_id 
 = 
 uuid 
 . 
 uuid4 
 () 
 # Generate an ID for tracking the run later. 
 response 
 = 
 agent 
 . 
 query 
 ( 
 input 
 = 
 "What is the exchange rate from US dollars to Swedish currency?" 
 , 
 config 
 = 
 { 
 # Specify the RunnableConfig here. 
 "run_id" 
 : 
 run_id 
 # Optional. 
 "tags" 
 : 
 [ 
 "config-tag" 
 ], 
 # Optional. 
 "metadata" 
 : 
 { 
 "config-key" 
 : 
 "config-value" 
 }, 
 # Optional. 
 "configurable" 
 : 
 { 
 "session_id" 
 : 
 "SESSION_ID" 
 } 
 # Optional. 
 }, 
 ) 
 print 
 ( 
 response 
 ) 
 
 

