Use a LlamaIndex Query Pipeline agent

Before you begin

This tutorial assumes that you have read and followed the instructions in:

Develop a LlamaIndexQueryPipeline agent : to develop agent as an instance of LlamaIndexQueryPipelineAgent .
User authentication to authenticate as a user for querying the agent.
Import and initialize the SDK to initialize the client for getting a deployed instance (if needed).

Get an instance of an agent

To query a LlamaIndexQueryPipelineAgent , you need to first create a new instance or get an existing instance .

To get the LlamaIndexQueryPipelineAgent corresponding to a specific resource ID:

Vertex AI SDK for Python

Run the following code:

  import 
  
  vertexai 
 
 client 
 = 
  vertexai 
 
 . 
 Client 
 ( 
 # For service interactions via client.agent_engines 
 project 
 = 
 " PROJECT_ID 
" 
 , 
 location 
 = 
 " LOCATION 
" 
 , 
 ) 
 agent 
 = 
 client 
 . 
  agent_engines 
 
 . 
 get 
 ( 
 name 
 = 
 "projects/ PROJECT_ID 
/locations/ LOCATION 
/reasoningEngines/ RESOURCE_ID 
" 
 ) 
 print 
 ( 
 agent 
 )

where

PROJECT_ID is the Google Cloud project ID under which you develop and deploy agents, and
LOCATION is one of the supported regions .
RESOURCE_ID is the ID of the deployed agent as a reasoningEngine resource .

Python requests library

Run the following code:

  from 
  
 google 
  
 import 
 auth 
 as 
 google_auth 
 from 
  
 google.auth.transport 
  
 import 
 requests 
 as 
 google_requests 
 import 
  
 requests 
 def 
  
 get_identity_token 
 (): 
 credentials 
 , 
 _ 
 = 
 google_auth 
 . 
 default 
 () 
 auth_request 
 = 
 google_requests 
 . 
 Request 
 () 
 credentials 
 . 
 refresh 
 ( 
 auth_request 
 ) 
 return 
 credentials 
 . 
 token 
 response 
 = 
 requests 
 . 
 get 
 ( 
 f 
 "https:// LOCATION 
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID 
/locations/ LOCATION 
/reasoningEngines/ RESOURCE_ID 
" 
 , 
 headers 
 = 
 { 
 "Content-Type" 
 : 
 "application/json; charset=utf-8" 
 , 
 "Authorization" 
 : 
 f 
 "Bearer 
 { 
 get_identity_token 
 () 
 } 
 " 
 , 
 }, 
 )

REST API

 curl  
 \ 
-H  
 "Authorization: Bearer 
 $( 
gcloud  
auth  
print-access-token ) 
 " 
  
 \ 
-H  
 "Content-Type: application/json" 
  
 \ 
https:// LOCATION 
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID 
/locations/ LOCATION 
/reasoningEngines/ RESOURCE_ID

When using the Vertex AI SDK for Python, the agent object corresponds to an AgentEngine class that contains the following:

an agent.api_resource with information about the deployed agent. You can also call agent.operation_schemas() to return the list of operations that the agent supports. See Supported operations for details.
an agent.api_client that allows for synchronous service interactions
an agent.async_api_client that allows for asynchronous service interactions

The rest of this section assumes that you have an AgentEngine instance, named as agent .

Supported operations

The following operations are supported for LlamaIndexQueryPipelineAgent :

query : for getting a response to a query synchronously.

The query method supports the following type of argument:

input : the messages to be sent to the agent.

Query the agent

The command:

  agent 
 . 
 query 
 ( 
 input 
 = 
 "What is Paul Graham's life in college?" 
 )

is equivalent to the following (in full form):

  agent 
 . 
 query 
 ( 
 input 
 = 
 { 
 "input" 
 : 
 "What is Paul Graham's life in college?" 
 })

To customize the input dictionary, see Customize the prompt template .

You can also customize the agent's behavior beyond input by passing additional keyword arguments to query() .

  response 
 = 
 agent 
 . 
 query 
 ( 
 input 
 = 
 { 
 "input" 
 = 
 [ 
 "What is Paul Graham's life in college?" 
 , 
 "How did Paul Graham's college experience shape his career?" 
 , 
 "How did Paul Graham's college experience shape his entrepreneurial mindset?" 
 , 
 ], 
 }, 
 batch 
 = 
 True 
 # run the pipeline in batch mode and pass a list of inputs. 
 ) 
 print 
 ( 
 response 
 )

See the QueryPipeline.run code for a complete list of available parameters.