Before you begin
This tutorial assumes that you have read and followed the instructions in:
- Develop a LlamaIndexQueryPipeline agent
: to develop
agentas an instance ofLlamaIndexQueryPipelineAgent. - User authentication to authenticate as a user for querying the agent.
- Import and initialize the SDK to initialize the client for getting a deployed instance (if needed).
Get an instance of an agent
To query a LlamaIndexQueryPipelineAgent
, you need to first create a new instance
or get an existing instance
.
To get the LlamaIndexQueryPipelineAgent
corresponding to a specific resource ID:
Vertex AI SDK for Python
Run the following code:
import
vertexai
client
=
vertexai
.
Client
(
# For service interactions via client.agent_engines
project
=
" PROJECT_ID
"
,
location
=
" LOCATION
"
,
)
agent
=
client
.
agent_engines
.
get
(
name
=
"projects/ PROJECT_ID
/locations/ LOCATION
/reasoningEngines/ RESOURCE_ID
"
)
print
(
agent
)
where
-
PROJECT_IDis the Google Cloud project ID under which you develop and deploy agents, and -
LOCATIONis one of the supported regions . -
RESOURCE_IDis the ID of the deployed agent as areasoningEngineresource .
Python requests library
Run the following code:
from
google
import
auth
as
google_auth
from
google.auth.transport
import
requests
as
google_requests
import
requests
def
get_identity_token
():
credentials
,
_
=
google_auth
.
default
()
auth_request
=
google_requests
.
Request
()
credentials
.
refresh
(
auth_request
)
return
credentials
.
token
response
=
requests
.
get
(
f
"https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/reasoningEngines/ RESOURCE_ID
"
,
headers
=
{
"Content-Type"
:
"application/json; charset=utf-8"
,
"Authorization"
:
f
"Bearer
{
get_identity_token
()
}
"
,
},
)
REST API
curl
\
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
https:// LOCATION
-aiplatform.googleapis.com/v1/projects/ PROJECT_ID
/locations/ LOCATION
/reasoningEngines/ RESOURCE_ID
When using the Vertex AI SDK for Python, the agent
object corresponds to an AgentEngine
class that contains the following:
- an
agent.api_resourcewith information about the deployed agent. You can also callagent.operation_schemas()to return the list of operations that the agent supports. See Supported operations for details. - an
agent.api_clientthat allows for synchronous service interactions - an
agent.async_api_clientthat allows for asynchronous service interactions
The rest of this section assumes that you have an AgentEngine
instance, named as agent
.
Supported operations
The following operations are supported for LlamaIndexQueryPipelineAgent
:
-
query: for getting a response to a query synchronously.
The query
method supports the following type of argument:
-
input: the messages to be sent to the agent.
Query the agent
The command:
agent
.
query
(
input
=
"What is Paul Graham's life in college?"
)
is equivalent to the following (in full form):
agent
.
query
(
input
=
{
"input"
:
"What is Paul Graham's life in college?"
})
To customize the input dictionary, see Customize the prompt template .
You can also customize the agent's behavior beyond input
by passing additional keyword arguments to query()
.
response
=
agent
.
query
(
input
=
{
"input"
=
[
"What is Paul Graham's life in college?"
,
"How did Paul Graham's college experience shape his career?"
,
"How did Paul Graham's college experience shape his entrepreneurial mindset?"
,
],
},
batch
=
True
# run the pipeline in batch mode and pass a list of inputs.
)
print
(
response
)
See the QueryPipeline.run
code
for a complete list of available parameters.

