To deploy an agent on Vertex AI Agent Engine, use the following steps:
- Configure your agent for deployment .
- Create an
AgentEngine
instance . - Grant the deployed agent permissions .
- Get the agent resource ID .
You can also use Agent Starter Pack templates for deployment.
Before you begin
Before you deploy an agent, make sure you have completed the following tasks:
Configure your agent for deployment
You can make the following optional configurations:
- Package requirements
- Additional packages
- Environment variables
- Customized resource controls
- Build options
- Cloud Storage folder
- Resource metadata
- Custom service account
- Private Service Connect interface
- Customer-managed encryption keys
Define the package requirements
Provide the set of packages required by the agent for deployment. The set of packages can either be a list of items to be installed by pip, or the path to a file that follows the Requirements File Format . Use the following best practices:
-
Pin your package versions for reproducible builds. Common packages to keep track of include the following:
google-cloud-aiplatform
,cloudpickle
,langchain
,langchain-core
,langchain-google-vertexai
, andpydantic
. -
Minimize the number of dependencies in your agent. This reduces the number of breaking changes when updating your dependencies and agent.
If the agent doesn't have any dependencies, you can set requirements
to None
:
requirements
=
None
If the agent uses a framework-specific template, you should specify the SDK
version that is imported (such as 1.77.0
) when developing the agent.
ADK
Preview
This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms . Pre-GA features are available "as is" and might have limited support. For more information, see the launch stage descriptions .
requirements
=
[
"google-cloud-aiplatform[agent_engines,adk]"
,
# any other dependencies
]
LangChain
requirements
=
[
"google-cloud-aiplatform[agent_engines,langchain]"
,
# any other dependencies
]
LangGraph
requirements
=
[
"google-cloud-aiplatform[agent_engines,langgraph]"
,
# any other dependencies
]
AG2
requirements
=
[
"google-cloud-aiplatform[agent_engines,ag2]"
,
# any other dependencies
]
LlamaIndex
Preview
This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms . Pre-GA features are available "as is" and might have limited support. For more information, see the launch stage descriptions .
The following instructions are for LlamaIndex Query Pipeline:
requirements
=
[
"google-cloud-aiplatform[agent_engines,llama_index]"
,
# any other dependencies
]
You can also do the following with package requirements
:
-
Upper-bound or pin the version of a given package (such as
google-cloud-aiplatform
):requirements = [ # See https://pypi.org/project/google-cloud-aiplatform for the latest version. "google-cloud-aiplatform[agent_engines,adk]==1.88.0" , ]
-
Add additional packages and constraints:
requirements = [ "google-cloud-aiplatform[agent_engines,adk]==1.88.0" , "cloudpickle==3.0" , # new ]
-
Point to the version of a package on a GitHub branch or pull request:
requirements = [ "google-cloud-aiplatform[agent_engines,adk] @ git+https://github.com/googleapis/python-aiplatform.git@ BRANCH_NAME " , # new "cloudpickle==3.0" , ]
-
Maintain the list of requirements in a file (such as
path/to/requirements.txt
):requirements = "path/to/requirements.txt"
where
path/to/requirements.txt
is a text file that follows the Requirements File Format . For example:google-cloud-aiplatform[agent_engines,adk] cloudpickle==3.0
Define additional packages
You can include local files or directories that contain local required Python source files. Compared to package requirements , this lets you use private utilities you have developed that aren't otherwise available on PyPI or GitHub.
If the agent does not require any extra packages, you can set extra_packages
to None
:
extra_packages
=
None
You can also do the following with extra_packages
:
-
Include a single file (such as
agents/agent.py
):extra_packages = [ "agents/agent.py" ]
-
Include the set of files in an entire directory (for example,
agents/
):extra_packages = [ "agents" ] # directory that includes agents/agent.py
-
Specify Python wheel binaries (for example,
path/to/python_package.whl
):requirements = [ "google-cloud-aiplatform[agent_engines,adk]" , "cloudpickle==3.0" , "python_package.whl" , # install from the whl file that was uploaded ] extra_packages = [ "path/to/python_package.whl" ] # bundle the whl file for uploading
Define environment variables
If there are environment variables that your agent depends on, you can specify
them in the env_vars=
argument. If the agent does not depend on any environment
variables, you can set it to None
:
env_vars
=
None
To specify the environment variables, there are a few different options available:
Dictionary
env_vars
=
{
"VARIABLE_1"
:
"VALUE_1"
,
"VARIABLE_2"
:
"VALUE_2"
,
}
# These environment variables will become available in Vertex AI Agent Engine
# through `os.environ`, e.g.
#
# import os
# os.environ["VARIABLE_1"] # will have the value "VALUE_1"
#
# and
#
# os.environ["VARIABLE_2"] # will have the value "VALUE_2"
#
To reference a secret in Secret Manager and have it be available as
an environment variable (for example, CLOUD_SQL_CREDENTIALS_SECRET
), first follow
the instructions to Create a secret
for CLOUD_SQL_CREDENTIALS_SECRET
in your project
,
before specifying the environment variables as:
env_vars
=
{
# ... (other environment variables and their values)
"CLOUD_SQL_CREDENTIALS_SECRET"
:
{
"secret"
:
" SECRET_ID
"
,
"version"
:
" SECRET_VERSION_ID
"
},
}
where
-
SECRET_VERSION_ID
is the ID of the secret version. -
SECRET_ID
is the ID of the secret.
In your agent code , you can then reference the secret like so:
secret
=
os
.
environ
.
get
(
"CLOUD_SQL_CREDENTIALS_SECRET"
)
if
secret
:
# Secrets are stored as strings, so use json.loads to parse JSON payloads.
return
json
.
loads
(
secret
)
List
env_vars
=
[
"VARIABLE_1"
,
"VARIABLE_2"
]
# This corresponds to the following code snippet:
#
# import os
#
# env_vars = {
# "VARIABLE_1": os.environ["VARIABLE_1"],
# "VARIABLE_2": os.environ["VARIABLE_2"],
# }
Define customized resource controls
You can specify runtime resource controls for the agent, such as the minimum and maximum number of application instances, resource limits for each container, and concurrency for each container.
-
min_instances
: The minimum number of application instances to keep running at all times, with a range of[0, 10]
. The default value is 1. -
max_instances
: The maximum number of application instances that can be launched to handle increased traffic, with a range of[1, 1000]
. The default value is 100. If VPC-SC or PSC-I is enabled, the acceptable range is[1, 100]
. -
resource_limits
: Resource limits for each container. Onlycpu
andmemory
keys are supported. The default value is{"cpu": "4", "memory": "4Gi"}
.-
The only supported values for
cpu
are '1', '2', '4', '6' and '8'. For more information, see Configure CPU allocation . -
The only supported values for
memory
are '1Gi', '2Gi', ... '32Gi'. -
For required CPU on different memory values, see Configure memory limits .
-
-
container_concurrency
: Concurrency for each container and agent server. The recommended value is 2 *cpu
+ 1. The default value is9
.
remote_agent
=
agent_engines
.
create
(
local_agent
,
# ... other configs
min_instances
=
1
,
max_instances
=
10
,
resource_limits
=
{
"cpu"
:
"4"
,
"memory"
:
"8Gi"
},
container_concurrency
=
9
,
)
Define build options
You can specify build options for the agent, such as installation scripts to run
when building the agent's container image. This is useful for installing system
dependencies (for example, gcloud cli
, npx
) or other custom setups. The
scripts are run with root permissions.
To use installation scripts, create a directory named installation_scripts
and
place your shell scripts inside the directory:
.
├── ...
└── installation_scripts/
└── install.sh
Next, specify the installation_scripts
directory in extra_packages
and the script paths in build_options
:
extra_packages
=
[
...
,
"installation_scripts/install.sh"
]
build_options
=
{
"installation_scripts"
:
[
"installation_scripts/install.sh"
]}
You can use one of the following common installation scripts:
install_npx.sh
#!/bin/bash
# Exit immediately if a command exits with a non-zero status.
set
-e echo
"--- Installing System-Wide Node.js v20.x ---"
# 1. Install prerequisites
apt-get
update
apt-get
install
-y
ca-certificates
curl
gnupg # 2. Add the NodeSource repository GPG key
mkdir
-p
/etc/apt/keyrings
curl
-fsSL
https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key
|
gpg
--dearmor
-o
/etc/apt/keyrings/nodesource.gpg # 3. Add the NodeSource repository for Node.js v20
NODE_MAJOR
=
20
echo
"deb [signed-by=/etc/apt/keyrings/nodesource.gpg] https://deb.nodesource.com/node_
$NODE_MAJOR
.x nodistro main"
|
tee
/etc/apt/sources.list.d/nodesource.list # 4. Update package lists again and install Node.js
apt-get
update
apt-get
install
nodejs
-y echo
"--- System-wide Node.js installation complete ---"
echo
"Verifying versions:"
# These commands will now work for ANY user because node and npx
# are installed in /usr/bin/ which is in everyone's default PATH.
node
-v
npm
-v
npx
-v
install_uvx.sh
#!/bin/bash
# Exit immediately if a command exits with a non-zero status.
set
-e echo
"Starting setup..."
# Install uv
apt-get
update
apt-get
install
-y
curl
curl
-LsSf
https://astral.sh/uv/install.sh
|
env
UV_INSTALL_DIR
=
"/usr/local/bin"
sh # These commands will now work for ANY user because uv and uvx
# are installed in /usr/local/bin/ which is in everyone's default PATH.
uv
--version
uvx
--version
install_gcloud_cli.sh
#!/bin/bash
# Exit immediately if a command exits with a non-zero status.
set
-e
apt-get
install
-y
curl
gpg
curl
https://packages.cloud.google.com/apt/doc/apt-key.gpg
|
gpg
--dearmor
-o
/usr/share/keyrings/cloud.google.gpg echo
"deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main"
|
tee
-a
/etc/apt/sources.list.d/google-cloud-sdk.list
apt-get
update
-y &&
apt-get
install
google-cloud-cli
-y
gcloud
--version
Define a Cloud Storage folder
Staging artifacts are overwritten if they correspond to an existing folder
in a Cloud Storage bucket. If necessary, you can
specify the Cloud Storage folder for the staging artifacts. You
can set gcs_dir_name
to None
if you don't mind potentially overwriting the files in the default folder:
gcs_dir_name
=
None
To avoid overwriting the files (such as for different environments such as development, staging, and production), you can set up corresponding folder, and specify the folder to stage the artifact under:
gcs_dir_name
=
"dev"
# or "staging" or "prod"
If you want or need to avoid collisions, you can generate a random uuid
:
import
uuid
gcs_dir_name
=
str
(
uuid
.
uuid4
())
Configure resource metadata
You can set metadata on the ReasoningEngine
resource:
display_name
=
"Currency Exchange Rate Agent (Staging)"
description
=
"""
An agent that has access to tools for looking up the exchange rate.
If you run into any issues, please contact the dev team.
"""
For a full set of the parameters, see the API reference .
Configure a custom service account
You can configure a custom service account as the identity of your deployed agent, instead of the default identity .
To do so, specify the email of your custom service account as the service_account
when creating or updating the Agent Engine instance, for
example:
# Create a new instance
agent_engines
.
create
(
local_agent
=
< my
-
agent
> ,
service_account
=
"my-custom-service-account@my-project.iam.gserviceaccount.com"
,
...
)
# Update an existing instance
resource_name
=
"projects/
{project_id}
/locations/
{location}
/reasoningEngines/
{reasoning_engine_id}
"
agent_engines
.
update
(
resource_name
,
service_account
=
"my-new-custom-service-account@my-project.iam.gserviceaccount.com"
,
...
)
Configure Private Service Connect interface
If you have Private Service Connect interface and DNS peering set up, you can specify your network attachment and private DNS peering while deploying your agent:
remote_agent
=
agent_engines
.
create
(
agent_engine
=
local_agent
,
psc_interface_config
=
{
"network_attachment"
:
" NETWORK_ATTACHMENT
"
,
"dns_peering_configs"
:
[
{
"domain"
:
" DOMAIN_SUFFIX
"
,
"target_project"
:
" TARGET_PROJECT
"
,
"target_network"
:
" TARGET_NETWORK
"
,
}
],
},
)
where
-
NETWORK_ATTACHMENT
is the name or full path of your network attachment. -
DOMAIN_SUFFIX
is the DNS name of the private Cloud DNS zone that you created when setting up the private DNS Peering. -
TARGET_PROJECT
is the project that hosts the VPC network. -
TARGET_NETWORK
is the VPC network name.
You can configure multiple agents to use either a single, shared network attachment or unique, dedicated network attachments. To use a shared network attachment, provide the same network attachment in the psc_interface_config
for each agent you create.
Configure customer-managed encryption keys
You can use a custom key to encrypt your agent's data at rest. See Agent Engine Customer-managed encryption keys (CMEK) for more details.
To configure the custom key (CMEK) for your agent, you need to provide the key
resource name to the encryption_spec
parameter when creating the Agent Engine
instance.
# The fully qualified key name
kms_key_name
=
"projects/ PROJECT_ID
/locations/ LOCATION
/keyRings/ KEY_RING
/cryptoKeys/ KEY_NAME
"
remote_agent
=
agent_engines
.
create
(
local_agent
,
# ... other parameters
encryption_spec
=
{
"kms_key_name"
:
kms_key_name
},
)
Create an AgentEngine
instance
To deploy the agent on Vertex AI, use agent_engines.create
to pass in the local_agent
object along with any optional configurations
:
remote_agent
=
agent_engines
.
create
(
local_agent
,
# Optional.
requirements
=
requirements
,
# Optional.
extra_packages
=
extra_packages
,
# Optional.
gcs_dir_name
=
gcs_dir_name
,
# Optional.
display_name
=
display_name
,
# Optional.
description
=
description
,
# Optional.
env_vars
=
env_vars
,
# Optional.
build_options
=
build_options
,
# Optional.
service_account
=
service_account
,
# Optional.
min_instances
=
min_instances
,
# Optional.
max_instances
=
max_instances
,
# Optional.
resource_limits
=
resource_limits
,
# Optional.
container_concurrency
=
container_concurrency
,
# Optional
encryption_spec
=
encryption_spec
,
# Optional.
)
Deployment takes a few minutes, during which the following steps happen in the background:
-
A bundle of the following artifacts are generated locally:
-
*.pkl
a pickle file corresponding to local_agent. -
requirements.txt
a text file containing the package requirements . -
dependencies.tar.gz
a tar file containing any extra packages .
-
-
The bundle is uploaded to Cloud Storage (under the corresponding folder ) for staging the artifacts.
-
The Cloud Storage URIs for the respective artifacts are specified in the PackageSpec .
-
The Vertex AI Agent Engine service receives the request and builds containers and starts HTTP servers on the backend.
Deployment latency depends on the total time it takes to install
required packages. Once deployed, remote_agent
corresponds to an instance of local_agent
that is running on Vertex AI and can be queried or
deleted. It is separate from local instances of the agent.
Grant the deployed agent permissions
If the deployed agent needs to be granted any additional permissions, follow the instructions in Set up the identity and permissions for your agent .
If you defined secrets as environment variables , you need to grant the following permission:
- Secret Manager Secret Accessor (
roles/secretmanager.secretAccessor
)
Get the agent resource ID
Each deployed agent has a unique identifier. You can run the following command
to get the resource_name
identifier for your deployed agent:
remote_agent
.
resource_name
The response should look like the following string:
"projects/ PROJECT_NUMBER
/locations/ LOCATION
/reasoningEngines/ RESOURCE_ID
"
where
-
PROJECT_ID
is the Google Cloud project ID where the deployed agent runs. -
LOCATION
is the region where the deployed agent runs. -
RESOURCE_ID
is the ID of the deployed agent as areasoningEngine
resource .