When you perform custom training, you must specify what machine learning (ML) code you want Vertex AI to run. To do this, configure training container settings for either a custom container or a Python training application that runs on a prebuilt container .
To determine whether you want to use a custom container or a prebuilt container, read Training code requirements .
This document describes the fields of the Vertex AI API that you must specify in either of the preceding cases.
Where to specify container settings
Specify configuration details within a WorkerPoolSpec
. Depending on how
you perform custom training, put this WorkerPoolSpec
in one of the following
API fields:
-
If you are creating a
CustomJobresource ,specify theWorkerPoolSpecinCustomJob.jobSpec.workerPoolSpecs.If you are using the Google Cloud CLI, then you can use the
--worker-pool-specflag or the--configflag on thegcloud ai custom-jobs createcommand to specify worker pool options.Learn more about creating a
CustomJob. -
If you are creating a
HyperparameterTuningJobresource ,specify theWorkerPoolSpecinHyperparameterTuningJob.trialJobSpec.workerPoolSpecs.If you are using the gcloud CLI, then you can use the
--configflag on thegcloud ai hpt-tuning-jobs createcommand to specify worker pool options.Learn more about creating a
HyperparameterTuningJob. -
If you are creating a
TrainingPipelineresource without hyperparameter tuning,specify theWorkerPoolSpecinTrainingPipeline.trainingTaskInputs.workerPoolSpecs.Learn more about creating a custom
TrainingPipeline. -
If you are creating a
TrainingPipelinewith hyperparameter tuning, specify theWorkerPoolSpecinTrainingPipeline.trainingTaskInputs.trialJobSpec.workerPoolSpecs.
If you are performing distributed training , you can use different settings for each worker pool.
Configure container settings
Depending on whether you are using a prebuilt container or a custom container,
you must specify different fields within the WorkerPoolSpec
. Select the tab for your scenario:
Prebuilt container
-
Select a prebuilt container that supports the ML framework you plan to use for training. Specify one of the container image's URIs in the
pythonPackageSpec.executorImageUrifield . -
Specify the Cloud Storage URIs of your Python training application in the
pythonPackageSpec.packageUrisfield . -
Specify your training application's entry point module in the
pythonPackageSpec.pythonModulefield . -
Optionally, specify a list of command-line arguments to pass to your training application's entry point module in the
pythonPackageSpec.argsfield .
The following examples highlight where you specify these container settings
when you create a CustomJob
:
Console
In the Google Cloud console, you can't create a CustomJob
directly. However,
you can create a TrainingPipeline
that creates a CustomJob
. When you create a TrainingPipeline
in the Google Cloud console, you can specify prebuilt
container settings in certain fields on the Training containerstep:
-
pythonPackageSpec.executorImageUri: Use the Model frameworkand Model framework versiondrop-down lists. -
pythonPackageSpec.packageUris: Use the Package locationfield. -
pythonPackageSpec.pythonModule: Use the Python modulefield. -
pythonPackageSpec.args: Use the Argumentsfield.
gcloud
gcloud
ai
custom-jobs
create
\
--region =
LOCATION
\
--display-name =
JOB_NAME
\
--python-package-uris =
PYTHON_PACKAGE_URIS
\
--worker-pool-spec =
machine-type =
MACHINE_TYPE
,replica-count =
REPLICA_COUNT
, executor-image-uri =
PYTHON_PACKAGE_EXECUTOR_IMAGE_URI
,python-module =
PYTHON_MODULE
For more context, read the guide to creating a CustomJob
.
Custom container
-
Specify the Artifact Registry or Docker Hub URI of your custom container in the
containerSpec.imageUrifield . -
Optionally, if you want to override the
ENTRYPOINTorCMDinstructions in your container, specify thecontainerSpec.commandorcontainerSpec.argsfields . These fields affect how your container runs according to the following rules:-
If you specify neither field:Your container runs according to its
ENTRYPOINTinstruction andCMDinstruction (if it exists). Refer to the Docker documentation about howCMDandENTRYPOINTinteract . -
If you specify only
containerSpec.command:Your container runs with the value ofcontainerSpec.commandreplacing itsENTRYPOINTinstruction. If the container has aCMDinstruction, it is ignored. -
If you specify only
containerSpec.args:Your container runs according to itsENTRYPOINTinstruction, with the value ofcontainerSpec.argsreplacing itsCMDinstruction. -
If you specify both fields:Your container runs with
containerSpec.commandreplacing itsENTRYPOINTinstruction andcontainerSpec.argsreplacing itsCMDinstruction.
-
The following example highlights where you can specify some of these
container settings when you create a CustomJob
:
Console
In the Google Cloud console, you can't create a CustomJob
directly. However,
you can create a TrainingPipeline
that creates a CustomJob
. When you create a TrainingPipeline
in the Google Cloud console, you can specify custom
container settings in certain fields on the Training containerstep:
-
containerSpec.imageUri: Use the Container imagefield. -
containerSpec.command: This API field is not configurable in the Google Cloud console. -
containerSpec.args: Use the Argumentsfield.
gcloud
gcloud
ai
custom-jobs
create
\
--region =
LOCATION
\
--display-name =
JOB_NAME
\
--worker-pool-spec =
machine-type =
MACHINE_TYPE
,replica-count =
REPLICA_COUNT
, container-image-uri =
CUSTOM_CONTAINER_IMAGE_URI
Java
Before trying this sample, follow the Java setup instructions in the Vertex AI quickstart using client libraries . For more information, see the Vertex AI Java API reference documentation .
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vertex AI quickstart using client libraries . For more information, see the Vertex AI Node.js API reference documentation .
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python . For more information, see the Python API reference documentation .
For more context, read the guide to creating a CustomJob
.
What's next
- Learn how to perform custom training by creating a
CustomJob.

