When you perform custom training, you must specify what machine learning (ML) code you want Vertex AI to run. To do this, configure training container settings for either a custom container or a Python training application that runs on a prebuilt container .
To determine whether you want to use a custom container or a prebuilt container, read Training code requirements .
This document describes the fields of the Vertex AI API that you must specify in either of the preceding cases.
Where to specify container settings
Specify configuration details within a WorkerPoolSpec
. Depending on how
you perform custom training, put this WorkerPoolSpec
in one of the following
API fields:
-
If you are creating a
CustomJob
resource ,specify theWorkerPoolSpec
inCustomJob.jobSpec.workerPoolSpecs
.If you are using the Google Cloud CLI, then you can use the
--worker-pool-spec
flag or the--config
flag on thegcloud ai custom-jobs create
command to specify worker pool options.Learn more about creating a
CustomJob
. -
If you are creating a
HyperparameterTuningJob
resource ,specify theWorkerPoolSpec
inHyperparameterTuningJob.trialJobSpec.workerPoolSpecs
.If you are using the gcloud CLI, then you can use the
--config
flag on thegcloud ai hpt-tuning-jobs create
command to specify worker pool options.Learn more about creating a
HyperparameterTuningJob
. -
If you are creating a
TrainingPipeline
resource without hyperparameter tuning,specify theWorkerPoolSpec
inTrainingPipeline.trainingTaskInputs.workerPoolSpecs
.Learn more about creating a custom
TrainingPipeline
. -
If you are creating a
TrainingPipeline
with hyperparameter tuning, specify theWorkerPoolSpec
inTrainingPipeline.trainingTaskInputs.trialJobSpec.workerPoolSpecs
.
If you are performing distributed training , you can use different settings for each worker pool.
Configure container settings
Depending on whether you are using a prebuilt container or a custom container,
you must specify different fields within the WorkerPoolSpec
. Select the tab for your scenario:
Prebuilt container
-
Select a prebuilt container that supports the ML framework you plan to use for training. Specify one of the container image's URIs in the
pythonPackageSpec.executorImageUri
field . -
Specify the Cloud Storage URIs of your Python training application in the
pythonPackageSpec.packageUris
field . -
Specify your training application's entry point module in the
pythonPackageSpec.pythonModule
field . -
Optionally, specify a list of command-line arguments to pass to your training application's entry point module in the
pythonPackageSpec.args
field .
The following examples highlight where you specify these container settings
when you create a CustomJob
:
Console
In the Google Cloud console, you can't create a CustomJob
directly. However,
you can create a TrainingPipeline
that creates a CustomJob
. When you create a TrainingPipeline
in the Google Cloud console, you can specify prebuilt
container settings in certain fields on the Training containerstep:
-
pythonPackageSpec.executorImageUri
: Use the Model frameworkand Model framework versiondrop-down lists. -
pythonPackageSpec.packageUris
: Use the Package locationfield. -
pythonPackageSpec.pythonModule
: Use the Python modulefield. -
pythonPackageSpec.args
: Use the Argumentsfield.
gcloud
gcloud
ai
custom-jobs
create
\
--region =
LOCATION
\
--display-name =
JOB_NAME
\
--python-package-uris =
PYTHON_PACKAGE_URIS
\
--worker-pool-spec =
machine-type =
MACHINE_TYPE
,replica-count =
REPLICA_COUNT
, executor-image-uri =
PYTHON_PACKAGE_EXECUTOR_IMAGE_URI
,python-module =
PYTHON_MODULE
For more context, read the guide to creating a CustomJob
.
Custom container
-
Specify the Artifact Registry or Docker Hub URI of your custom container in the
containerSpec.imageUri
field . -
Optionally, if you want to override the
ENTRYPOINT
orCMD
instructions in your container, specify thecontainerSpec.command
orcontainerSpec.args
fields . These fields affect how your container runs according to the following rules:-
If you specify neither field:Your container runs according to its
ENTRYPOINT
instruction andCMD
instruction (if it exists). Refer to the Docker documentation about howCMD
andENTRYPOINT
interact . -
If you specify only
containerSpec.command
:Your container runs with the value ofcontainerSpec.command
replacing itsENTRYPOINT
instruction. If the container has aCMD
instruction, it is ignored. -
If you specify only
containerSpec.args
:Your container runs according to itsENTRYPOINT
instruction, with the value ofcontainerSpec.args
replacing itsCMD
instruction. -
If you specify both fields:Your container runs with
containerSpec.command
replacing itsENTRYPOINT
instruction andcontainerSpec.args
replacing itsCMD
instruction.
-
The following example highlights where you can specify some of these
container settings when you create a CustomJob
:
Console
In the Google Cloud console, you can't create a CustomJob
directly. However,
you can create a TrainingPipeline
that creates a CustomJob
. When you create a TrainingPipeline
in the Google Cloud console, you can specify custom
container settings in certain fields on the Training containerstep:
-
containerSpec.imageUri
: Use the Container imagefield. -
containerSpec.command
: This API field is not configurable in the Google Cloud console. -
containerSpec.args
: Use the Argumentsfield.
gcloud
gcloud
ai
custom-jobs
create
\
--region =
LOCATION
\
--display-name =
JOB_NAME
\
--worker-pool-spec =
machine-type =
MACHINE_TYPE
,replica-count =
REPLICA_COUNT
, container-image-uri =
CUSTOM_CONTAINER_IMAGE_URI
Java
Before trying this sample, follow the Java setup instructions in the Vertex AI quickstart using client libraries . For more information, see the Vertex AI Java API reference documentation .
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .
Node.js
Before trying this sample, follow the Node.js setup instructions in the Vertex AI quickstart using client libraries . For more information, see the Vertex AI Node.js API reference documentation .
To authenticate to Vertex AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .
Python
To learn how to install or update the Vertex AI SDK for Python, see Install the Vertex AI SDK for Python . For more information, see the Python API reference documentation .
For more context, read the guide to creating a CustomJob
.
What's next
- Learn how to perform custom training by creating a
CustomJob
.