Once a cluster is deployed, you can manage its full lifecycle using the following REST API endpoints.
-
List: Views all active clusters in your project. -
Get: Retrieves detailed information for a specific cluster. -
Update: Modifies an existing cluster configuration. -
Delete: Permanently removes a cluster and its resources.
Authentication
alias gcurl='curl -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json"'
List clusters:
gcurl -X GET https:// REGION -aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID /locations/ REGION /modelDevelopmentClusters
The list method supports the following optional query parameters to control pagination.
-
pageSize(integer, optional): The maximum number of clusters to return in the response. The service may return fewer than this value, even if more items exist. If unspecified, a default page size will be used. -
pageToken (string, optional): A token received from a previous list call. Provide this token to retrieve the subsequent page of results.
gcurl "https:// REGION -aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID /locations/ REGION /modelDevelopmentClusters?pageSize=5"
nextPageToken
string. Get a cluster:
gcurl -X GET https:// REGION -aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID /locations/ REGION /modelDevelopmentClusters/ CLUSTER_ID
Update a cluster:
UPDATE_PAYLOAD
specifies the local path to a JSON file that defines the full ModelDevelopmentCluster
you want to update to.
For example, to update the node count of a pool of a CPU-only cluster , use the following JSON payload:
{ "display_name" : " DISPLAY_NAME " , "network" : { "network" : "projects/ PROJECT_ID /global/networks/NETWORK" , "subnetwork" : "projects/ PROJECT_ID /regions/ REGION /subnetworks/ SUBNETWORK " }, "node_pools" : [ { "id" : "cpu" , "machine_spec" : { "machine_type" : "n2-standard-8" }, "scaling_spec" : { "min_node_count" : UPDATED_MIN_NODE_COUNT , "max_node_count" : UPDATED_MAX_NODE_COUNT }, "zone" : " ZONE " , "enable_public_ips" : true , "boot_disk" : { "boot_disk_type" : "pd-standard" , "boot_disk_size_gb" : 120 } }, { "id" : "login" , "machine_spec" : { "machine_type" : "n2-standard-8" , }, "scaling_spec" : { "min_node_count" : 1 , "max_node_count" : 1 }, "zone" : " ZONE " , "enable_public_ips" : true , "boot_disk" : { "boot_disk_type" : "pd-standard" , "boot_disk_size_gb" : 120 } }, ], "orchestrator_spec" : { "slurm_spec" : { "home_directory_storage" : "projects/ PROJECT_ID /locations/ ZONE /instances/ FILESTORE " , "partitions" : [ { "id" : "cpu" , "node_pool_ids" : [ "cpu" ] } ], "login_node_pool_id" : "login" } } }
gcurl - X PATCH - d @ UPDATE_PAYLOAD https : // REGION - aiplatform . googleapis . com / v1beta1 / projects / PROJECT_ID / locations / REGION / modelDevelopmentClusters / CLUSTER_ID
-
updateMask(string, optional): A FieldMask that specifies which fields of the Model Development cluster resource to update. Only the fields listed in theupdateMaskare changed.The following fields within the
ModelDevelopmentClusterresource can be specified in theupdateMask:-
node_pools -
orchestrator_spec.slurm_spec.partitions -
orchestrator_spec.slurm_spec.login_node_pool_id -
orchestrator_spec.slurm_spec.prolog_bash_scripts -
orchestrator_spec.slurm_spec.epilog_bash_scripts
-
The command below updates both the node pool configuration and the Slurm partitions.
gcurl - X PATCH - d @ update - payload . json https : // REGION - aiplatform . googleapis . com / v1beta1 / projects / PROJECT_ID / locations / REGION / modelDevelopmentClusters / CLUSTER_ID ? updateMask = orchestrator_spec . slurm_spec . partitions , node_pools
For repeated fields, such as node_pools
, prolog_bash_scripts
, and epilog_bash_scripts
, the API only supports a full replacement operation. The user
must provide the entire, expected list of items in the request payload to replace the existing list
completely.
A successful request returns a Long Running Operation (LRO). You can then monitor the status of this operation using the following command:
gcurl https:// REGION -aiplatform.googleapis.com/v1/projects/ PROJECT_ID /locations/ REGION /operations/ OPERATION_ID
Delete a cluster:
gcurl -X DELETE https:// REGION -aiplatform.googleapis.com/v1beta1/projects/ PROJECT_ID /locations/ REGION /modelDevelopmentClusters/ CLUSTER_ID
This command returns a Long-Running Operation on success, which you can then monitor using the operations describe
command.
gcurl https:// REGION -aiplatform.googleapis.com/v1/projects/ PROJECT_ID /locations/ REGION /operations/ OPERATION_ID

