This document explains how to create policies for managed instance groups (MIGs) that have A4X Max, A4X, A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), or H4D Compute Engine instances. To learn more about the requirements and limitations that you apply when you create workload policies, see About workload policies .
A workload policy lets you specify the placement or topology for the Compute Engine instances in your MIG. For example, you can use workload policies to place compute instances closer to each other, minimizing network latency for artificial intelligence (AI), machine learning (ML), or high performance computing (HPC) workloads.
Before you begin
- If you haven't already, set up authentication
.
Authentication verifies your identity for access to Google Cloud services and APIs. To run
code or samples from a local development environment, you can authenticate to
Compute Engine by selecting one of the following options:
Select the tab for how you plan to use the samples on this page:
gcloud
-
Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command:
gcloud init
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity .
- Set a default region and zone .
REST
To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
Install the Google Cloud CLI.
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity .
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.
-
Required roles
To get the permissions that
you need to create and apply workload policies to MIGs,
ask your administrator to grant you the Compute Instance Admin (v1)
( roles/compute.instanceAdmin.v1
)
IAM role on the project.
For more information about granting roles, see Manage access to projects, folders, and organizations
.
This predefined role contains the permissions required to create and apply workload policies to MIGs. To see the exact permissions that are required, expand the Required permissionssection:
Required permissions
The following permissions are required to create and apply workload policies to MIGs:
- To create a workload policy:
compute.resourcePolicies.createon the project
You might also be able to get these permissions with custom roles or other predefined roles .
Create a workload policy
To create a workload policy, use one of the following methods based on the machine series that the compute instances in your MIG use:
Create a workload policy for A4X Max or A4X instances
If you apply a workload policy to A4X Max or A4X instances in a MIG, then you must specify an accelerator topology value when you create your policy. This action helps your workloads achieve large-scale, non-blocking network performance.
To create a workload policy for A4X Max or A4X instances, select one of the following options:
gcloud
To create a workload policy for A4X Max or A4X instances, use the gcloud compute resource-policies create workload-policy
command
with the --accelerator-topology=1x72
flag:
gcloud compute resource-policies create workload-policy WORKLOAD_POLICY_NAME
\
--accelerator-topology=1x72 \
--type=high-throughput \
--region= REGION
Replace the following:
-
WORKLOAD_POLICY_NAME: the name for your workload policy. -
REGION: the region where to create your workload policy. Specify a region in which you want to create the MIG, and where the machine type that you want to use is available. To review the regions in which A4X Max or A4X machine types are available, see Available regions and zones .
REST
To create a workload policy for A4X Max or A4X instances, make a POST
request to the resourcePolicies.insert
method
.
In the request body, include the acceleratorTopology
field set to 1x72
:
POST https://compute.googleapis.com/compute/v1/projects/ PROJECT_ID
/regions/ REGION
/resourcePolicies
{
"name": " WORKLOAD_POLICY_NAME
",
"workloadPolicy": {
"acceleratorTopology": "1x72",
"type": "HIGH_THROUGHPUT"
}
}
Replace the following:
-
PROJECT_ID: the ID of the project in which to create your workload policy. -
REGION: the region where to create your workload policy. Specify a region in which you want to create the MIG, and where the machine type that you want to use is available. To review the regions in which A4X Max or A4X machine types are available, see Available regions and zones . -
WORKLOAD_POLICY_NAME: the name for your workload policy.
Create a workload policy for A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), or H4D instances
If you want to apply a workload policy to A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), or H4D instances in a MIG, then we recommend that you specify a maximum topology distance value when you create your policy. This action helps ensure closer placement among your compute instances. However, the more compact the placement you specify, the fewer resources might be available in the zone where you want to create your compute instances.
To create a workload policy for A4, A3 Ultra, A3 Mega, or A3 High (8 GPUs) instances, select one of the following options:
gcloud
To create a workload policy, use the gcloud compute resource-policies create workload-policy
command
. Based on how
close you want to place compute instances in a MIG, include the following
flags in the command:
-
To place your compute instances close to each other on a best-effort basis, include the
--type=high-throughputflag:gcloud compute resource-policies create workload-policy WORKLOAD_POLICY_NAME \ --type=high-throughput \ --region= REGION -
To further control the placement of your compute instances, include the
--max-topology-distanceand--type=high-throughputflags:gcloud compute resource-policies create workload-policy WORKLOAD_POLICY_NAME \ --max-topology-distance= TOPOLOGY_DISTANCE \ --type=high-throughput \ --region= REGION
Replace the following:
-
WORKLOAD_POLICY_NAME: the name for your workload policy. -
REGION: the region where to create your workload policy. Specify a region in which you want to create the MIG, and where the machine type that you want to use is available. To review the regions in which GPU machine types are available, see Available regions and zones . -
TOPOLOGY_DISTANCE: the maximum topology distance. A shorter maximum distance can reduce the probability of VM availability. Specify one of the following values:-
To place A4 or H4D instances in the same cluster:
CLUSTER -
To place A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), or H4D instances in the same block:
BLOCK -
To place A4, A3 Ultra, or H4D instances in the same sub-block:
SUBBLOCK
-
REST
To create a workload policy, make a POST
request to the resourcePolicies.insert
method
. Based on how close you want to place compute
instances in a MIG, include the following fields in the request body:
-
To place your compute instances close to each other on a best-effort basis, include the
typefield in the request body:POST https://compute.googleapis.com/compute/v1/projects/ PROJECT_ID /regions/ REGION /resourcePolicies { "name": " WORKLOAD_POLICY_NAME ", "workloadPolicy": { "type": "HIGH_THROUGHPUT" } } -
To further control the placement of your compute instances, include the
maxTopologyDistanceandtypefields in the request body:POST https://compute.googleapis.com/compute/v1/projects/ PROJECT_ID /regions/ REGION /resourcePolicies { "name": " WORKLOAD_POLICY_NAME ", "workloadPolicy": { "maxTopologyDistance": " TOPOLOGY_DISTANCE ", "type": "HIGH_THROUGHPUT" } }
Replace the following:
-
PROJECT_ID: the ID of the project in which to create your workload policy. -
REGION: the region where to create your workload policy. Specify a region in which you want to create the MIG, and where the machine type that you want to use is available. To review the regions in which GPU machine types are available, see Available regions and zones . -
WORKLOAD_POLICY_NAME: the name for your workload policy. -
TOPOLOGY_DISTANCE: the maximum topology distance. A shorter maximum distance can reduce the probability of VM availability. Specify one of the following values:-
To place A4 or H4D instances in the same cluster:
CLUSTER -
To place A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), or H4D instances in the same block:
BLOCK -
To place A4, A3 Ultra, or H4D instances in the same sub-block:
SUBBLOCK
-
What's next
-
After you create a workload policy, learn how to apply it to a MIG:
-
Learn how to view compute instance topology .
-
Learn how to view workload policies .
-
Learn how to replace, remove, or delete workload policies .

