Create workload policies for MIGs

This document explains how to create policies for managed instance groups (MIGs) that have A4X Max, A4X, A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), or H4D Compute Engine instances. To learn more about the requirements and limitations that you apply when you create workload policies, see About workload policies .

A workload policy lets you specify the placement or topology for the Compute Engine instances in your MIG. For example, you can use workload policies to place compute instances closer to each other, minimizing network latency for artificial intelligence (AI), machine learning (ML), or high performance computing (HPC) workloads.

Before you begin

If you haven't already, set up authentication . Authentication verifies your identity for access to Google Cloud services and APIs. To run code or samples from a local development environment, you can authenticate to Compute Engine by selecting one of the following options:
Select the tab for how you plan to use the samples on this page:
gcloud
1. Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command:
  gcloud init
  If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity .
  
  Note:If you installed the gcloud CLI previously, make sure you have the latest version by running gcloud components update .
2. Set a default region and zone .
REST

To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.

Required roles

To get the permissions that you need to create and apply workload policies to MIGs, ask your administrator to grant you the Compute Instance Admin (v1) ( roles/compute.instanceAdmin.v1 ) IAM role on the project. For more information about granting roles, see Manage access to projects, folders, and organizations .

This predefined role contains the permissions required to create and apply workload policies to MIGs. To see the exact permissions that are required, expand the Required permissionssection:

Required permissions

The following permissions are required to create and apply workload policies to MIGs:

To create a workload policy: compute.resourcePolicies.create on the project

You might also be able to get these permissions with custom roles or other predefined roles .

Create a workload policy

To create a workload policy, use one of the following methods based on the machine series that the compute instances in your MIG use:

Create a workload policy for A4X Max or A4X instances
Create a workload policy for A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), or H4D instances

Create a workload policy for A4X Max or A4X instances

If you apply a workload policy to A4X Max or A4X instances in a MIG, then you must specify an accelerator topology value when you create your policy. This action helps your workloads achieve large-scale, non-blocking network performance.

To create a workload policy for A4X Max or A4X instances, select one of the following options:

gcloud

To create a workload policy for A4X Max or A4X instances, use the gcloud compute resource-policies create workload-policy command with the --accelerator-topology=1x72 flag:

 gcloud compute resource-policies create workload-policy WORKLOAD_POLICY_NAME 
\
    --accelerator-topology=1x72 \
    --type=high-throughput \
    --region= REGION

Replace the following:

WORKLOAD_POLICY_NAME : the name for your workload policy.
REGION : the region where to create your workload policy. Specify a region in which you want to create the MIG, and where the machine type that you want to use is available. To review the regions in which A4X Max or A4X machine types are available, see Available regions and zones .

REST

To create a workload policy for A4X Max or A4X instances, make a POST request to the resourcePolicies.insert method . In the request body, include the acceleratorTopology field set to 1x72 :

 POST https://compute.googleapis.com/compute/v1/projects/ PROJECT_ID 
/regions/ REGION 
/resourcePolicies

{
  "name": " WORKLOAD_POLICY_NAME 
",
  "workloadPolicy": {
    "acceleratorTopology": "1x72",
    "type": "HIGH_THROUGHPUT"
  }
}

Replace the following:

PROJECT_ID : the ID of the project in which to create your workload policy.
REGION : the region where to create your workload policy. Specify a region in which you want to create the MIG, and where the machine type that you want to use is available. To review the regions in which A4X Max or A4X machine types are available, see Available regions and zones .
WORKLOAD_POLICY_NAME : the name for your workload policy.

Create a workload policy for A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), or H4D instances

If you want to apply a workload policy to A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), or H4D instances in a MIG, then we recommend that you specify a maximum topology distance value when you create your policy. This action helps ensure closer placement among your compute instances. However, the more compact the placement you specify, the fewer resources might be available in the zone where you want to create your compute instances.

To create a workload policy for A4, A3 Ultra, A3 Mega, or A3 High (8 GPUs) instances, select one of the following options:

gcloud

To create a workload policy, use the gcloud compute resource-policies create workload-policy command . Based on how close you want to place compute instances in a MIG, include the following flags in the command:

To place your compute instances close to each other on a best-effort basis, include the --type=high-throughput flag:

 gcloud compute resource-policies create workload-policy WORKLOAD_POLICY_NAME 
\
    --type=high-throughput \
    --region= REGION

To further control the placement of your compute instances, include the --max-topology-distance and --type=high-throughput flags:

 gcloud compute resource-policies create workload-policy WORKLOAD_POLICY_NAME 
\
    --max-topology-distance= TOPOLOGY_DISTANCE 
\
    --type=high-throughput \
    --region= REGION

Replace the following:

WORKLOAD_POLICY_NAME : the name for your workload policy.
REGION : the region where to create your workload policy. Specify a region in which you want to create the MIG, and where the machine type that you want to use is available. To review the regions in which GPU machine types are available, see Available regions and zones .
TOPOLOGY_DISTANCE : the maximum topology distance. A shorter maximum distance can reduce the probability of VM availability. Specify one of the following values:
- To place A4 or H4D instances in the same cluster: CLUSTER
- To place A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), or H4D instances in the same block: BLOCK
- To place A4, A3 Ultra, or H4D instances in the same sub-block: SUBBLOCK

REST

To create a workload policy, make a POST request to the resourcePolicies.insert method . Based on how close you want to place compute instances in a MIG, include the following fields in the request body:

To place your compute instances close to each other on a best-effort basis, include the type field in the request body:

 POST https://compute.googleapis.com/compute/v1/projects/ PROJECT_ID 
/regions/ REGION 
/resourcePolicies

{
  "name": " WORKLOAD_POLICY_NAME 
",
  "workloadPolicy": {
    "type": "HIGH_THROUGHPUT"
  }
}

To further control the placement of your compute instances, include the maxTopologyDistance and type fields in the request body:

 POST https://compute.googleapis.com/compute/v1/projects/ PROJECT_ID 
/regions/ REGION 
/resourcePolicies

{
  "name": " WORKLOAD_POLICY_NAME 
",
  "workloadPolicy": {
    "maxTopologyDistance": " TOPOLOGY_DISTANCE 
",
    "type": "HIGH_THROUGHPUT"
  }
}

Replace the following:

PROJECT_ID : the ID of the project in which to create your workload policy.
REGION : the region where to create your workload policy. Specify a region in which you want to create the MIG, and where the machine type that you want to use is available. To review the regions in which GPU machine types are available, see Available regions and zones .
WORKLOAD_POLICY_NAME : the name for your workload policy.
TOPOLOGY_DISTANCE : the maximum topology distance. A shorter maximum distance can reduce the probability of VM availability. Specify one of the following values:
- To place A4 or H4D instances in the same cluster: CLUSTER
- To place A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), or H4D instances in the same block: BLOCK
- To place A4, A3 Ultra, or H4D instances in the same sub-block: SUBBLOCK

What's next

After you create a workload policy, learn how to apply it to a MIG:
Learn how to view compute instance topology .
Learn how to view workload policies .
Learn how to replace, remove, or delete workload policies .

Create workload policies for MIGs Stay organized with collections Save and categorize content based on your preferences.

Before you begin

gcloud

REST

Required roles

Required permissions

Create a workload policy

Create a workload policy for A4X Max or A4X instances

gcloud

REST

Create a workload policy for A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), or H4D instances

gcloud

REST

What's next

Create workload policies for MIGs