About workload policies in MIGs

This document explains the requirements and limitations that you apply when using workload policies with managed instance groups (MIGs). By default, you manage the location of your Compute Engine instances only by specifying their zones. Workload policies let you define the physical placement and topology of your compute instances within a zone. This approach helps you, for example, minimize network latency across your compute instances by placing them closer to each other.

You can only apply workload policies to MIGs that use A4X Max, A4X, A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), or H4D instances. If you are using A4X Max or A4X instances, a workload policy is required unless you are creating a single compute instance for testing purposes. For the other supported machine series, workload policies are optional.

To control the placement of compute instances that use other machine series, use placement policies .

Understand workload policies

The following sections describe workload policy use cases and the properties that you specify when you create workload policies.

Manage compute instance placement and topology

After you create a workload policy and apply it to a MIG, workload policies help you achieve the following:

  • Create compute instances close to each other: if capacity is available, then Compute Engine creates compute instances close to each other. Otherwise, Compute Engine creates only some or none of your requested compute instances.

  • Gain visibility into your compute instance topology: after you apply a workload policy with a high throughput ( HIGH_THROUGHPUT ) type to a MIG, and the MIG creates compute instances, you can view the position of the compute instances in relation to each other. This information helps you minimize network latency and troubleshoot errors. For more information, see View Compute Engine instance topology .

You can apply the same workload policy to multiple MIGs. When you do so, Compute Engine applies the placement rules to each MIG independently.

The following section describes the properties that you must specify when you create workload policies.

Configure workload policy properties

When you create a workload policy, you must specify the following properties:

  • Workload type ( type ): this field defines the high-level goal of your cluster. You can only specify HIGH_THROUGHPUT , which instructs Compute Engine to place compute instances as close together as possible to speed up communication.

  • Based on the machine series that the compute instances in your MIG use, you can optionally specify one of the following properties:

    • Accelerator topology ( acceleratorTopology ): this property helps you to achieve high performance for distributed workloads that run across multiple A4X Max or A4X instances that use a specialized inter-accelerator network configuration. For more information, see Accelerator topology property .

    • Maximum topology distance ( maxTopologyDistance ): this property defines the strictest physical boundary for creating your A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), or H4D instances, such as the same block or sub-block. If Compute Engine can't meet this strict limit due to resource availability errors, then it creates only part, or none, of your requested compute instances. For more information, see Maximum topology distance property .

Accelerator topology property

To achieve large-scale, non-blocking network performance, Compute Engine organizes A4X Max and A4X instances into a physical hierarchy of blocks and sub-blocks .

To create a MIG with A4X Max or A4X instances, you must apply a workload policy to the MIG that specifies the accelerator topology ( acceleratorTopology ) property. This property defines the physical network configuration for a slice of compute instances. A slice acts as a single, massive accelerator that provides maximum throughput for your distributed AI or ML workloads.

The following table shows the supported accelerator topology values for workload policies and the machine series they support:

Accelerator topology value Description Supported machine series Maximum number of compute instances
1x72
Compute Engine organizes compute instances into densely allocated sub-blocks of 18 compute instances, totaling 72 GPUs. Because each sub-block requires its own MIG, you can create a maximum of 18 compute instances per MIG. A full block consists of 25 MIGs, totaling 450 compute instances. A4X Max and A4X 18

For more information about A4X Max and A4X instances, see The A4X Max and A4X machine series .

Maximum topology distance property

When you create and apply a workload policy to a MIG, Compute Engine makes best-effort attempts to create your compute instances close together. If you require maximum compactness in a zone, then we recommend that you specify the maximum topology distance ( maxTopologyDistance ) property. A maximum topology distance value specifies to create A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), or H4D instances in the same cluster , block , or sub-block .

The following table shows the maximum topology distance values and the machine series that they support:

Maximum topology distance value
Description
Supported machine series
Maximum number of compute instances
Unspecified (Not recommended)
Compute Engine makes best-effort attempts to place the compute instances as close to each other as possible, but with no maximum distance guarantee among compute instances in a zone.
A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), and H4D
1,500
CLUSTER
Compute Engine creates compute instances in adjacent blocks within a cluster.
A4 and H4D
1,500
BLOCK
Compute Engine creates compute instances in the same block.
A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), or H4D
  • For A4 or H4D: 150
  • For A3 Ultra, A3 Mega, or A3 High (8 GPUs): 256
SUBBLOCK
Compute Engine creates compute instances in the same sub-block, minimizing network latency as much as possible.
A4, A3 Ultra, and H4D
22

Pricing

There are no additional costs associated with creating, deleting, or applying workload policies to a MIG.

Limitations

For workload policies in MIGs, the following limitations apply:

  • You can only apply a workload policy to an existing MIG, or change its workload policy, if there are no compute instances in the MIG.

  • You can only apply workload policies to MIGs with compute instances that use the following combinations of machine types and provisioning models:

    Machine series Provisioning model
    A4, A3 Ultra, and H4D Flex-start
    A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), and H4D Spot
    A4X Max, A4X, A4, A3 Ultra, A3 Mega, A3 High (8 GPUs), and H4D Reservation-bound
  • You can apply workload policies to regional MIGs only if the MIGs use the following distribution target shapes :

    • For MIGs with a target size: ANY or ANY_SINGLE_ZONE

    • For MIG resize requests: ANY_SINGLE_ZONE

  • You can't update a workload policy after you create it.

  • You can't configure a second instance template if your MIG uses a workload policy.

  • You can't use workload policies together with placement policies.

What's next

Design a Mobile Site
View Site in Mobile | Classic
Share by: