This document provides an overview and comparison of compact placement policies and workload policies. Both policies let you configure the placement of Compute Engine instances to minimize network latency.
By default, you manage the location of your compute instances by specifying only their zones. When you use future reservations or managed instance group (MIG) resize requests to obtain A4X, A4, A3 Ultra, A3 Mega, and A3 High (8 GPUs) machines, the compute instances that you receive are densely colocated . However, you might want to place specific compute instances closer together to optimize inter-instance performance. To place compute instances closer together, you can apply compact placement policies to compute instances or workload policies to MIGs.
Compact placement policies for compute instances
When you apply compact placement policies to standalone compute instances, or
compute instances created in bulk, Compute Engine makes best-effort
attempts to create compute instances as close to each other as possible. If your
application requires minimal network latency, then specify the maxDistance
field ( Preview
) when
you create a compact placement policy.
For more information, see About compact placement policies in the Compute Engine documentation.
Workload policies for MIGs
When you apply workload policies to MIGs, you can specify the placement and underlying infrastructure for the compute instances in your MIGs. By using workload policies with A4X Max, A4X, A4, A3 Ultra, A3 Mega, and A3 High (8 GPUs), you can minimize network latency among your compute instances and optimize performance.
Based on the machine types that the compute instances in your MIG use, workload policies are required or optional in the following situations:
-
To deploy sub-blocks of A4X Max or A4X instances, workload policies are required.
-
To use A4, A3 Ultra, A3 Mega, or A3 High (8 GPUs) instances, workload policies are optional.
For more information, see About workload policies in MIGs in the Compute Engine documentation.
Comparison of compact placement policy and workload policy
The following table summarizes the differences between compact placement policies and workload policies:
- Standalone instances
- Instances deployed using Bulk API
- For Flex-start: A4 and A3 Ultra
- For Spot or reservations: A4X, A4, A3 Ultra, A3 Mega, and A3 High (8 GPUs)
Compute Engine places the instances that use the same compact placement policy closer together.
We recommend that you use a different placement policy for each workload. Reusing a placement policy across instances that run different workloads causes all those instances to be placed together. This colocation can make it difficult to create instances that are close together when you scale out a specific workload.
Compute Engine places the instances in a MIG that uses a workload policy closer together.
Reusing a workload policy across multiple MIGs that run different workloads places the instances in individual MIGs together. Reusing is ideal for large training models in which each group of instances has to be isolated from each other.
For best-effort compute instance colocation, set the groupPlacementPolicy.collocation
field to COLLOCATED
.
For best-effort compute instance colocation, set the workloadPolicy.type
field to HIGH_THROUGHPUT
.
- For strict compute instance placement, specify the
maxDistancefield. - For GPU families supporting partitioning,
such as A4X, specify the
gpuTopologyfield.
- For strict compute instance placement, specify the
maxTopologyDistancefield. - For GPU families supporting partitioning,
such as A4X, specify the
acceleratorTopologyfield.
Comparison of maximum distance values
A lower maximum distance value ensures closer compute instance placement, but it also increases the chance that some compute instances won't be created.
The following table shows the machine series and number of compute instances that each maximum distance value supports:
maxDistance
in a compact placement policymaxTopologyDistance
in a workload policy3
CLUSTER
2
BLOCK
- For A4 instances: 150
- For A3 Ultra, A3 Mega, and A3 High (8 GPUs) instances: 256
1
SUBBLOCK

