Create TPU Flex-start VMs with Compute Engine
TPU Flex-start VMs, powered by Dynamic Workload Scheduler , offer a flexible, cost-effective way to access TPU resources for AI workloads for up to 7 days without long-term reservations. When you request TPU Flex-start VMs, your request remains in a queue until capacity is available. Once provisioned, the TPU VMs run for your specified duration.
TPU Flex-start VMs are a good fit for quick experimentation, small-scale testing, dynamic provisioning of TPUs for inference workloads, model fine-tuning, and workload runs that take less than 7 days. For more information about other TPU consumption options, see Cloud TPU consumption options .
You can delete your TPU resources at any time to stop billing. For more information about TPU pricing, see Cloud TPU pricing .
Limitations
TPU Flex-start VMs have the following limitations:
- You can request TPU Flex-start VMs for a duration of up to 7 days.
- You can request the following Cloud TPU versions and zones:
MIGs with TPUs have the following limitations:
-
Lifecycle operations: You can't stop, start, resume, or suspend TPU instances. To change configurations that require a restart or to stop incurring charges, you must delete the instances.
-
Regional MIG zone distribution: You must set the target distribution shape to
ANY_SINGLE_ZONE. -
Configuration updates in a MIG:
- You can't update a MIG that forms a multi-host TPU slice due to the defined accelerator topology.
- You can update a MIG that forms single-host TPU slices by using the automatic or selective methods
.
However, the updates for single-host TPU slice don't support the restart
(
RESTART) action. If a restart is necessary and the most disruptive action allowed is replace (REPLACE), then the updater will replace the instance; otherwise, the update attempt fails with an error.
-
For a MIG that forms a multi-host TPU slice, the following limitations also apply:
-
Target size policy: You must set the target size policy mode to
BULK. After you set this mode, you can't change it. -
Target size: In bulk mode, you can set the target size to either
0or the number of instances that are required to form the accelerator topology. -
Workload policy: You must specify a workload policy in which the accelerator topology is defined. After you set the workload policy, you can't change or remove the policy from the MIG.
-
-
Unsupported features: MIGs with TPUs don't support the following features:
- Instance flexibility
- Resize requests to obtain resources all at once
- Stateful configuration
- For a MIG that forms a multi-host TPU slice, the following are also not supported:
Before you begin
Before requesting TPU Flex-start VMs, you must:
- Install the Google Cloud CLI
- Create a Google Cloud project
- Enable the Compute Engine API (
compute.googleapis.com) - Ensure you have the required permissions:
-
roles/compute.instanceAdmin.v1 -
roles/iam.serviceAccountUser
-
For more information, see Set up a Google Cloud project for TPUs .
Ensure you have sufficient preemptible quota to use TPU Flex-start VMs. If your workload requires more cores than your current allocation, you can request a quota increase. For details, see Cloud TPU quotas .
Create TPU Flex-start VMs with MIGs
To use TPU Flex-start VMs, you create a managed instance group (MIG) with a specific instance template configuration.
For general instructions on creating Flex-start VMs, see Create Flex-start VMs .
Create TPU Flex-start VMs with a multi-host slice
Create an instance template
Create an instance template specifying the FLEX_START
provisioning model and
your chosen run duration.
gcloud
compute
instance-templates
create
TEMPLATE_NAME
\
--machine-type =
MACHINE_TYPE
\
--image-family =
IMAGE_FAMILY
\
--image-project =
IMAGE_PROJECT
\
--provisioning-model =
FLEX_START
\
--instance-termination-action =
DELETE
\
--max-run-duration =
DURATION
\
--region =
REGION
\
--maintenance-policy =
TERMINATE
Replace the following placeholders:
- TEMPLATE_NAME : The name of your instance template.
- MACHINE_TYPE
: The machine type
for the TPU VM (for example,
ct6e-standard-8t). - IMAGE_FAMILY
: The OS image family for the TPU VM (for
example,
ubuntu-accelerator-2204-amd64-with-tpu-v6e) - IMAGE_PROJECT
: The OS image project for the TPU VM
(for example,
ubuntu-os-accelerator-images) - DURATION
: The maximum run duration (for example,
7dfor 7 days). - REGION : The region in which to create the instance template.
Create a workload policy
The following command creates a workload policy. This is optional for single-host slices.
gcloud
compute
resource-policies
create
workload
WORKLOAD_POLICY_NAME
\
--type =
high-throughput
\
--accelerator-topology =
TOPOLOGY
Replace the following placeholders:
- WORKLOAD_POLICY_NAME : The name of your workload policy.
- TOPOLOGY
: The topology of the TPU VMs, for example,
4x4x8.
Create the MIG
Create the MIG using the template.
gcloud
compute
instance-groups
managed
create
MIG_NAME
\
--zone =
ZONE
\
--template =
TEMPLATE_NAME
\
--size =
SIZE
\
--workload-policy =
projects/ PROJECT_ID
/regions/ WORKLOAD_POLICY_REGION
/resourcePolicies/ WORKLOAD_POLICY_NAME
\
--target-size-policy-mode =
bulk
Replace the following placeholders:
- MIG_NAME : The name of your MIG.
- ZONE : The zone of your MIG.
- TEMPLATE_NAME : The name of your instance template.
- SIZE : The number of instances to create.
- PROJECT_ID : The ID of your Google Cloud project.
- WORKLOAD_POLICY_REGION : The region where the workload policy is defined.
- WORKLOAD_POLICY_NAME : The name of your workload policy.
Create TPU Flex-start VMs with single-host slices
Create an instance template
Create an instance template specifying the FLEX_START
provisioning model and
your chosen run duration.
gcloud
compute
instance-templates
create
TEMPLATE_NAME
\
--machine-type =
MACHINE_TYPE
\
--image-family =
IMAGE_FAMILY
\
--image-project =
IMAGE_PROJECT
\
--provisioning-model =
FLEX_START
\
--instance-termination-action =
DELETE
\
--max-run-duration =
DURATION
\
--region =
REGION
\
--maintenance-policy =
TERMINATE
Replace the following placeholders:
- TEMPLATE_NAME : The name of your instance template.
- MACHINE_TYPE
: The machine type
for the TPU VM (for example,
ct6e-standard-8t). - IMAGE_FAMILY
: The OS image family for the TPU VM (for
example,
ubuntu-accelerator-2204-amd64-with-tpu-v6e) - IMAGE_PROJECT
: The OS image project for the TPU VM (for
example,
ubuntu-os-accelerator-images) - DURATION
: The maximum run duration (for example,
7dfor 7 days). - REGION : The region in which to create the instance template.
Create a workload policy
The following command creates a workload policy. This is optional for single-host slices.
gcloud
compute
resource-policies
create
workload
WORKLOAD_POLICY_NAME
\
--type =
high-throughput
Replace the following placeholders:
- WORKLOAD_POLICY_NAME : A name for your workload policy.
Create the MIG
Create the MIG using the template.
gcloud
compute
instance-groups
managed
create
MIG_NAME
\
--zone =
ZONE
\
--template =
TEMPLATE_NAME
\
--size =
SIZE
\
--workload-policy =
projects/ PROJECT_ID
/regions/ WORKLOAD_POLICY_REGION
/resourcePolicies/ WORKLOAD_POLICY_NAME
Replace the following placeholders:
- MIG_NAME : The name of your MIG.
- ZONE : The zone of your MIG.
- TEMPLATE_NAME : The name of your instance template.
- SIZE : The number of instances to create.
- PROJECT_ID : The ID of your Google Cloud project.
- WORKLOAD_POLICY_REGION : The region where the workload policy is defined.
- WORKLOAD_POLICY_NAME : The name of your workload policy.

