Configure TPUs with GKE dynamic slicing

Standard

This document explains how to provision TPU node pools and schedule dynamic slices in Google Kubernetes Engine (GKE) using Kueue and Topology Aware Scheduling (TAS) .

Before following these instructions, ensure that you understand the concepts of dynamic slicing .

Requirements

To use dynamic slicing in GKE, you must meet the following requirements:

Use a Standard cluster in version 1.35.2-gke.1842000 or later, in the Rapid channel.
Use Ironwood (TPU7x) version.
Use the Container-Optimized OS image for your nodes.
To use incremental provisioning, use All Capacity mode reservations. All Capacity mode is a feature enabled by TPU Cluster Director.

Before you begin

Before you start, make sure that you have performed the following tasks:

Enable the Google Kubernetes Engine API.

Enable Google Kubernetes Engine API

If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running the gcloud components update command. Earlier gcloud CLI versions might not support running the commands in this document.
Note: For existing gcloud CLI installations, make sure to set the compute/region property . If you use primarily zonal clusters, set the compute/zone instead. By setting a default location, you can avoid errors in the gcloud CLI like the following: One of [--zone, --region] must be supplied: Please specify location . You might need to specify the location in certain commands if the location of your cluster differs from the default that you set.

Ensure that you have an existing Standard cluster in version 1.35.2-gke.1842000 or later, in the Rapid channel. To create a new cluster, see Creating a regional cluster .
Ensure you have sufficient quota for Ironwood (TPU7x) in your region.
If you plan to run multislice workloads, install JobSet v0.10.1 or later
Request TPU capacity in All Capacity mode .

Use dynamic slicing in GKE with Kueue

This section describes the workflow for using dynamic slicing in GKE.

View the topology and health status of All Capacity mode reservations .
Enable the slice controller in your cluster .
Create TPU node pools .
Configure Kueue to create a Slice custom resource .
Run workloads on dynamic slicing with Kueue .
Clean up .

Enable the slice controller

To use dynamic slicing, enable the slice controller in your cluster.

Update your cluster:

 gcloud  
container  
clusters  
update  
 CLUSTER_NAME 
  
 \ 
  
--location = 
 LOCATION 
  
 \ 
  
 --enable-slice-controller

Replace the following:

CLUSTER_NAME : the name of your cluster.
LOCATION : the region with your available TPU capacity .

Get credentials so that you can communicate with your cluster with kubectl commands:

 gcloud  
config  
 set 
  
container/cluster  
 CLUSTER_NAME 
gcloud  
container  
clusters  
get-credentials  
 CLUSTER_NAME 
  
 \ 
  
--location = 
 LOCATION

In the output of the following command, verify that the slices.accelerator.gke.io value is present:
```
 kubectl  
get  
crd  
slices.accelerator.gke.io 
```
The output is similar to the following:
```
 slices.accelerator.gke.io                2026-01-09T23:58:02Z 
```

Create node pools with incremental provisioning

This section describes how to create the TPU node pools with incremental provisioning. GKE converts all your TPU capacity into node pools of 16-node group of TPU VMs, or sub-blocks. GKE provisions these node pools even when it can't find all 16 healthy VMs by placing nodes on healthy parts of the host machine and incrementally provisioning unhealthy machines while they are repaired.

You can target your node pool to belong to any of the following:

A specific block of TPUs, which is exposed in All Capacity mode reservations. Block targeting allows GKE to create the node pool in any available sub-block within the specified block.
A specific sub-block, or a specific 16-node group of TPU VMs, of TPUs for more granular control.

Create a workload policy

To create a TPU slice node pool with Ironwood (TPU7x), you must first create a workload policy with the accelerator-topology-mode field set to provision_only . This setting triggers the incremental provisioning process.

Create a workload policy:

 gcloud  
compute  
resource-policies  
create  
workload-policy  
 WORKLOAD_POLICY_NAME 
  
 \ 
  
--project = 
 PROJECT_ID 
  
 \ 
  
--region = 
 REGION 
  
 \ 
  
--type = 
HIGH_THROUGHPUT  
 \ 
  
--accelerator-topology = 
4x4x4  
 \ 
  
--accelerator-topology-mode = 
provision_only

Replace the following:

WORKLOAD_POLICY_NAME : a name for your workload policy.
PROJECT_ID : your Google Cloud project ID.
REGION : the region for the workload policy.

In this command, do the following::

Always set the accelerator-topology field to 4x4x4 to match the total number of chips within a single sub-block.
Always set the accelerator-topology-mode field to provision_only to ensure the incremental provisioning process is triggered. When the provision_only field is set, the node pool provisions TPU nodes without forming ICI or OCS links.

Target your node pool to belong to a block or a sub-block

You can target specific sub-blocks or blocks within your All Capacity mode reservation.

Target a block:each node pool uses capacity from a specified block. GKE places the node pool within an available sub-block in that block. You must create as many node pools as there are sub-blocks in the block you want to use.
Target a sub-block:each node pool maps to a specific and available sub-block. When using sub-block targeting, GKE creates the node pool if at least one VM is healthy. Incremental provisioning helps ensure that all nodes are placed within the specified sub-block.

Block

To retrieve the name of the block in a reservation and the count of available sub-blocks in the block, complete the following steps in the View the topology and health status of All Capacity Mode reservations document:
1. Identify the name of the block by listing all reservation blocks and copying the value in the name: field. This value is the name of the block or BLOCK_NAME in this document.
2. Determine how many node pools to create by describing a reservation block and identifying the value in the reservationSubBlockCount field. This value is the number of sub-blocks available. For example, the reservationSubBlockCount: 4 value indicates that the block has four sub-blocks available, and you need to create four separate node pools.

Set the reservation path:

  export 
  
 RESERVATION_PATH 
 = 
 "projects/ PROJECT_ID 
/reservations/ RESERVATION_NAME 
/reservationBlocks/ BLOCK_NAME 
"

Replace the following:

RESERVATION_NAME : the name of your TPU reservation.
BLOCK_NAME : the name of the block.

Create a node pool for each sub-block identified in the preceding step. For example, if the count is 4 , run this command four times. Use a unique name for each node pool.

 gcloud  
container  
node-pools  
create  
 NODE_POOL_NAME 
  
 \ 
  
--cluster = 
 CLUSTER_NAME 
  
 \ 
  
--node-locations = 
 ZONE 
  
 \ 
  
--machine-type = 
tpu7x-standard-4t  
 \ 
  
--num-nodes = 
 16 
  
 \ 
  
--placement-policy = 
 WORKLOAD_POLICY_NAME 
  
 \ 
  
--reservation-affinity = 
specific  
 \ 
  
--reservation = 
 ${ 
 RESERVATION_PATH 
 }

Replace the following:

NODE_POOL_NAME : the name of your new node pool.
CLUSTER_NAME : the name of your GKE cluster.
WORKLOAD_POLICY_NAME : the name of the workload policy you created.
ZONE : the zone for the node pool, for example, us-central1-a .

Sub-block

To retrieve the name of the block and the IDs of the available sub-blocks, complete the following steps in the View the topology and health status of All Capacity Mode reservations document:
1. To identify the name of the block, list all reservation blocks and copy the value in the name: field. This value is the name of the block or BLOCK_NAME on this document.
2. To identify the name of the sub-blocks, list all sub-blocks of a block and copy the value in the name: field for each entry under reservationSubBlocks . This value is the name of the sub-block or SUBBLOCK_NAME in this document.

Set the reservation path:

  export 
  
 RESERVATION_PATH 
 = 
 "projects/ PROJECT_ID 
/reservations/ RESERVATION_NAME 
/reservationBlocks/ BLOCK_NAME 
/reservationSubBlocks/ SUBBLOCK_NAME 
"

Replace the following:

RESERVATION_NAME : the name of your TPU reservation.
BLOCK_NAME : the name of the block.
SUBBLOCK_NAME : the name of the sub-block.

Create the node pool:

 gcloud  
container  
node-pools  
create  
 NODE_POOL_NAME 
  
 \ 
  
--project = 
 PROJECT_ID 
  
 \ 
  
--cluster = 
 CLUSTER_NAME 
  
 \ 
  
--node-locations = 
 ZONE 
  
 \ 
  
--machine-type = 
tpu7x-standard-4t  
 \ 
  
--num-nodes = 
 16 
  
 \ 
  
--placement-policy = 
 WORKLOAD_POLICY_NAME 
  
 \ 
  
--reservation-affinity = 
specific  
 \ 
  
--reservation = 
 ${ 
 RESERVATION_PATH 
 }

Replace the following:

NODE_POOL_NAME : a unique name for your new node pool, for example, sub-block-pool-1 .
PROJECT_ID : your Google Cloud project ID.
CLUSTER_NAME : the name of your GKE cluster.
ZONE : the zone for the node pool, for example, us-central2-b .
WORKLOAD_POLICY_NAME : the name of the workload policy you created.

At this stage, the nodes are created, but their Inter-Chip Interconnect (ICI) links are not yet active. Therefore, you can't run workloads on these node pools directly.

To enable all the necessary ICI links to form the slice and allow workloads to be scheduled, create a dynamic slice by using one of the following methods:

Create a Slice custom resource . Instead of Pods, you use a Slice custom resource to define the specified topology, which the slice controller activates.
Schedule GKE workloads with Kueue and TAS . Kueue automatically handles the creation and deletion of Slice custom resources. Avoid manually modifying Slice custom resources created by Kueue.

Create a dynamic slice with Kueue and TAS

In this section, you schedule GKE workloads with Kueue and TAS.

Install JobSet and Kueue resources for dynamic slicing

Install JobSet:

 helm  
install  
jobset  
oci://registry.k8s.io/jobset/charts/jobset  
 \ 
  
--version  
 0 
.11.1  
 \ 
  
--namespace  
jobset-system  
 \ 
  
--create-namespace  
 \ 
  
--set  
controller.resources.requests.cpu = 
 4 
  
 \ 
  
--set  
controller.resources.requests.memory = 
16Gi

Install Kueue :

 helm  
install  
kueue  
oci://registry.k8s.io/kueue/charts/kueue  
 \ 
  
--version  
 0 
.16.6  
 \ 
  
--namespace  
kueue-system  
 \ 
  
--create-namespace  
 \ 
  
--wait  
 \ 
  
--set  
controllerManager.replicas = 
 3 
  
 \ 
  
--set  
controllerManager.manager.resources.requests.cpu = 
 16 
  
 \ 
  
--set  
controllerManager.manager.resources.requests.memory = 
64Gi

To install Kueue slice controller, save the following manifest as slice-controller.yaml :

  apiVersion 
 : 
  
 v1 
 kind 
 : 
  
 Namespace 
 metadata 
 : 
  
 labels 
 : 
  
 app.kubernetes.io/managed-by 
 : 
  
 kustomize 
  
 app.kubernetes.io/name 
 : 
  
 slice-controller 
  
 control-plane 
 : 
  
 controller-manager 
  
 name 
 : 
  
 slice-controller-system 
 --- 
 apiVersion 
 : 
  
 v1 
 kind 
 : 
  
 ServiceAccount 
 metadata 
 : 
  
 labels 
 : 
  
 app.kubernetes.io/managed-by 
 : 
  
 kustomize 
  
 app.kubernetes.io/name 
 : 
  
 slice-controller 
  
 control-plane 
 : 
  
 controller-manager 
  
 name 
 : 
  
 slice-controller-controller-manager 
  
 namespace 
 : 
  
 slice-controller-system 
 --- 
 apiVersion 
 : 
  
 rbac.authorization.k8s.io/v1 
 kind 
 : 
  
 Role 
 metadata 
 : 
  
 labels 
 : 
  
 app.kubernetes.io/managed-by 
 : 
  
 kustomize 
  
 app.kubernetes.io/name 
 : 
  
 slice-controller 
  
 control-plane 
 : 
  
 controller-manager 
  
 name 
 : 
  
 slice-controller-leader-election-role 
  
 namespace 
 : 
  
 slice-controller-system 
 rules 
 : 
 - 
  
 apiGroups 
 : 
  
 - 
  
 "" 
  
 resources 
 : 
  
 - 
  
 configmaps 
  
 verbs 
 : 
  
 - 
  
 get 
  
 - 
  
 list 
  
 - 
  
 watch 
  
 - 
  
 create 
  
 - 
  
 update 
  
 - 
  
 patch 
  
 - 
  
 delete 
 - 
  
 apiGroups 
 : 
  
 - 
  
 coordination.k8s.io 
  
 resources 
 : 
  
 - 
  
 leases 
  
 verbs 
 : 
  
 - 
  
 get 
  
 - 
  
 list 
  
 - 
  
 watch 
  
 - 
  
 create 
  
 - 
  
 update 
  
 - 
  
 patch 
  
 - 
  
 delete 
 - 
  
 apiGroups 
 : 
  
 - 
  
 "" 
  
 resources 
 : 
  
 - 
  
 events 
  
 verbs 
 : 
  
 - 
  
 create 
  
 - 
  
 patch 
 --- 
 apiVersion 
 : 
  
 rbac.authorization.k8s.io/v1 
 kind 
 : 
  
 ClusterRole 
 metadata 
 : 
  
 labels 
 : 
  
 control-plane 
 : 
  
 controller-manager 
  
 name 
 : 
  
 slice-controller-manager-role 
 rules 
 : 
 - 
  
 apiGroups 
 : 
  
 - 
  
 "" 
  
 resources 
 : 
  
 - 
  
 events 
  
 verbs 
 : 
  
 - 
  
 create 
  
 - 
  
 patch 
  
 - 
  
 update 
  
 - 
  
 watch 
 - 
  
 apiGroups 
 : 
  
 - 
  
 "" 
  
 resources 
 : 
  
 - 
  
 nodes 
  
 - 
  
 pods 
  
 verbs 
 : 
  
 - 
  
 get 
  
 - 
  
 list 
  
 - 
  
 watch 
 - 
  
 apiGroups 
 : 
  
 - 
  
 "" 
  
 resources 
 : 
  
 - 
  
 secrets 
  
 verbs 
 : 
  
 - 
  
 get 
  
 - 
  
 list 
  
 - 
  
 update 
  
 - 
  
 watch 
 - 
  
 apiGroups 
 : 
  
 - 
  
 accelerator.gke.io 
  
 resources 
 : 
  
 - 
  
 slices 
  
 verbs 
 : 
  
 - 
  
 create 
  
 - 
  
 delete 
  
 - 
  
 get 
  
 - 
  
 list 
  
 - 
  
 patch 
  
 - 
  
 update 
  
 - 
  
 watch 
 - 
  
 apiGroups 
 : 
  
 - 
  
 accelerator.gke.io 
  
 resources 
 : 
  
 - 
  
 slices/finalizers 
  
 verbs 
 : 
  
 - 
  
 update 
 - 
  
 apiGroups 
 : 
  
 - 
  
 admissionregistration.k8s.io 
  
 resources 
 : 
  
 - 
  
 mutatingwebhookconfigurations 
  
 verbs 
 : 
  
 - 
  
 get 
  
 - 
  
 list 
  
 - 
  
 update 
  
 - 
  
 watch 
 - 
  
 apiGroups 
 : 
  
 - 
  
 batch 
  
 resources 
 : 
  
 - 
  
 jobs 
  
 verbs 
 : 
  
 - 
  
 get 
  
 - 
  
 list 
  
 - 
  
 patch 
  
 - 
  
 update 
  
 - 
  
 watch 
 - 
  
 apiGroups 
 : 
  
 - 
  
 jobset.x-k8s.io 
  
 resources 
 : 
  
 - 
  
 jobsets 
  
 verbs 
 : 
  
 - 
  
 get 
  
 - 
  
 list 
  
 - 
  
 patch 
  
 - 
  
 update 
  
 - 
  
 watch 
 - 
  
 apiGroups 
 : 
  
 - 
  
 leaderworkerset.x-k8s.io 
  
 resources 
 : 
  
 - 
  
 leaderworkersets 
  
 verbs 
 : 
  
 - 
  
 get 
  
 - 
  
 list 
  
 - 
  
 patch 
  
 - 
  
 update 
  
 - 
  
 watch 
 - 
  
 apiGroups 
 : 
  
 - 
  
 kueue.x-k8s.io 
  
 resources 
 : 
  
 - 
  
 admissionchecks 
  
 verbs 
 : 
  
 - 
  
 get 
  
 - 
  
 list 
  
 - 
  
 watch 
 - 
  
 apiGroups 
 : 
  
 - 
  
 kueue.x-k8s.io 
  
 resources 
 : 
  
 - 
  
 admissionchecks/status 
  
 - 
  
 workloads/status 
  
 verbs 
 : 
  
 - 
  
 get 
  
 - 
  
 patch 
  
 - 
  
 update 
 - 
  
 apiGroups 
 : 
  
 - 
  
 kueue.x-k8s.io 
  
 resources 
 : 
  
 - 
  
 workloads 
  
 verbs 
 : 
  
 - 
  
 create 
  
 - 
  
 get 
  
 - 
  
 list 
  
 - 
  
 patch 
  
 - 
  
 update 
  
 - 
  
 watch 
 --- 
 apiVersion 
 : 
  
 rbac.authorization.k8s.io/v1 
 kind 
 : 
  
 ClusterRole 
 metadata 
 : 
  
 labels 
 : 
  
 control-plane 
 : 
  
 controller-manager 
  
 name 
 : 
  
 slice-controller-metrics-auth-role 
 rules 
 : 
 - 
  
 apiGroups 
 : 
  
 - 
  
 authentication.k8s.io 
  
 resources 
 : 
  
 - 
  
 tokenreviews 
  
 verbs 
 : 
  
 - 
  
 create 
 - 
  
 apiGroups 
 : 
  
 - 
  
 authorization.k8s.io 
  
 resources 
 : 
  
 - 
  
 subjectaccessreviews 
  
 verbs 
 : 
  
 - 
  
 create 
 --- 
 apiVersion 
 : 
  
 rbac.authorization.k8s.io/v1 
 kind 
 : 
  
 ClusterRole 
 metadata 
 : 
  
 labels 
 : 
  
 control-plane 
 : 
  
 controller-manager 
  
 name 
 : 
  
 slice-controller-metrics-reader 
 rules 
 : 
 - 
  
 nonResourceURLs 
 : 
  
 - 
  
 /metrics 
  
 verbs 
 : 
  
 - 
  
 get 
 --- 
 apiVersion 
 : 
  
 rbac.authorization.k8s.io/v1 
 kind 
 : 
  
 RoleBinding 
 metadata 
 : 
  
 labels 
 : 
  
 app.kubernetes.io/managed-by 
 : 
  
 kustomize 
  
 app.kubernetes.io/name 
 : 
  
 slice-controller 
  
 control-plane 
 : 
  
 controller-manager 
  
 name 
 : 
  
 slice-controller-leader-election-rolebinding 
  
 namespace 
 : 
  
 slice-controller-system 
 roleRef 
 : 
  
 apiGroup 
 : 
  
 rbac.authorization.k8s.io 
  
 kind 
 : 
  
 Role 
  
 name 
 : 
  
 slice-controller-leader-election-role 
 subjects 
 : 
 - 
  
 kind 
 : 
  
 ServiceAccount 
  
 name 
 : 
  
 slice-controller-controller-manager 
  
 namespace 
 : 
  
 slice-controller-system 
 --- 
 apiVersion 
 : 
  
 rbac.authorization.k8s.io/v1 
 kind 
 : 
  
 ClusterRoleBinding 
 metadata 
 : 
  
 labels 
 : 
  
 app.kubernetes.io/managed-by 
 : 
  
 kustomize 
  
 app.kubernetes.io/name 
 : 
  
 slice-controller 
  
 control-plane 
 : 
  
 controller-manager 
  
 name 
 : 
  
 slice-controller-manager-rolebinding 
 roleRef 
 : 
  
 apiGroup 
 : 
  
 rbac.authorization.k8s.io 
  
 kind 
 : 
  
 ClusterRole 
  
 name 
 : 
  
 slice-controller-manager-role 
 subjects 
 : 
 - 
  
 kind 
 : 
  
 ServiceAccount 
  
 name 
 : 
  
 slice-controller-controller-manager 
  
 namespace 
 : 
  
 slice-controller-system 
 --- 
 apiVersion 
 : 
  
 rbac.authorization.k8s.io/v1 
 kind 
 : 
  
 ClusterRoleBinding 
 metadata 
 : 
  
 labels 
 : 
  
 control-plane 
 : 
  
 controller-manager 
  
 name 
 : 
  
 slice-controller-metrics-auth-rolebinding 
 roleRef 
 : 
  
 apiGroup 
 : 
  
 rbac.authorization.k8s.io 
  
 kind 
 : 
  
 ClusterRole 
  
 name 
 : 
  
 slice-controller-metrics-auth-role 
 subjects 
 : 
 - 
  
 kind 
 : 
  
 ServiceAccount 
  
 name 
 : 
  
 slice-controller-controller-manager 
  
 namespace 
 : 
  
 slice-controller-system 
 --- 
 apiVersion 
 : 
  
 v1 
 kind 
 : 
  
 Secret 
 metadata 
 : 
  
 labels 
 : 
  
 control-plane 
 : 
  
 controller-manager 
  
 name 
 : 
  
 slice-controller-webhook-server-cert 
  
 namespace 
 : 
  
 slice-controller-system 
 --- 
 apiVersion 
 : 
  
 v1 
 kind 
 : 
  
 Service 
 metadata 
 : 
  
 labels 
 : 
  
 app.kubernetes.io/managed-by 
 : 
  
 kustomize 
  
 app.kubernetes.io/name 
 : 
  
 slice-controller 
  
 control-plane 
 : 
  
 controller-manager 
  
 name 
 : 
  
 slice-controller-controller-manager-metrics-service 
  
 namespace 
 : 
  
 slice-controller-system 
 spec 
 : 
  
 ports 
 : 
  
 - 
  
 name 
 : 
  
 https 
  
 port 
 : 
  
 8443 
  
 protocol 
 : 
  
 TCP 
  
 targetPort 
 : 
  
 8443 
  
 selector 
 : 
  
 app.kubernetes.io/name 
 : 
  
 slice-controller 
  
 control-plane 
 : 
  
 controller-manager 
 --- 
 apiVersion 
 : 
  
 v1 
 kind 
 : 
  
 Service 
 metadata 
 : 
  
 labels 
 : 
  
 control-plane 
 : 
  
 controller-manager 
  
 name 
 : 
  
 slice-controller-webhook-service 
  
 namespace 
 : 
  
 slice-controller-system 
 spec 
 : 
  
 ports 
 : 
  
 - 
  
 port 
 : 
  
 443 
  
 protocol 
 : 
  
 TCP 
  
 targetPort 
 : 
  
 9443 
  
 selector 
 : 
  
 control-plane 
 : 
  
 controller-manager 
 --- 
 apiVersion 
 : 
  
 apps/v1 
 kind 
 : 
  
 Deployment 
 metadata 
 : 
  
 labels 
 : 
  
 app.kubernetes.io/managed-by 
 : 
  
 kustomize 
  
 app.kubernetes.io/name 
 : 
  
 slice-controller 
  
 control-plane 
 : 
  
 controller-manager 
  
 name 
 : 
  
 slice-controller-controller-manager 
  
 namespace 
 : 
  
 slice-controller-system 
 spec 
 : 
  
 replicas 
 : 
  
 1 
  
 selector 
 : 
  
 matchLabels 
 : 
  
 app.kubernetes.io/name 
 : 
  
 slice-controller 
  
 control-plane 
 : 
  
 controller-manager 
  
 template 
 : 
  
 metadata 
 : 
  
 annotations 
 : 
  
 kubectl.kubernetes.io/default-container 
 : 
  
 manager 
  
 labels 
 : 
  
 app.kubernetes.io/name 
 : 
  
 slice-controller 
  
 control-plane 
 : 
  
 controller-manager 
  
 spec 
 : 
  
 containers 
 : 
  
 - 
  
 args 
 : 
  
 - 
  
 --metrics-bind-address=:8443 
  
 - 
  
 --leader-elect 
  
 - 
  
 --health-probe-bind-address=:8081 
  
 - 
  
 --zap-log-level=3 
  
 - 
  
 --feature-gates=UseRetryMechanismForSliceCreation=true 
  
 - 
  
 --activation-timeout=6m 
  
 command 
 : 
  
 - 
  
 /manager 
  
 image 
 : 
  
 tpuongke/kueue-slice-controller:latest 
  
 livenessProbe 
 : 
  
 httpGet 
 : 
  
 path 
 : 
  
 /healthz 
  
 port 
 : 
  
 8081 
  
 initialDelaySeconds 
 : 
  
 15 
  
 periodSeconds 
 : 
  
 20 
  
 name 
 : 
  
 manager 
  
 ports 
 : 
  
 - 
  
 containerPort 
 : 
  
 9443 
  
 name 
 : 
  
 webhook-server 
  
 protocol 
 : 
  
 TCP 
  
 readinessProbe 
 : 
  
 httpGet 
 : 
  
 path 
 : 
  
 /readyz 
  
 port 
 : 
  
 8081 
  
 initialDelaySeconds 
 : 
  
 5 
  
 periodSeconds 
 : 
  
 10 
  
 resources 
 : 
  
 limits 
 : 
  
 cpu 
 : 
  
 12000m 
  
 memory 
 : 
  
 32Gi 
  
 requests 
 : 
  
 cpu 
 : 
  
 8000m 
  
 memory 
 : 
  
 16Gi 
  
 securityContext 
 : 
  
 allowPrivilegeEscalation 
 : 
  
 false 
  
 capabilities 
 : 
  
 drop 
 : 
  
 - 
  
 ALL 
  
 volumeMounts 
 : 
  
 - 
  
 mountPath 
 : 
  
 /tmp/k8s-webhook-server/serving-certs 
  
 name 
 : 
  
 cert 
  
 readOnly 
 : 
  
 true 
  
 securityContext 
 : 
  
 runAsNonRoot 
 : 
  
 true 
  
 seccompProfile 
 : 
  
 type 
 : 
  
 RuntimeDefault 
  
 serviceAccountName 
 : 
  
 slice-controller-controller-manager 
  
 terminationGracePeriodSeconds 
 : 
  
 10 
  
 volumes 
 : 
  
 - 
  
 name 
 : 
  
 cert 
  
 secret 
 : 
  
 defaultMode 
 : 
  
 420 
  
 secretName 
 : 
  
 slice-controller-webhook-server-cert 
 --- 
 apiVersion 
 : 
  
 admissionregistration.k8s.io/v1 
 kind 
 : 
  
 MutatingWebhookConfiguration 
 metadata 
 : 
  
 labels 
 : 
  
 control-plane 
 : 
  
 controller-manager 
  
 name 
 : 
  
 slice-controller-mutating-webhook-configuration 
 webhooks 
 : 
 - 
  
 admissionReviewVersions 
 : 
  
 - 
  
 v1 
  
 clientConfig 
 : 
  
 service 
 : 
  
 name 
 : 
  
 slice-controller-webhook-service 
  
 namespace 
 : 
  
 slice-controller-system 
  
 path 
 : 
  
 /mutate-batch-v1-job 
  
 failurePolicy 
 : 
  
 Fail 
  
 name 
 : 
  
 mjob.kb.io 
  
 rules 
 : 
  
 - 
  
 apiGroups 
 : 
  
 - 
  
 batch 
  
 apiVersions 
 : 
  
 - 
  
 v1 
  
 operations 
 : 
  
 - 
  
 CREATE 
  
 resources 
 : 
  
 - 
  
 jobs 
  
 sideEffects 
 : 
  
 None 
 - 
  
 admissionReviewVersions 
 : 
  
 - 
  
 v1 
  
 clientConfig 
 : 
  
 service 
 : 
  
 name 
 : 
  
 slice-controller-webhook-service 
  
 namespace 
 : 
  
 slice-controller-system 
  
 path 
 : 
  
 /mutate-jobset-x-k8s-io-v1alpha2-jobset 
  
 failurePolicy 
 : 
  
 Fail 
  
 name 
 : 
  
 mjobset.kb.io 
  
 rules 
 : 
  
 - 
  
 apiGroups 
 : 
  
 - 
  
 jobset.x-k8s.io 
  
 apiVersions 
 : 
  
 - 
  
 v1alpha2 
  
 operations 
 : 
  
 - 
  
 CREATE 
  
 resources 
 : 
  
 - 
  
 jobsets 
  
 sideEffects 
 : 
  
 None 
 - 
  
 admissionReviewVersions 
 : 
  
 - 
  
 v1 
  
 clientConfig 
 : 
  
 service 
 : 
  
 name 
 : 
  
 slice-controller-webhook-service 
  
 namespace 
 : 
  
 slice-controller-system 
  
 path 
 : 
  
 /mutate-leaderworkerset-x-k8s-io-v1-leaderworkerset 
  
 failurePolicy 
 : 
  
 Fail 
  
 name 
 : 
  
 mleaderworkerset.kb.io 
  
 rules 
 : 
  
 - 
  
 apiGroups 
 : 
  
 - 
  
 leaderworkerset.x-k8s.io 
  
 apiVersions 
 : 
  
 - 
  
 v1 
  
 operations 
 : 
  
 - 
  
 CREATE 
  
 resources 
 : 
  
 - 
  
 leaderworkersets 
  
 sideEffects 
 : 
  
 None

Apply the slice-controller.yaml manifest:

 kubectl  
apply  
-f  
slice-controller.yaml

To configure Kueue for dynamic slicing, save the following manifest as dynamic-slice-topology.yaml :

  apiVersion 
 : 
  
 kueue.x-k8s.io/v1beta1 
 kind 
 : 
  
 Topology 
 metadata 
 : 
  
 name 
 : 
  
 superslice-topology 
 spec 
 : 
  
 levels 
 : 
  
 # Label to identify the physical block a sub-block belongs to. 
  
 # Only sub-blocks from the same block can form a slice. 
  
 - 
  
 nodeLabel 
 : 
  
 cloud.google.com/gce-topology-block 
  
 # Label to identify individual TPU sub-blocks (4x4x4 topology). 
  
 - 
  
 nodeLabel 
 : 
  
 cloud.google.com/gke-tpu-partition-4x4x4-id 
  
 # Standard Kubernetes label for individual nodes. 
  
 # Required to assign Pods to specific VMs. 
  
 - 
  
 nodeLabel 
 : 
  
 kubernetes.io/hostname 
 --- 
 apiVersion 
 : 
  
 kueue.x-k8s.io/v1beta1 
 kind 
 : 
  
 ResourceFlavor 
 metadata 
 : 
  
 name 
 : 
  
 superslice-rf 
 spec 
 : 
  
 nodeLabels 
 : 
  
 cloud.google.com/gke-tpu-accelerator 
 : 
  
 tpu7x 
  
 topologyName 
 : 
  
 superslice-topology 
 --- 
 apiVersion 
 : 
  
 kueue.x-k8s.io/v1beta1 
 kind 
 : 
  
 AdmissionCheck 
 metadata 
 : 
  
 name 
 : 
  
 superslice-ac 
 spec 
 : 
  
 controllerName 
 : 
  
  accelerator.gke.io/slice 
 --- 
 apiVersion 
 : 
  
 kueue.x-k8s.io/v1beta1 
 kind 
 : 
  
 ClusterQueue 
 metadata 
 : 
  
 name 
 : 
  
 cq 
 spec 
 : 
  
 namespaceSelector 
 : 
  
 {} 
  
 admissionChecks 
 : 
  
 - 
  
 superslice-ac 
  
 resourceGroups 
 : 
  
 - 
  
 coveredResources 
 : 
  
 - 
  
 google.com/tpu 
  
 flavors 
 : 
  
 - 
  
 name 
 : 
  
 superslice-rf 
  
 resources 
 : 
  
 - 
  
 name 
 : 
  
 google.com/tpu 
  
 nominalQuota 
 : 
  
 "999999" 
  
 # modeling unlimited quota 
 --- 
 apiVersion 
 : 
  
 kueue.x-k8s.io/v1beta1 
 kind 
 : 
  
 LocalQueue 
 metadata 
 : 
  
 name 
 : 
  
 lq 
  
 namespace 
 : 
  
 default 
 spec 
 : 
  
 clusterQueue 
 : 
  
 cq

Apply the dynamic-slice-topology.yaml manifest:
```
 kubectl  
apply  
-f  
dynamic-slice-topology.yaml 
```
In this manifest, you configure Kueue for dynamic slicing by defining the following resources:
- Ironwood (TPU7x) dynamic slice topology ( superslice-topology ): the topology defines the levels Kueue considers when it schedules dynamic slicing workloads. These levels are the following:
  - cloud.google.com/gce-topology-block label: this level is required to understand which sub-blocks belong to which blocks, because only sub-blocks from the same block can form a slice.
  - cloud.google.com/gke-tpu-partition-4x4x4-id label: this level represents individual Ironwood (TPU7x) sub-blocks ( 4x4x4 topology).
  - kubernetes.io/hostname label: this level is required to assign Pods to specific VMs and to observe their labels and taints.
- Ironwood (TPU7x) SuperSlice ResourceFlavor ( superslice-rf ): the resource flavor for Ironwood (TPU7x) sub-blocks includes the cloud.google.com/gke-tpu-accelerator: tpu7x label to match nodes with Ironwood (TPU7x) machines.
- SuperSlice AdmissionCheck ( superslice-ac ): this admission check tells Kueue not to schedule a workload until the GKE slice controller confirms that the slice has become active. The admission check is first defined and then added to the ClusterQueue that handles dynamic slicing workloads.
- ClusterQueue ( cq ) and LocalQueue ( lq ): these fields manage google.com/tpu resources. The cq ClusterQueue includes the superslice-ac admission check. The nominalQuota for google.com/tpu can be configured in two ways:
  - Specific quota: set nominalQuota to match existing capacity for fair-sharing and quota management.
  - Unlimited quota: set nominalQuota to a very high value such as "999999" , to model unlimited quota. To focus on TAS and dynamic slicing, this configuration bypasses Kueue's quota management functionality.

Define the sub-block health selection

Beyond standard node health and readiness, GKE exposes the specific state of each sub-block by using the cloud.google.com/gke-tpu-partition-4x4x4-state label. This label lets GKE account for factors that influence slice formation, such as the state of TPU links.

You can define the value of the cloud.google.com/gke-tpu-partition-4x4x4-state label as follows:

HEALTHY : the infrastructure is healthy.
DEGRADED : the sub-block's infrastructure is in a degraded state, for example, because of OCS link degradation. The sub-block can still form a slice, but overall performance might be lower compared to healthy sub-blocks.
UNHEALTHY : the sub-block is unhealthy and can't form a slice.

The Kueue Slice Controller webhook validates if a workload includes a specific sub-block health requirement. If no preference is indicated, the webhook injects a default node affinity.

The behavior is as follows:

If a nodeSelector or nodeAffinity that targets the cloud.google.com/gke-tpu-partition-4x4x4-state label is present, it remains unchanged.

If no such label configuration exists, the webhook injects the following default node affinity to ensure only available sub-blocks are used:

  nodeAffinity 
 : 
  
 requiredDuringSchedulingIgnoredDuringExecution 
 : 
  
 nodeSelectorTerms 
 : 
  
 - 
  
 matchExpressions 
 : 
  
 - 
  
 key 
 : 
  
 cloud.google.com/gke-tpu-partition-4x4x4-state 
  
 operator 
 : 
  
 In 
  
 values 
 : 
  
 - 
  
 "HEALTHY" 
  
 - 
  
 "DEGRADED"

The following section includes examples where the cloud.google.com/gke-tpu-partition-4x4x4-state label is configured to specify the different sub-block health configurations.

Run test workloads on dynamic slicing with Kueue

This section describes how to deploy workloads on dynamic slicing with Kueue and TAS. It includes examples that show how to create a dynamic slice workload and a workload consisting of multiple slices. The workloads are submitted as JobSets.

Example 1: Single workload uses a single dynamic slice

The following example describes how to create a workload using a slice with a 4x12x16 topology, which is composed of 12 sub-blocks. The number of Pods was calculated as: (4 * 12 * 16) / 4 chips per node = 192 Pods.

Save the following manifest as big-super-slice.yaml :

  apiVersion 
 : 
  
 jobset.x-k8s.io/v1alpha2 
 kind 
 : 
  
 JobSet 
 metadata 
 : 
  
 name 
 : 
  
 big-super-slice 
  
 labels 
 : 
  
 kueue.x-k8s.io/queue-name 
 : 
  
 lq 
  
 annotations 
 : 
 spec 
 : 
  
 replicatedJobs 
 : 
  
 - 
  
 name 
 : 
  
 job-jax 
  
 replicas 
 : 
  
 1 
  
 template 
 : 
  
 spec 
 : 
  
 parallelism 
 : 
  
 192 
  
 # pods per slice calculation: 4*12*16 / 4 = 192 
  
 completions 
 : 
  
 192 
  
 backoffLimit 
 : 
  
 10 
  
 template 
 : 
  
 metadata 
 : 
  
 annotations 
 : 
  
 cloud.google.com/gke-tpu-slice-topology 
 : 
  
 4x12x16 
  
 spec 
 : 
  
 tolerations 
 : 
  
 - 
  
 key 
 : 
  
 "google.com/tpu" 
  
 operator 
 : 
  
 "Equal" 
  
 value 
 : 
  
 "present" 
  
 effect 
 : 
  
 "NoSchedule" 
  
 nodeSelector 
 : 
  
 cloud.google.com/gke-tpu-accelerator 
 : 
  
 tpu7x 
  
 containers 
 : 
  
 - 
  
 name 
 : 
  
 jax 
  
 image 
 : 
  
 python:latest 
  
 command 
 : 
  
 - 
  
 bash 
  
 - 
  
 -c 
  
 - 
  
 | 
  
 printenv 
  
 pip install "jax[tpu]" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html 
  
 python -c 'import jax; print("Global device count:", jax.device_count(), "Local device count:", jax.local_device_count())' 
  
 resources 
 : 
  
 limits 
 : 
  
 google.com/tpu 
 : 
  
 4 
  
 restartPolicy 
 : 
  
 Never

In this manifest, the following annotations tell Kueue the slice characteristics and topology to configure the following:

cloud.google.com/gke-tpu-slice-topology : specifies "4x12x16" as the dynamic slice topology. Requirements for the tpu7x accelerator topology include the following rules:
- The minimum topology is 4x4x4 .
- The topology must be a three-dimensional string in the format AxBxC . For example, 4x8x8 .
- Each dimension (A, B, and C) must be a multiple of four.
- The dimensions must be sorted in non-decreasing order: A <= B <= C. For example, 4x8x4 is invalid; it should be 4x4x8 .
- The product of the dimensions (A B C) must not exceed 9,216.
- The largest supported slice topologies can include up to 32 sub-blocks. For example, 8x16x16 with 32 sub-blocks, 8x12x20 with 30 sub-blocks, or 12x12x12 with 27 sub-blocks are within the accepted limits.
cloud.google.com/gke-tpu-accelerator: tpu7x : schedules Pods on on VMs that run Ironwood (TPU7x).
kueue.x-k8s.io/queue-name : assigns the JobSet to a Kueue LocalQueue.
The webhook injects the default node affinity to ensure HEALTHY and DEGRADED nodes are used.

Apply the big-super-slice.yaml manifest:
```
 kubectl  
apply  
-f  
big-super-slice.yaml 
```
After you apply the manifest, Kueue creates a JobSet named big-super-slice . Kueue then attempts to form a single dynamic slice with a 4x12x16 topology. After the slice is active, Kueue admits the workload, and the 192 Pods are scheduled on the nodes to form the dynamic slice that runs your workloads.

Example 2: Workload with more than one replica

The following example demonstrates how to create a workload that uses two dynamic slices, each composed of four sub-blocks targeting only HEALTHY nodes.

Save the following manifest as two-super-slices.yaml :

  apiVersion 
 : 
  
 jobset.x-k8s.io/v1alpha2 
 kind 
 : 
  
 JobSet 
 metadata 
 : 
 name 
 : 
  
 two-super-slices 
 labels 
 : 
  
 kueue.x-k8s.io/queue-name 
 : 
  
 lq 
 annotations 
 : 
 spec 
 : 
 replicatedJobs 
 : 
  
 - 
  
 name 
 : 
  
 job-jax 
  
 replicas 
 : 
  
 2 
  
 template 
 : 
  
 spec 
 : 
  
 parallelism 
 : 
  
 64 
  
 # Pods per slice calculation: (4*8*8) / 4 = 64 
  
 completions 
 : 
  
 64 
  
 backoffLimit 
 : 
  
 10 
  
 template 
 : 
  
 metadata 
 : 
  
 annotations 
 : 
  
 cloud.google.com/gke-tpu-slice-topology 
 : 
  
 4x8x8 
  
 spec 
 : 
  
 tolerations 
 : 
  
 - 
  
 key 
 : 
  
 "google.com/tpu" 
  
 operator 
 : 
  
 "Equal" 
  
 value 
 : 
  
 "present" 
  
 effect 
 : 
  
 "NoSchedule" 
  
 nodeSelector 
 : 
  
 cloud.google.com/gke-tpu-accelerator 
 : 
  
 tpu7x 
  
 cloud.google.com/gke-tpu-partition-4x4x4-state 
 : 
  
 "HEALTHY" 
  
 containers 
 : 
  
 - 
  
 name 
 : 
  
 jax 
  
 image 
 : 
  
 python:latest 
  
 command 
 : 
  
 - 
  
 bash 
  
 - 
  
 -c 
  
 - 
  
 | 
  
 printenv 
  
 pip install "jax[tpu]" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html 
  
 python -c 'import jax; print("Global device count:", jax.device_count(), "Local device count:", jax.local_device_count())' 
  
 resources 
 : 
  
 limits 
 : 
  
 google.com/tpu 
 : 
  
 4 
  
 restartPolicy 
 : 
  
 Never

Apply the two-super-slices.yaml manifest:

 kubectl  
apply  
-f  
two-super-slices.yaml

In this manifest, you set replicas: 2 in the replicatedJobs field. After you apply the manifest, Kueue attempts to form two separate slices with a 4x8x8 topology. Kueue creates a dynamic slice for each replica defined in jobset.spec.replicatedJobs[].replicas . If n replicas are specified, Kueue creates n dynamic slices for the workload and waits for all slices to become active before admitting the workload.

Monitor the slice

You can see the status of the slice and monitor the slice metrics with GKE system metrics .

Monitor the status of the slice

To check the status of your dynamic slices, run the following command:

 kubectl  
describe  
slice  
 SLICE_NAME

Replace SLICE_NAME with the name of your slice. The slice name is typically derived from the JobSet name and replica index. For Example 1 , a slice created by Kueue would have a name similar to default-jobset-big-super-slice-yyyyy-job-jax-0 .

The output is similar to the following:

 Name:         test-slice
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  accelerator.gke.io/v1beta1
Kind:         Slice
Metadata:
  Creation Timestamp:  2026-02-12T23:44:28Z
  Finalizers:
    accelerator.gke.io/slice-finalizer
  Generation:        1
  Resource Version:  1770939905695871008
  UID:               6dbbfe14-4486-4462-864d-e078d0ca8b5b
Spec:
  Partition Ids:
    5eae6a4f59d59cf30a9bf49de618eb2b
  Topology:  4x4x4
  Type:      tpu7x
Status:
  Conditions:
    Last Transition Time:  2026-02-12T23:45:05Z
    Message:
    Reason:                ACTIVE
    Status:                True
    Type:                  Ready
    Last Transition Time:  2026-02-12T23:45:05Z
    Message:               NodeLabelingCompleted
    Reason:                NodeLabelIsAdded
    Status:                True
    Type:                  NodeLabeled
Events:                    <none>

The slice name adheres to the following rules to ensure compatibility with underlying Compute Engine resource naming conventions:

Template: {namespace}-jobset-{jobset.metadata.name}-kueueHash[5-character]-{jobset.spec.replicatedJobs[].name}-sliceIndex .
Length: the name has 49 characters or fewer. The controller appends a hyphen and an 8-character cluster hash to create Compute Engine resource names, which have a 63-character limit.
Format: the name matches the regular expression ^[a-z]([-a-z0-9]*[a-z0-9])?$ . The name has the following characteristics:
- Starts with a lowercase letter.
- Only contains lowercase letters, numbers, and hyphens (-).
- Ends with a lowercase letter or a number (it cannot end with a hyphen).

Monitor the metrics of the slice

You can monitor the following GKE system metrics that expose the condition of a slice:

kubernetes.io/accelerator/slice/state
kubernetes.io/accelerator/partition/state
kubernetes.io/accelerator/slice/deformation_durations
kubernetes.io/accelerator/slice/formation_durations

For more information about the metrics, see GKE system metrics .

Clean up

To avoid unexpected charges, delete your slices before deleting node pools.

Delete the JobSet. This action triggers Kueue to delete the associated Slice custom resources.
```
 kubectl  
delete  
jobset  
 JOBSET_NAME 
 
```
Replace JOBSET_NAME with the name of your JobSet, for example, big-super-slice .

Note: Active Slice custom resources block node pool deletion to ensure resources are cleaned up properly. Deleting the JobSet ensures that Kueue cleans up the Slice custom resources before you proceed with node pool deletion.

Delete the TPU node pool:

 gcloud  
container  
node-pools  
delete  
 NODE_POOL_NAME 
  
 \ 
  
--cluster = 
 CLUSTER_NAME 
  
 \ 
  
--location = 
 LOCATION

(Optional) Use dynamic slicing with your own scheduler

This document focuses on using Kueue and TAS. However, you can also manage dynamic slicing with your own custom scheduler. If you choose to use a different scheduler, follow the Slice custom resource reference information.

What's next

Learn more about TPU Cluster Director .
Learn how to Manage maintenance events with TPUs in All Capacity mode .

Configure TPUs with GKE dynamic slicing Stay organized with collections Save and categorize content based on your preferences.

Requirements

Before you begin

Use dynamic slicing in GKE with Kueue

Enable the slice controller

Create node pools with incremental provisioning

Create a workload policy

Target your node pool to belong to a block or a sub-block

Block

Sub-block

Create a dynamic slice with Kueue and TAS

Install JobSet and Kueue resources for dynamic slicing

Define the sub-block health selection

Run test workloads on dynamic slicing with Kueue

Example 1: Single workload uses a single dynamic slice

Example 2: Workload with more than one replica

Monitor the slice

Monitor the status of the slice

Monitor the metrics of the slice

Clean up

(Optional) Use dynamic slicing with your own scheduler

What's next

Configure TPUs with GKE dynamic slicing