This document explains how to provision TPU node pools and schedule dynamic slices in Google Kubernetes Engine (GKE) using Kueue and Topology Aware Scheduling (TAS) .
Before following these instructions, ensure that you understand the concepts of dynamic slicing .
Requirements
To use dynamic slicing in GKE, you must meet the following requirements:
- Use a Standard cluster in version 1.35.2-gke.1842000 or later, in the Rapid channel.
- Use Ironwood (TPU7x) version.
- Use the Container-Optimized OS image for your nodes.
- To use incremental provisioning, use All Capacity mode reservations. All Capacity mode is a feature enabled by TPU Cluster Director.
Before you begin
Before you start, make sure that you have performed the following tasks:
- Enable the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- If you want to use the Google Cloud CLI for this task, install
and then initialize
the
gcloud CLI. If you previously installed the gcloud CLI, get the latest
version by running the
gcloud components updatecommand. Earlier gcloud CLI versions might not support running the commands in this document.
- Ensure that you have an existing Standard cluster in version 1.35.2-gke.1842000 or later, in the Rapid channel. To create a new cluster, see Creating a regional cluster .
- Ensure you have sufficient quota for Ironwood (TPU7x) in your region.
- If you plan to run multislice workloads, install JobSet v0.10.1 or later
- Request TPU capacity in All Capacity mode .
Use dynamic slicing in GKE with Kueue
This section describes the workflow for using dynamic slicing in GKE.
- View the topology and health status of All Capacity mode reservations .
- Enable the slice controller in your cluster .
- Create TPU node pools .
- Configure Kueue to create a Slice custom resource .
- Run workloads on dynamic slicing with Kueue .
- Clean up .
Enable the slice controller
To use dynamic slicing, enable the slice controller in your cluster.
-
Update your cluster:
gcloud container clusters update CLUSTER_NAME \ --location = LOCATION \ --enable-slice-controllerReplace the following:
-
CLUSTER_NAME: the name of your cluster. -
LOCATION: the region with your available TPU capacity .
-
-
Get credentials so that you can communicate with your cluster with
kubectlcommands:gcloud config set container/cluster CLUSTER_NAME gcloud container clusters get-credentials CLUSTER_NAME \ --location = LOCATION -
In the output of the following command, verify that the
slices.accelerator.gke.iovalue is present:kubectl get crd slices.accelerator.gke.ioThe output is similar to the following:
slices.accelerator.gke.io 2026-01-09T23:58:02Z
Create node pools with incremental provisioning
This section describes how to create the TPU node pools with incremental provisioning. GKE converts all your TPU capacity into node pools of 16-node group of TPU VMs, or sub-blocks. GKE provisions these node pools even when it can't find all 16 healthy VMs by placing nodes on healthy parts of the host machine and incrementally provisioning unhealthy machines while they are repaired.
You can target your node pool to belong to any of the following:
- A specific block of TPUs, which is exposed in All Capacity mode reservations. Block targeting allows GKE to create the node pool in any available sub-block within the specified block.
- A specific sub-block, or a specific 16-node group of TPU VMs, of TPUs for more granular control.
Create a workload policy
To create a TPU slice node pool with Ironwood (TPU7x), you must first create a
workload policy with the accelerator-topology-mode
field set to provision_only
. This setting
triggers the incremental provisioning process.
Create a workload policy:
gcloud
compute
resource-policies
create
workload-policy
WORKLOAD_POLICY_NAME
\
--project =
PROJECT_ID
\
--region =
REGION
\
--type =
HIGH_THROUGHPUT
\
--accelerator-topology =
4x4x4
\
--accelerator-topology-mode =
provision_only
Replace the following:
-
WORKLOAD_POLICY_NAME: a name for your workload policy. -
PROJECT_ID: your Google Cloud project ID. -
REGION: the region for the workload policy.
In this command, do the following::
- Always set the
accelerator-topologyfield to4x4x4to match the total number of chips within a single sub-block. - Always set the
accelerator-topology-modefield toprovision_onlyto ensure the incremental provisioning process is triggered. When theprovision_onlyfield is set, the node pool provisions TPU nodes without forming ICI or OCS links.
Target your node pool to belong to a block or a sub-block
You can target specific sub-blocks or blocks within your All Capacity mode reservation.
- Target a block:each node pool uses capacity from a specified block. GKE places the node pool within an available sub-block in that block. You must create as many node pools as there are sub-blocks in the block you want to use.
-
Target a sub-block:each node pool maps to a specific and available sub-block. When using sub-block targeting, GKE creates the node pool if at least one VM is healthy. Incremental provisioning helps ensure that all nodes are placed within the specified sub-block.
Block
-
To retrieve the name of the block in a reservation and the count of available sub-blocks in the block, complete the following steps in the View the topology and health status of All Capacity Mode reservations document:
-
Identify the name of the block by listing all reservation blocks and copying the value in the
name:field. This value is the name of the block orBLOCK_NAMEin this document. -
Determine how many node pools to create by describing a reservation block and identifying the value in the
reservationSubBlockCountfield. This value is the number of sub-blocks available. For example, thereservationSubBlockCount: 4value indicates that the block has four sub-blocks available, and you need to create four separate node pools.
-
-
Set the reservation path:
export RESERVATION_PATH = "projects/ PROJECT_ID /reservations/ RESERVATION_NAME /reservationBlocks/ BLOCK_NAME "Replace the following:
-
RESERVATION_NAME: the name of your TPU reservation. -
BLOCK_NAME: the name of the block.
-
-
Create a node pool for each sub-block identified in the preceding step. For example, if the count is
4, run this command four times. Use a unique name for each node pool.gcloud container node-pools create NODE_POOL_NAME \ --cluster = CLUSTER_NAME \ --node-locations = ZONE \ --machine-type = tpu7x-standard-4t \ --num-nodes = 16 \ --placement-policy = WORKLOAD_POLICY_NAME \ --reservation-affinity = specific \ --reservation = ${ RESERVATION_PATH }Replace the following:
-
NODE_POOL_NAME: the name of your new node pool. -
CLUSTER_NAME: the name of your GKE cluster. -
WORKLOAD_POLICY_NAME: the name of the workload policy you created. -
ZONE: the zone for the node pool, for example,us-central1-a.
-
Sub-block
-
To retrieve the name of the block and the IDs of the available sub-blocks, complete the following steps in the View the topology and health status of All Capacity Mode reservations document:
-
To identify the name of the block, list all reservation blocks and copy the value in the
name:field. This value is the name of the block orBLOCK_NAMEon this document. -
To identify the name of the sub-blocks, list all sub-blocks of a block and copy the value in the
name:field for each entry underreservationSubBlocks. This value is the name of the sub-block orSUBBLOCK_NAMEin this document.
-
-
Set the reservation path:
export RESERVATION_PATH = "projects/ PROJECT_ID /reservations/ RESERVATION_NAME /reservationBlocks/ BLOCK_NAME /reservationSubBlocks/ SUBBLOCK_NAME "Replace the following:
-
RESERVATION_NAME: the name of your TPU reservation. -
BLOCK_NAME: the name of the block. -
SUBBLOCK_NAME: the name of the sub-block.
-
-
Create the node pool:
gcloud container node-pools create NODE_POOL_NAME \ --project = PROJECT_ID \ --cluster = CLUSTER_NAME \ --node-locations = ZONE \ --machine-type = tpu7x-standard-4t \ --num-nodes = 16 \ --placement-policy = WORKLOAD_POLICY_NAME \ --reservation-affinity = specific \ --reservation = ${ RESERVATION_PATH }Replace the following:
-
NODE_POOL_NAME: a unique name for your new node pool, for example,sub-block-pool-1. -
PROJECT_ID: your Google Cloud project ID. -
CLUSTER_NAME: the name of your GKE cluster. -
ZONE: the zone for the node pool, for example,us-central2-b. -
WORKLOAD_POLICY_NAME: the name of the workload policy you created.
-
At this stage, the nodes are created, but their Inter-Chip Interconnect (ICI) links are not yet active. Therefore, you can't run workloads on these node pools directly.
To enable all the necessary ICI links to form the slice and allow workloads to be scheduled, create a dynamic slice by using one of the following methods:
- Create a Slice custom resource . Instead of Pods, you use a Slice custom resource to define the specified topology, which the slice controller activates.
- Schedule GKE workloads with Kueue and TAS . Kueue automatically handles the creation and deletion of Slice custom resources. Avoid manually modifying Slice custom resources created by Kueue.
Create a dynamic slice with Kueue and TAS
In this section, you schedule GKE workloads with Kueue and TAS.
Install JobSet and Kueue resources for dynamic slicing
-
Install JobSet:
helm install jobset oci://registry.k8s.io/jobset/charts/jobset \ --version 0 .11.1 \ --namespace jobset-system \ --create-namespace \ --set controller.resources.requests.cpu = 4 \ --set controller.resources.requests.memory = 16Gi -
Install Kueue :
helm install kueue oci://registry.k8s.io/kueue/charts/kueue \ --version 0 .16.6 \ --namespace kueue-system \ --create-namespace \ --wait \ --set controllerManager.replicas = 3 \ --set controllerManager.manager.resources.requests.cpu = 16 \ --set controllerManager.manager.resources.requests.memory = 64Gi -
To install Kueue slice controller, save the following manifest as
slice-controller.yaml:apiVersion : v1 kind : Namespace metadata : labels : app.kubernetes.io/managed-by : kustomize app.kubernetes.io/name : slice-controller control-plane : controller-manager name : slice-controller-system --- apiVersion : v1 kind : ServiceAccount metadata : labels : app.kubernetes.io/managed-by : kustomize app.kubernetes.io/name : slice-controller control-plane : controller-manager name : slice-controller-controller-manager namespace : slice-controller-system --- apiVersion : rbac.authorization.k8s.io/v1 kind : Role metadata : labels : app.kubernetes.io/managed-by : kustomize app.kubernetes.io/name : slice-controller control-plane : controller-manager name : slice-controller-leader-election-role namespace : slice-controller-system rules : - apiGroups : - "" resources : - configmaps verbs : - get - list - watch - create - update - patch - delete - apiGroups : - coordination.k8s.io resources : - leases verbs : - get - list - watch - create - update - patch - delete - apiGroups : - "" resources : - events verbs : - create - patch --- apiVersion : rbac.authorization.k8s.io/v1 kind : ClusterRole metadata : labels : control-plane : controller-manager name : slice-controller-manager-role rules : - apiGroups : - "" resources : - events verbs : - create - patch - update - watch - apiGroups : - "" resources : - nodes - pods verbs : - get - list - watch - apiGroups : - "" resources : - secrets verbs : - get - list - update - watch - apiGroups : - accelerator.gke.io resources : - slices verbs : - create - delete - get - list - patch - update - watch - apiGroups : - accelerator.gke.io resources : - slices/finalizers verbs : - update - apiGroups : - admissionregistration.k8s.io resources : - mutatingwebhookconfigurations verbs : - get - list - update - watch - apiGroups : - batch resources : - jobs verbs : - get - list - patch - update - watch - apiGroups : - jobset.x-k8s.io resources : - jobsets verbs : - get - list - patch - update - watch - apiGroups : - leaderworkerset.x-k8s.io resources : - leaderworkersets verbs : - get - list - patch - update - watch - apiGroups : - kueue.x-k8s.io resources : - admissionchecks verbs : - get - list - watch - apiGroups : - kueue.x-k8s.io resources : - admissionchecks/status - workloads/status verbs : - get - patch - update - apiGroups : - kueue.x-k8s.io resources : - workloads verbs : - create - get - list - patch - update - watch --- apiVersion : rbac.authorization.k8s.io/v1 kind : ClusterRole metadata : labels : control-plane : controller-manager name : slice-controller-metrics-auth-role rules : - apiGroups : - authentication.k8s.io resources : - tokenreviews verbs : - create - apiGroups : - authorization.k8s.io resources : - subjectaccessreviews verbs : - create --- apiVersion : rbac.authorization.k8s.io/v1 kind : ClusterRole metadata : labels : control-plane : controller-manager name : slice-controller-metrics-reader rules : - nonResourceURLs : - /metrics verbs : - get --- apiVersion : rbac.authorization.k8s.io/v1 kind : RoleBinding metadata : labels : app.kubernetes.io/managed-by : kustomize app.kubernetes.io/name : slice-controller control-plane : controller-manager name : slice-controller-leader-election-rolebinding namespace : slice-controller-system roleRef : apiGroup : rbac.authorization.k8s.io kind : Role name : slice-controller-leader-election-role subjects : - kind : ServiceAccount name : slice-controller-controller-manager namespace : slice-controller-system --- apiVersion : rbac.authorization.k8s.io/v1 kind : ClusterRoleBinding metadata : labels : app.kubernetes.io/managed-by : kustomize app.kubernetes.io/name : slice-controller control-plane : controller-manager name : slice-controller-manager-rolebinding roleRef : apiGroup : rbac.authorization.k8s.io kind : ClusterRole name : slice-controller-manager-role subjects : - kind : ServiceAccount name : slice-controller-controller-manager namespace : slice-controller-system --- apiVersion : rbac.authorization.k8s.io/v1 kind : ClusterRoleBinding metadata : labels : control-plane : controller-manager name : slice-controller-metrics-auth-rolebinding roleRef : apiGroup : rbac.authorization.k8s.io kind : ClusterRole name : slice-controller-metrics-auth-role subjects : - kind : ServiceAccount name : slice-controller-controller-manager namespace : slice-controller-system --- apiVersion : v1 kind : Secret metadata : labels : control-plane : controller-manager name : slice-controller-webhook-server-cert namespace : slice-controller-system --- apiVersion : v1 kind : Service metadata : labels : app.kubernetes.io/managed-by : kustomize app.kubernetes.io/name : slice-controller control-plane : controller-manager name : slice-controller-controller-manager-metrics-service namespace : slice-controller-system spec : ports : - name : https port : 8443 protocol : TCP targetPort : 8443 selector : app.kubernetes.io/name : slice-controller control-plane : controller-manager --- apiVersion : v1 kind : Service metadata : labels : control-plane : controller-manager name : slice-controller-webhook-service namespace : slice-controller-system spec : ports : - port : 443 protocol : TCP targetPort : 9443 selector : control-plane : controller-manager --- apiVersion : apps/v1 kind : Deployment metadata : labels : app.kubernetes.io/managed-by : kustomize app.kubernetes.io/name : slice-controller control-plane : controller-manager name : slice-controller-controller-manager namespace : slice-controller-system spec : replicas : 1 selector : matchLabels : app.kubernetes.io/name : slice-controller control-plane : controller-manager template : metadata : annotations : kubectl.kubernetes.io/default-container : manager labels : app.kubernetes.io/name : slice-controller control-plane : controller-manager spec : containers : - args : - --metrics-bind-address=:8443 - --leader-elect - --health-probe-bind-address=:8081 - --zap-log-level=3 - --feature-gates=UseRetryMechanismForSliceCreation=true - --activation-timeout=6m command : - /manager image : tpuongke/kueue-slice-controller:latest livenessProbe : httpGet : path : /healthz port : 8081 initialDelaySeconds : 15 periodSeconds : 20 name : manager ports : - containerPort : 9443 name : webhook-server protocol : TCP readinessProbe : httpGet : path : /readyz port : 8081 initialDelaySeconds : 5 periodSeconds : 10 resources : limits : cpu : 12000m memory : 32Gi requests : cpu : 8000m memory : 16Gi securityContext : allowPrivilegeEscalation : false capabilities : drop : - ALL volumeMounts : - mountPath : /tmp/k8s-webhook-server/serving-certs name : cert readOnly : true securityContext : runAsNonRoot : true seccompProfile : type : RuntimeDefault serviceAccountName : slice-controller-controller-manager terminationGracePeriodSeconds : 10 volumes : - name : cert secret : defaultMode : 420 secretName : slice-controller-webhook-server-cert --- apiVersion : admissionregistration.k8s.io/v1 kind : MutatingWebhookConfiguration metadata : labels : control-plane : controller-manager name : slice-controller-mutating-webhook-configuration webhooks : - admissionReviewVersions : - v1 clientConfig : service : name : slice-controller-webhook-service namespace : slice-controller-system path : /mutate-batch-v1-job failurePolicy : Fail name : mjob.kb.io rules : - apiGroups : - batch apiVersions : - v1 operations : - CREATE resources : - jobs sideEffects : None - admissionReviewVersions : - v1 clientConfig : service : name : slice-controller-webhook-service namespace : slice-controller-system path : /mutate-jobset-x-k8s-io-v1alpha2-jobset failurePolicy : Fail name : mjobset.kb.io rules : - apiGroups : - jobset.x-k8s.io apiVersions : - v1alpha2 operations : - CREATE resources : - jobsets sideEffects : None - admissionReviewVersions : - v1 clientConfig : service : name : slice-controller-webhook-service namespace : slice-controller-system path : /mutate-leaderworkerset-x-k8s-io-v1-leaderworkerset failurePolicy : Fail name : mleaderworkerset.kb.io rules : - apiGroups : - leaderworkerset.x-k8s.io apiVersions : - v1 operations : - CREATE resources : - leaderworkersets sideEffects : None -
Apply the
slice-controller.yamlmanifest:kubectl apply -f slice-controller.yaml -
To configure Kueue for dynamic slicing, save the following manifest as
dynamic-slice-topology.yaml:apiVersion : kueue.x-k8s.io/v1beta1 kind : Topology metadata : name : superslice-topology spec : levels : # Label to identify the physical block a sub-block belongs to. # Only sub-blocks from the same block can form a slice. - nodeLabel : cloud.google.com/gce-topology-block # Label to identify individual TPU sub-blocks (4x4x4 topology). - nodeLabel : cloud.google.com/gke-tpu-partition-4x4x4-id # Standard Kubernetes label for individual nodes. # Required to assign Pods to specific VMs. - nodeLabel : kubernetes.io/hostname --- apiVersion : kueue.x-k8s.io/v1beta1 kind : ResourceFlavor metadata : name : superslice-rf spec : nodeLabels : cloud.google.com/gke-tpu-accelerator : tpu7x topologyName : superslice-topology --- apiVersion : kueue.x-k8s.io/v1beta1 kind : AdmissionCheck metadata : name : superslice-ac spec : controllerName : accelerator.gke.io/slice --- apiVersion : kueue.x-k8s.io/v1beta1 kind : ClusterQueue metadata : name : cq spec : namespaceSelector : {} admissionChecks : - superslice-ac resourceGroups : - coveredResources : - google.com/tpu flavors : - name : superslice-rf resources : - name : google.com/tpu nominalQuota : "999999" # modeling unlimited quota --- apiVersion : kueue.x-k8s.io/v1beta1 kind : LocalQueue metadata : name : lq namespace : default spec : clusterQueue : cq -
Apply the
dynamic-slice-topology.yamlmanifest:kubectl apply -f dynamic-slice-topology.yamlIn this manifest, you configure Kueue for dynamic slicing by defining the following resources:
- Ironwood (TPU7x) dynamic slice topology (
superslice-topology): the topology defines the levels Kueue considers when it schedules dynamic slicing workloads. These levels are the following:-
cloud.google.com/gce-topology-blocklabel: this level is required to understand which sub-blocks belong to which blocks, because only sub-blocks from the same block can form a slice. -
cloud.google.com/gke-tpu-partition-4x4x4-idlabel: this level represents individual Ironwood (TPU7x) sub-blocks (4x4x4topology). -
kubernetes.io/hostnamelabel: this level is required to assign Pods to specific VMs and to observe their labels and taints.
-
- Ironwood (TPU7x) SuperSlice ResourceFlavor (
superslice-rf): the resource flavor for Ironwood (TPU7x) sub-blocks includes thecloud.google.com/gke-tpu-accelerator: tpu7xlabel to match nodes with Ironwood (TPU7x) machines. - SuperSlice AdmissionCheck (
superslice-ac): this admission check tells Kueue not to schedule a workload until the GKE slice controller confirms that the slice has become active. The admission check is first defined and then added to theClusterQueuethat handles dynamic slicing workloads. - ClusterQueue (
cq) and LocalQueue (lq): these fields managegoogle.com/tpuresources. ThecqClusterQueue includes thesuperslice-acadmission check. ThenominalQuotaforgoogle.com/tpucan be configured in two ways:- Specific quota: set
nominalQuotato match existing capacity for fair-sharing and quota management. - Unlimited quota: set
nominalQuotato a very high value such as"999999", to model unlimited quota. To focus on TAS and dynamic slicing, this configuration bypasses Kueue's quota management functionality.
- Specific quota: set
- Ironwood (TPU7x) dynamic slice topology (
Define the sub-block health selection
Beyond standard node health and readiness, GKE exposes the specific state of
each sub-block by using the cloud.google.com/gke-tpu-partition-4x4x4-state
label.
This label lets GKE account for factors that influence slice formation, such as
the state of TPU links.
You can define the value of the cloud.google.com/gke-tpu-partition-4x4x4-state
label as follows:
-
HEALTHY: the infrastructure is healthy. -
DEGRADED: the sub-block's infrastructure is in a degraded state, for example, because of OCS link degradation. The sub-block can still form a slice, but overall performance might be lower compared to healthy sub-blocks. -
UNHEALTHY: the sub-block is unhealthy and can't form a slice.
The Kueue Slice Controller webhook validates if a workload includes a specific sub-block health requirement. If no preference is indicated, the webhook injects a default node affinity.
The behavior is as follows:
- If a
nodeSelectorornodeAffinitythat targets thecloud.google.com/gke-tpu-partition-4x4x4-statelabel is present, it remains unchanged. -
If no such label configuration exists, the webhook injects the following default node affinity to ensure only available sub-blocks are used:
nodeAffinity : requiredDuringSchedulingIgnoredDuringExecution : nodeSelectorTerms : - matchExpressions : - key : cloud.google.com/gke-tpu-partition-4x4x4-state operator : In values : - "HEALTHY" - "DEGRADED"
The following section includes examples where the cloud.google.com/gke-tpu-partition-4x4x4-state
label is configured to specify
the different sub-block health configurations.
Run test workloads on dynamic slicing with Kueue
This section describes how to deploy workloads on dynamic slicing with Kueue and TAS. It includes examples that show how to create a dynamic slice workload and a workload consisting of multiple slices. The workloads are submitted as JobSets.
Example 1: Single workload uses a single dynamic slice
The following example describes how to create a workload using a slice with a 4x12x16
topology, which is composed of 12 sub-blocks. The number of Pods was
calculated as: (4 * 12 * 16) / 4 chips per node = 192 Pods.
-
Save the following manifest as
big-super-slice.yaml:apiVersion : jobset.x-k8s.io/v1alpha2 kind : JobSet metadata : name : big-super-slice labels : kueue.x-k8s.io/queue-name : lq annotations : spec : replicatedJobs : - name : job-jax replicas : 1 template : spec : parallelism : 192 # pods per slice calculation: 4*12*16 / 4 = 192 completions : 192 backoffLimit : 10 template : metadata : annotations : cloud.google.com/gke-tpu-slice-topology : 4x12x16 spec : tolerations : - key : "google.com/tpu" operator : "Equal" value : "present" effect : "NoSchedule" nodeSelector : cloud.google.com/gke-tpu-accelerator : tpu7x containers : - name : jax image : python:latest command : - bash - -c - | printenv pip install "jax[tpu]" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html python -c 'import jax; print("Global device count:", jax.device_count(), "Local device count:", jax.local_device_count())' resources : limits : google.com/tpu : 4 restartPolicy : NeverIn this manifest, the following annotations tell Kueue the slice characteristics and topology to configure the following:
-
cloud.google.com/gke-tpu-slice-topology: specifies"4x12x16"as the dynamic slice topology. Requirements for thetpu7xaccelerator topology include the following rules:- The minimum topology is
4x4x4. - The topology must be a three-dimensional string in the format
AxBxC. For example,4x8x8. - Each dimension (A, B, and C) must be a multiple of four.
- The dimensions must be sorted in non-decreasing order: A <= B <= C. For
example,
4x8x4is invalid; it should be4x4x8. - The product of the dimensions (A B C) must not exceed 9,216.
- The largest supported slice topologies can include up to 32 sub-blocks. For
example,
8x16x16with 32 sub-blocks,8x12x20with 30 sub-blocks, or12x12x12with 27 sub-blocks are within the accepted limits.
- The minimum topology is
-
cloud.google.com/gke-tpu-accelerator: tpu7x: schedules Pods on on VMs that run Ironwood (TPU7x). -
kueue.x-k8s.io/queue-name: assigns the JobSet to a Kueue LocalQueue. - The webhook injects the default node affinity to ensure
HEALTHYandDEGRADEDnodes are used.
-
-
Apply the
big-super-slice.yamlmanifest:kubectl apply -f big-super-slice.yamlAfter you apply the manifest, Kueue creates a
JobSetnamedbig-super-slice. Kueue then attempts to form a single dynamic slice with a4x12x16topology. After the slice is active, Kueue admits the workload, and the 192 Pods are scheduled on the nodes to form the dynamic slice that runs your workloads.
Example 2: Workload with more than one replica
The following example demonstrates how to create a workload that uses two
dynamic slices, each composed of four sub-blocks targeting only HEALTHY
nodes.
-
Save the following manifest as
two-super-slices.yaml:apiVersion : jobset.x-k8s.io/v1alpha2 kind : JobSet metadata : name : two-super-slices labels : kueue.x-k8s.io/queue-name : lq annotations : spec : replicatedJobs : - name : job-jax replicas : 2 template : spec : parallelism : 64 # Pods per slice calculation: (4*8*8) / 4 = 64 completions : 64 backoffLimit : 10 template : metadata : annotations : cloud.google.com/gke-tpu-slice-topology : 4x8x8 spec : tolerations : - key : "google.com/tpu" operator : "Equal" value : "present" effect : "NoSchedule" nodeSelector : cloud.google.com/gke-tpu-accelerator : tpu7x cloud.google.com/gke-tpu-partition-4x4x4-state : "HEALTHY" containers : - name : jax image : python:latest command : - bash - -c - | printenv pip install "jax[tpu]" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html python -c 'import jax; print("Global device count:", jax.device_count(), "Local device count:", jax.local_device_count())' resources : limits : google.com/tpu : 4 restartPolicy : Never -
Apply the
two-super-slices.yamlmanifest:kubectl apply -f two-super-slices.yaml
In this manifest, you set replicas: 2
in the replicatedJobs
field.
After you apply the manifest, Kueue
attempts to form two separate slices with a 4x8x8
topology. Kueue creates a
dynamic slice for each replica defined in jobset.spec.replicatedJobs[].replicas
.
If n
replicas are specified, Kueue creates n
dynamic slices for the workload
and waits for all slices to become active before admitting the workload.
Monitor the slice
You can see the status of the slice and monitor the slice metrics with GKE system metrics .
Monitor the status of the slice
To check the status of your dynamic slices, run the following command:
kubectl
describe
slice
SLICE_NAME
Replace SLICE_NAME
with the name of your slice. The
slice name is typically derived from the JobSet name and replica index. For Example 1
, a slice created by Kueue would have a name similar to default-jobset-big-super-slice-yyyyy-job-jax-0
.
The output is similar to the following:
Name: test-slice
Namespace:
Labels: <none>
Annotations: <none>
API Version: accelerator.gke.io/v1beta1
Kind: Slice
Metadata:
Creation Timestamp: 2026-02-12T23:44:28Z
Finalizers:
accelerator.gke.io/slice-finalizer
Generation: 1
Resource Version: 1770939905695871008
UID: 6dbbfe14-4486-4462-864d-e078d0ca8b5b
Spec:
Partition Ids:
5eae6a4f59d59cf30a9bf49de618eb2b
Topology: 4x4x4
Type: tpu7x
Status:
Conditions:
Last Transition Time: 2026-02-12T23:45:05Z
Message:
Reason: ACTIVE
Status: True
Type: Ready
Last Transition Time: 2026-02-12T23:45:05Z
Message: NodeLabelingCompleted
Reason: NodeLabelIsAdded
Status: True
Type: NodeLabeled
Events: <none>
The slice name adheres to the following rules to ensure compatibility with underlying Compute Engine resource naming conventions:
- Template:
{namespace}-jobset-{jobset.metadata.name}-kueueHash[5-character]-{jobset.spec.replicatedJobs[].name}-sliceIndex. - Length: the name has 49 characters or fewer. The controller appends a hyphen and an 8-character cluster hash to create Compute Engine resource names, which have a 63-character limit.
- Format: the name matches the regular expression
^[a-z]([-a-z0-9]*[a-z0-9])?$. The name has the following characteristics:- Starts with a lowercase letter.
- Only contains lowercase letters, numbers, and hyphens (-).
- Ends with a lowercase letter or a number (it cannot end with a hyphen).
Monitor the metrics of the slice
You can monitor the following GKE system metrics that expose the condition of a slice:
-
kubernetes.io/accelerator/slice/state -
kubernetes.io/accelerator/partition/state -
kubernetes.io/accelerator/slice/deformation_durations -
kubernetes.io/accelerator/slice/formation_durations
For more information about the metrics, see GKE system metrics .
Clean up
To avoid unexpected charges, delete your slices before deleting node pools.
-
Delete the JobSet. This action triggers Kueue to delete the associated Slice custom resources.
kubectl delete jobset JOBSET_NAMEReplace
JOBSET_NAMEwith the name of your JobSet, for example,big-super-slice. -
Delete the TPU node pool:
gcloud container node-pools delete NODE_POOL_NAME \ --cluster = CLUSTER_NAME \ --location = LOCATION
(Optional) Use dynamic slicing with your own scheduler
This document focuses on using Kueue and TAS. However, you can also manage dynamic slicing with your own custom scheduler. If you choose to use a different scheduler, follow the Slice custom resource reference information.
What's next
- Learn more about TPU Cluster Director .
- Learn how to Manage maintenance events with TPUs in All Capacity mode .

