Schedule GKE workloads with Topology Aware Scheduling

This page shows you how to schedule workloads on GKE clusters using Topology Aware Scheduling (TAS) and Kueue.

AI and ML workloads require significant Pod-to-Pod communication. Because of this requirement, network bandwidth between Pods directly impacts workload execution time and cost. This bandwidth depends on the placement of virtual machines (VMs) within the data center.

What is Topology Aware Scheduling (TAS)?

TAS can significantly improve the efficiency of large language model (LLM) training. TAS strategically places workers on the network topology to minimize communication overhead during gradient aggregation, which requires workers to communicate in a specific rank order. By minimizing network hops between sequentially communicating workers, TAS reduces network contention and optimizes bandwidth utilization, leading to faster convergence and shorter training times. With increasingly-large LLM models, TAS is essential for maximizing the performance and scalability of distributed training.

TAS works best with compactly placed capacity, which can be obtained through reservations. With Spot VMs, your capacity is less likely to be compactly placed, so TAS might not work well in this scenario.

Before you begin

Before you start, make sure that you have performed the following tasks:

  • Enable the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running gcloud components update .

Connect to your cluster to run kubectl commands

  1. Run the following command to connect to your cluster, replacing CLUSTER_NAME with the name of your cluster:

     gcloud  
    container  
    clusters  
    get-credentials  
     CLUSTER_NAME 
     
    

Topology of GKE nodes on A4 and A3 Ultra VMs

You can understand the physical topology of GKE nodes on A4 and A3 Ultra VMs by referring to the following node labels:

  • cloud.google.com/gce-topology-block : the organization-specific ID of the reserved block in which the VM is located. A block is a collection of sub-blocks that are connected by a layer of distributed network fabric.
  • cloud.google.com/gce-topology-subblock : the organization-specific ID of the sub-block in which the VM is located. A sub-block is a group of hosts and associated connectivity hardware. For A4 and A3 Ultra VMs, these hosts are connected by a large-scale distributed Jupiter network fabric that offers low predictable latency and flat bandwidth across all the hosts.
  • cloud.google.com/gce-topology-host : the organization-specific ID of the host on which the VM is located. A host is a single physical server machine in the data center. Each GKE node is provisioned on a VM instance that is provisioned on top of a physical host.
  • kubernetes.io/hostname : the hostname of the Kubernetes node. This is typically also the GKE node name.

To learn more about the terms used with AI Hypercomputer, see Terminology .

View the physical topology of nodes in your GKE cluster

Run the following command to get the node labels for your GKE cluster nodes in a specific node pool:

 kubectl  
get  
nodes  
-l  
cloud.google.com/gke-nodepool = 
 NODE_POOL_NAME 
  
 \ 
  
-ocustom-columns = 
 'NAME:.metadata.name,BLOCK:.metadata.labels.cloud\.google\.com/gce-topology-block,SUBBLOCK:.metadata.labels.cloud\.google\.com/gce-topology-subblock,HOST:.metadata.labels.cloud\.google\.com/gce-topology-host' 
  
 | 
  
sort  
-k2,4 

Replace NODE_POOL_NAME with the name of the node pool.

The output displays the block, sub-block, and host of each of the GKE nodes in the node pool.

You can use this topology information to optimize Pod placement for your AI workloads using TAS.

Prepare to schedule workloads with TAS using Kueue

We recommend using TAS with Kueue , a Kubernetes-native system that manages quotas and how jobs should consume them.

Install Kueue with TAS enabled

TAS requires Kueue 0.10.0 or later, and must be explicitly enabled.

You can install Kueue and enable TAS through a Kueue manifest, or Kueue's helm chart:

Kueue manifest

  1. Install Kueue:

     kubectl  
    apply  
    --server-side  
    -f  
    https://github.com/kubernetes-sigs/kueue/releases/download/v0.10.0/manifests.yaml 
    
  2. Enable TAS in Kueue:

     kubectl  
    -n  
    kueue-system  
    patch  
    deployment  
    kueue-controller-manager  
     \ 
      
    --type  
    json  
     \ 
      
    -p = 
     '[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--feature-gates=TopologyAwareScheduling=true"}]' 
     
    

Helm chart

  1. Install Kueue with TAS enabled using a Helm chart:

     helm  
    install  
    kueue  
    oci://us-central1-docker.pkg.dev/k8s-staging-images/charts/kueue  
    --version = 
     "v0.10.0" 
      
    --create-namespace  
    --namespace = 
    kueue-system  
    --set  
     "controllerManager.featureGates[0].name=TopologyAwareScheduling,controllerManager.featureGates[0].enabled=true" 
     
    

Configure Kueue

After installation, you must configure Kueue to understand the infrastructure that it's managing. Typically, Kueue requires a ClusterQueue resource quota definition of either static infrastructure, or dynamic infrastructure with cluster autoscaling enabled. The ClusterQueue will admit a Workload if and only if the resources requested by the workload are less than or equal to the pool of resources defined in the ClusterQueue. After this, Kueue admits workloads using TAS in the following way:

  • TAS workloads: Kueue checks both the topology of the physical infrastructure and its current usage.
  • Non-TAS workloads: Kueue doesn't check the topology of the physical infrastructure. Kueue manages the entire quota defined in the config and leaves node assignment to kube-scheduler.

Review the following examples to understand two ways that you can provide a ClusterQueue resource quota definition to Kueue:

  • Very high quota: Kueue practically never stops admission of a workload based on the requested resources. Based on the TAS definitions, Kueue may or may not admit workloads based on the infrastructure topology.
  • Realistic quota: Kueue will admit the Workload if and only if the resources requested by the Workload are within these resource quota limits. Based on the TAS definitions, Kueue will then check the infrastructure topology before admitting the Workload.

All references to resource quota in the following sections refer to ClusterQueue resource quota.

Very high resource quota

The following example uses a very high resource quota, such that Kueue practically never stops a workload based on the available resource quota, but rather uses the topology information of available nodes to try and match the topology with the requirements of the workload:

   
 apiVersion 
 : 
  
 kueue.x-k8s.io/v1alpha1 
  
 kind 
 : 
  
 Topology 
  
 metadata 
 : 
  
 name 
 : 
  
 "gke-default" 
  
 spec 
 : 
  
 levels 
 : 
  
 - 
  
 nodeLabel 
 : 
  
 "cloud.google.com/gce-topology-block" 
  
 - 
  
 nodeLabel 
 : 
  
 "cloud.google.com/gce-topology-subblock" 
  
 - 
  
 nodeLabel 
 : 
  
 "cloud.google.com/gce-topology-host" 
  
 - 
  
 nodeLabel 
 : 
  
 "kubernetes.io/hostname" 
 --- 
  
 kind 
 : 
  
 ResourceFlavor 
  
 apiVersion 
 : 
  
 kueue.x-k8s.io/v1beta1 
  
 metadata 
 : 
  
 name 
 : 
  
 "tas-flavor" 
  
 spec 
 : 
  
 nodeLabels 
 : 
  
 cloud.google.com/gke-nodepool 
 : 
  
 " NODE_POOL_NAME 
" 
  
 topologyName 
 : 
  
 "gke-default" 
  
 tolerations 
 : 
  
 - 
  
 key 
 : 
  
 "nvidia.com/gpu" 
  
 operator 
 : 
  
 "Exists" 
  
 effect 
 : 
  
 NoSchedule 
 --- 
  
 apiVersion 
 : 
  
 kueue.x-k8s.io/v1beta1 
  
 kind 
 : 
  
 ClusterQueue 
  
 metadata 
 : 
  
 name 
 : 
  
 "tas-cluster-queue" 
  
 spec 
 : 
  
 namespaceSelector 
 : 
  
 {} 
  
 resourceGroups 
 : 
  
 - 
  
 coveredResources 
 : 
  
 [ 
 "nvidia.com/gpu" 
 ] 
  
 flavors 
 : 
  
 - 
  
 name 
 : 
  
 "tas-flavor" 
  
 resources 
 : 
  
 - 
  
 name 
 : 
  
 "nvidia.com/gpu" 
  
 nominalQuota 
 : 
  
 10000000 
 --- 
  
 apiVersion 
 : 
  
 kueue.x-k8s.io/v1beta1 
  
 kind 
 : 
  
 LocalQueue 
  
 metadata 
 : 
  
 namespace 
 : 
  
 "default" 
  
 name 
 : 
  
 "tas-user-queue" 
  
 spec 
 : 
  
 clusterQueue 
 : 
  
 "tas-cluster-queue" 
 

To use this resource quota definition, replace NODE_POOL_NAME , and save the YAML file as kueue-tas-config-very-high-quota.yaml .

Then, run the following command:

 kubectl  
create  
-f  
kueue-tas-config-very-high-quota.yaml 

Realistic resource quota

The previous example only configured GPU resources. However, Kueue can manage all Kubernetes-compatible resources.

The following example defines a more-realistic resource quota, including CPU, memory, and GPU. This is for 100 a3-ultragpu-8g machines. A single machine has 224 vCPUs, 2944 GB of memory, and 8 GPUs:

   
 apiVersion 
 : 
  
 kueue.x-k8s.io/v1alpha1 
  
 kind 
 : 
  
 Topology 
  
 metadata 
 : 
  
 name 
 : 
  
 "gke-default" 
  
 spec 
 : 
  
 levels 
 : 
  
 - 
  
 nodeLabel 
 : 
  
 "cloud.google.com/gce-topology-block" 
  
 - 
  
 nodeLabel 
 : 
  
 "cloud.google.com/gce-topology-subblock" 
  
 - 
  
 nodeLabel 
 : 
  
 "cloud.google.com/gce-topology-host" 
  
 - 
  
 nodeLabel 
 : 
  
 "kubernetes.io/hostname" 
 --- 
  
 kind 
 : 
  
 ResourceFlavor 
  
 apiVersion 
 : 
  
 kueue.x-k8s.io/v1beta1 
  
 metadata 
 : 
  
 name 
 : 
  
 "tas-flavor" 
  
 spec 
 : 
  
 nodeLabels 
 : 
  
 cloud.google.com/gke-nodepool 
 : 
  
 " NODE_POOL_NAME 
" 
  
 topologyName 
 : 
  
 "gke-default" 
  
 tolerations 
 : 
  
 - 
  
 key 
 : 
  
 "nvidia.com/gpu" 
  
 operator 
 : 
  
 "Exists" 
  
 effect 
 : 
  
 NoSchedule 
 --- 
  
 apiVersion 
 : 
  
 kueue.x-k8s.io/v1beta1 
  
 kind 
 : 
  
 ClusterQueue 
  
 metadata 
 : 
  
 name 
 : 
  
 "tas-cluster-queue" 
  
 spec 
 : 
  
 namespaceSelector 
 : 
  
 {} 
  
 # match all 
  
 resourceGroups 
 : 
  
 - 
  
 coveredResources 
 : 
  
 [ 
 "cpu" 
 , 
  
 "memory" 
 , 
  
 "nvidia.com/gpu" 
 ] 
  
 flavors 
 : 
  
 - 
  
 name 
 : 
  
 "tas-flavor" 
  
 resources 
 : 
  
 # numbers below represent quota of 100 a3-ultragpu-8g machines 
  
 - 
  
 name 
 : 
  
 "cpu" 
  
 nominalQuota 
 : 
  
 22400 
  
 - 
  
 name 
 : 
  
 "memory" 
  
 nominalQuota 
 : 
  
 294400Gi 
  
 - 
  
 name 
 : 
  
 "nvidia.com/gpu" 
  
 nominalQuota 
 : 
  
 800 
 --- 
  
 apiVersion 
 : 
  
 kueue.x-k8s.io/v1beta1 
  
 kind 
 : 
  
 LocalQueue 
  
 metadata 
 : 
  
 namespace 
 : 
  
 "default" 
  
 name 
 : 
  
 "tas-user-queue" 
  
 spec 
 : 
  
 clusterQueue 
 : 
  
 "tas-cluster-queue" 
 

To use this resource quota definition, replace NODE_POOL_NAME , and save the YAML file as kueue-tas-config-real-quota.yaml .

Then, run the following command:

 kubectl  
create  
-f  
kueue-tas-config-real-quota.yaml 

Verify successful application

If the application was successful, the output should look like the following:

 topology.kueue.x-k8s.io/gke-default created
    resourceflavor.kueue.x-k8s.io/tas-flavor created
    clusterqueue.kueue.x-k8s.io/tas-cluster-queue created
    localqueue.kueue.x-k8s.io/tas-user-queue created 

Schedule workloads with TAS using Kueue

The following scenarios demonstrate how you can instruct Kueue and TAS to manage common workload and infrastructure combinations using topology request types and topology request levels:

  • The following are the available topology request types (preferred or required):

    • kueue.x-k8s.io/podset-preferred-topology : Kueue will prioritize scheduling the entire workload within a given topology level, but will still admit a workload that doesn't fit within this topology level. For a workload that might have fit in a single topology level, Kueue might schedule that workload across multiple instances of that topology level.
    • kueue.x-k8s.io/podset-required-topology : Kueue will continue trying to admit this workload until the entire workload can fit within the chosen topology level.
  • The following are the available topology request levels , letting you be more or less specific about the physical infrastructure where you prefer or require your Job to run:

    • cloud.google.com/gce-topology-block
    • cloud.google.com/gce-topology-subblock
    • cloud.google.com/gce-topology-host
    • kubernetes.io/hostname

To schedule workloads using these values, use the following Job YAML file:

  apiVersion 
 : 
  
 batch/v1 
 kind 
 : 
  
 Job 
 metadata 
 : 
  
 generateName 
 : 
  
  JOB_NAME 
 
  
 labels 
 : 
  
 kueue.x-k8s.io/queue-name 
 : 
  
 tas-user-queue 
 spec 
 : 
  
 parallelism 
 : 
  
  NUMBER_OF_REPLICAS 
 
  
 completions 
 : 
  
  NUMBER_OF_REPLICAS 
 
  
 completionMode 
 : 
  
 Indexed 
  
 template 
 : 
  
 metadata 
 : 
  
 annotations 
 : 
  
  ANNOTATIONS_STRING 
 
  
 spec 
 : 
  
 containers 
 : 
  
 - 
  
 name 
 : 
  
 dummy-job 
  
 image 
 : 
  
 gcr.io/k8s-staging-perf-tests/sleep:v0.1.0 
  
 args 
 : 
  
 [ 
 "60s" 
 ] 
  
 resources 
 : 
  
 requests 
 : 
  
 nvidia.com/gpu 
 : 
  
 "1" 
  
 limits 
 : 
  
 nvidia.com/gpu 
 : 
  
 "1" 
  
 restartPolicy 
 : 
  
 Never 
 

Replace the following variables:

  • JOB_NAME : A name for the Job.
  • NUMBER_OF_REPLICAS : The number of Pods that are running in parallel.
  • ANNOTATIONS_STRING : See the following table:
Requested topology type and level Description ANNOTATIONS_STRING
Preferred to run within a hostname (recommended)
This configuration will admit your workload as long as there are enough resources available to satisfy your workload's resource requirements, even if the capacity is fragmented. Kueue will schedule your Pods as compactly as possible. kueue.x-k8s.io/podset-preferred-topology: "kubernetes.io/hostname"
Required to run within a host

This configuration will admit your workload if and only if there is a host available with enough resources to satisfy your workload's resource requirements.

This is useful when there are multiple VMs per host (for example, smaller machine types) or multiple Pods can run on a single node. In such cases, if the workload is admitted, it will run on a single host.

kueue.x-k8s.io/podset-required-topology: "cloud.google.com/gce-topology-host"
Preferred to run within a host
This configuration will admit your workload as long as there are enough resources available to satisfy your workload's resource requirements, even if the capacity is fragmented. Kueue will try to schedule your Pods within a host and will use additional hosts if needed. kueue.x-k8s.io/podset-preferred-topology: "cloud.google.com/gce-topology-host"
Required to run within a sub-block
This configuration will admit your workload if and only if there is a sub-block available with enough resources to satisfy your workload's resource requirements. kueue.x-k8s.io/podset-required-topology: "cloud.google.com/gce-topology-subblock"
Preferred to run within a sub-block
This configuration will admit your workload as long as there are enough resources available to satisfy your workload's resource requirements, even if the capacity is fragmented. Kueue will try to schedule your Pods within a sub-block and will use additional sub-blocks if needed. In this case, Kueue will rank higher a sub-block with more available capacity even if it is fragmented compared to a sub-block with just enough capacity to satisfy the requirements. kueue.x-k8s.io/podset-preferred-topology: "cloud.google.com/gce-topology-subblock"
Required to run within a block
This configuration will admit your workload if and only if the resources available within a block satisfy your workload's resource requirements. If admitted, Kueue will minimize the number of sub-blocks and hosts to schedule the workload. This might result in fragmentation of your available capacity. kueue.x-k8s.io/podset-required-topology: "cloud.google.com/gce-topology-block"
Preferred to run within a block
This configuration will admit your workload as long as there are enough resources available to satisfy your workload's resource requirements, even if the capacity is fragmented. Kueue will try to schedule your Pods within a block and will use additional blocks if needed. kueue.x-k8s.io/podset-preferred-topology: "cloud.google.com/gce-topology-block"

Schedule workloads using PodGroup with TAS using Kueue

When using PodGroups, you must specify three additional fields for every Pod in a PodGroup:

Depending on the ML framework you use, a leader of a PodGroup can either require a GPU or not require a GPU. Because of a limitation of Kueue, these cases need to be handled differently. The following examples demonstrate how to create a PodGroup of three Pods with one leader and two workers.

Case 1: Leader is also a worker and requires a GPU

If the leader is one of the workers and also requires a GPU, then the leader can have any number within the PodGroup. For simplicity, in the following example the index of the leader is 0:

  apiVersion 
 : 
  
 v1 
 kind 
 : 
  
 Pod 
 metadata 
 : 
  
 generateName 
 : 
  
 tas-podgroup-leader- 
  
 labels 
 : 
  
 kueue.x-k8s.io/queue-name 
 : 
  
 tas-user-queue 
  
 kueue.x-k8s.io/pod-group-name 
 : 
  
 "tas-podgroup-example-group" 
  
 kueue.x-k8s.io/pod-group-pod-index 
 : 
  
 "0" 
  
 annotations 
 : 
  
 kueue.x-k8s.io/pod-group-total-count 
 : 
  
 "3" 
  
 kueue.x-k8s.io/podset-required-topology 
 : 
  
 "cloud.google.com/gce-topology-block" 
 spec 
 : 
  
 containers 
 : 
  
 - 
  
 name 
 : 
  
 leader 
  
 image 
 : 
  
 gcr.io/k8s-staging-perf-tests/sleep:v0.1.0 
  
 args 
 : 
  
 [ 
 "600s" 
 ] 
  
 resources 
 : 
  
 requests 
 : 
  
 nvidia.com/gpu 
 : 
  
 "1" 
  
 limits 
 : 
  
 nvidia.com/gpu 
 : 
  
 "1" 
  
 restartPolicy 
 : 
  
 Never 
 --- 
 apiVersion 
 : 
  
 v1 
 kind 
 : 
  
 Pod 
 metadata 
 : 
  
 generateName 
 : 
  
 tas-podgroup-worker-1- 
  
 labels 
 : 
  
 kueue.x-k8s.io/queue-name 
 : 
  
 tas-user-queue 
  
 kueue.x-k8s.io/pod-group-name 
 : 
  
 "tas-podgroup-example-group" 
  
 kueue.x-k8s.io/pod-group-pod-index 
 : 
  
 "1" 
  
 annotations 
 : 
  
 kueue.x-k8s.io/pod-group-total-count 
 : 
  
 "3" 
  
 kueue.x-k8s.io/podset-required-topology 
 : 
  
 "cloud.google.com/gce-topology-block" 
 spec 
 : 
  
 restartPolicy 
 : 
  
 Never 
  
 containers 
 : 
  
 - 
  
 name 
 : 
  
 worker 
  
 image 
 : 
  
 gcr.io/k8s-staging-perf-tests/sleep:v0.1.0 
  
 args 
 : 
  
 [ 
 "600s" 
 ] 
  
 resources 
 : 
  
 requests 
 : 
  
 nvidia.com/gpu 
 : 
  
 "1" 
  
 limits 
 : 
  
 nvidia.com/gpu 
 : 
  
 "1" 
 --- 
 apiVersion 
 : 
  
 v1 
 kind 
 : 
  
 Pod 
 metadata 
 : 
  
 generateName 
 : 
  
 tas-podgroup-worker-2- 
  
 labels 
 : 
  
 kueue.x-k8s.io/queue-name 
 : 
  
 tas-user-queue 
  
 kueue.x-k8s.io/pod-group-name 
 : 
  
 "tas-podgroup-example-group" 
  
 kueue.x-k8s.io/pod-group-pod-index 
 : 
  
 "2" 
  
 annotations 
 : 
  
 kueue.x-k8s.io/pod-group-total-count 
 : 
  
 "3" 
  
 kueue.x-k8s.io/podset-required-topology 
 : 
  
 "cloud.google.com/gce-topology-block" 
 spec 
 : 
  
 restartPolicy 
 : 
  
 Never 
  
 containers 
 : 
  
 - 
  
 name 
 : 
  
 worker 
  
 image 
 : 
  
 gcr.io/k8s-staging-perf-tests/sleep:v0.1.0 
  
 args 
 : 
  
 [ 
 "600s" 
 ] 
  
 resources 
 : 
  
 requests 
 : 
  
 nvidia.com/gpu 
 : 
  
 "1" 
  
 limits 
 : 
  
 nvidia.com/gpu 
 : 
  
 "1" 
 

Case 2: Leader is not a worker and doesn't require a GPU

If the leader isn't one of the workers because of the Kueue limitation, the leader must have the last index in the PodGroup, because of how Kueue creates PodSets. If the leader doesn't have the last index and the first worker doesn't use the first index, Kueue won't apply rank assignments.

See the following example:

  --- 
 apiVersion 
 : 
  
 v1 
 kind 
 : 
  
 Pod 
 metadata 
 : 
  
 generateName 
 : 
  
 tas-podgroup-leader- 
  
 labels 
 : 
  
 kueue.x-k8s.io/queue-name 
 : 
  
 tas-user-queue 
  
 kueue.x-k8s.io/pod-group-name 
 : 
  
 "tas-podgroup-example-group2" 
  
 kueue.x-k8s.io/pod-group-pod-index 
 : 
  
 "2" 
  
 annotations 
 : 
  
 kueue.x-k8s.io/pod-group-total-count 
 : 
  
 "3" 
  
 kueue.x-k8s.io/podset-required-topology 
 : 
  
 "cloud.google.com/gce-topology-block" 
 spec 
 : 
  
 containers 
 : 
  
 - 
  
 name 
 : 
  
 leader 
  
 image 
 : 
  
 gcr.io/k8s-staging-perf-tests/sleep:v0.1.0 
  
 args 
 : 
  
 [ 
 "600s" 
 ] 
  
 resources 
 : 
  
 requests 
 : 
  
 cpu 
 : 
  
 "1" 
  
 limits 
 : 
  
 cpu 
 : 
  
 "1" 
  
 restartPolicy 
 : 
  
 Never 
 --- 
 apiVersion 
 : 
  
 v1 
 kind 
 : 
  
 Pod 
 metadata 
 : 
  
 generateName 
 : 
  
 tas-podgroup-worker-0- 
  
 labels 
 : 
  
 kueue.x-k8s.io/queue-name 
 : 
  
 tas-user-queue 
  
 kueue.x-k8s.io/pod-group-name 
 : 
  
 "tas-podgroup-example-group2" 
  
 kueue.x-k8s.io/pod-group-pod-index 
 : 
  
 "0" 
  
 annotations 
 : 
  
 kueue.x-k8s.io/pod-group-total-count 
 : 
  
 "3" 
  
 kueue.x-k8s.io/podset-required-topology 
 : 
  
 "cloud.google.com/gce-topology-block" 
 spec 
 : 
  
 restartPolicy 
 : 
  
 Never 
  
 containers 
 : 
  
 - 
  
 name 
 : 
  
 worker 
  
 image 
 : 
  
 gcr.io/k8s-staging-perf-tests/sleep:v0.1.0 
  
 args 
 : 
  
 [ 
 "600s" 
 ] 
  
 resources 
 : 
  
 requests 
 : 
  
 nvidia.com/gpu 
 : 
  
 "1" 
  
 limits 
 : 
  
 nvidia.com/gpu 
 : 
  
 "1" 
 --- 
 apiVersion 
 : 
  
 v1 
 kind 
 : 
  
 Pod 
 metadata 
 : 
  
 generateName 
 : 
  
 tas-podgroup-worker-1- 
  
 labels 
 : 
  
 kueue.x-k8s.io/queue-name 
 : 
  
 tas-user-queue 
  
 kueue.x-k8s.io/pod-group-name 
 : 
  
 "tas-podgroup-example-group2" 
  
 kueue.x-k8s.io/pod-group-pod-index 
 : 
  
 "1" 
  
 annotations 
 : 
  
 kueue.x-k8s.io/pod-group-total-count 
 : 
  
 "3" 
  
 kueue.x-k8s.io/podset-required-topology 
 : 
  
 "cloud.google.com/gce-topology-block" 
 spec 
 : 
  
 restartPolicy 
 : 
  
 Never 
  
 containers 
 : 
  
 - 
  
 name 
 : 
  
 worker 
  
 image 
 : 
  
 gcr.io/k8s-staging-perf-tests/sleep:v0.1.0 
  
 args 
 : 
  
 [ 
 "600s" 
 ] 
  
 resources 
 : 
  
 requests 
 : 
  
 nvidia.com/gpu 
 : 
  
 "1" 
  
 limits 
 : 
  
 nvidia.com/gpu 
 : 
  
 "1" 
 

What's next

Create a Mobile Website
View Site in Mobile | Classic
Share by: