Create a custom AI-optimized GKE cluster which uses A4X Max

This document shows you how to create an AI-optimized Google Kubernetes Engine (GKE) cluster that uses A4X Max Compute Engine instances to support your AI and ML workloads.

The A4X Max and A4X series lets you run large-scale AI/ML clusters by using the NVIDIA Multi-Node NVLink (MNNVL) system , a rack-scale solution which enables higher GPU power and performance. These machines offer features such as targeted workload placement, topology-aware scheduling, and advanced cluster maintenance controls. For more information, see Cluster management capabilities . With A4X Max, GKE also provides an automated networking setup which simplifies cluster configuration.

AI and ML workloads, such as distributed training, require powerful acceleration to optimize performance by reducing job completion times. GKE provides a single platform surface to run a diverse set of workloads for your organizations, reducing the operational burden of managing multiple platforms. You can run workloads such as high-performance distributed pre-training, model fine-tuning, model inference, application serving, and supporting services. For workloads that require high performance, high throughput, and low latency, GPUDirect RDMA reduces the network hops that are required to transfer payloads to and from GPUs. This approach more efficiently uses the network bandwidth that's available. For more information, see GPU networking stacks .

In this document, you learn how to create a GKE cluster with the Google Cloud CLI for maximum flexibility in configuring your cluster based on the needs of your workload. To use the gcloud CLI to create clusters with other machine types, see the following:

Alternatively, you can choose to use Cluster Toolkit to quickly deploy your cluster with default settings that reflect best practices for many use cases. For more information, see Create an AI-optimized GKE cluster with default configuration .

Before you begin

Before you start, make sure that you have performed the following tasks:

  • Enable the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running the gcloud components update command. Earlier gcloud CLI versions might not support running the commands in this document.

Obtain capacity

You can obtain capacity for A4X Max compute instances by creating a future reservation. For more information about future reservations, see the Future reservations in AI Hypercomputercolumn in the table for Choose a consumption option .

To obtain capacity with a future reservation, see the Future reservations in AI Hypercomputerrow in the table for How to obtain capacity .

Requirements

The following requirements apply to an AI-optimized GKE cluster with A4X Max compute instances:

  • For A4X Max, you must use one of the following versions:

    • For 1.35 or later, use GKE version 1.35.0-gke.2745000 or later.
    • For 1.34, use GKE version 1.34.3-gke.1318000 or later.

    These versions help to ensure that A4X Max uses the following:

    • R580.95.05, the minimum GPU driver version for A4X Max, which is enabled by default.
    • Coherent Driver-based Memory Management (CDMM), which is enabled by default. NVIDIA recommends that Kubernetes clusters enable this mode to resolve memory over-reporting. CDMM allows GPU memory to be managed through the driver instead of the operating system (OS). This approach helps you to avoid OS onlining of GPU memory, and exposes the GPU memory as a Non-Uniform Memory Access (NUMA) node to the OS. Multi-instance GPUs aren't supported when CDMM is enabled. For more information about CDMM, see Hardware and Software Support .
    • GPUDirect RDMA and MNNVL, which are recommended to enable A4X Max node pools to use the networking capabilities of A4X Max.
  • The GKE nodes must use a Container-Optimized OS node image. Ubuntu and Windows node images are not supported.

  • Your GKE workload must use all available GPUs and your Pod must use all available secondary NICs on a single GKE node. Multiple Pods cannot share RDMA on a single GKE node.

  • You must use the reservation-bound provisioning model to create clusters with A4X Max. Other provisioning models are not supported.

  • These instructions use DRANET to configure an AI-optimized GKE cluster with A4X Max. Multi-networking isn't supported for the a4x-maxgpu-4g-metal machine type.

Considerations for creating a cluster

When you create a cluster, consider the following information:

  • Choose a cluster location:
    • Verify that you use a location which has availability for the machine type that you choose. For more information, see Accelerator availability .
    • When you create node pools in a regional cluster —which are recommended for production workloads—you can use the --node-locations flag to specify the zones for your GKE nodes.
  • Choose a driver version:
    • The driver version can be one of the following values:
      • default : install the default driver version for your GKE node version. For more information about the requirements for default driver versions, see the Requirements section.
      • latest : install the latest available driver version for your GKE version. This option is available only for nodes that use Container-Optimized OS.
      • disabled : skip automatic driver installation. You must manually install a driver after you create the node pool.
    • For more information about the default and latest GPU driver versions for GKE node versions, see the table in the section Manually install NVIDIA GPU drivers .
  • Choose a reservation affinity:

    • You can find information about your reservation, such as the name of your reservation or the name of a specific block in your reservation. To find these values, see View future reservation requests .
    • The --reservation-affinity flag can take the values of specific or any . However, for high performance distributed AI workloads, we recommend that you use a specific reservation.
    • When you use a specific reservation, including shared reservations , specify the value of the --reservation flag in the following format:

       projects/ PROJECT_ID 
      /reservations/ RESERVATION_NAME 
      /reservationBlocks/ BLOCK_NAME 
       
      

      Replace the following values:

      • PROJECT_ID : your Google Cloud project ID.
      • RESERVATION_NAME : the name of your reservation.
      • BLOCK_NAME : the name of a specific block within the reservation.

      We also recommend that you use a sub-block targeted reservation so that compute instances are placed on a single sub-block within the BLOCK_NAME . Add the following to the end of the path:

       /reservationSubBlocks/ SUB_BLOCK_NAME 
       
      

      Replace SUB_BLOCK_NAME with the name of the sub-block.

Create an AI-optimized GKE cluster which uses A4X Max and GPUDirect RDMA

For distributed AI workloads, multiple GPU nodes are often linked together to work as a single computer. A4X Max is an exascale platform based on NVIDIA GB300 NVL72 rack-scale architecture. A4X Max compute instances use a multi-layered, hierarchical networking architecture with a rail-aligned design to optimize performance for various communication types. This machine type enables scaling and collaboration across multiple GPUs by delivering a high-performance cloud experience for AI workloads. For more information about the network architecture for A4X Max, including the network bandwidth and NIC arrangement, see A4X Max machine type (bare metal) .

To create a GKE Standard cluster with A4X Max that uses GPUDirect RDMA and MNNVL, complete the steps that are described in the following sections:

  1. Create the GKE cluster
  2. Create a workload policy
  3. Create a node pool with A4X Max
  4. Configure the MRDMA NICs with asapd-lite
  5. Install the NVIDIA Compute Domain CRD and DRA driver
  6. Configure your workload manifest for RDMA and IMEX domain

These instructions use accelerator network profiles to automatically configure VPC networks and subnets for your A4X Max nodes. Alternatively, you can explicitly specify your VPC network and subnets .

Create the GKE cluster

  1. Create a GKE Standard cluster:

     gcloud  
    container  
    clusters  
    create  
     CLUSTER_NAME 
      
     \ 
      
    --enable-dataplane-v2  
     \ 
      
    --enable-ip-alias  
     \ 
      
    --location = 
     COMPUTE_REGION 
      
     \ 
      
    --cluster-version = 
     CLUSTER_VERSION 
      
     \ 
      
    --no-enable-shielded-nodes  
     [ 
     \ 
      
    --services-ipv4-cidr = 
     SERVICE_CIDR 
      
     \ 
      
    --cluster-ipv4-cidr = 
     POD_CIDR 
      
     \ 
      
    --addons = 
     GcpFilestoreCsiDriver 
     = 
    ENABLED ] 
     
    

    Replace the following:

    • CLUSTER_NAME : the name of your cluster.
    • CLUSTER_VERSION : the version of your new cluster. For more information about which version of GKE supports your configuration, see the Requirements in this document.
    • COMPUTE_REGION : the name of the compute region.
    • Optionally, you can explicitly provide the secondary CIDR ranges for services and Pods. If you use these optional flags, then replace the following variables:

      • SERVICE_CIDR : the secondary CIDR range for services.
      • POD_CIDR : the secondary CIDR range for Pods.

      When you use these flags, you must verify that the CIDR ranges don't overlap with subnet ranges for additional node networks. For example, consider SERVICE_CIDR =10.65.0.0/19 and POD_CIDR =10.64.0.0/19 . For more information, see Adding Pod IPv4 address ranges .

  2. To run the kubectl commands in the next sections, connect to your cluster:

     gcloud  
    container  
    clusters  
    get-credentials  
     CLUSTER_NAME 
      
    --location = 
     COMPUTE_REGION 
     
    

    Replace the following:

    • CLUSTER_NAME : the name of your cluster.
    • COMPUTE_REGION : the name of the compute region.

    For more information, see Install kubectl and configure cluster access .

Create a workload policy

A workload policy is required to create a partition. For more information, see Workload policy for MIGs .

Create a HIGH_THROUGHPUT workload policy with the accelerator_topology field set to 1x72 .

 gcloud  
beta  
compute  
resource-policies  
create  
workload-policy  
 WORKLOAD_POLICY_NAME 
  
 \ 
  
--type  
HIGH_THROUGHPUT  
 \ 
  
--accelerator-topology  
1x72  
 \ 
  
--project  
 PROJECT 
  
 \ 
  
--region  
 COMPUTE_REGION 
 

Replace the following:

  • WORKLOAD_POLICY_NAME : the name of your workload policy.
  • PROJECT : the name of your project.
  • COMPUTE_REGION : the name of the compute region.

Create a node pool with A4X Max

  1. Create the following configuration file to pre-allocate hugepages with the node pool:

     cat > 
    node_custom.yaml  
    <<EOF
    linuxConfig:  
    hugepageConfig:  
    hugepage_size2m:  
     4096 
    EOF export 
      
     NODE_CUSTOM 
     = 
    node_custom.yaml 
    
  2. Create an A4X Max node pool:

     gcloud  
    container  
    node-pools  
    create  
     NODE_POOL_NAME 
      
     \ 
      
    --cluster = 
     CLUSTER_NAME 
      
     \ 
      
    --location = 
     COMPUTE_REGION 
      
     \ 
      
    --node-locations = 
     COMPUTE_ZONE 
      
     \ 
      
    --num-nodes = 
     NODE_COUNT 
      
     \ 
      
    --placement-policy = 
     WORKLOAD_POLICY_NAME 
      
     \ 
      
    --machine-type = 
    a4x-maxgpu-4g-metal  
     \ 
      
    --accelerator = 
     type 
     = 
    nvidia-gb300,count = 
     4 
    ,gpu-driver-version = 
    latest  
     \ 
      
    --system-config-from-file = 
     ${ 
     NODE_CUSTOM 
     } 
      
     \ 
      
    --accelerator-network-profile = 
    auto  
     \ 
      
    --node-labels = 
    cloud.google.com/gke-networking-dra-driver = 
    true,cloud.google.com/gke-dpv2-unified-cni = 
    cni-migration  
     \ 
      
    --reservation-affinity = 
    specific  
     \ 
      
    --reservation = 
     RESERVATION_NAME 
    /reservationBlocks/ BLOCK_NAME 
    /reservationSubBlocks/ SUB_BLOCK_NAME 
     
    

    Replace the following:

    • NODE_POOL_NAME : the name of the node pool.
    • CLUSTER_NAME : the name of your cluster.
    • COMPUTE_REGION : the compute region of the cluster.
    • COMPUTE_ZONE : the zone of your node pool.
    • NODE_COUNT : the number of nodes for the node pool, which must be 18 nodes or less. We recommend using 18 nodes to obtain the GPU topology of 1x72 in one sub-block using an NVLink domain.
    • WORKLOAD_POLICY_NAME : the name of the workload policy you created previously.
    • RESERVATION_NAME : the name of your reservation. To find this value, see View future reservation requests .
    • BLOCK_NAME : the name of a specific block within the reservation. To find this value, see View future reservation requests .

    This command automatically creates a network that connects all the A4X Max nodes within a single zone by using the auto accelerator network profile. When you create a node pool with the --accelerator-network-profile=auto flag, GKE automatically adds the gke.networks.io/accelerator-network-profile: auto label to the nodes. To schedule workloads on these nodes, you must include this label in your workload's nodeSelector field.

Configure the MRDMA NICs with asapd-lite

The asapd-lite DaemonSet configures the MRDMA NICs. An unhealthy asapd-lite DaemonSet might indicate no RDMA connectivity.

  1. Install the DaemonSet:

     kubectl  
    apply  
    -f  
    https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/refs/heads/master/asapd-lite-installer/asapd-lite-installer-a4x-max-bm-cos.yaml 
    
  2. Validate the replicas in the asapd-lite DaemonSet:

     kubectl  
    get  
    daemonset  
    -n  
    kube-system  
    asapd-lite 
    

    The output is similar to the following:

     NAME         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
    asapd-lite   18        18        18      18           18          <none>          5m 
    

    The number of READY replicas should match the number of nodes that were created and are healthy in the node pool.

Install the NVIDIA Compute Domain CRD and DRA driver

The following steps install the NVIDIA Compute Domain CRD and DRA driver to enable the use of MNNVL. For more information, see NVIDIA DRA Driver for GPUs .

  1. Verify that you have Helm installed in your development environment. Helm comes pre-installed on Cloud Shell .

    Although there is no specific Helm version requirement, you can use the following command to verify that you have Helm installed.

     helm  
    version 
    

    If the output is similar to Command helm not found , then you can install the Helm CLI:

     curl  
    -fsSL  
    -o  
    get_helm.sh  
    https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3  
     \ 
     && 
    chmod  
     700 
      
    get_helm.sh  
     \ 
     && 
    ./get_helm.sh 
    
  2. Add the NVIDIA Helm repository:

     helm  
    repo  
    add  
    nvidia  
    https://helm.ngc.nvidia.com/nvidia  
     \ 
     && 
    helm  
    repo  
    update 
    
  3. Create a ResourceQuota object for the DRA Driver:

      export 
      
     POD_QUOTA 
     = 
     POD_QUOTA 
    kubectl  
    create  
    ns  
    nvidia-dra-driver-gpu
    
    kubectl  
    apply  
    -n  
    nvidia-dra-driver-gpu  
    -f  
    - << 
    EOF
    apiVersion:  
    v1
    kind:  
    ResourceQuota
    metadata:  
    name:  
    nvidia-dra-driver-gpu-quota
    spec:  
    hard:  
    pods:  
     ${ 
     POD_QUOTA 
     } 
      
    scopeSelector:  
    matchExpressions:  
    -  
    operator:  
    In  
    scopeName:  
    PriorityClass  
    values:  
    -  
    system-node-critical  
    -  
    system-cluster-critical
    EOF 
    

    Replace POD_QUOTA with a number at least 2 times the number of A4X Max nodes in the cluster plus 1. For example, you must set the variable to at least 37 if you have 18 A4X Max nodes in your cluster.

  4. Install the ComputeDomain CRD and DRA driver:

     helm  
    install  
    nvidia-dra-driver-gpu  
    nvidia/nvidia-dra-driver-gpu  
     \ 
      
    --set  
    controller.args.v = 
     4 
      
    --set  
    kubeletPlugin.args.v = 
     4 
      
     \ 
      
    --version = 
     "25.8.0" 
      
     \ 
      
    --create-namespace  
     \ 
      
    --namespace  
    nvidia-dra-driver-gpu  
     \ 
      
    -f  
    < ( 
    cat  
    <<EOF
    nvidiaDriverRoot:  
    /home/kubernetes/bin/nvidia
    resources:  
    gpus:  
    enabled:  
     false 
    controller:  
    affinity:  
    nodeAffinity:  
    requiredDuringSchedulingIgnoredDuringExecution:  
    nodeSelectorTerms:  
    -  
    matchExpressions:  
    -  
    key:  
     "nvidia.com/gpu" 
      
    operator:  
     "DoesNotExist" 
    kubeletPlugin:  
    affinity:  
    nodeAffinity:  
    requiredDuringSchedulingIgnoredDuringExecution:  
    nodeSelectorTerms:  
    -  
    matchExpressions:  
    -  
    key:  
    cloud.google.com/gke-accelerator  
    operator:  
    In  
    values:  
    -  
    nvidia-gb300  
    -  
    key:  
    kubernetes.io/arch  
    operator:  
    In  
    values:  
    -  
    arm64  
    tolerations:  
    -  
    key:  
    nvidia.com/gpu  
    operator:  
    Equal  
    value:  
    present  
    effect:  
    NoSchedule  
    -  
    key:  
    kubernetes.io/arch  
    operator:  
    Equal  
    value:  
    arm64  
    effect:  
    NoSchedule
    EOF ) 
     
    

Configure your workload manifest for RDMA and IMEX domain

  1. Add a node affinity rule to schedule the workload on Arm nodes:

      spec 
     : 
      
     affinity 
     : 
      
     nodeAffinity 
     : 
      
     requiredDuringSchedulingIgnoredDuringExecution 
     : 
      
     nodeSelectorTerms 
     : 
      
     - 
      
     matchExpressions 
     : 
      
      
     - 
      
     key 
     : 
      
     kubernetes.io/arch 
      
      
     operator 
     : 
      
     In 
      
     values 
     : 
      
     - 
      
     arm64 
     
    
  2. Add the following volume to the Pod specification:

      spec 
     : 
      
     volumes 
     : 
      
     - 
      
     name 
     : 
      
     library-dir-host 
      
     hostPath 
     : 
      
     path 
     : 
      
     /home/kubernetes/bin/nvidia 
     
    
  3. Add the following volume mounts, environment variable, and resource to the container that requests GPUs. Your workload container must request all four GPUs:

      containers 
     : 
      
     - 
      
     name 
     : 
      
     my-container 
      
     volumeMounts 
     : 
      
     - 
      
     name 
     : 
      
     library-dir-host 
      
     mountPath 
     : 
      
     /usr/local/nvidia 
      
     env 
     : 
      
     - 
      
     name 
     : 
      
     LD_LIBRARY_PATH 
      
     value 
     : 
      
     /usr/local/nvidia/lib64 
      
     resources 
     : 
      
     limits 
     : 
      
     nvidia.com/gpu 
     : 
      
     4 
     
    
  4. Create the ComputeDomain resource for the workload:

      apiVersion 
     : 
      
     resource.nvidia.com/v1beta1 
     kind 
     : 
      
     ComputeDomain 
     metadata 
     : 
      
     name 
     : 
      
     a4x-max-compute-domain 
     spec 
     : 
      
     numNodes 
     : 
      
      NUM_NODES 
     
      
     channel 
     : 
      
     resourceClaimTemplate 
     : 
      
     name 
     : 
      
     a4x-max-compute-domain-channel 
     
    

    Replace NUM_NODES with the number of nodes the workload requires.

  5. Create a ResourceClaimTemplate to allocate network resources by using DRANET and request RDMA devices for your Pod:

      apiVersion 
     : 
      
     resource.k8s.io/v1 
     kind 
     : 
      
     ResourceClaimTemplate 
     metadata 
     : 
      
     name 
     : 
      
     all-mrdma 
     spec 
     : 
      
     spec 
     : 
      
     devices 
     : 
      
     requests 
     : 
      
     - 
      
     name 
     : 
      
     req-mrdma 
      
     exactly 
     : 
      
     deviceClassName 
     : 
      
     mrdma.google.com 
      
     allocationMode 
     : 
      
     ExactCount 
      
     count 
     : 
      
     8 
     
    
  6. Specify the ResourceClaimTemplate that the Pod uses:

      spec 
     : 
      
     ... 
      
     volumes 
     : 
      
     ... 
      
     containers 
     : 
      
     - 
      
     name 
     : 
      
     my-container 
      
     ... 
      
     resources 
     : 
      
     limits 
     : 
      
     nvidia.com/gpu 
     : 
      
     4 
      
     claims 
     : 
      
     - 
      
     name 
     : 
      
     compute-domain-channel 
      
     - 
      
     name 
     : 
      
     rdma 
      
      
     ... 
     resourceClaims 
     : 
      
     - 
      
     name 
     : 
      
     compute-domain-channel 
      
     resourceClaimTemplateName 
     : 
      
     a4x-max-compute-domain-channel 
      
     - 
      
     name 
     : 
      
     rdma 
      
     resourceClaimTemplateName 
     : 
      
     all-mrdma 
     
    
  7. Ensure that the userspace libraries and the libnccl packages are installed in the user container image:

     apt  
    update  
    -y
    apt  
    install  
    -y  
    curl export 
      
     DOCA_URL 
     = 
     "https://linux.mellanox.com/public/repo/doca/3.1.0/ubuntu22.04/arm64-sbsa/" 
     BASE_URL 
     = 
     $( 
     [ 
      
     " 
     ${ 
     DOCA_PREPUBLISH 
     :- 
     false 
     } 
     " 
      
     = 
      
     "true" 
      
     ] 
     && 
     echo 
      
    https://doca-repo-prod.nvidia.com/public/repo/doca  
     || 
      
     echo 
      
    https://linux.mellanox.com/public/repo/doca ) 
     DOCA_SUFFIX 
     = 
     ${ 
     DOCA_URL 
     #*public/repo/doca/ 
     } 
     ; 
      
     DOCA_URL 
     = 
     " 
     $BASE_URL 
     / 
     $DOCA_SUFFIX 
     " 
    curl  
     $BASE_URL 
    /GPG-KEY-Mellanox.pub  
     | 
      
    gpg  
    --dearmor > 
    /etc/apt/trusted.gpg.d/GPG-KEY-Mellanox.pub echo 
      
     "deb [signed-by=/etc/apt/trusted.gpg.d/GPG-KEY-Mellanox.pub] 
     $DOCA_URL 
     ./" 
     > 
    /etc/apt/sources.list.d/doca.list
    apt  
    update
    apt  
    -y  
    install  
    doca-ofed-userspace # The installed libnccl2 is 2.27.7, to upgrade to 2.28.9 as we recommend 
    apt  
    install  
    --only-upgrade  
    --allow-change-held-packages  
    -y  
    libnccl2  
    libnccl-dev 
    

A completed Pod specification looks like the following:

  apiVersion 
 : 
  
 resource.nvidia.com/v1beta1 
 kind 
 : 
  
 ComputeDomain 
 metadata 
 : 
  
 name 
 : 
  
 a4x-max-compute-domain 
 spec 
 : 
  
 numNodes 
 : 
  
  NUM_NODES 
 
  
 channel 
 : 
  
 resourceClaimTemplate 
 : 
  
 name 
 : 
  
 a4x-max-compute-domain-channel 
 --- 
 apiVersion 
 : 
  
 apps/v1 
 kind 
 : 
  
 Pod 
 metadata 
 : 
  
 name 
 : 
  
 my-pod 
  
 labels 
 : 
  
 k8s-app 
 : 
  
 my-pod 
 spec 
 : 
  
 ... 
  
 affinity 
 : 
  
 nodeAffinity 
 : 
  
 requiredDuringSchedulingIgnoredDuringExecution 
 : 
  
 nodeSelectorTerms 
 : 
  
 - 
  
 matchExpressions 
 : 
  
 - 
  
 key 
 : 
  
 kubernetes.io/arch 
  
 operator 
 : 
  
 In 
  
 values 
 : 
  
 - 
  
 arm64 
  
 volumes 
 : 
  
 - 
  
 name 
 : 
  
 library-dir-host 
  
 hostPath 
 : 
  
 path 
 : 
  
 /home/kubernetes/bin/nvidia 
  
 hostNetwork 
 : 
  
 true 
  
 containers 
 : 
  
 - 
  
 name 
 : 
  
 my-container 
  
 volumeMounts 
 : 
  
 - 
  
 name 
 : 
  
 library-dir-host 
  
 mountPath 
 : 
  
 /usr/local/nvidia 
  
 env 
 : 
  
 - 
  
 name 
 : 
  
 LD_LIBRARY_PATH 
  
 value 
 : 
  
 /usr/local/nvidia/lib64 
  
 resources 
 : 
  
 limits 
 : 
  
 nvidia.com/gpu 
 : 
  
 4 
  
 claims 
 : 
  
 - 
  
 name 
 : 
  
 compute-domain-channel 
  
 - 
  
 name 
 : 
  
 rdma 
  
 ... 
  
 resourceClaims 
 : 
  
 - 
  
 name 
 : 
  
 compute-domain-channel 
  
 resourceClaimTemplateName 
 : 
  
 a4x-max-compute-domain-channel 
  
 - 
  
 name 
 : 
  
 rdma 
  
 resourceClaimTemplateName 
 : 
  
 all-mrdma 
 

Test network performance

We recommended that you validate the functionality of provisioned clusters. To do so, use NCCL/gIB tests, which are NVIDIA Collective Communications Library (NCCL) tests that are optimized for the Google environment.

For more information, see Run NCCL on custom GKE clusters that use A4X Max .

What's next

Create a Mobile Website
View Site in Mobile | Classic
Share by: