Create a GKE Cluster with Pathways

You can use the Accelerated Processing Kit (XPK) to create pre-configured Google Kubernetes Engine (GKE) clusters for Pathways-based workloads. You can also use gcloud to manually create GKE clusters for Pathways-based workloads

Before you begin

Make sure you have:

Set up your local environment

Log in with your Google Cloud credentials.

 gcloud  
auth  
application-default  
login 

Define the following environment variables with values appropriate to your workload.

Required variables

Create a GKE cluster

In the following example, you create a cluster with two v5e 2x4 node pools. You can create a cluster using XPK or the gcloud command.

XPK

  1. Set some environment variables

     CLUSTER_NODEPOOL_COUNT 
     = 
     CLUSTER_NODEPOOL_COUNT 
     PROJECT 
     = 
     PROJECT_ID 
     ZONE 
     = 
     ZONE 
     CLUSTER 
     = 
     GKE_CLUSTER_NAME 
     TPU_TYPE 
     = 
     " v5litepod-8 
    " 
     PW_CPU_MACHINE_TYPE 
     = 
     " n2-standard-64 
    " 
     NETWORK 
     = 
     NETWORK 
     SUBNETWORK 
     = 
     SUB_NETWORK 
    

    Replace the following:

    • CLUSTER_NODEPOOL_COUNT : the maximum number of node pools a workload can use
    • PROJECT_ID : your Google Cloud project name
    • ZONE : the zone where you are creating resources
    • CLUSTER : the GKE cluster name
    • TPU_TYPE : the TPU type. For more information, see supported types in XPK
    • PW_CPU_MACHINE_TYPE : the CPU node type for the Pathways controller
    • NETWORK : [Optional] set a Virtual Private Cloud name if using XPK, this must be created before creating your cluster
    • SUBNETWORK : [Optional] set a subnetwork name if using XPK, this must be created before creating your cluster
  2. Use XPK to create a GKE Pathways cluster. This command can take several minutes to provision the capacity. Once completed, your capacity is allocated and you will start incurring charges.

    xpk  
    cluster  
    create-pathways  
     \ 
    --num-slices = 
     ${ 
     CLUSTER_NODEPOOL_COUNT 
     } 
      
     \ 
    --tpu-type = 
     ${ 
     TPU_TYPE 
     } 
      
     \ 
    --pathways-gce-machine-type = 
     ${ 
     PW_CPU_MACHINE_TYPE 
     } 
      
     \ 
    --on-demand  
     \ 
    --project = 
     ${ 
     PROJECT 
     } 
      
     \ 
    --zone = 
     ${ 
     ZONE 
     } 
      
     \ 
    --cluster = 
     ${ 
     CLUSTER 
     } 
      
     \ 
    --custom-cluster-arguments = 
     "--network= 
     ${ 
     NETWORK 
     } 
     --subnetwork= 
     ${ 
     SUBNETWORK 
     } 
     --enable-ip-alias" 
    

Once the cluster is created, you can create and delete workloads as needed. You don't need to re-provision the TPU capacity.

gcloud

  1. Set some environment variables

     CLUSTER 
     = 
     GKE_CLUSTER_NAME 
     PROJECT 
     = 
     PROJECT_ID 
     ZONE 
     = 
     ZONE 
     REGION 
     = 
     REGION 
     CLUSTER_VERSION 
     = 
     GKE_CLUSTER_VERSION 
     PW_CPU_MACHINE_TYPE 
     = 
     " n2-standard-64 
    " 
     NETWORK 
     = 
     NETWORK 
     SUBNETWORK 
     = 
     SUB_NETWORK 
     CLUSTER_NODEPOOL_COUNT 
     = 
      3 
     
     TPU_MACHINE_TYPE 
     = 
     " ct5lp-hightpu-4t 
    " 
     WORKERS_PER_SLICE 
     = 
      2 
     
     TOPOLOGY 
     = 
     " 2x4 
    " 
     NUM_CPU_NODES 
     = 
      1 
     
    

    Replace the following:

    • CLUSTER : the GKE cluster name
    • PROJECT_ID : your Google Cloud project name
    • ZONE : the zone where you are creating resources
    • REGION : the region where you are creating resources
    • CLUSTER_VERSION : [Optional] the GKE cluster version, use 1.32.2-gke.1475000 or later
    • PW_CPU_MACHINE_TYPE : the CPU node type for the Pathways controller
    • NETWORK : [Optional] set a Virtual Private Cloud name if using XPK, this must be created before creating your cluster
    • SUBNETWORK : [Optional] set a subnetwork name if using XPK, this must be created before creating your cluster
    • CLUSTER_NODEPOOL_COUNT : the maximum number of node pools a workload can use
    • TPU_MACHINE_TYPE : the TPU machine type you want to use
    • WORKERS_PER_SLICE : the number of nodes per node pool

    • GKE_ACCELERATOR_TYPE : the Google Kubernetes Engine accelerator type, see Choose a TPU version

    • TOPOLOGY : the TPU topology

    • NUM_CPU_NODES : the Pathways CPU node pool size

The following steps explain how to create a GKE cluster and set it up for running Pathways workloads.

  1. Create a GKE cluster:

     gcloud  
    beta  
    container  
    clusters  
    create  
     ${ 
     CLUSTER 
     } 
      
     \ 
    --project = 
     ${ 
     PROJECT 
     } 
      
     \ 
    --zone = 
     ${ 
     ZONE 
     } 
      
     \ 
    --cluster-version = 
     ${ 
     CLUSTER_VERSION 
     } 
      
     \ 
    --scopes = 
    storage-full,gke-default,cloud-platform  
     \ 
    --machine-type  
     ${ 
     PW_CPU_MACHINE_TYPE 
     } 
      
     \ 
    --network = 
     ${ 
     NETWORK 
     } 
      
     \ 
    --subnetwork = 
     ${ 
     SUBNETWORK 
     } 
     
    
  2. Create TPU node pools:

      for 
      
    i  
     in 
      
     $( 
    seq  
     1 
      
     ${ 
     CLUSTER_NODEPOOL_COUNT 
     } 
     ) 
     ; 
      
     do 
    gcloud  
    container  
    node-pools  
    create  
     "tpu-np- 
     ${ 
     i 
     } 
     " 
      
     \ 
    --project = 
     ${ 
     PROJECT 
     } 
      
     \ 
    --zone = 
     ${ 
     ZONE 
     } 
      
     \ 
    --cluster = 
     ${ 
     CLUSTER 
     } 
      
     \ 
    --machine-type = 
     ${ 
     TPU_MACHINE_TYPE 
     } 
      
     \ 
    --num-nodes = 
     ${ 
     WORKERS_PER_SLICE 
     } 
      
     \ 
    --placement-type = 
    COMPACT  
     \ 
    --tpu-topology = 
     ${ 
     TOPOLOGY 
     } 
      
     \ 
    --scopes = 
    storage-full,gke-default,cloud-platform  
     \ 
    --workload-metadata = 
    GCE_METADATA done 
     
    
  3. Create a CPU node pool:

     gcloud  
    container  
    node-pools  
    create  
     "cpu-pathways-np" 
      
     \ 
    --project  
     ${ 
     PROJECT 
     } 
      
     \ 
    --zone  
     ${ 
     ZONE 
     } 
      
     \ 
    --cluster  
     ${ 
     CLUSTER 
     } 
      
     \ 
    --machine-type  
     ${ 
     PW_CPU_MACHINE_TYPE 
     } 
      
     \ 
    --num-nodes  
     ${ 
     NUM_CPU_NODES 
     } 
      
     \ 
    --scopes = 
    storage-full,gke-default,cloud-platform  
     \ 
    --workload-metadata = 
    GCE_METADATA 
    
  4. Install the JobSet and PathwaysJob APIs

    Get credentials for the cluster and add them to your local kubectl context.

     gcloud  
    container  
    clusters  
    get-credentials  
     ${ 
     CLUSTER 
     } 
      
     \ 
      
     [ 
    --zone = 
     ${ 
     ZONE 
     } 
      
     | 
      
    --region = 
     ${ 
     REGION 
     } 
     ] 
      
     \ 
      
    --project = 
     ${ 
     PROJECT 
     } 
      
     \ 
     && 
    kubectl  
    config  
    set-context  
    --current  
    --namespace = 
    default 
    

    To use the Pathways architecture on your GKE cluster, you need to install the JobSet API and the PathwaysJob API.

     kubectl  
    apply  
    --server-side  
    -f  
    https://github.com/kubernetes-sigs/jobset/releases/download/v0.8.0/manifests.yaml
    kubectl  
    apply  
    --server-side  
    -f  
    https://github.com/google/pathways-job/releases/download/v0.1.2/install.yaml 
    

What's next

Design a Mobile Site
View Site in Mobile | Classic
Share by: