About quicker workload startup with fast-starting nodes


This page shows you how to deploy and scale workloads more quickly in Google Kubernetes Engine (GKE) clusters using fast-starting nodes. Fast-starting nodes are used in GKE with Autopilot mode on a best-effort basis when workloads use compatible configurations.

Fast-starting GKE nodes have significantly lower startup time for compatible machine families. The accelerated startup time provides you with the following benefits:

  • Faster cold start
  • Faster autoscaling
  • Improved Pod scheduling long-tail latency
  • Improved infrastructure cost efficiency

With fast-starting nodes, GKE pre-initializes hardware resources to accelerate startup time. The pre-initialized resources are available on a best-effort basis. Surge requests might only be partially served. Without fast-starting nodes, resources are initialized on-demand, and nodes are served at normal startup time.

Requirements

Fast-starting nodes require no additional configuration. GKE automatically uses fast-starting nodes if your workloads use compatible configurations. You must meet all of the following requirements to use fast-starting nodes:

  • Use Autopilot clusters.
  • Use any version in the Rapid release channel .
  • Use any of the following compatible compute resources, with a maximum compatible boot disk size of 500 GiB:

  • Use the pd-balanced boot disk type.

  • Don't use any features that are incompatible with fast-starting nodes. For more information, see Limitations .

Limitations

The following features aren't compatible with fast-starting GKE nodes. If you use any of these features, GKE provisions nodes with the typical startup time:

Autopilot GPU workloads

Requesting compatible GPUs in your Autopilot clusters results in up to four times faster node startup time and up to two times faster Pod scheduling time than similar requests in GKE Standard clusters, because the Autopilot GPU workloads can use fast-starting nodes.

The following are some example use cases. However, any Pods meeting the conditions from the Requirements section are compatible with fast-starting nodes.

ComputeClass

Request a compatible accelerator type and count in a ComputeClass, like in the following example:

  apiVersion 
 : 
  
 cloud.google.com/v1 
 kind 
 : 
  
 ComputeClass 
 metadata 
 : 
  
 name 
 : 
  
  ACCELERATOR_COMPUTE_CLASS_NAME 
 
 spec 
 : 
  
 priorities 
 : 
  
 - 
  
 gpu 
 : 
  
 type 
 : 
  
  ACCELERATOR_TYPE 
 
  
 count 
 : 
  
  ACCELERATOR_COUNT 
 
  
 nodePoolAutoCreation 
 : 
  
 enabled 
 : 
  
 true 
 

When you select this ComputeClass in a Pod, like in the following example, GKE uses fast-starting nodes:

  apiVersion 
 : 
  
 v1 
 kind 
 : 
  
 Pod 
 metadata 
 : 
  
 name 
 : 
  
  POD_NAME 
 
 spec 
 : 
  
 nodeSelector 
 : 
  
 # Select a ComputeClass that requests compatible GPUs 
  
 cloud.google.com/compute-class 
 : 
  
  ACCELERATOR_COMPUTE_CLASS_NAME 
 
  
 containers 
 : 
  
 - 
  
 name 
 : 
  
 my-container 
  
 image 
 : 
  
 gcr.io/google_containers/pause 
  
 resources 
 : 
  
 limits 
 : 
  
 nvidia.com/gpu 
 : 
  
  ACCELERATOR_COUNT 
 
 

Replace the following values:

  • ACCELERATOR_COMPUTE_CLASS_NAME : the name of the ComputeClass that requests the accelerators.
  • ACCELERATOR_TYPE : the type of accelerator.
  • ACCELERATOR_COUNT : the number of accelerators required by the Pod. This value must be less than or equal to the value in the spec.priorities.gpu.count field in the ComputeClass.
  • POD_NAME : the name of your Pod.

For more information about ComputeClass, see About custom compute classes .

Pod specification

Select a compatible accelerator type and count in your Pod specification, like in the following example:

  apiVersion 
 : 
  
 v1 
 kind 
 : 
  
 Pod 
 metadata 
 : 
  
 name 
 : 
  
  POD_NAME 
 
 spec 
 : 
  
 nodeSelector 
 : 
  
 cloud.google.com/gke-accelerator 
 : 
  
  ACCELERATOR_NAME 
 
  
 containers 
 : 
  
 - 
  
 name 
 : 
  
 my-container 
  
 image 
 : 
  
 gcr.io/google_containers/pause 
  
 resources 
 : 
  
 limits 
 : 
  
 nvidia.com/gpu 
 : 
  
  ACCELERATOR_COUNT 
 
 

Replace the following values:

  • POD_NAME : the name of your Pod.
  • ACCELERATOR_NAME : the name of the accelerator required by the Pod.
  • ACCELERATOR_COUNT : the number of accelerators required by the Pod.

Pricing

Fast-starting nodes are available in GKE Autopilot at no extra charge. For more information about GKE Autopilot pricing, see the Autopilot mode section in Google Kubernetes Engine pricing .

What's next

Design a Mobile Site
View Site in Mobile | Classic
Share by: