This page shows you how to deploy and scale workloads more quickly in Google Kubernetes Engine (GKE) clusters using fast-starting nodes. Fast-starting nodes are used in GKE with Autopilot mode on a best-effort basis when workloads use compatible configurations.
Fast-starting GKE nodes have significantly lower startup time for compatible machine families. The accelerated startup time provides you with the following benefits:
- Faster cold start
- Faster autoscaling
- Improved Pod scheduling long-tail latency
- Improved infrastructure cost efficiency
With fast-starting nodes, GKE pre-initializes hardware resources to accelerate startup time. The pre-initialized resources are available on a best-effort basis. Surge requests might only be partially served. Without fast-starting nodes, resources are initialized on-demand, and nodes are served at normal startup time.
Requirements
Fast-starting nodes require no additional configuration. GKE automatically uses fast-starting nodes if your workloads use compatible configurations. You must meet all of the following requirements to use fast-starting nodes:
- Use Autopilot clusters.
- Use any version in the Rapid release channel .
-
Use any of the following compatible compute resources, with a maximum compatible boot disk size of 500 GiB:
- NVIDIA L4 GPUs ( G2 machine series )
-
Use the
pd-balanced
boot disk type. -
Don't use any features that are incompatible with fast-starting nodes. For more information, see Limitations .
Limitations
The following features aren't compatible with fast-starting GKE nodes. If you use any of these features, GKE provisions nodes with the typical startup time:
- Customer-managed encryption keys (CMEK)
- Spot VMs
- Local SSDs
- Placement policies
- Multi-network support
Autopilot GPU workloads
Requesting compatible GPUs in your Autopilot clusters results in up to four times faster node startup time and up to two times faster Pod scheduling time than similar requests in GKE Standard clusters, because the Autopilot GPU workloads can use fast-starting nodes.
The following are some example use cases. However, any Pods meeting the conditions from the Requirements section are compatible with fast-starting nodes.
ComputeClass
Request a compatible accelerator type and count in a ComputeClass, like in the following example:
apiVersion
:
cloud.google.com/v1
kind
:
ComputeClass
metadata
:
name
:
ACCELERATOR_COMPUTE_CLASS_NAME
spec
:
priorities
:
-
gpu
:
type
:
ACCELERATOR_TYPE
count
:
ACCELERATOR_COUNT
nodePoolAutoCreation
:
enabled
:
true
When you select this ComputeClass in a Pod, like in the following example, GKE uses fast-starting nodes:
apiVersion
:
v1
kind
:
Pod
metadata
:
name
:
POD_NAME
spec
:
nodeSelector
:
# Select a ComputeClass that requests compatible GPUs
cloud.google.com/compute-class
:
ACCELERATOR_COMPUTE_CLASS_NAME
containers
:
-
name
:
my-container
image
:
gcr.io/google_containers/pause
resources
:
limits
:
nvidia.com/gpu
:
ACCELERATOR_COUNT
Replace the following values:
-
ACCELERATOR_COMPUTE_CLASS_NAME
: the name of the ComputeClass that requests the accelerators. -
ACCELERATOR_TYPE
: the type of accelerator. -
ACCELERATOR_COUNT
: the number of accelerators required by the Pod. This value must be less than or equal to the value in thespec.priorities.gpu.count
field in the ComputeClass. -
POD_NAME
: the name of your Pod.
For more information about ComputeClass, see About custom compute classes .
Pod specification
Select a compatible accelerator type and count in your Pod specification, like in the following example:
apiVersion
:
v1
kind
:
Pod
metadata
:
name
:
POD_NAME
spec
:
nodeSelector
:
cloud.google.com/gke-accelerator
:
ACCELERATOR_NAME
containers
:
-
name
:
my-container
image
:
gcr.io/google_containers/pause
resources
:
limits
:
nvidia.com/gpu
:
ACCELERATOR_COUNT
Replace the following values:
-
POD_NAME
: the name of your Pod. -
ACCELERATOR_NAME
: the name of the accelerator required by the Pod. -
ACCELERATOR_COUNT
: the number of accelerators required by the Pod.
Pricing
Fast-starting nodes are available in GKE Autopilot at no extra charge. For more information about GKE Autopilot pricing, see the Autopilot mode section in Google Kubernetes Engine pricing .