Scale container resource requests and limits


This page explains how you can analyze and adjust the CPU requests and memory requests of a container in a Google Kubernetes Engine (GKE) cluster using vertical Pod autoscaling .

You can scale container resources manually through the Google Cloud console, analyze resources using a VerticalPodAutoscaler object, or configure automatic scaling using vertical Pod autoscaling .

Before you begin

Before you start, make sure that you have performed the following tasks:

  • Enable the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • If you want to use the Google Cloud CLI for this task, install and then initialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running gcloud components update .

Analyze resource requests

The Vertical Pod Autoscaler automatically analyzes your containers and provides suggested resource requests. You can view these resource requests using the Google Cloud console, Cloud Monitoring, or Google Cloud CLI.

Console

To view suggested resource requests in the Google Cloud console, you must have an existing workload deployed that is at least 24 hours old. Some suggestions might not be available or relevant for certain workloads, such as those created within the last 24 hours, standalone Pods, and apps written in Java.

  1. Go to the Workloadspage in the Google Cloud console.

    Go to Workloads

  2. In the workloads list, click the name of the workload you want to scale.

  3. Click Actions > Scale > Edit resource requests.

    The Analyze resource utilization data section shows historic usage data that the Vertical Pod Autoscaler controller analyzed to create the suggested resource requests in the Adjust resource requests and limits section.

Cloud Monitoring

To view suggested resource requests in Cloud Monitoring, you must have an existing workload deployed.

  1. Go to the Metrics Explorerpage in the Google Cloud console.

    Go to Metrics Explorer

  2. Click Configuration.

  3. Expand the Select a Metricmenu.

  4. In the Resourcemenu, select Kubernetes Scale.

  5. In the Metric categorymenu, select Autoscaler.

  6. In the Metricmenu, select Recommended per replicate request bytesand Recommended per replica request core.

  7. Click Apply.

gcloud CLI

To view suggested resource requests, you must create a VerticalPodAutoscaler object and a Deployment.

  1. For Standard clusters, enable vertical Pod autoscaling for your cluster. For Autopilot clusters, vertical Pod autoscaling is enabled by default.

     gcloud  
    container  
    clusters  
    update  
     CLUSTER_NAME 
      
    --enable-vertical-pod-autoscaling 
    

    Replace CLUSTER_NAME with the name of your cluster.

  2. Save the following manifest as my-rec-deployment.yaml :

      apiVersion 
     : 
      
     apps/v1 
     kind 
     : 
      
     Deployment 
     metadata 
     : 
      
     name 
     : 
      
     my-rec-deployment 
     spec 
     : 
      
     replicas 
     : 
      
     2 
      
     selector 
     : 
      
     matchLabels 
     : 
      
     app 
     : 
      
     my-rec-deployment 
      
     template 
     : 
      
     metadata 
     : 
      
     labels 
     : 
      
     app 
     : 
      
     my-rec-deployment 
      
     spec 
     : 
      
     containers 
     : 
      
     - 
      
     name 
     : 
      
     my-rec-container 
      
     image 
     : 
      
     nginx 
     
    

    This manifest describes a Deployment that does not have CPU or memory requests. The containers.name value of my-rec-deployment specifies that all Pods in the Deployment belong to the VerticalPodAutoscaler .

  3. Apply the manifest to the cluster:

     kubectl  
    create  
    -f  
    my-rec-deployment.yaml 
    
  4. Save the following manifest as my-rec-vpa.yaml :

      apiVersion 
     : 
      
     autoscaling.k8s.io/v1 
     kind 
     : 
      
     VerticalPodAutoscaler 
     metadata 
     : 
      
     name 
     : 
      
     my-rec-vpa 
     spec 
     : 
      
     targetRef 
     : 
      
     apiVersion 
     : 
      
     "apps/v1" 
      
     kind 
     : 
      
     Deployment 
      
     name 
     : 
      
     my-rec-deployment 
      
     updatePolicy 
     : 
      
     updateMode 
     : 
      
     "Off" 
     
    

    This manifest describes a VerticalPodAutoscaler . The updateMode value of Off means that when Pods are created, the Vertical Pod Autoscaler controller analyzes the CPU and memory needs of the containers and records those recommendations in the status field of the resource. The Vertical Pod Autoscaler controller does not automatically update the resource requests for running containers.

  5. Apply the manifest to the cluster:

     kubectl  
    create  
    -f  
    my-rec-vpa.yaml 
    
  6. After some time, view the VerticalPodAutoscaler :

     kubectl  
    get  
    vpa  
    my-rec-vpa  
    --output  
    yaml 
    

    The output is similar to the following:

     ...
      recommendation:
        containerRecommendations:
        - containerName: my-rec-container
          lowerBound:
            cpu: 25m
            memory: 262144k
          target:
            cpu: 25m
            memory: 262144k
          upperBound:
            cpu: 7931m
            memory: 8291500k
    ... 
    

    This output shows recommendations for CPU and memory requests.

Set Pod resource requests manually

You can set Pod resource requests manually using the Google Cloud console or kubectl. Use the following best practices for setting your container resource requests and limits:

  • Memory: Set the same amount of memory for the request and limit.
  • CPU: For the request, specify the minimum CPU needed to ensure correct operation, according to your own SLOs. Set an unbounded CPU limit.

Console

  1. Go to the Workloadspage in the Google Cloud console.

    Go to Workloads

  2. In the workloads list, click the name of the workload you want to scale.

  3. Click Actions > Scale > Edit resource requests.

    1. The Adjust resource requests and limitssection shows the current CPU and memory requests for each container as well as suggested CPU and memory requests.
  4. Click Apply Latest Suggestionsto view suggested requests for each container.

  5. Click Save Changes.

  6. Click Confirm.

kubectl

Vertically scale your workload with minimal disruption

Preview

This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms . Pre-GA features are available "as is" and might have limited support. For more information, see the launch stage descriptions .

Starting in Kubernetes version 1.33 , you can use the kubectl patch command to vertically scale your workload by updating the resources that are assigned to a container, without recreating the Pod. For more information, including limitations, see the Kubernetes documentation for resizing CPU and memory resources .

To use the kubectl patch command, specify the updated resource request under the --patch flag. For example, to scale my-app to 800 mCPUs, run the following command:

 kubectl  
patch  
pod  
my-app  
--subresource  
resize  
--patch  
 \ 
  
 '{"spec":{"containers":[{"name":"pause", "resources":{"requests":{"cpu":"800m"}, "limits":{"cpu":"800m"}}}]}}' 
 

Vertically scale your workload

To set resource requests for a Pod, set the requests.cpu and memory.cpu values in your Deployment manifest. In this example, you manually modify the Deployment created in Analyze resource requests with suggested resource requests.

  1. Save the following example manifest as my-adjusted-deployment.yaml :

      apiVersion 
     : 
      
     apps/v1 
     kind 
     : 
      
     Deployment 
     metadata 
     : 
      
     name 
     : 
      
     my-rec-deployment 
     spec 
     : 
      
     replicas 
     : 
      
     2 
      
     selector 
     : 
      
     matchLabels 
     : 
      
     app 
     : 
      
     my-rec-deployment 
      
     template 
     : 
      
     metadata 
     : 
      
     labels 
     : 
      
     app 
     : 
      
     my-rec-deployment 
      
     spec 
     : 
      
     containers 
     : 
      
     - 
      
     name 
     : 
      
     my-rec-container 
      
     image 
     : 
      
     nginx 
      
     resources 
     : 
      
     requests 
     : 
      
     cpu 
     : 
      
     25m 
      
     memory 
     : 
      
     256Mi 
     
    

    This manifest describes a Deployment that has two Pods. Each Pod has one container that requests 25 milliCPU and 256 MiB of memory.

  2. Apply the manifest to the cluster:

     kubectl  
    apply  
    -f  
    my-adjusted-deployment.yaml 
    

Set Pod resource requests automatically

Vertical Pod autoscaling uses the VerticalPodAutoscaler object to automatically set resource requests on Pods when the updateMode is Auto . You can configure a VerticalPodAutoscaler using the gcloud CLI or the Google Cloud console.

Console

To set resource requests automatically, you must have a cluster with the vertical Pod autoscaling feature enabled. Autopilot clusters have the vertical Pod autoscaling feature enabled by default.

Enable Vertical Pod Autoscaling

  1. Go to the Google Kubernetes Enginepage in Google Cloud console.

    Go to Google Kubernetes Engine

  2. In the cluster list, click the name of the cluster you want to modify.

  3. In the Automationsection, click Editfor the Vertical Pod Autoscalingoption.

  4. Select the Enable Vertical Pod Autoscalingcheckbox.

  5. Click Save changes.

Configure Vertical Pod Autoscaling

  1. Go to the Workloadspage in Google Cloud console.

    Go to Workloads

  2. In the workloads list, click the name of the Deployment you want to configure vertical Pod autoscaling for.

  3. Click Actions > Autoscale > Vertical pod autoscaling.

  4. Choose an autoscaling mode:

    • Auto mode: Vertical Pod autoscaling updates CPU and memory requests during the life of a Pod.
    • Initial mode: Vertical Pod autoscaling assigns resource requests only at Pod creation and never changes them later.
  5. (Optional) Set container policies. This option lets you ensure that the recommendation is never set above or below a specified resource request.

    1. Click Add Policy.
    2. Select Autofor Edit container mode.
    3. In Controlled resources, select which resources you want to autoscale the container on.
    4. Click Add Ruleto set one or more minimum or maximum ranges for the container's resource requests:
      • Min. allowed Memory: the minimum amount of memory that the container should always have, in MiB.
      • Min. allowed CPU: the minimum amount of CPU that the container should always have, in mCPU.
      • Max allowed Memory: the maximum amount of memory that the container should always have, in MiB.
      • Max allowed CPU: the maximum amount of CPU that the container should always have, in mCPU.
  6. Click Done.

  7. Click Save.

gcloud

To set resource requests automatically, you must use a cluster that has the vertical Pod autoscaling feature enabled. Autopilot clusters have the feature enabled by default.

  1. For Standard clusters, enable vertical Pod autoscaling for your cluster:

     gcloud  
    container  
    clusters  
    update  
     CLUSTER_NAME 
      
    --enable-vertical-pod-autoscaling 
    

    Replace CLUSTER_NAME with the name of your cluster.

  2. Save the following manifest as my-auto-deployment.yaml :

      apiVersion 
     : 
      
     apps/v1 
     kind 
     : 
      
     Deployment 
     metadata 
     : 
      
     name 
     : 
      
     my-auto-deployment 
     spec 
     : 
      
     replicas 
     : 
      
     2 
      
     selector 
     : 
      
     matchLabels 
     : 
      
     app 
     : 
      
     my-auto-deployment 
      
     template 
     : 
      
     metadata 
     : 
      
     labels 
     : 
      
     app 
     : 
      
     my-auto-deployment 
      
     spec 
     : 
      
     containers 
     : 
      
     - 
      
     name 
     : 
      
     my-container 
      
     image 
     : 
      
     registry.k8s.io/ubuntu-slim:0.14 
      
     resources 
     : 
      
     requests 
     : 
      
     cpu 
     : 
      
     100m 
      
     memory 
     : 
      
     50Mi 
      
     command 
     : 
      
     [ 
     "/bin/sh" 
     ] 
      
     args 
     : 
      
     [ 
     "-c" 
     , 
      
     "while 
      
     true; 
      
     do 
      
     timeout 
      
     0.5s 
      
     yes 
      
    > /dev/null; 
      
     sleep 
      
     0.5s; 
      
     done" 
     ] 
     
    

    This manifest describes a Deployment that has two Pods. Each Pod has one container that requests 100 milliCPU and 50 MiB of memory.

  3. Apply the manifest to the cluster:

     kubectl  
    create  
    -f  
    my-auto-deployment.yaml 
    
  4. List the running Pods:

     kubectl  
    get  
    pods 
    

    The output shows the names of the Pods in my-deployment :

     NAME                            READY     STATUS             RESTARTS   AGE
    my-auto-deployment-cbcdd49fb-d6bf9   1/1       Running            0          8s
    my-auto-deployment-cbcdd49fb-th288   1/1       Running            0          8s 
    
  5. Save the following manifest as my-vpa.yaml :

      apiVersion 
     : 
      
     autoscaling.k8s.io/v1 
     kind 
     : 
      
     VerticalPodAutoscaler 
     metadata 
     : 
      
     name 
     : 
      
     my-vpa 
     spec 
     : 
      
     targetRef 
     : 
      
     apiVersion 
     : 
      
     "apps/v1" 
      
     kind 
     : 
      
     Deployment 
      
     name 
     : 
      
     my-auto-deployment 
      
     updatePolicy 
     : 
      
     updateMode 
     : 
      
     "Auto" 
     
    

    This manifest describes a VerticalPodAutoscaler with the following properties:

    • targetRef.name : specifies that any Pod that is controlled by a Deployment named my-deployment belongs to this VerticalPodAutoscaler .
    • updateMode: Auto : specifies that the Vertical Pod Autoscaler controller can delete a Pod, adjust the CPU and memory requests, and then start a new Pod.

    You can also configure vertical Pod autoscaling to assign resource requests only at Pod creation time, using updateMode: "Initial" .

  6. Apply the manifest to the cluster:

     kubectl  
    create  
    -f  
    my-vpa.yaml 
    
  7. Wait a few minutes, and view the running Pods again:

     kubectl  
    get  
    pods 
    

    The output shows that the Pod names have changed:

     NAME                                 READY     STATUS             RESTARTS   AGE
    my-auto-deployment-89dc45f48-5bzqp   1/1       Running            0          8s
    my-auto-deployment-89dc45f48-scm66   1/1       Running            0          8s 
    

    If the Pod names have not changed, wait a bit longer, and then view the running Pods again.

View information about a Vertical Pod Autoscaler

To view details about a Vertical Pod Autoscaler, do the following:

  1. Get detailed information about one of your running Pods:

     kubectl  
    get  
    pod  
     POD_NAME 
      
    --output  
    yaml 
    

    Replace POD_NAME with the name of one of your Pods that you retrieved in the previous step.

    The output is similar to the following:

     apiVersion: v1
    kind: Pod
    metadata:
      annotations:
        vpaUpdates: 'Pod resources updated by my-vpa: container 0: cpu capped to node capacity, memory capped to node capacity, cpu request, memory request'
    ...
    spec:
      containers:
      ...
        resources:
          requests:
            cpu: 510m
            memory: 262144k
        ... 
    

    This output shows that the Vertical Pod Autoscaler controller has a memory request of 262144k and a CPU request of 510 milliCPU.

  2. Get detailed information about the VerticalPodAutoscaler :

     kubectl  
    get  
    vpa  
    my-vpa  
    --output  
    yaml 
    

    The output is similar to the following:

     ...
      recommendation:
        containerRecommendations:
        - containerName: my-container
          lowerBound:
            cpu: 536m
            memory: 262144k
          target:
            cpu: 587m
            memory: 262144k
          upperBound:
            cpu: 27854m
            memory: "545693548" 
    

    This output shows recommendations for CPU and memory requests and includes the following properties:

    • target : specifies that for the container to run optimally, it should request 587 milliCPU and 26,2144 kilobytes of memory.
    • lowerBound and upperBound : vertical Pod autoscaling uses these properties to decide whether to delete a Pod and replace it with a new Pod. If a Pod has requests less than the lower bound or greater than the upper bound, the Vertical Pod Autoscaler deletes the Pod and replaces it with a Pod that meets the target attribute.

Opt out specific containers

You can opt out specific containers from vertical Pod autoscaling using the gcloud CLI or the Google Cloud console.

Console

To opt out specific containers from vertical Pod autoscaling, you must have a cluster with the vertical Pod autoscaling feature enabled. Autopilot clusters have the vertical Pod autoscaling feature enabled by default.

Enable Vertical Pod Autoscaling

  1. Go to the Google Kubernetes Enginepage in Google Cloud console.

    Go to Google Kubernetes Engine

  2. In the cluster list, click the name of the cluster you want to modify.

  3. In the Automationsection, click Editfor the Vertical Pod Autoscalingoption.

  4. Select the Enable Vertical Pod Autoscalingcheckbox.

  5. Click Save changes.

Configure Vertical Pod Autoscaling

  1. Go to the Workloadspage in Google Cloud console.

    Go to Workloads

  2. In the workloads list, click the name of the Deployment you want to configure vertical Pod autoscaling for.

  3. Click Actions > Autoscale > Vertical pod autoscaling.

  4. Choose an autoscaling mode:

    • Auto mode: Vertical Pod autoscaling updates CPU and memory requests during the life of a Pod.
    • Initial mode: Vertical Pod autoscaling assigns resource requests only at Pod creation and never changes them later.
  5. Click Add Policy.

  6. Select the container you want to opt out.

  7. For Edit container mode, select Off.

  8. Click Done.

  9. Click Save.

gcloud

To opt out specific containers from vertical Pod autoscaling, perform the following steps:

  1. Save the following manifest as my-opt-vpa.yaml :

      apiVersion 
     : 
      
     autoscaling.k8s.io/v1 
     kind 
     : 
      
     VerticalPodAutoscaler 
     metadata 
     : 
      
     name 
     : 
      
     my-opt-vpa 
     spec 
     : 
      
     targetRef 
     : 
      
     apiVersion 
     : 
      
     "apps/v1" 
      
     kind 
     : 
      
     Deployment 
      
     name 
     : 
      
     my-opt-deployment 
      
     updatePolicy 
     : 
      
     updateMode 
     : 
      
     "Auto" 
      
     resourcePolicy 
     : 
      
     containerPolicies 
     : 
      
     - 
      
     containerName 
     : 
      
     my-opt-sidecar 
      
     mode 
     : 
      
     "Off" 
     
    

    This manifest describes a VerticalPodAutoscaler . The mode: "Off" value turns off recommendations for the container my-opt-sidecar .

  2. Apply the manifest to the cluster:

     kubectl  
    apply  
    -f  
    my-opt-vpa.yaml 
    
  3. Save the following manifest as my-opt-deployment.yaml :

      apiVersion 
     : 
      
     apps/v1 
     kind 
     : 
      
     Deployment 
     metadata 
     : 
      
     name 
     : 
      
     my-opt-deployment 
     spec 
     : 
      
     replicas 
     : 
      
     1 
      
     selector 
     : 
      
     matchLabels 
     : 
      
     app 
     : 
      
     my-opt-deployment 
      
     template 
     : 
      
     metadata 
     : 
      
     labels 
     : 
      
     app 
     : 
      
     my-opt-deployment 
      
     spec 
     : 
      
     containers 
     : 
      
     - 
      
     name 
     : 
      
     my-opt-container 
      
     image 
     : 
      
     nginx 
      
     - 
      
     name 
     : 
      
     my-opt-sidecar 
      
     image 
     : 
      
     busybox 
      
     command 
     : 
      
     [ 
     "sh" 
     , 
     "-c" 
     , 
     "while 
      
     true; 
      
     do 
      
     echo 
      
     Doing 
      
     sidecar 
      
     stuff!; 
      
     sleep 
      
     60; 
      
     done" 
     ] 
     
    
  4. Apply the manifest to the cluster:

     kubectl  
    apply  
    -f  
    my-opt-deployment.yaml 
    
  5. After some time, view the Vertical Pod Autoscaler:

     kubectl  
    get  
    vpa  
    my-opt-vpa  
    --output  
    yaml 
    

    The output shows recommendations for CPU and memory requests:

     ...
      recommendation:
        containerRecommendations:
        - containerName: my-opt-container
    ... 
    

    In this output, there are only recommendations for one container. There are no recommendations for my-opt-sidecar .

    The Vertical Pod Autoscaler never updates resources on opted out containers. If you wait a few minutes, the Pod recreates but only one container has updated resource requests.

Identify workloads without resource requests or limits

You might want to identify workloads without configured resource requests and limits because GKE recommends setting resource requests and limits for all workloads as a best practice to avoid abrupt Pod termination under node resource pressure and improve accuracy of cost allocation. Defining BestEffort Pods or Pods with Burstable memory might lead to reliability issues when a node experiences memory pressure . Use the following best practices for setting your container resource requests and limits:

  • Memory: Set the same amount of memory for the request and limit.
  • CPU: For the request, specify the minimum CPU needed to ensure correct operation, according to your own SLOs. Set an unbounded CPU limit.

GKE generates insights and recommendations for workloads running without resource requests and limits.

The following table describes the resource configuration scenarios that GKE detects and the criteria for each scenario.

Insight subtype Missing settings scenario Details
REQUEST_OR_LIMIT_NOT_SET
No configured memory request and limit. ( MEMORY_REQUEST_AND_LIMIT_NOT_SET ) Pods are running without memory requests and limits set for their containers. GKE cannot throttle memory usage and might abruptly terminate such Pods if a node experiences memory pressure, which might cause reliability issues.
REQUEST_OR_LIMIT_NOT_SET
No configured memory limits. ( MEMORY_LIMIT_NOT_SET ) Pods are running without memory limits set for their containers. GKE cannot throttle memory usage and might abruptly terminate such Pods if a node experiences memory pressure and the Pods' memory usage exceeds requests, which might cause reliability issues. You should set the same amount of memory for requests and limits to avoid Pods using more memory than requested.
REQUEST_OR_LIMIT_NOT_SET
No configured CPU request and limit. ( CPU_REQUEST_AND_LIMIT_NOT_SET ) Pods are running without CPU requests and limits set for containers. This increases the chances of node resource exhaustion, makes Pods more likely to be throttled when node CPU utilization is close to its limit, and might cause performance issues.

For more information about these insights, follow the instructions to view insights and recommendations .

Manually check resource requests and limits

You might want to manually review which resource requests and limits are missing and need to be specified for a given workload so that you can update the configuration as recommended.

To review or update resource requests and limits configuration for a specified workload, do the following:

  1. Go to the Workloadspage in the Google Cloud console.

    Go to Workloads

  2. In the workloads list, click the name of the workload you want to inspect.

  3. Click Actions > Scale > Edit resource requests.

    1. The Adjust resource requests and limitssection shows the current CPU and memory requests for each container.

What's next

Create a Mobile Website
View Site in Mobile | Classic
Share by: