Investigate a cluster's state with kubectl


Diagnosing the root cause of Google Kubernetes Engine (GKE) issues often requires inspecting the live state, configuration, and events of your Kubernetes resources in detail. To move beyond surface-level symptoms, you need tools to directly query and interact with the cluster's control plane.

Use this page to learn essential kubectl commands for investigating the live state of your cluster. Learning these commands lets you gather detailed information directly from the Kubernetes control plane, helping you understand why a problem is occurring.

This information is important for Platform admins and operators who need to perform in-depth cluster health checks, manage resources, and troubleshoot infrastructure issues at a granular level. It's also essential for Application developers for debugging application behavior, inspecting Pod logs and events, and verifying the exact state of their deployments within the Kubernetes environment. For more information about the common roles and example tasks that we reference in Google Cloud content, see Common GKE user roles and tasks .

Before you begin

Before you start, perform the following tasks:

  • Install kubectl .
  • Configure the kubectl command-line tool to communicate with your cluster:

     gcloud  
    container  
    clusters  
    get-credentials  
     CLUSTER_NAME 
      
     \ 
      
    --location = 
     LOCATION 
     
    

    Replace the following:

    • CLUSTER_NAME : the name of your cluster.
    • LOCATION : the Compute Engine location of the control plane of your cluster. Provide a region for regional clusters, or a zone for zonal clusters.
  • Review your permissions. To see if you have the required permissions to run kubectl commands, use the kubectl auth can-i command. For example, to see if you have permission to run kubectl get nodes , run the kubectl auth can-i get nodes command.

    If you have the required permissions, the command returns yes ; otherwise, the command returns no .

    If you lack permission to run a kubectl command, you might see an error message similar to the following:

     Error from server (Forbidden): pods " POD_NAME 
    " is forbidden: User
    " USERNAME 
    @ DOMAIN 
    .com" cannot list resource "pods" in API group "" in the
    namespace "default" 
    

    If you don't have the required permissions, ask your cluster administrator to assign the necessary roles to you.

Get an overview of what's running

The kubectl get command helps you to see an overall view of what's happening in your cluster. Use the following commands to see the status of two of the most important cluster components, nodes and Pods:

  1. To check if your nodes are healthy, view details about all nodes and their statuses:

     kubectl  
    get  
    nodes 
    

    The output is similar to the following:

     NAME                                        STATUS   ROLES    AGE     VERSION
    
    gke-cs-cluster-default-pool-8b8a777f-224a   Ready    <none>   4d23h   v1.32.3-gke.1785003
    gke-cs-cluster-default-pool-8b8a777f-egb2   Ready    <none>   4d22h   v1.32.3-gke.1785003
    gke-cs-cluster-default-pool-8b8a777f-p5bn   Ready    <none>   4d22h   v1.32.3-gke.1785003 
    

    Any status other than Ready requires additional investigation.

  2. To check if your Pods are healthy, view details about all Pods and their statuses:

     kubectl  
    get  
    pods  
    --all-namespaces 
    

    The output is similar to the following:

     NAMESPACE   NAME       READY   STATUS      RESTARTS   AGE
    kube-system netd-6nbsq 3/3     Running     0          4d23h
    kube-system netd-g7tpl 3/3     Running     0          4d23h 
    

    Any status other than Running requires additional investigation. Here are some common statuses that you might see:

    • Running : a healthy, running state.
    • Pending : the Pod is waiting to be scheduled on a node.
    • CrashLoopBackOff : the containers in the Pod are repeatedly crashing in a loop because the app starts, exits with an error, and is then restarted by Kubernetes.
    • ImagePullBackOff : the Pod can't pull the container image.

The preceding commands are only two examples of how you can use the kubectl get command. You can also use the command to learn more about many types of Kubernetes resources. For a full list of the resources that you can explore, see kubectl get in the Kubernetes documentation.

Learn more about specific resources

After you identify a problem, you need to get more details. An example of a problem could be a Pod that doesn't have a status of Running . To get more details, use the kubectl describe command.

For example, to describe a specific Pod, run the following command:

 kubectl  
describe  
pod  
 POD_NAME 
  
-n  
 NAMESPACE_NAME 
 

Replace the following:

  • POD_NAME : the name of the Pod experiencing issues.
  • NAMESPACE_NAME : the namespace that the Pod is in. If you're not sure what the namespace is, review the Namespace column from the output of the kubectl get pods command.

The output of the kubectl describe command includes detailed information about your resource. Here are some of the most helpful sections to review when you troubleshoot a Pod:

  • Status : the current status of the Pod.
  • Conditions : the overall health and readiness of the Pod.
  • Restart Count : how many times the containers in the Pod have restarted. High numbers can be a cause of concern.
  • Events : a log of important things that have happened to this Pod, like being scheduled to a node, pulling its container image, and whether any errors occurred. The Events section is often where you can find the direct clues to why a Pod is failing.

Like the kubectl get command, you can use the kubectl describe command to learn more about multiple types of resources. For a full list of the resources that you can explore, see kubectl describe in the Kubernetes documentation.

What's next

Design a Mobile Site
View Site in Mobile | Classic
Share by: