When a Pod fails or a service doesn't work as expected in
Google Kubernetes Engine (GKE), understanding the sequence of events leading up to
the issue is critical. Inspecting the current state isn't always enough to find
the root cause, making historical log data invaluable.
Use this page to learn how to use Cloud Logging to investigate past
failures (such as why a Pod failed to start or who deleted a critical
Deployment) by querying and analyzing GKE logs.
This information is important for Platform admins and operators who need to
perform root cause analysis on cluster-wide issues, audit changes, and
understand system behavior trends. It's also essential for
Application developers for debugging application-specific errors, tracing
request paths, and understanding how their code behaves in the
GKE environment over time. For more information about the common
roles and example tasks that we reference in Google Cloud content, seeCommon GKE user roles and tasks.
Understand key log types for troubleshooting
To help you troubleshoot, Cloud Logging automatically collects and aggregates
several key log types from your GKE clusters, containerized apps, and other
Google Cloud services:
Node and runtime logs (kubelet,containerd): the logs from the
underlying node services. Because thekubeletmanages the lifecycle of all
Pods on the node, its logs are essential for troubleshooting issues like
container startups, Out of Memory (OOM) events, probe failures, and volume
mount errors. These logs are also crucial for diagnosing node-level
problems, such as a node that has aNotReadystatus.
Because containerd manages the lifecycle of your containers, including
pulling images, its logs are crucial for troubleshooting issues that happen
before the kubelet can start the container. The containerd logs help you
diagnose node-level problems in GKE, because they document the
specific activities and potential errors of the container runtime.
App logs (stdout,stderr): the standard output and error streams
from your containerized processes. These logs are essential for debugging
app-specific issues like crashes, errors, or unexpected behavior.
Audit logs: these logs answer "who did what, where, and when?" for
your cluster. They track administrative actions and API calls made to the
Kubernetes API server, which is useful for diagnosing issues caused by
configuration changes or unauthorized access.
Common troubleshooting scenarios
After you identify an issue, you can query these logs to find out what
happened. To help get you started, reviewing logs can help you with these
issues:
If a node has aNotReadystatus, review its node logs. Thekubeletandcontainerdlogs often reveal the underlying cause, such as network
problems or resource constraints.
If a new node fails to provision and join the cluster, review the node'sserial port logs.
These logs capture early boot and kubelet startup activity before the node's
logging agents are fully active.
If a Pod failed to start in the past, review the app logs for that Pod to
check for crashes. If the logs are empty or the Pod can't be scheduled,
check the audit logs for relevant events or the node logs on the target node
for clues about resource pressure or image pull errors.
If a critical Deployment was deleted and no one knows why, query the Admin
Activity audit logs. These logs can help you identify which user or service
account issued the delete API call, providing a clear starting point for
your investigation.
How to access logs
Use Logs Explorer to query, view, and analyze GKE logs
in the Google Cloud console. Logs Explorer provides powerful filtering
options that help you to isolate your issue.
To access and use Logs Explorer, complete the following steps:
In the Google Cloud console, go to theLogs Explorerpage.
For example, to troubleshoot a specific Pod, you might want to isolate its
error logs. To see only logs with anERRORseverity for that Pod, use the
following query:
POD_NAME: the name of the Pod experiencing
issues.
NAMESPACE_NAME: the namespace that the Pod is
in. If you're not sure what the namespace is, review theNamespacecolumn from the output of thekubectl get podscommand.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-10 UTC."],[],[],null,["[Autopilot](/kubernetes-engine/docs/concepts/autopilot-overview) [Standard](/kubernetes-engine/docs/concepts/choose-cluster-mode)\n\n*** ** * ** ***\n\nWhen a Pod fails or a service doesn't work as expected in\nGoogle Kubernetes Engine (GKE), understanding the sequence of events leading up to\nthe issue is critical. Inspecting the current state isn't always enough to find\nthe root cause, making historical log data invaluable.\n\nUse this page to learn how to use Cloud Logging to investigate past\nfailures (such as why a Pod failed to start or who deleted a critical\nDeployment) by querying and analyzing GKE logs.\n\nThis information is important for Platform admins and operators who need to\nperform root cause analysis on cluster-wide issues, audit changes, and\nunderstand system behavior trends. It's also essential for\nApplication developers for debugging application-specific errors, tracing\nrequest paths, and understanding how their code behaves in the\nGKE environment over time. For more information about the common\nroles and example tasks that we reference in Google Cloud content, see\n[Common GKE user roles and tasks](/kubernetes-engine/enterprise/docs/concepts/roles-tasks).\n\nUnderstand key log types for troubleshooting\n\nTo help you troubleshoot, Cloud Logging automatically collects and aggregates\nseveral key log types from your GKE clusters, containerized apps, and other\nGoogle Cloud services:\n\n- **Node and runtime logs (`kubelet`, `containerd`)** : the logs from the\n underlying node services. Because the `kubelet` manages the lifecycle of all\n Pods on the node, its logs are essential for troubleshooting issues like\n container startups, Out of Memory (OOM) events, probe failures, and volume\n mount errors. These logs are also crucial for diagnosing node-level\n problems, such as a node that has a `NotReady` status.\n\n Because containerd manages the lifecycle of your containers, including\n pulling images, its logs are crucial for troubleshooting issues that happen\n before the kubelet can start the container. The containerd logs help you\n diagnose node-level problems in GKE, because they document the\n specific activities and potential errors of the container runtime.\n- **App logs (`stdout`, `stderr`)**: the standard output and error streams\n from your containerized processes. These logs are essential for debugging\n app-specific issues like crashes, errors, or unexpected behavior.\n\n- **Audit logs**: these logs answer \"who did what, where, and when?\" for\n your cluster. They track administrative actions and API calls made to the\n Kubernetes API server, which is useful for diagnosing issues caused by\n configuration changes or unauthorized access.\n\nCommon troubleshooting scenarios\n\nAfter you identify an issue, you can query these logs to find out what\nhappened. To help get you started, reviewing logs can help you with these\nissues:\n\n- If a node has a `NotReady` status, review its node logs. The `kubelet` and `containerd` logs often reveal the underlying cause, such as network problems or resource constraints.\n- If a new node fails to provision and join the cluster, review the node's [serial port logs](/compute/docs/troubleshooting/viewing-serial-port-output). These logs capture early boot and kubelet startup activity before the node's logging agents are fully active.\n- If a Pod failed to start in the past, review the app logs for that Pod to check for crashes. If the logs are empty or the Pod can't be scheduled, check the audit logs for relevant events or the node logs on the target node for clues about resource pressure or image pull errors.\n- If a critical Deployment was deleted and no one knows why, query the Admin Activity audit logs. These logs can help you identify which user or service account issued the delete API call, providing a clear starting point for your investigation.\n\nHow to access logs\n\nUse Logs Explorer to query, view, and analyze GKE logs\nin the Google Cloud console. Logs Explorer provides powerful filtering\noptions that help you to isolate your issue.\n\nTo access and use Logs Explorer, complete the following steps:\n\n1. In the Google Cloud console, go to the **Logs Explorer** page.\n\n [Go to Logs Explorer](https://console.cloud.google.com/logs)\n2. In the **query pane**, enter a query. Use the Logging query language to\n write targeted queries. Here are some common filters to get you started:\n\n | Filter type | Description | Example value |\n |------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------|\n | `resource.type` | The type of Kubernetes resource. | `k8s_cluster`, `k8s_node`, `k8s_pod`, `k8s_container` |\n | `log_id` | The log stream from the resource. | `stdout`, `stderr` |\n | `resource.labels.`\u003cvar translate=\"no\"\u003eRESOURCE_TYPE\u003c/var\u003e`.name` | Filter for resources with a specific name. Replace \u003cvar translate=\"no\"\u003eRESOURCE_TYPE\u003c/var\u003e with the name of the resource that you want to query. For example, `namespace` or `pod`. | `example-namespace-name`, `example-pod-name` |\n | `severity` | The log severity level. | `DEFAULT`, `INFO`, `WARNING`, `ERROR`, `CRITICAL` |\n | `jsonPayload.message=~` | A regular expression search for text within the log message. | `scale.down.error.failed.to.delete.node.min.size.reached` |\n\n For example, to troubleshoot a specific Pod, you might want to isolate its\n error logs. To see only logs with an `ERROR` severity for that Pod, use the\n following query: \n\n resource.type=\"k8s_container\"\n resource.labels.pod_name=\"\u003cvar translate=\"no\"\u003ePOD_NAME\u003c/var\u003e\"\n resource.labels.namespace_name=\"\u003cvar translate=\"no\"\u003eNAMESPACE_NAME\u003c/var\u003e\"\n severity=ERROR\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003ePOD_NAME\u003c/var\u003e: the name of the Pod experiencing issues.\n - \u003cvar translate=\"no\"\u003eNAMESPACE_NAME\u003c/var\u003e: the namespace that the Pod is in. If you're not sure what the namespace is, review the `Namespace` column from the output of the `kubectl get pods` command.\n\n For more examples, see\n [Kubernetes-related queries](/logging/docs/view/query-library#kubernetes-filters)\n in the Google Cloud Observability documentation.\n3. Click **Run query**.\n\n4. To see the full log message, including the JSON payload, metadata, and\n timestamp, click the log entry.\n\nFor more information about GKE logs, see\n[About GKE logs](/kubernetes-engine/docs/concepts/about-logs).\n\nWhat's next\n\n- Read\n [Perform proactive monitoring with Cloud Monitoring](/kubernetes-engine/docs/troubleshooting/introduction-monitoring)\n (the next page in this series).\n\n- See these concepts applied in the\n [example troubleshooting scenario](/kubernetes-engine/docs/troubleshooting/introduction-example).\n\n- For advice about resolving specific problems, review\n [GKE's troubleshooting guides](/kubernetes-engine/docs/troubleshooting).\n\n- If you can't find a solution to your problem in the documentation, see\n [Get support](/kubernetes-engine/docs/getting-support) for further help,\n including advice on the following topics:\n\n - Opening a support case by contacting [Cloud Customer Care](/support-hub).\n - Getting support from the community by [asking questions on StackOverflow](http://stackoverflow.com/questions/tagged/google-kubernetes-engine) and using the `google-kubernetes-engine` tag to search for similar issues. You can also join the [`#kubernetes-engine` Slack channel](https://googlecloud-community.slack.com/messages/C0B9GKTKJ/) for more community support.\n - Opening bugs or feature requests by using the [public issue tracker](/support/docs/issue-trackers)."]]