Version 1.14. This version is no longer supported. For information about how to upgrade to version 1.15, seeUpgrading Anthos on bare metalin the 1.15 documentation. For more information about supported and unsupported versions, see theVersion historypage in the latest documentation.
When you need to repair or maintain nodes, you should first put the nodes into
maintenance mode. Putting nodes into maintenance mode safely drains their
pods/workloads and excludes the nodes from pod scheduling. In maintenance mode,
you can work on your nodes without a risk of disrupting pod traffic.
How it works
Google Distributed Cloud provides a way to place nodes into maintenance mode. This
approach lets other cluster components correctly know that the node is in
maintenance mode. When you place a node in maintenance mode, no additional pods
can be scheduled on the node, and existing pods are stopped.
Instead of using maintenance mode, you can manually use Kubernetes commands such
askubectl cordonandkubectl drainon a specific node. If you run
Google Distributed Cloud version 1.12.0 (anthosBareMetalVersion: 1.12.0) or
lower, see the known issue onNodes uncordoned if you don't use the maintenance mode procedure.
When you use the maintenance mode process, Google Distributed Cloud does the
following:
are added to specified nodes to indicate that no pods can be scheduled or
executed on the nodes.
A 20-minute timeout is enforced to ensure nodes don't get stuck waiting for
pods to stop. Pods might not stop if they are configured totolerate all taintsor they havefinalizers.
Google Distributed Cloud attempts to stop all pods, but if the timeout is
exceeded, the node is put into maintenance mode. This timeout prevents
running pods from blocking upgrades.
Put a node into maintenance mode
Choose the nodes you want to put into maintenance mode by specifying IP ranges
for the selected nodes undermaintenanceBlocksin your cluster configuration
file. The nodes you choose must be in a ready state, and functioning in the
cluster.
To put nodes into maintenance mode:
Edit the cluster configuration file to select the nodes you want to put into
maintenance mode.
You can edit the configuration file with an editor of your choice, or you
can edit the cluster custom resource directly by running the following
command:
kubectl-nCLUSTER_NAMESPACEeditclusterCLUSTER_NAME
Replace the following:
CLUSTER_NAMESPACE: the namespace of the cluster.
CLUSTER_NAME: the name of the cluster.
Add themaintenanceBlockssection to the cluster configuration file to
specify either a single IP address, or an address range, for nodes you want
to put into maintenance mode.
The following sample shows how to select multiple nodes by specifying a
range of IP addresses:
ThisUNDERMAINTENANCEcolumn in this sample shows that one node is in
maintenance mode.
Google Distributed Cloud also adds the following taints to nodes when they are
put into maintenance mode:
baremetal.cluster.gke.io/maintenance:NoExecute
baremetal.cluster.gke.io/maintenance:NoSchedule
Remove a node from maintenance mode
To remove nodes from maintenance mode:
Edit the cluster configuration file to clear the nodes you want to remove
from maintenance mode.
You can edit the configuration file with an editor of your choice, or you
can edit the cluster custom resource directly by running the following
command:
kubectl-nCLUSTER_NAMESPACEeditclusterCLUSTER_NAME
Replace the following:
CLUSTER_NAMESPACE: the namespace of the cluster.
CLUSTER_NAME: the name of the cluster.
Either edit the IP addresses to remove specific nodes from maintenance mode
or remove themaintenanceBlockssection remove all does from maintenance
mode.
Save and apply the updated cluster configuration.
Usekubectlcommands to check the status of your nodes.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[[["\u003cp\u003eMaintenance mode in Google Distributed Cloud allows for safe node repair and maintenance by preventing new pod scheduling and stopping existing pods.\u003c/p\u003e\n"],["\u003cp\u003eUsing the maintenance mode process automatically adds node taints to prevent pod scheduling and enforces a 20-minute timeout for pod termination to avoid blocking upgrades.\u003c/p\u003e\n"],["\u003cp\u003eNodes are placed into maintenance mode by specifying IP ranges in the \u003ccode\u003emaintenanceBlocks\u003c/code\u003e section of the cluster configuration file, and only nodes in a ready state can be chosen.\u003c/p\u003e\n"],["\u003cp\u003eYou can remove nodes from maintenance mode by editing the cluster configuration file, either by modifying the specified IP addresses or removing the \u003ccode\u003emaintenanceBlocks\u003c/code\u003e section entirely.\u003c/p\u003e\n"],["\u003cp\u003eWhile you can use the \u003ccode\u003ekubectl cordon\u003c/code\u003e and \u003ccode\u003ekubectl drain\u003c/code\u003e commands manually, Google distributed cloud will better handle the process, and those manual commands should be avoided in version 1.12.0 or lower.\u003c/p\u003e\n"]]],[],null,["# Put nodes into maintenance mode\n\n\u003cbr /\u003e\n\nWhen you need to repair or maintain nodes, you should first put the nodes into\nmaintenance mode. Putting nodes into maintenance mode safely drains their\npods/workloads and excludes the nodes from pod scheduling. In maintenance mode,\nyou can work on your nodes without a risk of disrupting pod traffic.\n\nHow it works\n------------\n\nGoogle Distributed Cloud provides a way to place nodes into maintenance mode. This\napproach lets other cluster components correctly know that the node is in\nmaintenance mode. When you place a node in maintenance mode, no additional pods\ncan be scheduled on the node, and existing pods are stopped.\n\nInstead of using maintenance mode, you can manually use Kubernetes commands such\nas `kubectl cordon` and `kubectl drain` on a specific node. If you run\nGoogle Distributed Cloud version 1.12.0 (`anthosBareMetalVersion: 1.12.0`) or\nlower, see the known issue on\n[Nodes uncordoned if you don't use the maintenance mode procedure](https://cloud.google.com/anthos/clusters/docs/bare-metal/1.12/troubleshooting/known-issues#nodes-uncordoned-if-you-dont-use-the-maintenance-mode-procedure).\n\nWhen you use the maintenance mode process, Google Distributed Cloud does the\nfollowing:\n\n- [Node taints](https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/)\n\n are added to specified nodes to indicate that no pods can be scheduled or\n executed on the nodes.\n- A 20-minute timeout is enforced to ensure nodes don't get stuck waiting for\n pods to stop. Pods might not stop if they are configured to\n [tolerate all taints](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/)\n or they have\n [finalizers](https://kubernetes.io/docs/concepts/overview/working-with-objects/finalizers/).\n Google Distributed Cloud attempts to stop all pods, but if the timeout is\n exceeded, the node is put into maintenance mode. This timeout prevents\n running pods from blocking upgrades.\n\nPut a node into maintenance mode\n--------------------------------\n\nChoose the nodes you want to put into maintenance mode by specifying IP ranges\nfor the selected nodes under `maintenanceBlocks` in your cluster configuration\nfile. The nodes you choose must be in a ready state, and functioning in the\ncluster.\n\nTo put nodes into maintenance mode:\n\n1. Edit the cluster configuration file to select the nodes you want to put into\n maintenance mode.\n\n You can edit the configuration file with an editor of your choice, or you\n can edit the cluster custom resource directly by running the following\n command: \n\n kubectl -n \u003cvar translate=\"no\"\u003eCLUSTER_NAMESPACE\u003c/var\u003e edit cluster \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eCLUSTER_NAMESPACE\u003c/var\u003e: the namespace of the cluster.\n - \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e: the name of the cluster.\n2. Add the `maintenanceBlocks` section to the cluster configuration file to\n specify either a single IP address, or an address range, for nodes you want\n to put into maintenance mode.\n\n The following sample shows how to select multiple nodes by specifying a\n range of IP addresses: \n\n metadata:\n name: my-cluster\n namespace: cluster-my-cluster\n spec:\n maintenanceBlocks:\n cidrBlocks:\n - 172.16.128.1-172.16.128.64\n\n3. Save and apply the updated cluster configuration.\n\n Google Distributed Cloud starts putting the nodes into maintenance mode.\n4. Run the following command to get the status of the nodes in your cluster:\n\n kubectl get nodes --kubeconfig=\u003cvar translate=\"no\"\u003eKUBECONFIG\u003c/var\u003e\n\n The response is something like the following: \n\n NAME STATUS ROLES AGE VERSION\n user-anthos-baremetal-01 Ready control-plane 2d22h v1.25.10-gke.2100\n user-anthos-baremetal-04 Ready worker 2d22h v1.25.10-gke.2100\n user-anthos-baremetal-05 Ready worker 2d22h v1.25.10-gke.2100\n user-anthos-baremetal-06 Ready worker 2d22h v1.25.10-gke.2100\n\n Note that the nodes are still schedulable, but taints keep any pods (without\n an appropriate toleration) from being scheduled on the node.\n5. Run the following command to get the number of nodes in maintenance mode:\n\n kubectl get nodepools\n\n The response should look something like the following output: \n\n NAME READY RECONCILING STALLED UNDERMAINTENANCE UNKNOWN\n np1 3 0 0 1 0\n\n This `UNDERMAINTENANCE` column in this sample shows that one node is in\n maintenance mode.\n\n Google Distributed Cloud also adds the following taints to nodes when they are\n put into maintenance mode:\n - `baremetal.cluster.gke.io/maintenance:NoExecute`\n - `baremetal.cluster.gke.io/maintenance:NoSchedule`\n\nRemove a node from maintenance mode\n-----------------------------------\n\nTo remove nodes from maintenance mode:\n\n1. Edit the cluster configuration file to clear the nodes you want to remove\n from maintenance mode.\n\n You can edit the configuration file with an editor of your choice, or you\n can edit the cluster custom resource directly by running the following\n command: \n\n kubectl -n \u003cvar translate=\"no\"\u003eCLUSTER_NAMESPACE\u003c/var\u003e edit cluster \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eCLUSTER_NAMESPACE\u003c/var\u003e: the namespace of the cluster.\n - \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e: the name of the cluster.\n2. Either edit the IP addresses to remove specific nodes from maintenance mode\n or remove the `maintenanceBlocks` section remove all does from maintenance\n mode.\n\n3. Save and apply the updated cluster configuration.\n\n4. Use `kubectl` commands to check the status of your nodes."]]