This document describes how a cluster behaves if vCenter Server is down.
While vCenter Server is down:
-
The machines are in the
Availablestate -
The nodes are in the
Readystate. -
The Pods are in the
Runningstate. -
There are some expected errors in Pods that connect to vCenter Server; for example, the
vsphere-controller-managerandcluster-health-controllerPods. -
Stateless Pods can be created and deleted.
-
The creation of a stateful Pod will fail, because attaching a disk requires access to vCenter Server. These Pods will be in the
Pendingstate. -
The
gkectl diagnosecommand will fail with an error similar to the following:Exit with error: failed to prepare diagnose parameters: failed to create vSphere client: Post "https://my-server": dial tcp 203.0.113.1:443: connect: connection timed out
-
Auto repair is not triggered. This is because the machine and node states do not change states on connection errors to vCenter Server.
After vCenter Server comes back online (versions < 7.0U2)
-
The machines go to the
Unavailablestate, and auto repair or or a manual workaround is needed to get back the correct states. -
The cluster functions correctly even though the machines are in the
Unavailablestate.
After vCenter Server comes back online (versions >= 7.0U2)
- No extra steps are needed, and the cluster is healthy again.

