Version 1.6. This version is no longer supported. For more information see the version support policy .

Troubleshooting storage

This document gives troubleshooting guidance for storage issues.

Volume fails to attach

This issue can occur if a virtual disk is attached to the wrong virtual machine, it may be due to Issue #32727 in Kubernetes 1.12.

The output of gkectl diagnose cluster looks like this:

Checking cluster object...PASS
Checking machine objects...PASS
Checking control plane pods...PASS
Checking gke-connect pods...PASS
Checking kube-system pods...PASS
Checking gke-system pods...PASS
Checking storage...FAIL
    PersistentVolume pvc-776459c3-d350-11e9-9db8-e297f465bc84: virtual disk "[datastore_nfs] kubevols/kubernetes-dynamic-pvc-776459c3-d350-11e9-9db8-e297f465bc84.vmdk" IS attached to machine "gsl-test-user-9b46dbf9b-9wdj7" but IS NOT listed in the Node.Status
1 storage errors

One or more Pods are stuck in the ContainerCreating state with warnings like this:

Events:
  Type     Reason              Age               From                     Message
  ----     ------              ----              ----                     -------
  Warning  FailedAttachVolume  6s (x6 over 31s)  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-776459c3-d350-11e9-9db8-e297f465bc84" : Failed to add disk 'scsi0:6'.

To resolve this issue:

If a virtual disk is attached to the wrong virtual machine, you might need to manually detach it:

Drain the node. See Safely draining a node {:.external"}. You might want to include the --ignore-daemonsets and --delete-local-data flags in your kubectl drain {:.external"> command.
Power off the VM
Edit the VM's hardware config in vCenter to remove the volume.
Power on the VM
Uncordon the node

Volume is lost

This issue can occur if a virtual disk was permanently deleted. This can happen if an operator manually deletes a virtual disk or the virtual machine it is attached to. If you see a "not found" error related to your VMDK file, it is likely that the virtual disk was permanently deleted.

The output of gkectl diagnose cluster looks like this:

Checking cluster object...PASS
Checking machine objects...PASS
Checking control plane pods...PASS
Checking gke-connect pods...PASS
Checking kube-system pods...PASS
Checking gke-system pods...PASS
Checking storage...FAIL
    PersistentVolume pvc-52161704-d350-11e9-9db8-e297f465bc84: virtual disk "[datastore_nfs] kubevols/kubernetes-dynamic-pvc-52161704-d350-11e9-9db8-e297f465bc84.vmdk" IS NOT found
1 storage errors

One or more Pods are stuck in the ContainerCreating state:

Events:
  Type     Reason              Age                   From                                    Message
  ----     ------              ----                  ----                                    -------
  Warning  FailedAttachVolume  71s (x28 over 42m)    attachdetach-controller                 AttachVolume.Attach failed for volume "pvc-52161704-d350-11e9-9db8-e297f465bc84" : File []/vmfs/volumes/43416d29-03095e58/kubevols/
  kubernetes-dynamic-pvc-52161704-d350-11e9-9db8-e297f465bc84.vmdk was not found

To prevent this issue from occurring, manage your virtual machines as described in Resizing a user cluster and Upgrading clusters .

To resolve this issue, you might need to manually clean up related Kubernetes resources:

Delete the PVC that referenced the PV by running kubectl delete pvc [PVC_NAME] .
Delete the Pod that referenced the PVC by running kubectl delete pod [POD_NAME] .
Repeat step 2. Yes, really. See Kubernetes issue 74374 {:.external}.

vSphere CSI Volume fails to detach

This issue occurs if the CNS > Searchable privilege has not been granted to the vSphere user.

If you find pods stuck in the ContainerCreating phase with FailedAttachVolume warnings, it could be due to a failed detach on a different node.

To check for CSI detach errors:

kubectl get volumeattachments -o=custom-columns=NAME:metadata.name,DETACH_ERROR:status.detachError.message

The output is similar to the following:

NAME                                                                   DETACH_ERROR
csi-0e80d9be14dc09a49e1997cc17fc69dd8ce58254bd48d0d8e26a554d930a91e5   rpc error: code = Internal desc = QueryVolume failed for volumeID: "57549b5d-0ad3-48a9-aeca-42e64a773469". ServerFaultCode: NoPermission
csi-164d56e3286e954befdf0f5a82d59031dbfd50709c927a0e6ccf21d1fa60192d csi-8d9c3d0439f413fa9e176c63f5cc92bd67a33a1b76919d42c20347d52c57435c csi-e40d65005bc64c45735e91d7f7e54b2481a2bd41f5df7cc219a2c03608e8e7a8

To resolve this issue, add the CNS > Searchable privilege to your vcenter user account . The detach operation automatically retries until it succeeds.

CSI volume creation fails with `NotSupported` error

This issue occurs when an ESXi host in the vSphere cluster is running a version lower than ESXi 6.7U3.

The output of kubectl describe pvc includes this error:

Failed to provision volume with StorageClass : rpc error:
code = Internal desc = Failed to create volume. Error: CnsFault error:
CNS: Failed to create disk.:Fault cause: vmodl.fault.NotSupported

To resolve this issue, upgrade your ESXi hosts to version 6.7U3 or later.

vSphere CSI volume fails to attach

This known issue {:.external} in the open-source vSphere CSI driver occurs when a node is shut down, deleted, or fails.

The output of kubectl describe pod looks like this:

Events:
  Type     Reason              Age        From                     Message
  ----     ------              ----       ----                     -------
  Warning  FailedAttachVolume  2m30s      attachdetach-controller  Multi-Attach error for volume
                                                                   "pvc-xxxxx"
                                                                   Volume is already exclusively attached to one
                                                                   node and can't be attached to another

To resolve this issue:

Note the name of the PersistentVolumeClaim (PVC) in the preceding output.
Find the VolumeAttachments that are associated with that PVC. For example:
```
kubectl get volumeattachments | grep pvc-xxxxx
```
The output shows the names of the VolumeAttachments. For example:
```
csi-yyyyy   csi.vsphere.vmware.com   pvc-xxxxx   node-zzzzz ...
```

Describe the VolumeAttachments. For example:

kubectl describe volumeattachments csi-yyy | grep "Deletion Timestamp"

Make a note of the deletion timestamp in the output. For example:

Deletion Timestamp:             2021-03-10T22:14:58Z

Wait until the time specified by the deletion timestamp, and then force delete the VolumeAttachment. To do this, edit the VolumeAttachment object and delete the finalizer. For example:
```
kubectl edit volumeattachment csi-yyyyy
 Finalizers:
  external-attacher/csi-vsphere-vmware-com
```

Troubleshooting storage Stay organized with collections Save and categorize content based on your preferences.