This page explains how to enable permissive mode on a backup plan.
During backup execution, if Backup for GKE detects conditions that are likely to cause restore to fail, the backup itself fails. The reason for the failure is provided in the backup's state_reason field. In the Google Cloud console, this field is termed as Status reason.
About permissive mode
When backup failures aren't acceptable and it's not possible to address the underlying issues, you can enable permissive mode . Permissive mode ensures that backups complete successfully, even if GKE resources that could potentially cause restore failures are detected during the backup process. Details about the issues are provided in the backup's Status reasonfield.
We recommend using this option only if you understand the issues and can implement workarounds during the restoration process. For a list of potential error messages in the backup's Status reasonfield with recommended actions, see Troubleshoot backup failures .
Enable permissive mode
Use the following instructions to enable permissive mode:
gcloud
To enable permissive mode, run the gcloud beta container backup-restore backup-plans update
command:
gcloud
beta
container
backup-restore
backup-plans
update
BACKUP_PLAN
\
--project =
PROJECT_ID
\
--location =
LOCATION
--permissive-mode
Replace the following:
-
BACKUP_PLAN
: the name of the backup plan that you want to update. -
PROJECT_ID
: the ID of your Google Cloud project. -
LOCATION
: the compute region for the resource, for exampleus-central1
. See About resource locations .For a full list of options, refer to the gcloud beta container backup-restore backup-plans update documentation.
Console
Use the following instructions to enable permissive mode in the Google Cloud console:
-
In the Google Cloud console, go to the Google Kubernetes Enginepage.
-
In the navigation menu, click Backup for GKE.
-
Click the Backup planstab.
-
Expand the cluster and click the plan name.
-
Click the Detailstab to edit the plan details.
-
Click Editto edit the section with Backup mode.
-
Click the Permissive modecheckbox and click Save changes.
Terraform
Update the existing google_gke_backup_backup_plan
resource.
resource "google_gke_backup_backup_plan" " NAME
" {
...
backup_config {
permissive_mode = true
...
}
}
Replace the following:
-
NAME
: the name of thegoogle_gke_backup_backup_plan
that you want to update.
For more information, see gke_backup_backup_plan .
Troubleshoot backup failures
The following table provides explanations and recommended actions for various backup failure messages displayed in the backup's Status reasonfield.
CustomResourceDefinitions "..." have invalid schemas
apiextensions.k8s.io/v1beta1
and lacks a structural schema
required in apiextensions.k8s.io/v1
.Reason: Backup for GKE cannot automatically define the structural schema. Restoring the CRD in Kubernetes v1.22+ clusters, where
apiextensions.k8s.io/v1beta1
is not available, causes
the restore to fail. This failure happens when restoring custom
resources defined by the CRD.- If you manage the CRD, follow the steps in the Kubernetes documentation to specify a structural schema for your CRD.
- If it's a GKE-managed CRD, you can call
kubectl delete crd
if there are no existing resources served by the CRD. If there are existing resources served by the CRD, you can enable permissive mode with an understanding of the restore behavior. For recommendations on common CRDs, see the documentation .
- If it's a third-party CRD, consult the relevant documentation to
migrate to
apiextensions.k8s.io/v1
.
When permissive mode is enabled, the CRD without a structural schema won't be backed up in a Kubernetes v1.22+ cluster. To successfully restore such a backup, you need to exclude the resources served by the CRD from restore or create the CRD in the target cluster before starting the restore.
PersistentVolumeClaims "..." are bound to PersistentVolumes of
unsupported types "..." and cannot be backed up
Reason: Backup for GKE only supports backing up Persistent Disk volume data. Non-Persistent Disk PVCs restored using the Provision new volumes and restore volume data from backuppolicy will not have any volume data restored. However, the Reuse existing volumes containing your datapolicy allows PVCs to be reconnected to the original volume handle. This is useful for volume types that are backed by an external server, like NFS.
When permissive mode is enabled, the PVC configuration is backed up, but the volume data is not.
PersistentVolumeClaims "..." are not bound to PersistentVolumes
and cannot be backed up
Reason: Backup for GKE can back up the PVC, but there is no volume data to back up. This situation might indicate a misconfiguration or a mismatch between requested and available storage.
When permissive mode is enabled, the PVC configuration is backed up, but there is no volume data to be backed up.
Failed to query API resources ...
Reason: Backup for GKE is unable to back up any resources served by the unavailable API.
spec.service
to make sure it is ready.When permissive mode is enabled, resources from the API groups that failed to load won't be backed up.
Secret ... is an auto-generated token from ServiceAccount ...
referenced in Pod specs
Reason: If Backup for GKE attempts to restore a service account along with its auto-generated secret and a Pod that mounts the secret volume, the restore appears to be successful. However, Kubernetes removes the secret, which causes the Pod to get stuck in container creation and fail to start.
spec.serviceAccountName
field in the Pod. This
action ensures that the token is automatically mounted on /var/run/secrets/kubernetes.io/serviceaccount
in the
containers. For more information, refer to Configure Service Accounts for Pods
documentation.When permissive mode is enabled, the secret is backed up but can't be mounted in Pods in Kubernetes v1.24+ clusters.
Common Custom Resource Definitions (CRDs) with issues and recommended actions
Here are some common CRDs that have backup issues and the actions we recommend to address the issues:
-
capacityrequests.internal.autoscaling.k8s.io
: This CRD was used temporarily in v1.21 clusters. Runkubectl delete crd capacityrequests.internal.autoscaling.k8s.io
to remove the CRD. -
scalingpolicies.scalingpolicy.kope.io
: This CRD was used to control fluentd resources, but GKE has migrated to using fluentbit. Runkubectl delete crd scalingpolicies.scalingpolicy.kope.io
to remove the CRD. -
memberships.hub.gke.io
: Runkubectl delete crd memberships.hub.gke.io
to remove the CRD if there are no membership resources. Enable permissive mode if there are membership resources. -
applications.app.k8s.io
: Enable permissive mode with an understanding of restore behavior.