Data backup and recovery for Parallelstore on Google Kubernetes Engine

Parallelstore is available by invitation only. If you'd like to request access to Parallelstore in your Google Cloud project, contact your sales representative.

This guide describes how you can back up the data in your (GKE) connected Parallelstore instance to a Cloud Storage bucket and prevent potential data loss by configuring a GKE CronJob to automatically back up the data on a schedule. This guide also describes how you can recover data for a Parallelstore instance.

Before you begin

Follow Create and connect to a Parallelstore instance from GKE to set up your GKE cluster and Parallelstore instance.

Data backup

The following section describes how you can set up a GKE CronJob to continually back up your data from a Parallelstore instance in the GKE cluster to prevent data loss.

Connect to your GKE cluster

Get the credentials for your GKE cluster:

   
gcloud  
container  
clusters  
get-credentials  
 CLUSTER_NAME 
  
 \ 
  
--project = 
 PROJECT_ID 
  
 \ 
  
--location = 
 CLUSTER_LOCATION 
 

Replace the following:

  • CLUSTER_NAME : the GKE cluster name.
  • PROJECT_ID : the Google Cloud project ID.
  • CLUSTER_LOCATION : the Compute Engine zone containing the cluster. Your cluster must be in a supported zone for the Parallelstore CSI driver.

Provision required permissions

Your GKE CronJob needs the roles/parallelstore.admin and roles/storage.admin roles to import and export data between Cloud Storage and Parallelstore.

Create a Google Cloud service account

   
gcloud  
iam  
service-accounts  
create  
parallelstore-sa  
 \ 
  
--project = 
 PROJECT_ID 
 

Grant the Google Cloud service account roles

Grant Parallelstore Admin and Cloud Storage Admin roles to the service account.

   
gcloud  
projects  
add-iam-policy-binding  
 PROJECT_ID 
  
 \ 
  
--member = 
serviceAccount:parallelstore-sa@ PROJECT_ID 
.iam.gserviceaccount.com  
 \ 
  
--role = 
roles/parallelstore.admin  
gcloud  
projects  
add-iam-policy-binding  
 PROJECT_ID 
  
 \ 
  
--member  
serviceAccount:parallelstore-sa@ PROJECT_ID 
.iam.gserviceaccount.com  
 \ 
  
--role = 
roles/storage.admin 

Set up a GKE service account

You need to set up a GKE service account and allow it to impersonate the Google Cloud service account. Use the following steps to allow GKE service account to bind to the Google Cloud service account.

  1. Create the following parallelstore-sa.yaml service account manifest:

       
     # GKE service account used by workload and will have access to Parallelstore and GCS 
      
     apiVersion 
     : 
      
     v1 
      
     kind 
     : 
      
     ServiceAccount 
      
     metadata 
     : 
      
     name 
     : 
      
     parallelstore-sa 
      
     namespace 
     : 
      
     default 
     
    

    Next, deploy it to your GKE cluster using this command:

       
    kubectl  
    apply  
    -f  
    parallelstore-sa.yaml 
    
  2. Allow the GKE service account to impersonate the Google Cloud service account.

       
     # Bind the GCP SA and GKE SA 
      
    gcloud  
    iam  
    service-accounts  
    add-iam-policy-binding  
    parallelstore-sa@ PROJECT_ID 
    .iam.gserviceaccount.com  
     \ 
      
    --role  
    roles/iam.workloadIdentityUser  
     \ 
      
    --member  
     "serviceAccount: PROJECT_ID 
    .svc.id.goog[default/parallelstore-sa]" 
      
     # Annotate the GKE SA with GCP SA 
      
    kubectl  
    annotate  
    serviceaccount  
    parallelstore-sa  
     \ 
      
    --namespace  
    default  
     \ 
      
    iam.gke.io/gcp-service-account = 
    parallelstore-sa@ PROJECT_ID 
    .iam.gserviceaccount.com 
    

Grant permissions to the Parallelstore Agent service account

   
gcloud  
storage  
buckets  
add-iam-policy-binding  
 GCS_BUCKET 
  
 \ 
  
--member = 
serviceAccount:service- PROJECT_NUMBER 
@gcp-sa-parallelstore.iam.gserviceaccount.com  
 \ 
  
--role = 
roles/storage.admin 

Replace the following:

  • GCS_BUCKET : The Cloud Storage bucket URI in the format of gs://<bucket_name> .
  • PROJECT_NUMBER : The Google Cloud project number.

Start the CronJob

Configure and start a GKE CronJob for periodically exporting data from Parallelstore to Cloud Storage.

Create the configuration file ps-to-gcs-backup.yaml for the CronJob:

   
 apiVersion 
 : 
  
 batch/v1 
  
 kind 
 : 
  
 CronJob 
  
 metadata 
 : 
  
 name 
 : 
  
 ps-to-gcs-backup 
  
 spec 
 : 
  
 concurrencyPolicy 
 : 
  
 Forbid 
  
 failedJobsHistoryLimit 
 : 
  
 1 
  
 schedule 
 : 
  
 "0 
  
 * 
  
 * 
  
 * 
  
 *" 
  
 successfulJobsHistoryLimit 
 : 
  
 3 
  
 suspend 
 : 
  
 false 
  
 jobTemplate 
 : 
  
 spec 
 : 
  
 template 
 : 
  
 metadata 
 : 
  
 annotations 
 : 
  
 gke-parallelstore/cpu-limit 
 : 
  
 "0" 
  
 gke-parallelstore/ephemeral-storage-limit 
 : 
  
 "0" 
  
 gke-parallelstore/memory-limit 
 : 
  
 "0" 
  
 gke-parallelstore/volumes 
 : 
  
 "true" 
  
 spec 
 : 
  
 serviceAccountName 
 : 
  
 parallelstore-sa 
  
 containers 
 : 
  
 - 
  
 name 
 : 
  
 pstore-backup 
  
 image 
 : 
  
 google/cloud-sdk:slim 
  
 imagePullPolicy 
 : 
  
 IfNotPresent 
  
 command 
 : 
  
 - 
  
 /bin/bash 
  
 - 
  
 -c 
  
 - 
  
 | 
  
 #!/bin/bash 
  
 set -ex 
  
 # Retrieve modification timestamp for the latest file up to the minute 
  
 latest_folder_timestamp=$(find $PSTORE_MOUNT_PATH/$SOURCE_PARALLELSTORE_PATH -type d -printf  '%T@ %p\n'| sort -n | tail -1 | cut -d' ' -f2- | xargs -I{} stat -c %x {} | xargs -I {} date -d {} +"%Y-%m-%d %H:%M") 
  
 # Start exporting from PStore to GCS 
  
 operation=$(gcloud beta parallelstore instances export-data $PSTORE_NAME \ 
  
 --location=$PSTORE_LOCATION \ 
  
 --source-parallelstore-path=$SOURCE_PARALLELSTORE_PATH \ 
  
 --destination-gcs-bucket-uri=$DESTINATION_GCS_URI \ 
  
 --async \ 
  
 --format="value(name)") 
  
 # Wait until operation complete 
  
 while true; do 
  
 status=$(gcloud beta parallelstore operations describe $operation \ 
  
 --location=$PSTORE_LOCATION \ 
  
 --format="value(done)") 
  
 if [ "$status" == "True" ]; then 
  
 break 
  
 fi 
  
 sleep 60 
  
 done 
  
 # Check if export succeeded 
  
 error=$(gcloud beta parallelstore operations describe $operation \ 
  
 --location=$PSTORE_LOCATION \ 
  
 --format="value(error)") 
  
 if [ "$error" != "" ]; then 
  
 echo "!!! ERROR while exporting data !!!" 
  
 fi 
  
 # Delete the old files from PStore if requested 
  
 # This will not delete the folder with the latest modification timestamp 
  
 if $DELETE_AFTER_BACKUP && [ "$error" == "" ]; then 
  
 find $PSTORE_MOUNT_PATH/$SOURCE_PARALLELSTORE_PATH -type d -mindepth 1 | 
  
 while read dir; do 
  
 # Only delete folders that is modified earlier than the latest modification timestamp 
  
 folder_timestamp=$(stat -c %y $dir) 
  
 if [ $(date -d "$folder_timestamp" +%s) -lt $(date -d "$latest_folder_timestamp" +%s) ]; then 
  
 echo "Deleting $dir" 
  
 rm -rf "$dir" 
  
 fi 
  
 done 
  
 fi 
  
 env 
 : 
  
 - 
  
 name 
 : 
  
 PSTORE_MOUNT_PATH 
  
 # mount path of the Parallelstore instance, should match the volumeMount defined for this container 
  
 value 
 : 
  
 " PSTORE_MOUNT_PATH 
" 
  
 - 
  
 name 
 : 
  
 PSTORE_NAME 
  
 # name of the Parallelstore instance that need backup 
  
 value 
 : 
  
 " PSTORE_NAME 
" 
  
 - 
  
 name 
 : 
  
 PSTORE_LOCATION 
  
 # location/zone of the Parallelstore instance that need backup 
  
 value 
 : 
  
 " PSTORE_LOCATION 
" 
  
 - 
  
 name 
 : 
  
 SOURCE_PARALLELSTORE_PATH 
  
 # absolute path from the PStore instance, without volume mount path 
  
 value 
 : 
  
 " SOURCE_PARALLELSTORE_PATH 
" 
  
 - 
  
 name 
 : 
  
 DESTINATION_GCS_URI 
  
 # GCS bucket uri used for storing backups, starting with "gs://" 
  
 value 
 : 
  
 " DESTINATION_GCS_URI 
" 
  
 - 
  
 name 
 : 
  
 DELETE_AFTER_BACKUP 
  
 # will delete old data from Parallelstore if true 
  
 value 
 : 
  
 " DELETE_AFTER_BACKUP 
" 
  
 volumeMounts 
 : 
  
 - 
  
 mountPath 
 : 
  
  PSTORE_MOUNT_PATH 
 
  
 # should match the value of env var PSTORE_MOUNT_PATH 
  
 name 
 : 
  
  PSTORE_PV_NAME 
 
  
 dnsPolicy 
 : 
  
 ClusterFirst 
  
 restartPolicy 
 : 
  
 OnFailure 
  
 terminationGracePeriodSeconds 
 : 
  
 30 
  
 volumes 
 : 
  
 - 
  
 name 
 : 
  
  PSTORE_PV_NAME 
 
  
 persistentVolumeClaim 
 : 
  
 claimName 
 : 
  
  PSTORE_PVC_NAME 
 
 

Replace the following variables:

  • PSTORE_MOUNT_PATH : The mount path of the Parallelstore instance, it should match the volumeMount defined for this container.
  • PSTORE_PV_NAME : The name of the GKE PersistentVolume that points to your Parallelstore instance. This should have been set up in your GKE cluster as part of the prerequisites.
  • PSTORE_PVC_NAME : The name of the GKE PersistentVolumeClaim that requests the usage of the Parallelstore PersistentVolume . This should have been set up in your GKE cluster as part of the prerequisites.
  • PSTORE_NAME : The name of the Parallelstore instance that need backup.
  • PSTORE_LOCATION : The location of the Parallelstore instance that need backup.
  • SOURCE_PARALLELSTORE_PATH : The absolute path from the Parallelstore instance without the volume mount path and must start with / .
  • DESTINATION_GCS_URI : The Cloud Storage bucket URI to a Cloud Storage bucket, or a path within a bucket, using the format of gs://<bucket_name>/<optional_path_inside_bucket> .
  • DELETE_AFTER_BACKUP : The configuration to decide whether to delete old data from Parallelstore after backup and free up space, supported values: true or false .

Deploy the CronJob to your GKE cluster using the following command:

   
kubectl  
apply  
-f  
ps-to-gcs-backup.yaml 

See CronJob for more details about setting up a CronJob.

Detecting data loss

When the state of a Parallelstore instance is FAILED , the data on the instance may no longer be accessible. You can use the following Google Cloud CLI command to check the state of the Parallelstore instance:

   
gcloud  
beta  
parallelstore  
instances  
describe  
 PARALLELSTORE_NAME 
  
 \ 
  
--location = 
 PARALLELSTORE_LOCATION 
  
 \ 
  
--format = 
 "value(state)" 
 

Data recovery

When a disaster happens or the Parallelstore instance fails for any reason, you can either use the GKE VolumePopulator to automatically preload data from Cloud Storage into a GKE managed Parallelstore instance, or manually create a new Parallelstore instance and import data from a Cloud Storage backup.

If you are recovering from a checkpoint of your workload, you need to decide which checkpoint to recover from by providing the path inside the Cloud Storage bucket.

The Parallelstore export in Cloud Storage might have partial data if the Parallelstore instance failed in the middle of the export operation. Check the data for completeness in the target Cloud Storage location before importing it to Parallelstore and resuming your workload.

GKE Volume Populator

GKE Volume Populator can be used to preload data from a Cloud Storage bucket path into a newly created Parallelstore instance. Instructions for this can be found in Preload Parallelstore .

Manual recovery

You can also create a Parallelstore instance manually and import data from a Cloud Storage bucket with the following steps.

  1. Create a new Parallelstore instance:

       
    gcloud  
    beta  
    parallelstore  
    instances  
    create  
     PARALLELSTORE_NAME 
      
     \ 
      
    --capacity-gib = 
     CAPACITY_GIB 
      
     \ 
      
    --location = 
     PARALLELSTORE_LOCATION 
      
     \ 
      
    --network = 
     NETWORK_NAME 
      
     \ 
      
    --project = 
     PROJECT_ID 
     
    
  2. Import data from Cloud Storage:

       
    gcloud  
    beta  
    parallelstore  
    instances  
    import-data  
     PARALLELSTORE_NAME 
      
     \ 
      
    --location = 
     PARALLELSTORE_LOCATION 
      
     \ 
      
    --source-gcs-bucket-uri = 
     SOURCE_GCS_URI 
      
     \ 
      
    --destination-parallelstore-path = 
     DESTINATION_PARALLELSTORE_PATH 
      
     \ 
      
    --async 
    

Replace the following:

  • PARALLELSTORE_NAME : The name of this Parallelstore instance.
  • CAPACITY_GIB : The storage capacity of the Parallelstore instance in GB, value from 12000 to 100000 , in multiples of 4000 .
  • PARALLELSTORE_LOCATION : The location of the Parallelstore instance that need backup, it must be in the supported zone .
  • NETWORK_NAME : The name of the VPC network that you created during Configure a VPC network , it must be the same network your GKE cluster uses and have Private Services Access enabled.
  • SOURCE_GCS_URI : The Cloud Storage bucket URI to a Cloud Storage bucket, or a path within a bucket where you have the data you want to import from, using the format of gs://<bucket_name>/<optional_path_inside_bucket> .
  • DESTINATION_PARALLELSTORE_PATH : The absolute path from the Parallelstore instance where you want to import the data to, must start with / .

More details about importing data into a Parallelstore instance can be found in Transfer data to or from Cloud Storage .

Create a Mobile Website
View Site in Mobile | Classic
Share by: