Mount Cloud Storage buckets as CSI ephemeral volumes


This guide shows you how to use CSI ephemeral volumes backed by your Cloud Storage buckets to automatically manage storage resources for your Kubernetes Pods or Jobs on Google Kubernetes Engine (GKE). CSI ephemeral volumes are tied to the Pod or Job lifecycle, and you don't need to manually handle PersistentVolume and PersistentVolumeClaim objects.

This guide is for Platform admins and operators who want to simplify storage management for their GKE applications.

Before reading this page, ensure you're familiar with CSI ephemeral volumes, Kubernetes Pods and Jobs, and Cloud Storage buckets.

If you are already familiar with PersistentVolumes and want consistency with your existing deployments that rely on this resource type, see Mount Cloud Storage buckets as persistent volumes .

Before you begin

Make sure you have completed these prerequisites:

How CSI ephemeral storage for Cloud Storage buckets work

CSI ephemeral volumes simplify storage management for your applications on GKE. You define CSI ephemeral volumes directly within your Pod or Job specification. Using CSI ephemeral volumes eliminates the need for separate PersistentVolume and PersistentVolumeClaim objects.

Using a CSI ephemeral volume involves these operations:

  1. Storage definition: You specify the storage in your Pod or Job's YAML file, including the CSI driver to use and any required parameters. For Cloud Storage FUSE CSI driver, you specify the bucket name and other relevant details.

    Optionally, you can fine-tune the performance of your CSI driver by using the file caching feature. File caching can boost GKE app performance by caching frequently accessed Cloud Storage files on a faster disk.

    Additionally, you can use the parallel download feature to accelerate reading large files from Cloud Storage for multi-threaded downloads. You can use this feature to improve model load times, especially for reads of over 1 GB in size.

  2. Driver invocation: When you create the Pod or Job, GKE detects the ephemeral volume request and calls the Cloud Storage FUSE CSI driver.

  3. Volume mount and attachment: The CSI driver mounts the CSI ephmeral volume (which points to the underlying Cloud Storage bucket) and makes it available to the Pod or Job, making it accessible to your application. To fine-tune how buckets are mounted in the file system, you can use mount options . You can also use volume attributes to configure specific behavior of the Cloud Storage FUSE CSI driver.

  4. Lifecycle management: The ephemeral volume exists for the lifetime of the Pod or Job. When the Pod is deleted or the Job completes, the CSI driver automatically handles cleanup, and unmounting the volume.

Attach the CSI ephemeral volume

Follow these instructions, depending on whether you want to attach the CSI ephemeral volume to a Pod or Job.

Pod

To attach the CSI ephemeral volume in a Pod, follow these steps:

  1. Create a Pod YAML manifest with the following specification:

      apiVersion 
     : 
      
     v1 
     kind 
     : 
      
     Pod 
     metadata 
     : 
      
     name 
     : 
      
     gcs-fuse-csi-example-ephemeral 
      
       
     namespace 
     : 
      
      NAMESPACE 
     
      
     annotations 
     : 
      
     gke-gcsfuse/volumes 
     : 
      
     "true" 
      
     spec 
     : 
      
     terminationGracePeriodSeconds 
     : 
      
     60 
      
     containers 
     : 
      
     - 
      
     image 
     : 
      
     busybox 
      
     name 
     : 
      
     busybox 
      
     command 
     : 
      
     [ 
     "sleep" 
     ] 
      
     args 
     : 
      
     [ 
     "infinity" 
     ] 
      
       
     volumeMounts 
     : 
      
     - 
      
     name 
     : 
      
     gcs-fuse-csi-ephemeral 
      
     mountPath 
     : 
      
     /data 
      
     readOnly 
     : 
      
     true 
      
     serviceAccountName 
     : 
      
      KSA_NAME 
     
      
     volumes 
     : 
      
     - 
      
     name 
     : 
      
     gcs-fuse-csi-ephemeral 
      
     csi 
     : 
      
     driver 
     : 
      
     gcsfuse.csi.storage.gke.io 
      
     readOnly 
     : 
      
     true 
      
     volumeAttributes 
     : 
      
     bucketName 
     : 
      
      BUCKET_NAME 
     
      
     mountOptions 
     : 
      
     "implicit-dirs" 
      
     
    

    Replace the following values:

    • NAMESPACE : the Kubernetes namespace where you want to deploy your Pod.
    • KSA_NAME : the name of the Kubernetes ServiceAccount you specified when configuring access to the Cloud Storage buckets .
    • BUCKET_NAME : the Cloud Storage bucket name you specified when configuring access to the Cloud Storage buckets. You can specify an underscore ( _ ) to mount all buckets that the Kubernetes ServiceAccount can access. To learn more, see Dynamic mounting in the Cloud Storage FUSE documentation.

    The example manifest shows these required settings:

    • metadata.annotations : the annotation gke-gcsfuse/volumes: "true" is required. See Configure the sidecar container for optional annotations.
    • spec.volumes[n].csi.driver : use gcsfuse.csi.storage.gke.io as the CSI driver name.

    Optionally, you can adjust these variables:

    • spec.terminationGracePeriodSeconds : By default, this is set to 30. If you need to write large files to the Cloud Storage bucket, increase this value to make sure that Cloud Storage FUSE has enough time to flush the data after your application exits. To learn more, see Kubernetes best practices: terminating with grace .
    • spec.volumes[n].csi.volumeAttributes.mountOptions : Pass mount options to Cloud Storage FUSE. Specify the flags in one string separated by commas, without spaces.
    • spec.volumes[n].csi.volumeAttributes : Pass additional volume attributes to Cloud Storage FUSE.
    • spec.volumes[n].csi.readOnly : Specify true if all the volume mounts are read-only.
    • spec.containers[n].volumeMounts[m].readOnly : Specify true if only a specific volume mount is read-only.
  2. Run the following command to apply the manifest to your cluster:

     kubectl  
    apply  
    -f  
     FILE_PATH 
     
    

    Replace FILE_PATH with the path to your YAML file.

Pod (file caching)

To attach the CSI ephemeral volume with file caching in a Pod, follow these steps:

  1. Create a cluster or node pool with Local SSD-backed ephemeral storage , by following the steps in Create a cluster or node pool with Local SSD-backed ephemeral storage .

  2. Create a Pod YAML manifest with the following specification:

      apiVersion 
     : 
      
     v1 
     kind 
     : 
      
     Pod 
     metadata 
     : 
      
     name 
     : 
      
     gcs-fuse-csi-file-cache-example 
      
       
     namespace 
     : 
      
      NAMESPACE 
     
      
     annotations 
     : 
      
     gke-gcsfuse/volumes 
     : 
      
     "true" 
      
     gke-gcsfuse/ephemeral-storage-limit 
     : 
      
     "50Gi" 
      
     spec 
     : 
      
     nodeSelector 
     : 
      
     cloud.google.com/gke-ephemeral-storage-local-ssd 
     : 
      
     "true" 
      
     restartPolicy 
     : 
      
     Never 
      
     initContainers 
     : 
      
     - 
      
     name 
     : 
      
     data-loader 
      
     image 
     : 
      
     gcr.io/google.com/cloudsdktool/google-cloud-cli:slim 
      
     resources 
     : 
      
     limits 
     : 
      
     cpu 
     : 
      
     500m 
      
     memory 
     : 
      
     1Gi 
      
     requests 
     : 
      
     cpu 
     : 
      
     500m 
      
     memory 
     : 
      
     1Gi 
      
     command 
     : 
      
     - 
      
     "/bin/sh" 
      
     - 
      
     "-c" 
      
     - 
      
     | 
      
     mkdir -p /test_files 
      
     for i in $(seq 1 1000); do dd if=/dev/zero of=/test_files/file_$i.txt bs=1024 count=64; done 
      
     gcloud storage cp /test_files gs:// BUCKET_NAME 
    --recursive 
      
     containers 
     : 
      
     - 
      
     name 
     : 
      
     data-validator 
      
     image 
     : 
      
     busybox 
      
     resources 
     : 
      
     limits 
     : 
      
     cpu 
     : 
      
     500m 
      
     memory 
     : 
      
     512Mi 
      
     requests 
     : 
      
     cpu 
     : 
      
     500m 
      
     memory 
     : 
      
     512Mi 
      
     command 
     : 
      
     - 
      
     "/bin/sh" 
      
     - 
      
     "-c" 
      
     - 
      
     | 
      
     echo "first read with cache miss" 
      
     time cat /data/test_files/file_* > /dev/null 
      
     echo "second read from local cache" 
      
     time cat /data/test_files/file_* > /dev/null 
       
     volumeMounts 
     : 
      
     - 
      
     name 
     : 
      
     gcs-fuse-csi-ephemeral 
      
     mountPath 
     : 
      
     /data 
      
     serviceAccountName 
     : 
      
      KSA_NAME 
     
      
     volumes 
     : 
      
     - 
      
     name 
     : 
      
     gcs-fuse-csi-ephemeral 
      
     csi 
     : 
      
     driver 
     : 
      
     gcsfuse.csi.storage.gke.io 
      
     volumeAttributes 
     : 
      
     bucketName 
     : 
      
      BUCKET_NAME 
     
      
     mountOptions 
     : 
      
     "implicit-dirs,file-cache:max-size-mb:-1" 
     
    

    Replace the following values:

    • NAMESPACE : the Kubernetes namespace where you want to deploy your Pod.
    • KSA_NAME : the name of the Kubernetes ServiceAccount you specified when configuring access to the Cloud Storage buckets.
    • BUCKET_NAME : the Cloud Storage bucket name you specified when configuring access to the Cloud Storage buckets. You can specify an underscore ( _ ) to mount all buckets that the Kubernetes ServiceAccount can access. To learn more, see Dynamic mounting in the Cloud Storage FUSE documentation.

      In the example manifest, the init container data-loader generates 1,000 files with size of 64 KiB, and uploads the files to a Cloud Storage bucket. The main container data-validator reads all the files from the bucket twice, and logs the duration.

  3. Run the following command to apply the manifest to your cluster:

     kubectl  
    apply  
    -f  
     FILE_PATH 
     
    

    Replace FILE_PATH with the path to your YAML file.

  4. To view the log output, run the following command:

     kubectl  
    logs  
    -n  
     NAMESPACE 
      
    gcs-fuse-csi-file-cache-example  
    -c  
    data-validator 
    

    Replace NAMESPACE with the namespace of your workload.

    The output should look similar to the following:

     first read with cache miss
    real    0m 54.68s
    ...
    second read from local cache
    real    0m 0.38s
    ... 
    

    The output shows that the second read with local cache is much faster than the first read with a cache miss.

Pod (parallel download)

To attach the CSI ephemeral volume with parallel download in a Pod, follow these steps:

  1. Create a Pod YAML manifest with the following specification:

      apiVersion 
     : 
      
     v1 
     kind 
     : 
      
     Pod 
     metadata 
     : 
      
     name 
     : 
      
     gcs-fuse-csi-example-ephemeral 
      
      
     namespace 
     : 
      
      NAMESPACE 
     
      
     annotations 
     : 
      
     gke-gcsfuse/volumes 
     : 
      
     "true" 
      
     gke-gcsfuse/ephemeral-storage-limit 
     : 
      
     "50Gi" 
      
     spec 
     : 
      
     containers 
     : 
      
     ... 
      
     volumes 
     : 
      
     - 
      
     name 
     : 
      
     gcs-fuse-csi-ephemeral 
      
       
     csi 
     : 
      
     driver 
     : 
      
     gcsfuse.csi.storage.gke.io 
      
     volumeAttributes 
     : 
      
     bucketName 
     : 
      
      BUCKET_NAME 
     
      
     mountOptions 
     : 
      
     "implicit-dirs,file-cache:enable-parallel-downloads:true,file-cache:max-size-mb:-1" 
      
     fileCacheCapacity 
     : 
      
     "-1" 
     
    

    Replace the following values:

    • NAMESPACE : the Kubernetes namespace where you want to deploy your Pod.
    • BUCKET_NAME : the Cloud Storage bucket name you specified when configuring access to the Cloud Storage buckets. You can specify an underscore ( _ ) to mount all buckets that the Kubernetes ServiceAccount can access. To learn more, see Dynamic mounting in the Cloud Storage FUSE documentation.
  2. Run the following command to apply the manifest to your cluster:

     kubectl  
    apply  
    -f  
     FILE_PATH 
     
    

    Replace FILE_PATH with the path to your YAML file.

Job

To attach the CSI ephemeral volume in a Job, follow these steps:

  1. Create a Job YAML manifest with the following specification:

      apiVersion 
     : 
      
     batch/v1 
     kind 
     : 
      
     Job 
     metadata 
     : 
      
     name 
     : 
      
     gcs-fuse-csi-job-example 
      
       
     namespace 
     : 
      
      NAMESPACE 
     
      
     spec 
     : 
      
     template 
     : 
      
     metadata 
     : 
      
       
     annotations 
     : 
      
     gke-gcsfuse/volumes 
     : 
      
     "true" 
      
     spec 
     : 
      
     serviceAccountName 
     : 
      
      KSA_NAME 
     
      
      
     containers 
     : 
      
     - 
      
     name 
     : 
      
     writer 
      
     image 
     : 
      
     busybox 
      
     command 
     : 
      
     - 
      
     "/bin/sh" 
      
     - 
      
     "-c" 
      
     - 
      
     touch /data/test && echo $(date) >> /data/test && sleep 10 
      
     volumeMounts 
     : 
      
     - 
      
     name 
     : 
      
     gcs-fuse-csi-ephemeral 
      
     mountPath 
     : 
      
     /data 
      
     - 
      
     name 
     : 
      
     reader 
      
     image 
     : 
      
     busybox 
      
     command 
     : 
      
     - 
      
     "/bin/sh" 
      
     - 
      
     "-c" 
      
     - 
      
     sleep 10 && cat /data/test 
      
       
     volumeMounts 
     : 
      
     - 
      
     name 
     : 
      
     gcs-fuse-csi-ephemeral 
      
     mountPath 
     : 
      
     /data 
      
     readOnly 
     : 
      
     true 
      
     volumes 
     : 
      
     - 
      
     name 
     : 
      
     gcs-fuse-csi-ephemeral 
      
     csi 
     : 
      
     driver 
     : 
      
     gcsfuse.csi.storage.gke.io 
      
     volumeAttributes 
     : 
      
     bucketName 
     : 
      
      BUCKET_NAME 
     
      
     restartPolicy 
     : 
      
     Never 
      
      
     backoffLimit 
     : 
      
     1 
     
    

    Replace the following values:

    • NAMESPACE : the Kubernetes namespace where you deploy your Pod.
    • KSA_NAME : the name of the Kubernetes ServiceAccount you specified when configuring access to the Cloud Storage buckets.
    • BUCKET_NAME : the Cloud Storage bucket name you specified when configuring access to the Cloud Storage buckets. You can specify an underscore ( _ ) to mount all buckets that the Kubernetes ServiceAccount can access. To learn more, see Dynamic mounting in the Cloud Storage FUSE documentation.

    The example manifest shows these required settings:

    • metadata.annotations : the annotation gke-gcsfuse/volumes: "true" is required. See Configure the sidecar container for optional annotations.
    • spec.volumes[n].csi.drive r: use gcsfuse.csi.storage.gke.io as the CSI driver name.

    Optionally, you can adjust these variables:

    • spec.volumes[n].csi.volumeAttributes.mountOptions : Pass mount options to Cloud Storage FUSE. Specify the flags in one string separated by commas, without spaces.
    • spec.volumes[n].csi.volumeAttributes : Pass additional volume attributes to Cloud Storage FUSE.
    • spec.volumes[n].csi.readOnly : Specify true if all the volume mounts are read-only.
    • spec.containers[n].volumeMounts[m].readOnly : Specify true if only a specific volume mount is read-only.
  2. Run the following command to apply the manifest to your cluster:

     kubectl  
    apply  
    -f  
     FILE_PATH 
     
    

    Replace FILE_PATH with the path to your YAML file.

Troubleshoot issues

For more information about troubleshooting the Cloud Storage FUSE CSI driver, see the troubleshooting guide in the GitHub project documentation.

What's next

Design a Mobile Site
View Site in Mobile | Classic
Share by: