Mount a Network File System share

You can configure your custom training jobs to mount Network File System (NFS) shares to the container where your code is running. This lets your jobs access remote files as if they were local, enabling high throughput and low latency.

This guide shows how to mount a Network File System share when running a custom training job.

Before you begin

  1. Create an NFS share in a Virtual Private Cloud (VPC) . Your share must be accessible without authentication.

    You can use a Filestore instance as your NFS share. If you are using Filestore and plan to use VPC peering for Vertex AI in the next step, select private service accessas the connect mode when you create an instance. For an example, see Create instances in the Filestore documentation.

  2. To connect Vertex AI with the VPC that hosts your NFS share, follow the instructions in Use Private Service Connect interface for Vertex AI (recommended), or Set up VPC Network Peering .

Network File System information for custom training

When you create a custom training job that mounts an NFS share, you must specify the following:

  • The name of the network for Vertex AI to access. The way that you specify the network name differs depending on the type of custom training job. For details, see Perform custom training .

  • Your NFS configuration in the WorkerPoolSpec field . Include the following fields:

    Field Description
    nfsMounts.server The IP address of your NFS server. This must be a private address in your VPC.
    nfsMounts.path The NFS share path. This must be an absolute path that begins with / .
    nfsMounts.mountPoint The local mount point. This must be a valid UNIX directory name. For example, if the local mount point is sourceData , then specify the path /mnt/nfs/sourceData from your training VM instance.

    For more information, see Where to specify compute resources .

Example: create a custom job using the gcloud CLI

  1. Follow the steps in Create a Python training application for a prebuilt container to build a training application to run on Vertex AI.

  2. Create a file named config.yaml that describes the PSA or Private Service Connect interface config mount settings for your training job. Use one of the following formats:

Private Service Connect interface

Preview — Private Service Connect interface

This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms . Pre-GA features are available "as is" and might have limited support. For more information, see the launch stage descriptions .

  1. To use Private Service Connect interface:

     pscInterfaceConfig:  
    network_attachment:  
     NETWORK_ATTACHMENT_NAME 
    workerPoolSpecs:  
    -  
    machineSpec:  
    machineType:  
     MACHINE_TYPE 
      
    replicaCount:  
     1 
      
    pythonPackageSpec:  
    executorImageUri:  
     PYTHON_PACKAGE_EXECUTOR_IMAGE_URI 
      
    or  
     PRE_BUILT_CONTAINER_IMAGE_URI 
      
    packageUris:  
    -  
     PYTHON_PACKAGE_URIS 
      
    pythonModule:  
    PYTHON_MODULE  
    nfsMounts:  
    -  
    server:  
     NFS_SERVER_IP 
      
    path:  
     NFS_SHARE_NAME 
      
    mountPoint:  
    LOCAL_FOLDER 
    

    Replace the following:

    • NETWORK_ATTACHMENT_NAME : The name of your network attachment.

    • MACHINE_TYPE : The identifier of your virtual machine type.

    • PYTHON_PACKAGE_EXECUTOR_IMAGE_URI or PRE_BUILT_CONTAINER_IMAGE_URI : The URI of a container image in Artifact Registry that will run the provided Python package. Vertex AI provides a wide range of executor images with pre-installed packages to meet users' various use cases.

    • PYTHON_PACKAGE_URIS : A comma-separated list of Cloud Storage URIs that specify the Python package files that make up the training program and its dependent packages. The maximum number of package URIs is 100.

    • PYTHON_MODULE : The Python module name to run after installing the packages.

    • NFS_SERVER_IP : The IP address of your NFS server.

    • NFS_SHARE_NAME : The NFS share path, which is an absolute path that begins with / .

    • LOCAL_FOLDER : The local mount point (UNIX directory name).

    Make sure that your network name is formatted correctly and that your NFS share exists in the specified network.

  2. Create your custom job and pass your config.yaml file to the --config parameter.

     gcloud  
    ai  
    custom-jobs  
    create  
     \ 
      
    --region = 
     LOCATION 
      
     \ 
      
    --display-name = 
     JOB_NAME 
      
     \ 
      
    --config = 
    config.yaml 
    

    Replace the following:

    • LOCATION : Specify the region to create the job in.

    • JOB_NAME : A name for the custom job.

VPC peering

  1. Use VPC Peering if you want the job to use VPC Peering/PSA on the job or not.

     network:  
    projects/ PROJECT_NUMBER 
    /global/networks/ NETWORK_NAME 
    workerPoolSpecs:  
    -  
    machineSpec:  
    machineType:  
     MACHINE_TYPE 
      
    replicaCount:  
     1 
      
    pythonPackageSpec:  
    executorImageUri:  
     PYTHON_PACKAGE_EXECUTOR_IMAGE_URI  
    or  
      
    PRE_BUILT_CONTAINER_IMAGE_URI 
      
    packageUris:  
    -  
     PYTHON_PACKAGE_URIS 
      
    pythonModule:  
     PYTHON_MODULE 
      
    nfsMounts:  
    -  
    server:  
     NFS_SERVER_IP 
      
    path:  
     NFS_SHARE_NAME 
      
    mountPoint:  
     LOCAL_FOLDER 
     
    

    Replace the following:

    • PROJECT_NUMBER : The project ID of your Google Cloud project.

    • NETWORK_NAME : The name of your private or Shared VPC.

    • MACHINE_TYPE : The identifier of your virtual machine type.

    • PYTHON_PACKAGE_EXECUTOR_IMAGE_URI or PRE_BUILT_CONTAINER_IMAGE_URI : The URI of a container image in Artifact Registry that will run the provided Python package. Vertex AI provides a wide range of executor images with pre-installed packages to meet users' various use cases.

    • PYTHON_PACKAGE_URIS : A comma-separated list of Cloud Storage URIs that specify the Python package files that make up the training program and its dependent packages. The maximum number of package URIs is 100.

    • PYTHON_MODULE : The Python module name to run after installing the packages.

    • NFS_SERVER_IP : The IP address of your NFS server.

    • NFS_SHARE_NAME : The NFS share path, which is an absolute path that begins with / .

    • LOCAL_FOLDER : The local mount point (UNIX directory name).

    Make sure that your network name is formatted correctly and that your NFS share exists in the specified network.

  2. Create your custom job and pass your config.yaml file to the --config parameter.

     gcloud  
    ai  
    custom-jobs  
    create  
     \ 
      
    --region = 
     LOCATION 
      
     \ 
      
    --display-name = 
     JOB_NAME 
      
     \ 
      
    --config = 
    config.yaml 
    

Replace the following:

  • LOCATION : Specify the region to create the job in.

  • JOB_NAME : A name for the custom job.

Limitations

  • You must mount your NFS share using an IP address that is internal to your VPC; using public URLs isn't allowed.

  • Training jobs mount NFS shares without authentication, and will fail if a username and password are required.

    To secure your data, set permissions on your NFS share. If you are using Filestore, see access control in the Filestore documentation.

  • You can't run two training jobs that mount NFS shares from different VPC networks at the same time. This is due to the network peering restriction .

Create a Mobile Website
View Site in Mobile | Classic
Share by: