Stay organized with collectionsSave and categorize content based on your preferences.
This page describes how to manage graphics processing unit (GPU) workloads on
Google Distributed Cloud connected. To take advantage of this functionality,
you must have a Distributed Cloud connected hardware configuration
that contains GPUs. For more information, seePlan the hardware configuration.
Distributed Cloud connected workloads can run in containers and on
virtual machines:
GPU workloads running in containers.All GPU resources on your
Distributed Cloud connected cluster are initially allocated to
workloads running in containers. The GPU driver for running GPU-based containerized
workloads is included in Distributed Cloud connected. Within each
container, GPU libraries are mounted at/opt/nvidia.
GPU workloads running on virtual machines.To run a GPU-based workload
on a virtual machine, you must allocate GPU resources on the target
Distributed Cloud connected node to virtual machines, as described
later on this page. Doing so bypasses the built-in GPU driver and passes the
GPUs directly through to virtual machines. You must manually install a
compatible GPU driver on each virtual machine's guest operating system. You
must also secure all the licensing required to run specialized GPU drivers on
your virtual machines.
To confirm that GPUs are present on a Distributed Cloud connected
node, verify that the node has thevm.cluster.gke.io.gpu=truelabel. If the
label is not present on the node, then there are no GPUs installed on the
corresponding Distributed Cloud connected physical machine.
Allocate GPU resources
By default, all GPU resources on each node in the cluster are allocated to
containerized workloads. To customize the allocation of GPU resources on each node,
complete the steps in this section.
Configure GPU resource allocation
To allocate GPU resources on a Distributed Cloud connected node,
use the following command to edit theGPUAllocationcustom resource on the
target node:
ReplaceNODE_NAMEwith the name of the target
Distributed Cloud node.
In the following example, the command's output shows the factory-default GPU
resource allocation. By default, all GPU resources are allocated to
containerized (pod) workloads, and no GPU resources are allocated to
virtual machine (vm) workloads:
...spec:pod:2# Number of GPUs allocated for container workloadsvm:0# Number of GPUs allocated for VM workloads
Set your GPU resource allocations as follows:
To allocate a GPU resource to containerized workloads, increase the value
of thepodfield and decrease the value of thevmfield by the same
amount.
To allocate a GPU resource to virtual machine workloads, increase the value
of thevmfield and decrease the value of thepodfield by the same
amount.
The total number of allocated GPU resources must not exceed the number of GPUs
installed on the physical Distributed Cloud connected machine on
which the node runs; otherwise, the node rejects the invalid allocation.
In the following example, two GPU resources have been reallocated from
containerized (pod) workloads to virtual machine (vm) workloads:
...spec:pod:0# Number of GPUs allocated for container workloadsvm:2# Number of GPUs allocated for VM workloads
When you finish, apply the modifiedGPUAllocationresource to your cluster
and wait for its status to change toAllocationFulfilled.
Check GPU resource allocation
To check your GPU resource allocation, use the following command:
ReplaceNODE_NAMEwith the name of the target
Distributed Cloud connected node.
The command returns output similar to the following example:
Name:mynode1...spec:node:mynode1pod:2# Number of GPUs allocated for container workloadsvm:0# Number of GPUs allocated for VM workloadsStatus:Allocated:trueConditions:LastTransitionTime:2022-09-23T03:14:10ZMessage:ObservedGeneration:1Reason:AllocationFulfilledStatus:TrueType:AllocationStatusLastTransitionTime:2022-09-23T03:14:16ZMessage:ObservedGeneration:1Reason:DeviceStateUpdatedStatus:TrueType:DeviceStateUpdatedConsumption:pod:0/2# Number of GPUs currently consumed by container workloadsvm:0/0# Number of GPUs currently consumed by VM workloadsDeviceModel:TeslaT4Events:<none>
Configure a container to use GPU resources
To configure a container running on Distributed Cloud connected to
use GPU resources, configure its specification as shown in the following
example, and then apply it to your cluster:
CUDA_TOOLKIT_IMAGE: the full path and name of the
NVIDIA CUDA toolkit image. The version of the CUDA toolkit must match the
version of the NVIDIA driver running on your
Distributed Cloud connected cluster.
To determine your NVIDIA driver version, see theDistributed Cloud release notes.
To find the matching CUDA toolkit version, seeCUDA Compatibility.
NODE_NAME: the name of the target
Distributed Cloud connected node.
GPU_MODEL: the model of NVIDIA GPU installed in the target Distributed Cloud connected
machine. Valid values are:
nvidia.com/gpu-pod-NVIDIA_L4for the NVIDIA L4 GPU
nvidia.com/gpu-pod-TESLA_T4for the NVIDIA Tesla T4 GPU
Configure a virtual machine to use GPU resources
To configure a virtual machine running on
Distributed Cloud connected to use GPU resources, configure itsVirtualMachineresource specification as shown in the following example,
and then apply it to your cluster:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[[["\u003cp\u003eThis document details the management of GPU workloads on Google Distributed Cloud connected, which requires a specific hardware configuration including GPUs.\u003c/p\u003e\n"],["\u003cp\u003eGPU workloads can run in both containers and virtual machines, with containers initially having all GPU resources allocated, and the built-in GPU driver is utilized, while VMs require manual driver installation and licensing.\u003c/p\u003e\n"],["\u003cp\u003eGPU resource allocation is customizable per node, and by default, they are allocated to containerized workloads, but they can be reallocated to virtual machine workloads.\u003c/p\u003e\n"],["\u003cp\u003eTo allocate GPU resources, you must use the \u003ccode\u003ekubectl edit gpuallocation\u003c/code\u003e command, by which you can then check the allocation using \u003ccode\u003ekubectl describe gpuallocations\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eTo configure containers and virtual machines to utilize GPU resources you need to modify their specification files, using the right \u003ccode\u003eGPU_MODEL\u003c/code\u003e value, and then apply them to your cluster.\u003c/p\u003e\n"]]],[],null,["# Manage GPU workloads\n\nThis page describes how to manage graphics processing unit (GPU) workloads on\nGoogle Distributed Cloud connected. To take advantage of this functionality,\nyou must have a Distributed Cloud connected hardware configuration\nthat contains GPUs. For more information, see\n[Plan the hardware configuration](/distributed-cloud/edge/latest/docs/requirements#hardware).\n| **Note:** GPU workloads are only supported on legacy Distributed Cloud connected rack hardware (\"Config 2\") and on [Distributed Cloud connected servers](/distributed-cloud/edge/latest/docs/vm-servers#create_a_virtual_machine_with_gpu_support). GPU workloads are not supported on the refreshed Distributed Cloud connected rack hardware.\n\nDistributed Cloud connected workloads can run in containers and on\nvirtual machines:\n\n- **GPU workloads running in containers.** All GPU resources on your\n Distributed Cloud connected cluster are initially allocated to\n workloads running in containers. The GPU driver for running GPU-based containerized\n workloads is included in Distributed Cloud connected. Within each\n container, GPU libraries are mounted at `/opt/nvidia`.\n\n- **GPU workloads running on virtual machines.** To run a GPU-based workload\n on a virtual machine, you must allocate GPU resources on the target\n Distributed Cloud connected node to virtual machines, as described\n later on this page. Doing so bypasses the built-in GPU driver and passes the\n GPUs directly through to virtual machines. You must manually install a\n compatible GPU driver on each virtual machine's guest operating system. You\n must also secure all the licensing required to run specialized GPU drivers on\n your virtual machines.\n\nTo confirm that GPUs are present on a Distributed Cloud connected\nnode, verify that the node has the `vm.cluster.gke.io.gpu=true` label. If the\nlabel is not present on the node, then there are no GPUs installed on the\ncorresponding Distributed Cloud connected physical machine.\n\nAllocate GPU resources\n----------------------\n\nBy default, all GPU resources on each node in the cluster are allocated to\ncontainerized workloads. To customize the allocation of GPU resources on each node,\ncomplete the steps in this section.\n\n### Configure GPU resource allocation\n\n1. To allocate GPU resources on a Distributed Cloud connected node,\n use the following command to edit the `GPUAllocation` custom resource on the\n target node:\n\n ```bash\n kubectl edit gpuallocation NODE_NAME --namespace vm-system\n ```\n\n Replace \u003cvar translate=\"no\"\u003eNODE_NAME\u003c/var\u003e with the name of the target\n Distributed Cloud node.\n\n In the following example, the command's output shows the factory-default GPU\n resource allocation. By default, all GPU resources are allocated to\n containerized (`pod`) workloads, and no GPU resources are allocated to\n virtual machine (`vm`) workloads: \n\n ...\n spec:\n pod: 2 # Number of GPUs allocated for container workloads\n vm: 0 # Number of GPUs allocated for VM workloads\n\n2. Set your GPU resource allocations as follows:\n\n - To allocate a GPU resource to containerized workloads, increase the value of the `pod` field and decrease the value of the `vm` field by the same amount.\n - To allocate a GPU resource to virtual machine workloads, increase the value of the `vm` field and decrease the value of the `pod` field by the same amount.\n\n The total number of allocated GPU resources must not exceed the number of GPUs\n installed on the physical Distributed Cloud connected machine on\n which the node runs; otherwise, the node rejects the invalid allocation.\n\n In the following example, two GPU resources have been reallocated from\n containerized (`pod`) workloads to virtual machine (`vm`) workloads: \n\n ...\n spec:\n pod: 0 # Number of GPUs allocated for container workloads\n vm: 2 # Number of GPUs allocated for VM workloads\n\n When you finish, apply the modified `GPUAllocation` resource to your cluster\n and wait for its status to change to `AllocationFulfilled`.\n\n### Check GPU resource allocation\n\n- To check your GPU resource allocation, use the following command:\n\n ```bash\n kubectl describe gpuallocations NODE_NAME --namespace vm-system\n ```\n\n Replace \u003cvar translate=\"no\"\u003eNODE_NAME\u003c/var\u003e with the name of the target\n Distributed Cloud connected node.\n\n The command returns output similar to the following example: \n\n Name: mynode1\n ...\n spec:\n node: mynode1\n pod: 2 # Number of GPUs allocated for container workloads\n vm: 0 # Number of GPUs allocated for VM workloads\n Status:\n Allocated: true\n Conditions:\n Last Transition Time: 2022-09-23T03:14:10Z\n Message:\n Observed Generation: 1\n Reason: AllocationFulfilled\n Status: True\n Type: AllocationStatus\n Last Transition Time: 2022-09-23T03:14:16Z\n Message:\n Observed Generation: 1\n Reason: DeviceStateUpdated\n Status: True\n Type: DeviceStateUpdated\n Consumption:\n pod: 0/2 # Number of GPUs currently consumed by container workloads\n vm: 0/0 # Number of GPUs currently consumed by VM workloads\n Device Model: Tesla T4\n Events: \u003cnone\u003e\n\nConfigure a container to use GPU resources\n------------------------------------------\n\nTo configure a container running on Distributed Cloud connected to\nuse GPU resources, configure its specification as shown in the following\nexample, and then apply it to your cluster: \n\n```yaml\n apiVersion: v1\n kind: Pod\n metadata:\n name: my-gpu-pod\n spec:\n containers:\n - name: my-gpu-container\n image: CUDA_TOOLKIT_IMAGE\n command: [\"/bin/bash\", \"-c\", \"--\"]\n args: [\"while true; do sleep 600; done;\"]\n env:\n resources:\n requests:\n GPU_MODEL: 2\n limits:\n GPU_MODEL: 2\n nodeSelector:\n kubernetes.io/hostname: NODE_NAME\n```\n\nReplace the following:\n\n- \u003cvar translate=\"no\"\u003eCUDA_TOOLKIT_IMAGE\u003c/var\u003e: the full path and name of the NVIDIA CUDA toolkit image. The version of the CUDA toolkit must match the version of the NVIDIA driver running on your Distributed Cloud connected cluster. To determine your NVIDIA driver version, see the [Distributed Cloud release notes](/distributed-cloud/edge/latest/docs/release-notes). To find the matching CUDA toolkit version, see [CUDA Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/index.html).\n- \u003cvar translate=\"no\"\u003eNODE_NAME\u003c/var\u003e: the name of the target Distributed Cloud connected node.\n- \u003cvar translate=\"no\"\u003eGPU_MODEL\u003c/var\u003e: the model of NVIDIA GPU installed in the target Distributed Cloud connected machine. Valid values are:\n - `nvidia.com/gpu-pod-NVIDIA_L4` for the NVIDIA L4 GPU\n - `nvidia.com/gpu-pod-TESLA_T4` for the NVIDIA Tesla T4 GPU\n\nConfigure a virtual machine to use GPU resources\n------------------------------------------------\n\nTo configure a virtual machine running on\nDistributed Cloud connected to use GPU resources, configure its\n`VirtualMachine` resource specification as shown in the following example,\nand then apply it to your cluster: \n\n```yaml\napiVersion: vm.cluster.gke.io/v1\nkind: VirtualMachine\n...\nspec:\n ...\n gpu:\n model: GPU_MODEL\n quantity: 2\n```\n\nReplace the following:\n\n- \u003cvar translate=\"no\"\u003eGPU_MODEL\u003c/var\u003e: the model of NVIDIA GPU installed in the target Distributed Cloud connected\n machine. Valid values are:\n\n - `nvidia.com/gpu-pod-NVIDIA_L4` for the NVIDIA L4 GPU\n - `nvidia.com/gpu-pod-TESLA_T4` for the NVIDIA Tesla T4 GPU ## What's next\n- [Manage virtual machines](/distributed-cloud/edge/latest/docs/virtual-machines)\n\n- [Deploy workloads](/distributed-cloud/edge/latest/docs/deploy)\n\n- [Manage machines](/distributed-cloud/edge/latest/docs/machines)\n\n- [Manage clusters](/distributed-cloud/edge/latest/docs/clusters)"]]