This document describes the pre-configured, bootable operating system (OS) images that Cluster Director uses to deploy Compute Engine instances in your clusters.
The OS images that you can use in Cluster Director come with pre-configured machine learning (ML) frameworks and libraries. These frameworks and libraries remove the need for manual installation, as well as simplify model creation and training for large-scale workloads. By understanding the available OS images in Cluster Director and their lifecycle, you can choose the right OS image for your workload, keep your clusters secure, and prevent workload disruptions caused by obsolete software.
Slurm OS images for Cluster Director
When you deploy a Slurm cluster, Cluster Director provides OS images with pre-configured ML frameworks and libraries for the nodes in your cluster. These OS images offer the following benefits:
-
The OS images provision every node in your cluster with a consistent software stack, which includes the required NVIDIA drivers and CUDA versions.
-
The OS images are extensions of the Ubuntu LTS OS images , and they include all necessary system software for cluster and workload management.
Included software
The custom OS images for Cluster Director include the following software components by default:
-
OS images:The following table lists the supported image families and image versions for each machine series.
Machine seriesImage familyImage versionA4XUbuntu 24.04 LTS with NVIDIA driver version 580 and CUDA 13-
projects/clusterdirector-public-images/global/images/family/a4x-ubuntu-2404-arm64-nvidia-580-slurm-2505-v20251118
A4, A3 Ultra, or N2Ubuntu 22.04 LTS with NVIDIA driver version 580 and CUDA 13, or version 570 and CUDA 12-
projects/clusterdirector-public-images/global/images/family/common-ubuntu-2204-amd64-nvidia-580-slurm-2505-v20251113(default) -
projects/clusterdirector-public-images/global/images/family/common-ubuntu-2204-amd64-nvidia-570-slurm-2505-v20250918
A3 MegaUbuntu 22.04 LTS with NVIDIA driver version 570 and CUDA 12-
projects/clusterdirector-public-images/global/images/family/a3m-ubuntu-2204-amd64-nvidia-570-slurm-2505-v20251114(default) -
projects/clusterdirector-public-images/global/images/family/a3m-ubuntu-2204-amd64-nvidia-570-slurm-2505-v20250918
-
-
Orchestration:Slurm version 25.05 and its dependencies, such as MariaDB .
-
Containerization tools:the NVIDIA enroot container runtime and NVIDIA pyxis, used for running containerized workloads on Slurm clusters.
-
Drivers:NVIDIA driver version 580 with CUDA Toolkit 13, or version 570 with CUDA Toolkit 12.
-
Libraries:libraries for GPUDirect RDMA, including
ibverbs-utilsandrdma-core. -
Parallel computing libraries: Open MPI and PMIx for managing parallel processing tasks across the cluster.
-
Google Cloud integrations:the Ops Agent for monitoring and logging, and Cloud Storage FUSE for accessing Cloud Storage buckets from the cluster nodes.
Image family lifecycle
To help ensure that your clusters remain secure and compatible with the latest Google Cloud features, Cluster Director releases new image families approximately every six months. Each image family that Cluster Director supports goes through the following lifecycle:
-
Supported: for 12 months after the release date, the image family is fully supported. It receives critical security and bug fixes. We recommend that you use supported image families for all new deployments.
-
Deprecated: for approximately six months after the active phase ends, the image family is deprecated. You can still use the image family to create nodes in your cluster, but we don't recommend it as the image family no longer receives security patches.
-
Obsolete: after the deprecated phase ends, the image family becomes obsolete. You can no longer create nodes in your cluster by using this image family. Existing nodes continue to run, but they risk being incompatible with the cluster controller.
-
Deleted: three months after an image family becomes obsolete, the underlying Compute Engine image resource is permanently deleted. Running nodes that use that image family are unaffected.
For information about the exact end of support dates for an image family, see Image family release notes .

