Use a custom OS image

You can use a custom OS image for your TPU VMs to pre-load software, use a specific OS distribution, or apply custom kernel modifications. Creating a custom image involves making specific system modifications during the image creation process and configuring the image to handle boot-time tasks required for TPU functionality.

Keep the following disclaimers in mind if you use a custom OS image with TPUs:

Google provides default TPU-optimized Ubuntu long-term support (LTS) images. The OS changes listed on this page are only validated for the Google-supported, TPU-optimized Ubuntu LTS images.
You are responsible for extrapolating the required OS changes for any other OS distribution or custom images. Google doesn't guarantee that the modifications for Ubuntu listed on this page work with other OS distributions or another Ubuntu image with a custom kernel.
Google doesn't build or provide testing for any OS images other than the default TPU-optimized Ubuntu LTS images. You must build and test your custom OS image.

For more information about the default TPU-optimized Ubuntu LTS images, see TPU OS images .

Prerequisites

Your base image must have the following components installed:

Python 3
gcloud CLI

Make modifications during image creation

Apply the following modifications while building your custom Ubuntu image.

Bind TPU devices to VFIO

To allow the guest OS to access TPU hardware, you must bind TPU devices to the vfio-pci driver .

Create a udev rules file named 99-tpu-vfiopci.rules in /etc/udev/rules.d/ :

  # Rules for binding vfio-enabled TPU devices to vfio-pci. 
 # v5p 
 SUBSYSTEM 
 == 
 "pci" 
,  
 ACTION 
 == 
 "add" 
,  
ATTRS { 
vendor }== 
 "0x1ae0" 
,  
ATTRS { 
device }== 
 "0x0062" 
,  
ATTRS { 
subsystem_vendor }== 
 "0x1ae0" 
,  
ATTRS { 
subsystem_device }== 
 "0x00ad" 
,  
DRIVER! = 
 "vfio-pci" 
,  
 TAG 
 += 
 "bind_to_vfio_pci" 
 # v6e 
 SUBSYSTEM 
 == 
 "pci" 
,  
 ACTION 
 == 
 "add" 
,  
ATTRS { 
vendor }== 
 "0x1ae0" 
,  
ATTRS { 
device }== 
 "0x006f" 
,  
ATTRS { 
subsystem_vendor }== 
 "0x1ae0" 
,  
ATTRS { 
subsystem_device }== 
 "0x00d1" 
,  
DRIVER! = 
 "vfio-pci" 
,  
 TAG 
 += 
 "bind_to_vfio_pci" 
 # TPU7x 
 SUBSYSTEM 
 == 
 "pci" 
,  
 ACTION 
 == 
 "add" 
,  
ATTRS { 
vendor }== 
 "0x1ae0" 
,  
ATTRS { 
device }== 
 "0x0076" 
,  
ATTRS { 
subsystem_vendor }== 
 "0x1ae0" 
,  
ATTRS { 
subsystem_device }== 
 "0x00f2" 
,  
DRIVER! = 
 "vfio-pci" 
,  
 TAG 
 += 
 "bind_to_vfio_pci" 
 # Bind all 'bind_to_vfio_pci' tagged devices to vfio-pci. 
 TAG 
 == 
 "bind_to_vfio_pci" 
,  
 RUN 
 += 
 "/lib/udev/bind_to_vfio_pci.sh 
 $kernel 
 "

Create a script named bind_to_vfio_pci.sh in /lib/udev/ :

  #!/bin/bash 
 #!/usr/bin/env bash 
 # Run ./bind_to_vfio_pci.sh <DBDF> 
 # Binds the device at <DBDF> to vfio-pci. 
 # If the device is already bound to a driver, unbinds it first. 
 # Load the vfio-pci module into the kernel. No-op if already loaded. 
modprobe  
vfio-pci DBDF_REGEX 
 = 
 "^[[:xdigit:]]{4}:[[:xdigit:]]{2}:[[:xdigit:]]{2}.[[:xdigit:]] 
$ " 
 unset 
  
BDF if 
  
 [[ 
  
 $1 
  
 = 
~  
 $DBDF_REGEX 
  
 ]] 
 ; 
  
 then 
  
 BDF 
 = 
 $1 
 else 
  
 echo 
  
 "Error: BDF arg ( 
 $1 
 ) is not in form dddd:bb:dd.f" 
  
 exit 
  
 1 
 fi 
 PCI_PATH 
 = 
 "/sys/bus/pci/devices/ 
 $BDF 
 " 
 echo 
  
 "vfio-pci" 
 > 
 " 
 $PCI_PATH 
 /driver_override" 
 PCI_DRIVER_PATH 
 = 
 " 
 $PCI_PATH 
 /driver" 
 if 
  
 [[ 
  
-d  
 " 
 $PCI_DRIVER_PATH 
 " 
  
 ]] 
 ; 
  
 then 
  
 curr_driver 
 = 
 $( 
readlink  
 " 
 $PCI_DRIVER_PATH 
 " 
 ) 
  
 curr_driver 
 = 
 ${ 
 curr_driver 
 ##*/ 
 } 
  
 if 
  
 [[ 
  
 $curr_driver 
  
 == 
  
 "vfio-pci" 
  
 ]] 
 ; 
  
 then 
  
 echo 
  
 " 
 $BDF 
 already bound to vfio-pci" 
  
 exit 
  
 0 
  
 else 
  
 echo 
  
 " 
 $BDF 
 " 
 > 
 " 
 $PCI_DRIVER_PATH 
 /unbind" 
  
 if 
  
 [[ 
  
-d  
 " 
 $PCI_DRIVER_PATH 
 " 
  
 ]] 
 ; 
  
 then 
  
 echo 
  
 "Error: Unable to unbind 
 $PCI_DRIVER_PATH 
 " 
  
 exit 
  
 1 
  
 fi 
  
 echo 
  
 "Unbound 
 $BDF 
 from driver 
 $curr_driver 
 " 
  
 fi 
 fi 
 echo 
  
 " 
 $BDF 
 " 
 > 
/sys/bus/pci/drivers_probe echo 
  
 "Bound 
 $BDF 
 to vfio-pci" 
 # Grant read/write access on VFIO device to all users 
 IOMMU_GROUP 
 = 
 $( 
readlink  
 " 
 $PCI_PATH 
 /iommu_group" 
  
 | 
  
xargs  
basename ) 
 VFIO_DEV 
 = 
 "/dev/vfio/ 
 $IOMMU_GROUP 
 " 
 if 
  
 [[ 
  
-c  
 " 
 $VFIO_DEV 
 " 
  
 ]] 
 ; 
  
 then 
  
chmod  
 0666 
  
 " 
 $VFIO_DEV 
 " 
 else 
  
 echo 
  
 " 
 $VFIO_DEV 
 not found" 
  
 exit 
  
 1 
 fi 
 # Set allow_unsafe_interrupts for x86 platforms. 
 ( 
uname  
-a  
 | 
  
grep  
-q  
x86_64 ) 
 && 
 echo 
  
 1 
 > 
/sys/module/vfio_iommu_type1/parameters/allow_unsafe_interrupts # This is only needed to avoid non-zero exit code from previous command. 
 echo 
  
 "All Done!"

Make the script executable:

 chmod  
+x  
/lib/udev/bind_to_vfio_pci.sh

Grant all users on the system access to the TPU device:

  echo 
  
 'KERNEL=="accel*" MODE="0666"' 
 >> 
/etc/udev/rules.d/99-tpu.rules

Modify the image to enhance performance

To ensure optimal performance, adjust the following system limits and parameters.

Memory limits

Allow a single process to lock unlimited memory by updating /etc/security/limits.conf :

  echo 
  
 '*  hard  memlock  unlimited' 
 >> 
/etc/security/limits.conf echo 
  
 '*  soft  memlock  unlimited' 
 >> 
/etc/security/limits.conf

File limits

Increase the number of open files by updating /etc/security/limits.conf :

  echo 
  
 "*    soft    nofile       100000" 
 >> 
/etc/security/limits.conf echo 
  
 "*    hard    nofile       100000" 
 >> 
/etc/security/limits.conf echo 
  
 "root soft    nofile       100000" 
 >> 
/etc/security/limits.conf echo 
  
 "root hard    nofile       100000" 
 >> 
/etc/security/limits.conf

Kernel parameters

Update your GRUB configuration (typically in /etc/default/grub ) to include the following parameters in GRUB_CMDLINE_LINUX :

idle=poll : Prevents the CPU from entering low-power idle states.
intel_iommu=on,sm_on : Enables Intel Input-Output Memory Management Unit (IOMMU). Required for TPU7x and v5p architectures.
transparent_hugepage=always : Enables Transparent Huge Pages (THP).

The following steps show how to update these kernel parameters:

Prevent the CPU from moving into a low power idle state by setting the following variable, which you will use in the next step.
```
  kernel_cmdline 
 = 
 "idle=poll" 
 
```

Enable the Intel Input-Output Memory Management Unit (IOMMU). This step is required for TPU7x and TPU v5p.

  kernel_cmdline 
 = 
 " 
 ${ 
 kernel_cmdline 
 } 
 intel_iommu=on,sm_on" 
 ; 
sed  
-i  
 "s/GRUB_CMDLINE_LINUX=\"\"/GRUB_CMDLINE_LINUX=\" 
 ${ 
 kernel_cmdline 
 } 
 \"/" 
  
/etc/default/grub echo 
  
 "Status: New kernel cmdline: 
 $( 
cat  
/etc/default/grub  
 | 
  
grep  
-e  
 '^GRUB_CMDLINE_LINUX=' 
 ) 
 " 
update-grub

Enable Transparent Huge Pages (THP):

  echo 
  
 "Status: Enabling THP" 
sed  
-i  
-r  
 's/GRUB_CMDLINE_LINUX="[a-zA-Z0-9_= ]*/& transparent_hugepage=always/' 
  
/etc/default/grub

update-grub

Install vBar agent

The vBar agent is required for the inter-chip interconnect (ICI) network to function.

To install the vBar agent, run the following commands:

Authenticate Docker with Artifact Registry:

 gcloud  
auth  
configure-docker  
us-docker.pkg.dev

Pull the Docker image from Artifact Registry:

 docker  
pull  
gcr.io/cloud-tpu-v2-images/vbar_control_agent:0.0.1

Run a container using the vBar agent image:

 docker  
run  
--privileged  
--net = 
host  
vbar_control_agent:0.0.1

Optional: Install and run AI Telemetry Collector

The AI Telemetry Collector runs inside the TPU VM and lets you access runtime and infrastructure metrics through Cloud Monitoring or through your own Prometheus-based monitoring pipeline. You can use the AI Telemetry Collector with a custom OS by using the ai-telemetry-collector Docker image. You can install the image onto your custom OS and use a config.yaml file to dictate the collection intervals, enable or disable specific metrics, or change the export destinations.

To install the AI Telemetry Collector, run the following commands:

Authenticate Docker with Artifact Registry:

 gcloud  
auth  
configure-docker  
us-docker.pkg.dev

Pull the Docker image from Artifact Registry:

 docker  
pull  
gcr.io/cloud-tpu-v2-images/ai-telemetry-collector:latest

Run a container using the AI Telemetry Collector image with the default configuration:
```
 docker  
run  
--privileged  
--net = 
host  
ai-telemetry-collector:latest 
```
For information about using a custom configuration file or adding additional configuration files, see AI Telemetry Collector .

Make boot time modifications

Configure your image to perform the tasks in the following sections every time a VM boots. You can use the cloud-init tool to configure boot time tasks by passing metadata to your instances. The configurations in the following sections use modules such as write_files and runcmd . Snippets that define files to be written should be included under the write_files: key, and commands that should be run at boot time should be included under the runcmd: key in your cloud-init configuration.

Start the vBar agent

Initiate the vBar control agent with the appropriate user and group IDs:

 vbar_control_agent  
--logtostderr  
--gid = 
  
--uid = 
  
--chroot = 
  
--census_enabled = 
 false 
  
--loas_pwd_fallback_in_corp

Configure environment variables

To ensure your environment is correctly initialized for TPU workloads, you must retrieve runtime configuration variables from the Compute Engine metadata server during the system boot process. To do this, add the following snippet to the write_files: section of your cloud-init configuration, which creates a script named /var/scripts/configure-env-vars.sh . This script automates retrieval of attributes from the tpu-env metadata key and saves them in /${HOME}/tpu-env to be used by the TPU software stack.

   
 - 
  
 path 
 : 
  
 /var/scripts/configure-env-vars.sh 
  
 permissions 
 : 
  
 0444 
  
 owner 
 : 
  
 root 
  
 content 
 : 
  
 | 
  
 grep -q CLOUDSDK_PYTHON /etc/environment || echo "CLOUDSDK_PYTHON=/usr/bin/python3" >> /etc/environment 
  
 export HOME=/home/tpu-runtime 
  
 curl -s 'http://metadata.google.internal/computeMetadata/v1/instance/attributes/tpu-env' -H 'Metadata-Flavor: Google' > /tmp/tpu-env.yaml 
  
 eval $(python3 -c ''' 
  
 import yaml 
  
 stream_in=open("/tmp/tpu-env.yaml", "r") 
  
 for k,v in yaml.safe_load(stream_in).items(): 
  
 print("{var}=\"{value}\"".format(var = k, value = str(v))) 
  
 ''' > "/${HOME}/tpu-env" 
  
 ) 
  
 rm -f "/tmp/tpu-env.yaml" 
  
 printenv 
  
 cat ${HOME}/tpu-env

Get VM metadata

The following snippet creates a script named /var/scripts/get-vm-metadata.py , a Python utility to programmatically query the metadata server for specific instance attributes and custom metadata tags. Add the following to the write_files: section of your cloud-init configuration:

   
 - 
  
 path 
 : 
  
 /var/scripts/get-vm-metadata.py 
  
 permissions 
 : 
  
 0444 
  
 owner 
 : 
  
 root 
  
 content 
 : 
  
 | 
  
 import sys, requests, os 
  
 if len(sys.argv) < 2: 
  
 sys.stderr.write('Must provide key') 
  
 os._exit(1) 
  
 key = sys.argv[1] 
  
 default = None 
  
 if len(sys.argv) > 2: 
  
 default = sys.argv[2] 
  
 attribute_type = 'attributes' 
  
 if len(sys.argv) > 3: 
  
 attribute_type = sys.argv[3] 
  
 request = requests.get("http://metadata.google.internal/computeMetadata/v1/instance/{}/{}".format(attribute_type, key), headers={'Metadata-Flavor': 'Google'}) 
  
 if request.status_code == 200: 
  
 print(request.content) 
  
 elif request.status_code == 404 or request.status_code == '403': 
  
 sys.stderr.write('Metadata key: {} does not exist\n'.format(key)) 
  
 if default: 
  
 print(default) 
  
 else: 
  
 sys.stderr.write('Lookup failed with: {}'.format(request))

Increase Cloud Storage timeouts

If your workload interacts with Cloud Storage, increase timeout durations by adding timeout values to /etc/environment . To do this, add the following snippet to the write_files: section of your cloud-init configuration, which creates a script named /var/scripts/configure-gcs-timeouts.sh .

   
 - 
  
 path 
 : 
  
 /var/scripts/configure-gcs-timeouts.sh 
  
 permissions 
 : 
  
 0444 
  
 owner 
 : 
  
 root 
  
 content 
 : 
  
 | 
  
 echo "GCS_RESOLVE_REFRESH_SECS=60" >> /etc/environment 
  
 echo "GCS_REQUEST_CONNECTION_TIMEOUT_SECS=300" >> /etc/environment 
  
 echo "GCS_METADATA_REQUEST_TIMEOUT_SECS=300" >> /etc/environment 
  
 echo "GCS_READ_REQUEST_TIMEOUT_SECS=300" >> /etc/environment 
  
 echo "GCS_WRITE_REQUEST_TIMEOUT_SECS=600" >> /etc/environment

What's next

Review available TPU OS images .
Learn how to Manage TPU VMs .