Manage queued resources

Queued resources enable you to request Cloud TPU resources in a queued manner. When you request queued resources, the request is added to a queue maintained by the Cloud TPU service. When the requested resource becomes available, it's assigned to your Google Cloud project for your immediate exclusive use. It will remain assigned to your project unless you delete it or it's preempted. Only TPU Spot VMs and preemptible TPUs are eligible for preemption.

You can specify an optional start and end time in a queued resource request. The start time specifies the earliest time in which to fill the request. If a request has not been filled by the specified end time, the request expires. The request remains in the queue after it has expired.

Queued resource requests can be in one the following states:

WAITING_FOR_RESOURCES: The request has passed initial validation and has been added to the queue. It remains in this state until there are sufficient free resources to begin provisioning your request or the allocation interval elapses. When demand is high, not all requests can be immediately provisioned. If you need more reliable obtainability of TPUs, consider purchasing a reservation.
Important: WAITING_FOR_RESOURCES replaced the ACCEPTED state. If your code has logic that waits for queued resources to enter the ACCEPTED state, you may need to update the code to wait for the WAITING_FOR_RESOURCES state.
PROVISIONING: The request has been selected from the queue and its resources are being allocated.
ACTIVE: The request has been allocated. When queued resource requests are in the ACTIVE state, you can manage your TPU VMs as described in Manage TPUs .
FAILED: The request couldn't be completed, either because there is a problem with the request or the requested resources were not available within the allocation interval. The request remains in the queue until it is explicitly deleted.
SUSPENDING: The resources associated with the request are being deleted.
SUSPENDED: The resources specified in the request have been deleted. When a request is in the SUSPENDED state, it's no longer eligible for further allocation.

Prerequisites

Before you run the commands in this guide, you must install the Google Cloud CLI, create a Google Cloud project, and enable the Cloud TPU API. For instructions, see Set up the Cloud TPU environment .

If you're using one of the Cloud Client Libraries , follow the setup instructions for the language you're using:

Python
Java

Request an on-demand queued resource

On-demand resources won't be preempted, but on-demand quota doesn't guarantee there will be enough available Cloud TPU resources to satisfy your request. For more information about on-demand resources, see Quota types .

gcloud

gcloud  
compute  
tpus  
queued-resources  
create  
 your-queued-resource-id 
  
 \ 
  
--node-id  
 your-node-id 
  
 \ 
  
--project  
 your-project-id 
  
 \ 
  
--zone  
 us-central1-a 
  
 \ 
  
--accelerator-type  
 v5litepod-8 
  
 \ 
  
--runtime-version  
 v2-alpha-tpuv5-lite

Command parameter descriptions

queued-resource-id: The user-assigned ID of the queued resource request.
node-id: The user-assigned ID of the TPU which is created when the queued resource request is allocated.
project: Your Google Cloud project.
zone: The zone where you plan to create your Cloud TPU.
accelerator-type: The accelerator type specifies the version and size of the Cloud TPU you want to create. For more information about supported accelerator types for each TPU version, see TPU versions .
runtime-version: The Cloud TPU software version.

curl

curl  
-X  
POST  
-H  
 "Authorization: Bearer 
 $( 
gcloud  
auth  
print-access-token ) 
 " 
  
 \ 
-H  
 "Content-Type: application/json" 
  
 \ 
-d  
 "{ 
 'tpu': { 
 'node_spec': { 
 'parent': 'projects/ your-project-number 
/locations/ us-central1-a 
', 
 'node_id': ' your-node-id 
', 
 'node': { 
 'accelerator_type': ' v5litepod-8 
', 
 'runtime_version': ' v2-alpha-tpuv5-lite 
', 
 } 
 } 
 } 
 }" 
  
 \ 
https://tpu.googleapis.com/v2alpha1/projects/ your-project-id 
/locations/ us-central1-a 
/queuedResources?queued_resource_id = 
 your-queued-resource-id

Command parameter descriptions

queued-resource-id: The user-assigned ID of the queued resource request.
node-id: The user-assigned ID of the TPU which is created when the queued resource request is allocated.
project: Your Google Cloud project.
zone: The zone where you plan to create your Cloud TPU.
accelerator-type: The accelerator type specifies the version and size of the Cloud TPU you want to create. For more information about supported accelerator types for each TPU version, see TPU versions .
runtime-version: The Cloud TPU software version.

Console

In the Google Cloud console, go to the TPUspage:

Go to TPUs
Click Create TPU.
In the Namefield, enter a name for your TPU.
In the Zonebox, select the zone where you want to create the TPU.
In the TPU typebox, select an accelerator type. The accelerator type specifies the version and size of the Cloud TPU you want to create. For more information about supported accelerator types for each TPU version, see TPU versions .
In the TPU software versionbox, select a software version. When creating a Cloud TPU VM, the TPU software version specifies the version of the TPU runtime to install. For more information, see TPU software versions .
Click the Enable queueingtoggle.
In the Queued resource namefield, enter a name for your queued resource request.
Click Createto create your queued resource request.

Java

To authenticate to Cloud TPU, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  import 
  
 com.google.cloud.tpu.v2alpha1.CreateQueuedResourceRequest 
 ; 
 import 
  
 com.google.cloud.tpu.v2alpha1.Node 
 ; 
 import 
  
 com.google.cloud.tpu.v2alpha1.QueuedResource 
 ; 
 import 
  
 com.google.cloud.tpu.v2alpha1.TpuClient 
 ; 
 import 
  
 java.io.IOException 
 ; 
 import 
  
 java.util.concurrent.ExecutionException 
 ; 
 import 
  
 java.util.concurrent.TimeUnit 
 ; 
 import 
  
 java.util.concurrent.TimeoutException 
 ; 
 public 
  
 class 
 CreateQueuedResource 
  
 { 
  
 public 
  
 static 
  
 void 
  
 main 
 ( 
 String 
 [] 
  
 args 
 ) 
  
 throws 
  
 IOException 
 , 
  
 ExecutionException 
 , 
  
 InterruptedException 
 , 
  
 TimeoutException 
  
 { 
  
 // TODO(developer): Replace these variables before running the sample. 
  
 // Project ID or project number of the Google Cloud project you want to create a node. 
  
 String 
  
 projectId 
  
 = 
  
 "YOUR_PROJECT_ID" 
 ; 
  
 // The zone in which to create the TPU. 
  
 // For more information about supported TPU types for specific zones, 
  
 // see https://cloud.google.com/tpu/docs/regions-zones 
  
 String 
  
 zone 
  
 = 
  
 "us-central1-a" 
 ; 
  
 // The name for your TPU. 
  
 String 
  
 nodeName 
  
 = 
  
 "YOUR_NODE_ID" 
 ; 
  
 // The accelerator type that specifies the version and size of the Cloud TPU you want to create. 
  
 // For more information about supported accelerator types for each TPU version, 
  
 // see https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#versions. 
  
 String 
  
 tpuType 
  
 = 
  
 "v5litepod-4" 
 ; 
  
 // Software version that specifies the version of the TPU runtime to install. 
  
 // For more information see https://cloud.google.com/tpu/docs/runtimes 
  
 String 
  
 tpuSoftwareVersion 
  
 = 
  
 "v2-tpuv5-litepod" 
 ; 
  
 // The name for your Queued Resource. 
  
 String 
  
 queuedResourceId 
  
 = 
  
 "QUEUED_RESOURCE_ID" 
 ; 
  
 createQueuedResource 
 ( 
  
 projectId 
 , 
  
 zone 
 , 
  
 queuedResourceId 
 , 
  
 nodeName 
 , 
  
 tpuType 
 , 
  
 tpuSoftwareVersion 
 ); 
  
 } 
  
 // Creates a Queued Resource 
  
 public 
  
 static 
  
 QueuedResource 
  
 createQueuedResource 
 ( 
 String 
  
 projectId 
 , 
  
 String 
  
 zone 
 , 
  
 String 
  
 queuedResourceId 
 , 
  
 String 
  
 nodeName 
 , 
  
 String 
  
 tpuType 
 , 
  
 String 
  
 tpuSoftwareVersion 
 ) 
  
 throws 
  
 IOException 
 , 
  
 ExecutionException 
 , 
  
 InterruptedException 
 , 
  
 TimeoutException 
  
 { 
  
 String 
  
 resource 
  
 = 
  
 String 
 . 
 format 
 ( 
 "projects/%s/locations/%s/queuedResources/%s" 
 , 
  
 projectId 
 , 
  
 zone 
 , 
  
 queuedResourceId 
 ); 
  
 // Initialize client that will be used to send requests. This client only needs to be created 
  
 // once, and can be reused for multiple requests. 
  
 try 
  
 ( 
 TpuClient 
  
 tpuClient 
  
 = 
  
 TpuClient 
 . 
 create 
 ()) 
  
 { 
  
 String 
  
 parent 
  
 = 
  
 String 
 . 
 format 
 ( 
 "projects/%s/locations/%s" 
 , 
  
 projectId 
 , 
  
 zone 
 ); 
  
 Node 
  
 node 
  
 = 
  
 Node 
 . 
 newBuilder 
 () 
  
 . 
 setName 
 ( 
 nodeName 
 ) 
  
 . 
 setAcceleratorType 
 ( 
 tpuType 
 ) 
  
 . 
 setRuntimeVersion 
 ( 
 tpuSoftwareVersion 
 ) 
  
 . 
 setQueuedResource 
 ( 
 resource 
 ) 
  
 . 
 build 
 (); 
  
 QueuedResource 
  
 queuedResource 
  
 = 
  
 QueuedResource 
 . 
 newBuilder 
 () 
  
 . 
 setName 
 ( 
 queuedResourceId 
 ) 
  
 . 
 setTpu 
 ( 
  
 QueuedResource 
 . 
 Tpu 
 . 
 newBuilder 
 () 
  
 . 
 addNodeSpec 
 ( 
  
 QueuedResource 
 . 
 Tpu 
 . 
 NodeSpec 
 . 
 newBuilder 
 () 
  
 . 
 setParent 
 ( 
 parent 
 ) 
  
 . 
 setNode 
 ( 
 node 
 ) 
  
 . 
 setNodeId 
 ( 
 nodeName 
 ) 
  
 . 
 build 
 ()) 
  
 . 
 build 
 ()) 
  
 . 
 build 
 (); 
  
 CreateQueuedResourceRequest 
  
 request 
  
 = 
  
 CreateQueuedResourceRequest 
 . 
 newBuilder 
 () 
  
 . 
 setParent 
 ( 
 parent 
 ) 
  
 . 
 setQueuedResourceId 
 ( 
 queuedResourceId 
 ) 
  
 . 
 setQueuedResource 
 ( 
 queuedResource 
 ) 
  
 . 
 build 
 (); 
  
 return 
  
 tpuClient 
 . 
 createQueuedResourceAsync 
 ( 
 request 
 ). 
 get 
 ( 
 1 
 , 
  
 TimeUnit 
 . 
 MINUTES 
 ); 
  
 } 
  
 } 
 }

Manage queued resources

Prerequisites

Request an on-demand queued resource

gcloud

Command parameter descriptions

curl

Command parameter descriptions

Console

Java

Python

Request a queued resource using a reservation

gcloud

Command parameter descriptions

curl

Command parameter descriptions

Console

Request a TPU Spot VM queued resource

gcloud

Command parameter descriptions

curl

Command parameter descriptions

Console

Java

Python

Request a queued resource to be allocated before or after a specified time

Request a queued resource to be fulfilled after a specified time

gcloud

Command parameter descriptions

curl

Command parameter descriptions

Console

gcloud

Command parameter descriptions

curl

Command parameter descriptions

Java

Request a queued resource that expires after a specified time

gcloud

Command parameter descriptions

curl

Command parameter descriptions

Console

gcloud

Command parameter descriptions

curl

Command parameter descriptions

Python

Request a queued resource to be allocated within a specified interval

gcloud

Command flag descriptions

curl

Command flag descriptions

Console

Request a queued resource with a startup script

gcloud

Command flag descriptions

curl

Command flag descriptions

Java

Python

Request a queued resource with a specified network and subnetwork

gcloud

Command parameter descriptions

curl

Command parameter descriptions

Console

Java

Python

Delete a queued resource request

gcloud

Command flag descriptions

curl

Command flag descriptions

Console

Java

Python

gcloud

Command flag descriptions

Command flag descriptions

curl