Skip to main content
Send feedback
Stay organized with collections
Save and categorize content based on your preferences.
Manage queued resources
Note:
This page applies to the Cloud TPU API. For Ironwood (TPU7x), you must use
Google Kubernetes Engine (GKE). For more information, see About TPUs in GKE
.
Queued resources enable you to request Cloud TPU resources in a queued manner.
When you create a queued resource request, the request is added to a queue
maintained by the Cloud TPU service. When the resource you requested becomes
available, it's assigned to your Google Cloud project for your immediate and
exclusive use. It will remain assigned to your project unless you delete it or
it's preempted. Only TPU Spot VMs and preemptible TPUs are eligible for
preemption.
Note: The examples shown in this document use TPU v5e. If you want to use a
different TPU version, update the --accelerator-type
and --runtime-version
flags. For more information about TPU versions, see TPU versions
.
For more information about runtime versions, see TPU runtime versions
.
You can specify an optional start and end time
in a queued resource request. The start time specifies the earliest time in
which to fill the request. If a request has not been filled by the specified end
time, the request expires. The request remains in the queue after it has expired.
Queued resource requests can be in one the following states:
WAITING_FOR_RESOURCES
:
The request has passed initial validation and has been added to the queue.
It remains in this state until there are sufficient free resources to begin
provisioning your request or the allocation interval
elapses. When demand is high, not all requests can be immediately
provisioned. If you need more reliable obtainability of TPUs, consider
purchasing a reservation. Important:
WAITING_FOR_RESOURCES
replaced the ACCEPTED
state. If your code has logic that waits for queued
resources to enter the ACCEPTED
state, you may need to update
the code to wait for the WAITING_FOR_RESOURCES
state.
PROVISIONING
: The request has been selected from the queue and its resources are being allocated.
ACTIVE
: The request has been allocated. When queued resource requests are in the ACTIVE
state, you can manage your TPU VMs as described in Manage TPUs
.
FAILED
: The request couldn't be completed, either because there is a problem with the
request or the requested resources were not available within the allocation interval.
The request remains in the queue until it is explicitly deleted.
SUSPENDING
: The resources associated with the request are being deleted.
SUSPENDED
: The resources specified in the request have been deleted. When a request
is in the SUSPENDED
state, it's no longer eligible for further
allocation.
Prerequisites
Before you run the commands in this guide, you must install the Google Cloud CLI,
create a Google Cloud project, and enable the Cloud TPU API. For
instructions, see Set up the Cloud TPU
environment
.
If you're using one of the Cloud Client Libraries
, follow the setup
instructions for the language you're using:
Request an on-demand queued resource
On-demand resources won't be preempted, but on-demand quota doesn't guarantee
there will be enough available Cloud TPU resources to satisfy your request.
For more information about on-demand resources, see Quota
types
.
gcloud
gcloud
compute
tpus
queued-resources
create
your-queued-resource-id
\
--node-id
your-node-id
\
--project
your-project-id
\
--zone
us-central1-a
\
--accelerator-type
v5litepod-8
\
--runtime-version
v2-alpha-tpuv5-lite
Command parameter descriptions
queued-resource-id
: The user-assigned ID of the queued resource request.
node-id
: The user-assigned ID of the TPU which is created when the queued
resource request is allocated.
project
: Your Google Cloud project.
zone
: The zone
where you plan to create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator types
for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
curl
curl
-X
POST
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
-d
"{
'tpu': {
'node_spec': {
'parent': 'projects/ your-project-number
/locations/ us-central1-a
',
'node_id': ' your-node-id
',
'node': {
'accelerator_type': ' v5litepod-8
',
'runtime_version': ' v2-alpha-tpuv5-lite
',
}
}
}
}"
\
https://tpu.googleapis.com/v2alpha1/projects/ your-project-id
/locations/ us-central1-a
/queuedResources?queued_resource_id =
your-queued-resource-id
Command parameter descriptions
queued-resource-id
: The user-assigned ID of the queued resource request.
node-id
: The user-assigned ID of the TPU which is created when the queued resource
request is allocated.
project
: Your Google Cloud project.
zone
: The zone
where you plan to
create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator types
for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
Console
In the Google Cloud console, go to the TPUs page:
Go to TPUs
Click Create TPU .
In the Name field, enter a name for your TPU.
In the Zone box, select the zone where you want to create the TPU.
In the TPU type box, select an accelerator type. The accelerator
type specifies the version and size of the Cloud TPU you want to
create. For more information about supported accelerator types for each
TPU version, see TPU
versions
.
In the TPU software version box, select a software version. When
creating a Cloud TPU VM, the TPU software version specifies the
version of the TPU runtime to install. For more information, see TPU
software versions
.
Click the Enable queueing toggle.
In the Queued resource name field, enter a name for your queued
resource request.
Click Create to create your queued resource request.
Important: Queued resources consume quota regardless of the state of the queued
resource request. You should delete queued resources after use to avoid blocking
future requests on quota limits. For more information about TPU quotas, see TPU quota
.
Request a queued resource using a reservation
You can request a queued resource using a reservation. To purchase a
reservation, contact your Google Cloud account team.
gcloud
gcloud
compute
tpus
queued-resources
create
your-queued-resource-id
\
--node-id
your-node-id
\
--project
your-project-id
\
--zone
us-central1-a
\
--accelerator-type
v5litepod-8
\
--runtime-version
v2-alpha-tpuv5-lite
\
--reserved
Command parameter descriptions
queued-resource-id
: The user-assigned ID of the queued resource request.
node-id
: The user-assigned ID of the TPU which is created when the queued resource
request is allocated.
project
: Your Google Cloud project.
zone
: The zone
where you plan to
create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator
types for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
reserved
: Use this flag when requesting queued resources as part of a Cloud TPU
reservation.
curl
curl
-X
POST
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
-d
"{
'tpu': {
'node_spec': {
'parent': 'projects/ your-project-number
/locations/ us-central1-a
',
'node_id': ' your-node-id
',
'node': {
'accelerator_type': ' v5litepod-8
',
'runtime_version': ' v2-alpha-tpuv5-lite
',
}
}
},
'guaranteed': {
'reserved': true,
}
}"
\
https://tpu.googleapis.com/v2alpha1/projects/ your-project-id
/locations/ us-central1-a
/queuedResources?queued_resource_id =
your-queued-resource-id
Command parameter descriptions
queued-resource-id
: The user-assigned ID of the queued resource request.
node-id
: The user-assigned ID of the TPU which is created when the queued
resource request is allocated.
project
: Your Google Cloud project.
zone
: The zone
where you plan to
create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator types
for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
reserved
: Use this flag when requesting queued resources as part of a Cloud TPU
reservation.
Console
In the Google Cloud console, go to the TPUs page:
Go to TPUs
Click Create TPU .
In the Name field, enter a name for your TPU.
In the Zone box, select the zone where you want to create the TPU.
In the TPU type box, select an accelerator type. The accelerator
type specifies the version and size of the Cloud TPU you want to
create. For more information about supported accelerator types for each
TPU version, see TPU
versions
.
In the TPU software version box, select a software version. When
creating a Cloud TPU VM, the TPU software version specifies the
version of the TPU runtime to install. For more information, see TPU
software versions
.
Click the Enable queueing toggle.
In the Queued resource name field, enter a name for your queued
resource request.
Expand the Management section.
Select the Use existing reservation checkbox.
Click Create to create your queued resource request.
Request a TPU Spot VM queued resource
A Spot VM
is a resource that can be preempted
and assigned to another workload at any time. Spot VM resources
cost less, and you might get access to resources sooner compared to a
non-Spot VM request. For more information about TPU
Spot VMs, see Manage TPU Spot VMs
.
gcloud
gcloud
compute
tpus
queued-resources
create
your-queued-resource-id
\
--node-id
your-node-id
\
--project
your-project-id
\
--zone
us-central1-a
\
--accelerator-type
v5litepod-8
\
--runtime-version
v2-alpha-tpuv5-lite
\
--spot
Command parameter descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
node-id
: The user-defined ID of the TPU created in response to the request.
project
: The ID of the project where the queued resource is allocated.
zone
: The zone
where you plan to
create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator types
for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
spot
: A boolean flag specifying that the queued resource is a Spot VM.
curl
curl
-X
POST
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
-d
"{
'tpu': {
'node_spec': {
'parent': 'projects/ your-project-number
/locations/ us-central1-a
',
'node_id': ' your-node-id
',
'node': {
'accelerator_type': ' v5litepod-8
',
'runtime_version': ' v2-alpha-tpuv5-lite
'
}
}
},
'spot': {}
}"
\
https://tpu.googleapis.com/v2alpha1/projects/ your-project-id
/locations/ us-central1-a
/queuedResources?queued_resource_id =
your-queued-resource-id
Command parameter descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
node-id
: The user-defined ID of the TPU created in response to the request.
project
: The ID of the project where the queued resource is allocated.
zone
: The zone
where you plan to
create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator
types for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
spot
: A boolean flag specifying that the queued resource is a Spot VM.
Console
In the Google Cloud console, go to the TPUs page:
Go to TPUs
Click Create TPU .
In the Name field, enter a name for your TPU.
In the Zone box, select the zone where you want to create the TPU.
In the TPU type box, select an accelerator type. The accelerator
type specifies the version and size of the Cloud TPU you want to
create. For more information about supported accelerator types for each
TPU version, see TPU versions
.
In the TPU software version box, select a software version. When
creating a Cloud TPU VM, the TPU software version specifies the
version of the TPU runtime to install. For more information, see TPU software versions
.
Click the Enable queueing toggle.
In the Queued resource name field, enter a name for your queued
resource request.
Expand the Management section.
Select the Make this a TPU Spot VM checkbox.
Click Create .
Request a queued resource to be allocated before or after a specified time
You can specify an optional start time
or end time
in a queued resource request. The start
time or start duration specifies the earliest time in which to fill the request.
The end time or end duration specifies how long the request remains valid.
If a request hasn't been filled by the specified end time or within the
specified duration, the request expires. After the request has expired, it
remains in the queue but is no longer eligible for allocation.
You can also specify an allocation interval
by
specifying a start time or duration and an end time or duration.
For a list of supported timestamp and duration formats, see Datetime
.
Request a queued resource to be fulfilled after a specified time
In a queued resource request, you can specify a time or duration after which a
resource should be allocated.
gcloud
The following command requests a v5p-4096 TPU with to be allocated after 9AM on
December 14, 2022.
gcloud
compute
tpus
queued-resources
create
your-queued-resource-id
\
--node-id
your-node-id
\
--project
your-project-id
\
--zone
us-east5-a
\
--accelerator-type
v5p-4096
\
--runtime-version
v2-alpha-tpuv5
\
--valid-after-time
2022
-12-14T09:00:00Z
Command parameter descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
node-id
: The user-defined ID of the TPU created in response to the request.
project
: The Google Cloud project where the queued resource is allocated.
zone
: The zone
where you plan to create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU you want to create.
For more information about supported accelerator types for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
valid-after-time
: The time, after which, the resource should be allocated For more information on duration formats, see Google Cloud CLI topic datetime
.
curl
The following command requests a v5p-4096 TPU with to be allocated after 9AM on
December 14, 2022.
curl
-X
POST
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
-d
"{
'tpu': {
'node_spec': {
'parent': 'projects/ your-project-number
/locations/ us-east5-a
',
'node_id': ' your-node-id
',
'node': {
'accelerator_type': ' v5p-4096
',
'runtime_version': ' v2-alpha-tpuv5
',
}
}
},
'queueing_policy': {
'valid_after_time': {
'seconds': 2022-12-14T09:00:00Z
}
}
}"
\
https://tpu.googleapis.com/v2alpha1/projects/ your-project-id
/locations/ us-east5-a
/queuedResources?queued_resource_id =
your-queued-resource-id
Command parameter descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
node-id
: The user-defined ID of the TPU created in response to the request.
project
: The Google Cloud project where the queued resource is allocated.
zone
: The zone
where you plan to create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator
types for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
valid-after-time
: The time, after which, the resource should be allocated For more
information on duration formats, see Google Cloud CLI topic datetime
.
Console
In the Google Cloud console, go to the TPUs page:
Go to TPUs
Click Create TPU .
In the Name field, enter a name for your TPU.
In the Zone box, select the zone where you want to create the TPU.
In the TPU type box, select an accelerator type. The accelerator
type specifies the version and size of the Cloud TPU you want to
create. For more information about supported accelerator types for each
TPU version, see TPU
versions
.
In the TPU software version box, select a software version. When
creating a Cloud TPU VM, the TPU software version specifies the
version of the TPU runtime to install. For more information, see [TPU
version of the TPU runtime to install. For more information, see TPU
software versions
.
Click the Enable queueing toggle.
In the Queued resource name field, enter a name for your queued
resource request.
In the Start request on field, enter the time after which the
resource should be allocated.
Click Create to create your queued resource request.
The following example requests a v5p-32 to be allocated after six hours.
Note: Specifying a duration for the start time of the queued resource request is
not supported in the Google Cloud console.
gcloud
gcloud
compute
tpus
queued-resources
create
your-queued-resource-id
\
--node-id
your-node-id
\
--project
your-project-id
\
--zone
us-east5-a
\
--accelerator-type
v5p-32
\
--runtime-version
v2-alpha-tpuv5
\
--valid-after-duration
6h
Command parameter descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
node-id
: The user-defined ID of the TPU created in response to the request.
project
: The Google Cloud project where the queued resource is allocated.
zone
: The zone
where you plan to
create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator types
for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
valid-after-duration
: The duration before which the TPU must not be provisioned. For more information on duration formats, see Google Cloud CLI topic datetime
curl
curl
-X
POST
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
-d
"{
'tpu': {
'node_spec': {
'parent': 'projects/ your-project-number
/locations/ us-east5-a
',
'node_id': ' your-node-id
',
'node': {
'accelerator_type': ' v5p-32
',
'runtime_version': ' v2-alpha-tpuv5
',
}
}
},
'queueing_policy': {
'valid_after_duration': {
'seconds': 21600
}
}"
\
https://tpu.googleapis.com/v2alpha1/projects/ your-project-id
/locations/ us-east5-a
/queuedResources?queued_resource_id =
your-queued-resource-id
Command parameter descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
node-id
: The user-defined ID of the TPU created in response to the request.
project
: The Google Cloud project where the queued resource is allocated.
zone
: The zone
where you plan to
create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator types
for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
valid-after-duration
: The duration before which the TPU must not be provisioned. For more information on duration formats, see Google Cloud CLI topic datetime
Request a queued resource that expires after a specified time
In a queued resource request, you can specify how long a queued resource request
remains valid. If the request hasn't been fulfilled by the time or duration you
specify, the request expires.
gcloud
The following command requests a v5p-4096 TPU. If the request isn't fulfilled by
December 14, 2022 at 9:00 AM, the request expires.
gcloud
compute
tpus
queued-resources
create
your-queued-resource-id
\
--node-id
your-node-id
\
--project
your-project-id
\
--zone
us-east5-a
\
--accelerator-type
v5p-4096
\
--runtime-version
v2-alpha-tpuv5
\
--valid-until-time
2022
-12-14T09:00:00Z
Command parameter descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
node-id
: The user-defined ID of the TPU created in response to the request.
project
: The ID of the project where the queued resource is allocated.
zone
: The zone
where you plan to
create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator types
for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
valid-until-time
: The time after which the request is canceled. For more information on
duration formats, see Google Cloud CLI topic datetime
.
curl
The following command requests a v5p-4096 TPU. If the request isn't fulfilled by
December 14, 2022 at 9:00 AM, the request expires.
curl
-X
POST
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
-d
"{
'tpu': {
'node_spec': {
'parent': 'projects/ your-project-number
/locations/ us-east5-a
',
'node_id': ' your-node-id
',
'node': {
'accelerator_type': ' v5p-4096
',
'runtime_version': ' v2-alpha-tpuv5
',
}
}
},
'queueing_policy': {
'valid_until_time': {
'seconds': 1655197200
}
}
}"
\
https://tpu.googleapis.com/v2alpha1/projects/ your-project-id
/locations/ us-east5-a
/queuedResources?queued_resource_id =
your-queued-resource-id
Command parameter descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
node-id
: The user-defined ID of the TPU created in response to the request.
project
: The ID of the project where the queued resource is allocated.
zone
: The zone
where you plan to
create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator types
for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
valid-until-time
: The time after which the request is canceled. For more information on duration formats, see Google Cloud CLI topic datetime
.
Console
In the Google Cloud console, go to the TPUs page:
Go to TPUs
Click Create TPU .
In the Name field, enter a name for your TPU.
In the Zone box, select the zone where you want to create the TPU.
In the TPU type box, select an accelerator type. The accelerator
type specifies the version and size of the Cloud TPU you want to
create. For more information about supported accelerator types for each
TPU version, see TPU versions
.
In the TPU software version box, select a software version. When
creating a Cloud TPU VM, the TPU software version specifies the
version of the TPU runtime to install. For more information, see TPU software versions
.
Click the Enable queueing toggle.
In the Queued resource name field, enter a name for your queued
resource request.
In the Cancel request on field, enter the time when the queued
resource request should expire if not filled.
Click Create to create your queued resource request.
The following example requests a v5p-32. The request expires if it's not filled
in six hours.
Note: Specifying a duration for the time the request expires is not supported in
the Google Cloud console.
gcloud
gcloud
compute
tpus
queued-resources
create
your-queued-resource-id
\
--node-id
your-node-id
\
--project
your-project-id
\
--zone
us-east5-a
\
--accelerator-type
v5p-32
\
--runtime-version
v2-alpha-tpuv5
\
--valid-until-duration
6h
Command parameter descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
node-id
: The user-defined ID of the TPU created in response to the request.
project
: The Google Cloud project where the queued resource is allocated.
zone
: The zone
where you plan to
create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator types
for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
valid-until-duration
: The duration for which the request is valid. For more information on duration formats, see Google Cloud CLI topic datetime
curl
curl
-X
POST
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
-d
"{
'tpu': {
'node_spec': {
'parent': 'projects/ your-project-number
/locations/ us-east5-a
',
'node_id': ' your-node-id
',
'node': {
'accelerator_type': ' v5p-32
',
'runtime_version': ' v2-alpha-tpuv5
',
}
}
},
'queueing_policy': {
'valid_until_duration': {
'seconds': 21600
}
}"
\
https://tpu.googleapis.com/v2alpha1/projects/ your-project-id
/locations/ us-east5-a
/queuedResources?queued_resource_id =
your-queued-resource-id
Command parameter descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
node-id
: The user-defined ID of the TPU created in response to the request.
project
: The Google Cloud project where the queued resource is allocated.
zone
: The zone
where you plan to
create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator types
for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
valid-until-duration
: The duration for which the request is valid. For more information on duration formats, see Google Cloud CLI topic datetime
Request a queued resource to be allocated within a specified interval
You can specify an allocation interval by specifying both the start time or
duration and end time or duration.
gcloud
The following command requests a v5p-32 in 5 hours and 30 minutes from the
current time, to be created no later than December 14, 2022 at 9:00 AM.
gcloud
compute
tpus
queued-resources
create
your-queued-resource-id
\
--node-id
your-node-id
\
--project
your-project-id
\
--zone
us-east5-a
\
--accelerator-type
v5p-32
\
--runtime-version
v2-alpha-tpuv5
\
--valid-after-duration
5h30m
\
--valid-until-time
2022
-12-14T09:00:00Z
Command flag descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
node-id
: The user-defined ID of the TPU created in response to the request.
project
: The ID of the project where the queued resource is allocated.
zone
: The zone
where you plan to create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU you want to create.
For more information about supported accelerator types for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
valid-until-time
: The time after which the request is canceled. For more information on duration formats, see Google Cloud CLI topic datetime
.
valid-after-duration
: The duration before which the TPU must not be provisioned. For more information on duration formats, see Google Cloud CLI topic datetime
.
curl
The following command requests a v5p-32 in 5 hours and 30 minutes from the
current time, to be created no later than December 14, 2022 at 9:00 AM.
curl
-X
POST
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
-d
"{
'tpu': {
'node_spec': {
'parent': 'projects/ your-project-number
/locations/ us-east5-a
',
'node_id': ' your-node-id
',
'node': {
'accelerator_type': ' v5p-32
',
'runtime_version': ' v2-alpha-tpuv5
',
}
}
},
'queueing_policy': {
'validInterval': {
'startTime': ' 2022-12-10T14:30:00Z
',
'endTime': ' 2022-12-14T09:00:00Z
'
}
},
}"
\
https://tpu.googleapis.com/v2alpha1/projects/ your-project-id
/locations/ us-east5-a
/queuedResources?queued_resource_id =
your-queued-resource-id
Command flag descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
node-id
: The user-defined ID of the TPU created in response to the request.
project
: The ID of the project where the queued resource is allocated.
zone
: The zone
where you plan to create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU you want to create.
For more information about supported accelerator types for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
valid-until-time
: The time after which the request is canceled. For more information on duration formats, see Google Cloud CLI topic datetime
.
valid-until-duration
: The duration for which the request is valid. For more information on duration formats, see Google Cloud CLI topic datetime
.
Console
In the Google Cloud console, go to the TPUs page:
Go to TPUs
Click Create TPU .
In the Name field, enter a name for your TPU.
In the Zone box, select the zone where you want to create the TPU.
In the TPU type box, select an accelerator type. The accelerator
type specifies the version and size of the Cloud TPU you want to
create. For more information about supported accelerator types for each
TPU version, see TPU versions
.
In the TPU software version box, select a software version. When
creating a Cloud TPU VM, the TPU software version specifies the
version of the TPU runtime to install. For more information, see TPU software versions
.
Click the Enable queueing toggle.
In the Queued resource name field, enter a name for your queued
resource request.
In the Start request on field, enter the time after which the
resource should be allocated.
In the Cancel request on field, enter the time when the queued
resource request should expire if not filled.
Click Create to create your queued resource request.
Request a queued resource with a startup script
You can specify a script to be run on a queued resource after it has been
provisioned.
gcloud
When using the gcloud
command, you can use either the --metadata
or --metadata-from-file
flag to specify a script command or a file
containing the script code, respectively. The following example creates a
queued resource request that will run the startup-script.sh
script.
gcloud
compute
tpus
queued-resources
create
your-queued-resource-id
\
--node-id
your-node-id
\
--project
your-project-id
\
--zone
us-central1-a
\
--accelerator-type
v5litepod-8
\
--runtime-version
v2-alpha-tpuv5-lite
\
--metadata-from-file =
'startup-script= startup-script.sh
'
Command flag descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
node-id
: The user-defined ID of the TPU created in response to the request.
project
: The ID of the project where the queued resource is allocated.
zone
: The zone
where you plan to
create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator types
for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
validInterval
: The time during which the request is valid after which the request is
canceled. For more information on duration formats, see Google Cloud CLI topic datetime
.
metadata-from-file
: Specifies a file that contains metadata. If you don't specify a fully
qualified path to the metadata file, the command assumes it is located in the
current directory. In this example the file contains a startup script that
is run when the queued resource is provisioned.
metadata
: Specifies metadata for the request. In this example the metadata is a
startup script command run when the queued resource is provisioned.
curl
When using curl
, you must include the script code in the JSON content.
The following example includes an inline script in the JSON body.
curl
-X
POST
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
-d
"{
tpu: {
node_spec: {
parent: 'projects/ your-project-number
/locations/ us-central1-a
',
node_id: ' your-node-id
',
node: {
accelerator_type: ' v5e-8
',
runtime_version: ' v2-alpha-tpuv5-lite
',
metadata: {
"
startup-script ": "
#! /bin/bash\npwd > /tmp/out.txt\nwhoami >> /tmp/out.txt"
}
}
}
}
, 'queueing_policy'
:
{
'validInterval'
:
{
'startTime'
:
' 2022-12-10T14:30:00Z
'
,
'endTime'
:
' 2022-12-14T09:00:00Z
'
}
}
, }
" \
https://tpu.googleapis.com/v2alpha1/projects/ your-project-id
/locations/ us-central1-a
/queuedResources?queued_resource_id= your-queued-resource-id
Command flag descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
node-id
: The user-defined ID of the TPU created in response to the request.
project
: The ID of the project where the queued resource is allocated.
zone
: The zone
where you plan to
create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator types
for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
validInterval
: The time during which the request is valid after which the request is
canceled. For more information on duration formats, see Google Cloud CLI topic datetime
.
metadata-from-file
: Specifies a file that contains metadata. If you don't specify a fully
qualified path to the metadata file, the command assumes it is located in the
current directory. In this example the file contains a startup script that
is run when the queued resource is provisioned.
metadata
: Specifies metadata for the request. In this example the metadata is a
startup script command run when the queued resource is provisioned.
Request a queued resource with a specified network and subnetwork
In a queued resource request, you can specify a network and subnetwork that
you want to connect your TPU to.
gcloud
gcloud
compute
tpus
queued-resources
create
your-queued-resource-id
\
--node-id
your-node-id
\
--project
your-project-id
\
--zone
us-central1-a
\
--accelerator-type
v5e-8
\
--runtime-version
v2-alpha-tpuv5-lite
\
--network
network-name
\
--subnetwork
subnetwork-name
Command parameter descriptions
queued-resource-id
: The user-assigned ID of the queued resource request.
node-id
: The user-assigned ID of the TPU which is created when the queued resource
request is allocated.
project
: Your Google Cloud project.
zone
: The zone
where you plan to create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator types
for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
reserved
: Use this flag when requesting queued resources as part of a Cloud TPU
reservation.
network
: A network that the queued resource will be a part of.
subnetwork
: A subnetwork that the queued resource will be a part of.
curl
curl
-X
POST
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
-d
"{
'tpu': {
'node_spec': {
'parent': 'projects/ your-project-number
/locations/ us-central1-a
',
'node_id': ' your-node-id
',
'node': {
'accelerator_type': ' v5e-8
',
'runtime_version': ' v2-alpha-tpuv5-lite
',
'network_config': {
'network': ' network-name
',
'subnetwork': ' subnetwork-name
',
'enable_external_ips': true
}
}
},
'guaranteed': {
'reserved': true,
}
}"
\
https://tpu.googleapis.com/v2alpha1/projects/ your-project-id
/locations/ us-central1-a
/queuedResources?queued_resource_id =
your-queued-resource-id
Command parameter descriptions
queued-resource-id
: The user-assigned ID of the queued resource request.
node-id
: The user-assigned ID of the TPU which is created when the queued resource
request is allocated.
project
: Your Google Cloud project.
zone
: The zone
where you plan to
create your Cloud TPU.
accelerator-type
: The accelerator type specifies the version and size of the Cloud TPU
you want to create. For more information about supported accelerator types
for each TPU version, see TPU versions
.
runtime-version
: The Cloud TPU software version.
reserved
: Use this flag when requesting queued resources as part of a Cloud TPU
reservation.
network
: A network that the queued resource will be a part of.
subnetwork
: A subnetwork that the queued resource will be a part of.
Console
In the Google Cloud console, go to the TPUs page:
Go to TPUs
Click Create TPU .
In the Name field, enter a name for your TPU.
In the Zone box, select the zone where you want to create the TPU.
In the TPU type box, select an accelerator type. The accelerator
type specifies the version and size of the Cloud TPU you want to
create. For more information about supported accelerator types for each
TPU version, see TPU versions
.
In the TPU software version box, select a software version. When
creating a Cloud TPU VM, the TPU software version specifies the
version of the TPU runtime to install. For more information, see TPU software versions
.
Click the Enable queueing toggle.
In the Queued resource name field, enter a name for your queued
resource request.
Expand the Network section.
In the Network and Subnetwork fields, select the network and
subnetwork you want to use.
Click Create to create your queued resource request.
Delete a queued resource request
You can delete a queued resource request and the TPU associated with the request
by deleting the queued resource request:
gcloud
Pass the --force
flag to the queued-resource delete
command:
Note: The command can take two to five minutes to complete. You can run the
command asynchronously by passing the --async
flag to the gcloud
command as
shown in the following command.
gcloud
compute
tpus
queued-resources
delete
your-queued-resource-id
\
--project
your-project-id
\
--zone
us-central1-a
\
--force
\
--async
Command flag descriptions
your-queued-resource-id
: The user-assigned ID of the queued resource request.
project
: The Google Cloud project where the queued resource is allocated.
zone
: The zone
of the Cloud TPU to delete.
force
: Delete both the TPU VM and the queued resource request.
curl
Use the query parameter force=true
in your curl
request:
curl
-X
DELETE
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
https://tpu.googleapis.com/v2/projects/ your-project-id
/locations/ us-central1-a
/queuedResources/ your-queued-resource-id
?force =
true
Command flag descriptions
your-queued-resource-id
: The user-assigned ID of the queued resource request.
project
: The Google Cloud project where the queued resource is allocated.
zone
: The zone
of the Cloud TPU to delete.
force
: Delete both the TPU VM and the queued resource request.
Console
In the Google Cloud console, go to the TPUs page:
Go to TPUs
Click the Queued resources tab.
Select the checkbox next to your queued resource request.
Click delete
Delete .
If you delete the TPU directly, you also need to delete the queued resource, as
shown in the following example. When you delete the TPU, the queued resource
request transitions to the SUSPENDED
state, after which the queued resource
request can be deleted.
gcloud
Delete the TPU:
$
gcloud
compute
tpus
tpu-vm
delete
your-node-id
\
--project =
your-project-id
\
--zone =
us-central1-a
\
--quiet
Command flag descriptions
project
: The Google Cloud project where the queued resource is allocated.
zone
: The zone
of the Cloud TPU to delete.
your-node-id
: The name of the TPU you want to delete.
When you delete your TPU, the associated queued resource goes into the SUSPENDING
state, then the SUSPENDED
state. When your queued
resource is in the SUSPENDED
state, you can delete it:
gcloud
compute
tpus
queued-resources
delete
your-queued-resource-id
\
--project
your-project-id
\
--zone
us-central1-a
Command flag descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
project
: The Google Cloud project where the queued resource is allocated.
zone
: The zone
of the Cloud TPU to delete.
curl
Delete the TPU:
curl
-X
DELETE
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
https://tpu.googleapis.com/v2/projects/ your-project
/locations/ us-central1-a
/nodes?node_id =
your-node-id
Command flag descriptions
project
: The Google Cloud project where the queued resource is allocated.
zone
: The zone
of the Cloud TPU to delete.
your-node-id
: The name of the TPU you want to delete.
When you delete your TPU, the associated queued resource goes into the SUSPENDING
state, then the SUSPENDED
state. When your queued
resource is in the SUSPENDED
state, you can delete it:
curl
-X
DELETE
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
https://tpu.googleapis.com/v2/projects/ your-project-id
/locations/ us-central1-a
/queuedResources/ your-queued-resource-id
Command flag descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
project
: The Google Cloud project where the queued resource is allocated.
zone
: The zone
of the Cloud TPU to delete.
Console
Delete your TPU:
In the Google Cloud console, go to the TPUs page:
Go to TPUs
Select the checkbox next to your TPU.
Click delete
Delete .
When you delete your TPU, the associated queued resource goes into the Suspending state, then the Suspended state. When your queued resource is
in the Suspended state, you can delete it:
Click the Queued resources tab.
Select the checkbox next to your queued resource request.
Click delete
Delete .
Java
When you delete your TPU, the associated queued resource goes into the SUSPENDING
state, then the SUSPENDED
state. When your queued resource is
in the SUSPENDED
state, you can delete it:
Python
When you delete your TPU, the associated queued resource goes into the SUSPENDING
state, then the SUSPENDED
state. When your queued resource is
in the SUSPENDED
state, you can delete it:
Retrieve the state and diagnostic information about a queued resource request:
gcloud
gcloud
compute
tpus
queued-resources
describe
queued-resource-request-id
\
--project
your-project-id
\
--zone
us-central1-a
Command flag descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
project
: The ID of the project where the queued resource is allocated.
zone
: The zone
where you plan to create your Cloud TPU.
curl
curl
-X
GET
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
https://tpu.googleapis.com/v2/projects/ your-project-id
/locations/ us-central1-a
/queuedResources/ your-queued-resource-id
Command flag descriptions
queued-resource-request-id
: The user-assigned ID of the queued resource request.
project
: The ID of the project where the queued resource is allocated.
zone
: The zone
where you plan to create your Cloud TPU.
Console
In the Google Cloud console, go to the TPUs page:
Go to TPUs
Click the Queued resources tab.
Click the name of your queued resource request.
After your TPU has been provisioned, you can also view details about your
queued resource request by going to the TPUs page
, finding your TPU, and
clicking on the name of the corresponding queued resource request.
If the request fails, the output will contain error information. For a request
that is waiting for resources, the output looks similar to the following:
gcloud
name:
projects/your-project-id/locations/us-central1-a/queuedResources/your-queued-resource-id
state:
state:
WAITING_FOR_RESOURCES
tpu:
nodeSpec:
-
node:
acceleratorType:
v4-8
bootDisk:
{}
networkConfig:
enableExternalIps:
true
queuedResource:
projects/your-project-number/locations/us-central1-a/queuedResources/your-queued-resource-id
runtimeVersion:
v2-alpha-tpuv5-lite
schedulingConfig:
{}
serviceAccount:
{}
shieldedInstanceConfig:
{}
useTpuVm:
true
nodeId:
your-node-id
parent:
projects/your-project-number/locations/us-central1-a
Console
The Queued resource status field displays Waiting for resources .
List queued resource requests in your project
List the queued resource requests in your project:
gcloud
gcloud
compute
tpus
queued-resources
list
--project
your-project-id
\
--zone
us-central1-a
Command flag descriptions
project
: The Google Cloud project where the queued resource is allocated.
zone
: The zone
where you plan to create your Cloud TPU.
curl
curl
-X
GET
-H
"Authorization: Bearer
$(
gcloud
auth
print-access-token )
"
\
-H
"Content-Type: application/json"
\
https://tpu.googleapis.com/v2/projects/ your-project-id
/locations/ your-zone
/queuedResources
Command flag descriptions
project
: The Google Cloud project where the queued resource is allocated.
zone
: The zone
where you plan to create your Cloud TPU.
Console
In the Google Cloud console, go to the TPUs page:
Go to TPUs
Click the Queued resources tab.
Send feedback
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License
, and code samples are licensed under the Apache 2.0 License
. For details, see the Google Developers Site Policies
. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-05-29 UTC.
Need to tell us more?
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2026-05-29 UTC."],[],[]]