This document lists the Compute Engine rate quotas, which define the number of requests you can make to Compute Engine API methods.
Rate quotas
Rate quotas (also known as API rate limits or API quotas) define the number of requests that can be made to the Compute Engine API . These quotas apply on a per-project basis. Each quota applies to a group of one or more Compute Engine API methods. When you use the gcloud CLI or the Google Cloud console, you are also making requests to the API and these requests count towards your rate quota. If you use service accounts to access the API, that also counts towards your rate quota.
Google enforces rate quotas at a per-minute (60 seconds) interval for each
group. That means if your project reaches the maximum number of API requests
anytime within 60 seconds, you must wait for that rate quota to refill before
making more requests in that group. If your project exceeds a rate quota, you
receive a 403 error with the reason rateLimitExceeded
. To resolve this error,
wait a minute then try your request again—the quota should be refilled at
the start of the next interval.
The number of requests that you can make to the Compute Engine API is defined by API quotas as described in the following tables. Each group is counted separately, so you can achieve the maximum quota in each group simultaneously.
Rate quotas for global methods
The following sections list the quotas that define the number of requests you can make to the global Compute Engine API methods.
Simplified API quotas for global methods
To improve the discoverability and manageability of Compute Engine API quota, Google Cloud has reduced the number of quota metrics by consolidating metrics into fewer quotas. These simplified quota metrics also offer higher limits for each quota.
To view a complete list of global methods and the quota metrics that track the cost of each method, see Compute Engine API quota metrics reference .
The following table lists all simplified API quota metrics for Compute Engine API:
Quota Quota description |
Metric name |
---|---|
Read requests per minute ( GlobalReadsPerMinutePerProject
)Quota for global get
and list
methods. |
compute.googleapis.com/global_reads
|
Write requests per minute ( GlobalWritesPerMinutePerProject
)Quota for global write methods that are not included under other quota metrics. |
compute.googleapis.com/global_writes
|
List usable requests per minute ( GlobalListUsablePerMinutePerProject
)Quota for global ListUsable
methods. |
compute.googleapis.com/list_usable_requests
|
Cache invalidation requests per minute ( GlobalCacheInvalidationRequestsPerMinutePerProject
)Quota for global UrlMapsService.InvalidateCache
methods. |
compute.googleapis.com/global_cache_invalidation_requests
|
Filtered list cost overhead ( ListRequestsFilterCostOverheadPerMinutePerProject
)Quota for Google Cloud charges quota usage for this metric in addition to
the quotas against
the |
compute.googleapis.com/filtered_list_cost_overhead
Rate per project: 750k resources filtered out of the list requests per region per minute. As the quota is charged for every 10k resources, you see the limit as 75 (750k/10k) when you query for this quota limit on the Google Cloud console. |
Requests per minute ( GlobalRequestsPerMinutePerProject
)Quota for all global read and write requests. |
compute.googleapis.com/global_requests
|
Legacy API quotas for global methods
The following table lists all Compute Engine API quotas for global methods.
Quota description
defaultPerMinutePerProject
)Quota for global list and mutation methods that are not included under other quota metrics.
The following get
methods also use this default
metric:
-
networkFirewallPolicies.get
-
projects.getXpnHost
-
projects.getXpnResources
compute.googleapis.com/default
ReadRequestsPerMinutePerProject
)Quota for global
*.get
methods.compute.googleapis.com/read_requests
ListRequestsPerMinutePerProject
)Quota for global
*.list
methods.compute.googleapis.com/list_requests
OperationReadRequestsPerMinutePerProject
)Quota for
globalOperations.get
method.compute.googleapis.com/operation_read_requests
GlobalResourceWriteRequestsPerMinutePerProject
)Quota for
images.delete
, images.deprecate
, images.insert
, images.setLabels
, snapshots.delete
, snapshots.insert
, snapshots.setLabels
, machineImages.insert
, and machineImages.delete
methods.compute.googleapis.com/global_resource_write_requests
HeavyWeightWriteRequestsPerMinutePerProject
)Quota for
patch
, delete
,
and insert
methods for the interconnects
resources.compute.googleapis.com/heavy_weight_write_requests
HeavyWeightReadRequestsPerMinutePerProject
)Quota for
*.aggregatedList
methods.compute.googleapis.com/heavy_weight_read_requests
The following quotas apply to global APIs with per method quotas:
Quota description
Quotas for
licenses.insert
method.compute.googleapis.com/license_insert_requests
- Quota per project (
LicenseInsertRequestsPerMinutePerProject
): 2.5 requests/second (150 requests/minute) - Quota per day per project (
LicenseInsertRequestsPerDayPerProject
): 30 requests/day
Quota for
projects.setCommonInstanceMetadata
method.compute.googleapis.com/project_set_common_instance_metadata_requests
ProjectSetCommonInstanceMetadataRequestsPerMinutePerProject
):
36 requests/minuteRate quotas for regional and zonal methods
The following sections list all quotas that apply to methods that use regional metrics.
Simplified API quotas for regional and zonal methods
The following table lists all simplified quotas for Compute Engine API regional and zonal methods. To view a complete list of regional and zonal methods, and the quota metrics that track the usage of each method, see Compute Engine API quota metrics reference .
Quota Quota description |
Metric name |
---|---|
Read requests per minute per region ( ReadRequestsPerMinutePerProjectPerRegion
)Quota for regional and zonal `get` and `list` methods. |
compute.googleapis.com/reads_per_region
|
Write requests per region ( WritesPerMinutePerProjectPerRegion
)Quota for regional and zonal write methods that are not included under other quota metrics. |
compute.googleapis.com/writes_per_region
|
List usable requests per region ( ListUsablePerMinutePerProjectPerRegion
)Quota for regional and zonal ListUsable
methods. |
compute.googleapis.com/list_usable_requests_per_region
|
Filtered list cost overhead per region Quota for Google Cloud charges quota usage for this metric in addition to
the quotas against
the |
compute.googleapis.com/filtered_list_cost_overhead_per_region
Rate per project ( ListRequestsFilterCostOverheadPerMinutePerProjectPerRegion
):
750k resources filtered out of the list requests per region per minute. As the
quota is charged for every 10k resources, you see the limit as 75 (750k/10k) when you
query for this quota limit on the Google Cloud console.
|
Requests per minute per region ( RequestsPerMinutePerProjectPerRegion
)Quota for all regional read and write requests. |
compute.googleapis.com/requests_per_region
|
Legacy API quotas for regional methods
The following table lists all Compute Engine API quotas for regional and zonal methods.
Quota description
QueriesPerMinutePerRegion
)Quota for regional and zonal methods that create, modify, or delete Compute Engine resources. For example,
instances.insert
, disks.update
, and instances.delete
methods. The following get
, list
and patch
methods also use this default_per_region
metric:
-
projects.listXpnHosts
-
instances.getScreenshot
-
instances.getGuestAttributes
-
instances.getShieldedInstanceIdentity
-
instances.getEffectiveFirewalls
-
projects.listXpnHosts
-
instanceGroupManagers.listManagedInstances
-
instanceGroupManagers.listErrors
-
instanceGroupManagers.listPerInstanceConfigs
-
regionInstanceGroupManagers.listManagedInstances
-
regionInstanceGroupManagers.listErrors
-
regionInstanceGroupManagers.listPerInstanceConfigs
-
resourcePolicies.patch
compute.googleapis.com/default_per_region
ReadRequestsPerMinutePerRegion
)Quota for regional and zonal
get
methods
such as autoscalers.get
, disks.get
, instances.get
,
and machineTypes.get
.compute.googleapis.com/read_requests_per_region
ListRequestsPerMinutePerRegion
)Quota for regional and zonal
list
methods
such as autoscalers.list
, disks.list
, instances.list
,
and machineTypes.list
.compute.googleapis.com/list_requests_per_region
Quota for *.list
and *.aggregatedList
methods with filters.
Google Cloud charges quota usage for this metric in addition to
the quotas against
the compute.googleapis.com/list_requests_per_region
and compute.googleapis.com/heavy_weight_read_requests_per_region
metrics.
You incur quota charges if there are more than 10k resources filtered
out of the list requests. Compute Engine API rejects the list
requests if you exceed this quota limit.
compute.googleapis.com/filtered_list_cost_overhead_per_region
Rate per project (
ListRequestsFilterCostOverheadPerMinutePerProjectPerRegion
):
750k resources filtered out of the list requests per region per minute. As the
quota is charged for every 10k resources, you see the limit as 75 (750k/10k) when you
query for this quota limit on the Google Cloud console.
OperationReadRequestsPerMinutePerRegion
)Quota for
regionOperations.get
and zoneOperations.get
methods.compute.googleapis.com/operation_read_requests_per_region
GlobalResourceWriteRequestsPerMinutePerProjectPerRegion
)Quota for
disks.createSnapshot
,
and regionDisks.createSnapshot
methods.compute.googleapis.com/global_resource_write_requests_per_region
GetSerialPortOutputRequestsPerMinutePerProjectPerRegion
)Quota for
instances.getSerialPortOutput
method.compute.googleapis.com/get_serial_port_output_requests_per_region
HeavyWeightReadRequestsPerMinutePerRegion
)Quota for
regionOperations.wait
, zoneOperations.wait
, and regionNetworkFirewallPolicies.getEffectiveFirewalls
methods.compute.googleapis.com/heavy_weight_read_requests_per_region
HeavyWeightWriteRequestsPerMinutePerProjectPerRegion
Quota for
patch
, delete
, and insert
methods for interconnectAttachments
resource and
for the networks.updatePeering
method.compute.googleapis.com/heavy_weight_write_requests_per_region
SimulateMaintenanceEventRequestsPerMinutePerProjectPerRegion
)Quota for
instances.simulateMaintenanceEvent
method.compute.googleapis.com/simulate_maintenance_event_requests_per_region
InstanceListReferrersRequestsPerMinutePerProjectPerRegion
Quota for
instances.listReferrers
method.compute.googleapis.com/instance_list_referrers_requests_per_region
NetworkEndpointWriteRequestsPerMinutePerProjectPerRegion
Quota for
networkEndpointGroups.attachNetworkEndpoints
and networkEndpointGroups.detachNetworkEndpoints
methods.compute.googleapis.com/network_endpoint_write_requests_per_region
NetworkEndpointListRequestsPerMinutePerProjectPerRegion
)Quota for
networkEndpointGroups.listNetworkEndpoints
method.compute.googleapis.com/network_endpoint_list_requests_per_region
RegionalNetworkEndpointWriteRequestsPerMinutePerProjectPerRegion
)Quota for
regionNetworkEndpointGroups.attachNetworkEndpoints
and regionNetworkEndpointGroups.detachNetworkEndpoints
methods.compute.googleapis.com/regional_network_endpoint_write_requests_per_region
RegionalNetworkEndpointListRequestsPerMinutePerProjectPerRegion
)Quota for
regionNetworkEndpointGroups.listNetworkEndpoints
method.compute.googleapis.com/regional_network_endpoint_list_requests_per_region
Request an increase in rate quotas
If you need a higher quota for making API requests, you can request an increase in the API quota from the Google Cloud console. For instructions, see Request a quota adjustment .
Best practices
Follow the Compute Engine API best practices for preserving rate quotas to mitigate the effects of rate quotas.
What's next
- Learn about Monitoring API usage .
- Learn how to set up quota alerts .