This page shows you how to identify and troubleshoot latency issues in your Spanner components.To learn more about possible latency points in a Spanner request, see Latency points in a Spanner request .
You can measure and compare the request latencies between different components and the database to determine which component is causing the latency. These latencies include End-to-end latency , Google Front End (GFE) latency , Spanner API request latency , and Query latency .
-
In your client application that uses your service, confirm there's a latency increase from end-to-end latency. Check the following dimensions from your client-side metrics. For more information, see Client-side metrics descriptions .
-
client_name: the client library name and version. -
location: the Google Cloud region where the client-side metrics are published. If your application is deployed outside Google Cloud, then the metrics are published to theglobalregion. -
method: the RPC method name—for example,spanner.commit. -
status: the RPC status—for example,OKorINTERNAL.
Group by these dimensions to see if the issue is limited to a specific client, status, or method. For dual-region or multi-regional workloads, see if the issue is limited to a specific client or Spanner region.
-
-
Check your client application health, especially the computing infrastructure on the client side (for example, VM, CPU, or memory utilization, connections, file descriptors, and so on).
-
Check latency in Spanner components by viewing the client-side metrics :
a. Check end-to-end latency using the
spanner.googleapis.com/client/operation_latenciesmetric.b. Check Google Front End (GFE) latency using the
spanner.googleapis.com/client/gfe_latenciesmetric. -
Check the following dimensions for Spanner metrics :
-
database: the Spanner database name. -
method: the RPC method name—for example,spanner.commit. -
status: the RPC status—for example,OKorINTERNAL.
Group by these dimensions to see if the issue is limited to a specific database, status, or method. For dual-region or multi-regional workloads, check to see if the issue is limited to a specific region.
-
-
Check Spanner API request latency using the
spanner.googleapis.com/api/request_latenciesmetric. For more information, see Spanner metrics .If you have high end-to-end latency, but low GFE latency, and a low Spanner API request latency, the application code might have an issue. It could also indicate a networking issue between the client and regional GFE. If your application has a performance issue that causes some code paths to be slow, then the end-to-end latency for each API request might increase. There might also be an issue in the client computing infrastructure that was not detected in the previous step.
If you have a high GFE latency, but a low Spanner API request latency, it might have one of the following causes:
-
Accessing a database from another region. This action can lead to high GFE latency and low Spanner API request latency. For example, traffic from a client in the
us-east1region that has an instance in theus-central1region might have a high GFE latency but a lower Spanner API request latency. -
There's an issue at the GFE layer. Check the Google Cloud Status Dashboard to see if there are any ongoing networking issues in your region. If there aren't any issues, then open a support case and include this information so that support engineers can help with troubleshooting the GFE.
-
-
Check the CPU utilization of the instance . If the CPU utilization of the instance is above the recommended level, you should manually add more nodes, or set up auto scaling. For more information, see Autoscaling overview .
-
Observe and troubleshoot potential hotspots or unbalanced access patterns using Key Visualizer and try to roll back any application code changes that strongly correlate with the issue timeframe.
-
Check any traffic pattern changes.
-
Check Query insights and Transaction insights to see if there might be any query or transaction performance bottlenecks.
-
Use procedures in Oldest active queries to see any expense queries that might cause a performance bottleneck and cancel the queries as needed.
-
Use procedures in the troubleshooting sections in the following topics to troubleshoot the issue further using Spanner introspection tools:
What's next
- Now that you've identified the component that contains the latency, explore the problem further using the built-in client-side metrics .
- Learn how to use metrics to diagnose latency.
- Learn how to troubleshoot Spanner deadline exceeded errors .

