This document describes best practices for tuning Spanner Graph query performance, which include the following optimizations:
- Avoid a full scan of the input table for nodes and edges.
- Reduce the amount of data the query needs to read from storage.
- Reduce the size of intermediate data.
Start from lower cardinality nodes
Write the path traversal so that it starts with the lower cardinality nodes. This approach keeps the intermediate result set small, and speeds up query execution.
For example, the following queries have the same semantics:
-
Forward edge traversal:
GRAPH FinGraph MATCH ( p : Person { name : "Alex" } ) - [: Owns ] - > ( a : Account { is_blocked : true } ) RETURN p . id AS person_id , a . id AS account_id ; -
Reverse edge traversal:
GRAPH FinGraph MATCH ( a : Account { is_blocked : true } ) < - [: Owns ] - ( p : Person { name : "Alex" } ) RETURN p . id AS person_id , a . id AS account_id ;
Assuming that there are fewer people with the name Alex
than there are
blocked accounts, we recommend that you write this query in the forward
edge traversal.
Starting from lower cardinality nodes is especially important for variable-length path traversal. The following example shows the recommended way to find accounts that are within three transfers of a given account.
GRAPH
FinGraph
MATCH
(:
Account
{
id
:
7
}
)
-
[:
Transfers
]
-
> {
1
,
3
}
(
a
:
Account
)
RETURN
a
.
id
;
Specify all labels by default
Spanner Graph infers the qualifying nodes and edge labels if labels are omitted. We recommend that you specify labels for all nodes and edges where possible, because this inference might not always be possible and it might cause more labels than necessary to be scanned.
Single MATCH statement
The following example finds accounts linked by at most 3 transfers from the given account:
GRAPH
FinGraph
MATCH
(
src
:
Account
{
id
:
7
}
)
-
[:
Transfers
]
-
> {
1
,
3
}
(
dst
:
Account
)
RETURN
dst
.
id
;
Across MATCH statements
Specify labels on nodes and edges when they refer to the same element but are
across MATCH
statements.
The following example shows this recommended approach:
GRAPH
FinGraph
MATCH
(
acct
:
Account
{
id
:
7
}
)
-
[:
Transfers
]
-
> {
1
,
3
}
(
other_acct
:
Account
)
RETURN
acct
,
COUNT
(
DISTINCT
other_acct
)
AS
related_accts
GROUP
BY
acct
NEXT
MATCH
(
acct
:
Account
)
< -
[:
Owns
]
-
(
p
:
Person
)
RETURN
p
.
id
AS
person
,
acct
.
id
AS
acct
,
related_accts
;
Use IS_FIRST
to optimize queries
You can use the IS_FIRST
function to improve query performance by sampling edges and limiting traversals
in graphs. This function helps handle high-cardinality nodes and optimize
multi-hop queries.
If your specified sample size is too small, the query might return no data. Because of this, you might need to try different sample sizes to find the optimal balance of returned data and improved query performance.
These IS_FIRST
examples use FinGraph
, a financial graph with Account
nodes
and Transfers
edges for money transfers. To create the FinGraph
and use it
to run the sample queries, see Set up and query Spanner Graph
.
Limit traversed edges to improve query performance
When you query graphs, some nodes can have a significantly larger number of incoming or outgoing edges compared to other nodes. These high-cardinality nodes are sometimes called super nodes or hub nodes. Super nodes can cause performance issues because traversals through them might involve processing huge amounts of data, which leads to data skew and long execution times.
To optimize a query of a graph with super nodes, use the IS_FIRST
function
within a FILTER
clause to limit the number of edges the query traverses from a
node. Because accounts in FinGraph
might have significantly higher numbers of
transactions than others, you might use IS_FIRST
to prevent an inefficient
query. This technique is particularly useful when you don't need a complete
enumeration of all connections from a super node.
The following query finds accounts ( a2
) that either directly or indirectly
receive transfers from blocked accounts ( a1
). The query uses IS_FIRST
to
prevent slow performance when an account has many transfers by limiting the
number of Transfers
edges to consider for each Account
.
GRAPH
FinGraph
MATCH
(
a1
:
Account
{
is_blocked
:
true
}
)
-
[
e
:
Transfers
WHERE
e
IN
{
MATCH
-
[
selected_e
:
Transfers
]
-
>
FILTER
IS_FIRST
(
@
max_transfers_per_account
)
OVER
(
PARTITION
BY
SOURCE_NODE_ID
(
selected_e
)
ORDER
BY
selected_e
.
create_time
DESC
)
RETURN
selected_e
}
]
-
> {
1
,
5
}
(
a2
:
Account
)
RETURN
a1
.
id
AS
src_id
,
a2
.
id
AS
dst_id
;
This example uses the following:
-
@max_transfers_per_account: A query parameter that specifies the maximum number ofTransfersedges to consider for each account (a1). -
PARTITION BY SOURCE_NODE_ID(selected_e): Ensures that theIS_FIRSTlimit applies independently for each account (a1). -
ORDER BY selected_e.create_time DESC: Specifies that the most recent transfers are returned.
Sample intermediate nodes to optimize multi-hop queries
You can also improve query efficiency by using IS_FIRST
to sample intermediate
nodes in multi-hop queries. This technique improves efficiency by limiting the
number of paths the query considers for each intermediate node. To do this,
break a multi-hop query into multiple MATCH
statements separated by NEXT
, and
apply IS_FIRST
at the midpoint where you need to sample:
GRAPH
FinGraph
MATCH
(
a1
:
Account
{
is_blocked
:
true
}
)
-
[
e1
:
Transfers
]
-
> (
a2
:
Account
)
FILTER
IS_FIRST
(
1
)
OVER
(
PARTITION
BY
a2
)
RETURN
a1
,
a2
NEXT
MATCH
(
a2
)
-
[
e2
:
Transfers
]
-
> (
a3
:
Account
)
RETURN
a1
.
id
AS
src_id
,
a2
.
id
AS
mid_id
,
a3
.
id
AS
dst_id
;
To understand how IS_FIRST
optimizes this query:
-
The clause
FILTER IS_FIRST(1) OVER (PARTITION BY a2)is applied in the firstMATCHstatement. -
For each intermediate account node (
a2),IS_FIRSTconsiders only the first incomingTransfersedge (e1), reducing the number of paths to explore in the secondMATCHstatement. -
The overall two-hop query's efficiency is improved because the second
MATCHdoesn't process unnecessary data, especially whena2has many incoming transfers.
What's next
- Learn how to query property graphs in Spanner Graph .
- Migrate to Spanner Graph .

