GoogleSQL for BigQuery supports the following syntax to use graphs within SQL queries.
Language list
| Name | Summary |
|---|---|
GRAPH_EXPAND
|
A TVF that returns a flattened version of the input graph. |
GRAPH_TABLE
operator
|
Performs an operation on a graph in the FROM
clause of a SQL
query and then produces a table with the results. |
GRAPH_EXPAND
GRAPH_EXPAND( graph_name )
Description
Provides a flattened table representation of the data in the input graph.
The GRAPH_EXPAND
TVF also supports querying graph properties defined by a
measure. For more information and examples of how to use
this function, see Work with measures
.
Definitions
-
graph_name: ASTRINGvalue that contains the name of the graph to expand.
Details
The GRAPH_EXPAND
function produces a flattened table by applying a series
of LEFT JOIN
operations to node and edge tables. Because of this join behavior,
the function only accepts certain types of property graphs.
Input limitations
The schema of a property graph naturally forms a directed graph itself. Define the schema relationship graph for a property graph as follows:
- Each node table defines a node.
-
A directed edge from node table
Ato node tableBexists if the following conditions are met:- An edge table exists whose source key references one of the node tables and whose destination key references the other node table. The direction of the edge in the schema relationship graph might not match the direction of the edge defined on the property graph.
- The edge table defines a many-to-one relationship from table
Ato tableB. In other words, every row in tableAcorresponds to at most one row in tableB.
If the direction of the edge is ambiguous or there is a many-to-many relationship between the tables, omit the edge.
A property graph is valid input to the GRAPH_EXPAND
function if it meets the
following requirements:
- The property graph has at least one node table.
- Every node table has at least one key column defined.
- At least one property is defined on a node or edge table.
- If the graph contains measure definitions, then a property name can't match a key column name unless the property's expression is identical to the name itself.
- Every node or edge table uses the default label.
-
The following conditions hold for the schema relationship graph of the property graph:
- There is a single root node , defined as a node table with in-degree zero.
- The graph is acyclic.
- Every node is reachable from the root node.
- The in-degree of every node is at most one.
Output
The GRAPH_EXPAND
TVF returns a table that generally contains one column for
each property in the graph. However, if an edge table's properties are identical
to the properties of one of its adjacent node tables, then the redundant columns
are omitted.
The output column names are constructed by concatenating the
label name and property name with an underscore ( _
). For example, if
a node table has the label Person
and a property called age
, then the
column name corresponding to that property is Person_age
.
To select columns that correspond to properties defined by a measure, you
must wrap them in the AGG
function
.
You can't directly select columns that correspond to properties defined by a
measure.
Examples
The following example creates a graph called StoreGraph
based on the Stores
and Locations
tables.
CREATE
OR
REPLACE
TABLE
mydataset
.
Stores
(
name
STRING
PRIMARY
KEY
NOT
ENFORCED
,
location_id
INT64
REFERENCES
mydataset
.
Locations
(
id
)
NOT
ENFORCED
)
AS
(
SELECT
'Store 1'
AS
name
,
101
AS
location_id
UNION
ALL
SELECT
'Store 2'
AS
name
,
101
AS
location_id
);
CREATE
OR
REPLACE
TABLE
mydataset
.
Locations
(
id
INT64
PRIMARY
KEY
NOT
ENFORCED
,
name
STRING
,
population
INT64
)
AS
(
SELECT
101
AS
id
,
'Anytown'
AS
name
,
1000
AS
population
UNION
ALL
SELECT
102
AS
id
,
'Sometown'
AS
name
,
500
AS
population
);
CREATE
OR
REPLACE
PROPERTY
GRAPH
mydataset
.
StoreGraph
NODE
TABLES
(
mydataset
.
Stores
AS
S
,
mydataset
.
Locations
AS
L
PROPERTIES
(
id
,
name
,
population
,
MEASURE
(
SUM
(
population
))
AS
total_population
)
)
EDGE
TABLES
(
mydataset
.
Stores
AS
SL
SOURCE
KEY
(
location_id
)
REFERENCES
L
(
id
)
DESTINATION
KEY
(
name
)
REFERENCES
S
(
name
)
);
The property graph consists of four nodes that represent Store 1
, Store 2
, Anytown
, and Sometown
. The property graph contains two edges: one from Anytown
to Store 1
and another from Anytown
to Store 2
.
The schema relationship graph has a single edge
from the Stores
node table to the Locations
node table, because there is a
many-to-one relationship between them: many stores can belong to a single
location. This relationship is also reflected by the fact that location_id
is a foreign key in the Stores
table.
The following query calls the GRAPH_EXPAND
function and omits the L_total_population
column from the output because you can't directly select
a column for a property defined by a measure without using the AGG
function:
SELECT
*
EXCEPT
(
L_total_population
)
FROM
GRAPH_EXPAND
(
'mydataset.StoreGraph'
);
/*---------------+---------+------+---------+--------------+
| S_location_id | S_name | L_id | L_name | L_population |
+---------------+---------+------+---------+--------------+
| 101 | Store 2 | 101 | Anytown | 1000 |
| 101 | Store 1 | 101 | Anytown | 1000 |
+---------------+---------+------+---------+--------------*/
The Sometown
location doesn't appear in the output because it's
not referenced by the location_id
foreign key in the Stores
table, so it's dropped from the LEFT JOIN
that produces the output.
The columns SL_location_id
and SL_name
don't appear because the properties
of the edge table SL
are identical to the properties of its node table S
.
The following query shows the difference between aggregating a measure and
a regular value. When you apply the AGG
function to the L_total_population
measure, population is counted exactly once per distinct location_id
value.
If you call the SUM
function on L_population
, then the L_population
column contributes the population for every row in the table with a given
location ID.
SELECT
S_location_id
,
AGG
(
L_total_population
)
AS
true_total_population
,
SUM
(
L_population
)
AS
overcounted_population
FROM
GRAPH_EXPAND
(
'mydataset.StoreGraph'
)
GROUP
BY
S_location_id
;
/*---------------+-----------------------+------------------------+
| S_location_id | true_total_population | overcounted_population |
+---------------+-----------------------+------------------------+
| 101 | 1000 | 2000 |
+---------------+-----------------------+------------------------*/
GRAPH_TABLE
operator
FROM GRAPH_TABLE ( property_graph_name multi_linear_query_statement ) [ [ AS ] alias ]
Description
Performs an operation on a graph in the FROM
clause of a SQL query and then
produces a table with the results.
With the GRAPH_TABLE
operator, you can use the GQL syntax
to query a property graph. The result of this operation is produced as a table that
you can use in the rest of the query.
Definitions
-
property_graph_name: The name of the property graph to query for patterns. -
multi_linear_query_statement: You can use GQL to query a property graph for patterns. For more information, see Graph query language . -
alias: An optional alias, which you can use to refer to the table produced by theGRAPH_TABLEoperator elsewhere in the query.
Examples
You can use the RETURN
statement to return specific node and edge properties.
For example:
SELECT
name
,
id
FROM
GRAPH_TABLE
(
graph_db
.
FinGraph
MATCH
(
n
:
Person
)
RETURN
n
.
name
AS
name
,
n
.
id
AS
id
);
/*-----------+
| name | id |
+-----------+
| Alex | 1 |
| Dana | 2 |
| Lee | 3 |
+-----------*/
The following query produces an error because id
isn't
included in the RETURN
statement, even though this property exists for
element n
:
SELECT
name
,
id
FROM
GRAPH_TABLE
(
graph_db
.
FinGraph
MATCH
(
n
:
Person
)
RETURN
n
.
name
);
The following query produces an error because directly outputting the graph
element n
is not supported. Convert n
to its JSON representation using the TO_JSON
function for successful output.
-- Error
SELECT
n
FROM
GRAPH_TABLE
(
graph_db
.
FinGraph
MATCH
(
n
:
Person
)
RETURN
n
);
SELECT
TO_JSON
(
n
)
as
json_node
FROM
GRAPH_TABLE
(
graph_db
.
FinGraph
MATCH
(
n
:
Person
)
RETURN
n
);
/*---------------------------+
| json_node |
+---------------------------+
| {"identifier":"mUZpbk...} |
| {"identifier":"mUZpbk...} |
| {"identifier":"mUZpbk...} |
+--------------------------*/

