Run federated queries with Data Boost

This page explains how to use Spanner Data Boost when you run federated queries from BigQuery to a Spanner database. With Data Boost, federated queries run with minimal impact to existing workloads on the provisioned Spanner instance. The Data Boost queries from BigQuery to a Spanner database can join BigQuery data with Spanner data.

You can run federated queries from BigQuery to Spanner using Data Boost with one of the following methods:

Run a federated Data Boost query , which uses the EXTERNAL_QUERY federated query function.
Use Data Boost with external datasets .

Spanner federation lets BigQuery query data residing in Spanner in real time, without copying or moving data. To learn more about Spanner federated queries, see Spanner federated queries . To learn about Data Boost, see Data Boost overview .

Before you begin

Before you can run federated queries with Data Boost, you need to complete the following tasks:

Create a Spanner instance and database .
Enable the BigQuery connection API .
Grant the required IAM permissions .

Create a Spanner instance and database

If you don't have a Spanner instance and database, follow the steps in Create and query a database using the Google Cloud console to create them.

Enable the BigQuery connection API

The BigQuery connection API lets you manage BigQuery connections to external data sources such as a Spanner database.

Enable the BigQuery connection API.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role ( roles/serviceusage.serviceUsageAdmin ), which contains the serviceusage.services.enable permission. Learn how to grant roles .

Enable the API

For more information, see BigQuery connection API in the BigQuery documentation.

Grant IAM permissions for Data Boost to principals

A principal must be granted the following permissions to run federated queries with Data Boost:

spanner.instances.get - lets you get the configuration of an instance.
spanner.databases.useDataBoost - lets you use the Spanner Data Boost compute resources to process partitioned queries .

For more information about Spanner permissions, see Identity and Access Management (IAM) permissions .

To grant these required permissions, we recommend that you use the Cloud Spanner Database Reader With DataBoost ( roles/spanner.databaseReaderWithDataBoost ) IAM role. You can add that role to any principal that needs to run federated queries with Data Boost. To learn more about predefined roles in Spanner, see Predefined roles . To learn how to create a custom IAM role, see Create a custom role .

Run a federated Data Boost query

To run a Data Boost query from BigQuery to an external source, you need a BigQuery connection to the external source and the ID of the connection. When you run a federated Spanner query with Data Boost, the external source is a Spanner database. After you create your connection ID, it's used by BigQuery to run a Data Boost query of a Spanner database.

Use one of the following options to create a BigQuery connection ID, and then use the connection ID to run a Data Boost query from BigQuery:

Start in Spanner - Create the BigQuery external connection ID in the Spanner console. After your connection ID is created in the Spanner console, you're redirected to the BigQuery console to run a federated Data Boost query to a Spanner database.
Start in BigQuery - Create the Data Boost external connection ID in the BigQuery console or using the bq command-line tool. After you create the connection ID, you stay in the BigQuery console to run a federated Data Boost query to a Spanner database.

Start in Spanner to run a Data Boost query

To run a federated Data Boost query starting in the Spanner Studio, do the following:

Go to the Spanner Instancespage in the Google Cloud console.

Go to the Instances page

The console shows a list of your Spanner instances.
Select a Spanner instance, and then select a database.
On the Database overviewpage, in the navigation menu, click Spanner Studio.
Click View in BiqQuery.
In the View in BigQuerydialog, enter a connection ID.

The connection ID is used to create a new BigQuery external connection to your Spanner database. You reference your external connection using the following pattern:
```
  PROJECT-ID 
. LOCATION 
. CONNECTION-ID 
 
```
An error occurs if the ID already exists.
Fill in the rest of the dialog and do the following:
- Select Read data in parallel.
- Select Use Spanner Data Boost.

Click View in BigQuery.

BigQuery Studio opens with the following query:

  SELECT 
  
 * 
  
 FROM 
  
 EXTERNAL_QUERY 
 ( 
 " PROJECT-ID 
. LOCATION 
. CONNECTION-ID 
" 
 , 
  
 "SELECT * FROM INFORMATION_SCHEMA.TABLES;" 
 );

You can replace this with your federated query. For example, you might make a query that's similar to the following example. This example makes a federated query from a table named orders in a Spanner database and joins the results with a BigQuery table named mydataset.customers .

 SELECT 
  
 c 
 . 
 customer_id 
 , 
  
 c 
 . 
 name 
 , 
  
 rq 
 . 
 first_order_date 
 FROM 
  
 mydataset 
 . 
 customers 
  
 AS 
  
 c 
 LEFT 
  
 OUTER 
  
 JOIN 
  
 EXTERNAL_QUERY 
 ( 
  
 'my-project.us.example-db' 
 , 
  
 '''SELECT customer_id, MIN(order_date) AS first_order_date 
 FROM orders 
 GROUP BY customer_id''' 
 ) 
  
 AS 
  
 rq 
  
 ON 
  
 rq 
 . 
 customer_id 
  
 = 
  
 c 
 . 
 customer_id 
 GROUP 
  
 BY 
  
 c 
 . 
 customer_id 
 , 
  
 c 
 . 
 name 
 , 
  
 rq 
 . 
 first_order_date 
 ;

Start in BigQuery to run a Data Boost query

To create an external data connection from BigQuery to a Spanner database and use that connection to run a federated Data Boost query from BigQuery, select one of the following options:

Console

Go to Create Spanner connections in the BigQuery documentation and follow the instructions on the Consoletab.
In the External data sourcepane, do the following:
- Select Read data in parallel.
- Select Use Spanner Data Boost.

bq

Go to Create Spanner connections in the BigQuery documentation and follow instructions on the bq* tab.
Set the following connection properties to true :
- useParallelism
- useDataBoost

The following example uses the bq mk command to create a new connection named my_connection with the two required properties for Data Boost:

 bq  
mk  
--connection  
--connection_type = 
 'CLOUD_SPANNER' 
  
--location = 
 'us' 
  
 \ 
--properties = 
 '{"database":"projects/my-project/instances/my-instance/databases/my-database", "useParallelism":true, "useDataBoost": true}' 
  
my_connection

Use Data Boost with external datasets

To run a Data Boost query from BigQuery to Spanner as an external source, you can create an external dataset (also known as a federated dataset) in BigQuery that's linked to an existing GoogleSQL or PostgreSQL database in Spanner.

Use a `CLOUD_RESOURCE` connection

By default, Spanner external datasets use end-user credentials (EUC), which requires the users have direct access to their Spanner databases. Users can query these datasets if they have access granted in Spanner.

Optionally, Spanner external datasets can use a CLOUD_RESOURCE connection to interact with your Spanner database, so that you can provide a user access to Spanner data through BigQuery, without giving them direct access to the Spanner database. Because the service account from CLOUD_RESOURCE connection handles retrieving data from the Spanner, you only have to grant users access to the Spanner external dataset. This access delegation decouples access to the Spanner tables from external datasets and the direct access to the underlying Spanner tables. A Cloud resource connection associated with a service account is used to connect to the Spanner. Users can query these Spanner tables from external datasets even if they don't have access granted in Spanner.

Before you create Spanner external datasets with a CLOUD_RESOURCE connection, do the following:

Create a connection

You can create or use an existing CLOUD_RESOURCE connection to connect to Spanner. Make sure to create the connection in the same location that you plan to create your Spanner external dataset.

Select one of the following options:

Console

Go to the BigQuerypage.

Go to BigQuery
In the left pane, click Explorer:

If you don't see the left pane, click Expand left paneto open the pane.
In the Explorerpane, expand your project name, and then click Connections.
On the Connectionspage, click Create connection.
For Connection type, choose Vertex AI remote models, remote functions, BigLake and Spanner (Cloud Resource).
In the Connection IDfield, enter a name for your connection.
For Location type, select a location for your connection. The connection should be colocated with your other resources such as datasets.
Click Create connection.
Click Go to connection.
In the Connection infopane, copy the service account ID for use in a later step.

bq

In a command-line environment, create a connection:
```
bq  
mk  
--connection  
--location = 
 REGION 
  
--project_id = 
 PROJECT_ID 
  
 \ 
  
--connection_type = 
CLOUD_RESOURCE  
 CONNECTION_ID 
```
The --project_id parameter overrides the default project.

Replace the following:
- REGION : your connection region
- PROJECT_ID : your Google Cloud project ID
- CONNECTION_ID : an ID for your connection
When you create a connection resource, BigQuery creates a unique system service account and associates it with the connection.

Troubleshooting: If you get the following connection error, update the Google Cloud SDK :
```
Flags parsing error: flag --connection_type=CLOUD_RESOURCE: value should be one of...
```

Retrieve and copy the service account ID for use in a later step:

bq  
show  
--connection  
 PROJECT_ID 
. REGION 
. CONNECTION_ID

The output is similar to the following:

name                          properties
1234. REGION 
. CONNECTION_ID 
{"serviceAccountId": "connection-1234-9u56h9@gcp-sa-bigquery-condel.iam.gserviceaccount.com"}

Python

To learn how to install and use the client library for Spanner, see Spanner client libraries .

To authenticate to Spanner, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  import 
  
 google.api_core.exceptions 
 from 
  
 google.cloud 
  
 import 
  bigquery_connection_v1 
 
 client 
 = 
  bigquery_connection_v1 
 
 . 
  ConnectionServiceClient 
 
 () 
 def 
  
 create_connection 
 ( 
 project_id 
 : 
 str 
 , 
 location 
 : 
 str 
 , 
 connection_id 
 : 
 str 
 , 
 ): 
  
 """Creates a BigQuery connection to a Cloud Resource. 
 Cloud Resource connection creates a service account which can then be 
 granted access to other Google Cloud resources for federated queries. 
 Args: 
 project_id: The Google Cloud project ID. 
 location: The location of the connection (for example, "us-central1"). 
 connection_id: The ID of the connection to create. 
 """ 
 parent 
 = 
 client 
 . 
  common_location_path 
 
 ( 
 project_id 
 , 
 location 
 ) 
 connection 
 = 
  bigquery_connection_v1 
 
 . 
  Connection 
 
 ( 
 friendly_name 
 = 
 "Example Connection" 
 , 
 description 
 = 
 "A sample connection for a Cloud Resource." 
 , 
 cloud_resource 
 = 
  bigquery_connection_v1 
 
 . 
  CloudResourceProperties 
 
 (), 
 ) 
 try 
 : 
 created_connection 
 = 
 client 
 . 
  create_connection 
 
 ( 
 parent 
 = 
 parent 
 , 
 connection_id 
 = 
 connection_id 
 , 
 connection 
 = 
 connection 
 ) 
 print 
 ( 
 f 
 "Successfully created connection: 
 { 
 created_connection 
 . 
 name 
 } 
 " 
 ) 
 print 
 ( 
 f 
 "Friendly name: 
 { 
 created_connection 
 . 
 friendly_name 
 } 
 " 
 ) 
 print 
 ( 
 f 
 "Service Account: 
 { 
 created_connection 
 . 
 cloud_resource 
 . 
 service_account_id 
 } 
 " 
 ) 
 except 
 google 
 . 
 api_core 
 . 
 exceptions 
 . 
 AlreadyExists 
 : 
 print 
 ( 
 f 
 "Connection with ID ' 
 { 
 connection_id 
 } 
 ' already exists." 
 ) 
 print 
 ( 
 "Please use a different connection ID." 
 ) 
 except 
 Exception 
 as 
 e 
 : 
 print 
 ( 
 f 
 "An unexpected error occurred while creating the connection: 
 { 
 e 
 } 
 " 
 )

Node.js

To learn how to install and use the client library for Spanner, see Spanner client libraries .

To authenticate to Spanner, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  const 
  
 { 
 ConnectionServiceClient 
 } 
  
 = 
  
 require 
 ( 
 ' @google-cloud/bigquery-connection 
' 
 ). 
 v1 
 ; 
 const 
  
 { 
 status 
 } 
  
 = 
  
 require 
 ( 
 '@grpc/grpc-js' 
 ); 
 const 
  
 client 
  
 = 
  
 new 
  
  ConnectionServiceClient 
 
 (); 
 /** 
 * Creates a new BigQuery connection to a Cloud Resource. 
 * 
 * A Cloud Resource connection creates a service account that can be granted access 
 * to other Google Cloud resources. 
 * 
 * @param {string} projectId The Google Cloud project ID. for example, 'example-project-id' 
 * @param {string} location The location of the project to create the connection in. for example, 'us-central1' 
 * @param {string} connectionId The ID of the connection to create. for example, 'example-connection-id' 
 */ 
 async 
  
 function 
  
 createConnection 
 ( 
 projectId 
 , 
  
 location 
 , 
  
 connectionId 
 ) 
  
 { 
  
 const 
  
 parent 
  
 = 
  
 client 
 . 
 locationPath 
 ( 
 projectId 
 , 
  
 location 
 ); 
  
 const 
  
 connection 
  
 = 
  
 { 
  
 friendlyName 
 : 
  
 'Example Connection' 
 , 
  
 description 
 : 
  
 'A sample connection for a Cloud Resource' 
 , 
  
 // The service account for this cloudResource will be created by the API. 
  
 // Its ID will be available in the response. 
  
 cloudResource 
 : 
  
 {}, 
  
 }; 
  
 const 
  
 request 
  
 = 
  
 { 
  
 parent 
 , 
  
 connectionId 
 , 
  
 connection 
 , 
  
 }; 
  
 try 
  
 { 
  
 const 
  
 [ 
 response 
 ] 
  
 = 
  
 await 
  
 client 
 . 
 createConnection 
 ( 
 request 
 ); 
  
 console 
 . 
 log 
 ( 
 `Successfully created connection: 
 ${ 
 response 
 . 
 name 
 } 
 ` 
 ); 
  
 console 
 . 
 log 
 ( 
 `Friendly name: 
 ${ 
 response 
 . 
 friendlyName 
 } 
 ` 
 ); 
  
 console 
 . 
 log 
 ( 
 `Service Account: 
 ${ 
 response 
 . 
 cloudResource 
 . 
 serviceAccountId 
 } 
 ` 
 ); 
  
 } 
  
 catch 
  
 ( 
 err 
 ) 
  
 { 
  
 if 
  
 ( 
 err 
 . 
 code 
  
 === 
  
 status 
 . 
 ALREADY_EXISTS 
 ) 
  
 { 
  
 console 
 . 
 log 
 ( 
 `Connection ' 
 ${ 
 connectionId 
 } 
 ' already exists.` 
 ); 
  
 } 
  
 else 
  
 { 
  
 console 
 . 
 error 
 ( 
 `Error creating connection: 
 ${ 
 err 
 . 
 message 
 } 
 ` 
 ); 
  
 } 
  
 } 
 }

Terraform

Use the google_bigquery_connection resource.

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries .

The following example creates a Cloud resource connection named my_cloud_resource_connection in the US region:

 # This queries the provider for project information.
data "google_project" "default" {}

# This creates a cloud resource connection in the US region named my_cloud_resource_connection.
# Note: The cloud resource nested object has only one output field - serviceAccountId.
resource "google_bigquery_connection" "default" {
  connection_id = "my_cloud_resource_connection"
  project       = data.google_project.default.project_id
  location      = "US"
  cloud_resource {}
}

To apply your Terraform configuration in a Google Cloud project, complete the steps in the following sections.

Prepare Cloud Shell

Launch Cloud Shell .
Set the default Google Cloud project where you want to apply your Terraform configurations.

You only need to run this command once per project, and you can run it in any directory.
```
export GOOGLE_CLOUD_PROJECT= PROJECT_ID 
```
Environment variables are overridden if you set explicit values in the Terraform configuration file.

Prepare the directory

Each Terraform configuration file must have its own directory (also called a root module ).

In Cloud Shell , create a directory and a new file within that directory. The filename must have the .tf extension—for example main.tf . In this tutorial, the file is referred to as main.tf .
```
mkdir DIRECTORY 
&& cd DIRECTORY 
&& touch main.tf
```
If you are following a tutorial, you can copy the sample code in each section or step.

Copy the sample code into the newly created main.tf .

Optionally, copy the code from GitHub. This is recommended when the Terraform snippet is part of an end-to-end solution.
Review and modify the sample parameters to apply to your environment.
Save your changes.
Initialize Terraform. You only need to do this once per directory.
```
terraform init
```
Optionally, to use the latest Google provider version, include the -upgrade option:
```
terraform init -upgrade
```

Apply the changes

Review the configuration and verify that the resources that Terraform is going to create or update match your expectations:
```
terraform plan
```
Make corrections to the configuration as necessary.
Apply the Terraform configuration by running the following command and entering yes at the prompt:
```
terraform apply
```
Wait until Terraform displays the "Apply complete!" message.
Open your Google Cloud project to view the results. In the Google Cloud console, navigate to your resources in the UI to make sure that Terraform has created or updated them.

After you create the connection, open it, and in the Connection infopane, copy the service account ID. You need this ID when you configure permissions for the connection. When you create a connection resource, BigQuery creates a unique system service account and associates it with the connection.

Set up access

You must give the service account that is associated with the new connection read access to your Spanner instance or database. It is recommended to use the Cloud Spanner Database Reader with Data Boost ( roles/spanner.databaseReaderWithDataBoost ) predefined IAM role.

Follow these steps to grant access to database-level roles for the service account that you copied earlier from the connection:

Go to the Spanner Instancespage.

Go to the instances page
Click the name of the instance that contains your database to go to the Instance detailspage.
In the Overviewtab, select the checkbox for your database.The Info panelappears.
Click Add principal.
In the Add principalspanel, in New principals, enter the service account ID that you copied earlier.
In the Select a rolefield, select Cloud Spanner Database Reader with DataBoost role.
Click Save.

Create an external dataset

To create an external dataset, do the following:

Console

Open the BigQuery page in the Google Cloud console.

Go to the BigQuery page
In the left pane, click Explorer:

If you don't see the left pane, click Expand left paneto open the pane.
In the Explorerpane, select the project where you want to create the dataset.
Click View actions, and then click Create dataset.
On the Create datasetpage, do the following:
- For Dataset ID, enter a unique dataset name.
- For Location type, choose a location for the dataset, such as us-central1 or multiregion us . After you create a dataset, the location can't be changed.
- For External Dataset, do the following:
  - Check the box next to Link to an external dataset.
  - For External dataset type, select Spanner .
  - For External source, enter the full identifier of your Spanner database in the following format: projects/ PROJECT_ID /instances/ INSTANCE /databases/ DATABASE . For example: projects/my_project/instances/my_instance/databases/my_database .
  - Optionally, for Database roleenter the name of a Spanner database role. For more information read about Database roles used for creating Spanner Connections
  - Optionally, check the box next to Use a Cloud Resource connectionto create the external dataset with a connection.
- Leave the other default settings as they are.
Click Create dataset.

SQL

Use the CREATE EXTERNAL SCHEMA data definition language (DDL) statement .

In the Google Cloud console, go to the BigQuerypage.

Go to BigQuery
In the query editor, enter the following statement:
```
 CREATE 
  
 EXTERNAL 
  
 SCHEMA 
  
  DATASET_NAME 
 
  
 OPTIONS 
  
 ( 
  
 external_source 
  
 = 
  
 ' SPANNER_EXTERNAL_SOURCE 
' 
 , 
  
 location 
  
 = 
  
 ' LOCATION 
' 
 ); 
 /* 
 Alternatively, create with a connection: 
 */ 
 CREATE 
  
 EXTERNAL 
  
 SCHEMA 
  
  DATASET_NAME 
 
  
 WITH 
  
 CONNECTION 
  
   PROJECT_ID 
 
 . 
  LOCATION 
 
 . 
  CONNECTION_NAME 
 
 
  
 OPTIONS 
  
 ( 
  
 external_source 
  
 = 
  
 ' SPANNER_EXTERNAL_SOURCE 
' 
 , 
  
 location 
  
 = 
  
 ' LOCATION 
' 
 ); 
```
Replace the following:
- DATASET_NAME : the name of your new dataset in BigQuery.
- SPANNER_EXTERNAL_SOURCE : the full, qualified Spanner database name, with a prefix identifying the source, in the following format: google-cloudspanner://[ DATABASE_ROLE @]/projects/ PROJECT_ID /instances/ INSTANCE /databases/ DATABASE . For example: google-cloudspanner://admin@/projects/my_project/instances/my_instance/databases/my_database or google-cloudspanner:/projects/my_project/instances/my_instance/databases/my_database .
- LOCATION : the location of your new dataset in BigQuery, for example, us-central1 . After you create a dataset, you can't change its location.
- (Optional) CONNECTION_NAME : the name of your Cloud resource connection.
Click Run.

For more information about how to run queries, see Run an interactive query .

bq

In a command-line environment, create an external dataset by using the bq mk command :

bq  
--location = 
 LOCATION 
  
mk  
--dataset  
 \ 
  
--external_source  
 SPANNER_EXTERNAL_SOURCE 
  
 \ 
  
 DATASET_NAME

Alternatively, create with a connection:

bq  
--location = 
 LOCATION 
  
mk  
--dataset  
 \ 
  
--external_source  
 SPANNER_EXTERNAL_SOURCE 
  
 \ 
  
--connection_id  
 PROJECT_ID 
. LOCATION 
. CONNECTION_NAME 
  
 \ 
  
 DATASET_NAME

Replace the following:

LOCATION : the location of your new dataset in BigQuery—for example, us-central1 . After you create a dataset, you can't change its location. You can set a default location value by using the .bigqueryrc file .
SPANNER_EXTERNAL_SOURCE : the full, qualified Spanner database name, with a prefix identifying the source, in the following format: google-cloudspanner://[ DATABASE_ROLE @]/projects/ PROJECT_ID /instances/ INSTANCE /databases/ DATABASE . For example: google-cloudspanner://admin@/projects/my_project/instances/my_instance/databases/my_database or google-cloudspanner:/projects/my_project/instances/my_instance/databases/my_database .
DATASET_NAME : the name of your new dataset in BigQuery. To create a dataset in a project other than your default project, add the project ID to the dataset name in the following format: PROJECT_ID : DATASET_NAME .
(Optional) CONNECTION_NAME : the name of your Cloud resource connection.

Terraform

Use the google_bigquery_dataset resource .

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries .

The following example creates a Spanner external dataset:

 resource "google_bigquery_dataset" "default" {
  dataset_id    = "my_external_dataset"
  friendly_name = "My external dataset"
  description   = "This is a test description."
  location      = "US"
  external_dataset_reference {
    # The full identifier of your Spanner database.
    external_source = "google-cloudspanner:/projects/my_project/instances/my_instance/databases/my_database"
    # Must be empty for a Spanner external dataset.
    connection = ""
  }
}

To apply your Terraform configuration in a Google Cloud project, complete the steps in the following sections.

Prepare Cloud Shell

Launch Cloud Shell .
Set the default Google Cloud project where you want to apply your Terraform configurations.

You only need to run this command once per project, and you can run it in any directory.
```
export GOOGLE_CLOUD_PROJECT= PROJECT_ID 
```
Environment variables are overridden if you set explicit values in the Terraform configuration file.

Prepare the directory

Each Terraform configuration file must have its own directory (also called a root module ).

In Cloud Shell , create a directory and a new file within that directory. The filename must have the .tf extension—for example main.tf . In this tutorial, the file is referred to as main.tf .
```
mkdir DIRECTORY 
&& cd DIRECTORY 
&& touch main.tf
```
If you are following a tutorial, you can copy the sample code in each section or step.

Copy the sample code into the newly created main.tf .

Optionally, copy the code from GitHub. This is recommended when the Terraform snippet is part of an end-to-end solution.
Review and modify the sample parameters to apply to your environment.
Save your changes.
Initialize Terraform. You only need to do this once per directory.
```
terraform init
```
Optionally, to use the latest Google provider version, include the -upgrade option:
```
terraform init -upgrade
```

Apply the changes

Review the configuration and verify that the resources that Terraform is going to create or update match your expectations:
```
terraform plan
```
Make corrections to the configuration as necessary.
Apply the Terraform configuration by running the following command and entering yes at the prompt:
```
terraform apply
```
Wait until Terraform displays the "Apply complete!" message.
Open your Google Cloud project to view the results. In the Google Cloud console, navigate to your resources in the UI to make sure that Terraform has created or updated them.

API

Call the datasets.insert method with a defined dataset resource and externalDatasetReference field for your Spanner database.

Note that names of the tables in the external datasets are case insensitive.

When you create the external datasets with a CLOUD_RESOURCE connection, you need to have the bigquery.connections.delegate permission (available from the BigQuery Connection Admin role) on the connection that is used by the external datasets.

Create a non-incremental materialized view based on tables from an external dataset

Before you proceed, you must create the underlying Spanner external dataset using a CLOUD_RESOURCE connection .

You can create non-incremental materialized views that reference Spanner external dataset tables by using the allow_non_incremental_definition option. The following example uses a base Spanner external dataset table:

 /* 
 You must create the spanner_external_dataset with a CLOUD_RESOURCE connection. 
 */ 
 CREATE 
  
 MATERIALIZED 
  
 VIEW 
  
 sample_dataset 
 . 
 sample_spanner_mv 
  
 OPTIONS 
  
 ( 
  
 enable_refresh 
  
 = 
  
 true 
 , 
  
 refresh_interval_minutes 
  
 = 
  
 60 
 , 
  
 max_staleness 
  
 = 
  
 INTERVAL 
  
 "24" 
  
 HOUR 
 , 
  
 allow_non_incremental_definition 
  
 = 
  
 true 
 ) 
 AS 
  
 SELECT 
  
 COUNT 
 ( 
 * 
 ) 
  
 cnt 
  
 FROM 
  
 spanner_external_dataset 
 . 
 spanner_table 
 ;

Only BigQuery non-incremental materialized views can have Spanner external dataset tables as base tables . If a non-incremental materialized view's last refresh occurred outside the max_staleness interval, then the query reads the base Spanner external dataset tables. Learn more about BigQuery non-incremental materialized views .

What's next

Learn about Data Boost in Data Boost overview
Use Data Boost in your applications
Monitor Data Boost usage
Monitor and manage Data Boost quota usage
Learn more about Spanner external dataset .
Learn more about creating materialized views over Spanner external datasets .

Run federated queries with Data Boost Stay organized with collections Save and categorize content based on your preferences.

Before you begin

Create a Spanner instance and database

Enable the BigQuery connection API

Grant IAM permissions for Data Boost to principals

Run a federated Data Boost query

Start in Spanner to run a Data Boost query

Start in BigQuery to run a Data Boost query

Console

bq

Use Data Boost with external datasets

Use a CLOUD_RESOURCE connection

Create a connection

Console

bq

Python

Node.js

Terraform

Prepare Cloud Shell

Prepare the directory

Apply the changes

Set up access

Create an external dataset

Console

SQL

bq

Terraform

Prepare Cloud Shell

Prepare the directory

Apply the changes

API

Create a non-incremental materialized view based on tables from an external dataset

What's next

Run federated queries with Data Boost

Use a `CLOUD_RESOURCE` connection