Use SQL cells
This guide describes how to use SQL cells to query data from within a Colab Enterprise notebook.
Overview
A SQL cell is a code cell for writing, editing, and running SQL queries from within your Colab Enterprise notebook. SQL cells provide an alternative workflow to IPython Magics for BigQuery .
Capabilities
SQL cells provide the following capabilities:
- Dry-run support: SQL statement validation and an approximation of the number of bytes processed by the query
- Formatting: Keyword linting and syntax highlighting
- BigQuery DataFrame output variable naming: Refer to the output variable from within other notebook cells
- Variable replacement: Refer to Python variables and SQL cells to support parameterization and the ability to query the results of a previous query
- Result set viewer: Lightweight tabular result set viewer with pagination for large result sets
Supported SQL dialect and data source
-  Colab Enterprise SQL cells support GoogleSQL. 
-  You can run SQL queries on BigQuery data. 
Limitations
Consider the following limitations when you plan your project:
- You can run multiple SQL statements in a single SQL cell, but only the results of the last SQL statement are saved to a DataFrame.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-  In the Google Cloud console, on the project selector page, select or create a Google Cloud project. Roles required to select or create a project - Select a project : Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-  Create a project 
: To create a project, you need the Project Creator
      ( roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles .
 
-  Verify that billing is enabled for your Google Cloud project . 
-  Enable the BigQuery, Compute Engine, Dataform, and Vertex AI APIs. Roles required to enable APIs To enable APIs, you need the Service Usage Admin IAM role ( roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission. Learn how to grant roles .
-  In the Google Cloud console, on the project selector page, select or create a Google Cloud project. Roles required to select or create a project - Select a project : Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-  Create a project 
: To create a project, you need the Project Creator
      ( roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles .
 
-  Verify that billing is enabled for your Google Cloud project . 
-  Enable the BigQuery, Compute Engine, Dataform, and Vertex AI APIs. Roles required to enable APIs To enable APIs, you need the Service Usage Admin IAM role ( roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission. Learn how to grant roles .
Required roles
To get the permissions that you need to create a Colab Enterprise notebook, run the notebook's code on a runtime, and use BigQuery data in the notebook, ask your administrator to grant you the following IAM roles on the project:
-  BigQuery User 
( roles/bigquery.user)
-  Colab Enterprise User 
( roles/aiplatform.colabEnterpriseUser)
For more information about granting roles, see Manage access to projects, folders, and organizations .
You might also be able to get the required permissions through custom roles or other predefined roles .
Create a SQL cell
To create a SQL cell in Colab Enterprise, do the following:
-  In the Google Cloud console, go to the Colab Enterprise My notebookspage. 
-  In the Regionmenu, select the region that contains your notebook. 
-  Click the notebook that you want to open. If you haven't created a notebook yet, create a notebook . 
-  In the toolbar, to add a SQL cell, click the Insert code cell options menu and select Add SQL cell .   Your SQL cell is added to your notebook. 
Enter and run a query
-  In your SQL cell, enter a SQL query. For an overview of supported statements and SQL dialects, see Introduction to SQL in BigQuery . You can refer to Python variables in expressions by enclosing the variable name in braces ( { }). For example, if you specified a value in a Python variable namedmy_threshold, you might limit your result set with a query similar to the following:SELECT * FROM my_dataset . my_table WHERE x > { my_threshold } ; 
-  Hold the pointer over the SQL cell that you want to run, and then click the Run cell button. 
The output of the query is automatically saved as a BigQuery DataFrame with the same name as the title of the SQL cell.
Interact with the result set
You can interact with the result set as a BigQuery DataFrame or a pandas DataFrame.
You can chain SQL statements using the same SQL cell variable name. For
example, you can use BigQuery DataFrames generated by the result set
as tables in a following query by enclosing the DataFrame name in braces
( { } 
). See the following example, which references a previous query's
output that was saved as a DataFrame named df 
:
SELECT * FROM { df } ;

 Run cell
 Run cell 
