Create and run a workflow in Dataform

This quickstart walks you through the following process in Dataform to create a workflow and run it in BigQuery:

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project .

  4. Enable the BigQuery and Dataform APIs.

    Enable the APIs

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  6. Verify that billing is enabled for your Google Cloud project .

  7. Enable the BigQuery and Dataform APIs.

    Enable the APIs

Required roles

To get the permissions that you need to create and run a workflow in Dataform, ask your administrator to grant you the following IAM roles on the project that will host your Dataform repository:

For more information about granting roles, see Manage access to projects, folders, and organizations .

You might also be able to get the required permissions through custom roles or other predefined roles .

Create a Dataform repository

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. Click Create repository.

  3. On the Create repositorypage, do the following:

    1. In the Repository IDfield, enter quickstart-repository .

    2. In the Regionlist, select europe-west4 .

    3. Click Create.

Create and initialize a Dataform development workspace

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. Click quickstart-repository .

  3. Click Create development workspace.

  4. In the Create development workspacewindow, do the following:

    1. In the Workspace IDfield, enter quickstart-workspace .

    2. Click Create.

    The development workspace page appears.

  5. Click Initialize workspace.

Create a view

In the following sections, you define a view that you will later use as a data source for a table.

Create a SQLX file for defining a view

  1. In the Filespane, next to definitions/ , click the Moremenu.

  2. Click Create file.

  3. In the Create new filepane, do the following:

    1. In the Add a file pathfield, enter definitions/quickstart-source.sqlx .

    2. Click Create file.

Define a view

  1. In the Filespane, expand the definitions folder.

  2. Click definitions/quickstart-source.sqlx .

  3. In the file, enter the following code snippet:

      config 
      
     { 
      
     type 
     : 
      
     "view" 
     } 
     SELECT 
      
     "apples" 
      
     AS 
      
     fruit 
     , 
      
     2 
      
     AS 
      
     count 
     UNION 
      
     ALL 
     SELECT 
      
     "oranges" 
      
     AS 
      
     fruit 
     , 
      
     5 
      
     AS 
      
     count 
     UNION 
      
     ALL 
     SELECT 
      
     "pears" 
      
     AS 
      
     fruit 
     , 
      
     1 
      
     AS 
      
     count 
     UNION 
      
     ALL 
     SELECT 
      
     "bananas" 
      
     AS 
      
     fruit 
     , 
      
     0 
      
     AS 
      
     count 
     
    
  4. Click Format.

Create a table

In the following sections, you define the table type in a SQLX file, and then write a SELECT statement to define the table structure within the same file.

Create a SQLX file for table definition

  1. In the Filespane, next to definitions/ , click the Moremenu, and then select Create file.

  2. In the Add a file pathfield, enter definitions/quickstart-table.sqlx .

  3. Click Create file.

Define the table type, structure and dependencies

  1. In the Filespane, expand the definitions/ directory.

  2. Select quickstart-table.sqlx , and then enter the following table type and SELECT statement:

      config 
      
     { 
      
     type 
     : 
      
     "table" 
     } 
     SELECT 
      
     fruit 
     , 
      
     SUM 
     ( 
     count 
     ) 
      
     as 
      
     count 
     FROM 
      
     ${ 
     ref 
     ( 
     "quickstart-source" 
     ) 
     } 
     GROUP 
      
     BY 
      
     1 
     
    
  3. Click Format.

After defining the table type, Dataform throws a query validation error because quickstart-source does not exist in BigQuery yet. This error is resolved when you run the workflow later in this tutorial.

Run the workflow in BigQuery

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. On the quickstart-workspace page, click Start execution.

  3. Click All actions.

  4. Click Start execution.

  5. In the dialog that opens, click Allowto give permission to BigQuery Pipelinesto access your Google Account.

    Dataform uses the default repository settings to create the contents of your workflow in a BigQuery dataset called dataform .

View execution logs in Dataform

  1. On the quickstart-repository page, click Workflow Execution Logs.

  2. To view details of your execution, click the latest execution.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.

Delete the dataset created in BigQuery

To avoid incurring charges for BigQuery assets, delete the dataset called dataform .

  1. In the Google Cloud console, go to the BigQuerypage.

    Go to BigQuery

  2. In the Explorerpanel, expand your project and select dataform .

  3. Click the Actionsmenu, and then select Delete.

  4. In the Delete datasetdialog, enter delete into the field, and then click Delete.

Delete the Dataform development workspace

Dataform development workspace creation incurs no costs, but to delete the development workspace you can follow these steps:

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. Click quickstart-repository .

  3. In the Development workspacestab, click the Moremenu by quickstart-workspace , and then select Delete.

  4. To confirm, click Delete.

Delete the Dataform repository

Dataform repository creation incurs no costs, but to delete the repository you can follow these steps:

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. By quickstart-repository , click the Moremenu, and then select Delete.

  3. In the Delete repositorywindow, enter the name of the repository to confirm deletion.

  4. To confirm, click Delete.

What's next

Design a Mobile Site
View Site in Mobile | Classic
Share by: