Schedule production runs

This quickstart walks you through the following steps to schedule production runs in Dataform:

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project : Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project : To create a project, you need the Project Creator role ( roles/resourcemanager.projectCreator ), which contains the resourcemanager.projects.create permission. Learn how to grant roles .

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project .

  4. Enable the BigQuery and Dataform APIs.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role ( roles/serviceusage.serviceUsageAdmin ), which contains the serviceusage.services.enable permission. Learn how to grant roles .

    Enable the APIs

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project : Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project : To create a project, you need the Project Creator role ( roles/resourcemanager.projectCreator ), which contains the resourcemanager.projects.create permission. Learn how to grant roles .

    Go to project selector

  6. Verify that billing is enabled for your Google Cloud project .

  7. Enable the BigQuery and Dataform APIs.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role ( roles/serviceusage.serviceUsageAdmin ), which contains the serviceusage.services.enable permission. Learn how to grant roles .

    Enable the APIs

Additionally, select or create a custom service account to run workflows in BigQuery.

Required roles

To get the permissions that you need to perform all tasks in this tutorial, ask your administrator to grant you the following IAM roles:

For more information about granting roles, see Manage access to projects, folders, and organizations .

You might also be able to get the required permissions through custom roles or other predefined roles .

Grant required roles

To run workflows in BigQuery, you can use a custom service account or your Google Account ( Preview ). However, custom service account credentials are the default option for scheduled runs. Using Google Account user account credentials is discouraged for scheduled runs.

To run workflows in BigQuery, your custom service account must have the following required roles:

  • BigQuery Data Editor ( roles/bigquery.dataEditor ) on projects to which Dataform needs both read and write access, which usually includes the project hosting your Dataform repository.
  • BigQuery Data Viewer ( roles/bigquery.dataViewer ) on projects to which Dataform needs read-only access.
  • BigQuery Job User ( roles/bigquery.jobUser ) on the project hosting your Dataform repository.

To let Dataform use your custom service account, the default Dataform service agent must have the following roles on the custom service account resource:

To grant these roles, follow these steps:

  1. In the Google Cloud console, go to the IAMpage.

    Go to IAM

  2. Click Grant access.

  3. In the New principalsfield, enter your custom service account ID.

  4. In the Select a rolemenu, select the following roles one by one, using Add another rolefor each additional role:

    • BigQuery Data Editor
    • BigQuery Data Viewer
    • BigQuery Job User
  5. Click Save.

  6. In the Google Cloud console, go to the Service accountspage.

    Go to Service accounts

  7. Select your custom service account.

  8. Go to Principals with access, and then click Grant access.

  9. In the New principalsfield, enter your default Dataform service agent ID.

    Your default Dataform service agent ID is in the following format:

     service- PROJECT_NUMBER 
    @gcp-sa-dataform.iam.gserviceaccount.com 
    
  10. In the Select a rolelist, add the following roles:

    • Service Account User
    • Service Account Token Creator
  11. Click Save.

Create a Dataform repository

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. Click Create repository.

  3. On the Create repositorypage, do the following:

    1. In the Repository IDfield, enter quickstart-production .

    2. In the Regionlist, select europe-west4 .

    3. In the Service accountfield, click Enter manually, and then enter the name of your custom service account. Ensure you enter your custom service account in this field.

    4. Click Create.

  4. Click Go to repositories.

Create a release configuration and workflow configuration

To create production compilation results of the quickstart-production repository and schedule a run of production tables, follow these steps:

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. Click quickstart-production .

  3. Click Releases & scheduling, then click Create production release.

  4. In the Create release configurationpane, configure the following settings:

    1. In the Release IDfield, enter production .
    2. In the Git commitishfield, leave the default value main .
    3. In the Schedule frequencysection, in the Repeatsmenu, select Custom.
    4. In the Custom schedulefield, enter 0 16 * * * .
    5. In the Timezonemenu, select a UTC+1 timezone, for example, Central European Standard Time (CET).

      Every day at 4 PM UTC+1, Dataform compiles the quickstart-production repository and applies the compilation settings configured in this release configuration to create production compilation results.

  5. Click Create.

    The production release configuration creates a compilation result of the entire quickstart-production repository every day at 4PM UTC+1.

  6. Ensure that you're on the Releases & schedulingtab. Go to the Workflow configurationssection and click Create.

  7. In the Create workflow configurationpane, configure the following settings:

    1. In the Configuration IDfield, enter production .
    2. In the Release configurationmenu, select production .
    3. In the Schedule frequencysection, in the Repeatsmenu, select Custom.
    4. In the Custom schedulefield, enter 0 17 * * * .
    5. In the Timezonemenu, select a UTC+1 timezone, for example, Central European Standard Time (CET).

      Every day at 5PM UTC+1, Dataform runs the latest production compilation result of the quickstart-production repository.

    6. Click All actions.

      Dataform runs all the workflow actions in the production compilation result.

  8. Click Create.

    The production workflow configuration runs the entire latest compilation result created by the production release configuration every day at 5PM UTC+1.

View past production compilation results

To view past scheduled production compilation results, follow these steps:

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. Select the quickstart-production repository.

  3. Click Releases & scheduling.

  4. In the Release configurationssection, click production .

View past production workflow runs

To view past production workflow runs, follow these steps:

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. Select the quickstart-production repository.

  3. Click Workflow Execution Logs.

  4. Select a workflow run to see more detailed information, including the status of each action and any logs.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.

Delete the dataset created in BigQuery

To avoid incurring charges for BigQuery assets, follow these steps to delete the dataset called dataform_production :

  1. In the Google Cloud console, go to the BigQuerypage.

    Go to BigQuery

  2. In the Explorerpanel, expand your project and select dataform_production .

  3. Click the Actionsmenu, and then select Delete.

  4. In the Delete datasetdialog, enter delete , and then click Delete.

Delete the Dataform release configuration

There are no costs associated with creating Dataform release configurations. However, if you want to delete the production release configuration, follow these steps:

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. Click quickstart-production .

  3. Click Releases & scheduling, and go to the Release configurationssection.

  4. By the production release configuration, click the Moremenu, and then click Delete.

  5. In the Delete release configurationdialog, click Delete.

Delete the Dataform workflow configuration

To avoid incurring charges for BigQuery assets, follow these steps to delete the Dataform production workflow configuration:

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. Click quickstart-production .

  3. Click Releases & scheduling, and go to the Workflow configurationssection.

  4. By the production workflow configuration, click the Moremenu, and then click Delete.

  5. In the Delete release configurationdialog, click Delete.

Delete the Dataform repository

There are no costs associated with creating Dataform repositories. However, if you want to delete a repository and all its contents, follow these steps:

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. By quickstart-production , click the Moremenu, and then select Delete.

  3. In the Delete repositorywindow, enter the name of the repository to confirm deletion.

  4. To confirm, click Delete.

What's next

Design a Mobile Site
View Site in Mobile | Classic
Share by: