Connect to a third-party Git repository

This document shows you how to connect a remote repository to a Dataform repository. After you connect the repositories, the changes you make in a Dataform development workspace can be pushed to and pulled from the remote Git repository.

You can connect a remote repository through HTTPS or SSH.

The following table lists supported Git providers and connection methods that are available for their repositories:

Git provider Connection method
Azure DevOps Services SSH
Bitbucket Developer Connect (recommended), SSH
GitHub Developer Connect (recommended), SSH, or HTTPS
GitLab Developer Connect (recommended), SSH, or HTTPS

Before you begin

  1. If your organization or project restricts remote Git repositories with the dataform.restrictGitRemotes Organization Policy, ensure that the remote Git repository is added to the allowlist in the policy before you create the Dataform repository that you want to connect to a remote repository. For more information, see Restrict remote repositories .
  2. Select or create a Dataform repository . You need the repository later to share a secret with your default Dataform service agent .

  3. Ensure that you have the necessary permissions in your Git provider and that your Dataform repository is in a supported region.

  4. Enable the Developer Connect API .

  5. If you're connecting Developer Connect to a Bitbucket repository, ensure that your authorizer access token has write access to your repositories so that Dataform can push changes. For more information, see Create access tokens .

Required roles

To get the permissions that you need to connect a Dataform repository to a remote Git repository, ask your administrator to grant you the following IAM roles on repositories:

For more information about granting roles, see Manage access to projects, folders, and organizations .

You might also be able to get the required permissions through custom roles or other predefined roles .

To connect a remote repository using Developer Connect, grant the Developer Connect Token Accessor ( roles/developerconnect.tokenAccessor ) and Developer Connect Git Proxy User ( roles/developerconnect.gitProxyUser ) roles to the default Dataform service agent. Your default Dataform service agent ID is in the following format:

 service- PROJECT_NUMBER 
@gcp-sa-dataform.iam.gserviceaccount.com 

Connect a remote repository using Developer Connect

Developer Connect streamlines the integration of external Git providers with Dataform repositories by providing a guided interface that removes the need for manual secrets management.

By using Developer Connect, you can connect to remote repositories in privately hosted networks—such as on-premises environments or virtual private clouds (VPCs)—without exposing them to the public internet.

Prepare for private network connections

If you're connecting to a remote repository in a private network, ensure you have the following information and resources ready before starting the setup:

  • The namespace and service name of the Service Directory service that points to your internal Git host.
  • A certification authority certificate in PEM format (with a maximum size of 10 KB) if your host uses a private or self-signed certificate authority.
  • Verification that your Git host's SSL certificate includes a subject alternative name (SAN) that matches the host URI.

For more information on configuring connections for privately hosted networks, see the following:

Create a new Developer Connect connection

To connect a remote repository to a new Developer Connect connection, select one of the following options:

Bitbucket

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. Select your repository.

  3. Go to Settings, and then click Connect with Git.

  4. In the Link to remote repositorypane, in the Remote Git repository protocolsection, select Developer Connect.

  5. In the repository selection menu, click Link new repository.

  6. In the Link Git repositories via Developer Connectpane, select Create new connection.

  7. Select Bitbucketas your provider.

  8. Specify the connection region and connection name.

  9. Provide the workspace name, authorizer access token, and read access token.

  10. Click Continue, then select the remote repositories to link and click Link.

  11. In Dataform, select your remote repository and the default branch.

  12. To complete the setup, click Link.

GitHub

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. Select your repository.

  3. Go to Settings, and then click Connect with Git.

  4. In the Link to remote repositorypane, in the Remote Git repository protocolsection, select Developer Connect.

  5. In the repository selection menu, click Link new repository.

  6. In the Link Git repositories via Developer Connectpane, select Create new connection.

  7. Select GitHubas your provider.

  8. Specify the connection region and connection name.

  9. To trigger the OAuth authentication flow, click Continue, and then do the following:

    1. Click I understand and continue.
    2. Select Install the GitHub App on another GitHub accountand follow the prompts to authorize access to your GitHub account and specific repositories.
    3. Select the account on which you want to install the Dataform GitHub application.
    4. In the Repository accesssection, select if you want to give access to all repositories or only a selection of repositories.
    5. Click Save.
  10. Select the remote repositories to link and click Link.

  11. In Dataform, select your remote repository and the default branch.

  12. To complete the setup, click Link.

GitLab

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. Select your repository.

  3. Go to Settings, and then click Connect with Git.

  4. In the Link to remote repositorypane, in the Remote Git repository protocolsection, select Developer Connect.

  5. In the repository selection menu, click Link new repository.

  6. In the Link Git repositories via Developer Connectpane, select Create new connection.

  7. Select GitLabas your provider.

  8. Specify the connection region and connection name.

  9. Provide the API access token and the read API access token. For information on how to do this, see Create access tokens .

  10. Click Continue, then select the remote repositories to link and click Link.

  11. In Dataform, select your remote repository and the default branch.

  12. To complete the setup, click Link.

Use an existing Developer Connect connection

To connect a remote repository to an existing Developer Connect connection, do the following:

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. Select your repository.

  3. Go to Settingsthen click Connect with Git.

  4. In the Link to remote repositorypane, in the Remote Git repository protocolsection, select Developer Connect.

  5. In the repository selection menu, select a repository that belongs to an existing Developer Connect connection.

  6. In Dataform, select your remote repository and the default branch.

  7. To complete the setup, click Link.

Connect a remote repository through SSH

To connect a remote repository through SSH, you need to generate an SSH key and a Secret Manager secret. The SSH key consists of a public SSH key and a private SSH key. You need to share the public SSH key with your Git provider, and create a Secret Manager secret with the private SSH key. Then, share the secret with your default Dataform service agent.

Dataform uses the secret with the private SSH key to sign in to your Git provider to commit changes on behalf of the developers. Dataform makes these commits using the developer's Google Cloud email address so you can tell who made each commit.

To connect a remote repository to a Dataform repository through SSH, follow these steps:

  1. In your Git provider, do one of the following:

    Azure DevOps Services

    1. In Azure DevOps Services, create a private SSH key .
    2. Upload the public SSH key to your Azure DevOps Services repository.

    Bitbucket

    1. In Bitbucket, create a private SSH key .
    2. Upload the public SSH key to your Bitbucket repository.

    GitHub

    1. In GitHub, create a private SSH key .
    2. Upload the GitHub public SSH key to your GitHub repository.

    GitLab

    1. In GitLab, create a private SSH key .
    2. Upload the GitLab public SSH key to your GitLab repository.
  2. In Secret Manager, create a secret and set your private SSH key as the secret value.

    1. Grant access to the secret to your default Dataform service agent .

      Your default Dataform service agent is in the following format:

       service- PROJECT_NUMBER 
      @gcp-sa-dataform.iam.gserviceaccount.com 
      
    2. Grant the roles/secretmanager.secretAccessor role to the service agent or service account.

  3. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  4. Select the Dataform repository that you want to connect to the remote repository.

  5. On the repository page, click Settings > Connect with Git.

  6. In the Link to remote repositorypane, in the Remote Git repository URLfield, enter the URL of the remote Git repository, ending with .git .

    The URL of the remote Git repository must be in one of the following formats:

    • Absolute URL: ssh://git@{host_name}[:{port}]/{repository_path} , port is optional.
    • SCP-like URL: git@{host_name}:{repository_path} .
  7. In the Default remote branch namefield, enter the name of the main development branch of the remote Git repository.

  8. In the Secretdrop-down, select your secret for the remote Git repository.

  9. In the SSH public host key valuefield, enter the public host key of your Git provider.

    Azure DevOps Services

    1. To retrieve the Azure DevOps Services public host key, run the following command in the terminal:

       ssh-keyscan -t rsa ssh.dev.azure.com 
      
    2. Copy one of the outputted keys, omitting ssh.dev.azure.com from the beginning of the line. The value that you copy must be in the following format:

        ALGORITHM 
       BASE64_KEY_VALUE 
       
      

      For example:

       ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC7Hr1oTWqNqOlzGJOfGJ4NakVyIzf1rXYd4d7wo6jBlkLvCA4odBlL0mDUyZ0/QUfTTqeu+tm22gOsv+VrVTMk6vwRU75gY/y9ut5Mb3bR5BV58dKXyq9A9UeB5Cakehn5Zgm6x1mKoVyf+FFn26iYqXJRgzIZZcZ5V6hrE0Qg39kZm4az48o0AUbf6Sp4SLdvnuMa2sVNwHBboS7EJkm57XQPVU3/QpyNLHbWDdzwtrlS+ez30S3AdYhLKEOxAG8weOnyrtLJAUen9mTkol8oII1edf7mWWbWVf0nBmly21+nZcmCTISQBtdcyPaEno7fFQMDD26/s0lfKob4Kw8H 
      

      Verify this key is still up-to-date with Azure DevOps Services.

    Bitbucket

    1. To retrieve the Bitbucket public host key, run the following command in the terminal:

       curl https://bitbucket.org/site/ssh 
      
    2. The command returns a list of public host keys. Choose one of the keys from the list, and copy it, omitting bitbucket.org from the beginning of the line. The value that you copy must be in the following format:

        ALGORITHM 
       BASE64_KEY_VALUE 
       
      

      For example:

       ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIIazEu89wgQZ4bqs3d63QSMzYVa0MuJ2e2gKTKqu+UUO 
      

      Verify this key is still up-to-date with Bitbucket.

    GitHub

    1. To retrieve the GitHub public host key, see GitHub's SSH key fingerprints .
    2. The page contains a list of public host keys. Choose one of them, and copy it, omitting github.com from the beginning of the line. The value that you copy must be in the following format:

        ALGORITHM 
       BASE64_KEY_VALUE 
       
      

      For example:

       ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOMqqnkVzrm0SdG6UOoqKLsabgH5C9okWi0dh2l9GKJl 
      

      Verify this key is still up-to-date with GitHub.

    GitLab

    1. To retrieve the GitLab public host key, see SSH known_hosts entries .
    2. The page contains a list of public host keys. Choose one of them, and copy it, omitting gitlab.com from the beginning of the line. The value that you copy must be in the following format:

        ALGORITHM 
       BASE64_KEY_VALUE 
       
      

      For example:

       ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAfuCHKVTjquxvt6CM6tdG4SLp1Btn/nOeHHE5UOzRdf 
      

      Verify this key is still up-to-date with GitLab.

  10. Click Link.

Connect a remote repository through HTTPS

To connect a remote repository through HTTPS, you need to create a Secret Manager secret with a personal access token, and share the secret with your default Dataform service agent.

Dataform then uses the access token to sign in to your Git provider to commit changes on behalf of the developers. Dataform makes these commits using the developer's Google Cloud email address so you can tell who made each commit.

To connect a remote repository to a Dataform repository through HTTPS, follow these steps:

  1. In your Git provider, do the following:

    GitHub

    1. In GitHub, create a fine-grained personal access token or a classic personal access token .

      • For a fine-grained GitHub personal access token, do the following:
      1. Select repository access to only selected repositories, then select the repository that you want to connect to.

      2. Grant read and write access on contents of the repository.

      3. Set a token expiration time appropriate to your needs.

      • For a classic GitHub personal access token, do the following:
      1. Grant Dataform the repo permission.

      2. Set a token expiration time appropriate to your needs.

    2. If your organization uses SAML single sign-on (SSO), authorize the token .

    GitLab

    1. In GitLab, create a GitLab personal access token .

    2. Name the token dataform .

      The GitLab personal access token must be named dataform .

    3. Grant Dataform the api , read_repository , and write_repository permissions.

    4. Set a token expiration time appropriate to your needs.

  2. In Secret Manager, create a secret containing the personal access token of your remote repository.

  3. Grant access to the secret to your default Dataform service agent .

    Your default Dataform service agent is in the following format:

     service- PROJECT_NUMBER 
    @gcp-sa-dataform.iam.gserviceaccount.com 
    
    1. Grant the roles/secretmanager.secretAccessor role to the service agent.
  4. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  5. Select the Dataform repository that you want to connect to the remote repository.

  6. On the repository page, click Settings > Connect with Git.

  7. In the Link to remote repositorypane, in the Remote Git repository URLfield, enter the URL of the remote Git repository, ending with .git .

    The URL of the remote Git repository cannot contain usernames or passwords.

  8. In the Default remote branch namefield, enter the name of the main development branch of the remote Git repository.

  9. In the Secretdrop-down, select your secret for the remote Git repository.

  10. Click Link.

Edit the remote repository connection

To edit a connection between a Dataform repository and a remote Git repository, follow these steps:

  1. In the Google Cloud console, go to the Dataformpage.

    Go to Dataform

  2. Click the repository that you want to edit.

  3. On the repository page, click Settings > Edit Git connection.

  4. On the Link to remote repositorypane, edit connection settings.

  5. Click Update.

What's next

Create a Mobile Website
View Site in Mobile | Classic
Share by: