Use the Bigtable change stream to BigQuery template

In this quickstart, you learn how to set up a Bigtable table with a change stream enabled, run a change stream pipeline, make changes to your table, and then see the changes streamed.

Before you begin

  1. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  2. Verify that billing is enabled for your Google Cloud project .

  3. Enable the Dataflow, Cloud Bigtable API, Cloud Bigtable Admin API, and BigQuery APIs.

    Enable the APIs

  4. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

Create a BigQuery dataset

Use the Google Cloud console to create a dataset that stores the data.

  1. In the Google Cloud console, go to the BigQuery page.

    Go to BigQuery

  2. In the Explorerpane, click your project name.

  3. Expand the Actionsoption and click Create dataset.

  4. On the Create datasetpage, do the following:

    1. For Dataset ID, enter bigtable_bigquery_quickstart .
    2. Leave the remaining default settings as they are, and click Create dataset.

Create a table with a change stream enabled

  1. In the Google Cloud console, go to the Bigtable Instancespage.

    Go to Instances

  2. Click the ID of the instance that you are using for this quickstart.

    If you don't have an instance available, create an instance with the default configurations in a region near you.

  3. In the left navigation pane, click Tables.

  4. Click Create a table.

  5. Name the table bigquery-changestream-quickstart .

  6. Add a column family named cf .

  7. Select Enable change stream.

  8. Click Create.

  9. On the Bigtable Tablespage, find your table bigquery-changestream-quickstart .

  10. In the Change streamcolumn, click Connect.

  11. In the dialog, select BigQuery.

  12. Click Create Dataflow job.

  13. In the provided parameter fields, enter your parameter values. You don't need to provide any optional parameters.

    1. Set the Bigtable application profile ID to default .
    2. Set the BigQuery dataset to bigtable_bigquery_quickstart .
  14. Click Run job.

  15. Wait until the job status is Startingor Runningbefore proceeding. It takes around 5 minutes once the job is queued.

  16. Keep the job open in a tab, so you can stop the job when cleaning up your resources.

Write some data to Bigtable

  1. In the Cloud Shell, write a few rows to Bigtable so the change log can write some data to BigQuery. As long as you write the data after the job is created, the changes appear. You don't have to wait for the job status to become running .

     cbt  
    -instance = 
     BIGTABLE_INSTANCE_ID 
      
    -project = 
     PROJECT_ID 
      
     \ 
      
     set 
      
    bigquery-changestream-quickstart  
    user123  
    cf:col1 = 
    abc
    cbt  
    -instance = 
     BIGTABLE_INSTANCE_ID 
      
    -project = 
     PROJECT_ID 
      
     \ 
      
     set 
      
    bigquery-changestream-quickstart  
    user546  
    cf:col1 = 
    def
    cbt  
    -instance = 
     BIGTABLE_INSTANCE_ID 
      
    -project = 
     PROJECT_ID 
      
     \ 
      
     set 
      
    bigquery-changestream-quickstart  
    user789  
    cf:col1 = 
    ghi 
    

    Replace the following:

    • PROJECT_ID : the ID of the project that you are using
    • BIGTABLE_INSTANCE_ID : the ID of the instance that contains the bigquery-changestream-quickstart table

View the change logs in BigQuery

  1. In the Google Cloud console, go to the BigQuerypage.

    Go to BigQuery

  2. In the Explorerpane, expand your project and the dataset bigtable_bigquery_quickstart .

  3. Click the table bigquery-changestream-quickstart_changelog .

  4. To see the change log, click Preview.

    Change log preview in BigQuery

Clean up

To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.

  1. Disable the change stream on the table:

     gcloud  
    bigtable  
    instances  
    tables  
    update  
    bigquery-changestream-quickstart  
     \ 
    --project = 
     PROJECT_ID 
      
    --instance = 
     BIGTABLE_INSTANCE_ID 
      
     \ 
    --clear-change-stream-retention-period 
    
  2. Delete the table bigquery-changestream-quickstart :

     cbt  
    --instance = 
     BIGTABLE_INSTANCE_ID 
      
    --project = 
     PROJECT_ID 
      
    deletetable  
    bigquery-changestream-quickstart 
    
  3. Stop the change stream pipeline:

    1. In the Google Cloud console, go to the Dataflow Jobspage.

      Go to Jobs

    2. Select your streaming job from the job list.

    3. In the navigation, click Stop.

    4. In the Stop jobdialog, select Cancel, and then click Stop job.

  4. Delete the BigQuery dataset:

    1. In the Google Cloud console, go to the BigQuery page.

      Go to BigQuery

    2. In the Explorerpanel, find the dataset bigtable_bigquery_quickstart and click it.

    3. Click Delete, type delete , and then click Deleteto confirm.

  5. Optional: Delete the instance if you created a new one for this quickstart:

     cbt  
    deleteinstance  
     BIGTABLE_INSTANCE_ID 
     
    

What's next

Create a Mobile Website
View Site in Mobile | Classic
Share by: