Create a change stream-enabled table and capture changes
Learn how to set up a Bigtable table with a change stream enabled, run a change stream pipeline, make changes to your table, and then see the changes streamed.
Before you begin
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project : Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project
: To create a project, you need the Project Creator
(
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles .
-
Verify that billing is enabled for your Google Cloud project .
-
Enable the Dataflow, Cloud Bigtable API, and Cloud Bigtable Admin API APIs.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enablepermission. Learn how to grant roles . -
In the Google Cloud console, activate Cloud Shell.
Create a table with a change stream enabled
-
In the Google Cloud console, go to the Bigtable Instancespage.
-
Click the ID of the instance that you are using for this quickstart.
If you don't have an instance available, create an instance with the default configurations in a region near you.
-
In the left navigation pane, click Tables.
-
Click Create a table.
-
Name the table
change-streams-quickstart. -
Add a column family named
cf. -
Select Enable change stream.
-
Click Create.
Initialize a data pipeline to capture the change stream
-
In the Cloud Shell, run the following commands to download the code and run it.
git clone https://github.com/GoogleCloudPlatform/java-docs-samples.git cd java-docs-samples/bigtable/beam/change-streams mvn compile exec:java -Dexec.mainClass = ChangeStreamsHelloWorld \ "-Dexec.args=--project= PROJECT_ID --bigtableProjectId= PROJECT_ID \ --bigtableInstanceId= BIGTABLE_INSTANCE_ID --bigtableTableId=change-streams-quickstart \ --runner=dataflow --region= BIGTABLE_REGION --experiments=use_runner_v2"Replace the following:
- PROJECT_ID : the ID of the project that you are using
- BIGTABLE_INSTANCE_ID : the ID of the instance to contain the new table
- BIGTABLE_REGION
: the region that your Bigtable instance is in, such as
us-east5
-
In the Google Cloud console, go to the Dataflowpage.
-
Click the job with a name that begins with changestreamquickstart.
-
At the bottom of the screen, click Showto open the logs panel.
-
Click Worker logsto monitor the output of the change stream.
-
In the Cloud Shell, write some data to Bigtable to see the change stream process.
cbt -instance = BIGTABLE_INSTANCE_ID -project = PROJECT_ID \ import change-streams-quickstart quickstart-data.csv column-family = cf -
In the Google Cloud console, make sure that Severityis set to at least
Info. -
The worker log output logs contain this:
Change captured: user123#2023,USER,SetCell,cf,col1,abc Change captured: user546#2023,USER,SetCell,cf,col1,def Change captured: user789#2023,USER,SetCell,cf,col1,ghi
Clean up
To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.
-
Disable the change stream on the table
gcloud bigtable instances tables update change-streams-quickstart --instance = BIGTABLE_INSTANCE_ID \ --clear-change-stream-retention-period -
Delete the table
change-streams-quickstart:cbt -instance = BIGTABLE_INSTANCE_ID -project = PROJECT_ID deletetable change-streams-quickstart -
Stop the change stream pipeline:
-
In the Google Cloud console, go to the Dataflow Jobspage.
-
Select your streaming job from the job list.
-
In the navigation, click Stop.
-
In the Stop jobdialog, cancel your pipeline, and then click Stop job.
-
-
Optional: Delete the instance if you created a new one for this quickstart:
cbt deleteinstance BIGTABLE_INSTANCE_ID

