Query a Bigtable change log in BigQuery

This page provides guidance and example queries to help you process a Bigtable change log in BigQuery.

This page is intended for users who have completed the following:

This guide assumes some knowledge of BigQuery. To learn more, you can follow the quickstart that shows how to load and query data .

Open the change log table

  1. In the Google Cloud console, go to the BigQuerypage.

    Go to BigQuery

  2. In the Explorerpane, expand your project.

  3. Expand your dataset.

  4. Click the table with the suffix: _changelog .

Table format

The entire output schema contains several columns. This guide focuses on connecting the rows to their columns and values, and parsing values into analyzable formats.

Basic queries

The examples in this section use a Bigtable table for tracking credit card sales. The table has one column family ( cf ) and the following columns:

  • Row key with the format credit card number # transaction timestamp
  • Merchant
  • Amount
  • Category
  • Transaction date

Query one column

Filter the results to just one column family and one column using a WHERE clause.

  SELECT 
  
 row_key 
 , 
  
 column_family 
 , 
  
 column 
 , 
  
 value 
 , 
  
 timestamp 
 , 
 FROM 
  
 your_dataset 
 . 
 your_table 
 WHERE 
  
 mod_type 
 = 
 "SET_CELL" 
  
 AND 
  
 column_family 
 = 
 "cf" 
  
 AND 
  
 column 
 = 
 "merchant" 
 LIMIT 
  
 1000 
 

Parse values

All values are stored as strings or byte strings. You can cast a value to its intended type with conversion functions .

  SELECT 
  
 row_key 
 , 
  
 column_family 
 , 
  
 column 
 , 
  
 value 
 , 
  
 CAST 
 ( 
 value 
  
 AS 
  
 NUMERIC 
 ) 
  
 AS 
  
 amount 
 FROM 
  
 your_dataset 
 . 
 your_table 
 WHERE 
  
 mod_type 
 = 
 "SET_CELL" 
  
 AND 
  
 column_family 
 = 
 "cf" 
  
 AND 
  
 column 
 = 
 "amount" 
 LIMIT 
  
 1000 
 

Perform aggregations

You can perform more operations, such as aggregations on numeric values.

  SELECT 
  
 SUM 
 ( 
 CAST 
 ( 
 value 
  
 AS 
  
 NUMERIC 
 )) 
  
 as 
  
 total_amount 
 FROM 
  
 your_dataset 
 . 
 your_table 
 WHERE 
  
 mod_type 
 = 
 "SET_CELL" 
  
 AND 
  
 column_family 
 = 
 "cf" 
  
 AND 
  
 column 
 = 
 "amount" 
 

Pivot the data

To perform queries that involve multiple Bigtable columns, you need to pivot the table. Each new BigQuery row includes one data change record returned by the change stream from its corresponding row in your Bigtable table. Depending on your schema, you can use a combination of the row key and timestamp to group the data.

  SELECT 
  
 * 
  
 FROM 
  
 ( 
  
 SELECT 
  
 row_key 
 , 
  
 timestamp 
 , 
  
 column 
 , 
  
 value 
  
 FROM 
  
 your_dataset 
 . 
 your_table 
 ) 
 PIVOT 
  
 ( 
  
 MAX 
 ( 
 value 
 ) 
  
 FOR 
  
 column 
  
 in 
  
 ( 
 "merchant" 
 , 
  
 "amount" 
 , 
  
 "category" 
 , 
  
 "transaction_date" 
 ) 
 ) 
 

Pivoting with a dynamic columnset

If you have a dynamic set of columns, you can do some additional processing to get all the columns and put them into the query programmatically.

  DECLARE 
  
 cols 
  
 STRING 
 ; 
 SET 
  
 cols 
  
 = 
  
 ( 
  
 SELECT 
  
 CONCAT 
 ( 
 '("' 
 , 
  
 STRING_AGG 
 ( 
 DISTINCT 
  
 column 
 , 
  
 '", "' 
 ), 
  
 '")' 
 ), 
  
 FROM 
  
 your_dataset 
 . 
 your_table 
 ); 
 EXECUTE 
  
 IMMEDIATE 
  
 format 
 ( 
 " 
 "" 
 SELECT * FROM ( 
 SELECT row_key, timestamp, column, value 
 FROM your_dataset.your_table 
 ) 
 PIVOT ( 
 MAX(value) 
 FOR column in %s 
 ) 
 "" 
 " 
 , 
  
 cols 
 ); 
 

JSON Data

If you are setting all values with JSON, you need to parse them and extract the values based on the keys. You can use parsing functions once you have derived the value from the JSON object. These examples use the credit card sales data introduced earlier , but instead of writing data to multiple columns, the data is written as a single column as a JSON object.

  SELECT 
  
 row_key 
 , 
  
 JSON_VALUE 
 ( 
 value 
 , 
  
 "$.category" 
 ) 
  
 as 
  
 category 
 , 
  
 CAST 
 ( 
 JSON_VALUE 
 ( 
 value 
 , 
  
 "$.amount" 
 ) 
  
 AS 
  
 NUMERIC 
 ) 
  
 as 
  
 amount 
 FROM 
  
 your_dataset 
 . 
 your_table 
 LIMIT 
  
 1000 
 

Aggregation queries with JSON

You can perform aggregation queries with JSON values.

  SELECT 
  
 JSON_VALUE 
 ( 
 value 
 , 
  
 "$.category" 
 ) 
  
 as 
  
 category 
 , 
  
 SUM 
 ( 
 CAST 
 ( 
 JSON_VALUE 
 ( 
 value 
 , 
  
 "$.amount" 
 ) 
  
 AS 
  
 NUMERIC 
 )) 
  
 as 
  
 total_amount 
 FROM 
  
 your_dataset 
 . 
 your_table 
 GROUP 
  
 BY 
  
 category 
 

What's next

Create a Mobile Website
View Site in Mobile | Classic
Share by: