Export data to Bigtable (reverse ETL)
This document describes how you can set up reverse ETL (RETL) from
BigQuery to Bigtable. You can do this by using the  EXPORT
DATA 
statement 
to
export data from a BigQuery table to a Bigtable 
table.
You can use a RETL workflow to Bigtable to combine BigQuery's analytics capabilities with Bigtable's low latency and high throughput. This workflow lets you serve data to application users without exhausting quotas and limits on BigQuery.
Characteristics of Bigtable tables
Bigtable tables are different from BigQuery tables in several ways:
- Both Bigtable and BigQuery tables are made of rows, but a Bigtable row is made of row key and column families that have an arbitrary number of columns belonging to the same column family.
- Column families for a given table are created at table creation time but can also be added or removed later. When a column family is created, columns that belong to it don't need to be specified.
- Bigtable columns don't need to be defined ahead of time and can be used to store data in their name (also known as a qualifier ) within data size limits within tables .
- Bigtable columns can have any binary value within data size limits within tables .
- Bigtable columns always have a temporal dimension (also known as version ). Any number of values might be stored in any row for the same column as long as the timestamp is not identical.
- A Bigtable timestamp is measured in microseconds since Unix epoch time —for example, 0 represents 1970-01-01T00:00:00 UTC. Timestamps must be a non-negative number of microseconds with millisecond granularity (only multiples of 1000us are accepted). The default Bigtable timestamp is 0.
- Data in Bigtable is read by row key, multiple row keys, range of row keys, or by using a filter . At least one row key or row keys range is required in all types of read requests except for a full table scan.
For information about preparing BigQuery results for export to Bigtable, see Prepare query results for export .
Before you begin
You must create a Bigtable instance and a Bigtable table to receive the exported data.
Grant Identity and Access Management (IAM) roles that give users the necessary permissions to perform each task in this document.
Required roles
To get the permissions that you need to export BigQuery data to Bigtable, ask your administrator to grant you the following IAM roles on your project:
- Export data from a BigQuery table: BigQuery Data Viewer 
( roles/bigquery.dataViewer)
- Run an extract job: BigQuery User 
( roles/bigquery.user)
- Write data to a Bigtable table: Bigtable User 
( roles/bigtable.user)
- Auto-create new column families for a Bigtable table: Bigtable Administrator 
( roles/bigtable.admin)
For more information about granting roles, see Manage access to projects, folders, and organizations .
You might also be able to get the required permissions through custom roles or other predefined roles .
Limitations
- Encoding is limited to BINARYandTEXTonly.
- The destination Bigtable app profile must be configured with single-cluster routing and a low request priority level .
- The Bigtable app profile must be configured to route data to a Bigtable cluster colocated with the BigQuery dataset. For more information, see location considerations .
- Exports to Bigtable are only supported for BigQuery Enterprise or Enterprise Plus editions . BigQuery Standard edition and on-demand compute are not supported.
- Exports to Bigtable are only supported for reservations with a  QUERYassignment .
Location considerations
- If your BigQuery dataset is in a multi-region, your Bigtable app profile 
must be configured to route data to a Bigtable cluster within that multi-region.
    For example, if your BigQuery dataset is in the USmulti-region, the Bigtable cluster can be located in theus-west1(Oregon) region, which is within the United States.
- If your BigQuery dataset is in a single region, your Bigtable app profile 
must be configured to route data to a Bigtable cluster in
    the same region. For example, if your BigQuery dataset is in the asia-northeast1(Tokyo) region, your Bigtable cluster must also be in theasia-northeast1(Tokyo) region.
For more information, see Bigtable locations .
Supported BigQuery types
The following types of data are supported when they're written to Bigtable:
| BigQuery type | Bigtable value written | 
|---|---|
| BYTES | Exported as is. | 
| STRING | Converted to BYTES. | 
| INTEGER | If bigtable_options.column_families.encodingis set toBINARY, then the value is written in an 8 byte, big-endian format
    (most significant byte
    first). Ifbigtable_options.column_families.encodingis set toTEXT, then the
    value is written as a human-readable string representing a number. | 
| FLOAT | Writes value in the IEEE 754 8-byte output format. | 
| BOOLEAN | If bigtable_options.column_families.encodingis set toBINARY, then the value is written as a 1 byte value (false= 0x00 ortrue= 0x01). Ifbigtable_options.column_families.encodingis set toTEXT, then the value is written as a text ("true"or"false"). | 
| JSON | An exported column of  JSONtype is interpreted as a group of columns belonging to a specific Bigtable column family. Members of the JSON object are interpreted as columns and their values are to be written to Bigtable. The name of the column to be written can be adjusted using thebigtable_optionsconfiguration.For example: JSON ' { " FIELD1 " : " VALUE1 " , " FIELD2 " : " VALUE2 " } ' as MY_COLUMN_FAMILY Where values VALUE1 
and VALUE2 
are written to Bigtable as columns FIELD1 
and FIELD2 
to the column family MY_COLUMN_FAMILY 
. | 
| STRUCT | An exported column of  STRUCTtype is interpreted as a group of columns belonging to a specific Bigtable column family. Members of the struct are interpreted as columns and their values to be written to Bigtable. The name of the column to be written can be adjusted using thebigtable_optionsconfiguration.For example: STRUCT< FIELD 1 STRING , FIELD 2 INTEGER> as MY_COLUMN_FAMILY Where values FIELD1 
and FIELD2 
are written to Bigtable as columns FIELD1 
and FIELD2 
to the column family MY_COLUMN_FAMILY 
. | 
These supported data types are similar to reading from external Bigtable tables for BigQuery.
 NULL 
values in Bigtable
 
  NULL 
values in Bigtable have the following constraints:
-  Bigtable has no analog for NULLvalues. Exporting aNULLvalue for a given column family and column in Bigtable deletes the present values from a Bigtable row.
-  If a Bigtable value with a given row key, column family, column qualifier, and timestamp doesn't exist prior to the export, the exported NULLvalues have no effect on the Bigtable row.
-  When exporting a NULLvalue of theSTRUCTorJSONtype, all column values belonging to the corresponding column family of the affected row are deleted. You should cast theNULLvalue to theSTRUCTorJSONtype in order for the SQL engine to attach a correct type to it. The following query deletes all data from column familycolumn_family1with a set of given rowkeys:EXPORT DATA OPTIONS (...) AS SELECT rowkey , CAST ( NULL as STRUCT<INT64> ) AS column_family1 FROM T 
-  Rows with NULLrow keys are skipped during the export. The number of skipped rows is returned in export statistics to the caller.
Configure exports with bigtable_options 
 
 You can use the bigtable_options 
configuration during an export to bridge the
differences between BigQuery and Bigtable storage
models. The configuration is expressed in the form of a JSON string, as seen in
the following example:
EXPORT DATA OPTIONS( uri= "https://bigtable.googleapis.com/projects/PROJECT_ID/instances/INSTANCE_ID/appProfiles/APP_PROFILE_ID/tables/TABLE" , big ta ble_op t io ns = """{ " colum n Families ": [{ " fa milyId ": "COLUMN_FAMILY_NAME", " e n codi n g ": "ENCODING_VALUE", " colum ns ": [ { " quali f ierS tr i n g ": "BIGTABLE_COLUMN_QUALIFIER", [" quali f ierE n coded ": "BASE_ 64 _ENCODED_VALUE",] " f ieldName ": "BIGQUERY_RESULT_FIELD_NAME" } ] }] }""" )
The following table describes the possible fields used in a bigtable_options 
configuration:
| Field name | Description | 
|---|---|
| columnFamilies | An array of column family descriptors. | 
| columnFamilies.familyId | Identifier of Bigtable column family. | 
| columnFamilies.encoding | Value can be set to BINARYorTEXT. For information about how types are encoded, see Supported BigQuery types 
. | 
| columnFamilies.columns | An array of Bigtable column mappings. | 
| columnFamilies.columns.qualifierString | Optional: A Bigtable column qualifier. Specify this value if the column qualifier has no non-UTF-8 codes. The fields qualifierStringandqualifierEncodingare mutually exclusive. If neitherqualifierStringnorqualifierEncodedare specified,fieldNameis used as a column qualifier. | 
| columnFamilies.columns.qualifierEncoded | Optional: Base64-encoded column qualifier. Similar to qualifierStringin case the column qualifier must have non-UTF-8 codes. | 
| columnFamilies.columns.fieldName | Required: BigQuery result set field name. Can be an empty string in certain cases. For an example of how an empty fieldNamevalue is used with fields of simple types, see Prepare query results for export 
. | 
Prepare query results for export
To export query results to Bigtable, the results must meet the following requirements:
- The result set must contain a column rowkeyof the typeSTRINGorBYTES.
- Row keys, column qualifiers, values, and timestamps must not exceed Bigtable data size limits within tables .
- At least one column other than rowkeymust be present in the result set.
- Each result set column must be of one of the supported BigQuery types . Any unsupported column types must be converted to one of the supported types before exporting to Bigtable.
Bigtable doesn't require column qualifiers to be valid BigQuery column names, and Bigtable supports using any bytes. For information about overriding target column qualifiers for an export, see Configure exports with bigtable_options 
 
.
If you use exported values with Bigtable APIs, such as  ReadModifyWriteRow 
 
, any numerical values must use the correct binary encoding 
.
By default, standalone result columns of types other than STRUCT 
or JSON 
are interpreted as values for destination column families
equal to result column name, and column qualifier equal to an empty string.
To demonstrate how these data types are written, consider the follow SQL example,
where column 
and column2 
are standalone result columns:
  SELECT 
  
 x 
  
 as 
  
 column1 
 , 
  
 y 
  
 as 
  
 column2 
 FROM 
  
 table 
 
 
In this example query, SELECT x as column1 
writes values to Bigtable
under the column1 
column family and '' 
(empty string) column qualifier when
handling types other than JSON 
or STRUCT 
.
You can change how these types are written in an export using the  bigtable_options 
 
configuration, as
seen in the following example:
EXPORT DATA OPTIONS ( … big ta ble_op t io ns = """{ " colum n Families " : [ { " fa milyId ": " ordered_a t ", " colum ns ": [ {" quali f ierS tr i n g ": " order_ t ime ", " f ieldName ": ""} ] } ] }""" ) AS SELECT order_id as rowkey , STRUCT(produc t , amou nt ) AS sales_i nf o , EXTRACT (MILLISECOND FROM order_ t imes ta mp AT TIME ZONE "UTC" ) AS ordered_a t FROM T
In this example, BigQuery table T 
contains the
following row:
| order_id | order_timestamp | product | amount | 
|---|---|---|---|
|   
101 | 2023-03-28T10:40:54Z | Joystick | 2 | 
If you use the preceding bigtable_options 
configuration with table T 
, the
following data is written to Bigtable:
rowkey 
sales_info 
(column family)ordered_at 
(column family) 1680000054000 
represents 2023-03-28T10:40:54Z 
in milliseconds since
Unix epoch time in the UTC time zone.
Auto-create new column families
To auto-create new column families in a Bigtable table,
set the  auto_create_column_families 
option 
in the EXPORT DATA 
statement to true 
. This option requires the bigtable.tables.update 
permission, which is included in roles such as
Bigtable Administrator ( roles/bigtable.admin 
).
 EXPORT 
  
 DATA 
  
 OPTIONS 
  
 ( 
 uri 
 = 
 "https://bigtable.googleapis.com/projects/ PROJECT-ID 
/instances/ INSTANCE-ID 
/appProfiles/  APP_PROFILE_ID 
 
  
/tables/ TABLE 
" 
 , 
 format 
 = 
 "CLOUD_BIGTABLE" 
 , 
 auto_create_column_families 
  
 = 
  
 true 
 ) 
  
 AS 
 SELECT 
  
 order_id 
  
 as 
  
 rowkey 
 , 
  
 STRUCT 
 ( 
 product 
 , 
  
 amount 
 ) 
  
 AS 
  
 sales_info 
 FROM 
  
 T 
 
Set timestamp for all cells in a row using _CHANGE_TIMESTAMP 
 
 You can add a _CHANGE_TIMESTAMP 
column of the TIMESTAMP 
type to the result for export.
Every cell written to Bigtable uses the timestamp value from the _CHANGE_TIMESTAMP 
of the exported result row.
Bigtable doesn't support timestamps earlier than Unix epoch
(1970-01-01T00:00:00Z). If _CHANGE_TIMESTAMP 
value is NULL 
, the
Unix epoch time of 0 
is used as the default timestamp value.
The following query writes cells for product 
and amount 
columns with the
timestamp specified in the order_timestamp 
column of table T 
.
EXPORT DATA OPTIONS (...) AS SELECT rowkey , STRUCT(produc t , amou nt ) AS sales_i nf o , order_ t imes ta mp as _CHANGE_TIMESTAMP FROM T
Export continuously
If you want to continually process an export query, you can configure it as a continuous query .
Export multiple results with the same rowkey 
value
 
 When you export a result containing multiple rows with the same rowkey 
value,
values written to Bigtable end up in the same
Bigtable row.
You can use this method to generate multiple versions of column values in the
same row. In this example, the orders 
table in BigQuery contains
the following data:
| id | customer | order_timestamp | amount_spent | 
|---|---|---|---|
|   
100 | Bob | 2023-01-01T10:10:54Z | 10.99 | 
|   
101 | Alice | 2023-01-02T12:10:50Z | 102.7 | 
|   
102 | Bob | 2023-01-04T15:17:01Z | 11.1 | 
The user then executes the following EXPORT DATA 
statement:
 EXPORT 
  
 DATA 
  
 OPTIONS 
  
 ( 
 uri= 
 "https://bigtable.googleapis.com/projects/ PROJECT-ID 
/instances/ INSTANCE-ID 
/appProfiles/  APP_PROFILE_ID 
 
  
/tables/ TABLE 
" 
 , 
 f 
 orma 
 t 
 = 
 "CLOUD_BIGTABLE" 
 ) 
  
 AS 
 SELECT 
  
 cus 
 t 
 omer 
  
 as 
  
 rowkey 
 , 
  
 STRUCT(amou 
 nt 
 _spe 
 nt 
 ) 
  
 as 
  
 orders_colum 
 n 
 _ 
 fa 
 mily 
 , 
  
 order_ 
 t 
 imes 
 ta 
 mp 
  
 as 
  
 _CHANGE_TIMESTAMP 
 FROM 
  
 orders 
 
Using this statement with the BigQuery orders 
table
results in the following data written to Bigtable:
Exporting to Bigtable merges new values into the table instead of replacing entire rows. If values are already present in Bigtable for a row key, then new values can partially or fully override earlier values depending on the column family, column names, and timestamps of the cells being written.
Export multiple columns as Protocol Buffer (Protobuf) values
Protocol buffers provide a flexible and efficient mechanism for serializing structured data. Exporting as a Protobuf can be beneficial considering how different types are handled between BigQuery and Bigtable. You can use BigQuery user-defined functions (UDFs) to export data as Protobuf binary values to Bigtable. For more information, see Export data as Protobuf columns .
Export optimization
You can change the throughput at which records are exported from BigQuery to Bigtable by modifying the number of nodes in the Bigtable destination cluster . The throughput (rows written per second) linearly scales with the numbers of nodes in the destination cluster. For example, if you double the number of nodes in your destination cluster, your export throughput will roughly double.
Pricing
When you export data in a standard query, you are billed using data extraction pricing 
.
When you export data in a continuous query, you are billed using BigQuery capacity compute pricing 
.
To run continuous queries, you must have a reservation 
that uses the Enterprise or Enterprise Plus edition 
,
and a reservation assignment 
that uses the CONTINUOUS 
job type.
After the data is exported, you're charged for storing the data in Bigtable. For more information, see Bigtable pricing .

