Cloud Bigtable

Cloud Bigtable is a petabyte-scale, fully managed NoSQL database service for large analytical and operational workloads. Ideal for ad tech, fintech, and IoT, Cloud Bigtable offers consistent sub-10ms latency. Replication provides higher availability, higher durability, and resilience in the face of zonal failures. Cloud Bigtable is designed with a storage engine for machine learning applications and provides easy integration with open source big data tools.

For more information about Cloud Bigtable, read the Cloud Bigtable Documentation .

The goal of google-cloud is to provide an API that is comfortable to Rubyists. Your authentication credentials are detected automatically in Google Cloud Platform (GCP), including Google Compute Engine (GCE), Google Kubernetes Engine (GKE), Google App Engine (GAE), Google Cloud Functions (GCF) and Cloud Run. In other environments you can configure authentication easily, either directly in your code or via environment variables. Read more about the options for connecting in the Authentication Guide .

Creating instances and clusters

When you first use Cloud Bigtable, you must create an instance, which is an allocation of resources that are used by Cloud Bigtable. When you create an instance, you must specify at least one cluster. Clusters describe where your data is stored and how many nodes are used for your data.

Use Project#create_instance to create an instance. The following example creates a production instance with one cluster and three nodes:

 require 
  
 "google/cloud/bigtable" 
 bigtable 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 . 
 new 
 job 
  
 = 
  
 bigtable 
 . 
 create_instance 
 ( 
  
 "my-instance" 
 , 
  
 display_name 
 : 
  
 "Instance for user data" 
 , 
  
 labels 
 : 
  
 { 
  
 "env" 
  
 = 
>  
 "dev" 
 } 
 ) 
  
 do 
  
 | 
 clusters 
 | 
  
 clusters 
 . 
 add 
 ( 
 "test-cluster" 
 , 
  
 "us-east1-b" 
 , 
  
 nodes 
 : 
  
 3 
 , 
  
 storage_type 
 : 
  
 :SSD 
 ) 
 end 
 job 
 . 
 done? 
  
 #=> false 
 # To block until the operation completes. 
 job 
 . 
 wait_until_done! 
 job 
 . 
 done? 
  
 #=> true 
 if 
  
 job 
 . 
 error? 
  
 status 
  
 = 
  
 job 
 . 
 error 
 else 
  
 instance 
  
 = 
  
 job 
 . 
 instance 
 end

You can also create a low-cost development instance for development and testing, with performance limited to the equivalent of a one-node cluster. There are no monitoring or throughput guarantees; replication is not available; and the SLA does not apply. When creating a development instance, you do not specify nodes for your clusters:

 require "google/cloud/bigtable"

bigtable = Google::Cloud:: Bigtable 
. new 
job = bigtable.create_instance(
 "my-instance",
 display_name: "Instance for user data",
 type: :DEVELOPMENT,
 labels: { "env" => "dev"}
) do |clusters|
 clusters.add("test-cluster", "us-east1-b") # nodes not allowed
end

job. done? 
#=> false

# Reload job until completion.
job. wait_until_done! 
job. done? 
#=> true

if job. error? 
status = job. error 
else
 instance = job.instance
end

You can upgrade a development instance to a production instance at any time.

Creating tables

Cloud Bigtable stores data in massively scalable tables, each of which is a sorted key/value map. The table is composed of rows, each of which typically describes a single entity, and columns, which contain individual values for each row. Each row is indexed by a single row key, and columns that are related to one another are typically grouped together into a column family. Each column is identified by a combination of the column family and a column qualifier, which is a unique name within the column family.

Each row/column intersection can contain multiple cells, or versions, at different timestamps, providing a record of how the stored data has been altered over time. Cloud Bigtable tables are sparse; if a cell does not contain any data, it does not take up any space.

Use Project#create_table or Instance#create_table to create a table:

 require 
  
 "google/cloud/bigtable" 
 bigtable 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 . 
 new 
 table 
  
 = 
  
 bigtable 
 . 
 create_table 
 ( 
 "my-instance" 
 , 
  
 "my-table" 
 ) 
 puts 
  
 table 
 . 
 name

When you create a table, you may specify the column families to use in the table, as well as a list of row keys that will be used to initially split the table into several tablets (tablets are similar to HBase regions):

 require 
  
 "google/cloud/bigtable" 
 bigtable 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 . 
 new 
 initial_splits 
  
 = 
  
 [ 
 "user-00001" 
 , 
  
 "user-100000" 
 , 
  
 "others" 
 ] 
 table 
  
 = 
  
 bigtable 
 . 
 create_table 
 ( 
 "my-instance" 
 , 
  
 "my-table" 
 , 
  
 initial_splits 
 : 
  
 initial_splits 
 ) 
  
 do 
  
 | 
 cfm 
 | 
  
 cfm 
 . 
 add 
 ( 
 'cf1' 
 , 
  
 gc_rule 
 : 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 :: 
 GcRule 
 . 
 max_versions 
 ( 
 5 
 )) 
  
 cfm 
 . 
 add 
 ( 
 'cf2' 
 , 
  
 gc_rule 
 : 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 :: 
 GcRule 
 . 
 max_age 
 ( 
 600 
 )) 
  
 gc_rule 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 :: 
 GcRule 
 . 
 union 
 ( 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 :: 
 GcRule 
 . 
 max_age 
 ( 
 1800 
 ), 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 :: 
 GcRule 
 . 
 max_versions 
 ( 
 3 
 ) 
  
 ) 
  
 cfm 
 . 
 add 
 ( 
 'cf3' 
 , 
  
 gc_rule 
 : 
  
 gc_rule 
 ) 
 end 
 puts 
  
 table

You may also add, update, and delete column families later by passing a block to Table#column_families :

 require 
  
 "google/cloud/bigtable" 
 bigtable 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 . 
 new 
 table 
  
 = 
  
 bigtable 
 . 
 table 
 ( 
 "my-instance" 
 , 
  
 "my-table" 
 , 
  
 perform_lookup 
 : 
  
 true 
 ) 
 table 
 . 
 column_families 
  
 do 
  
 | 
 cfm 
 | 
  
 cfm 
 . 
 add 
  
 "cf4" 
 , 
  
 gc_rule 
 : 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 :: 
 GcRule 
 . 
 max_age 
 ( 
 600 
 ) 
  
 cfm 
 . 
 add 
  
 "cf5" 
 , 
  
 gc_rule 
 : 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 :: 
 GcRule 
 . 
 max_versions 
 ( 
 5 
 ) 
  
 rule_1 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 :: 
 GcRule 
 . 
 max_versions 
 ( 
 3 
 ) 
  
 rule_2 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 :: 
 GcRule 
 . 
 max_age 
 ( 
 600 
 ) 
  
 rule_union 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 :: 
 GcRule 
 . 
 union 
 ( 
 rule_1 
 , 
  
 rule_2 
 ) 
  
 cfm 
 . 
 update 
  
 "cf2" 
 , 
  
 gc_rule 
 : 
  
 rule_union 
  
 cfm 
 . 
 delete 
  
 "cf3" 
 end 
 puts 
  
 table 
 . 
 column_families 
 [ 
 "cf3" 
 ] 
  
 #=> nil

Writing data

The Table class allows you to perform the following types of writes:

Simple writes
Increments and appends
Conditional writes
Batch writes

See Cloud Bigtable writes for detailed information about writing data.

Simple writes

Use Table#mutate_row to make one or more mutations to a single row:

 require 
  
 "google/cloud/bigtable" 
 bigtable 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 . 
 new 
 table 
  
 = 
  
 bigtable 
 . 
 table 
 ( 
 "my-instance" 
 , 
  
 "my-table" 
 ) 
 entry 
  
 = 
  
 table 
 . 
 new_mutation_entry 
 ( 
 "user-1" 
 ) 
 entry 
 . 
 set_cell 
 ( 
  
 "cf1" 
 , 
  
 "field1" 
 , 
  
 "XYZ" 
 , 
  
 timestamp 
 : 
  
 ( 
 Time 
 . 
 now 
 . 
 to_f 
  
 * 
  
 1000000 
 ) 
 . 
 round 
 ( 
 - 
 3 
 ) 
  
 # microseconds 
 ) 
 . 
 delete_cells 
 ( 
 "cf2" 
 , 
  
 "field02" 
 ) 
 table 
 . 
 mutate_row 
 ( 
 entry 
 )

Increments and appends

If you want to append data to an existing value or increment an existing numeric value, use Table#read_modify_write_row :

 require 
  
 "google/cloud/bigtable" 
 bigtable 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 . 
 new 
 table 
  
 = 
  
 bigtable 
 . 
 table 
 ( 
 "my-instance" 
 , 
  
 "my-table" 
 ) 
 rule_1 
  
 = 
  
 table 
 . 
 new_read_modify_write_rule 
 ( 
 "cf" 
 , 
  
 "field01" 
 ) 
 rule_1 
 . 
 append 
 ( 
 "append-xyz" 
 ) 
 rule_2 
  
 = 
  
 table 
 . 
 new_read_modify_write_rule 
 ( 
 "cf" 
 , 
  
 "field01" 
 ) 
 rule_2 
 . 
 increment 
 ( 
 1 
 ) 
 row 
  
 = 
  
 table 
 . 
 read_modify_write_row 
 ( 
 "user01" 
 , 
  
 [ 
 rule_1 
 , 
  
 rule_2 
 ] 
 ) 
 puts 
  
 row 
 . 
 cells

Do not use read_modify_write_row if you are using an app profile that has multi-cluster routing. (See AppProfile#routing_policy .)

Conditional writes

To check a row for a condition and then, depending on the result, write data to that row, use Table#check_and_mutate_row :

 require 
  
 "google/cloud/bigtable" 
 bigtable 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 . 
 new 
 table 
  
 = 
  
 bigtable 
 . 
 table 
 ( 
 "my-instance" 
 , 
  
 "my-table" 
 ) 
 predicate_filter 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 :: 
 RowFilter 
 . 
 key 
 ( 
 "user-10" 
 ) 
 on_match_mutations 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 :: 
 MutationEntry 
 . 
 new 
 on_match_mutations 
 . 
 set_cell 
 ( 
  
 "cf1" 
 , 
  
 "field1" 
 , 
  
 "XYZ" 
 , 
  
 timestamp 
 : 
  
 ( 
 Time 
 . 
 now 
 . 
 to_f 
  
 * 
  
 1000000 
 ) 
 . 
 round 
 ( 
 - 
 3 
 ) 
  
 # microseconds 
 ) 
 . 
 delete_cells 
 ( 
 "cf2" 
 , 
  
 "field02" 
 ) 
 otherwise_mutations 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 :: 
 MutationEntry 
 . 
 new 
 otherwise_mutations 
 . 
 delete_from_family 
 ( 
 "cf3" 
 ) 
 predicate_matched 
  
 = 
  
 table 
 . 
 check_and_mutate_row 
 ( 
  
 "user01" 
 , 
  
 predicate_filter 
 , 
  
 on_match 
 : 
  
 on_match_mutations 
 , 
  
 otherwise 
 : 
  
 otherwise_mutations 
 ) 
 if 
  
 predicate_matched 
  
 puts 
  
 "All predicates matched" 
 end

Do not use check_and_mutate_row if you are using an app profile that has multi-cluster routing. (See AppProfile#routing_policy .)

Batch writes

You can write more than one row in a single RPC using Table#mutate_rows :

 require 
  
 "google/cloud/bigtable" 
 bigtable 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 . 
 new 
 table 
  
 = 
  
 bigtable 
 . 
 table 
 ( 
 "my-instance" 
 , 
  
 "my-table" 
 ) 
 entries 
  
 = 
  
 [] 
 entries 
 < 
 table 
 . 
 new_mutation_entry 
 ( 
 "row-1" 
 ) 
 . 
 set_cell 
 ( 
 "cf1" 
 , 
 = 
 "" 
  
 "field1" 
 , 
 = 
 "" 
  
 "xyz" 
 ) 
 = 
 "" 
  
 entries 
 = 
 "" 
><  
 table 
 . 
 new_mutation_entry 
 ( 
 "row-2" 
 ) 
 . 
 set_cell 
 ( 
 "cf1" 
 , 
 = 
 "" 
  
 "field1" 
 , 
 = 
 "" 
  
 "abc" 
 ) 
 = 
 "" 
  
 responses 
 = 
 "table.mutate_rows(entries)" 
  
 responses 
 . 
 each 
 = 
 "" 
  
 do 
 = 
 "" 
  
 | 
 response 
 |= 
 "" 
  
 puts 
 = 
 "" 
  
 response 
 . 
 status 
 . 
 description 
 = 
 "" 
  
 end 
 = 
 "" 
>

Each entry in the request is atomic, but the request as a whole is not. As shown above, Cloud Bigtable returns a list of responses corresponding to the entries.

Reading data

The Table class also enables you to read data.

Use Table#read_row to read a single row by key:

 require 
  
 "google/cloud/bigtable" 
 bigtable 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 . 
 new 
 table 
  
 = 
  
 bigtable 
 . 
 table 
 ( 
 "my-instance" 
 , 
  
 "my-table" 
 ) 
 row 
  
 = 
  
 table 
 . 
 read_row 
 ( 
 "user-1" 
 )

If desired, you can apply a filter:

 require 
  
 "google/cloud/bigtable" 
 bigtable 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 . 
 new 
 table 
  
 = 
  
 bigtable 
 . 
 table 
 ( 
 "my-instance" 
 , 
  
 "my-table" 
 ) 
 filter 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 :: 
 RowFilter 
 . 
 cells_per_row 
 ( 
 3 
 ) 
 row 
  
 = 
  
 table 
 . 
 read_row 
 ( 
 "user-1" 
 , 
  
 filter 
 : 
  
 filter 
 )

For multiple rows, the Table#read_rows method streams back the contents of all requested rows in key order:

 require 
  
 "google/cloud/bigtable" 
 bigtable 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 . 
 new 
 table 
  
 = 
  
 bigtable 
 . 
 table 
 ( 
 "my-instance" 
 , 
  
 "my-table" 
 ) 
 table 
 . 
 read_rows 
 ( 
 keys 
 : 
  
 [ 
 "user-1" 
 , 
  
 "user-2" 
 ] 
 ) 
 . 
 each 
  
 do 
  
 | 
 row 
 | 
  
 puts 
  
 row 
 end

Instead of specifying individual keys (or a range), you can often just use a filter:

 require 
  
 "google/cloud/bigtable" 
 bigtable 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 . 
 new 
 table 
  
 = 
  
 bigtable 
 . 
 table 
 ( 
 "my-instance" 
 , 
  
 "my-table" 
 ) 
 filter 
  
 = 
  
 table 
 . 
 filter 
 . 
 key 
 ( 
 "user-*" 
 ) 
 # OR 
 # filter = Google::Cloud::Bigtable::RowFilter.key("user-*") 
 table 
 . 
 read_rows 
 ( 
 filter 
 : 
  
 filter 
 ) 
 . 
 each 
  
 do 
  
 | 
 row 
 | 
  
 puts 
  
 row 
 end

Deleting rows, tables, and instances

Use Table#drop_row_range to delete some or all of the rows in a table:

 require 
  
 "google/cloud/bigtable" 
 bigtable 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 . 
 new 
 table 
  
 = 
  
 bigtable 
 . 
 table 
 ( 
 "my-instance" 
 , 
  
 "my-table" 
 ) 
 # Delete rows using row key prefix. 
 table 
 . 
 drop_row_range 
 ( 
 row_key_prefix 
 : 
  
 "user-100" 
 ) 
 # Delete all data With timeout 
 table 
 . 
 drop_row_range 
 ( 
 delete_all_data 
 : 
  
 true 
 , 
  
 timeout 
 : 
  
 120 
 ) 
  
 # 120 seconds.

Delete tables and instances using Table#delete and Instance#delete , respectively:

 require 
  
 "google/cloud/bigtable" 
 bigtable 
  
 = 
  
 Google 
 :: 
 Cloud 
 :: 
 Bigtable 
 . 
 new 
 instance 
  
 = 
  
 bigtable 
 . 
 instance 
 ( 
 "my-instance" 
 ) 
 table 
  
 = 
  
 instance 
 . 
 table 
 ( 
 "my-table" 
 ) 
 table 
 . 
 delete 
 instance 
 . 
 delete

Additional information

Google Bigtable can be configured to use an emulator or to enable gRPC's logging. To learn more, see the Emulator guide and Logging guide .