Create and manage protobuf schemas

This document describes how to create and perform operations on schema bundles.

In Bigtable, you can use protocol buffer (protobuf) schemas to query individual fields within protobuf messages stored as bytes in your columns. You do this by uploading your schemas in a schema bundle , a table-level resource that contains one or more of your protobuf schemas.

Using schema bundles offers the following benefits:

  • Saves time and effort: With protocol buffers, you define your data structure once in a proto file, and then use the generated source code to write and read your data.
  • Improves data consistency: By using a proto file as a single source of truth, you can ensure that all applications and services are using the same data model.
  • Eliminates data duplication: You can use protocol buffers across projects by defining message types in proto files that reside outside of a specific project's codebase.

The process of using schemas in Bigtable starts with your proto files. A proto file is a text file where you define the structure of your data. You use the protobuf compiler tool, also referred to as protoc , to generate a protobuf file descriptor set , which is a machine-readable schema of your proto file. You then use this descriptor set to create a schema bundle.

For examples of proto files and their corresponding descriptor sets, see Example data .

The following diagram shows the process of using schemas in Bigtable:

The process of using protobuf schemas in Bigtable.
Figure 1. The process of using protobuf schemas in Bigtable (click to enlarge).

You can create schema bundles using the Google Cloud CLI. After you upload a schema bundle to Bigtable, you can query your data using the Bigtable Studio query builder, GoogleSQL for Bigtable, or Bigtable external tables in BigQuery.

Before you begin

Take the following steps if you plan to use the gcloud CLI:

  1. Install the Google Cloud CLI .
  2. Initialize the gcloud CLI:

     gcloud  
    init 
    

Required roles

To get the permissions that you need to create and manage schema bundles, ask your administrator to grant you the Bigtable Admin( roles/bigtable.admin ) Identity and Access Management (IAM) role on the table.

This predefined role contains the permissions that Bigtable requires to work with schema bundles. To see the exact permissions that are required, expand the Required permissionssection:

Required permissions

  • bigtable.schemaBundles.create
  • bigtable.schemaBundles.update
  • bigtable.schemaBundles.delete
  • bigtable.schemaBundles.get
  • bigtable.schemaBundles.list

You might also be able to get these permissions with custom roles or other predefined roles .

For more information about Bigtable roles and permissions, see Access control with IAM .

Generate a protobuf file descriptor set

Before you can create a schema bundle, you need to generate a descriptor set from your proto files with the protobuf compiler tool.

  1. To install the compiler, download the package and follow the instructions in the README file.
  2. Run the compiler:

     protoc  
    --proto_path = 
     IMPORT_PATH 
      
    --include_imports  
     \ 
      
    --descriptor_set_out = 
     DESCRIPTOR_OUTPUT_LOCATION 
      
     PATH_TO_PROTO 
     
    

    Replace the following:

    • IMPORT_PATH : the directory where the protoc compiler searches for proto files.
    • DESCRIPTOR_OUTPUT_LOCATION : the directory where the protoc compiler saves the generated descriptor set.
    • PATH_TO_PROTO : the path to your proto file.

For example, to create a descriptor set named library.pb for the library.proto file in the current directory, you can use the following command:

 protoc  
--include_imports  
--descriptor_set_out = 
library.pb
library.proto 

Create a schema bundle

gcloud

To create a schema bundle, use the gcloud bigtable schema-bundles create command:

 gcloud  
bigtable  
schema-bundles  
create  
 SCHEMA_BUNDLE_ID 
  
 \ 
  
--instance = 
 INSTANCE_ID 
  
 \ 
  
--table = 
 TABLE_ID 
  
 \ 
  
--proto-descriptors-file = 
 PROTO_DESCRIPTORS_FILE 
 

Replace the following:

  • SCHEMA_BUNDLE_ID : a unique ID for the new schema bundle that can't contain any dot ('.') character.
  • INSTANCE_ID : the ID of the instance to create the schema bundle in.
  • TABLE_ID : the ID of the table to create the schema bundle in.
  • PROTO_DESCRIPTORS_FILE : the path to the descriptor set generated in the previous step.

Java

To create a schema bundle, use the createSchemaBundle method:

To learn how to install and use the client library for Bigtable, see Bigtable client libraries .

To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  try 
  
 { 
  
 InputStream 
  
 in 
  
 = 
  
 getClass 
 (). 
 getClassLoader 
 (). 
 getResourceAsStream 
 ( 
 PROTO_FILE_PATH 
 ); 
  
 CreateSchemaBundleRequest 
  
 request 
  
 = 
  
 CreateSchemaBundleRequest 
 . 
 of 
 ( 
 tableId 
 , 
  
 schemaBundleId 
 ) 
  
 . 
 setProtoSchema 
 ( 
 ByteString 
 . 
 readFrom 
 ( 
 in 
 )); 
  
 SchemaBundle 
  
 schemaBundle 
  
 = 
  
 adminClient 
 . 
 createSchemaBundle 
 ( 
 request 
 ); 
  
 System 
 . 
 out 
 . 
 printf 
 ( 
 "Schema bundle: %s created successfully%n" 
 , 
  
 schemaBundle 
 . 
 getId 
 ()); 
 } 
  
 catch 
  
 ( 
 NotFoundException 
  
 e 
 ) 
  
 { 
  
 System 
 . 
 err 
 . 
 println 
 ( 
  
 "Failed to create a schema bundle from a non-existent table: " 
  
 + 
  
 e 
 . 
 getMessage 
 ()); 
 } 
  
 catch 
  
 ( 
 IOException 
  
 e 
 ) 
  
 { 
  
 throw 
  
 new 
  
 RuntimeException 
 ( 
 e 
 ); 
 } 
 

View information about schema bundles

Before you can view information about schema bundles, you need to have a Bigtable table with at least one schema bundle. You can get information about schema bundles in a table by retrieving a single schema bundle's definition or by listing all schema bundles in a table.

Get schema bundle definition

gcloud

To get details about a schema bundle, use the gcloud bigtable schema-bundles describe command:

 gcloud  
bigtable  
schema-bundles  
describe  
 SCHEMA_BUNDLE_ID 
  
 \ 
  
--instance = 
 INSTANCE_ID 
  
 \ 
  
--table = 
 TABLE_ID 
 

Replace the following:

  • SCHEMA_BUNDLE_ID : the ID of the schema bundle.
  • INSTANCE_ID : the ID of the instance.
  • TABLE_ID : the ID of the table.

Java

To get the definition of a schema bundle, use the getSchemaBundle method. This method returns a SchemaBundle object that contains the schema definition.

The following example shows how to get a schema bundle and deserialize the descriptor set to print the contents of the schema:

To learn how to install and use the client library for Bigtable, see Bigtable client libraries .

To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  SchemaBundle 
  
 schemaBundle 
  
 = 
  
 null 
 ; 
 try 
  
 { 
  
 schemaBundle 
  
 = 
  
 adminClient 
 . 
 getSchemaBundle 
 ( 
 tableId 
 , 
  
 schemaBundleId 
 ); 
  
 // Deserialize and print the FileDescriptorSet 
  
 DescriptorProtos 
 . 
 FileDescriptorSet 
  
 fileDescriptorSet 
  
 = 
  
 DescriptorProtos 
 . 
 FileDescriptorSet 
 . 
 parseFrom 
 ( 
 schemaBundle 
 . 
 getProtoSchema 
 ()); 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "--------- Deserialized FileDescriptorSet ---------" 
 ); 
  
 for 
  
 ( 
 DescriptorProtos 
 . 
 FileDescriptorProto 
  
 fileDescriptorProto 
  
 : 
  
 fileDescriptorSet 
 . 
 getFileList 
 ()) 
  
 { 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "File: " 
  
 + 
  
 fileDescriptorProto 
 . 
 getName 
 ()); 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "  Package: " 
  
 + 
  
 fileDescriptorProto 
 . 
 getPackage 
 ()); 
  
 for 
  
 ( 
 DescriptorProtos 
 . 
 DescriptorProto 
  
 messageType 
  
 : 
  
 fileDescriptorProto 
 . 
 getMessageTypeList 
 ()) 
  
 { 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "  Message: " 
  
 + 
  
 messageType 
 . 
 getName 
 ()); 
  
 } 
  
 } 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "--------------------------------------------------" 
 ); 
 } 
  
 catch 
  
 ( 
 InvalidProtocolBufferException 
  
 e 
 ) 
  
 { 
  
 System 
 . 
 err 
 . 
 println 
 ( 
 "Failed to parse FileDescriptorSet: " 
  
 + 
  
 e 
 . 
 getMessage 
 ()); 
 } 
  
 catch 
  
 ( 
 NotFoundException 
  
 e 
 ) 
  
 { 
  
 System 
 . 
 err 
 . 
 println 
 ( 
  
 "Failed to retrieve metadata from a non-existent schema bundle: " 
  
 + 
  
 e 
 . 
 getMessage 
 ()); 
 } 
 

The output is similar to the following:

 --------- Deserialized FileDescriptorSet ---------
File: my_schema.proto
Package: my_package
Message: MyMessage
-------------------------------------------------- 

List schema bundles in a table

gcloud

To see a list of schema bundles for a table, use the gcloud bigtable schema-bundles list command:

 gcloud  
bigtable  
schema-bundles  
list  
 \ 
  
--instance = 
 INSTANCE_ID 
  
 \ 
  
--table = 
 TABLE_ID 
 

Replace the following:

  • INSTANCE_ID : the ID of the instance.
  • TABLE_ID : the ID of the table.

Java

To see a list of all schema bundles in a table, use the listSchemaBundles method. This method returns a list of schema bundle IDs.

The following example shows how to list the schema bundles in a table:

To learn how to install and use the client library for Bigtable, see Bigtable client libraries .

To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  List<String> 
  
 schemaBundleIds 
  
 = 
  
 new 
  
 ArrayList 
<> (); 
 try 
  
 { 
  
 schemaBundleIds 
  
 = 
  
 adminClient 
 . 
 listSchemaBundles 
 ( 
 tableId 
 ); 
  
 for 
  
 ( 
 String 
  
 schemaBundleId 
  
 : 
  
 schemaBundleIds 
 ) 
  
 { 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 schemaBundleId 
 ); 
  
 } 
 } 
  
 catch 
  
 ( 
 NotFoundException 
  
 e 
 ) 
  
 { 
  
 System 
 . 
 err 
 . 
 println 
 ( 
  
 "Failed to list schema bundles from a non-existent table: " 
  
 + 
  
 e 
 . 
 getMessage 
 ()); 
 } 
 

The output is similar to the following:

 my-schema-bundle-1
my-schema-bundle-2 

Update a schema bundle

When you update a schema bundle, Bigtable checks if the new descriptor set is backward compatible with the existing one. If it is incompatible, the update fails with a FailedPrecondition error. We recommend that you reserve deleted field numbers to prevent their reuse. For more information, see Proto Best Practices in the protobuf documentation.

If you're sure that the incompatible changes are safe and want to force an update, you can use the --ignore-warnings flag with gcloud CLI.

gcloud

To update a schema bundle to use a different descriptor set, use the gcloud bigtable schema-bundles update command:

 gcloud  
bigtable  
schema-bundles  
update  
 SCHEMA_BUNDLE_ID 
  
 \ 
  
--instance = 
 INSTANCE_ID 
  
 \ 
  
--table = 
 TABLE_ID 
  
 \ 
  
--proto-descriptors-file = 
 PROTO_DESCRIPTORS_FILE 
 

Replace the following:

  • SCHEMA_BUNDLE_ID : the ID of the schema bundle to update.
  • INSTANCE_ID : the ID of the instance that contains the schema bundle.
  • TABLE_ID : the ID of the table that contains the schema bundle.
  • PROTO_DESCRIPTORS_FILE : the path to the new descriptor set file.

Optional: To force the update even if there are incompatible changes, append the command with the --ignore-warnings flag.

Java

To learn how to install and use the client library for Bigtable, see Bigtable client libraries .

To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  try 
  
 { 
  
 InputStream 
  
 in 
  
 = 
  
 getClass 
 (). 
 getClassLoader 
 (). 
 getResourceAsStream 
 ( 
 PROTO_FILE_PATH 
 ); 
  
 UpdateSchemaBundleRequest 
  
 request 
  
 = 
  
 UpdateSchemaBundleRequest 
 . 
 of 
 ( 
 tableId 
 , 
  
 schemaBundleId 
 ) 
  
 . 
 setProtoSchema 
 ( 
 ByteString 
 . 
 readFrom 
 ( 
 in 
 )); 
  
 SchemaBundle 
  
 schemaBundle 
  
 = 
  
 adminClient 
 . 
 updateSchemaBundle 
 ( 
 request 
 ); 
  
 System 
 . 
 out 
 . 
 printf 
 ( 
 "Schema bundle: %s updated successfully%n" 
 , 
  
 schemaBundle 
 . 
 getId 
 ()); 
 } 
  
 catch 
  
 ( 
 NotFoundException 
  
 e 
 ) 
  
 { 
  
 System 
 . 
 err 
 . 
 println 
 ( 
 "Failed to modify a non-existent schema bundle: " 
  
 + 
  
 e 
 . 
 getMessage 
 ()); 
 } 
  
 catch 
  
 ( 
 IOException 
  
 e 
 ) 
  
 { 
  
 throw 
  
 new 
  
 RuntimeException 
 ( 
 e 
 ); 
 } 
 

Delete a schema bundle

gcloud

To delete a schema bundle, use the gcloud bigtable schema-bundles delete command:

 gcloud  
bigtable  
schema-bundles  
delete  
 SCHEMA_BUNDLE_ID 
  
 \ 
  
--instance = 
 INSTANCE_ID 
  
 \ 
  
--table = 
 TABLE_ID 
 

Replace the following:

  • SCHEMA_BUNDLE_ID : the ID of the schema bundle to delete.
  • INSTANCE_ID : the ID of the instance that contains the schema bundle.
  • TABLE_ID : the ID of the table that contains the schema bundle.

Java

To learn how to install and use the client library for Bigtable, see Bigtable client libraries .

To authenticate to Bigtable, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  try 
  
 { 
  
 adminClient 
 . 
 deleteSchemaBundle 
 ( 
 tableId 
 , 
  
 schemaBundleId 
 ); 
  
 System 
 . 
 out 
 . 
 printf 
 ( 
 "SchemaBundle: %s deleted successfully%n" 
 , 
  
 schemaBundleId 
 ); 
 } 
  
 catch 
  
 ( 
 NotFoundException 
  
 e 
 ) 
  
 { 
  
 System 
 . 
 err 
 . 
 println 
 ( 
 "Failed to delete a non-existent schema bundle: " 
  
 + 
  
 e 
 . 
 getMessage 
 ()); 
 } 
 

Limitations

Schema bundles have the following limitations:

  • You can create a maximum of 10 schema bundles per table.
  • The total size of the serialized protocol buffer descriptors within a schema bundle can't exceed 4 MB. There is no direct limit on the number of individual schemas you can include in a bundle, as long as the total size of the bundle doesn't exceed this limit.

What's next

Design a Mobile Site
View Site in Mobile | Classic
Share by: