Modifying table schemas

This document describes how to modify the schema definitions for existing BigQuery tables.

You can make most schema modifications described in this document by using SQL data definition language (DDL) statements . These statements don't incur charges.

You can modify a table schema in all the ways described on this page by exporting your table data to Cloud Storage, and then loading the data into a new table with the modified schema definition. BigQuery load and extract jobs are free, but you incur costs for storing the exported data in Cloud Storage. The following sections describe other ways of performing various types of schema modifications.

Schema updates in BigQuery don't cause data loss.

Add a column

You can add columns to an existing table's schema definition by using one of the following options:

Add a new empty column.
Overwrite a table with a load or query job.
Append data to a table with a load or query job.

Any column you add must adhere to BigQuery's rules for column names . For more information on creating schema components, see Specifying a schema .

It isn't possible to add columns in the middle of a table schema. New columns and nested fields are always added at the end of the table or field. The only way to create a new column in the middle of a table schema is to create a new table with the chosen schema and copy the data from the original table.

Add an empty column

If you add new columns to an existing table schema, the columns must be NULLABLE or REPEATED . You cannot add a REQUIRED column to an existing table schema. Adding a REQUIRED column to an existing table schema in the API or bq command-line tool causes an error. However, you can create a nested REQUIRED column as part of a new RECORD field. REQUIRED columns can be added only when you create a table while loading data, or when you create an empty table with a schema definition.

To add empty columns to a table's schema definition:

Console

In the Google Cloud console, go to the BigQuery page.

Go to BigQuery
In the left pane, click Explorer:

If you don't see the left pane, click Expand left paneto open the pane.
In the Explorerpane, expand your project, click Datasets, and then select a dataset.
Click Overview > Tables, and then select the table.
In the details pane, click the Schematab.
Click Edit schema. You might need to scroll to see this button.
In the Current schemapage, under New fields, click Add field.
- For Name, type the column name.
- For Type, choose the data type .
- For Mode , choose NULLABLE or REPEATED .
When you are done adding columns, click Save.

SQL

Use the ALTER TABLE ADD COLUMN DDL statement :

In the Google Cloud console, go to the BigQuerypage.

Go to BigQuery

In the query editor, enter the following statement:

 ALTER 
  
 TABLE 
  
 mydataset 
 . 
 mytable 
 ADD 
  
 COLUMN 
  
 new_column 
  
 STRING 
 ;

Click Run.

For more information about how to run queries, see Run an interactive query .

bq

Issue the bq update command and provide a JSON schema file. If the table you're updating is in a project other than your default project, add the project ID to the dataset name in the following format: PROJECT_ID:DATASET .

bq  
update  
 PROJECT_ID 
: DATASET 
. TABLE 
  
 SCHEMA

Replace the following:

PROJECT_ID : your project ID.
DATASET : the name of the dataset that contains the table you're updating.
TABLE : the name of the table you're updating.
SCHEMA : the path to the JSON schema file on your local machine.

When you specify an inline schema, you cannot specify the column description, mode, and RECORD ( STRUCT ) type. All column modes default to NULLABLE . As a result, if you are adding a new nested column to a RECORD , you must supply a JSON schema file .

If you attempt to add columns using an inline schema definition, you must supply the entire schema definition including the new columns. Because you cannot specify column modes using an inline schema definition, the update changes any existing REPEATED column to NULLABLE , which produces the following error: BigQuery error in update operation: Provided Schema does not match Table PROJECT_ID:dataset.table . Field field has changed mode from REPEATED to NULLABLE.

The preferred method of adding columns to an existing table using the bq command-line tool is to supply a JSON schema file .

To add empty columns to a table's schema using a JSON schema file:

First, issue the bq show command with the --schema flag and write the existing table schema to a file. If the table you're updating is in a project other than your default project, add the project ID to the dataset name in the following format: PROJECT_ID:DATASET .
```
bq  
show  
 \ 
--schema  
 \ 
--format = 
prettyjson  
 \ 
 PROJECT_ID 
: DATASET 
. TABLE 
  
>  
 SCHEMA 
```
Replace the following:
- PROJECT_ID : your project ID.
- DATASET : the name of the dataset that contains the table you're updating.
- TABLE : the name of the table you're updating.
- SCHEMA : the schema definition file written to your local machine.
For example, to write the schema definition of mydataset.mytable to a file, enter the following command. mydataset.mytable is in your default project.
```
 bq show \
   --schema \
   --format=prettyjson \
   mydataset.mytable > /tmp/myschema.json 
```

Open the schema file in a text editor. The schema should look like the following:

[
  {
    "mode": "REQUIRED",
    "name": "column1",
    "type": "STRING"
  },
  {
    "mode": "REQUIRED",
    "name": "column2",
    "type": "FLOAT"
  },
  {
    "mode": "REPEATED",
    "name": "column3",
    "type": "STRING"
  }
]

Add the new columns to the end of the schema definition. If you attempt to add new columns elsewhere in the array, the following error is returned: BigQuery error in update operation: Precondition Failed . Modifying schema order after table creation doesn't have an effect on column or nested field order.

Using a JSON file, you can specify descriptions, NULLABLE or REPEATED modes, and RECORD types for new columns. For example, using the schema definition from the previous step, your new JSON array would look like the following. In this example, a new NULLABLE column is added named column4 . column4 includes a description.
```
[
    {
      "mode": "REQUIRED",
      "name": "column1",
      "type": "STRING"
    },
    {
      "mode": "REQUIRED",
      "name": "column2",
      "type": "FLOAT"
    },
    {
      "mode": "REPEATED",
      "name": "column3",
      "type": "STRING"
    },
    {
      "description": "my new column",
      "mode": "NULLABLE",
      "name": "column4",
      "type": "STRING"
    }
  ]
```
For more information on working with JSON schema files, see Specifying a JSON schema file .
After updating your schema file, issue the following command to update the table's schema. If the table you're updating is in a project other than your default project, add the project ID to the dataset name in the following format: PROJECT_ID:DATASET .
```
bq  
update  
 PROJECT_ID 
: DATASET 
. TABLE 
  
 SCHEMA 
```
Replace the following:
- PROJECT_ID : your project ID.
- DATASET : the name of the dataset that contains the table you're updating.
- TABLE : the name of the table you're updating.
- SCHEMA : the schema definition file written to your local machine.
For example, enter the following command to update the schema definition of mydataset.mytable in your default project. The path to the schema file on your local machine is /tmp/myschema.json .
```
 bq update mydataset.mytable /tmp/myschema.json 
```

API

Call the tables.patch method and use the schema property to add empty columns to your schema definition. Because the tables.update method replaces the entire table resource, the tables.patch method is preferred.

Go

Before trying this sample, follow the Go setup instructions in the BigQuery quickstart using client libraries . For more information, see the BigQuery Go API reference documentation .

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries .

  import 
  
 ( 
  
 "context" 
  
 "fmt" 
  
 "cloud.google.com/go/bigquery" 
 ) 
 // updateTableAddColumn demonstrates modifying the schema of a table to append an additional column. 
 func 
  
 updateTableAddColumn 
 ( 
 projectID 
 , 
  
 datasetID 
 , 
  
 tableID 
  
 string 
 ) 
  
 error 
  
 { 
  
 // projectID := "my-project-id" 
  
 // datasetID := "mydataset" 
  
 // tableID := "mytable" 
  
 ctx 
  
 := 
  
 context 
 . 
 Background 
 () 
  
 client 
 , 
  
 err 
  
 := 
  
 bigquery 
 . 
 NewClient 
 ( 
 ctx 
 , 
  
 projectID 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 fmt 
 . 
 Errorf 
 ( 
 "bigquery.NewClient: %v" 
 , 
  
 err 
 ) 
  
 } 
  
 defer 
  
 client 
 . 
 Close 
 () 
  
 tableRef 
  
 := 
  
 client 
 . 
 Dataset 
 ( 
 datasetID 
 ). 
 Table 
 ( 
 tableID 
 ) 
  
 meta 
 , 
  
 err 
  
 := 
  
 tableRef 
 . 
 Metadata 
 ( 
 ctx 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 newSchema 
  
 := 
  
 append 
 ( 
 meta 
 . 
  Schema 
 
 , 
  
& bigquery 
 . 
  FieldSchema 
 
 { 
 Name 
 : 
  
 "phone" 
 , 
  
 Type 
 : 
  
 bigquery 
 . 
  StringFieldType 
 
 }, 
  
 ) 
  
 update 
  
 := 
  
 bigquery 
 . 
  TableMetadataToUpdate 
 
 { 
  
 Schema 
 : 
  
 newSchema 
 , 
  
 } 
  
 if 
  
 _ 
 , 
  
 err 
  
 := 
  
 tableRef 
 . 
 Update 
 ( 
 ctx 
 , 
  
 update 
 , 
  
 meta 
 . 
 ETag 
 ); 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 err 
  
 } 
  
 return 
  
 nil 
 }

Java

Before trying this sample, follow the Java setup instructions in the BigQuery quickstart using client libraries . For more information, see the BigQuery Java API reference documentation .

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries .

  import 
  
 com.google.cloud.bigquery. BigQuery 
 
 ; 
 import 
  
 com.google.cloud.bigquery. BigQueryException 
 
 ; 
 import 
  
 com.google.cloud.bigquery. BigQueryOptions 
 
 ; 
 import 
  
 com.google.cloud.bigquery. Field 
 
 ; 
 import 
  
 com.google.cloud.bigquery. FieldList 
 
 ; 
 import 
  
 com.google.cloud.bigquery. LegacySQLTypeName 
 
 ; 
 import 
  
 com.google.cloud.bigquery. Schema 
 
 ; 
 import 
  
 com.google.cloud.bigquery. StandardTableDefinition 
 
 ; 
 import 
  
 com.google.cloud.bigquery. Table 
 
 ; 
 import 
  
 java.util.ArrayList 
 ; 
 import 
  
 java.util.List 
 ; 
 public 
  
 class 
 AddEmptyColumn 
  
 { 
  
 public 
  
 static 
  
 void 
  
 runAddEmptyColumn 
 () 
  
 { 
  
 // TODO(developer): Replace these variables before running the sample. 
  
 String 
  
 datasetName 
  
 = 
  
 "MY_DATASET_NAME" 
 ; 
  
 String 
  
 tableId 
  
 = 
  
 "MY_TABLE_NAME" 
 ; 
  
 String 
  
 newColumnName 
  
 = 
  
 "NEW_COLUMN_NAME" 
 ; 
  
 addEmptyColumn 
 ( 
 newColumnName 
 , 
  
 datasetName 
 , 
  
 tableId 
 ); 
  
 } 
  
 public 
  
 static 
  
 void 
  
 addEmptyColumn 
 ( 
 String 
  
 newColumnName 
 , 
  
 String 
  
 datasetName 
 , 
  
 String 
  
 tableId 
 ) 
  
 { 
  
 try 
  
 { 
  
 // Initialize client that will be used to send requests. This client only needs to be created 
  
 // once, and can be reused for multiple requests. 
  
  BigQuery 
 
  
 bigquery 
  
 = 
  
  BigQueryOptions 
 
 . 
 getDefaultInstance 
 (). 
 getService 
 (); 
  
  Table 
 
  
 table 
  
 = 
  
 bigquery 
 . 
  getTable 
 
 ( 
 datasetName 
 , 
  
 tableId 
 ); 
  
  Schema 
 
  
 schema 
  
 = 
  
 table 
 . 
  getDefinition 
 
 (). 
 getSchema 
 (); 
  
  FieldList 
 
  
 fields 
  
 = 
  
  schema 
 
 . 
  getFields 
 
 (); 
  
 // Create the new field/column 
  
  Field 
 
  
 newField 
  
 = 
  
  Field 
 
 . 
 of 
 ( 
 newColumnName 
 , 
  
  LegacySQLTypeName 
 
 . 
 STRING 
 ); 
  
 // Create a new schema adding the current fields, plus the new one 
  
 List<Field> 
  
 fieldList 
  
 = 
  
 new 
  
 ArrayList<Field> 
 (); 
  
 fields 
 . 
 forEach 
 ( 
 fieldList 
 :: 
 add 
 ); 
  
 fieldList 
 . 
 add 
 ( 
 newField 
 ); 
  
  Schema 
 
  
 newSchema 
  
 = 
  
  Schema 
 
 . 
 of 
 ( 
 fieldList 
 ); 
  
 // Update the table with the new schema 
  
  Table 
 
  
 updatedTable 
  
 = 
  
 table 
 . 
  toBuilder 
 
 (). 
 setDefinition 
 ( 
  StandardTableDefinition 
 
 . 
 of 
 ( 
 newSchema 
 )). 
 build 
 (); 
  
  update 
 
dTable . 
  update 
 
 (); 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "Empty column successfully added to table" 
 ); 
  
 } 
  
 catch 
  
 ( 
  BigQueryException 
 
  
 e 
 ) 
  
 { 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "Empty column was not added. \n" 
  
 + 
  
 e 
 . 
 toString 
 ()); 
  
 } 
  
 } 
 }

Node.js

Before trying this sample, follow the Node.js setup instructions in the BigQuery quickstart using client libraries . For more information, see the BigQuery Node.js API reference documentation .

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries .

  // Import the Google Cloud client library and create a client 
 const 
  
 { 
 BigQuery 
 } 
  
 = 
  
 require 
 ( 
 ' @google-cloud/bigquery 
' 
 ); 
 const 
  
 bigquery 
  
 = 
  
 new 
  
  BigQuery 
 
 (); 
 async 
  
 function 
  
 addEmptyColumn 
 () 
  
 { 
  
 // Adds an empty column to the schema. 
  
 /** 
 * TODO(developer): Uncomment the following lines before running the sample. 
 */ 
  
 // const datasetId = 'my_dataset'; 
  
 // const tableId = 'my_table'; 
  
 const 
  
 column 
  
 = 
  
 { 
 name 
 : 
  
 'size' 
 , 
  
 type 
 : 
  
 'STRING' 
 }; 
  
 // Retrieve current table metadata 
  
 const 
  
 table 
  
 = 
  
 bigquery 
 . 
 dataset 
 ( 
 datasetId 
 ). 
 table 
 ( 
 tableId 
 ); 
  
 const 
  
 [ 
 metadata 
 ] 
  
 = 
  
 await 
  
 table 
 . 
 getMetadata 
 (); 
  
 // Update table schema 
  
 const 
  
 schema 
  
 = 
  
 metadata 
 . 
 schema 
 ; 
  
 const 
  
 new_schema 
  
 = 
  
 schema 
 ; 
  
 new_schema 
 . 
 fields 
 . 
 push 
 ( 
 column 
 ); 
  
 metadata 
 . 
 schema 
  
 = 
  
 new_schema 
 ; 
  
 const 
  
 [ 
 result 
 ] 
  
 = 
  
 await 
  
 table 
 . 
 setMetadata 
 ( 
 metadata 
 ); 
  
 console 
 . 
 log 
 ( 
  result 
 
 . 
 schema 
 . 
 fields 
 ); 
 }

Modifying table schemas

Add a column

Add an empty column

Console

SQL

bq

API

Go

Java

Node.js

Python

Add a nested column to a RECORD column

Console

SQL

bq

API

Add columns when you overwrite or append data

Add columns in a load append job

bq

API

Go

Java

Node.js

Python

Add columns in a query append job

bq

API

Go

Java

Node.js

Python

Change a column's name

Change a column's data type

Change a column's data type with a DDL statement

Modify nested column types

Cast a column's data type

Console

bq

API

Change a column's mode

Make a column NULLABLE in an existing table

Console

SQL

bq

API

Go

Java

Node.js

Python

Make a column NULLABLE with an appending load job

Console

bq

API

Go

Java

Node.js

Python

Make all columns NULLABLE with an append job

Console

bq

API

Go

Java

Python

Change a column's default value

Console

SQL

Change a column description

Console

SQL

Gemini

Delete a column

Add a nested column to a `RECORD` column

Make a column `NULLABLE` in an existing table

Make a column `NULLABLE` with an appending load job

Make all columns `NULLABLE` with an append job