You can modify a table schema in all the ways described on this page byexportingyour table data to
Cloud Storage,
and thenloadingthe data into a new table with the modified schema definition.
BigQuery load and export jobs are
free, but you incur costs for storing the exported data in
Cloud Storage.
The following sections describe other ways of performing various types of
schema modifications.
Add a column
You can add columns to an existing table's schema definition by using one of the
following options:
Add a new empty column.
Overwrite a table with a load or query job.
Append data to a table with a load or query job.
Any column you add must adhere to BigQuery's rules forcolumn names. For more information on
creating schema components, seeSpecifying a schema.
Add an empty column
If you add new columns to an existing table schema, the columns must beNULLABLEorREPEATED. You cannot add aREQUIREDcolumn to an existing
table schema. Adding aREQUIREDcolumn to an existing table
schema in the API or bq command-line tool causes an error. However, you can create a
nestedREQUIREDcolumn as part of a newRECORDfield.REQUIREDcolumns can be added only when you
create a table while loading data, or when you create an empty table with a
schema definition.
To add empty columns to a table's schema definition:
Console
In the Google Cloud console, go to the BigQuery page.
Issue thebq updatecommand and provide a JSON schema file. If the table
you're updating is in a project other than your default project, add the
project ID to the dataset name in the following format:PROJECT_ID:DATASET.
bqupdatePROJECT_ID:DATASET.TABLESCHEMA
Replace the following:
PROJECT_ID: your project ID.
DATASET: the name of the dataset that contains the table
you're updating.
TABLE: the name of the table you're updating.
SCHEMA: the path to the JSON schema file on your local
machine.
When you specify an inline schema, you cannot specify the column
description, mode, andRECORD(STRUCT)
type. All column modes default toNULLABLE. As a result, if you are
adding a new nested column to aRECORD, you mustsupply a JSON schema file.
If you attempt to add columns using an inline schema definition, you must
supply the entire schema definition including the new columns. Because you
cannot specify column modes using an inline schema definition, the update
changes any existingREPEATEDcolumn toNULLABLE, which
produces the following error:BigQuery error in update
operation: Provided Schema does not match TablePROJECT_ID:dataset.table. Fieldfieldhas changed mode
from REPEATED to NULLABLE.
The preferred method of adding columns to an existing table using the bq command-line tool is
tosupply a JSON schema file.
To add empty columns to a table's schema using a JSON schema file:
First, issue thebq showcommand with the--schemaflag and write the
existing table schema to a file. If the table you're updating is in a
project other than your default project, add the project ID to the
dataset name in the following format:PROJECT_ID:DATASET.
Add the new columns to the end of the schema definition. If you attempt
to add new columns elsewhere in the array, the following error is
returned:BigQuery error in update operation: Precondition
Failed.
Using a JSON file, you can specify descriptions,NULLABLEorREPEATEDmodes, andRECORDtypes for new columns. For example,
using the schema definition from the previous step, your new JSON array
would look like the following. In this example, a newNULLABLEcolumn
is added namedcolumn4.column4includes a description.
After updating your schema file, issue the following command to update
the table's schema. If the table you're updating is in a project other
than your default project, add the project ID to the dataset name in the
following format:PROJECT_ID:DATASET.
bqupdatePROJECT_ID:DATASET.TABLESCHEMA
Replace the following:
PROJECT_ID: your project ID.
DATASET: the name of the dataset that contains the table
you're updating.
TABLE: the name of the table you're updating.
SCHEMA: the schema definition file written to your
local machine.
For example, enter the following command to update the schema definition
ofmydataset.mytablein your default project. The path to the schema
file on your local machine is/tmp/myschema.json.
bq update mydataset.mytable /tmp/myschema.json
API
Call thetables.patchmethod and use theschemaproperty to add empty columns to your schema
definition. Because thetables.updatemethod replaces the entire table
resource, thetables.patchmethod is preferred.
import("context""fmt""cloud.google.com/go/bigquery")// updateTableAddColumn demonstrates modifying the schema of a table to append an additional column.funcupdateTableAddColumn(projectID,datasetID,tableIDstring)error{// projectID := "my-project-id"// datasetID := "mydataset"// tableID := "mytable"ctx:=context.Background()client,err:=bigquery.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("bigquery.NewClient: %v",err)}deferclient.Close()tableRef:=client.Dataset(datasetID).Table(tableID)meta,err:=tableRef.Metadata(ctx)iferr!=nil{returnerr}newSchema:=append(meta.Schema,&bigquery.FieldSchema{Name:"phone",Type:bigquery.StringFieldType},)update:=bigquery.TableMetadataToUpdate{Schema:newSchema,}if_,err:=tableRef.Update(ctx,update,meta.ETag);err!=nil{returnerr}returnnil}
importcom.google.cloud.bigquery.BigQuery;importcom.google.cloud.bigquery.BigQueryException;importcom.google.cloud.bigquery.BigQueryOptions;importcom.google.cloud.bigquery.Field;importcom.google.cloud.bigquery.FieldList;importcom.google.cloud.bigquery.LegacySQLTypeName;importcom.google.cloud.bigquery.Schema;importcom.google.cloud.bigquery.StandardTableDefinition;importcom.google.cloud.bigquery.Table;importjava.util.ArrayList;importjava.util.List;publicclassAddEmptyColumn{publicstaticvoidrunAddEmptyColumn(){// TODO(developer): Replace these variables before running the sample.StringdatasetName="MY_DATASET_NAME";StringtableId="MY_TABLE_NAME";StringnewColumnName="NEW_COLUMN_NAME";addEmptyColumn(newColumnName,datasetName,tableId);}publicstaticvoidaddEmptyColumn(StringnewColumnName,StringdatasetName,StringtableId){try{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.BigQuerybigquery=BigQueryOptions.getDefaultInstance().getService();Tabletable=bigquery.getTable(datasetName,tableId);Schemaschema=table.getDefinition().getSchema();FieldListfields=schema.getFields();// Create the new field/columnFieldnewField=Field.of(newColumnName,LegacySQLTypeName.STRING);// Create a new schema adding the current fields, plus the new oneList<Field>fieldList=newArrayList<Field>();fields.forEach(fieldList::add);fieldList.add(newField);SchemanewSchema=Schema.of(fieldList);// Update the table with the new schemaTableupdatedTable=table.toBuilder().setDefinition(StandardTableDefinition.of(newSchema)).build();updatedTable.update();System.out.println("Empty column successfully added to table");}catch(BigQueryExceptione){System.out.println("Empty column was not added. \n"+e.toString());}}}
// Import the Google Cloud client library and create a clientconst{BigQuery}=require('@google-cloud/bigquery');constbigquery=newBigQuery();asyncfunctionaddEmptyColumn(){// Adds an empty column to the schema./*** TODO(developer): Uncomment the following lines before running the sample.*/// const datasetId = 'my_dataset';// const tableId = 'my_table';constcolumn={name:'size',type:'STRING'};// Retrieve current table metadataconsttable=bigquery.dataset(datasetId).table(tableId);const[metadata]=awaittable.getMetadata();// Update table schemaconstschema=metadata.schema;constnew_schema=schema;new_schema.fields.push(column);metadata.schema=new_schema;const[result]=awaittable.setMetadata(metadata);console.log(result.schema.fields);}
fromgoogle.cloudimportbigquery# Construct a BigQuery client object.client=bigquery.Client()# TODO(developer): Set table_id to the ID of the table# to add an empty column.# table_id = "your-project.your_dataset.your_table_name"table=client.get_table(table_id)# Make an API request.original_schema=table.schemanew_schema=original_schema[:]# Creates a copy of the schema.new_schema.append(bigquery.SchemaField("phone","STRING"))table.schema=new_schematable=client.update_table(table,["schema"])# Make an API request.iflen(table.schema)==len(original_schema)+1==len(new_schema):print("A new column has been added.")else:print("The column has not been added.")
Add a nested column to aRECORDcolumn
In addition to adding new columns to a table's schema, you can also add new
nested columns to aRECORDcolumn. The process for adding a new nested column is
similar to the process for adding a new column.
Console
Adding a new nested field to an existingRECORDcolumn is not
supported by the Google Cloud console.
SQL
Adding a new nested field to an existingRECORDcolumn by using a SQL DDL
statement is not supported.
bq
Issue thebq updatecommand and provide a JSON schema file that adds the
nested field to the existingRECORDcolumn's schema definition. If the
table you're updating is in a project other than your default project, add
the project ID to the dataset name in the following format:PROJECT_ID:DATASET.
bqupdatePROJECT_ID:DATASET.TABLESCHEMA
Replace the following:
PROJECT_ID: your project ID.
DATASET: the name of the dataset that contains the table
you're updating.
TABLE: the name of the table you're updating.
SCHEMA: the path to the JSON schema file on your local
machine.
When you specify an inline schema, you cannot specify the column
description, mode, andRECORD(STRUCT)
type. All column modes default toNULLABLE. As a result, if you are
adding a new nested column to aRECORD, you mustsupply a JSON schema file.
To add a nested column to aRECORDusing a JSON schema file:
First, issue thebq showcommand with the--schemaflag and write the
existing table schema to a file. If the table you're updating is in a
project other than your default project, add the project ID to the
dataset name in the following format:PROJECT_ID:DATASET.TABLE.
DATASET: the name of the dataset that contains the table
you're updating.
TABLE: the name of the table you're updating.
SCHEMA: the schema definition file written to your
local machine.
For example, to write the schema definition ofmydataset.mytableto a
file, enter the following command.mydataset.mytableis in your
default project.
bq show \
--schema \
--format=prettyjson \
mydataset.mytable > /tmp/myschema.json
Open the schema file in a text editor. The schema should look like the
following. In this example,column3is a nested repeated column. The
nested columns arenested1andnested2. Thefieldsarray lists
the fields nested withincolumn3.
After updating your schema file, issue the following command to update
the table's schema. If the table you're updating is in a project other
than your default project, add the project ID to the dataset name in the
following format:PROJECT_ID:DATASET.
bqupdatePROJECT_ID:DATASET.TABLESCHEMA
Replace the following:
PROJECT_ID: your project ID.
DATASET: the name of the dataset that contains the table
you're updating.
TABLE: the name of the table you're updating.
SCHEMA: the path to the JSON schema file on your local
machine.
For example, enter the following command to update the schema definition
ofmydataset.mytablein your default project. The path to the schema
file on your local machine is/tmp/myschema.json.
bq update mydataset.mytable /tmp/myschema.json
API
Call thetables.patchmethod and use theschemaproperty to add the nested columns to your
schema definition. Because thetables.updatemethod replaces the entire
table resource, thetables.patchmethod is preferred.
Add columns when you overwrite or append data
You can add new columns to an existing table when you load data into it and
choose to overwrite the existing table. When you overwrite an existing table,
the schema of the data you're loading is used to overwrite the existing table's
schema. For information on overwriting a table using a load job, see the document
for your data's format:
You can add columns to a table when you append data to it in a load job. The
new schema is determined by one of the following:
Autodetection (for CSV and JSON files)
A schema specified in a JSON schema file (for CSV and JSON files)
The self-describing source data for Avro, ORC, Parquet and
Datastore export files
If you specify the schema in a JSON file, the new columns must be defined in it.
If the new column definitions are missing, an error is returned when
you attempt to append the data.
When you add new columns during an append operation,
the values in the new columns are set toNULLfor existing rows.
To add a new column when you append data to a table during a load job, use
one of the following options:
bq
Use thebq loadcommand to load your data and specify the--noreplaceflag to indicate that you are appending the data to an existing table.
If the data you're appending is in CSV or newline-delimited JSON format,
specify the--autodetectflag to useschema auto-detectionor supply the schema in a JSON schema file. The added columns can be
automatically inferred from Avro or Datastore export files.
Set the--schema_update_optionflag toALLOW_FIELD_ADDITIONto indicate
that the data you're appending contains new columns.
If the table you're appending is in a dataset in a project other than your
default project, add the project ID to the dataset name in the following
format:PROJECT_ID:DATASET.
(Optional) Supply the--locationflag and set the value to yourlocation.
LOCATION: the name of your location. The--locationflag is
optional. For example, if you are using BigQuery in the
Tokyo region, set the flag's value toasia-northeast1. You can set a
default value for the location using the.bigqueryrc file.
FORMAT: the format of the schema.NEWLINE_DELIMITED_JSON,CSV,AVRO,PARQUET,ORC, orDATASTORE_BACKUP.
PROJECT_ID: your project ID.
DATASET: the name of the dataset that contains the table.
TABLE: the name of the table you're appending.
PATH_TO_SOURCE: a fully-qualifiedCloud Storage URI,
a comma-separated list of URIs, or the path to a data file on your
local machine.
SCHEMA: the path to a local JSON schema file. A schema file
is required only for CSV and JSON files when--autodetectis
unspecified. Avro and Datastore schemas are inferred from the
source data.
Examples:
Enter the following command to append a local Avro data file,/tmp/mydata.avro, tomydataset.mytableusing a load job. Because schemas
can be automatically inferred from Avro data you don't need to use
the--autodetectflag.mydatasetis in your default project.
Enter the following command append a newline-delimited JSON data file in
Cloud Storage tomydataset.mytableusing a load job. The--autodetectflag is used to detect the new columns.mydatasetis in your default
project.
Enter the following command append a newline-delimited JSON data file in
Cloud Storage tomydataset.mytableusing a load job. The schema
containing the new columns is specified in a local JSON schema file,/tmp/myschema.json.mydatasetis inmyotherproject, not your default
project.
import("context""fmt""os""cloud.google.com/go/bigquery")// createTableAndWidenLoad demonstrates augmenting a table's schema to add a new column via a load job.funccreateTableAndWidenLoad(projectID,datasetID,tableID,filenamestring)error{// projectID := "my-project-id"// datasetID := "mydataset"// tableID := "mytable"ctx:=context.Background()client,err:=bigquery.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("bigquery.NewClient: %v",err)}deferclient.Close()sampleSchema:=bigquery.Schema{{Name:"full_name",Type:bigquery.StringFieldType},}meta:=&bigquery.TableMetadata{Schema:sampleSchema,}tableRef:=client.Dataset(datasetID).Table(tableID)iferr:=tableRef.Create(ctx,meta);err!=nil{returnerr}// Now, import data from a local file, but specify field additions are allowed.// Because the data has a second column (age), the schema is amended as part of// the load.f,err:=os.Open(filename)iferr!=nil{returnerr}source:=bigquery.NewReaderSource(f)source.AutoDetect=true// Allow BigQuery to determine schema.source.SkipLeadingRows=1// CSV has a single header line.loader:=client.Dataset(datasetID).Table(tableID).LoaderFrom(source)loader.SchemaUpdateOptions=[]string{"ALLOW_FIELD_ADDITION"}job,err:=loader.Run(ctx)iferr!=nil{returnerr}status,err:=job.Wait(ctx)iferr!=nil{returnerr}iferr:=status.Err();err!=nil{returnerr}returnnil}
importcom.google.cloud.bigquery.BigQuery;importcom.google.cloud.bigquery.BigQueryException;importcom.google.cloud.bigquery.BigQueryOptions;importcom.google.cloud.bigquery.Field;importcom.google.cloud.bigquery.FormatOptions;importcom.google.cloud.bigquery.Job;importcom.google.cloud.bigquery.JobId;importcom.google.cloud.bigquery.JobInfo;importcom.google.cloud.bigquery.JobInfo.SchemaUpdateOption;importcom.google.cloud.bigquery.JobInfo.WriteDisposition;importcom.google.cloud.bigquery.LegacySQLTypeName;importcom.google.cloud.bigquery.LoadJobConfiguration;importcom.google.cloud.bigquery.Schema;importcom.google.cloud.bigquery.TableId;importcom.google.common.collect.ImmutableList;importjava.util.UUID;publicclassAddColumnLoadAppend{publicstaticvoidrunAddColumnLoadAppend()throwsException{// TODO(developer): Replace these variables before running the sample.StringdatasetName="MY_DATASET_NAME";StringtableName="MY_TABLE_NAME";StringsourceUri="/path/to/file.csv";addColumnLoadAppend(datasetName,tableName,sourceUri);}publicstaticvoidaddColumnLoadAppend(StringdatasetName,StringtableName,StringsourceUri)throwsException{try{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.BigQuerybigquery=BigQueryOptions.getDefaultInstance().getService();TableIdtableId=TableId.of(datasetName,tableName);// Add a new column to a BigQuery table while appending rows via a load job.// 'REQUIRED' fields cannot be added to an existing schema, so the additional column must be// 'NULLABLE'.SchemanewSchema=Schema.of(Field.newBuilder("name",LegacySQLTypeName.STRING).setMode(Field.Mode.REQUIRED).build(),// Adding below additional column during the load jobField.newBuilder("post_abbr",LegacySQLTypeName.STRING).setMode(Field.Mode.NULLABLE).build());LoadJobConfigurationloadJobConfig=LoadJobConfiguration.builder(tableId,sourceUri).setFormatOptions(FormatOptions.csv()).setWriteDisposition(WriteDisposition.WRITE_APPEND).setSchema(newSchema).setSchemaUpdateOptions(ImmutableList.of(SchemaUpdateOption.ALLOW_FIELD_ADDITION)).build();// Create a job ID so that we can safely retry.JobIdjobId=JobId.of(UUID.randomUUID().toString());JobloadJob=bigquery.create(JobInfo.newBuilder(loadJobConfig).setJobId(jobId).build());// Load data from a GCS parquet file into the table// Blocks until this load table job completes its execution, either failing or succeeding.JobcompletedJob=loadJob.waitFor();// Check for errorsif(completedJob==null){thrownewException("Job not executed since it no longer exists.");}elseif(completedJob.getStatus().getError()!=null){// You can also look at queryJob.getStatus().getExecutionErrors() for all// errors, not just the latest one.thrownewException("BigQuery was unable to load into the table due to an error: \n"+loadJob.getStatus().getError());}System.out.println("Column successfully added during load append job");}catch(BigQueryException|InterruptedExceptione){System.out.println("Column not added during load append \n"+e.toString());}}}
// Import the Google Cloud client librariesconst{BigQuery}=require('@google-cloud/bigquery');// Instantiate clientconstbigquery=newBigQuery();asyncfunctionaddColumnLoadAppend(){// Adds a new column to a BigQuery table while appending rows via a load job./*** TODO(developer): Uncomment the following lines before running the sample.*/// const fileName = '/path/to/file.csv';// const datasetId = 'my_dataset';// const tableId = 'my_table';// In this example, the existing table contains only the 'Name', 'Age',// & 'Weight' columns. 'REQUIRED' fields cannot be added to an existing// schema, so the additional column must be 'NULLABLE'.constschema='Name:STRING, Age:INTEGER, Weight:FLOAT, IsMagic:BOOLEAN';// Retrieve destination table referenceconst[table]=awaitbigquery.dataset(datasetId).table(tableId).get();constdestinationTableRef=table.metadata.tableReference;// Set load job optionsconstoptions={schema:schema,schemaUpdateOptions:['ALLOW_FIELD_ADDITION'],writeDisposition:'WRITE_APPEND',destinationTable:destinationTableRef,};// Load data from a local file into the tableconst[job]=awaitbigquery.dataset(datasetId).table(tableId).load(fileName,options);console.log(`Job${job.id}completed.`);console.log(`New Schema:`);console.log(job.configuration.load.schema.fields);// Check the job's status for errorsconsterrors=job.status.errors;if(errors&&errors.length>0){throwerrors;}}
# from google.cloud import bigquery# client = bigquery.Client()# project = client.project# dataset_ref = bigquery.DatasetReference(project, 'my_dataset')# filepath = 'path/to/your_file.csv'# Retrieves the destination table and checks the length of the schematable_id="my_table"table_ref=dataset_ref.table(table_id)table=client.get_table(table_ref)print("Table{}contains{}columns.".format(table_id,len(table.schema)))# Configures the load job to append the data to the destination table,# allowing field additionjob_config=bigquery.LoadJobConfig()job_config.write_disposition=bigquery.WriteDisposition.WRITE_APPENDjob_config.schema_update_options=[bigquery.SchemaUpdateOption.ALLOW_FIELD_ADDITION]# In this example, the existing table contains only the 'full_name' column.# 'REQUIRED' fields cannot be added to an existing schema, so the# additional column must be 'NULLABLE'.job_config.schema=[bigquery.SchemaField("full_name","STRING",mode="REQUIRED"),bigquery.SchemaField("age","INTEGER",mode="NULLABLE"),]job_config.source_format=bigquery.SourceFormat.CSVjob_config.skip_leading_rows=1withopen(filepath,"rb")assource_file:job=client.load_table_from_file(source_file,table_ref,location="US",# Must match the destination dataset location.job_config=job_config,)# API requestjob.result()# Waits for table load to complete.print("Loaded{}rows into{}:{}.".format(job.output_rows,dataset_id,table_ref.table_id))# Checks the updated length of the schematable=client.get_table(table)print("Table{}now contains{}columns.".format(table_id,len(table.schema)))
Add columns in a query append job
You can add columns to a table when you append query results to it.
When you add columns using an append operation in a query job, the schema of the
query results is used to update the schema of the destination table. Note that
you cannot query a table in one location and write the results to a table in
another location.
To add a new column when you append data to a table during a query job, select
one of the following options:
bq
Use thebq querycommand to query your data and specify the--destination_tableflag to indicate which table you're appending.
To specify that you are appending query results to an existing destination
table, specify the--append_tableflag.
Set the--schema_update_optionflag toALLOW_FIELD_ADDITIONto indicate
that the query results you're appending contain new columns.
Specify theuse_legacy_sql=falseflag to use GoogleSQL syntax for the
query.
If the table you're appending is in a dataset in a project other than your
default project, add the project ID to the dataset name in the following
format:PROJECT_ID:DATASET. Note that the table
you're querying and the destination table must be in the same location.
(Optional) Supply the--locationflag and set the value to yourlocation.
LOCATION: the name of your location. The--locationflag is
optional. For example, if you are using BigQuery in the
Tokyo region, set the flag's value toasia-northeast1. You can set a
default value for the location using the.bigqueryrc file.
Note that you cannot append query results to a table in another
location.
PROJECT_ID: your project ID.
dataset: the name of the dataset that contains the table
you're appending.
TABLE: the name of the table you're appending.
QUERY: a query in GoogleSQL syntax.
Examples:
Enter the following command to querymydataset.mytablein your default
project and to append the query results tomydataset.mytable2(also in
your default project).
import("context""fmt""cloud.google.com/go/bigquery")// createTableAndWidenQuery demonstrates how the schema of a table can be modified to add columns by appending// query results that include the new columns.funccreateTableAndWidenQuery(projectID,datasetID,tableIDstring)error{// projectID := "my-project-id"// datasetID := "mydataset"// tableID := "mytable"ctx:=context.Background()client,err:=bigquery.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("bigquery.NewClient: %v",err)}deferclient.Close()// First, we create a sample table.sampleSchema:=bigquery.Schema{{Name:"full_name",Type:bigquery.StringFieldType,Required:true},{Name:"age",Type:bigquery.IntegerFieldType,Required:true},}original:=&bigquery.TableMetadata{Schema:sampleSchema,}tableRef:=client.Dataset(datasetID).Table(tableID)iferr:=tableRef.Create(ctx,original);err!=nil{returnerr}// Our table has two columns. We'll introduce a new favorite_color column via// a subsequent query that appends to the table.q:=client.Query("SELECT \"Timmy\" as full_name, 85 as age, \"Blue\" as favorite_color")q.SchemaUpdateOptions=[]string{"ALLOW_FIELD_ADDITION"}q.QueryConfig.Dst=client.Dataset(datasetID).Table(tableID)q.WriteDisposition=bigquery.WriteAppendq.Location="US"job,err:=q.Run(ctx)iferr!=nil{returnerr}_,err=job.Wait(ctx)iferr!=nil{returnerr}returnnil}
importcom.google.cloud.bigquery.BigQuery;importcom.google.cloud.bigquery.BigQueryException;importcom.google.cloud.bigquery.BigQueryOptions;importcom.google.cloud.bigquery.Job;importcom.google.cloud.bigquery.JobInfo;importcom.google.cloud.bigquery.JobInfo.SchemaUpdateOption;importcom.google.cloud.bigquery.JobInfo.WriteDisposition;importcom.google.cloud.bigquery.QueryJobConfiguration;importcom.google.cloud.bigquery.TableId;importcom.google.cloud.bigquery.TableResult;importcom.google.common.collect.ImmutableList;publicclassRelaxTableQuery{publicstaticvoidrunRelaxTableQuery()throwsException{// TODO(developer): Replace these variables before running the sample.StringprojectId="MY_PROJECT_ID";StringdatasetName="MY_DATASET_NAME";StringtableName="MY_TABLE_NAME";relaxTableQuery(projectId,datasetName,tableName);}// To relax all columns in a destination table when you append data to it during a query jobpublicstaticvoidrelaxTableQuery(StringprojectId,StringdatasetName,StringtableName)throwsException{try{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.BigQuerybigquery=BigQueryOptions.getDefaultInstance().getService();TableIdtableId=TableId.of(datasetName,tableName);StringsourceTable="`"+projectId+"."+datasetName+"."+tableName+"`";Stringquery="SELECT word FROM "+sourceTable+" WHERE word like '%is%'";QueryJobConfigurationqueryConfig=QueryJobConfiguration.newBuilder(query)// Use standard SQL syntax for queries.// See: https://cloud.google.com/bigquery/sql-reference/.setUseLegacySql(false).setSchemaUpdateOptions(ImmutableList.of(SchemaUpdateOption.ALLOW_FIELD_RELAXATION)).setWriteDisposition(WriteDisposition.WRITE_APPEND).setDestinationTable(tableId).build();JobqueryJob=bigquery.create(JobInfo.newBuilder(queryConfig).build());queryJob=queryJob.waitFor();// Check for errorsif(queryJob==null){thrownewException("Job no longer exists");}elseif(queryJob.getStatus().getError()!=null){// You can also look at queryJob.getStatus().getExecutionErrors() for all// errors, not just the latest one.thrownewException(queryJob.getStatus().getError().toString());}// Get the results.TableResultresults=queryJob.getQueryResults();// Print all pages of the results.results.iterateAll().forEach(rows->{rows.forEach(row->System.out.println("row: "+row.toString()));});System.out.println("Successfully relaxed all columns in destination table during query job");}catch(BigQueryException|InterruptedExceptione){System.out.println("Columns not relaxed during query job \n"+e.toString());}}}
// Import the Google Cloud client librariesconst{BigQuery}=require('@google-cloud/bigquery');// Instantiate clientconstbigquery=newBigQuery();asyncfunctionaddColumnQueryAppend(){// Adds a new column to a BigQuery table while appending rows via a query job./*** TODO(developer): Uncomment the following lines before running the sample.*/// const datasetId = 'my_dataset';// const tableId = 'my_table';// Retrieve destination table referenceconst[table]=awaitbigquery.dataset(datasetId).table(tableId).get();constdestinationTableRef=table.metadata.tableReference;// In this example, the existing table contains only the 'name' column.// 'REQUIRED' fields cannot be added to an existing schema,// so the additional column must be 'NULLABLE'.constquery=`SELECT name, yearFROM \`bigquery-public-data.usa_names.usa_1910_2013\`WHERE state = 'TX'LIMIT 10`;// Set load job optionsconstoptions={query:query,schemaUpdateOptions:['ALLOW_FIELD_ADDITION'],writeDisposition:'WRITE_APPEND',destinationTable:destinationTableRef,// Location must match that of the dataset(s) referenced in the query.location:'US',};const[job]=awaitbigquery.createQueryJob(options);console.log(`Job${job.id}started.`);// Wait for the query to finishconst[rows]=awaitjob.getQueryResults();console.log(`Job${job.id}completed.`);// Print the resultsconsole.log('Rows:');rows.forEach(row=>console.log(row));}
fromgoogle.cloudimportbigquery# Construct a BigQuery client object.client=bigquery.Client()# TODO(developer): Set table_id to the ID of the destination table.# table_id = "your-project.your_dataset.your_table_name"# Retrieves the destination table and checks the length of the schema.table=client.get_table(table_id)# Make an API request.print("Table{}contains{}columns".format(table_id,len(table.schema)))# Configures the query to append the results to a destination table,# allowing field addition.job_config=bigquery.QueryJobConfig(destination=table_id,schema_update_options=[bigquery.SchemaUpdateOption.ALLOW_FIELD_ADDITION],write_disposition=bigquery.WriteDisposition.WRITE_APPEND,)# Start the query, passing in the extra configuration.client.query_and_wait(# In this example, the existing table contains only the 'full_name' and# 'age' columns, while the results of this query will contain an# additional 'favorite_color' column.'SELECT "Timmy" as full_name, 85 as age, "Blue" as favorite_color;',job_config=job_config,)# Make an API request and wait for job to complete.# Checks the updated length of the schema.table=client.get_table(table_id)# Make an API request.print("Table{}now contains{}columns".format(table_id,len(table.schema)))
For more
information aboutALTER TABLE RENAME COLUMNstatements, seeDDL details.
Change a column's data type
Changing a column's data type isn't supported by the Google Cloud console, the
bq command-line tool, or the BigQuery API. If you attempt to update a table by
applying a schema
that specifies a new data type for a column, an error is returned.
Change a column's data type with a DDL statement
You can use GoogleSQL to make certain changes to the data type of a
column. For more information and a complete list of supported data type
conversions, see theALTER COLUMN SET DATA TYPEDDL statement.
The following example creates a table with a column of typeINT64, then
updates the type toNUMERIC:
For complex nested schema changes, like altering a field within an array of
STRUCTs, theALTER TABLEDDL statementisn't supported.
As a workaround you can use theCREATE OR REPLACE TABLEstatementwith aSELECTstatement
to transform your nested schema changes.
The following example demonstrates how to transform a column within an array ofSTRUCTS:
Consider a tablesamples.testwith the following schema and data:
+----------------------------------------------------------------------------+
| D |
+----------------------------------------------------------------------------+
| {"L":[{"R":"r1","U":"u1","V":"v1"},{"R":"r2","U":"u2","V":"v2"}],"F":"f1"} |
+----------------------------------------------------------------------------+
Suppose you need to change the type of fieldUwithin the nested array ofSTRUCTs toSTRUCT<W STRING>. The following SQL statement demonstrates how to
accomplish this:
This statement creates a new table,samples.new_table, with the target schema.
TheUNNESTfunction expands the array of STRUCTs withint.D.L. The
expressionSTRUCT(tmp.U AS W) AS Uconstructs the new STRUCT with field W,
populated by the value from the originalUfield. The resulting table,samples.new_table, has the following schema and data:
+----------------------------------------------------------------------------------------+
| D |
+----------------------------------------------------------------------------------------+
| {"L":[{"R":"r1","U":{"W":"u1"},"V":"v1"},{"R":"r2","U":{"W":"u2"},"V":"v2"}],"F":"f1"} |
+----------------------------------------------------------------------------------------+
Cast a column's data type
To change a column's data type into acastabletype,
use a SQL query to select the table data,castthe relevant column, andoverwrite the table. Casting
and overwriting is not recommended for very large tables because it requires a
full table scan.
The following example shows a SQL query that selects all the data fromcolumn_twoandcolumn_threeinmydataset.mytableand castscolumn_onefromDATEtoSTRING. The query result is used to overwrite the existing
table. The overwritten table storescolumn_oneas aSTRINGdata type.
When usingCAST, a query can fail if BigQuery is unable to
perform the cast. For details on casting rules in GoogleSQL, seeCasting.
Console
In the Google Cloud console, go to theBigQuerypage.
In theQuery editor, enter the following query to select all of the
data fromcolumn_twoandcolumn_threeinmydataset.mytableand to
castcolumn_onefromDATEtoSTRING. The query uses an alias to
castcolumn_onewith the same name.mydataset.mytableis in
your default project.
When the query job completes, the data type ofcolumn_oneisSTRING.
bq
Enter the followingbq querycommand to select all of the data fromcolumn_twoandcolumn_threeinmydataset.mytableand to castcolumn_onefromDATEtoSTRING. The query uses an alias to castcolumn_onewith the same name.mydataset.mytableis in your default
project.
The query results are written tomydataset.mytableusing the--destination_tableflag, and the--replaceflag is used to overwritemytable. Specify theuse_legacy_sql=falseflag to use
GoogleSQL syntax.
Optionally, supply the--locationflag and set the value to yourlocation.
bq query \
--destination_table mydataset.mytable \
--replace \
--use_legacy_sql=false \
'SELECT
column_two,
column_three,
CAST(column_one AS STRING) AS column_one
FROM
mydataset.mytable'
API
To select all of the data fromcolumn_twoandcolumn_threeinmydataset.mytableand to castcolumn_onefromDATEtoSTRING, call
thejobs.insertmethod and configure aqueryjob. Optionally, specify your location in thelocationproperty in thejobReferencesection.
The SQL query used in the query job would beSELECT column_two,
column_three, CAST(column_one AS STRING) AS column_one FROM
mydataset.mytable. The query uses an alias to castcolumn_onewith the
same name.
To overwritemytablewith the query results, includemydataset.mytablein theconfiguration.query.destinationTableproperty, and specifyWRITE_TRUNCATEin theconfiguration.query.writeDispositionproperty.
Change a column's mode
The only supported modification you can make to a column's mode is
changing it fromREQUIREDtoNULLABLE. Changing a column's mode fromREQUIREDtoNULLABLEis also called column relaxation. You can also relax a
column when you load data to overwrite an existing table,
or when you append data to an existing table. You can't change a column's mode
fromNULLABLEtoREQUIRED.
Make a columnNULLABLEin an existing table
To change a column's mode fromREQUIREDtoNULLABLE, select one of
the following options:
First, issue thebq showcommand with the--schemaflag and write the
existing table schema to a file. If the table you're updating is in a
project other than your default project, add the project ID to the dataset
name in the following format:PROJECT_ID:DATASET.
After updating your schema file, issue the following command to update
the table's schema. If the table you're updating is in a project other than
your default project, add the project ID to the dataset name in the
following format:PROJECT_ID:DATASET.
bqupdatePROJECT_ID:DATASET.TABLESCHEMA
Replace the following:
PROJECT_ID: your project ID.
DATASET: the name of the dataset that contains the table
you're updating.
TABLE: the name of the table you're updating.
SCHEMA: the path to the JSON schema file on your local
machine.
For example, enter the following command to update the schema definition
ofmydataset.mytablein your default project. The path to the schema
file on your local machine is/tmp/myschema.json.
bq update mydataset.mytable /tmp/myschema.json
API
Calltables.patchand
use theschemaproperty to change aREQUIREDcolumn toNULLABLEin
your schema definition. Because thetables.updatemethod replaces the
entire table resource, thetables.patchmethod is preferred.
import("context""fmt""cloud.google.com/go/bigquery")// relaxTableAPI demonstrates modifying the schema of a table to remove the requirement that columns allow// no NULL values.funcrelaxTableAPI(projectID,datasetID,tableIDstring)error{// projectID := "my-project-id"// datasetID := "mydatasetid"// tableID := "mytableid"ctx:=context.Background()client,err:=bigquery.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("bigquery.NewClient: %v",err)}deferclient.Close()// Setup: We first create a table with a schema that's restricts NULL values.sampleSchema:=bigquery.Schema{{Name:"full_name",Type:bigquery.StringFieldType,Required:true},{Name:"age",Type:bigquery.IntegerFieldType,Required:true},}original:=&bigquery.TableMetadata{Schema:sampleSchema,}iferr:=client.Dataset(datasetID).Table(tableID).Create(ctx,original);err!=nil{returnerr}tableRef:=client.Dataset(datasetID).Table(tableID)meta,err:=tableRef.Metadata(ctx)iferr!=nil{returnerr}// Iterate through the schema to set all Required fields to false (nullable).varrelaxedbigquery.Schemafor_,v:=rangemeta.Schema{v.Required=falserelaxed=append(relaxed,v)}newMeta:=bigquery.TableMetadataToUpdate{Schema:relaxed,}if_,err:=tableRef.Update(ctx,newMeta,meta.ETag);err!=nil{returnerr}returnnil}
importcom.google.cloud.bigquery.BigQuery;importcom.google.cloud.bigquery.BigQueryException;importcom.google.cloud.bigquery.BigQueryOptions;importcom.google.cloud.bigquery.Field;importcom.google.cloud.bigquery.LegacySQLTypeName;importcom.google.cloud.bigquery.Schema;importcom.google.cloud.bigquery.StandardTableDefinition;importcom.google.cloud.bigquery.Table;publicclassRelaxColumnMode{publicstaticvoidrunRelaxColumnMode(){// TODO(developer): Replace these variables before running the sample.StringdatasetName="MY_DATASET_NAME";StringtableId="MY_TABLE_NAME";relaxColumnMode(datasetName,tableId);}publicstaticvoidrelaxColumnMode(StringdatasetName,StringtableId){try{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.BigQuerybigquery=BigQueryOptions.getDefaultInstance().getService();Tabletable=bigquery.getTable(datasetName,tableId);// Create new relaxed schema based on the existing table schemaSchemarelaxedSchema=Schema.of(// The only supported modification you can make to a column's mode is changing it from// REQUIRED to NULLABLE// Changing a column's mode from REQUIRED to NULLABLE is also called column relaxation// INFO: LegacySQLTypeName will be updated to StandardSQLTypeName in release 1.103.0Field.newBuilder("word",LegacySQLTypeName.STRING).setMode(Field.Mode.NULLABLE).build(),Field.newBuilder("word_count",LegacySQLTypeName.STRING).setMode(Field.Mode.NULLABLE).build(),Field.newBuilder("corpus",LegacySQLTypeName.STRING).setMode(Field.Mode.NULLABLE).build(),Field.newBuilder("corpus_date",LegacySQLTypeName.STRING).setMode(Field.Mode.NULLABLE).build());// Update the table with the new schemaTableupdatedTable=table.toBuilder().setDefinition(StandardTableDefinition.of(relaxedSchema)).build();updatedTable.update();System.out.println("Table schema successfully relaxed.");}catch(BigQueryExceptione){System.out.println("Table schema not relaxed \n"+e.toString());}}}
// Import the Google Cloud client library and create a clientconst{BigQuery}=require('@google-cloud/bigquery');constbigquery=newBigQuery();asyncfunctionrelaxColumn(){/*** Changes columns from required to nullable.* Assumes existing table with the following schema:* [{name: 'Name', type: 'STRING', mode: 'REQUIRED'},* {name: 'Age', type: 'INTEGER'},* {name: 'Weight', type: 'FLOAT'},* {name: 'IsMagic', type: 'BOOLEAN'}];*//*** TODO(developer): Uncomment the following lines before running the sample.*/// const datasetId = 'my_dataset';// const tableId = 'my_table';constnewSchema=[{name:'Name',type:'STRING',mode:'NULLABLE'},{name:'Age',type:'INTEGER'},{name:'Weight',type:'FLOAT'},{name:'IsMagic',type:'BOOLEAN'},];// Retrieve current table metadataconsttable=bigquery.dataset(datasetId).table(tableId);const[metadata]=awaittable.getMetadata();// Update schemametadata.schema=newSchema;const[apiResponse]=awaittable.setMetadata(metadata);console.log(apiResponse.schema.fields);}
fromgoogle.cloudimportbigqueryclient=bigquery.Client()# TODO(dev): Change table_id to full name of the table you want to create.table_id="your-project.your_dataset.your_table"table=client.get_table(table_id)new_schema=[]forfieldintable.schema:iffield.mode!="REQUIRED":new_schema.append(field)else:# SchemaField properties cannot be edited after initialization.# To make changes, construct new SchemaField objects.new_field=field.to_api_repr()new_field["mode"]="NULLABLE"relaxed_field=bigquery.SchemaField.from_api_repr(new_field)new_schema.append(relaxed_field)table.schema=new_schematable=client.update_table(table,["schema"])print(f"Updated{table_id}schema:{table.schema}.")
Make a columnNULLABLEwith an appending load job
You can relax a column's mode when you append data to a table in a load job.
Select one of the following based on the type of file:
When appending data from CSV and JSON files, relax the mode for individual
columns by specifying a JSON schema file.
When appending data from Avro, ORC, or Parquet files, relax columns toNULLin your schema and let schema inference detect the relaxed columns.
To relax a column fromREQUIREDtoNULLABLEwhen you append data to a table
during a load job, select one of the following options:
Console
You cannot relax a column's mode using the Google Cloud console.
bq
Use thebq loadcommand to load your data and specify the--noreplaceflag to indicate that you are appending the data to an existing table.
If the data you're appending is in CSV or newline-delimited JSON format,
specify the relaxed columns in a local JSON schema file or use the--autodetectflag to useschema detectionto discover relaxed columns in the source data.
Relaxed columns can be automatically inferred from Avro, ORC, and Parquet
files. Column relaxation does not apply to Datastore export
appends. The columns in tables created by loading Datastore export
files are alwaysNULLABLE.
Set the--schema_update_optionflag toALLOW_FIELD_RELAXATIONto
indicate that the data you're appending contains relaxed columns.
If the table you're appending is in a dataset in a project other than your
default project, add the project ID to the dataset name in the following
format:PROJECT_ID:DATASET.
(Optional) Supply the--locationflag and set the value to yourlocation.
LOCATION: the name of your location. The--locationflag is
optional. For example, if you are using BigQuery in the
Tokyo region, set the flag's value toasia-northeast1. You can set a
default value for the location using the.bigqueryrc file.
FORMAT:NEWLINE_DELIMITED_JSON,CSV,PARQUET,ORC, orAVRO.DATASTORE_BACKUPfiles don't require column relaxation. The
columns in tables created from Datastore export files are
alwaysNULLABLE.
PROJECT_ID: your project ID.
datasetis the name of the dataset that
contains the table.
TABLE: the name of the table you're
appending.
PATH_TO_SOURCE: a fully-qualifiedCloud Storage URI,
a comma-separated list of URIs, or the path to a data file on your
local machine.
SCHEMA: the path to a local JSON schema file. This option is
used only for CSV and JSON files. Relaxed columns are automatically
inferred from Avro files.
Examples:
Enter the following command to append a local Avro data file,/tmp/mydata.avro, tomydataset.mytableusing a load job. Since relaxed
columns can be automatically inferred from Avro data you don't need to
specify a schema file.mydatasetis in your default project.
Enter the following command to append data from a newline-delimited JSON
file in Cloud Storage tomydataset.mytableusing a load job. The
schema containing the relaxed columns is in a local JSON schema file —/tmp/myschema.json.mydatasetis in your default project.
Enter the following command to append data in a CSV file on your local
machine tomydataset.mytableusing a load job. The command uses schema
auto-detection to discover relaxed columns in the source data.mydatasetis inmyotherproject, not your default project.
import("context""fmt""os""cloud.google.com/go/bigquery")// relaxTableImport demonstrates amending the schema of a table to relax columns from// not allowing NULL values to allowing them.funcrelaxTableImport(projectID,datasetID,tableID,filenamestring)error{// projectID := "my-project-id"// datasetID := "mydataset"// tableID := "mytable"ctx:=context.Background()client,err:=bigquery.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("bigquery.NewClient: %v",err)}deferclient.Close()sampleSchema:=bigquery.Schema{{Name:"full_name",Type:bigquery.StringFieldType,Required:true},{Name:"age",Type:bigquery.IntegerFieldType,Required:true},}meta:=&bigquery.TableMetadata{Schema:sampleSchema,}tableRef:=client.Dataset(datasetID).Table(tableID)iferr:=tableRef.Create(ctx,meta);err!=nil{returnerr}// Now, import data from a local file, but specify relaxation of required// fields as a side effect while the data is appended.f,err:=os.Open(filename)iferr!=nil{returnerr}source:=bigquery.NewReaderSource(f)source.AutoDetect=true// Allow BigQuery to determine schema.source.SkipLeadingRows=1// CSV has a single header line.loader:=client.Dataset(datasetID).Table(tableID).LoaderFrom(source)loader.SchemaUpdateOptions=[]string{"ALLOW_FIELD_RELAXATION"}job,err:=loader.Run(ctx)iferr!=nil{returnerr}status,err:=job.Wait(ctx)iferr!=nil{returnerr}iferr:=status.Err();err!=nil{returnerr}returnnil}
importcom.google.cloud.bigquery.BigQuery;importcom.google.cloud.bigquery.BigQueryException;importcom.google.cloud.bigquery.BigQueryOptions;importcom.google.cloud.bigquery.CsvOptions;importcom.google.cloud.bigquery.Field;importcom.google.cloud.bigquery.Job;importcom.google.cloud.bigquery.JobInfo;importcom.google.cloud.bigquery.LoadJobConfiguration;importcom.google.cloud.bigquery.Schema;importcom.google.cloud.bigquery.StandardSQLTypeName;importcom.google.cloud.bigquery.Table;importcom.google.cloud.bigquery.TableId;importcom.google.common.collect.ImmutableList;// Sample to append relax column in a table.publicclassRelaxColumnLoadAppend{publicstaticvoidmain(String[]args){// TODO(developer): Replace these variables before running the sample.StringdatasetName="MY_DATASET_NAME";StringtableName="MY_TABLE_NAME";StringsourceUri="gs://cloud-samples-data/bigquery/us-states/us-states.csv";relaxColumnLoadAppend(datasetName,tableName,sourceUri);}publicstaticvoidrelaxColumnLoadAppend(StringdatasetName,StringtableName,StringsourceUri){try{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.BigQuerybigquery=BigQueryOptions.getDefaultInstance().getService();// Retrieve destination table referenceTabletable=bigquery.getTable(TableId.of(datasetName,tableName));// column as a 'REQUIRED' field.Fieldname=Field.newBuilder("name",StandardSQLTypeName.STRING).setMode(Field.Mode.REQUIRED).build();FieldpostAbbr=Field.newBuilder("post_abbr",StandardSQLTypeName.STRING).setMode(Field.Mode.REQUIRED).build();Schemaschema=Schema.of(name,postAbbr);// Skip header row in the file.CsvOptionscsvOptions=CsvOptions.newBuilder().setSkipLeadingRows(1).build();// Set job optionsLoadJobConfigurationloadConfig=LoadJobConfiguration.newBuilder(table.getTableId(),sourceUri).setSchema(schema).setFormatOptions(csvOptions).setSchemaUpdateOptions(ImmutableList.of(JobInfo.SchemaUpdateOption.ALLOW_FIELD_RELAXATION)).setWriteDisposition(JobInfo.WriteDisposition.WRITE_APPEND).build();// Create a load job and wait for it to complete.Jobjob=bigquery.create(JobInfo.of(loadConfig));job=job.waitFor();// Check the job's status for errorsif(job.isDone()&&job.getStatus().getError()==null){System.out.println("Relax column append successfully loaded in a table");}else{System.out.println("BigQuery was unable to load into the table due to an error:"+job.getStatus().getError());}}catch(BigQueryException|InterruptedExceptione){System.out.println("Column not added during load append \n"+e.toString());}}}
// Import the Google Cloud client librariesconst{BigQuery}=require('@google-cloud/bigquery');// Instantiate clientconstbigquery=newBigQuery();asyncfunctionrelaxColumnLoadAppend(){// Changes required column to nullable in load append job./*** TODO(developer): Uncomment the following lines before running the sample.*/// const fileName = '/path/to/file.csv';// const datasetId = 'my_dataset';// const tableId = 'my_table';// In this example, the existing table contains the 'Name'// column as a 'REQUIRED' field.constschema='Age:INTEGER, Weight:FLOAT, IsMagic:BOOLEAN';// Retrieve destination table referenceconst[table]=awaitbigquery.dataset(datasetId).table(tableId).get();constdestinationTableRef=table.metadata.tableReference;// Set load job optionsconstoptions={schema:schema,schemaUpdateOptions:['ALLOW_FIELD_RELAXATION'],writeDisposition:'WRITE_APPEND',destinationTable:destinationTableRef,};// Load data from a local file into the tableconst[job]=awaitbigquery.dataset(datasetId).table(tableId).load(fileName,options);console.log(`Job${job.id}completed.`);// Check the job's status for errorsconsterrors=job.status.errors;if(errors&&errors.length>0){throwerrors;}}
# from google.cloud import bigquery# client = bigquery.Client()# project = client.project# dataset_ref = bigquery.DatasetReference(project, 'my_dataset')# filepath = 'path/to/your_file.csv'# Retrieves the destination table and checks the number of required fieldstable_id="my_table"table_ref=dataset_ref.table(table_id)table=client.get_table(table_ref)original_required_fields=sum(field.mode=="REQUIRED"forfieldintable.schema)# In this example, the existing table has 3 required fields.print("{}fields in the schema are required.".format(original_required_fields))# Configures the load job to append the data to a destination table,# allowing field relaxationjob_config=bigquery.LoadJobConfig()job_config.write_disposition=bigquery.WriteDisposition.WRITE_APPENDjob_config.schema_update_options=[bigquery.SchemaUpdateOption.ALLOW_FIELD_RELAXATION]# In this example, the existing table contains three required fields# ('full_name', 'age', and 'favorite_color'), while the data to load# contains only the first two fields.job_config.schema=[bigquery.SchemaField("full_name","STRING",mode="REQUIRED"),bigquery.SchemaField("age","INTEGER",mode="REQUIRED"),]job_config.source_format=bigquery.SourceFormat.CSVjob_config.skip_leading_rows=1withopen(filepath,"rb")assource_file:job=client.load_table_from_file(source_file,table_ref,location="US",# Must match the destination dataset location.job_config=job_config,)# API requestjob.result()# Waits for table load to complete.print("Loaded{}rows into{}:{}.".format(job.output_rows,dataset_id,table_ref.table_id))# Checks the updated number of required fieldstable=client.get_table(table)current_required_fields=sum(field.mode=="REQUIRED"forfieldintable.schema)print("{}fields in the schema are now required.".format(current_required_fields))
Make all columnsNULLABLEwith an append job
You can relax all columns in a table when you append query results to it. You
can relax all required fields in the destination table by setting the--schema_update_optionflag toALLOW_FIELD_RELAXATION. You cannot relax
individual columns in a destination table by using a query append. To relax
individual columns with a load append job, seeMake a columnNULLABLEwith an append job.
To relax all columns when you append query results to a destination table,
select one of the following options:
Console
You cannot relax a column's mode using the Google Cloud console.
bq
Use thebq querycommand to query your data and specify the--destination_tableflag to indicate which table you're appending.
To specify that you are appending query results to an existing destination
table, specify the--append_tableflag.
Set the--schema_update_optionflag toALLOW_FIELD_RELAXATIONto
indicate that allREQUIREDcolumns in the table you're appending should be
changed toNULLABLE.
Specify theuse_legacy_sql=falseflag to use GoogleSQL syntax for the
query.
If the table you're appending is in a dataset in a project other than your
default project, add the project ID to the dataset name in the following
format:PROJECT_ID:DATASET.
(Optional) Supply the--locationflag and set the value to yourlocation.
LOCATION: the name of your location. The--locationflag is
optional. For example, if you are using BigQuery in the
Tokyo region, set the flag's value toasia-northeast1. You can set a
default value for the location using the.bigqueryrc file.
PROJECT_ID: your project ID.
DATASET: the name of the dataset that contains the table
you're appending.
TABLE: the name of the table you're appending.
QUERY: a query in GoogleSQL syntax.
Examples:
Enter the following command querymydataset.mytablein your default
project to append the query results tomydataset.mytable2(also in
your default project). The command changes allREQUIREDcolumns in the
destination table toNULLABLE.
Enter the following command querymydataset.mytablein your default
project and to append the query results tomydataset.mytable2inmyotherproject. The command changes allREQUIREDcolumns in the
destination table toNULLABLE.
import("context""fmt""cloud.google.com/go/bigquery")// relaxTableQuery demonstrates relaxing the schema of a table by appending query results to// enable the table to allow NULL values.funcrelaxTableQuery(projectID,datasetID,tableIDstring)error{// projectID := "my-project-id"// datasetID := "mydataset"// tableID := "mytable"ctx:=context.Background()client,err:=bigquery.NewClient(ctx,projectID)iferr!=nil{returnfmt.Errorf("bigquery.NewClient: %v",err)}deferclient.Close()sampleSchema:=bigquery.Schema{{Name:"full_name",Type:bigquery.StringFieldType,Required:true},{Name:"age",Type:bigquery.IntegerFieldType,Required:true},}meta:=&bigquery.TableMetadata{Schema:sampleSchema,}tableRef:=client.Dataset(datasetID).Table(tableID)iferr:=tableRef.Create(ctx,meta);err!=nil{returnerr}// Now, append a query result that includes nulls, but allow the job to relax// all required columns.q:=client.Query("SELECT \"Beyonce\" as full_name")q.QueryConfig.Dst=client.Dataset(datasetID).Table(tableID)q.SchemaUpdateOptions=[]string{"ALLOW_FIELD_RELAXATION"}q.WriteDisposition=bigquery.WriteAppendq.Location="US"job,err:=q.Run(ctx)iferr!=nil{returnerr}_,err=job.Wait(ctx)iferr!=nil{returnerr}returnnil}
importcom.google.cloud.bigquery.BigQuery;importcom.google.cloud.bigquery.BigQueryException;importcom.google.cloud.bigquery.BigQueryOptions;importcom.google.cloud.bigquery.Job;importcom.google.cloud.bigquery.JobInfo;importcom.google.cloud.bigquery.JobInfo.SchemaUpdateOption;importcom.google.cloud.bigquery.JobInfo.WriteDisposition;importcom.google.cloud.bigquery.QueryJobConfiguration;importcom.google.cloud.bigquery.TableId;importcom.google.cloud.bigquery.TableResult;importcom.google.common.collect.ImmutableList;publicclassRelaxTableQuery{publicstaticvoidrunRelaxTableQuery()throwsException{// TODO(developer): Replace these variables before running the sample.StringprojectId="MY_PROJECT_ID";StringdatasetName="MY_DATASET_NAME";StringtableName="MY_TABLE_NAME";relaxTableQuery(projectId,datasetName,tableName);}// To relax all columns in a destination table when you append data to it during a query jobpublicstaticvoidrelaxTableQuery(StringprojectId,StringdatasetName,StringtableName)throwsException{try{// Initialize client that will be used to send requests. This client only needs to be created// once, and can be reused for multiple requests.BigQuerybigquery=BigQueryOptions.getDefaultInstance().getService();TableIdtableId=TableId.of(datasetName,tableName);StringsourceTable="`"+projectId+"."+datasetName+"."+tableName+"`";Stringquery="SELECT word FROM "+sourceTable+" WHERE word like '%is%'";QueryJobConfigurationqueryConfig=QueryJobConfiguration.newBuilder(query)// Use standard SQL syntax for queries.// See: https://cloud.google.com/bigquery/sql-reference/.setUseLegacySql(false).setSchemaUpdateOptions(ImmutableList.of(SchemaUpdateOption.ALLOW_FIELD_RELAXATION)).setWriteDisposition(WriteDisposition.WRITE_APPEND).setDestinationTable(tableId).build();JobqueryJob=bigquery.create(JobInfo.newBuilder(queryConfig).build());queryJob=queryJob.waitFor();// Check for errorsif(queryJob==null){thrownewException("Job no longer exists");}elseif(queryJob.getStatus().getError()!=null){// You can also look at queryJob.getStatus().getExecutionErrors() for all// errors, not just the latest one.thrownewException(queryJob.getStatus().getError().toString());}// Get the results.TableResultresults=queryJob.getQueryResults();// Print all pages of the results.results.iterateAll().forEach(rows->{rows.forEach(row->System.out.println("row: "+row.toString()));});System.out.println("Successfully relaxed all columns in destination table during query job");}catch(BigQueryException|InterruptedExceptione){System.out.println("Columns not relaxed during query job \n"+e.toString());}}}
fromgoogle.cloudimportbigquery# Construct a BigQuery client object.client=bigquery.Client()# TODO(developer): Set table_id to the ID of the destination table.# table_id = "your-project.your_dataset.your_table_name"# Retrieves the destination table and checks the number of required fields.table=client.get_table(table_id)# Make an API request.original_required_fields=sum(field.mode=="REQUIRED"forfieldintable.schema)# In this example, the existing table has 2 required fields.print("{}fields in the schema are required.".format(original_required_fields))# Configures the query to append the results to a destination table,# allowing field relaxation.job_config=bigquery.QueryJobConfig(destination=table_id,schema_update_options=[bigquery.SchemaUpdateOption.ALLOW_FIELD_RELAXATION],write_disposition=bigquery.WriteDisposition.WRITE_APPEND,)# Start the query, passing in the extra configuration.client.query_and_wait(# In this example, the existing table contains 'full_name' and 'age' as# required columns, but the query results will omit the second column.'SELECT "Beyonce" as full_name;',job_config=job_config,)# Make an API request and wait for job to complete# Checks the updated number of required fields.table=client.get_table(table_id)# Make an API request.current_required_fields=sum(field.mode=="REQUIRED"forfieldintable.schema)print("{}fields in the schema are now required.".format(current_required_fields))
Change a column's default value
To change the default value for a column, select one of the following options:
Console
In the Google Cloud console, go to theBigQuerypage.
You can generate column descriptions with Gemini in
BigQuery by using data insights. Data insights is an automated
way to explore, understand, and curate your data.
For more information about data insights, including setup steps, required
IAM roles, and best practices to improve the accuracy of the
generated insights, seeGenerate data insights in BigQuery.
In the Google Cloud console, go to theBigQuerypage.
In theExplorerpanel, expand your project and dataset, then select
the table.
In the details panel, click theSchematab.
ClickGenerate.
Gemini generates column descriptions and insights about
the table. It takes a few minutes for the information to be
populated. You can view the generated insights on the table'sInsightstab.
To edit and save the generated column descriptions, do the following:
ClickView column descriptions.
In theColumn descriptionssection, clickSave to schema.
The generated column descriptions are populated in theNew descriptionfield for each column.
Edit the column descriptions as necessary, and then clickSave.
The column descriptions are updated immediately.
To close thePreview descriptionspanel, clickcloseClose.
The statement does not immediately free up the storage that is associated with
the dropped column. To learn more about the impact on storage when you drop a
column on storage, seeALTER TABLE DROP COLUMNstatement
details.
There are two options for immediately reclaiming storage:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[[["\u003cp\u003eBigQuery table schemas can be modified by adding, renaming, or changing the data types and modes of columns through SQL DDL statements, console operations, \u003ccode\u003ebq\u003c/code\u003e commands, or API methods.\u003c/p\u003e\n"],["\u003cp\u003eAdding columns is generally done by specifying \u003ccode\u003eNULLABLE\u003c/code\u003e or \u003ccode\u003eREPEATED\u003c/code\u003e attributes, and several methods can be utilized such as using the console, SQL \u003ccode\u003eALTER TABLE ADD COLUMN\u003c/code\u003e, or the \u003ccode\u003ebq update\u003c/code\u003e command with a JSON schema file.\u003c/p\u003e\n"],["\u003cp\u003eRenaming columns is directly supported by SQL DDL with the \u003ccode\u003eALTER TABLE RENAME COLUMN\u003c/code\u003e command, while changing column data types requires SQL's \u003ccode\u003eALTER COLUMN SET DATA TYPE\u003c/code\u003e or using \u003ccode\u003eCAST\u003c/code\u003e within a query to overwrite the table, with the latter being less recommended for large tables.\u003c/p\u003e\n"],["\u003cp\u003eChanging column modes from \u003ccode\u003eREQUIRED\u003c/code\u003e to \u003ccode\u003eNULLABLE\u003c/code\u003e is supported and can be achieved through the console, SQL \u003ccode\u003eALTER COLUMN DROP NOT NULL\u003c/code\u003e, or \u003ccode\u003ebq update\u003c/code\u003e with a modified JSON schema, but the reverse operation is not allowed.\u003c/p\u003e\n"],["\u003cp\u003eSchema relaxation, the act of changing columns from \u003ccode\u003eREQUIRED\u003c/code\u003e to \u003ccode\u003eNULLABLE\u003c/code\u003e during a load operation, is facilitated through the use of the \u003ccode\u003eALLOW_FIELD_RELAXATION\u003c/code\u003e option in \u003ccode\u003ebq\u003c/code\u003e commands and the API, alongside \u003ccode\u003eWRITE_APPEND\u003c/code\u003e to add to the table rather than overwrite it.\u003c/p\u003e\n"]]],[],null,[]]