This page describes how to change schemas in Cloud Data Fusion when you replicate data from a Microsoft SQL Server database to a BigQuery table. To ensure the replication job continues to replicate data from a SQL Server source after the schema changes, perform these steps.
Change the schema
-
Go to your instance:
-
In the Google Cloud console, go to the Cloud Data Fusion page.
-
To open the instance in the Cloud Data Fusion Studio, click Instances, and then click View instance.
-
-
In the Cloud Data Fusion web interface, click Replication.
-
Stop the replication job.
-
Apply a supported schema change to the source table, such as Add nullable column.
-
Create a new capture table using
sys.sp_cdc_enable_table
procedure with a unique value for parameter@capture_instance
. -
Apply the same schema change on the target table in BigQuery.
-
Restart the replication job.
-
Optional: When the job starts streaming new changes that occur after you stop the job, you can drop the old capture table using the
sys.sp_cdc_disable_table
stored procedure. Set the parameter@capture_instance
to the old capture instance name.
After you apply a schema change to the source table and before you create a new capture table (as you did in the previous steps), all changes are captured by the old capture with the old schema. If you added a column, any change event during that time doesn't contain data for the new column. If your application cannot handle such a transition, follow these steps:
-
Suspend all applications generating database records.
-
Ensure that the replication job has processed all existing events.
-
Stop the replication job.
-
Apply the schema change to the source table, such as Add nullable column.
-
Create a new capture table using the
sys.sp_cdc_enable_table
procedure with a unique value for parameter@capture_instance
. -
Resume the applications you suspended in the first step.
-
Apply the same schema change on the target table in BigQuery.
-
Restart the replication job.
-
Optional: When the job starts streaming new changes that occur after you created the new capture table in the previous step, you can drop the old capture table using the
sys.sp_cdc_disable_table
stored procedure. Set the parameter@capture_instance
to the old capture instance name.
What's next
- Learn more about Replication .
- Refer to the supported schema changes .
- Refer to the Replication API .