This page describes how to manage read replicas. These operations include disabling and enabling replication, promoting a replica, configuring parallel replication, and checking the replication status.
For more information about how replication works, see Replication in Cloud SQL .
Disable replication
By default, a replica starts with replication enabled. However, you can disable replication, for example, to debug or analyze the state of an instance. When you are ready, you explicitly re-enable replication. Disabling or re-enabling replication doesn't restart the replica instance.
Disabling replication does not stop the replica instance; it becomes a read-only instance that is no longer replicating from its primary instance. You continue to be charged for the instance. On the disabled replica, you can re-enable replication, delete the replica, or promote the replica to a stand-alone instance.
When you disable the replication for an extended period of time, your disk storage requirements might increase. For example, your instance might accumulate transactional logs to let you resume replication when you re-enable replication. To avoid increasing disk storage requirements, instead of disabling replication for an extended period of time, consider promoting the replica or creating a clone of the primary instance.
To disable replication:
Console
-
In the Google Cloud console, go to the Cloud SQL Instances page.
- Select a replica instance by clicking its name.
- Click Disable replication in the button bar.
- Click OK .
gcloud
gcloud sql instances patch REPLICA_NAME \ --no-enable-database-replication
REST v1
To execute this cURL command at a command line prompt, you acquire an access token by using the gcloud auth print-access-token command. You can also use the APIs Explorer on the Instances:patch page to send the REST API request.
Before using any of the request data, make the following replacements:
- project-id : The project ID
- replica-name : The name of the replica instance
HTTP method and URL:
PATCH https://sqladmin.googleapis.com/v1/projects/ project-id /instances/ replica-name
Request JSON body:
{ "settings": { "databaseReplicationEnabled": "False" } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
REST v1beta4
To execute this cURL command at a command line prompt, you acquire an access token by using the gcloud auth print-access-token command. You can also use the APIs Explorer on the Instances:patch page to send the REST API request.
Before using any of the request data, make the following replacements:
- project-id : The project ID
- replica-name : The name of the replica instance
HTTP method and URL:
PATCH https://sqladmin.googleapis.com/sql/v1beta4/projects/ project-id /instances/ replica-name
Request JSON body:
{ "settings": { "databaseReplicationEnabled": "False" } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
Enable replication
If a replica has not been replicating for a long time, it will take longer for it to catch up to the primary instance. In this case, delete the replica and create a new one.
To enable replication:
Console
-
In the Google Cloud console, go to the Cloud SQL Instances page.
- Select a replica instance by clicking its name.
- Click Enable replication .
- Click Ok .
gcloud
gcloud sql instances patch REPLICA_NAME \ --enable-database-replication
REST v1
To execute this cURL command at a command line prompt, you acquire an access token by using the gcloud auth print-access-token command. You can also use the APIs Explorer on the Instances:patch page to send the REST API request.
Before using any of the request data, make the following replacements:
- project-id : The project ID
- replica-name : The name of the replica instance
HTTP method and URL:
PATCH https://sqladmin.googleapis.com/v1/projects/ project-id /instances/ replica-name
Request JSON body:
{ "settings": { "databaseReplicationEnabled": "True" } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
REST v1beta4
To execute this cURL command at a command line prompt, you acquire an access token by using the gcloud auth print-access-token command. You can also use the APIs Explorer on the Instances:patch page to send the REST API request.
Before using any of the request data, make the following replacements:
- project-id : The project ID
- replica-name : The name of the replica instance
HTTP method and URL:
PATCH https://sqladmin.googleapis.com/sql/v1beta4/projects/ project-id /instances/ replica-name
Request JSON body:
{ "settings": { "databaseReplicationEnabled": "True" } }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
Promote a replica
Promoting a read replica stops replication and converts the instance to a standalone Cloud SQL primary instance with read and write capabilities.
When promoted, read replicas are automatically configured with backups, but they aren't automatically configured as high availability (HA) instances. You can enable high availability after promoting the replica just as you would for any non-replica instance. Configuring a read replica for high availability is done the same way as for a primary instance. Learn more about configuring the instance for high availability .
Before promoting a read replica, if the primary is still available and serving clients, you should do the following:
- Stop all writes to the primary instance.
- Check the replication status of the replica (follow the instructions in the psql Client tab).
- Verify that the replica is replicating, and then wait until the
replication lag reported by
the
replay_lag
metric is 0.
Otherwise, a newly promoted instance may be missing some transactions that were committed to the primary instance.
To promote a replica to a standalone instance:
Console
-
In the Google Cloud console, go to the Cloud SQL Instances page.
- Select a replica instance by clicking its name.
- Click Promote replica .
- Click Ok .
gcloud
gcloud sql instances promote-replica REPLICA_NAME
REST v1
To execute this cURL command at a command line prompt, you acquire an access token by using the gcloud auth print-access-token command. You can also use the APIs Explorer on the Instances:promoteReplica page to send the REST API request.
Before using any of the request data, make the following replacements:
- project-id : The project ID
- replica-name : The name of the replica instance
HTTP method and URL:
POST https://sqladmin.googleapis.com/v1/projects/ project-id /instances/ replica-name /promoteReplica
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
REST v1beta4
To execute this cURL command at a command line prompt, you acquire an access token by using the gcloud auth print-access-token command. You can also use the APIs Explorer on the Instances:promoteReplica page to send the REST API request.
Before using any of the request data, make the following replacements:
- project-id : The project ID
- replica-name : The name of the replica instance
HTTP method and URL:
POST https://sqladmin.googleapis.com/sql/v1beta4/projects/ project-id /instances/ replica-name /promoteReplica
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
Confirm that the promoted instance is configured correctly. In particular, consider configuring the instance for high availability if needed.
Check replication status
When you view a replica instance using the Google Cloud console or log into the instance using an administration client, you get details about replication, including status and metrics. When you use the gcloud CLI , you get a brief summary of the replication configuration.
Before checking the replication status for a Cloud SQL replica instance,
use the
gcloud sql instances describe
command to display
the status of the instance. As a result, you can see whether replication is enabled
for the replica instance.
The following metrics are available for replica instances. ( Learn more about additional metrics available for all instances, including non-replica instances.)
(
cloudsql.googleapis.com
/database
/replication
/state
)Indicates whether replication is actively streaming logs from the primary to the replica. Possible values are:
-
Running
-
Stopped
-
Error
This metric reports Running
if:
-
pg_catalog.pg_stat_wal_receiver
reports astatus
of "streaming", and -
pg_catalog.pg_is_wal_replay_paused()
reports "f" (false).
For more information, see The Statistics Collector and System Administration Functions in the PostgreSQL Reference Manual.
(
cloudsql.googleapis.com
/database
/replication
/replica_lag
)The amount of time that the replica's state is lagging behind the state of the primary instance. This is the difference between (1) the current time and (2) the original timestamp at which the primary committed the transaction that is currently being applied on the replica. In particular, writes may be counted as lagging even if they have been received by the replica, if the replica has not yet applied the write to the database.
For cascading replicas, each primary-replica pair is monitored separately and there's no single metric that yields the end-to-end (primary to replica) lag.
For more information, see Replication lag .
(
cloudsql.googleapis.com
/database
/postgresql
/replication
/replica_byte_lag
)Reports the number of bytes by which the read replica lags the primary. Four time series are produced for each replica, showing the number of bytes in the primary's write-ahead log that have not yet been…
-
sent_location
: …sent to the replica -
write_location
: …written to disk by the replica -
flush_location
: …flushed to disk by the replica -
replay_location
: …replayed by the replica
These metrics serve different purposes; for example, replay_location
gives an indication of the replication lag
(the number of transactions committed to the primary that have not yet
been applied to the replica), while flush_location
gives
an indication of the number of transactions that have not been recorded
durably on the replica instance.
These metrics are computed by comparing pg_catalog.pg_current_wal_lsn()
to one of the following
fields from pg_stat_replication
: sent_lsn
, write_lsn
, flush_lsn
, or replay_lsn
. For more information, see The
Statistics Collector
in the PostgreSQL Reference Manual.
(
cloudsql.googleapis.com
/database
/postgresql
/external_sync
/max_replica_byte_lag
)For a replica of an external primary , reports the maximum replication lag (in bytes) over all databases that are being replicated to this instance. For each database, this is defined as the number of bytes in the primary's write-ahead log that have not been confirmed to be received by the replica.
This metric is computed by sending a query to the primary to compare pg_catalog.pg_current_wal_lsn()
to the value of confirmed_flush_lsn
for each database being replicated to
this replica instance. For more information, see The
Statistics Collector
in the PostgreSQL Reference Manual.
To check replication status:
Console
Cloud SQL reports the Replication State
metric on the default
Cloud SQL monitoring dashboard
.
To view other metrics for in-region and cross-region replicas, and replicas of external servers, create a custom dashboard and add the metrics you wish to monitor to it:
-
In the Google Cloud console, go to the Monitoring page.
- Select the Dashboards tab.
- Click Create dashboard .
- Give the dashboard a name and click OK.
- Click Add chart .
- For Resource Type select Cloud SQL Database .
- Do any of the following:
- To monitor the replication state metric
: in the Select a
metric
field, type
Replication state
. Then add a filter forstate = "Running"
. The chart shows 1 if replication is running and 0 otherwise. - To monitor the replication lag, in bytes, for a read replica
: in
the Select a metric
field, type
Lag Bytes
. Then add a filter onreplica_lag_type = "replay_location"
. The chart shows the number of bytes associated with transactions that have been committed on the primary but have not yet been replayed on the replica. - To monitor the replication lag, in bytes, for a replica of an
external primary
: in the Select a metric
field, type
Max Lag Bytes
. The chart shows the number of bytes associated with transactions that have been committed on the primary but have not yet been confirmed received by the replica.
gcloud
For a replica instance, check the replication status with:
gcloud sql instances describe REPLICA_NAME
In the output, look for the properties databaseReplicationEnabled
and masterInstanceName
.
For a primary instance, check if there are replicas with:
gcloud sql instances describe PRIMARY_INSTANCE_NAME
In the output, look for the property replicaNames
.
psql Client
Some replication status metrics are produced by the primary and some are produced by the replica. For the following steps, connect to the replica or primary instance (as directed below) with a PostgreSQL client.
For information, see Connection options for external applications .
- To check the replica's status from the primary instance
:
select * from pg_stat_replication ;
-
client_addr
: The IP address of the replica instance. -
state
: Indicates whether the SQL thread for executing events in the relay log is running. The value isstreaming
when replication is started. -
replay_lag
: The number of bytes that the replica SQL thread is behind the primary instance. The value isO
or a small number of bytes.
-
- To check the replica's status from the replica instance
:
select * from pg_stat_wal_receiver ;
Look for the following metrics in the output of the command:
-
sender_host
: The IP address of the primary instance. -
status
: Indicates whether the SQL thread for executing events in the relay log is running. The value isstreaming
when replication is started. -
last_msg_send_time
andlast_msg_receipt_time
: The difference between these two timestamps is the lag time.
To check whether replication has been paused:
select pg_is_wal_replay_paused ();
The value is
t
if replication is paused andf
otherwise.To check whether there are transactions that have been received from the primary but not yet applied:
# for PostgreSQL 9 . 6 select pg_catalog . pg_last_xlog_receive_location (), pg_catalog . pg_last_xlog_replay_location (); # for PostgreSQL 10 and above select pg_catalog . pg_last_wal_receive_lsn (), pg_catalog . pg_last_wal_replay_lsn ();
If the two values are equal, then the replica has processed all of the transactions it has received from the primary.
-
Troubleshoot
First, check that the value of the max_connections
flag is
greater than or equal to the value on the primary.
If the max_connections
flag is set appropriately, inspect the logs
in
Cloud Logging to find the actual error.
If the error is: set Service Networking service account as
servicenetworking.serviceAgent role on consumer project
, then disable
and re-enable the Service
Networking API
. This action creates the service account necessary
to continue with the process.
pg_replication_slots
system view and filtering on the active
column. Unused
slots can be dropped to remove WAL segments using the pg_drop_replication_slot
command.Restart the replica instance to reclaim the temporary memory space.
Edit the instance
to enable automatic storage increase
.
- Slow queries on the replica. Find and fix them.
- All tables must have a unique/primary key. Every update on such a table without a unique/primary key causes full table scans on th replica.
- Queries like
DELETE ... WHERE field < 50000000
cause replication lag with row-based replication since a huge number of updates are piled up on the replica.
Some possible solutions include:
- Edit the instance to increase the size of the replica.
- Reduce the load on the database.
- Send read traffic to the read replica.
- Index the tables.
- Identify and fix slow write queries.
- Recreate the replica.
If you must use hash indexes, upgrade to PostgreSQL 10+. Otherwise, if you also want to use replicas, don't use hash indexes in PostgreSQL 9.6.
SELECT * from pg_stat_activity where state = 'active' and pid = XXXX and username = 'cloudsqlreplica'
is expected to run continuously on your primary instance.Recreate the replica after stopping all running queries.
To resolve this issue, complete the following steps:
- Turn on the log_duration
flag and set the
log_statement
parameter toddl
. This provides you with both the queries and the run time on the database. However, depending on your workload, this might cause performance issues. - On both the primary instance and the read replica, run
explain analyze
for the queries. - Compare the query plan and check for differences.
If this is a specific query, then modify the query. For example, you can change the order of the joins to see if you get better performance.
What's next
- Learn how to create a read replica .
- Learn more about requirements and best practices for replication .