Overcome transaction ID (TXID) wraparound protection

This page describes what you can do when your database runs into Transaction ID Wraparound protection in PostgreSQL. It manifests as an ERROR message, as follows:

database  
is  
not  
accepting  
commands  
to  
avoid  
wraparound  
data  
loss  
 in 
  
database dbname 
.

Stop  
the  
postmaster  
and  
vacuum  
that  
database  
 in 
  
single-user  
mode.

You  
might  
also  
need  
to  
commit  
or  
roll  
back  
old  
prepared  
transactions,  
or  
drop
stale  
replication  
slots.

Alternatively, a WARNING message as follows might appear:

database  
 dbname 
  
must  
be  
vacuumed  
within  
 10985967 
  
transactions.

To  
avoid  
a  
database  
shutdown,  
execute  
a  
database-wide  
VACUUM  
 in 
  
that  
database.

Overview of steps

  • Find out which database and which tables are causing the wraparound.
  • Check if there's anything holding back (AUTO)VACUUM (for example, a stuck transaction ID).
  • Measure the speed of AUTOVACUUM. If it is slow, optionally, you can try to speed it up.
  • If needed, run a few more VACUUM commands manually.
  • Investigate other ways to speed up the vacuum. Sometimes the fastest way is to drop the table or some indexes.

Many of the recommendations for values of flags are purposefully not exact because they depend on many database parameters. Read the documents linked at the end of this page for a deeper analysis on this topic.

Find the database and table causing the wraparound

Finding the database

To find out which database or databases contain the tables that are causing the wraparound, run the following query:

  SELECT 
  
 datname 
 , 
  
  
 age 
 ( 
 datfrozenxid 
 ), 
  
  
 2 
 ^ 
 31 
 - 
 1000000 
 - 
 age 
 ( 
 datfrozenxid 
 ) 
  
 as 
  
 remaining 
  
 FROM 
  
 pg_database 
  
 ORDER 
  
 BY 
  
 3 
 

The database with the remaining value close to 0 is the one causing the problem.

Finding the table

Connect to that database and run the following query:

  SELECT 
  
 c 
 . 
 relnamespace 
 :: 
 regnamespace 
  
 as 
  
 schema_name 
 , 
  
 c 
 . 
 relname 
  
 as 
  
 table_name 
 , 
  
 greatest 
 ( 
 age 
 ( 
 c 
 . 
 relfrozenxid 
 ), 
 age 
 ( 
 t 
 . 
 relfrozenxid 
 )) 
  
 as 
  
 age 
 , 
  
 2 
 ^ 
 31 
 - 
 1000000 
 - 
 greatest 
 ( 
 age 
 ( 
 c 
 . 
 relfrozenxid 
 ), 
 age 
 ( 
 t 
 . 
 relfrozenxid 
 )) 
  
 as 
  
 remaining 
  
 FROM 
  
 pg_class 
  
 c 
  
 LEFT 
  
 JOIN 
  
 pg_class 
  
 t 
  
 ON 
  
 c 
 . 
 reltoastrelid 
  
 = 
  
 t 
 . 
 oid 
  
 WHERE 
  
 c 
 . 
 relkind 
  
 IN 
  
 ( 
 'r' 
 , 
  
 'm' 
 ) 
  
 ORDER 
  
 BY 
  
 4 
 ; 
 

This query returns the table or tables causing the problem.

For TEMPORARY tables

If the schema_name starts with pg_temp_ , then the only way to resolve the problem is to drop the table because PostgreSQL doesn't let you VACUUM temporary tables created in other sessions. Sometimes if that session is open and accessible, you can vacuum the table there, but this is often not the case. Use the following SQL statements to drop the temp table:

  SET 
  
 cloudsql 
 . 
 enable_maintenance_mode 
  
 = 
  
 'on' 
 ; 
  
 /* get extra transaction ids */ 
 DROP 
  
 TABLE 
  
 pg_temp_<N> 
 . 
< tablename 
> ; 
 

If this was the only blocker, then in about a minute, the autovacuum picks up this change and moves the datfrozenxid forward in pg_database . This resolves the wraparound protection read-only state.

Normal tables

For normal (that is non-temporary) tables, continue with the next steps below here to see if anything is blocking the clean-up, if the VACUUM is running fast enough, and it the most important table is being vacuumed.

Check for a stuck transaction ID

One possible reason why the system can run out of transaction IDs is that PostgreSQL can't freeze (that is, mark as visible to all transactions) any transaction IDs created after the oldest currently running transaction started. This is because of multiversion concurrency control (MVCC) rules. In extreme cases, such transactions can become so old that they make it impossible for VACUUM to clean up any old transactions for the entire 2 billion transaction ID wraparound limit and cause the whole system to stop accepting new DML. You typically also see warnings in the log file, saying WARNING: oldest xmin is far in the past .

You should move on to optimization only after the stuck transaction ID has been remediated.

Here are four potential reasons why there might be a stuck transaction ID, with information on how to mitigate each of them:

  • Long running transactions: Identify them and cancel or terminate the backend to unblock the vacuum.
  • Orphaned prepare transactions: Roll back these transactions.
  • Abandoned replication slots: Drop the abandoned slots.
  • Long running transaction on replica, with hot_standby_feedback = on : Identify them and cancel or terminate the backend to unblock the vacuum.

For these scenarios, the following query returns the age of the oldest transaction and the number of transactions left until wraparound:

  
 WITH 
  
 q 
  
 AS 
  
 ( 
 SELECT 
  
 ( 
 SELECT 
  
 max 
 ( 
 age 
 ( 
 backend_xmin 
 )) 
  
 FROM 
  
 pg_stat_activity 
  
 WHERE 
  
 state 
  
 != 
  
 'idle' 
  
 ) 
  
 AS 
  
 oldest_running_xact_age 
 , 
  
 ( 
 SELECT 
  
 max 
 ( 
 age 
 ( 
 transaction 
 )) 
  
 FROM 
  
 pg_prepared_xacts 
 ) 
  
 AS 
  
 oldest_prepared_xact_age 
 , 
  
 ( 
 SELECT 
  
 max 
 ( 
 greatest 
 ( 
 age 
 ( 
 catalog_xmin 
 ), 
 age 
 ( 
 xmin 
 ))) 
  
 FROM 
  
 pg_replication_slots 
 ) 
  
 AS 
  
 oldest_replication_slot_age 
 , 
  
 ( 
 SELECT 
  
 max 
 ( 
 age 
 ( 
 backend_xmin 
 )) 
  
 FROM 
  
 pg_stat_replication 
 ) 
  
 AS 
  
 oldest_replica_xact_age 
 ) 
 SELECT 
  
 * 
 , 
  
 2 
 ^ 
 31 
  
 - 
  
 oldest_running_xact_age 
  
 AS 
  
 oldest_running_xact_left 
 , 
  
 2 
 ^ 
 31 
  
 - 
  
 oldest_prepared_xact_age 
  
 AS 
  
 oldest_prepared_xact_left 
 , 
  
 2 
 ^ 
 31 
  
 - 
  
 oldest_replication_slot_age 
  
 AS 
  
 oldest_replication_slot_left 
 , 
  
 2 
 ^ 
 31 
  
 - 
  
 oldest_replica_xact_age 
  
 AS 
  
 oldest_replica_xact_left 
 FROM 
  
 q 
 ; 

This query might return any of the *_left values reported close to or less than 1 million away from wraparound. This value is the wraparound protection limit when PostgreSQL stops accepting new write commands. In this case, see either Remove VACUUM blockers or Tune VACUUM .

For example, the preceding query might return:

┌─ [ 
  
RECORD  
 1 
  
 ] 
─────────────────┬────────────┐
│  
oldest_running_xact_age  
  
 2146483655 
  
│
│  
oldest_prepared_xact_age  
  
 2146483655 
  
│
│  
oldest_replication_slot_age  
  
¤  
│
│  
oldest_replica_xact_age  
  
¤  
│
│  
oldest_running_xact_left  
  
 999993 
  
│
│  
oldest_prepared_xact_left  
  
 999993 
  
│
│  
oldest_replication_slot_left  
  
¤  
│
│  
oldest_replica_xact_left  
  
¤  
│
└──────────────────────────────┴────────────┘

where oldest_running_xact_left and oldest_prepared_xact_left are within the 1 million wraparound protection limit. In this case, you must first remove the blockers for the VACUUM to be able to proceed.

Remove VACUUM blockers

Long-running transactions

In the preceding query, if oldest_running_xact is equal to oldest_prepared_xact , then go to the Orphaned prepare transaction section, because the latest running value includes also the prepared transactions.

You might first need to run the following command as the postgres user:

 GRANT 
  
 pg_signal_backend 
  
 TO 
  
 postgres 
 ; 

If the offending transaction belongs to any of the system users (starting with cloudsql... ), you can't cancel it directly. You must restart the database to cancel it.

To identify a long-running query, and cancel or terminate it to unblock the vacuum, first select a few of the oldest queries. The LIMIT 10 line helps fit the result on the screen. You might need to repeat this after resolving the oldest running queries.

 SELECT 
  
 pid 
 , 
  
 age 
 ( 
 backend_xid 
 ) 
  
 AS 
  
 age_in_xids 
 , 
  
 now 
 () 
  
 - 
  
 xact_start 
  
 AS 
  
 xact_age 
 , 
  
 now 
 () 
  
 - 
  
 query_start 
  
 AS 
  
 query_age 
 , 
  
 state 
 , 
  
 query 
  
 FROM 
  
 pg_stat_activity 
  
 WHERE 
  
 state 
  
 != 
  
 'idle' 
  
 ORDER 
  
 BY 
  
 2 
  
 DESC 
  
 LIMIT 
  
 10 
 ; 

If age_in_xids comes back as NULL , this means the transaction has not been allocated a permanent transaction ID and can be safely ignored.

Cancel the queries where the xids_left_to_wraparound is approaching 1M.

If state is active , then the query can be cancelled using SELECT pg_cancel_backend( pid ); . Otherwise, you need to terminate the whole connection using SELECT pg_terminate_backend( pid ); , where pid is the pid from the previous query

Orphaned prepare transactions

List all prepared transactions:

  DB_NAME 
 
 => 
  
 SELECT 
  
 age 
 ( 
 transaction 
 ), 
 * 
  
 FROM 
  
 pg_prepared_xacts 
  
 ; 
 ┌─ 
 [ 
  
 RECORD 
  
 1 
  
 ] 
 ┬───────────────────────────────┐ 
  
  
 age 
  
  
  
 2146483656 
  
  
  
  
 transaction 
  
  
  
 2455493932 
  
  
  
  
 gid 
  
  
  
 trx_id_pin 
  
  
  
  
 prepared 
  
  
  
 2021 
 - 
 03 
 - 
 03 
  
 16 
 : 
 54 
 : 
 07 
 . 
 923158 
 + 
 00 
  
  
  
  
 owner 
  
  
  
 postgres 
  
  
  
  
 database 
  
  
  
 DB_NAME 
  
  
 └─────────────┴───────────────────────────────┘ 

Roll back the oldest orphaned prepared transaction(s) by using the gid from the last query (in this case, trx_id_pin ) as the transaction ID:

ROLLBACK PREPARED trx_id_pin 
;

Alternatively, commit it:

COMMIT PREPARED trx_id_pin 
;

See the SQL ROLLBACK PREPARED documentation for a full explanation.

Abandoned replication slots

In case the replication slot is abandoned because the existing replica is either stopped, paused, or has some other issue, you can delete the replica from gcloud or Google Cloud console.

First, check that the replica is not disabled as described in Managing read replicas . If the replica is disabled, enable it again. If the lag still stays high, delete the replica,

The replication slots are visible in the pg_replication_slots system view.

The following query fetches the relevant info:

 SELECT 
  
 * 
 , 
  
 age 
 ( 
 xmin 
 ) 
  
 AS 
  
 age 
  
 FROM 
  
 pg_replication_slots 
 ; 
 ┌─ 
 [ 
  
 RECORD 
  
 1 
  
 ] 
 ────────┬─────────────────────────────────────────────────┐ 
  
  
 slot_name 
  
  
  
 cloudsql_1_355b114b_9ff4_4bc3_ac88_6a80199bd738 
  
  
  
  
 plugin 
  
  
  
 ¤ 
  
  
  
  
 slot_type 
  
  
  
 physical 
  
  
  
  
 datoid 
  
  
  
 ¤ 
  
  
  
  
 database 
  
  
  
 ¤ 
  
  
  
  
 active 
  
  
  
 t 
  
  
  
  
 active_pid 
  
  
  
 1126 
  
  
  
  
 xmin 
  
  
  
 2453745071 
  
  
  
  
 catalog_xmin 
  
  
  
 ¤ 
  
  
  
  
 restart_lsn 
  
  
  
 C0 
 / 
 BEF7C2D0 
  
  
  
  
 confirmed_flush_lsn 
  
  
  
 ¤ 
  
  
  
  
 age 
  
  
  
 59 
  
  
 └─────────────────────┴─────────────────────────────────────────────────┘ 

In this example, the pg_replication_slots value is healthy (age == 59). If the age was near 2 billion, you would want to delete the slot. There is no easy way to know which replica is which in case the query returns multiple records. So, check them all in case there is a long-running transaction on any replica.

Long-running transactions on replicas

Check replicas for the oldest running transaction with hot_standby_feedback set to on and disable it on the replica.

The backend_xmin column in the pg_stat_replication view has the oldest TXID needed on the replica.

To move it forward, stop the query that holds it back on the replica. To discover which query is holding it back, use the query in Long running transactions , but this time, run it on the replica.

Another option is to restart the replica.

Configure VACUUM

Set the following two flags:

  • autovacuum_vacuum_cost_delay = 0
  • autovacuum_work_mem = 1048576

The first disables any disk throttling for vacuuming by PostgreSQL so VACUUM can run at full speed. By default, autovacuum is throttled so it does not use up all disk IO on the slowest servers.

The second flag, autovacuum_work_mem , decreases the number of index cleanup passes. If possible, it should be large enough to store all IDs of dead rows in a table that VACUUM is going to clean up. When setting this value, consider that this is the maximum amount of local memory each running VACUUM can allocate. Make sure that you're not allowing more than is available, with some left in reserve. If you leave the database running in read-only mode, then also consider the local memory used for read-only queries.

On most systems, use the maximum value (1 GB or 1048576 kB, as shown in the sample). This value fits up to about 178 million dead tuples. Any more still causes multiple index scan passes.

These and other flags are explained in more detail in Optimizing, monitoring, and troubleshooting VACUUM operations in PostgreSQL .

After setting these flags, restart the database so that autovacuum starts with the new values.

You can use the pg_stat_progress_vacuum view to monitor the progress of autovacuum-started VACUUMs. This view shows VACUUMs running in all databases, and for tables (relations) from other databases that you can't look up the table name using the view column relid .

To identify the databases and tables that need vacuuming next, use queries from Optimizing, monitoring, and troubleshooting VACUUM operations in PostgreSQL . If the server VM is powerful enough and has the bandwidth for more parallel VACUUM processes than started by autovacuum, you can start some manual vacuums.

Check VACUUM speed

This section describes how to check VACUUM speed and how to accelerate it, if needed.

Check running autovacuums

All backends running VACUUM are visible in the system view pg_stat_progress_vacuum .

If the current phase is scanning heap , then you can monitor progress by watching changes in the column heap_blks_scanned . Unfortunately, there is no easy way to determine scan speed in other phases.

Estimate the VACUUM scan speed

To estimate the scan speed, you need to first store the base values and then calculate the change over time to estimate the completion time. First, you need to save a snapshot of heap_blks_scanned together with a timestamp by using the following snapshot query:

 SELECT 
  
 set_config 
 ( 
 'save.ts' 
 , 
  
 clock_timestamp 
 ():: 
 text 
 , 
  
 false 
 ), 
  
 set_config 
 ( 
 'save.heap_blks_scanned' 
 , 
  
 heap_blks_scanned 
 :: 
 text 
 , 
  
 false 
 ) 
 FROM 
  
 pg_stat_progress_vacuum 
 WHERE 
  
 datname 
  
 = 
  
 ' DB_NAME 
' 
 ; 

Since we can't save anything in tables that are already in wraparound, use set_config(flag, value) to set two user-defined flags - save.ts and save.heap_blks_scanned - to the current values from pg_stat_progress_vacuum .

In the next query, we use these two as the comparison base to determine speed and estimate completion time.

NOTE: WHERE datname = DB_NAME restricts the investigation to one database at a time. This number is enough if there is only one autovacuum running in this database, with more than one row per database. Extra filter conditions ('AND relid= …'') need to be added to WHERE to indicate a single autovacuum row. This is also true for the next query.

Once you've saved the base values, you can run the following query:

 with 
  
 q 
  
 as 
  
 ( 
  
 SELECT 
  
 datname 
 , 
  
 phase 
 , 
  
 heap_blks_total 
 , 
  
 heap_blks_scanned 
 , 
  
 clock_timestamp 
 () 
  
 - 
  
 current_setting 
 ( 
 'save.ts' 
 ):: 
 timestamp 
  
 AS 
  
 ts_delta 
 , 
  
 heap_blks_scanned 
  
 - 
  
 current_setting 
 ( 
 'save.heap_blks_scanned' 
 ):: 
 bigint 
  
 AS 
  
 scanned_delta 
  
 FROM 
  
 pg_stat_progress_vacuum 
  
 WHERE 
  
 datname 
  
 = 
  
  DB_NAME 
 
 ), 
  
 q2 
  
 AS 
  
 ( 
 SELECT 
  
 * 
 , 
  
 scanned_delta 
  
 / 
  
 extract 
 ( 
 'epoch' 
  
 FROM 
  
 ts_delta 
 ) 
  
 AS 
  
 pages_per_second 
  
 FROM 
  
 q 
 ) 
 SELECT 
  
 * 
 , 
  
 ( 
 heap_blks_total 
  
 - 
  
 heap_blks_scanned 
 ) 
  
 / 
  
 pages_per_second 
  
 AS 
  
 remaining_time 
  
 FROM 
  
 q2 
 ; 
┌─ [ 
  
RECORD  
 1 
  
 ] 
──────┬──────────────────┐
│  
datname  
  
DB_NAME  
│
│  
phase  
  
scanning  
heap  
│
│  
heap_blks_total  
  
 9497174 
  
│
│  
heap_blks_scanned  
  
 18016 
  
│
│  
ts_delta  
  
 00 
:00:40.30126  
│
│  
as_scanned_delta  
  
 11642 
  
│
│  
pages_per_second  
  
 288 
.87434288655  
│
│  
remaining_time  
  
 32814 
.1222418038  
│
└───────────────────┴──────────────────┘

This query compares the current values to the save base values and calculates pages_per_second and remaining_time , which lets us decide if VACUUM is running fast enough or if we want to speed it up. The remaining_time value is only for the scanning heap phase. Other phases also take time, sometimes even more. You can read more on vacuuming and view blog posts on the internet discussing some of the complex aspects of vacuum.

Speed up VACUUM

The easiest and fastest way to make VACUUM scan faster is setting autovacuum_vacuum_cost_delay=0 . This can be done from the Google Cloud console.

Unfortunately, the already running VACUUM does not pick up this value and you might need to restart the database.

After a restart, you might see a result similar to the following:

┌─ [ 
  
RECORD  
 1 
  
 ] 
──────┬──────────────────┐
│  
datname  
  
DB_NAME  
│
│  
phase  
  
scanning  
heap  
│
│  
heap_blks_total  
  
 9497174 
  
│
│  
heap_blks_scanned  
  
 222382 
  
│
│  
ts_delta  
  
 00 
:00:21.422615  
│
│  
as_scanned_delta  
  
 138235 
  
│
│  
pages_per_second  
  
 6452 
.76031894332  
│
│  
remaining_time  
  
 1437 
.33713040171  
│
└───────────────────┴──────────────────┘

In this sample, the speed increased from <300 pages/sec to ~6500 pages/sec, and the expected remaining time for the heap scanning phase decreased from 9 hours to 23 minutes.

The scan speed of the other phases is not as easy to measure, but they should show a similar speedup.

Also consider making autovacuum_work_mem as large as possible to avoid multiple passes over indexes. An index pass happens each time the memory is filled with dead tuple pointers.

If the database is not being used otherwise, set autovacuum_work_mem to have ~80% of memory free after allowing the required amount for shared_buffers . This is the upper limit for each of the autovacuum-started VACUUM processes. If you want to continue running read-only workloads, use less memory.

Other ways to improve speed

Avoid vacuuming indexes

For huge tables, VACUUM spends most of the time cleaning up indexes.

PostgreSQL 14 has special optimizations for avoiding index cleanup if the system is in danger of wraparound.

In PostgreSQL 12 and 13, you can manually run the following statement:

  VACUUM 
  
 ( 
 INDEX_CLEANUP 
  
 OFF 
 , 
  
 TRUNCATE 
  
 OFF 
 ) 
  
< tablename 
> ; 
 

In versions 11 and older, you can DROP the index before running vacuum and recreate it later.

Dropping the index when an autovacuum is already running on that table requires cancelling the running vacuum and then immediately executing the drop index command before the autovacuum manages to start vacuum on that table again.

First, run the following statement to find the PID of the autovacuum process you need to terminate:

  SELECT 
  
 pid 
 , 
  
 query 
  
  
 FROM 
  
 pg_stat_activity 
  
 WHERE 
  
 state 
  
 != 
  
 'idle' 
  
 AND 
  
 query 
  
 ilike 
  
 '%vacuum%' 
 ; 
 

Then run the following statements to terminate the running vacuum and drop one or more indexes:

  SET 
  
 cloudsql 
 . 
 enable_maintenance_mode 
  
 = 
  
 'on' 
 ; 
  
 /* get extra transaction ids */ 
 SELECT 
  
 pg_terminate_backend 
 ( 
< pid 
> ); 
 DROP 
  
 INDEX 
  
< index1 
> ; 
 DROP 
  
 INDEX 
  
< index2 
> ; 
  
 ... 
 

Drop the offending table

In some rare cases, you can drop the table. For example, if it's a table that's easy to restore from another source like a backup or other database.

You still need to use cloudsql.enable_maintenance_mode = 'on' and likely also terminate the VACUUM on that table as shown in the previous section.

VACUUM FULL

In rare cases, it's faster to run VACUUM FULL FREEZE , usually when the table has only a small proportion of live tuples. This can be checked from the pg_stat_user_tables view (unless there has been a crash which has wiped out the statistics).

The VACUUM FULL command copies live tuples to a new file, so enough space has to be available for the new file and its indexes.

What's next

Design a Mobile Site
View Site in Mobile | Classic
Share by: