Best practices for managing memory usage

This page describes how to configure memory usage for a Cloud SQL instance.

Introduction

When you create a Cloud SQL instance, you select an amount of memory for the instance. As a PostgreSQL database's workload increases, the instance's memory usage increases. Instances that consume lots of memory can create a performance bottleneck that can sometimes lead to out-of-memory issues.

When a Cloud SQL instance runs out of memory because of an increased demand, it can cause database downtime. Therefore, it's important to configure the instance memory and the memory-related database flags properly and monitor the memory usage so that the instance operates in a healthy state.

PostgreSQL memory components are broadly divided into two sections:

  • Global memory:this is shared across all processes to execute queries; for example, shared_buffers and max_connections .
  • Local memory:this is dedicated memory assigned to each connection; for example, work_mem , maintenance_work_mem , and temp_buffers .

For other configuration considerations, see General best practices and Operational guidelines .

Memory usage and flags

Whenever there's high memory usage by Cloud SQL instances, the following questions might arise:

  • Which query or process is using high memory?
  • Are the memory settings adequate for the database activity?
  • How do you change the memory settings?

When a PostgreSQL database operates, most memory usage occurs in a few areas:

  • Shared buffer: this is the shared memory that PostgreSQL allocates to hold table data for read and write operations. For the read operation, any data that's requested from disk is first fetched to RAM and then it's given to the client. Similarly, in PostgreSQL, when the data is requested (for example, SELECT * from emp ), it's first fetched from disk to shared_buffers for caching, and then it's given to the client. The same thing happens with the write operation.

    Shared buffer is also the shared memory area for all processes and connections for database activities such as data caching, connection caching, and Data Manipulation Language (DML) operations. The maximum that this area can allocate is specified by the shared_buffers flag and the default is 33% of the instance's memory. If the value of shared_buffers is high, the size of the data cached in memory is high.

  • Query work memory: as a query is run, PostgreSQL allocates local memory for each operation such as sorting and hashing. The maximum it can allocate for each operation of a query before writing to temporary disk files is configured by the work_mem flag, and the default value is 4 MB. If the value of work_mem is high, the amount of data that can be sorted in the memory is high.
  • Maintenance work memory: some maintenance operations such as VACUUM , CREATE INDEX , ALTER TABLE , and ADD FOREIGN KEY require separate local memory that PostgreSQL allocates. The maximum amount for the back-end process that these operations use can be configured by the maintenance_work_mem flag and the default value is 64 MB. Note that autovacuum workers also use maintenance work memory and the maximum can be overridden by the autovacuum_work_mem flag. If the value of maintenance_work_mem is high, the performance speed of the VACUUM operation is high.
  • Temporary buffers: when a temporary table is used in a database session, PostgreSQL allocates temporary buffers to hold the session-local temporary table. The maximum amount can be specified by the temp_buffers flag and the default value is 8 MB.
  • Database connection: when a client connects to the database, PostgreSQL creates a back-end process to serve the client session. Besides the memory to run the query, PostgreSQL allocates additional memory to maintain information such as system catalog cache and prepared query plans. The maximum number of concurrent connections allowed to the database server can be configured by the max_connections flag. Each idle connection uses approximately 2 MB to 3 MB of shared memory. If the value of max_connections is high, the instance can make more connections, but at the cost of the memory.

For the complete list of memory components in PostgreSQL, see the PostgreSQL documentation . To change or modify the flags listed in this section, see Configure database flags .

Monitor memory usage

Monitor your instance's memory in Cloud Monitoring regularly and keep it below the memory limit. A good practice is to set an alert in Cloud Monitoring to alert when the usage exceeds 90% of the limit for 6 hours. This alert can warn you when your memory usage is close to the limit constantly.

Additionally, monitor for out-of-memory incidents. To do that, set up a log-based metric for the server process .* was terminated by signal 9: Killed message in Cloud Monitoring to count the out-of-memory events, and then alert each time such an event happens.

If your instance operates constantly above 90% of the memory's limit or an out-of-memory event occurs, you can increase the instance's memory. Alternatively, you can reduce the memory usage by limiting the number of database connections or lowering database flags such as shared_buffers , work_mem , or max_connections . Lowering these flags can limit the performance of your instance.

Out of memory

When there's insufficient memory to handle the database workload, as a last resort, the underlying Linux operating system uses the out-of-memory (OOM) killer to end a process to release memory. Cloud SQL is configured so that the OOM killer targets only the PostgreSQL worker processes. The postmaster process is preserved in this situation so that it only has to end all existing database connections and run a recovery to protect the integrity of the database. If this happens, there are moments of service disruption and downtime to the database. In the PostgreSQL database log, messages like the following appear:

 2021 
-10-24  
 23 
:34:22.265  
UTC  
 [ 
 7 
 ] 
:  
 [ 
 663 
-1 ] 
  
 db 
 = 
,user = 
  
LOG:  
server  
process  
 ( 
PID  
 1255039 
 ) 
  
was  
terminated  
by  
signal  
 9 
:  
Killed 2021 
-10-24  
 23 
:34:22.265  
UTC  
 [ 
 7 
 ] 
:  
 [ 
 664 
-1 ] 
  
 db 
 = 
,user = 
  
DETAIL:  
Failed  
process  
was  
running:  
SELECT  
*  
FROM  
tab  
ORDER  
BY  
col 2021 
-10-24  
 23 
:34:22.277  
UTC  
 [ 
 7 
 ] 
:  
 [ 
 665 
-1 ] 
  
 db 
 = 
,user = 
  
LOG:  
terminating  
any  
other  
active  
server  
processes 2021 
-10-24  
 23 
:34:22.278  
UTC  
 [ 
 1255458 
 ] 
:  
 [ 
 1 
-1 ] 
  
 db 
 = 
postgres,user = 
postgres  
WARNING:  
terminating  
connection  
because  
of  
crash  
of  
another  
server  
process 2021 
-10-24  
 23 
:34:22.278  
UTC  
 [ 
 1255458 
 ] 
:  
 [ 
 2 
-1 ] 
  
 db 
 = 
postgres,user = 
postgres  
DETAIL:  
The  
postmaster  
has  
commanded  
this  
server  
process  
to  
roll  
back  
the  
current  
transaction  
and  
exit,  
because  
another  
server  
process  
exited  
abnormally  
and  
possibly  
corrupted  
shared  
memory. 2021 
-10-24  
 23 
:34:22.278  
UTC  
 [ 
 1255458 
 ] 
:  
 [ 
 3 
-1 ] 
  
 db 
 = 
postgres,user = 
postgres  
HINT:  
In  
a  
moment  
you  
should  
be  
able  
to  
reconnect  
to  
the  
database  
and  
repeat  
your  
command. 2021 
-10-24  
 23 
:34:22.278  
UTC  
 [ 
 1255458 
 ] 
:  
 [ 
 4 
-1 ] 
  
 db 
 = 
postgres,user = 
postgres  
CONTEXT:  
 while 
  
updating  
tuple  
 ( 
 27 
,18 ) 
  
 in 
  
relation  
 "tab" 
... 2021 
-10-24  
 23 
:34:22.558  
UTC  
 [ 
 1255477 
 ] 
:  
 [ 
 1 
-1 ] 
  
 db 
 = 
postgres,user = 
postgres  
FATAL:  
the  
database  
system  
is  
 in 
  
recovery  
mode
... 2021 
-10-24  
 23 
:34:25.579  
UTC  
 [ 
 7 
 ] 
:  
 [ 
 666 
-1 ] 
  
 db 
 = 
,user = 
  
LOG:  
all  
server  
processes  
terminated ; 
  
reinitializing
... 2021 
-10-24  
 23 
:34:25.691  
UTC  
 [ 
 1255482 
 ] 
:  
 [ 
 1 
-1 ] 
  
 db 
 = 
,user = 
  
LOG:  
database  
system  
was  
interrupted ; 
  
last  
known  
up  
at  
 2021 
-10-24  
 23 
:31:53  
UTC 2021 
-10-24  
 23 
:34:25.776  
UTC  
 [ 
 1255482 
 ] 
:  
 [ 
 2 
-1 ] 
  
 db 
 = 
,user = 
  
LOG:  
database  
system  
was  
not  
properly  
shut  
down ; 
  
automatic  
recovery  
 in 
  
progress 2021 
-10-24  
 23 
:34:25.789  
UTC  
 [ 
 1255482 
 ] 
:  
 [ 
 3 
-1 ] 
  
 db 
 = 
,user = 
  
LOG:  
redo  
starts  
at  
 227 
/AB359400 2021 
-10-24  
 23 
:34:38.957  
UTC  
 [ 
 1255482 
 ] 
:  
 [ 
 4 
-1 ] 
  
 db 
 = 
,user = 
  
LOG:  
redo  
 done 
  
at  
 229 
/4621F508 2021 
-10-24  
 23 
:34:38.959  
UTC  
 [ 
 1255482 
 ] 
:  
 [ 
 5 
-1 ] 
  
 db 
 = 
,user = 
  
LOG:  
last  
completed  
transaction  
was  
at  
log  
 time 
  
 2021 
-10-24  
 23 
:34:18.5535+00 2021 
-10-24  
 23 
:34:39.290  
UTC  
 [ 
 7 
 ] 
:  
 [ 
 667 
-1 ] 
  
 db 
 = 
,user = 
  
LOG:  
database  
system  
is  
ready  
to  
accept  
connections

What's next

Create a Mobile Website
View Site in Mobile | Classic
Share by: