How performance snapshot reports work
Performance snapshot reports are a built-in AlloyDB Omni tool that captures and analyzes performance data to help you identify the cause of performance issues.
Performance snapshot reports display database metrics between two timestamps in a single report. You can use the performance snapshot report information to identify performance issues with your performance snapshot report instance, like decreased database performance during certain times of the day or decreased performance over a certain time period.
Using the performance snapshot report, you compare the metrics to a performance baseline to gain insights into workload performance metrics, which you can use to optimize or troubleshoot database performance. A baseline is a customized set of database snapshots that measure the standard performance and behavior of a database for a specific configuration and workload.
For information about wait events in performance snapshot report, see Database performance snapshot report reference .
Required roles
Ensure that you have the pg_monitor
role
.
This role is granted to superusers, who can then grant pg_monitor
to other users.
For example, postgres
is the superuser by default. You can run GRANT pg_monitor TO my_user
as postgres
to allow my_user
to use the performance snapshot tool as described in this document.
Install performance snapshot report
perfsnap
is the schema name that contains SQL functions that allow users to capture snapshots or generate reports. This schema is part of the AlloyDB Omni g_stats
extension. Use the superuser role to install performance snapshot report.
To use the perfsnap
APIs, connect to any database where users want to install the extension, and create the g_stats
extension with the following command:
CREATE
EXTENSION
IF
NOT
EXISTS
g_stats
;
Create a snapshot of system metrics
Create a snapshot at the beginning and end of the workload you're interested in. The time interval between the two snapshots allows enough time for the workload to progress so that the system can accumulate metrics that reflect the workload. After you obtain metrics from the resulting performance snapshot report, you can take another set of snapshots and repeat the process.
- Connect a
psql
client to an AlloyDB instance . -
Run
SELECT perfsnap.snap()
. The output looks similar to the following:postgres=# select perfsnap.snap(); snap ------ 1 (1 row)
View a list of snapshots
- Connect a
psql
client to an AlloyDB instance . -
Run
SELECT * FROM perfsnap.g$snapshots
. The output looks similar to the following:postgres=# select * from perfsnap.g$snapshots; snap_id | snap_time | instance_id | node_id | snap_description | snap_type | is_baseline ---------+-------------------------------+-------------+---------+--------------------+-----------+------------- 1 | 2023-11-13 22:13:43.159237+00 | sr-primary | | Manual snapshot | Manual | f 2 | 2023-11-13 22:53:40.49565+00 | sr-primary | | Automatic snapshot | Automatic | f (2 rows)
Generate a snapshot report
To generate a report that captures the difference between snapshots 1
and 2, for example, run
SELECT perfsnap.report(1,2)
.
The second snapshot in a differential operation doesn't need to immediately follow the first snapshot. However, make sure you capture the second snapshot in the differential after the first snapshot.
The generated performance snapshot report looks similar to the following abridged example:
Example performance snapshot report
$ psql -d postgres -U alloydbsuperuser postgres=> select perfsnap.report(22, 23); report -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- PGSNAP DB Report for: Snapshot details -------------------------------------- Host i841-sr-primary-2a34f46e-06bc Release 14.12 Startup Time 2024-10-08 03:23:15+00 Snap Id Snap Time ------------ ---------- ------------------------ Begin Snap: 22 24.10.2024 04:33:56 (UTC) Automatic snapshot End Snap: 23 25.10.2024 04:38:56 (UTC) Automatic snapshot Elapsed: 1 day 00:04:59.979321 Database Cache sizes ~~~~~~~~~~~~~ Shared Buffers: 31 GB Block Size: 8192 Effective Cache Size: 25 GB WAL Buffers: 16384 Host CPU ~~~~~~~~~~ %User %Nice %System %Idle %WIO %IRQ %SIRQ %Steal %Guest ------- ------- ------- ------- ------- ------- ------- ------- ------- 1.07 0.22 0.91 97.40 0.09 0.00 0.31 0.00 0.00 Host Memory ~~~~~~~~~~~~ Total Memory: 63 GB Available Memory: 11 GB Free Memory: 726 MB Buffers Memory: 3706 MB Load profile (in bytes) ~~~~~~~~~~~~~~~~~~~~~~~ Per Second Per Transaction ------------ --------------- Redo size: 63083.64 4489.93 Logical reads: 1961.21 139.59 ... Response Time Profile (in s) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ CPU time: 5399 ( 0.39%) Wait time: 1386906 ( 99.61%) Total time: 1392306 Backend Processes Wait Class Breakdown (in s) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ IO 119.300 ( 98.91%) LWLock 1.305 ( 1.08%) IPC .010 ( 0.01%) Lock .000 ( 0.00%) Backend Processes Wait Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Event Class Waits Time (us) Avg (us) -------------------------------------- ------------- ------------- -------------- ------------- CPU 1995948632 WALInsert LWLock 1 6656 6656 Vacuum Information ~~~~~~~~~~~~~~~~~~~ Num Analyze operations: 1976 Num Vacuum operations: 3435 Per Database Information ~~~~~~~~~~~~~~~~~~~~~~~~~ Name Commits Rollbacks BlkRds Blkhits TempFiles TempBytes ------------------------- ------------- ------------- ------------- ------------- ------------- ------------- bench 27939 0 0 7823038 0 0 bytes postgres 39792 0 7 11089243 0 0 bytes Per Database DML & DQL Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Name Tuples returned Tuples fetched Tuples inserted Tuples updated Tuples deleted Index splits Index Only heap fetches HOT updates ------------------------- ---------------- ---------------- ---------------- ---------------- ---------------- ---------------- ------------------------- ---------------- bench 16119481 4843262 0 0 0 0 16 0 postgres 25415473 6327188 0 10 0 0 0 8 Per Database Conflict Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Name Lock Timeout Old Snapshot Buffer Pins Deadlock ------------------------- ------------- ------------- ------------- ------------- bench 0 0 0 0 postgres 0 0 0 0 Per Database Vacuum Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Name Frozen XID % Consumed Aggregate Vacuum Gap ------------------------- ------------- ------------- -------------------- bench 179460916 9.00% 20539084 postgres 179339239 9.00% 20660761 Per Database Sizing Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Conn. Name Collation Limit Tablespace DB Size Growth -------------------- ------------- ------- -------------------- ---------- ---------- bench C.UTF-8 -1 pg_default 80 GB 0 bytes postgres C.UTF-8 -1 pg_default 135 MB 0 bytes Backend Wait Event Histogram ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Event Class Waits <= 1us <= 2us <= 4us <= 8us <= 16us <= 32us <= 64us <= 128us <= 256us <= 512us -------------------------------------- ------------- ----------- --------- --------- --------- --------- --------- --------- --------- --------- --------- -------- WALInsert LWLock 1 0 0 0 0 0 0 0 0 0 0 Background Wait Event Histogram ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Event Class Waits <= 1us <= 2us <= 4us <= 8us <= 16us <= 32us <= 64us <= 128us <= 256us <= 512us -------------------------------------- ------------- ----------- --------- --------- --------- --------- --------- --------- --------- --------- --------- -------- WALInsert LWLock 542 107 174 39 113 93 8 1 1 0 1 Write Ahead Log (WAL) Statistics -------------------------------- Records Full Page Images Bytes Buffers Full Write Sync Write Time Sync Time ----------- ---------------- ----------- ------------ ----------- ----------- ----------- ----------- 2936305 100 805989345 0 0 0 0 0 Summary Stats (across all databases) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Name Value -------------------------------- ---------------------------------- Buffers evicted 0 Commits 1216693 ... Parameter Settings ~~~~~~~~~~~~~~~~~~~ Parameter Value --------------------------------- -------------------------------------------------------------- DateStyle ISO, MDY TimeZone UTC autovacuum on work_mem 4096 Columnar Engine available size Columnar Engine configured size ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 14959MB 19293MB Columnar Engine Statistics ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ name count ---------------------------------------------------------- ------------ CU Populations/Refreshes 13197 CU Auto Refreshes 10975 ... Columnar Engine Ultra-fast Cache Statistics ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Ultra-fast Cache Size (MB): 19200 Ultra-fast Cache Used Size (MB): 0 Ultra-fast Cache Block Size (MB): 80 ---------------------------------------------------- Created by G_STATS v1.0.100 ---------------------------------------------------- (xxx rows)
For information about report fields and performance optimization recommendations, see Database performance optimization recommendations . For more information about wait events in performance snapshot reports, see Database performance snapshot report reference .
Delete a snapshot
Before you can delete snapshots that are part of an existing baseline, you must clear the baseline .
To delete a snapshot, run SELECT perfsnap.delete(n)
. After you delete a
snapshot, you can't recover it.
Mark a snapshot as a performance baseline
To mark all snapshots with IDs between 1 and 3, for example, as a system
performance baseline, run
SELECT perfsnap.make_baseline(1, 3)
.
Clear performance baselines
To clear all baselines with IDs between 1 and 3, for example, run SELECT perfsnap.clear_baseline(1, 3)
.
Optimize database performance using snapshot report results
Follow these steps to optimize AlloyDB database performance:
- Create baseline snapshots when your database is idle or when it's experiencing an average load.
- Start the workload or query whose performance you want to improve.
- When the workload or query reaches peak resource usage, create another set of snapshots. We recommend that you use the same interval for both reports.
- Compare the reports that you created with both set of snapshots and identify changes that might improve performance. For more information about performance recommendations, see Database performance optimization recommendations .
Database performance optimization recommendations
The following table lists performance snapshot report sections and recommended improvements for each report section. For more information about performance snapshot report sections and wait events, see Database performance snapshot report reference .
Sizing Information
- Current age of
datfrozenxid
(oldest unfrozen XIDs) of each database, or the number of transactions fromdatfrozenxid
to the current transaction's XID.
- Unfrozen transaction IDs consumed out of all transaction IDs.
- Result of
autovacuum_freeze_max_age - age(pg_database.datfrozenxid)
, which indicates the approximate age gaps (in transactions) at the second snapshot time, when autovacuum is triggered to prevent wraparounds on a database-aggregated level.
Background Processes Information
wal_buffers
available to the database.Limitations
-
To avoid space bloat, you can manually create a maximum of 2500 snapshots on one instance. Space bloat ensures that snapshots don't occupy too much storage space in your database.
-
If the number of snapshots exceeds the snapshot limit, then AlloyDB Omni deletes all manual snapshots older than 90 days. To remain within the snapshot limit, you must clean up unnecessary snapshots before you take a new one.
-
AlloyDB Omni periodically cleans up manual snapshots that are older than 90 days.
What's next
- Learn about wait events in performance snapshot reports .