PMU overview

The performance monitoring unit (PMU) is a hardware component within the CPU core that monitors how the processor runs code. When you enable the PMU in an A4X, C4A, C4, or M4 Compute Engine instance, you can access the performance counters in the PMU by using performance-monitoring software. This approach lets you optimize performance-sensitive workloads, such as high performance computing (HPC) or machine learning (ML), by helping you identify and address performance bottlenecks in your applications.

This document explains the behavior, billing, and limitations of the PMU. To learn how to enable the PMU in an A4X, C4A, C4, or M4 instance, see Enable the PMU in Compute Engine instances .

Understand the PMU

The PMU is composed of a set of hardware counters called performance monitoring counters (PMCs). These counters are model-specific registers that count each time a low-level processor event occurs within the CPU, such as a branch misprediction or cache miss. You can read and configure PMCs in the PMU by using performance-monitoring software such as Intel VTune Profiler .

After you enable the PMU in a compute instance, the PMU runs in the background, continuously monitoring performance events using PMCs. You can optionally configure thresholds for specific PMCs using your preferred performance-monitoring software. If a PMC exceeds its designated threshold, then the PMU notifies the software.

PMU events

By default, the PMU is disabled in compute instances. To enable it, specify the types of low-level CPU events to track by enabling one of the following PMU types:

  • Architectural ( ARCHITECTURAL ): you can measure the following architectural performance events:
    • Branch instructions retired: the number of branch instructions retired. Use this event to measure your code's execution and identify potential performance bottlenecks.
    • Branch misses retired: the number of branch instructions that were mispredicted, causing the processor to stall and discard fetched instructions. If you see a high number for this event, then you can likely optimize the CPU performance.
    • Instructions retired: the number of instructions the CPU successfully processes. Use this event to measure the CPU's instruction throughput.
    • Top down slots: the number of available slots within a processor's pipeline that are used to simultaneously execute instructions. Use this event to understand how efficiently your code is using the processor's resources.
    • Unhalted core cycles: the number of core cycles when the thread isn't halted—for example, due to power management or interrupts. Use this event to evaluate the overall usage of the processor.
    • Unhalted reference cycles: the number of reference cycles when the core isn't halted—for example, when fetching data or instructions. The core is halted when it runs the HLT or MWAIT instructions . Reference cycles operate at a fixed frequency, providing a stable time reference even when the speed of the processor changes to preserve energy. Use this event to measure the time spent on a task and identify performance bottlenecks in your code.
  • Standard ( STANDARD ): you can measure all events from the Architectural PMU type and any local events inside the CPU core, including level 2 (L2) cache events.
  • Enhanced ( ENHANCED ): you can measure all events from the Standard PMU type and any local events outside the CPU core, including level 3 (L3) cache events.

Supported machine series and CPU platforms

You can enable the PMU on compute instances that use the following machine series and CPU platforms:

CPU platform Machine series PMU tracking types
Google Axion Processor
C4A ARCHITECTURAL and STANDARD
Intel Xeon Scalable Processor (Emerald Rapids) 5th generation
C4 and M4 ARCHITECTURAL , STANDARD , and ENHANCED
Intel Xeon Scalable Processor (Granite Rapids) 6th generation
C4 ARCHITECTURAL , STANDARD , and ENHANCED
NVIDIA Grace Processor
A4X ARCHITECTURAL and STANDARD

Supported PMU events by CPU platform

The following tables list the supported PMU events by CPU platform:

Google Axion

The following table lists the supported PMU events for Google Axion processors with Neoverse V2 Armv9 cores.

Event code PMU tracking type Event name
0x0
ARCHITECTURAL SW_INCR
0x3
ARCHITECTURAL L1D_CACHE_REFILL
0x4
ARCHITECTURAL L1D_CACHE
0x8
ARCHITECTURAL INST_RETIRED
0x10
ARCHITECTURAL BR_MIS_PRED
0x11
ARCHITECTURAL CPU_CYCLES
0x12
ARCHITECTURAL BR_PRED
0x1b
ARCHITECTURAL INST_SPEC
0x1e
ARCHITECTURAL CHAIN
0x23
ARCHITECTURAL STALL_FRONTEND
0x24
ARCHITECTURAL STALL_BACKEND
0x39
ARCHITECTURAL L1D_CACHE_LMISS_RD
0x3c
ARCHITECTURAL STALL
0x40
ARCHITECTURAL L1D_CACHE_RD
0x4006
ARCHITECTURAL L1I_CACHE_LMISS
0x8006
ARCHITECTURAL SVE_INST_SPEC
0x0
STANDARD SW_INCR
0x1
STANDARD L1I_CACHE_REFILL
0x2
STANDARD L1I_TLB_REFILL
0x3
STANDARD L1D_CACHE_REFILL
0x4
STANDARD L1D_CACHE
0x5
STANDARD L1D_TLB_REFILL
0x8
STANDARD INST_RETIRED
0x9
STANDARD EXC_TAKEN
0xa
STANDARD EXC_RETURN
0xb
STANDARD CID_WRITE_RETIRED
0x10
STANDARD BR_MIS_PRED
0x11
STANDARD CPU_CYCLES
0x12
STANDARD BR_PRED
0x13
STANDARD MEM_ACCESS
0x14
STANDARD L1I_CACHE
0x15
STANDARD L1D_CACHE_WB
0x16
STANDARD L2D_CACHE
0x17
STANDARD L2D_CACHE_REFILL
0x18
STANDARD L2D_CACHE_WB
0x19
STANDARD BUS_ACCESS
0x1b
STANDARD INST_SPEC
0x1c
STANDARD TTBR_WRITE_RETIRED
0x1d
STANDARD BUS_CYCLES
0x1e
STANDARD CHAIN
0x20
STANDARD L2D_CACHE_ALLOCATE
0x21
STANDARD BR_RETIRED
0x22
STANDARD BR_MIS_PRED_RETIRED
0x23
STANDARD STALL_FRONTEND
0x24
STANDARD STALL_BACKEND
0x25
STANDARD L1D_TLB
0x26
STANDARD L1I_TLB
0x2d
STANDARD L2D_TLB_REFILL
0x2f
STANDARD L2D_TLB
0x31
STANDARD REMOTE_ACCESS
0x34
STANDARD DTLB_WALK
0x35
STANDARD ITLB_WALK
0x39
STANDARD L1D_CACHE_LMISS_RD
0x3a
STANDARD OP_RETIRED
0x3b
STANDARD OP_SPEC
0x3c
STANDARD STALL
0x3d
STANDARD STALL_SLOT_BACKEND
0x3e
STANDARD STALL_SLOT_FRONTEND
0x3f
STANDARD STALL_SLOT
0x40
STANDARD L1D_CACHE_RD
0x41
STANDARD L1D_CACHE_WR
0x42
STANDARD L1D_CACHE_REFILL_RD
0x43
STANDARD L1D_CACHE_REFILL_WR
0x44
STANDARD L1D_CACHE_REFILL_INNER
0x45
STANDARD L1D_CACHE_REFILL_OUTER
0x46
STANDARD L1D_CACHE_WB_VICTIM
0x47
STANDARD L1D_CACHE_WB_CLEAN
0x48
STANDARD L1D_CACHE_INVAL
0x4c
STANDARD L1D_TLB_REFILL_RD
0x4d
STANDARD L1D_TLB_REFILL_WR
0x4e
STANDARD L1D_TLB_RD
0x4f
STANDARD L1D_TLB_WR
0x50
STANDARD L2D_CACHE_RD
0x51
STANDARD L2D_CACHE_WR
0x52
STANDARD L2D_CACHE_REFILL_RD
0x53
STANDARD L2D_CACHE_REFILL_WR
0x56
STANDARD L2D_CACHE_WB_VICTIM
0x57
STANDARD L2D_CACHE_WB_CLEAN
0x58
STANDARD L2D_CACHE_INVAL
0x5c
STANDARD L2D_TLB_REFILL_RD
0x5d
STANDARD L2D_TLB_REFILL_WR
0x5e
STANDARD L2D_TLB_RD
0x5f
STANDARD L2D_TLB_WR
0x60
STANDARD BUS_ACCESS_RD
0x61
STANDARD BUS_ACCESS_WR
0x66
STANDARD MEM_ACCESS_RD
0x67
STANDARD MEM_ACCESS_WR
0x68
STANDARD UNALIGNED_LD_SPEC
0x69
STANDARD UNALIGNED_ST_SPEC
0x6a
STANDARD UNALIGNED_LDST_SPEC
0x6c
STANDARD LDREX_SPEC
0x6d
STANDARD STREX_PASS_SPEC
0x6e
STANDARD STREX_FAIL_SPEC
0x6f
STANDARD STREX_SPEC
0x70
STANDARD LD_SPEC
0x71
STANDARD ST_SPEC
0x73
STANDARD DP_SPEC
0x74
STANDARD ASE_SPEC
0x75
STANDARD VFP_SPEC
0x76
STANDARD PC_WRITE_SPEC
0x77
STANDARD CRYPTO_SPEC
0x78
STANDARD BR_IMMED_SPEC
0x79
STANDARD BR_RETURN_SPEC
0x7a
STANDARD BR_INDIRECT_SPEC
0x7c
STANDARD ISB_SPEC
0x7d
STANDARD DSB_SPEC
0x7e
STANDARD DMB_SPEC
0x81
STANDARD EXC_UNDEF
0x82
STANDARD EXC_SVC
0x83
STANDARD EXC_PABORT
0x84
STANDARD EXC_DABORT
0x86
STANDARD EXC_IRQ
0x87
STANDARD EXC_FIQ
0x88
STANDARD EXC_SMC
0x8a
STANDARD EXC_HVC
0x8b
STANDARD EXC_TRAP_PABORT
0x8c
STANDARD EXC_TRAP_DABORT
0x8d
STANDARD EXC_TRAP_OTHER
0x8e
STANDARD EXC_TRAP_IRQ
0x8f
STANDARD EXC_TRAP_FIQ
0x90
STANDARD RC_LD_SPEC
0x91
STANDARD RC_ST_SPEC
0x4004
STANDARD CNT_CYCLES
0x4005
STANDARD STALL_BACKEND_MEM
0x4006
STANDARD L1I_CACHE_LMISS
0x4009
STANDARD L2D_CACHE_LMISS_RD
0x4020
STANDARD LDST_ALIGN_LAT
0x4021
STANDARD LD_ALIGN_LAT
0x4022
STANDARD ST_ALIGN_LAT
0x8005
STANDARD ASE_INST_SPEC
0x8006
STANDARD SVE_INST_SPEC
0x8014
STANDARD FP_HP_SPEC
0x8018
STANDARD FP_SP_SPEC
0x801c
STANDARD FP_DP_SPEC
0x8074
STANDARD SVE_PRED_SPEC
0x8075
STANDARD SVE_PRED_EMPTY_SPEC
0x8076
STANDARD SVE_PRED_FULL_SPEC
0x8077
STANDARD SVE_PRED_PARTIAL_SPEC
0x8079
STANDARD SVE_PRED_NOT_FULL_SPEC
0x80bc
STANDARD SVE_LDFF_SPEC
0x80bd
STANDARD SVE_LDFF_FAULT_SPEC
0x80c0
STANDARD FP_SCALE_OPS_SPEC
0x80c1
STANDARD FP_FIXED_OPS_SPEC
0x80e3
STANDARD ASE_SVE_INT8_SPEC
0x80e7
STANDARD ASE_SVE_INT16_SPEC
0x80eb
STANDARD ASE_SVE_INT32_SPEC
0x80ef
STANDARD ASE_SVE_INT64_SPEC

Intel Emerald Rapids

The following table lists the supported PMU events for Intel 5th generation Emerald Rapids (EMR) processors.

Event code UMask PMU tracking type Event name
0x03
0x04 STANDARD LD_BLOCKS.ADDRESS_ALIAS
0x03
0x82 STANDARD LD_BLOCKS.STORE_FORWARD
0x03
0x88 STANDARD LD_BLOCKS.NO_SR
0x11
0x02 STANDARD ITLB_MISSES.WALK_COMPLETED_4K
0x11
0x04 STANDARD ITLB_MISSES.WALK_COMPLETED_2M_4M
0x11
0x0E STANDARD ITLB_MISSES.WALK_COMPLETED
0x11
0x10 STANDARD ITLB_MISSES.WALK_ACTIVE
0x11
0x20 STANDARD ITLB_MISSES.STLB_HIT
0x12
0x02 STANDARD DTLB_LOAD_MISSES.WALK_COMPLETED_4K
0x12
0x04 STANDARD DTLB_LOAD_MISSES.WALK_COMPLETED_2M_4M
0x12
0x08 STANDARD DTLB_LOAD_MISSES.WALK_COMPLETED_1G
0x12
0x0E STANDARD DTLB_LOAD_MISSES.WALK_COMPLETED
0x12
0x10 STANDARD DTLB_LOAD_MISSES.WALK_ACTIVE
0x12
0x20 STANDARD DTLB_LOAD_MISSES.STLB_HIT
0x13
0x02 STANDARD DTLB_STORE_MISSES.WALK_COMPLETED_4K
0x13
0x04 STANDARD DTLB_STORE_MISSES.WALK_COMPLETED_2M_4M
0x13
0x08 STANDARD DTLB_STORE_MISSES.WALK_COMPLETED_1G
0x13
0x0E STANDARD DTLB_STORE_MISSES.WALK_COMPLETED
0x13
0x10 STANDARD DTLB_STORE_MISSES.WALK_ACTIVE
0x13
0x20 STANDARD DTLB_STORE_MISSES.STLB_HIT
0x20
0x04 ENHANCED OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_RFO
0x20
0x08 ENHANCED OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD
0x21
0x01 ENHANCED OFFCORE_REQUESTS.DEMAND_DATA_RD
0x21
0x08 ENHANCED OFFCORE_REQUESTS.DATA_RD
0x21
0x80 ENHANCED OFFCORE_REQUESTS.ALL_REQUESTS
0x24
0x21 STANDARD L2_RQSTS.DEMAND_DATA_RD_MISS
0x24
0x22 STANDARD L2_RQSTS.RFO_MISS
0x24
0x24 STANDARD L2_RQSTS.CODE_RD_MISS
0x24
0x27 STANDARD L2_RQSTS.ALL_DEMAND_MISS
0x24
0x28 STANDARD L2_RQSTS.SWPF_MISS
0x24
0x3F STANDARD L2_RQSTS.MISS
0x24
0xC1 STANDARD L2_RQSTS.DEMAND_DATA_RD_HIT
0x24
0xC2 STANDARD L2_RQSTS.RFO_HIT
0x24
0xC4 STANDARD L2_RQSTS.CODE_RD_HIT
0x24
0xC8 STANDARD L2_RQSTS.SWPF_HIT
0x24
0xE1 STANDARD L2_RQSTS.ALL_DEMAND_DATA_RD
0x24
0xE2 STANDARD L2_RQSTS.ALL_RFO
0x24
0xE4 STANDARD L2_RQSTS.ALL_CODE_RD
0x24
0xE7 STANDARD L2_RQSTS.ALL_DEMAND_REFERENCES
0x24
0xFF STANDARD L2_RQSTS.REFERENCES
0x25
0x1F STANDARD L2_LINES_IN.ALL
0x26
0x01 STANDARD L2_LINES_OUT.SILENT
0x26
0x02 STANDARD L2_LINES_OUT.NON_SILENT
0x2D
0x01 ENHANCED XQ.FULL_CYCLES
0x2E
0x41 ENHANCED LONGEST_LAT_CACHE.MISS
0x3C
0x00 ARCHITECTURAL CPU_CLK_UNHALTED.THREAD_P
0x40
0x01 STANDARD SW_PREFETCH_ACCESS.NTA
0x40
0x02 STANDARD SW_PREFETCH_ACCESS.T0
0x40
0x04 STANDARD SW_PREFETCH_ACCESS.T1_T2
0x40
0x08 STANDARD SW_PREFETCH_ACCESS.PREFETCHW
0x43
0xFD STANDARD MEM_LOAD_COMPLETED.L1_MISS_ANY
0x44
0x01 STANDARD MEM_STORE_RETIRED.L2_HIT
0x47
0x02 STANDARD MEMORY_ACTIVITY.CYCLES_L1D_MISS
0x47
0x03 STANDARD MEMORY_ACTIVITY.STALLS_L1D_MISS
0x47
0x05 STANDARD MEMORY_ACTIVITY.STALLS_L2_MISS
0x47
0x09 ENHANCED MEMORY_ACTIVITY.STALLS_L3_MISS
0x48
0x01 STANDARD L1D_PEND_MISS.PENDING
0x48
0x02 STANDARD L1D_PEND_MISS.FB_FULL
0x48
0x04 STANDARD L1D_PEND_MISS.L2_STALL
0x4C
0x01 STANDARD LOAD_HIT_PREFETCH.SWPF
0x51
0x01 STANDARD L1D.REPLACEMENT
0x54
0x01 ENHANCED TX_MEM.ABORT_CONFLICT
0x54
0x02 ENHANCED TX_MEM.ABORT_CAPACITY_WRITE
0x54
0x80 ENHANCED TX_MEM.ABORT_CAPACITY_READ
0x61
0x02 STANDARD DSB2MITE_SWITCHES.PENALTY_CYCLES
0x75
0x01 STANDARD INST_DECODED.DECODERS
0x76
0x01 STANDARD UOPS_DECODED.DEC0_UOPS
0x79
0x04 STANDARD IDQ.MITE_CYCLES_ANY
0x79
0x08 STANDARD IDQ.DSB_CYCLES_ANY
0x79
0x20 STANDARD IDQ.MS_CYCLES_ANY
0x80
0x04 STANDARD ICACHE_DATA.STALLS
0x83
0x04 STANDARD ICACHE_TAG.STALLS
0x87
0x01 STANDARD DECODE.LCP
0x9C
0x01 STANDARD IDQ_UOPS_NOT_DELIVERED.CORE
0xA2
0x02 STANDARD RESOURCE_STALLS.SCOREBOARD
0xA2
0x08 STANDARD RESOURCE_STALLS.SB
0xA3
0x01 STANDARD CYCLE_ACTIVITY.CYCLES_L2_MISS
0xA3
0x04 STANDARD CYCLE_ACTIVITY.STALLS_TOTAL
0xA3
0x05 STANDARD CYCLE_ACTIVITY.STALLS_L2_MISS
0xA3
0x06 ENHANCED CYCLE_ACTIVITY.STALLS_L3_MISS
0xA3
0x08 STANDARD CYCLE_ACTIVITY.CYCLES_L1D_MISS
0xA3
0x0C STANDARD CYCLE_ACTIVITY.STALLS_L1D_MISS
0xA3
0x10 STANDARD CYCLE_ACTIVITY.CYCLES_MEM_ANY
0xA4
0x01 ARCHITECTURAL TOPDOWN.SLOTS_P
0xA4
0x02 STANDARD TOPDOWN.BACKEND_BOUND_SLOTS
0xA4
0x04 STANDARD TOPDOWN.BAD_SPEC_SLOTS
0xA4
0x08 STANDARD TOPDOWN.BR_MISPREDICT_SLOTS
0xA4
0x10 STANDARD TOPDOWN.MEMORY_BOUND_SLOTS
0xA5
0x07 STANDARD RS_EMPTY.CYCLES
0xA6
0x02 STANDARD EXE_ACTIVITY.1_PORTS_UTIL
0xA6
0x04 STANDARD EXE_ACTIVITY.2_PORTS_UTIL
0xA6
0x08 STANDARD EXE_ACTIVITY.3_PORTS_UTIL
0xA6
0x10 STANDARD EXE_ACTIVITY.4_PORTS_UTIL
0xA6
0x21 STANDARD EXE_ACTIVITY.BOUND_ON_LOADS
0xA6
0x40 STANDARD EXE_ACTIVITY.BOUND_ON_STORES
0xA6
0x80 STANDARD EXE_ACTIVITY.EXE_BOUND_0_PORTS
0xA8
0x01 STANDARD LSD.CYCLES_ACTIVE
0xAD
0x01 STANDARD INT_MISC.RECOVERY_CYCLES
0xAD
0x10 STANDARD INT_MISC.UOP_DROPPING
0xAD
0x20 STANDARD INT_MISC.MBA_STALLS
0xAD
0x80 STANDARD INT_MISC.CLEAR_RESTEER_CYCLES
0xAE
0x01 STANDARD UOPS_ISSUED.ANY
0xB0
0x01 STANDARD ARITH.FP_DIVIDER_ACTIVE
0xB0
0x08 STANDARD ARITH.IDIV_ACTIVE
0xB0
0x09 STANDARD ARITH.DIVIDER_ACTIVE
0xB1
0x01 STANDARD UOPS_EXECUTED.CYCLES_GE_1
0xB1
0x02 STANDARD UOPS_EXECUTED.CORE
0xB1
0x10 STANDARD UOPS_EXECUTED.X87
0xB2
0x01 STANDARD UOPS_DISPATCHED.PORT_0
0xB2
0x02 STANDARD UOPS_DISPATCHED.PORT_1
0xB2
0x04 STANDARD UOPS_DISPATCHED.PORT_2_3_10
0xB2
0x10 STANDARD UOPS_DISPATCHED.PORT_4_9
0xB2
0x20 STANDARD UOPS_DISPATCHED.PORT_5_11
0xB2
0x40 STANDARD UOPS_DISPATCHED.PORT_6
0xB2
0x80 STANDARD UOPS_DISPATCHED.PORT_7_8
0xB3
0x01 STANDARD FP_ARITH_DISPATCHED.PORT_0
0xB3
0x02 STANDARD FP_ARITH_DISPATCHED.PORT_1
0xB3
0x04 STANDARD FP_ARITH_DISPATCHED.PORT_5
0xB7
0x02 STANDARD EXE.AMX_BUSY
0xC0
0x00 ARCHITECTURAL INST_RETIRED.ANY_P
0xC0
0x02 STANDARD INST_RETIRED.NOP
0xC0
0x08 STANDARD INST_RETIRED.REP_ITERATION
0xC0
0x10 STANDARD INST_RETIRED.MACRO_FUSED
0xC1
0x02 STANDARD ASSISTS.FP
0xC1
0x08 STANDARD ASSISTS.PAGE_FAULT
0xC1
0x10 STANDARD ASSISTS.SSE_AVX_MIX
0xC1
0x1B STANDARD ASSISTS.ANY
0xC2
0x01 STANDARD UOPS_RETIRED.HEAVY
0xC2
0x02 STANDARD UOPS_RETIRED.SLOTS
0xC3
0x01 STANDARD MACHINE_CLEARS.COUNT
0xC3
0x02 STANDARD MACHINE_CLEARS.MEMORY_ORDERING
0xC3
0x04 STANDARD MACHINE_CLEARS.SMC
0xC4
0x00 ARCHITECTURAL BR_INST_RETIRED.ALL_BRANCHES
0xC4
0x01 STANDARD BR_INST_RETIRED.COND_TAKEN
0xC4
0x02 STANDARD BR_INST_RETIRED.NEAR_CALL
0xC4
0x08 STANDARD BR_INST_RETIRED.NEAR_RETURN
0xC4
0x10 STANDARD BR_INST_RETIRED.COND_NTAKEN
0xC4
0x11 STANDARD BR_INST_RETIRED.COND
0xC4
0x20 STANDARD BR_INST_RETIRED.NEAR_TAKEN
0xC4
0x40 STANDARD BR_INST_RETIRED.FAR_BRANCH
0xC4
0x80 STANDARD BR_INST_RETIRED.INDIRECT
0xC5
0x00 ARCHITECTURAL BR_MISP_RETIRED.ALL_BRANCHES
0xC5
0x01 STANDARD BR_MISP_RETIRED.COND_TAKEN
0xC5
0x02 STANDARD BR_MISP_RETIRED.INDIRECT_CALL
0xC5
0x08 STANDARD BR_MISP_RETIRED.RET
0xC5
0x10 STANDARD BR_MISP_RETIRED.COND_NTAKEN
0xC5
0x11 STANDARD BR_MISP_RETIRED.COND
0xC5
0x20 STANDARD BR_MISP_RETIRED.NEAR_TAKEN
0xC5
0x80 STANDARD BR_MISP_RETIRED.INDIRECT
0xC7
0x01 STANDARD FP_ARITH_INST_RETIRED.SCALAR_DOUBLE
0xC7
0x02 STANDARD FP_ARITH_INST_RETIRED.SCALAR_SINGLE
0xC7
0x04 STANDARD FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE
0xC7
0x08 STANDARD FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE
0xC7
0x10 STANDARD FP_ARITH_INST_RETIRED.256B_PACKED_DOUBLE
0xC7
0x20 STANDARD FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE
0xC7
0x40 STANDARD FP_ARITH_INST_RETIRED.512B_PACKED_DOUBLE
0xC7
0x80 STANDARD FP_ARITH_INST_RETIRED.512B_PACKED_SINGLE
0xC9
0x01 ENHANCED RTM_RETIRED.START
0xC9
0x02 ENHANCED RTM_RETIRED.COMMIT
0xC9
0x04 ENHANCED RTM_RETIRED.ABORTED
0xC9
0x08 ENHANCED RTM_RETIRED.ABORTED_MEM
0xC9
0x20 ENHANCED RTM_RETIRED.ABORTED_UNFRIENDLY
0xC9
0x40 ENHANCED RTM_RETIRED.ABORTED_MEMTYPE
0xC9
0x80 ENHANCED RTM_RETIRED.ABORTED_EVENTS
0xCE
0x01 STANDARD AMX_OPS_RETIRED.INT8
0xCE
0x02 STANDARD AMX_OPS_RETIRED.BF16
0xCF
0x01 STANDARD FP_ARITH_INST_RETIRED2.SCALAR_HALF
0xCF
0x02 STANDARD FP_ARITH_INST_RETIRED2.COMPLEX_SCALAR_HALF
0xCF
0x03 STANDARD FP_ARITH_INST_RETIRED2.SCALAR
0xCF
0x04 STANDARD FP_ARITH_INST_RETIRED2.128B_PACKED_HALF
0xCF
0x08 STANDARD FP_ARITH_INST_RETIRED2.256B_PACKED_HALF
0xCF
0x10 STANDARD FP_ARITH_INST_RETIRED2.512B_PACKED_HALF
0xCF
0x1C STANDARD FP_ARITH_INST_RETIRED2.VECTOR
0xD0
0x11 STANDARD MEM_INST_RETIRED.STLB_MISS_LOADS
0xD0
0x12 STANDARD MEM_INST_RETIRED.STLB_MISS_STORES
0xD0
0x21 STANDARD MEM_INST_RETIRED.LOCK_LOADS
0xD0
0x41 STANDARD MEM_INST_RETIRED.SPLIT_LOADS
0xD0
0x42 STANDARD MEM_INST_RETIRED.SPLIT_STORES
0xD0
0x81 STANDARD MEM_INST_RETIRED.ALL_LOADS
0xD0
0x82 STANDARD MEM_INST_RETIRED.ALL_STORES
0xD0
0x83 STANDARD MEM_INST_RETIRED.ANY
0xD1
0x01 STANDARD MEM_LOAD_RETIRED.L1_HIT
0xD1
0x02 STANDARD MEM_LOAD_RETIRED.L2_HIT
0xD1
0x04 ENHANCED MEM_LOAD_RETIRED.L3_HIT
0xD1
0x08 STANDARD MEM_LOAD_RETIRED.L1_MISS
0xD1
0x10 STANDARD MEM_LOAD_RETIRED.L2_MISS
0xD1
0x20 ENHANCED MEM_LOAD_RETIRED.L3_MISS
0xD1
0x40 STANDARD MEM_LOAD_RETIRED.FB_HIT
0xD2
0x01 ENHANCED MEM_LOAD_L3_HIT_RETIRED.XSNP_MISS
0xD2
0x02 ENHANCED MEM_LOAD_L3_HIT_RETIRED.XSNP_NO_FWD
0xD2
0x04 ENHANCED MEM_LOAD_L3_HIT_RETIRED.XSNP_FWD
0xD2
0x08 ENHANCED MEM_LOAD_L3_HIT_RETIRED.XSNP_NONE
0xD3
0x01 ENHANCED MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM
0xD3
0x02 ENHANCED MEM_LOAD_L3_MISS_RETIRED.REMOTE_DRAM
0xD3
0x04 ENHANCED MEM_LOAD_L3_MISS_RETIRED.REMOTE_HITM
0xD3
0x08 ENHANCED MEM_LOAD_L3_MISS_RETIRED.REMOTE_FWD
0xD4
0x04 STANDARD MEM_LOAD_MISC_RETIRED.UC
0xE0
0x20 STANDARD MISC2_RETIRED.LFENCE
0xE5
0x03 STANDARD MEM_UOP_RETIRED.ANY
0xE7
0x03 STANDARD INT_VEC_RETIRED.ADD_128
0xE7
0x0C STANDARD INT_VEC_RETIRED.ADD_256
0xE7
0x10 STANDARD INT_VEC_RETIRED.VNNI_128
0xE7
0x13 STANDARD INT_VEC_RETIRED.128BIT
0xE7
0x20 STANDARD INT_VEC_RETIRED.VNNI_256
0xE7
0x40 STANDARD INT_VEC_RETIRED.SHUFFLES
0xE7
0x80 STANDARD INT_VEC_RETIRED.MUL_256
0xE7
0xAC STANDARD INT_VEC_RETIRED.256BIT
0xEC
0x10 STANDARD CPU_CLK_UNHALTED.C01
0xEC
0x20 STANDARD CPU_CLK_UNHALTED.C02
0xEC
0x40 STANDARD CPU_CLK_UNHALTED.PAUSE
0xEC
0x70 STANDARD CPU_CLK_UNHALTED.C0_WAIT

Intel Granite Rapids

The following table lists the supported PMU events for Intel 6th generation Granite Rapids processors.

Event code UMask PMU tracking type Event name
0x03
0x04 STANDARD LD_BLOCKS.ADDRESS_ALIAS
0x03
0x82 STANDARD LD_BLOCKS.STORE_FORWARD
0x03
0x88 STANDARD LD_BLOCKS.NO_SR
0x11
0x02 STANDARD ITLB_MISSES.WALK_COMPLETED_4K
0x11
0x04 STANDARD ITLB_MISSES.WALK_COMPLETED_2M_4M
0x11
0x0E STANDARD ITLB_MISSES.WALK_COMPLETED
0x11
0x10 STANDARD ITLB_MISSES.WALK_ACTIVE
0x11
0x20 STANDARD ITLB_MISSES.STLB_HIT
0x12
0x02 STANDARD DTLB_LOAD_MISSES.WALK_COMPLETED_4K
0x12
0x04 STANDARD DTLB_LOAD_MISSES.WALK_COMPLETED_2M_4M
0x12
0x08 STANDARD DTLB_LOAD_MISSES.WALK_COMPLETED_1G
0x12
0x0E STANDARD DTLB_LOAD_MISSES.WALK_COMPLETED
0x12
0x10 STANDARD DTLB_LOAD_MISSES.WALK_ACTIVE
0x12
0x20 STANDARD DTLB_LOAD_MISSES.STLB_HIT
0x13
0x02 STANDARD DTLB_STORE_MISSES.WALK_COMPLETED_4K
0x13
0x04 STANDARD DTLB_STORE_MISSES.WALK_COMPLETED_2M_4M
0x13
0x08 STANDARD DTLB_STORE_MISSES.WALK_COMPLETED_1G
0x13
0x0E STANDARD DTLB_STORE_MISSES.WALK_COMPLETED
0x13
0x10 STANDARD DTLB_STORE_MISSES.WALK_ACTIVE
0x13
0x20 STANDARD DTLB_STORE_MISSES.STLB_HIT
0x20
0x01 ENHANCED OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD
0x20
0x02 ENHANCED OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_CODE_RD
0x20
0x04 ENHANCED OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_RFO
0x20
0x08 ENHANCED OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DATA_RD
0x20
0x10 ENHANCED OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_L3_MISS_DEMAND_DATA_RD
0x21
0x01 ENHANCED OFFCORE_REQUESTS.DEMAND_DATA_RD
0x21
0x02 ENHANCED OFFCORE_REQUESTS.DEMAND_CODE_RD
0x21
0x04 ENHANCED OFFCORE_REQUESTS.DEMAND_RFO
0x21
0x08 ENHANCED OFFCORE_REQUESTS.DATA_RD
0x21
0x10 ENHANCED OFFCORE_REQUESTS.L3_MISS_DEMAND_DATA_RD
0x21
0x80 ENHANCED OFFCORE_REQUESTS.ALL_REQUESTS
0x23
0x40 STANDARD L2_TRANS.L2_WB
0x24
0x21 STANDARD L2_RQSTS.DEMAND_DATA_RD_MISS
0x24
0x22 STANDARD L2_RQSTS.RFO_MISS
0x24
0x24 STANDARD L2_RQSTS.CODE_RD_MISS
0x24
0x27 STANDARD L2_RQSTS.ALL_DEMAND_MISS
0x24
0x28 STANDARD L2_RQSTS.SWPF_MISS
0x24
0x30 STANDARD L2_RQSTS.HWPF_MISS
0x24
0x3F STANDARD L2_RQSTS.MISS
0x24
0xC1 STANDARD L2_RQSTS.DEMAND_DATA_RD_HIT
0x24
0xC2 STANDARD L2_RQSTS.RFO_HIT
0x24
0xC4 STANDARD L2_RQSTS.CODE_RD_HIT
0x24
0xC8 STANDARD L2_RQSTS.SWPF_HIT
0x24
0xDF STANDARD L2_RQSTS.HIT
0x24
0xE1 STANDARD L2_RQSTS.ALL_DEMAND_DATA_RD
0x24
0xE2 STANDARD L2_RQSTS.ALL_RFO
0x24
0xE4 STANDARD L2_RQSTS.ALL_CODE_RD
0x24
0xE7 STANDARD L2_RQSTS.ALL_DEMAND_REFERENCES
0x24
0xF0 STANDARD L2_RQSTS.ALL_HWPF
0x24
0xFF STANDARD L2_RQSTS.REFERENCES
0x25
0x1F STANDARD L2_LINES_IN.ALL
0x26
0x01 STANDARD L2_LINES_OUT.SILENT
0x26
0x02 STANDARD L2_LINES_OUT.NON_SILENT
0x26
0x04 STANDARD L2_LINES_OUT.USELESS_HWPF
0x2D
0x01 ENHANCED XQ.FULL_CYCLES
0x2E
0x41 ENHANCED LONGEST_LAT_CACHE.MISS
0x2E
0x4F ENHANCED LONGEST_LAT_CACHE.REFERENCE
0x3C
0x00 ARCHITECTURAL CPU_CLK_UNHALTED.THREAD_P
0x3C
0x01 ARCHITECTURAL CPU_CLK_UNHALTED.REF_TSC_P
0x40
0x01 STANDARD SW_PREFETCH_ACCESS.NTA
0x40
0x02 STANDARD SW_PREFETCH_ACCESS.T0
0x40
0x04 STANDARD SW_PREFETCH_ACCESS.T1_T2
0x40
0x08 STANDARD SW_PREFETCH_ACCESS.PREFETCHW
0x40
0xF STANDARD SW_PREFETCH_ACCESS.ANY
0x43
0xFD STANDARD MEM_LOAD_COMPLETED.L1_MISS_ANY
0x44
0x01 STANDARD MEM_STORE_RETIRED.L2_HIT
0x47
0x02 STANDARD MEMORY_ACTIVITY.CYCLES_L1D_MISS
0x47
0x03 STANDARD MEMORY_ACTIVITY.STALLS_L1D_MISS
0x47
0x05 STANDARD MEMORY_ACTIVITY.STALLS_L2_MISS
0x47
0x09 ENHANCED MEMORY_ACTIVITY.STALLS_L3_MISS
0x48
0x01 STANDARD L1D_PEND_MISS.PENDING
0x48
0x02 STANDARD L1D_PEND_MISS.FB_FULL
0x48
0x04 STANDARD L1D_PEND_MISS.L2_STALLS
0x4C
0x01 STANDARD LOAD_HIT_PREFETCH.SWPF
0x51
0x01 STANDARD L1D.REPLACEMENT
0x51
0x20 STANDARD L1D.HWPF_MISS
0x54
0x01 ENHANCED TX_MEM.ABORT_CONFLICT
0x54
0x02 ENHANCED TX_MEM.ABORT_CAPACITY_WRITE
0x54
0x80 ENHANCED TX_MEM.ABORT_CAPACITY_READ
0x60
0x01 STANDARD BACLEARS.ANY
0x61
0x02 STANDARD DSB2MITE_SWITCHES.PENALTY_CYCLES
0x75
0x01 STANDARD INST_DECODED.DECODERS
0x76
0x01 STANDARD UOPS_DECODED.DEC0_UOPS
0x79
0x04 STANDARD IDQ.MITE_CYCLES_ANY
0x79
0x08 STANDARD IDQ.DSB_CYCLES_ANY
0x79
0x20 STANDARD IDQ.MS_CYCLES_ANY
0x80
0x04 STANDARD ICACHE_DATA.STALLS
0x83
0x04 STANDARD ICACHE_TAG.STALLS
0x87
0x01 STANDARD DECODE.LCP
0x87
0x02 STANDARD DECODE.MS_BUSY
0x9C
0x01 STANDARD IDQ_UOPS_NOT_DELIVERED.CORE
0xA2
0x02 STANDARD RESOURCE_STALLS.SCOREBOARD
0xA2
0x08 STANDARD RESOURCE_STALLS.SB
0xA3
0x01 STANDARD CYCLE_ACTIVITY.CYCLES_L2_MISS
0xA3
0x02 ENHANCED CYCLE_ACTIVITY.CYCLES_L3_MISS
0xA3
0x04 STANDARD CYCLE_ACTIVITY.STALLS_TOTAL
0xA3
0x05 STANDARD CYCLE_ACTIVITY.STALLS_L2_MISS
0xA3
0x06 ENHANCED CYCLE_ACTIVITY.STALLS_L3_MISS
0xA3
0x08 STANDARD CYCLE_ACTIVITY.CYCLES_L1D_MISS
0xA3
0x0C STANDARD CYCLE_ACTIVITY.STALLS_L1D_MISS
0xA3
0x10 STANDARD CYCLE_ACTIVITY.CYCLES_MEM_ANY
0xA4
0x01 ARCHITECTURAL TOPDOWN.SLOTS_P
0xA4
0x02 STANDARD TOPDOWN.BACKEND_BOUND_SLOTS
0xA4
0x04 STANDARD TOPDOWN.BAD_SPEC_SLOTS
0xA4
0x08 STANDARD TOPDOWN.BR_MISPREDICT_SLOTS
0xA4
0x10 STANDARD TOPDOWN.MEMORY_BOUND_SLOTS
0xA5
0x01 STANDARD RS.EMPTY_RESOURCE
0xA5
0x07 STANDARD RS.EMPTY
0xA6
0x02 STANDARD EXE_ACTIVITY.1_PORTS_UTIL
0xA6
0x04 STANDARD EXE_ACTIVITY.2_PORTS_UTIL
0xA6
0x08 STANDARD EXE_ACTIVITY.3_PORTS_UTIL
0xA6
0x10 STANDARD EXE_ACTIVITY.4_PORTS_UTIL
0xA6
0x21 STANDARD EXE_ACTIVITY.BOUND_ON_LOADS
0xA6
0x40 STANDARD EXE_ACTIVITY.BOUND_ON_STORES
0xA6
0x80 STANDARD EXE_ACTIVITY.EXE_BOUND_0_PORTS
0xA6
0xC STANDARD EXE_ACTIVITY.2_3_PORTS_UTIL
0xA8
0x01 STANDARD LSD.CYCLES_ACTIVE
0xAD
0x01 STANDARD INT_MISC.RECOVERY_CYCLES
0xAD
0x10 STANDARD INT_MISC.UOP_DROPPING
0xAD
0x20 STANDARD INT_MISC.MBA_STALLS
0xAD
0x80 STANDARD INT_MISC.CLEAR_RESTEER_CYCLES
0xAE
0x01 STANDARD UOPS_ISSUED.ANY
0xB0
0x01 STANDARD ARITH.FPDIV_ACTIVE
0xB0
0x08 STANDARD ARITH.IDIV_ACTIVE
0xB0
0x09 STANDARD ARITH.DIV_ACTIVE
0xB1
0x01 STANDARD UOPS_EXECUTED.CYCLES_GE_1
0xB1
0x02 STANDARD UOPS_EXECUTED.CORE
0xB1
0x10 STANDARD UOPS_EXECUTED.X87
0xB2
0x01 STANDARD UOPS_DISPATCHED.PORT_0
0xB2
0x02 STANDARD UOPS_DISPATCHED.PORT_1
0xB2
0x04 STANDARD UOPS_DISPATCHED.PORT_2_3_10
0xB2
0x10 STANDARD UOPS_DISPATCHED.PORT_4_9
0xB2
0x20 STANDARD UOPS_DISPATCHED.PORT_5_11
0xB2
0x40 STANDARD UOPS_DISPATCHED.PORT_6
0xB2
0x80 STANDARD UOPS_DISPATCHED.PORT_7_8
0xB3
0x01 STANDARD FP_ARITH_DISPATCHED.PORT_0
0xB3
0x02 STANDARD FP_ARITH_DISPATCHED.PORT_1
0xB3
0x04 STANDARD FP_ARITH_DISPATCHED.PORT_5
0xB7
0x02 STANDARD EXE.AMX_BUSY
0xC0
0x00 ARCHITECTURAL INST_RETIRED.ANY_P
0xC0
0x02 STANDARD INST_RETIRED.NOP
0xC0
0x08 STANDARD INST_RETIRED.REP_ITERATION
0xC0
0x10 STANDARD INST_RETIRED.MACRO_FUSED
0xC1
0x02 STANDARD ASSISTS.FP
0xC1
0x04 STANDARD ASSISTS.HARDWARE
0xC1
0x08 STANDARD ASSISTS.PAGE_FAULT
0xC1
0x10 STANDARD ASSISTS.SSE_AVX_MIX
0xC1
0x1B STANDARD ASSISTS.ANY
0xC2
0x01 STANDARD UOPS_RETIRED.HEAVY
0xC2
0x02 STANDARD UOPS_RETIRED.SLOTS
0xC3
0x01 STANDARD MACHINE_CLEARS.COUNT
0xC3
0x02 STANDARD MACHINE_CLEARS.MEMORY_ORDERING
0xC3
0x04 STANDARD MACHINE_CLEARS.SMC
0xC4
0x00 ARCHITECTURAL BR_INST_RETIRED.ALL_BRANCHES
0xC4
0x01 STANDARD BR_INST_RETIRED.COND_TAKEN
0xC4
0x02 STANDARD BR_INST_RETIRED.NEAR_CALL
0xC4
0x08 STANDARD BR_INST_RETIRED.NEAR_RETURN
0xC4
0x10 STANDARD BR_INST_RETIRED.COND_NTAKEN
0xC4
0x11 STANDARD BR_INST_RETIRED.COND
0xC4
0x20 STANDARD BR_INST_RETIRED.NEAR_TAKEN
0xC4
0x40 STANDARD BR_INST_RETIRED.FAR_BRANCH
0xC4
0x80 STANDARD BR_INST_RETIRED.INDIRECT
0xC5
0x00 ARCHITECTURAL BR_MISP_RETIRED.ALL_BRANCHES
0xC5
0x01 STANDARD BR_MISP_RETIRED.COND_TAKEN
0xC5
0x02 STANDARD BR_MISP_RETIRED.INDIRECT_CALL
0xC5
0x08 STANDARD BR_MISP_RETIRED.RET
0xC5
0x10 STANDARD BR_MISP_RETIRED.COND_NTAKEN
0xC5
0x11 STANDARD BR_MISP_RETIRED.COND
0xC5
0x20 STANDARD BR_MISP_RETIRED.NEAR_TAKEN
0xC5
0x41 STANDARD BR_MISP_RETIRED.COND_TAKEN_COST
0xC5
0x42 STANDARD BR_MISP_RETIRED.INDIRECT_CALL_COST
0xC5
0x44 STANDARD BR_MISP_RETIRED.ALL_BRANCHES_COST
0xC5
0x48 STANDARD BR_MISP_RETIRED.RET_COST
0xC5
0x50 STANDARD BR_MISP_RETIRED.COND_NTAKEN_COST
0xC5
0x51 STANDARD BR_MISP_RETIRED.COND_COST
0xC5
0x60 STANDARD BR_MISP_RETIRED.NEAR_TAKEN_COST
0xC5
0x80 STANDARD BR_MISP_RETIRED.INDIRECT
0xC5
0xC0 STANDARD BR_MISP_RETIRED.INDIRECT_COST
0xC7
0x01 STANDARD FP_ARITH_INST_RETIRED.SCALAR_DOUBLE
0xC7
0x02 STANDARD FP_ARITH_INST_RETIRED.SCALAR_SINGLE
0xC7
0x03 STANDARD FP_ARITH_INST_RETIRED.SCALAR
0xC7
0x04 STANDARD FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE
0xC7
0x08 STANDARD FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE
0xC7
0x10 STANDARD FP_ARITH_INST_RETIRED.256B_PACKED_DOUBLE
0xC7
0x18 STANDARD FP_ARITH_INST_RETIRED.4_FLOPS
0xC7
0x20 STANDARD FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE
0xC7
0x40 STANDARD FP_ARITH_INST_RETIRED.512B_PACKED_DOUBLE
0xC7
0x60 STANDARD FP_ARITH_INST_RETIRED.8_FLOPS
0xC7
0x80 STANDARD FP_ARITH_INST_RETIRED.512B_PACKED_SINGLE
0xC7
0xFC STANDARD FP_ARITH_INST_RETIRED.VECTOR
0xC9
0x01 ENHANCED RTM_RETIRED.START
0xC9
0x02 ENHANCED RTM_RETIRED.COMMIT
0xC9
0x04 ENHANCED RTM_RETIRED.ABORTED
0xC9
0x08 ENHANCED RTM_RETIRED.ABORTED_MEM
0xC9
0x20 ENHANCED RTM_RETIRED.ABORTED_UNFRIENDLY
0xC9
0x40 ENHANCED RTM_RETIRED.ABORTED_MEMTYPE
0xC9
0x80 ENHANCED RTM_RETIRED.ABORTED_EVENTS
0xCF
0x01 STANDARD FP_ARITH_INST_RETIRED2.SCALAR_HALF
0xCF
0x02 STANDARD FP_ARITH_INST_RETIRED2.COMPLEX_SCALAR_HALF
0xCF
0x03 STANDARD FP_ARITH_INST_RETIRED2.SCALAR
0xCF
0x04 STANDARD FP_ARITH_INST_RETIRED2.128B_PACKED_HALF
0xCF
0x08 STANDARD FP_ARITH_INST_RETIRED2.256B_PACKED_HALF
0xCF
0x10 STANDARD FP_ARITH_INST_RETIRED2.512B_PACKED_HALF
0xCF
0x1C STANDARD FP_ARITH_INST_RETIRED2.VECTOR
0xD0
0x09 STANDARD MEM_INST_RETIRED.STLB_HIT_LOADS
0xD0
0x0A STANDARD MEM_INST_RETIRED.STLB_HIT_STORES
0xD0
0x11 STANDARD MEM_INST_RETIRED.STLB_MISS_LOADS
0xD0
0x12 STANDARD MEM_INST_RETIRED.STLB_MISS_STORES
0xD0
0x21 STANDARD MEM_INST_RETIRED.LOCK_LOADS
0xD0
0x41 STANDARD MEM_INST_RETIRED.SPLIT_LOADS
0xD0
0x42 STANDARD MEM_INST_RETIRED.SPLIT_STORES
0xD0
0x81 STANDARD MEM_INST_RETIRED.ALL_LOADS
0xD0
0x82 STANDARD MEM_INST_RETIRED.ALL_STORES
0xD0
0x83 STANDARD MEM_INST_RETIRED.ANY
0xD1
0x01 STANDARD MEM_LOAD_RETIRED.L1_HIT
0xD1
0x02 STANDARD MEM_LOAD_RETIRED.L2_HIT
0xD1
0x04 ENHANCED MEM_LOAD_RETIRED.L3_HIT
0xD1
0x08 STANDARD MEM_LOAD_RETIRED.L1_MISS
0xD1
0x10 STANDARD MEM_LOAD_RETIRED.L2_MISS
0xD1
0x20 ENHANCED MEM_LOAD_RETIRED.L3_MISS
0xD1
0x40 STANDARD MEM_LOAD_RETIRED.FB_HIT
0xD2
0x01 ENHANCED MEM_LOAD_L3_HIT_RETIRED.XSNP_MISS
0xD2
0x02 ENHANCED MEM_LOAD_L3_HIT_RETIRED.XSNP_NO_FWD
0xD2
0x04 ENHANCED MEM_LOAD_L3_HIT_RETIRED.XSNP_FWD
0xD2
0x08 ENHANCED MEM_LOAD_L3_HIT_RETIRED.XSNP_NONE
0xD3
0x01 ENHANCED MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM
0xD3
0x02 ENHANCED MEM_LOAD_L3_MISS_RETIRED.REMOTE_DRAM
0xD3
0x04 ENHANCED MEM_LOAD_L3_MISS_RETIRED.REMOTE_HITM
0xD3
0x08 ENHANCED MEM_LOAD_L3_MISS_RETIRED.REMOTE_FWD
0xD4
0x04 STANDARD MEM_LOAD_MISC_RETIRED.UC
0xE0
0x20 STANDARD MISC2_RETIRED.LFENCE
0xE5
0x03 STANDARD MEM_UOP_RETIRED.ANY
0xE7
0x03 STANDARD INT_VEC_RETIRED.ADD_128
0xE7
0x0C STANDARD INT_VEC_RETIRED.ADD_256
0xE7
0x10 STANDARD INT_VEC_RETIRED.VNNI_128
0xE7
0x13 STANDARD INT_VEC_RETIRED.128BIT
0xE7
0x20 STANDARD INT_VEC_RETIRED.VNNI_256
0xE7
0x40 STANDARD INT_VEC_RETIRED.SHUFFLES
0xE7
0x80 STANDARD INT_VEC_RETIRED.MUL_256
0xE7
0xAC STANDARD INT_VEC_RETIRED.256BIT
0xEC
0x10 STANDARD CPU_CLK_UNHALTED.C01
0xEC
0x20 STANDARD CPU_CLK_UNHALTED.C02
0xEC
0x40 STANDARD CPU_CLK_UNHALTED.PAUSE
0xEC
0x70 STANDARD CPU_CLK_UNHALTED.C0_WAIT

NVIDIA Grace

The following table lists the supported PMU events for NVIDIA Grace processors with Arm Neoverse V2 cores.

Event code PMU tracking type Event name
0x0
ARCHITECTURAL SW_INCR
0x3
ARCHITECTURAL L1D_CACHE_REFILL
0x4
ARCHITECTURAL L1D_CACHE
0x8
ARCHITECTURAL INST_RETIRED
0x10
ARCHITECTURAL BR_MIS_PRED
0x11
ARCHITECTURAL CPU_CYCLES
0x12
ARCHITECTURAL BR_PRED
0x1b
ARCHITECTURAL INST_SPEC
0x1e
ARCHITECTURAL CHAIN
0x23
ARCHITECTURAL STALL_FRONTEND
0x24
ARCHITECTURAL STALL_BACKEND
0x39
ARCHITECTURAL L1D_CACHE_LMISS_RD
0x3c
ARCHITECTURAL STALL
0x40
ARCHITECTURAL L1D_CACHE_RD
0x4006
ARCHITECTURAL L1I_CACHE_LMISS
0x8006
ARCHITECTURAL SVE_INST_SPEC
0x0
STANDARD SW_INCR
0x1
STANDARD L1I_CACHE_REFILL
0x2
STANDARD L1I_TLB_REFILL
0x3
STANDARD L1D_CACHE_REFILL
0x4
STANDARD L1D_CACHE
0x5
STANDARD L1D_TLB_REFILL
0x8
STANDARD INST_RETIRED
0x9
STANDARD EXC_TAKEN
0xa
STANDARD EXC_RETURN
0xb
STANDARD CID_WRITE_RETIRED
0x10
STANDARD BR_MIS_PRED
0x11
STANDARD CPU_CYCLES
0x12
STANDARD BR_PRED
0x13
STANDARD MEM_ACCESS
0x14
STANDARD L1I_CACHE
0x15
STANDARD L1D_CACHE_WB
0x16
STANDARD L2D_CACHE
0x17
STANDARD L2D_CACHE_REFILL
0x18
STANDARD L2D_CACHE_WB
0x19
STANDARD BUS_ACCESS
0x1b
STANDARD INST_SPEC
0x1c
STANDARD TTBR_WRITE_RETIRED
0x1d
STANDARD BUS_CYCLES
0x1e
STANDARD CHAIN
0x20
STANDARD L2D_CACHE_ALLOCATE
0x21
STANDARD BR_RETIRED
0x22
STANDARD BR_MIS_PRED_RETIRED
0x23
STANDARD STALL_FRONTEND
0x24
STANDARD STALL_BACKEND
0x25
STANDARD L1D_TLB
0x26
STANDARD L1I_TLB
0x2d
STANDARD L2D_TLB_REFILL
0x2f
STANDARD L2D_TLB
0x31
STANDARD REMOTE_ACCESS
0x34
STANDARD DTLB_WALK
0x35
STANDARD ITLB_WALK
0x39
STANDARD L1D_CACHE_LMISS_RD
0x3a
STANDARD OP_RETIRED
0x3b
STANDARD OP_SPEC
0x3c
STANDARD STALL
0x3d
STANDARD STALL_SLOT_BACKEND
0x3e
STANDARD STALL_SLOT_FRONTEND
0x3f
STANDARD STALL_SLOT
0x40
STANDARD L1D_CACHE_RD
0x41
STANDARD L1D_CACHE_WR
0x42
STANDARD L1D_CACHE_REFILL_RD
0x43
STANDARD L1D_CACHE_REFILL_WR
0x44
STANDARD L1D_CACHE_REFILL_INNER
0x45
STANDARD L1D_CACHE_REFILL_OUTER
0x46
STANDARD L1D_CACHE_WB_VICTIM
0x47
STANDARD L1D_CACHE_WB_CLEAN
0x48
STANDARD L1D_CACHE_INVAL
0x4c
STANDARD L1D_TLB_REFILL_RD
0x4d
STANDARD L1D_TLB_REFILL_WR
0x4e
STANDARD L1D_TLB_RD
0x4f
STANDARD L1D_TLB_WR
0x50
STANDARD L2D_CACHE_RD
0x51
STANDARD L2D_CACHE_WR
0x52
STANDARD L2D_CACHE_REFILL_RD
0x53
STANDARD L2D_CACHE_REFILL_WR
0x56
STANDARD L2D_CACHE_WB_VICTIM
0x57
STANDARD L2D_CACHE_WB_CLEAN
0x58
STANDARD L2D_CACHE_INVAL
0x5c
STANDARD L2D_TLB_REFILL_RD
0x5d
STANDARD L2D_TLB_REFILL_WR
0x5e
STANDARD L2D_TLB_RD
0x5f
STANDARD L2D_TLB_WR
0x60
STANDARD BUS_ACCESS_RD
0x61
STANDARD BUS_ACCESS_WR
0x66
STANDARD MEM_ACCESS_RD
0x67
STANDARD MEM_ACCESS_WR
0x68
STANDARD UNALIGNED_LD_SPEC
0x69
STANDARD UNALIGNED_ST_SPEC
0x6a
STANDARD UNALIGNED_LDST_SPEC
0x6c
STANDARD LDREX_SPEC
0x6d
STANDARD STREX_PASS_SPEC
0x6e
STANDARD STREX_FAIL_SPEC
0x6f
STANDARD STREX_SPEC
0x70
STANDARD LD_SPEC
0x71
STANDARD ST_SPEC
0x73
STANDARD DP_SPEC
0x74
STANDARD ASE_SPEC
0x75
STANDARD VFP_SPEC
0x76
STANDARD PC_WRITE_SPEC
0x77
STANDARD CRYPTO_SPEC
0x78
STANDARD BR_IMMED_SPEC
0x79
STANDARD BR_RETURN_SPEC
0x7a
STANDARD BR_INDIRECT_SPEC
0x7c
STANDARD ISB_SPEC
0x7d
STANDARD DSB_SPEC
0x7e
STANDARD DMB_SPEC
0x81
STANDARD EXC_UNDEF
0x82
STANDARD EXC_SVC
0x83
STANDARD EXC_PABORT
0x84
STANDARD EXC_DABORT
0x86
STANDARD EXC_IRQ
0x87
STANDARD EXC_FIQ
0x88
STANDARD EXC_SMC
0x8a
STANDARD EXC_HVC
0x8b
STANDARD EXC_TRAP_PABORT
0x8c
STANDARD EXC_TRAP_DABORT
0x8d
STANDARD EXC_TRAP_OTHER
0x8e
STANDARD EXC_TRAP_IRQ
0x8f
STANDARD EXC_TRAP_FIQ
0x90
STANDARD RC_LD_SPEC
0x91
STANDARD RC_ST_SPEC
0x4004
STANDARD CNT_CYCLES
0x4005
STANDARD STALL_BACKEND_MEM
0x4006
STANDARD L1I_CACHE_LMISS
0x4009
STANDARD L2D_CACHE_LMISS_RD
0x4020
STANDARD LDST_ALIGN_LAT
0x4021
STANDARD LD_ALIGN_LAT
0x4022
STANDARD ST_ALIGN_LAT
0x8005
STANDARD ASE_INST_SPEC
0x8006
STANDARD SVE_INST_SPEC
0x8014
STANDARD FP_HP_SPEC
0x8018
STANDARD FP_SP_SPEC
0x801c
STANDARD FP_DP_SPEC
0x8074
STANDARD SVE_PRED_SPEC
0x8075
STANDARD SVE_PRED_EMPTY_SPEC
0x8076
STANDARD SVE_PRED_FULL_SPEC
0x8077
STANDARD SVE_PRED_PARTIAL_SPEC
0x8079
STANDARD SVE_PRED_NOT_FULL_SPEC
0x80bc
STANDARD SVE_LDFF_SPEC
0x80bd
STANDARD SVE_LDFF_FAULT_SPEC
0x80c0
STANDARD FP_SCALE_OPS_SPEC
0x80c1
STANDARD FP_FIXED_OPS_SPEC
0x80e3
STANDARD ASE_SVE_INT8_SPEC
0x80e7
STANDARD ASE_SVE_INT16_SPEC
0x80eb
STANDARD ASE_SVE_INT32_SPEC
0x80ef
STANDARD ASE_SVE_INT64_SPEC

Pricing

There are no costs associated with using the PMU in a compute instance.

Limitations

You can enable the enhanced PMU type only in C4 or M4 instances that use one of the following machine types:

  • Any C4 machine type with 144 or 288 vCPUs

  • One of the following M4 machine types:

    • m4-megamem-112

    • m4-megamem-224

    • m4-ultramem-56

    • m4-ultramem-112

    • m4-ultramem-224

What's next

Create a Mobile Website
View Site in Mobile | Classic
Share by: