linux-loongson/tools/perf/pmu-events/arch/x86/amdzen3/floating-point.json
Smita Koralahalli da66658638 perf vendor events amd: Add Zen3 events
Add PMU events for AMD Zen3 processors as documented in the AMD Processor
Programming Reference for Family 19h and Model 01h [1].

Below are the events which are new on Zen3:

  PMCx041 ls_mab_alloc.{all_allocations|hardware_prefetcher_allocations|load_store_allocations}
  PMCx043 ls_dmnd_fills_from_sys.ext_cache_local
  PMCx044 ls_any_fills_from_sys.{mem_io_remote|ext_cache_remote|mem_io_local|ext_cache_local|int_cache|lcl_l2}
  PMCx047 ls_misal_loads.{ma4k|ma64}
  PMCx059 ls_sw_pf_dc_fills.ext_cache_local
  PMCx05a ls_hw_pf_dc_fills.ext_cache_local
  PMCx05f ls_alloc_mab_count
  PMCx085 bp_l1_tlb_miss_l2_tlb_miss.coalesced_4k
  PMCx0ab de_dis_cops_from_decoder.disp_op_type.{any_integer_dispatch|any_fp_dispatch}
  PMCx0cc ex_ret_ind_brch_instr
  PMCx18e ic_tag_hit_miss.{all_instruction_cache_accesses|instruction_cache_miss|instruction_cache_hit}
  PMCx1c7 ex_ret_msprd_brnch_instr_dir_msmtch
  PMCx28f op_cache_hit_miss.{all_op_cache_accesses|op_cache_miss|op_cache_hit}

Section 2.1.17.2 "Performance Measurement" of "PPR for AMD Family 19h,
Model 01h, Revision B1 Processors - 55898 Rev 0.35 - Feb 5, 2021." lists
new metrics. Add them.

Preserve the events for Zen3 if they are measurable and non-zero as taken
from Zen2 directory even if the PPR of Zen3 [1] omits them. Those events
are the following:

  PMCx000 fpu_pipe_assignment.{total|total0|total1|total2|total3}
  PMCx004 fp_num_mov_elim_scal_op.{optimized|opt_potential|sse_mov_ops_elim|sse_mov_ops}
  PMCx02D ls_rdtsc
  PMCx040 ls_dc_accesses
  PMCx046 ls_tablewalker.{iside|ic_type1|ic_type0|dside|dc_type1|dc_type0}
  PMCx061 l2_request_g2.{group1|ls_rd_sized|ls_rd_sized_nc|ic_rd_sized|ic_rd_sized_nc|smc_inval|bus_lock_originator|bus_locks_responses}
  PMCx062 l2_latency.l2_cycles_waiting_on_fills
  PMCx063 l2_wcb_req.{wcb_write|wcb_close|zero_byte_store|cl_zero}
  PMCx06d l2_fill_pending.l2_fill_busy
  PMCx080 ic_fw32
  PMCx081 ic_fw32_miss
  PMCx086 bp_snp_re_sync
  PMCx087 ic_fetch_stall.{ic_stall_any|ic_stall_dq_empty|ic_stall_back_pressure}
  PMCx08a bp_l1_btb_correct
  PMCx08c ic_cache_inval.{l2_invalidating_probe|fill_invalidated}
  PMCx099 bp_tlb_rel
  PMCx0a9 de_dis_uop_queue_empty_di0
  PMCx0c7 ex_ret_brn_resync
  PMCx28a ic_oc_mode_switch.{oc_ic_mode_switch|ic_oc_mode_switch}
  L3PMCx01 l3_request_g1.caching_l3_cache_accesses
  L3PMCx06 l3_comb_clstr_state.{other_l3_miss_typs|request_miss}

[1] Processor Programming Reference (PPR) for AMD Family 19h, Model 01h,
Revision B1 Processors - 55898 Rev 0.35 - Feb 5, 2021.

[2] Processor Programming Reference (PPR) for AMD Family 17h Model 71h,
Revision B0 Processors, 56176 Rev 3.06 - Jul 17, 2019.

[3] Processor Programming Reference (PPR) for AMD Family 17h Models
01h,08h, Revision B2 Processors, 54945 Rev 3.03 - Jun 14, 2019.

All of the PPRs can be found at:

https://bugzilla.kernel.org/show_bug.cgi?id=206537

Reviewed-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vijay Thakkar <vijaythakkar@me.com>
Cc: linux-perf-users@vger.kernel.org
Link: https://lore.kernel.org/r/20210406215944.113332-5-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-04-08 14:24:39 -03:00

140 lines
8.5 KiB
JSON

[
{
"EventName": "fpu_pipe_assignment.total",
"EventCode": "0x00",
"BriefDescription": "Total number of fp uOps.",
"PublicDescription": "Total number of fp uOps. The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS.",
"UMask": "0x0f"
},
{
"EventName": "fpu_pipe_assignment.total3",
"EventCode": "0x00",
"BriefDescription": "Total number uOps assigned to pipe 3.",
"PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one-cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number uOps assigned to pipe 3.",
"UMask": "0x08"
},
{
"EventName": "fpu_pipe_assignment.total2",
"EventCode": "0x00",
"BriefDescription": "Total number uOps assigned to pipe 2.",
"PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number uOps assigned to pipe 2.",
"UMask": "0x04"
},
{
"EventName": "fpu_pipe_assignment.total1",
"EventCode": "0x00",
"BriefDescription": "Total number uOps assigned to pipe 1.",
"PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number uOps assigned to pipe 1.",
"UMask": "0x02"
},
{
"EventName": "fpu_pipe_assignment.total0",
"EventCode": "0x00",
"BriefDescription": "Total number of fp uOps on pipe 0.",
"PublicDescription": "The number of operations (uOps) dispatched to each of the 4 FPU execution pipelines. This event reflects how busy the FPU pipelines are and may be used for workload characterization. This includes all operations performed by x87, MMX, and SSE instructions, including moves. Each increment represents a one- cycle dispatch event. This event is a speculative event. Since this event includes non-numeric operations it is not suitable for measuring MFLOPS. Total number uOps assigned to pipe 0.",
"UMask": "0x01"
},
{
"EventName": "fp_ret_sse_avx_ops.all",
"EventCode": "0x03",
"BriefDescription": "All FLOPS. This is a retire-based event. The number of retired SSE/AVX FLOPS. The number of events logged per cycle can vary from 0 to 64. This event can count above 15.",
"UMask": "0xff"
},
{
"EventName": "fp_ret_sse_avx_ops.mac_flops",
"EventCode": "0x03",
"BriefDescription": "Multiply-Accumulate FLOPs. Each MAC operation is counted as 2 FLOPS. This is a retire-based event. The number of retired SSE/AVX FLOPs. The number of events logged per cycle can vary from 0 to 64. This event requires the use of the MergeEvent since it can count above 15 events per cycle. See 2.1.17.3 [Large Increment per Cycle Events]. It does not provide a useful count without the use of the MergeEvent.",
"UMask": "0x08"
},
{
"EventName": "fp_ret_sse_avx_ops.div_flops",
"EventCode": "0x03",
"BriefDescription": "Divide/square root FLOPs. This is a retire-based event. The number of retired SSE/AVX FLOPs. The number of events logged per cycle can vary from 0 to 64. This event requires the use of the MergeEvent since it can count above 15 events per cycle. See 2.1.17.3 [Large Increment per Cycle Events]. It does not provide a useful count without the use of the MergeEvent.",
"UMask": "0x04"
},
{
"EventName": "fp_ret_sse_avx_ops.mult_flops",
"EventCode": "0x03",
"BriefDescription": "Multiply FLOPs. This is a retire-based event. The number of retired SSE/AVX FLOPs. The number of events logged per cycle can vary from 0 to 64. This event requires the use of the MergeEvent since it can count above 15 events per cycle. See 2.1.17.3 [Large Increment per Cycle Events]. It does not provide a useful count without the use of the MergeEvent.",
"UMask": "0x02"
},
{
"EventName": "fp_ret_sse_avx_ops.add_sub_flops",
"EventCode": "0x03",
"BriefDescription": "Add/subtract FLOPs. This is a retire-based event. The number of retired SSE/AVX FLOPs. The number of events logged per cycle can vary from 0 to 64. This event requires the use of the MergeEvent since it can count above 15 events per cycle. See 2.1.17.3 [Large Increment per Cycle Events]. It does not provide a useful count without the use of the MergeEvent.",
"UMask": "0x01"
},
{
"EventName": "fp_num_mov_elim_scal_op.optimized",
"EventCode": "0x04",
"BriefDescription": "Number of Scalar Ops optimized. This is a dispatch based speculative event, and is useful for measuring the effectiveness of the Move elimination and Scalar code optimization schemes.",
"UMask": "0x08"
},
{
"EventName": "fp_num_mov_elim_scal_op.opt_potential",
"EventCode": "0x04",
"BriefDescription": "Number of Ops that are candidates for optimization (have Z-bit either set or pass). This is a dispatch based speculative event, and is useful for measuring the effectiveness of the Move elimination and Scalar code optimization schemes.",
"UMask": "0x04"
},
{
"EventName": "fp_num_mov_elim_scal_op.sse_mov_ops_elim",
"EventCode": "0x04",
"BriefDescription": "Number of SSE Move Ops eliminated. This is a dispatch based speculative event, and is useful for measuring the effectiveness of the Move elimination and Scalar code optimization schemes.",
"UMask": "0x02"
},
{
"EventName": "fp_num_mov_elim_scal_op.sse_mov_ops",
"EventCode": "0x04",
"BriefDescription": "Number of SSE Move Ops. This is a dispatch based speculative event, and is useful for measuring the effectiveness of the Move elimination and Scalar code optimization schemes.",
"UMask": "0x01"
},
{
"EventName": "fp_retired_ser_ops.sse_bot_ret",
"EventCode": "0x05",
"BriefDescription": "SSE/AVX bottom-executing ops retired. The number of serializing Ops retired.",
"UMask": "0x08"
},
{
"EventName": "fp_retired_ser_ops.sse_ctrl_ret",
"EventCode": "0x05",
"BriefDescription": "SSE/AVX control word mispredict traps. The number of serializing Ops retired.",
"UMask": "0x04"
},
{
"EventName": "fp_retired_ser_ops.x87_bot_ret",
"EventCode": "0x05",
"BriefDescription": "x87 bottom-executing ops retired. The number of serializing Ops retired.",
"UMask": "0x02"
},
{
"EventName": "fp_retired_ser_ops.x87_ctrl_ret",
"EventCode": "0x05",
"BriefDescription": "x87 control word mispredict traps due to mispredictions in RC or PC, or changes in mask bits. The number of serializing Ops retired.",
"UMask": "0x01"
},
{
"EventName": "fp_disp_faults.ymm_spill_fault",
"EventCode": "0x0e",
"BriefDescription": "Floating Point Dispatch Faults. YMM spill fault.",
"UMask": "0x08"
},
{
"EventName": "fp_disp_faults.ymm_fill_fault",
"EventCode": "0x0e",
"BriefDescription": "Floating Point Dispatch Faults. YMM fill fault.",
"UMask": "0x04"
},
{
"EventName": "fp_disp_faults.xmm_fill_fault",
"EventCode": "0x0e",
"BriefDescription": "Floating Point Dispatch Faults. XMM fill fault.",
"UMask": "0x02"
},
{
"EventName": "fp_disp_faults.x87_fill_fault",
"EventCode": "0x0e",
"BriefDescription": "Floating Point Dispatch Faults. x87 fill fault.",
"UMask": "0x01"
}
]