* [PATCH v2 0/9] Update Intel events and make optane events dynamic
@ 2023-03-23 19:20 Ian Rogers
2023-03-23 19:20 ` [PATCH v2 4/9] perf vendor events: Haswell v33 events Ian Rogers
` (5 more replies)
0 siblings, 6 replies; 10+ messages in thread
From: Ian Rogers @ 2023-03-23 19:20 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Zhengjun Xing, Kan Liang, linux-perf-users, linux-kernel,
Edward Baker, Dan Williams, perry.taylor, caleb.biggers,
samantha.alt, weilin.wang
Cc: Ian Rogers
Update events from:
https://github.com/intel/perfmon/pull/62
Add new #has_optane literal to allow optane metrics to be dynamic in
how optane events are enabled. Update CLX, ICX and SPR for this using
this PR:
https://github.com/intel/perfmon/pull/63
Ian Rogers (9):
perf vendor events: Broadwell v27 events
perf vendor events: Broadwellde v9 events
perf vendor events: Broadwellx v20 events
perf vendor events: Haswell v33 events
perf vendor events: Haswellx v27 events
perf vendor events: Jaketown v23 events
perf vendor events: Sandybridge v19 events
perf metrics: Add has_optane literal
perf vendor events: Update metrics to detect optane memory at runtime
.../pmu-events/arch/x86/broadwell/cache.json | 296 +++++-----
.../arch/x86/broadwell/floating-point.json | 7 +
.../arch/x86/broadwell/frontend.json | 18 +-
.../pmu-events/arch/x86/broadwell/memory.json | 248 ++++-----
.../arch/x86/broadwell/pipeline.json | 22 +-
.../arch/x86/broadwell/uncore-other.json | 2 +-
.../arch/x86/broadwellde/cache.json | 105 ++--
.../arch/x86/broadwellde/floating-point.json | 45 +-
.../arch/x86/broadwellde/frontend.json | 18 +-
.../arch/x86/broadwellde/memory.json | 64 ++-
.../arch/x86/broadwellde/pipeline.json | 79 +--
.../arch/x86/broadwellde/uncore-cache.json | 72 +--
.../arch/x86/broadwellde/uncore-memory.json | 256 ++++++++-
.../arch/x86/broadwellde/uncore-other.json | 27 +-
.../arch/x86/broadwellde/uncore-power.json | 10 +-
.../pmu-events/arch/x86/broadwellx/cache.json | 16 +-
.../arch/x86/broadwellx/frontend.json | 18 +-
.../arch/x86/broadwellx/pipeline.json | 20 +-
.../arch/x86/broadwellx/uncore-cache.json | 156 ++----
.../x86/broadwellx/uncore-interconnect.json | 84 +--
.../arch/x86/broadwellx/uncore-memory.json | 522 +++++++++---------
.../arch/x86/broadwellx/uncore-other.json | 44 +-
.../arch/x86/broadwellx/uncore-power.json | 10 +-
.../arch/x86/cascadelakex/clx-metrics.json | 10 +-
.../pmu-events/arch/x86/haswell/cache.json | 38 +-
.../pmu-events/arch/x86/haswell/memory.json | 38 +-
.../pmu-events/arch/x86/haswell/pipeline.json | 8 +
.../pmu-events/arch/x86/haswellx/cache.json | 2 +-
.../arch/x86/haswellx/pipeline.json | 8 +
.../arch/x86/haswellx/uncore-cache.json | 16 +-
.../arch/x86/haswellx/uncore-other.json | 6 +-
.../arch/x86/icelakex/icx-metrics.json | 10 +-
.../arch/x86/jaketown/pipeline.json | 8 +
tools/perf/pmu-events/arch/x86/mapfile.csv | 14 +-
.../arch/x86/sandybridge/pipeline.json | 8 +
.../arch/x86/sapphirerapids/spr-metrics.json | 10 +-
tools/perf/util/expr.c | 19 +
37 files changed, 1323 insertions(+), 1011 deletions(-)
--
2.40.0.348.gf938b09366-goog
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 4/9] perf vendor events: Haswell v33 events
2023-03-23 19:20 [PATCH v2 0/9] Update Intel events and make optane events dynamic Ian Rogers
@ 2023-03-23 19:20 ` Ian Rogers
2023-03-23 19:20 ` [PATCH v2 5/9] perf vendor events: Haswellx v27 events Ian Rogers
` (4 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Ian Rogers @ 2023-03-23 19:20 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Zhengjun Xing, Kan Liang, linux-perf-users, linux-kernel,
Edward Baker, Dan Williams, perry.taylor, caleb.biggers,
samantha.alt, weilin.wang
Cc: Ian Rogers
Updates descriptions and encodings. Adds BR_MISP_EXEC.INDIRECT events.
Signed-off-by: Ian Rogers <irogers@google.com>
---
.../pmu-events/arch/x86/haswell/cache.json | 38 +++++++++----------
.../pmu-events/arch/x86/haswell/memory.json | 38 +++++++++----------
.../pmu-events/arch/x86/haswell/pipeline.json | 8 ++++
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
4 files changed, 47 insertions(+), 39 deletions(-)
diff --git a/tools/perf/pmu-events/arch/x86/haswell/cache.json b/tools/perf/pmu-events/arch/x86/haswell/cache.json
index 5a1489e79859..0831f14b3cc6 100644
--- a/tools/perf/pmu-events/arch/x86/haswell/cache.json
+++ b/tools/perf/pmu-events/arch/x86/haswell/cache.json
@@ -8,7 +8,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability.",
+ "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers unavailability.",
"CounterMask": "1",
"EventCode": "0x48",
"EventName": "L1D_PEND_MISS.FB_FULL",
@@ -643,7 +643,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand & prefetch code readshit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded",
+ "BriefDescription": "Counts all demand & prefetch code reads hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.ALL_CODE_RD.L3_HIT.HIT_OTHER_CORE_NO_FWD",
"MSRIndex": "0x1a6,0x1a7",
@@ -652,7 +652,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand & prefetch data readshit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded",
+ "BriefDescription": "Counts all demand & prefetch data reads hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_HIT.HITM_OTHER_CORE",
"MSRIndex": "0x1a6,0x1a7",
@@ -661,7 +661,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand & prefetch data readshit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded",
+ "BriefDescription": "Counts all demand & prefetch data reads hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_HIT.HIT_OTHER_CORE_NO_FWD",
"MSRIndex": "0x1a6,0x1a7",
@@ -688,7 +688,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all requestshit in the L3",
+ "BriefDescription": "Counts all requests hit in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.ALL_REQUESTS.L3_HIT.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -697,7 +697,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand & prefetch RFOshit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded",
+ "BriefDescription": "Counts all demand & prefetch RFOs hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_HIT.HITM_OTHER_CORE",
"MSRIndex": "0x1a6,0x1a7",
@@ -706,7 +706,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand & prefetch RFOshit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded",
+ "BriefDescription": "Counts all demand & prefetch RFOs hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_HIT.HIT_OTHER_CORE_NO_FWD",
"MSRIndex": "0x1a6,0x1a7",
@@ -715,7 +715,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand code readshit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded",
+ "BriefDescription": "Counts all demand code reads hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.HITM_OTHER_CORE",
"MSRIndex": "0x1a6,0x1a7",
@@ -724,7 +724,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand code readshit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded",
+ "BriefDescription": "Counts all demand code reads hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.HIT_OTHER_CORE_NO_FWD",
"MSRIndex": "0x1a6,0x1a7",
@@ -733,7 +733,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts demand data readshit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded",
+ "BriefDescription": "Counts demand data reads hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT.HITM_OTHER_CORE",
"MSRIndex": "0x1a6,0x1a7",
@@ -742,7 +742,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts demand data readshit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded",
+ "BriefDescription": "Counts demand data reads hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT.HIT_OTHER_CORE_NO_FWD",
"MSRIndex": "0x1a6,0x1a7",
@@ -751,7 +751,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand data writes (RFOs)hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded",
+ "BriefDescription": "Counts all demand data writes (RFOs) hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.HITM_OTHER_CORE",
"MSRIndex": "0x1a6,0x1a7",
@@ -760,7 +760,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand data writes (RFOs)hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded",
+ "BriefDescription": "Counts all demand data writes (RFOs) hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.HIT_OTHER_CORE_NO_FWD",
"MSRIndex": "0x1a6,0x1a7",
@@ -769,7 +769,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all prefetch (that bring data to LLC only) code readshit in the L3",
+ "BriefDescription": "Counts all prefetch (that bring data to LLC only) code reads hit in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.PF_L2_CODE_RD.L3_HIT.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -778,7 +778,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts prefetch (that bring data to L2) data readshit in the L3",
+ "BriefDescription": "Counts prefetch (that bring data to L2) data reads hit in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_HIT.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -787,7 +787,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all prefetch (that bring data to L2) RFOshit in the L3",
+ "BriefDescription": "Counts all prefetch (that bring data to L2) RFOs hit in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_HIT.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -796,7 +796,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts prefetch (that bring data to LLC only) code readshit in the L3",
+ "BriefDescription": "Counts prefetch (that bring data to LLC only) code reads hit in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.PF_L3_CODE_RD.L3_HIT.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -805,7 +805,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all prefetch (that bring data to LLC only) data readshit in the L3",
+ "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads hit in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -814,7 +814,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOshit in the L3",
+ "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs hit in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
diff --git a/tools/perf/pmu-events/arch/x86/haswell/memory.json b/tools/perf/pmu-events/arch/x86/haswell/memory.json
index 9fb63e1dab08..2fc25e22a42a 100644
--- a/tools/perf/pmu-events/arch/x86/haswell/memory.json
+++ b/tools/perf/pmu-events/arch/x86/haswell/memory.json
@@ -179,7 +179,7 @@
"UMask": "0x2"
},
{
- "BriefDescription": "Counts all demand & prefetch code readsmiss in the L3",
+ "BriefDescription": "Counts all demand & prefetch code reads miss in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.ALL_CODE_RD.L3_MISS.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -188,7 +188,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand & prefetch code readsmiss the L3 and the data is returned from local dram",
+ "BriefDescription": "Counts all demand & prefetch code reads miss the L3 and the data is returned from local dram",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.ALL_CODE_RD.L3_MISS.LOCAL_DRAM",
"MSRIndex": "0x1a6,0x1a7",
@@ -197,7 +197,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand & prefetch data readsmiss in the L3",
+ "BriefDescription": "Counts all demand & prefetch data reads miss in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_MISS.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -206,7 +206,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand & prefetch data readsmiss the L3 and the data is returned from local dram",
+ "BriefDescription": "Counts all demand & prefetch data reads miss the L3 and the data is returned from local dram",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_MISS.LOCAL_DRAM",
"MSRIndex": "0x1a6,0x1a7",
@@ -233,7 +233,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all requestsmiss in the L3",
+ "BriefDescription": "Counts all requests miss in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.ALL_REQUESTS.L3_MISS.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -242,7 +242,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand & prefetch RFOsmiss in the L3",
+ "BriefDescription": "Counts all demand & prefetch RFOs miss in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_MISS.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -251,7 +251,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand & prefetch RFOsmiss the L3 and the data is returned from local dram",
+ "BriefDescription": "Counts all demand & prefetch RFOs miss the L3 and the data is returned from local dram",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_MISS.LOCAL_DRAM",
"MSRIndex": "0x1a6,0x1a7",
@@ -260,7 +260,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand code readsmiss in the L3",
+ "BriefDescription": "Counts all demand code reads miss in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -269,7 +269,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand code readsmiss the L3 and the data is returned from local dram",
+ "BriefDescription": "Counts all demand code reads miss the L3 and the data is returned from local dram",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.LOCAL_DRAM",
"MSRIndex": "0x1a6,0x1a7",
@@ -278,7 +278,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts demand data readsmiss in the L3",
+ "BriefDescription": "Counts demand data reads miss in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -287,7 +287,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts demand data readsmiss the L3 and the data is returned from local dram",
+ "BriefDescription": "Counts demand data reads miss the L3 and the data is returned from local dram",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS.LOCAL_DRAM",
"MSRIndex": "0x1a6,0x1a7",
@@ -296,7 +296,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand data writes (RFOs)miss in the L3",
+ "BriefDescription": "Counts all demand data writes (RFOs) miss in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -305,7 +305,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all demand data writes (RFOs)miss the L3 and the data is returned from local dram",
+ "BriefDescription": "Counts all demand data writes (RFOs) miss the L3 and the data is returned from local dram",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.LOCAL_DRAM",
"MSRIndex": "0x1a6,0x1a7",
@@ -314,7 +314,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all prefetch (that bring data to LLC only) code readsmiss in the L3",
+ "BriefDescription": "Counts all prefetch (that bring data to LLC only) code reads miss in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.PF_L2_CODE_RD.L3_MISS.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -323,7 +323,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts prefetch (that bring data to L2) data readsmiss in the L3",
+ "BriefDescription": "Counts prefetch (that bring data to L2) data reads miss in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_MISS.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -332,7 +332,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all prefetch (that bring data to L2) RFOsmiss in the L3",
+ "BriefDescription": "Counts all prefetch (that bring data to L2) RFOs miss in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_MISS.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -341,7 +341,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts prefetch (that bring data to LLC only) code readsmiss in the L3",
+ "BriefDescription": "Counts prefetch (that bring data to LLC only) code reads miss in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.PF_L3_CODE_RD.L3_MISS.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -350,7 +350,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all prefetch (that bring data to LLC only) data readsmiss in the L3",
+ "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads miss in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
@@ -359,7 +359,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOsmiss in the L3",
+ "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs miss in the L3",
"EventCode": "0xB7, 0xBB",
"EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.ANY_RESPONSE",
"MSRIndex": "0x1a6,0x1a7",
diff --git a/tools/perf/pmu-events/arch/x86/haswell/pipeline.json b/tools/perf/pmu-events/arch/x86/haswell/pipeline.json
index 9ac36c1c24b6..540f4372623c 100644
--- a/tools/perf/pmu-events/arch/x86/haswell/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/haswell/pipeline.json
@@ -194,6 +194,14 @@
"SampleAfterValue": "200003",
"UMask": "0xc4"
},
+ {
+ "BriefDescription": "Speculative mispredicted indirect branches",
+ "EventCode": "0x89",
+ "EventName": "BR_MISP_EXEC.INDIRECT",
+ "PublicDescription": "Counts speculatively miss-predicted indirect branches at execution time. Counts for indirect near CALL or JMP instructions (RET excluded).",
+ "SampleAfterValue": "200003",
+ "UMask": "0xe4"
+ },
{
"BriefDescription": "Not taken speculative and retired mispredicted macro conditional branches.",
"EventCode": "0x89",
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index dfed265c95ab..927e60f3417d 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -10,7 +10,7 @@ GenuineIntel-6-9[6C],v1.03,elkhartlake,core
GenuineIntel-6-5[CF],v13,goldmont,core
GenuineIntel-6-7A,v1.01,goldmontplus,core
GenuineIntel-6-A[DE],v1.01,graniterapids,core
-GenuineIntel-6-(3C|45|46),v32,haswell,core
+GenuineIntel-6-(3C|45|46),v33,haswell,core
GenuineIntel-6-3F,v26,haswellx,core
GenuineIntel-6-(7D|7E|A7),v1.17,icelake,core
GenuineIntel-6-6[AC],v1.19,icelakex,core
--
2.40.0.348.gf938b09366-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 5/9] perf vendor events: Haswellx v27 events
2023-03-23 19:20 [PATCH v2 0/9] Update Intel events and make optane events dynamic Ian Rogers
2023-03-23 19:20 ` [PATCH v2 4/9] perf vendor events: Haswell v33 events Ian Rogers
@ 2023-03-23 19:20 ` Ian Rogers
2023-03-23 19:20 ` [PATCH v2 6/9] perf vendor events: Jaketown v23 events Ian Rogers
` (3 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Ian Rogers @ 2023-03-23 19:20 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Zhengjun Xing, Kan Liang, linux-perf-users, linux-kernel,
Edward Baker, Dan Williams, perry.taylor, caleb.biggers,
samantha.alt, weilin.wang
Cc: Ian Rogers
Updates descriptions and encodings. Adds BR_MISP_EXEC.INDIRECT events.
Signed-off-by: Ian Rogers <irogers@google.com>
---
.../perf/pmu-events/arch/x86/haswellx/cache.json | 2 +-
.../pmu-events/arch/x86/haswellx/pipeline.json | 8 ++++++++
.../arch/x86/haswellx/uncore-cache.json | 16 ++++++++--------
.../arch/x86/haswellx/uncore-other.json | 6 +++---
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
5 files changed, 21 insertions(+), 13 deletions(-)
diff --git a/tools/perf/pmu-events/arch/x86/haswellx/cache.json b/tools/perf/pmu-events/arch/x86/haswellx/cache.json
index 1836ed62694e..a6c81010b394 100644
--- a/tools/perf/pmu-events/arch/x86/haswellx/cache.json
+++ b/tools/perf/pmu-events/arch/x86/haswellx/cache.json
@@ -8,7 +8,7 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability.",
+ "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers unavailability.",
"CounterMask": "1",
"EventCode": "0x48",
"EventName": "L1D_PEND_MISS.FB_FULL",
diff --git a/tools/perf/pmu-events/arch/x86/haswellx/pipeline.json b/tools/perf/pmu-events/arch/x86/haswellx/pipeline.json
index 9ac36c1c24b6..540f4372623c 100644
--- a/tools/perf/pmu-events/arch/x86/haswellx/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/haswellx/pipeline.json
@@ -194,6 +194,14 @@
"SampleAfterValue": "200003",
"UMask": "0xc4"
},
+ {
+ "BriefDescription": "Speculative mispredicted indirect branches",
+ "EventCode": "0x89",
+ "EventName": "BR_MISP_EXEC.INDIRECT",
+ "PublicDescription": "Counts speculatively miss-predicted indirect branches at execution time. Counts for indirect near CALL or JMP instructions (RET excluded).",
+ "SampleAfterValue": "200003",
+ "UMask": "0xe4"
+ },
{
"BriefDescription": "Not taken speculative and retired mispredicted macro conditional branches.",
"EventCode": "0x89",
diff --git a/tools/perf/pmu-events/arch/x86/haswellx/uncore-cache.json b/tools/perf/pmu-events/arch/x86/haswellx/uncore-cache.json
index 183bcac99642..e969dc71bea1 100644
--- a/tools/perf/pmu-events/arch/x86/haswellx/uncore-cache.json
+++ b/tools/perf/pmu-events/arch/x86/haswellx/uncore-cache.json
@@ -3114,7 +3114,7 @@
"Unit": "HA"
},
{
- "BriefDescription": "Tracker Occupancy Accumultor; Local InvItoE Requests",
+ "BriefDescription": "Tracker Occupancy Accumulator; Local InvItoE Requests",
"EventCode": "0x4",
"EventName": "UNC_H_TRACKER_OCCUPANCY.INVITOE_LOCAL",
"PerPkg": "1",
@@ -3123,7 +3123,7 @@
"Unit": "HA"
},
{
- "BriefDescription": "Tracker Occupancy Accumultor; Remote InvItoE Requests",
+ "BriefDescription": "Tracker Occupancy Accumulator; Remote InvItoE Requests",
"EventCode": "0x4",
"EventName": "UNC_H_TRACKER_OCCUPANCY.INVITOE_REMOTE",
"PerPkg": "1",
@@ -3132,7 +3132,7 @@
"Unit": "HA"
},
{
- "BriefDescription": "Tracker Occupancy Accumultor; Local Read Requests",
+ "BriefDescription": "Tracker Occupancy Accumulator; Local Read Requests",
"EventCode": "0x4",
"EventName": "UNC_H_TRACKER_OCCUPANCY.READS_LOCAL",
"PerPkg": "1",
@@ -3141,7 +3141,7 @@
"Unit": "HA"
},
{
- "BriefDescription": "Tracker Occupancy Accumultor; Remote Read Requests",
+ "BriefDescription": "Tracker Occupancy Accumulator; Remote Read Requests",
"EventCode": "0x4",
"EventName": "UNC_H_TRACKER_OCCUPANCY.READS_REMOTE",
"PerPkg": "1",
@@ -3150,7 +3150,7 @@
"Unit": "HA"
},
{
- "BriefDescription": "Tracker Occupancy Accumultor; Local Write Requests",
+ "BriefDescription": "Tracker Occupancy Accumulator; Local Write Requests",
"EventCode": "0x4",
"EventName": "UNC_H_TRACKER_OCCUPANCY.WRITES_LOCAL",
"PerPkg": "1",
@@ -3159,7 +3159,7 @@
"Unit": "HA"
},
{
- "BriefDescription": "Tracker Occupancy Accumultor; Remote Write Requests",
+ "BriefDescription": "Tracker Occupancy Accumulator; Remote Write Requests",
"EventCode": "0x4",
"EventName": "UNC_H_TRACKER_OCCUPANCY.WRITES_REMOTE",
"PerPkg": "1",
@@ -3168,7 +3168,7 @@
"Unit": "HA"
},
{
- "BriefDescription": "Data Pending Occupancy Accumultor; Local Requests",
+ "BriefDescription": "Data Pending Occupancy Accumulator; Local Requests",
"EventCode": "0x5",
"EventName": "UNC_H_TRACKER_PENDING_OCCUPANCY.LOCAL",
"PerPkg": "1",
@@ -3177,7 +3177,7 @@
"Unit": "HA"
},
{
- "BriefDescription": "Data Pending Occupancy Accumultor; Remote Requests",
+ "BriefDescription": "Data Pending Occupancy Accumulator; Remote Requests",
"EventCode": "0x5",
"EventName": "UNC_H_TRACKER_PENDING_OCCUPANCY.REMOTE",
"PerPkg": "1",
diff --git a/tools/perf/pmu-events/arch/x86/haswellx/uncore-other.json b/tools/perf/pmu-events/arch/x86/haswellx/uncore-other.json
index 4c3e2a794117..d30e3b16c1af 100644
--- a/tools/perf/pmu-events/arch/x86/haswellx/uncore-other.json
+++ b/tools/perf/pmu-events/arch/x86/haswellx/uncore-other.json
@@ -474,7 +474,7 @@
"EventCode": "0xD",
"EventName": "UNC_I_TxR_REQUEST_OCCUPANCY",
"PerPkg": "1",
- "PublicDescription": "Accumultes the number of outstanding outbound requests from the IRP to the switch (towards the devices). This can be used in conjuection with the allocations event in order to calculate average latency of outbound requests.",
+ "PublicDescription": "Accumulates the number of outstanding outbound requests from the IRP to the switch (towards the devices). This can be used in conjunction with the allocations event in order to calculate average latency of outbound requests.",
"Unit": "IRP"
},
{
@@ -2256,7 +2256,7 @@
"EventCode": "0x33",
"EventName": "UNC_R3_VNA_CREDITS_ACQUIRED.AD",
"PerPkg": "1",
- "PublicDescription": "Number of QPI VNA Credit acquisitions. This event can be used in conjunction with the VNA In-Use Accumulator to calculate the average lifetime of a credit holder. VNA credits are used by all message classes in order to communicate across QPI. If a packet is unable to acquire credits, it will then attempt to use credts from the VN0 pool. Note that a single packet may require multiple flit buffers (i.e. when data is being transferred). Therefore, this event will increment by the number of credits acquired in each cycle. Filtering based on message class is not provided. One can count the number of packets transferred in a given message class using an qfclk event.; Filter for the Home (HOM) message class. HOM is generally used to send requests, request responses, and snoop responses.",
+ "PublicDescription": "Number of QPI VNA Credit acquisitions. This event can be used in conjunction with the VNA In-Use Accumulator to calculate the average lifetime of a credit holder. VNA credits are used by all message classes in order to communicate across QPI. If a packet is unable to acquire credits, it will then attempt to use credits from the VN0 pool. Note that a single packet may require multiple flit buffers (i.e. when data is being transferred). Therefore, this event will increment by the number of credits acquired in each cycle. Filtering based on message class is not provided. One can count the number of packets transferred in a given message class using an qfclk event.; Filter for the Home (HOM) message class. HOM is generally used to send requests, request responses, and snoop responses.",
"UMask": "0x1",
"Unit": "R3QPI"
},
@@ -2265,7 +2265,7 @@
"EventCode": "0x33",
"EventName": "UNC_R3_VNA_CREDITS_ACQUIRED.BL",
"PerPkg": "1",
- "PublicDescription": "Number of QPI VNA Credit acquisitions. This event can be used in conjunction with the VNA In-Use Accumulator to calculate the average lifetime of a credit holder. VNA credits are used by all message classes in order to communicate across QPI. If a packet is unable to acquire credits, it will then attempt to use credts from the VN0 pool. Note that a single packet may require multiple flit buffers (i.e. when data is being transferred). Therefore, this event will increment by the number of credits acquired in each cycle. Filtering based on message class is not provided. One can count the number of packets transferred in a given message class using an qfclk event.; Filter for the Home (HOM) message class. HOM is generally used to send requests, request responses, and snoop responses.",
+ "PublicDescription": "Number of QPI VNA Credit acquisitions. This event can be used in conjunction with the VNA In-Use Accumulator to calculate the average lifetime of a credit holder. VNA credits are used by all message classes in order to communicate across QPI. If a packet is unable to acquire credits, it will then attempt to use credits from the VN0 pool. Note that a single packet may require multiple flit buffers (i.e. when data is being transferred). Therefore, this event will increment by the number of credits acquired in each cycle. Filtering based on message class is not provided. One can count the number of packets transferred in a given message class using an qfclk event.; Filter for the Home (HOM) message class. HOM is generally used to send requests, request responses, and snoop responses.",
"UMask": "0x4",
"Unit": "R3QPI"
},
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 927e60f3417d..e1a609401fff 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -11,7 +11,7 @@ GenuineIntel-6-5[CF],v13,goldmont,core
GenuineIntel-6-7A,v1.01,goldmontplus,core
GenuineIntel-6-A[DE],v1.01,graniterapids,core
GenuineIntel-6-(3C|45|46),v33,haswell,core
-GenuineIntel-6-3F,v26,haswellx,core
+GenuineIntel-6-3F,v27,haswellx,core
GenuineIntel-6-(7D|7E|A7),v1.17,icelake,core
GenuineIntel-6-6[AC],v1.19,icelakex,core
GenuineIntel-6-3A,v23,ivybridge,core
--
2.40.0.348.gf938b09366-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 6/9] perf vendor events: Jaketown v23 events
2023-03-23 19:20 [PATCH v2 0/9] Update Intel events and make optane events dynamic Ian Rogers
2023-03-23 19:20 ` [PATCH v2 4/9] perf vendor events: Haswell v33 events Ian Rogers
2023-03-23 19:20 ` [PATCH v2 5/9] perf vendor events: Haswellx v27 events Ian Rogers
@ 2023-03-23 19:20 ` Ian Rogers
2023-03-23 19:20 ` [PATCH v2 7/9] perf vendor events: Sandybridge v19 events Ian Rogers
` (2 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Ian Rogers @ 2023-03-23 19:20 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Zhengjun Xing, Kan Liang, linux-perf-users, linux-kernel,
Edward Baker, Dan Williams, perry.taylor, caleb.biggers,
samantha.alt, weilin.wang
Cc: Ian Rogers
Adds BR_MISP_EXEC.INDIRECT event.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/arch/x86/jaketown/pipeline.json | 8 ++++++++
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/tools/perf/pmu-events/arch/x86/jaketown/pipeline.json b/tools/perf/pmu-events/arch/x86/jaketown/pipeline.json
index 85c04fe7632a..d0edfdec9f01 100644
--- a/tools/perf/pmu-events/arch/x86/jaketown/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/jaketown/pipeline.json
@@ -202,6 +202,14 @@
"SampleAfterValue": "200003",
"UMask": "0xc4"
},
+ {
+ "BriefDescription": "Speculative mispredicted indirect branches",
+ "EventCode": "0x89",
+ "EventName": "BR_MISP_EXEC.INDIRECT",
+ "PublicDescription": "Counts speculatively miss-predicted indirect branches at execution time. Counts for indirect near CALL or JMP instructions (RET excluded).",
+ "SampleAfterValue": "200003",
+ "UMask": "0xe4"
+ },
{
"BriefDescription": "Not taken speculative and retired mispredicted macro conditional branches.",
"EventCode": "0x89",
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index e1a609401fff..e41c289fa427 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -16,7 +16,7 @@ GenuineIntel-6-(7D|7E|A7),v1.17,icelake,core
GenuineIntel-6-6[AC],v1.19,icelakex,core
GenuineIntel-6-3A,v23,ivybridge,core
GenuineIntel-6-3E,v22,ivytown,core
-GenuineIntel-6-2D,v22,jaketown,core
+GenuineIntel-6-2D,v23,jaketown,core
GenuineIntel-6-(57|85),v10,knightslanding,core
GenuineIntel-6-A[AC],v1.01,meteorlake,core
GenuineIntel-6-1[AEF],v3,nehalemep,core
--
2.40.0.348.gf938b09366-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 7/9] perf vendor events: Sandybridge v19 events
2023-03-23 19:20 [PATCH v2 0/9] Update Intel events and make optane events dynamic Ian Rogers
` (2 preceding siblings ...)
2023-03-23 19:20 ` [PATCH v2 6/9] perf vendor events: Jaketown v23 events Ian Rogers
@ 2023-03-23 19:20 ` Ian Rogers
2023-03-23 19:20 ` [PATCH v2 8/9] perf metrics: Add has_optane literal Ian Rogers
2023-03-23 19:20 ` [PATCH v2 9/9] perf vendor events: Update metrics to detect optane memory at runtime Ian Rogers
5 siblings, 0 replies; 10+ messages in thread
From: Ian Rogers @ 2023-03-23 19:20 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Zhengjun Xing, Kan Liang, linux-perf-users, linux-kernel,
Edward Baker, Dan Williams, perry.taylor, caleb.biggers,
samantha.alt, weilin.wang
Cc: Ian Rogers
Adds BR_MISP_EXEC.INDIRECT event.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
tools/perf/pmu-events/arch/x86/sandybridge/pipeline.json | 8 ++++++++
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index e41c289fa427..41d755d570e6 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -21,7 +21,7 @@ GenuineIntel-6-(57|85),v10,knightslanding,core
GenuineIntel-6-A[AC],v1.01,meteorlake,core
GenuineIntel-6-1[AEF],v3,nehalemep,core
GenuineIntel-6-2E,v3,nehalemex,core
-GenuineIntel-6-2A,v18,sandybridge,core
+GenuineIntel-6-2A,v19,sandybridge,core
GenuineIntel-6-(8F|CF),v1.11,sapphirerapids,core
GenuineIntel-6-(37|4A|4C|4D|5A),v15,silvermont,core
GenuineIntel-6-(4E|5E|8E|9E|A5|A6),v55,skylake,core
diff --git a/tools/perf/pmu-events/arch/x86/sandybridge/pipeline.json b/tools/perf/pmu-events/arch/x86/sandybridge/pipeline.json
index 54454e5e262c..ecaf94ccc9c7 100644
--- a/tools/perf/pmu-events/arch/x86/sandybridge/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/sandybridge/pipeline.json
@@ -210,6 +210,14 @@
"SampleAfterValue": "200003",
"UMask": "0xc4"
},
+ {
+ "BriefDescription": "Speculative mispredicted indirect branches",
+ "EventCode": "0x89",
+ "EventName": "BR_MISP_EXEC.INDIRECT",
+ "PublicDescription": "Counts speculatively miss-predicted indirect branches at execution time. Counts for indirect near CALL or JMP instructions (RET excluded).",
+ "SampleAfterValue": "200003",
+ "UMask": "0xe4"
+ },
{
"BriefDescription": "Not taken speculative and retired mispredicted macro conditional branches.",
"EventCode": "0x89",
--
2.40.0.348.gf938b09366-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 8/9] perf metrics: Add has_optane literal
2023-03-23 19:20 [PATCH v2 0/9] Update Intel events and make optane events dynamic Ian Rogers
` (3 preceding siblings ...)
2023-03-23 19:20 ` [PATCH v2 7/9] perf vendor events: Sandybridge v19 events Ian Rogers
@ 2023-03-23 19:20 ` Ian Rogers
2023-03-23 20:31 ` Liang, Kan
2023-03-23 19:20 ` [PATCH v2 9/9] perf vendor events: Update metrics to detect optane memory at runtime Ian Rogers
5 siblings, 1 reply; 10+ messages in thread
From: Ian Rogers @ 2023-03-23 19:20 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Zhengjun Xing, Kan Liang, linux-perf-users, linux-kernel,
Edward Baker, Dan Williams, perry.taylor, caleb.biggers,
samantha.alt, weilin.wang
Cc: Ian Rogers
Add literal so that if optane memory isn't installed we can record
fewer events. The file detection mechanism was suggested by Dan
Williams <dan.j.williams@intel.com> in:
https://lore.kernel.org/linux-perf-users/641bbe1eced26_1b98bb29440@dwillia2-xfh.jf.intel.com.notmuch/
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/expr.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index d46a1878bc9e..a43cdda0b044 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -14,6 +14,7 @@
#include "util/hashmap.h"
#include "smt.h"
#include "tsc.h"
+#include <api/fs/fs.h>
#include <linux/err.h>
#include <linux/kernel.h>
#include <linux/zalloc.h>
@@ -400,6 +401,20 @@ double arch_get_tsc_freq(void)
}
#endif
+static double has_optane(void)
+{
+ static bool has_optane, cached;
+ const char *sysfs = sysfs__mountpoint();
+ char path[PATH_MAX];
+
+ if (!cached) {
+ snprintf(path, sizeof(path), "%s/firmware/acpi/tables/NFIT", sysfs);
+ has_optane = access(path, F_OK) == 0;
+ cached = true;
+ }
+ return has_optane ? 1.0 : 0.0;
+}
+
double expr__get_literal(const char *literal, const struct expr_scanner_ctx *ctx)
{
const struct cpu_topology *topology;
@@ -449,6 +464,10 @@ double expr__get_literal(const char *literal, const struct expr_scanner_ctx *ctx
result = perf_pmu__cpu_slots_per_cycle();
goto out;
}
+ if (!strcmp("#has_optane", literal)) {
+ result = has_optane();
+ goto out;
+ }
pr_err("Unrecognized literal '%s'", literal);
out:
--
2.40.0.348.gf938b09366-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 9/9] perf vendor events: Update metrics to detect optane memory at runtime
2023-03-23 19:20 [PATCH v2 0/9] Update Intel events and make optane events dynamic Ian Rogers
` (4 preceding siblings ...)
2023-03-23 19:20 ` [PATCH v2 8/9] perf metrics: Add has_optane literal Ian Rogers
@ 2023-03-23 19:20 ` Ian Rogers
5 siblings, 0 replies; 10+ messages in thread
From: Ian Rogers @ 2023-03-23 19:20 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Zhengjun Xing, Kan Liang, linux-perf-users, linux-kernel,
Edward Baker, Dan Williams, perry.taylor, caleb.biggers,
samantha.alt, weilin.wang
Cc: Ian Rogers
By detecting whether optane memory is installed at runtime the number
of events can be reduced if it isn't. These changes come from this PR:
https://github.com/intel/perfmon/pull/63
Signed-off-by: Ian Rogers <irogers@google.com>
---
.../pmu-events/arch/x86/cascadelakex/clx-metrics.json | 10 +++++-----
.../perf/pmu-events/arch/x86/icelakex/icx-metrics.json | 10 +++++-----
.../arch/x86/sapphirerapids/spr-metrics.json | 10 +++++-----
3 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/tools/perf/pmu-events/arch/x86/cascadelakex/clx-metrics.json b/tools/perf/pmu-events/arch/x86/cascadelakex/clx-metrics.json
index 4e993a3220e3..903f19ea1696 100644
--- a/tools/perf/pmu-events/arch/x86/cascadelakex/clx-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/cascadelakex/clx-metrics.json
@@ -201,7 +201,7 @@
{
"BriefDescription": "This metric estimates how often the CPU was stalled on accesses to external memory (DRAM) by loads",
"MetricConstraint": "NO_GROUP_EVENTS",
- "MetricExpr": "CYCLE_ACTIVITY.STALLS_L3_MISS / tma_info_clks + (CYCLE_ACTIVITY.STALLS_L1D_MISS - CYCLE_ACTIVITY.STALLS_L2_MISS) / tma_info_clks - tma_l2_bound - tma_pmm_bound",
+ "MetricExpr": "(CYCLE_ACTIVITY.STALLS_L3_MISS / tma_info_clks + (CYCLE_ACTIVITY.STALLS_L1D_MISS - CYCLE_ACTIVITY.STALLS_L2_MISS) / tma_info_clks - tma_l2_bound - tma_pmm_bound if #has_optane > 0 else CYCLE_ACTIVITY.STALLS_L3_MISS / tma_info_clks + (CYCLE_ACTIVITY.STALLS_L1D_MISS - CYCLE_ACTIVITY.STALLS_L2_MISS) / tma_info_clks - tma_l2_bound)",
"MetricGroup": "MemoryBound;TmaL3mem;TopdownL3;tma_L3_group;tma_memory_bound_group",
"MetricName": "tma_dram_bound",
"MetricThreshold": "tma_dram_bound > 0.1 & (tma_memory_bound > 0.2 & tma_backend_bound > 0.2)",
@@ -933,7 +933,7 @@
},
{
"BriefDescription": "Average latency of data read request to external 3D X-Point memory [in nanoseconds]",
- "MetricExpr": "1e9 * (UNC_M_PMM_RPQ_OCCUPANCY.ALL / UNC_M_PMM_RPQ_INSERTS) / imc_0@event\\=0x0@",
+ "MetricExpr": "(1e9 * (UNC_M_PMM_RPQ_OCCUPANCY.ALL / UNC_M_PMM_RPQ_INSERTS) / imc_0@event\\=0x0@ if #has_optane > 0 else 0)",
"MetricGroup": "Mem;MemoryLat;Server;SoC",
"MetricName": "tma_info_mem_pmm_read_latency",
"PublicDescription": "Average latency of data read request to external 3D X-Point memory [in nanoseconds]. Accounts for demand loads and L1/L2 data-read prefetches"
@@ -998,13 +998,13 @@
},
{
"BriefDescription": "Average 3DXP Memory Bandwidth Use for reads [GB / sec]",
- "MetricExpr": "64 * UNC_M_PMM_RPQ_INSERTS / 1e9 / duration_time",
+ "MetricExpr": "(64 * UNC_M_PMM_RPQ_INSERTS / 1e9 / duration_time if #has_optane > 0 else 0)",
"MetricGroup": "Mem;MemoryBW;Server;SoC",
"MetricName": "tma_info_pmm_read_bw"
},
{
"BriefDescription": "Average 3DXP Memory Bandwidth Use for Writes [GB / sec]",
- "MetricExpr": "64 * UNC_M_PMM_WPQ_INSERTS / 1e9 / duration_time",
+ "MetricExpr": "(64 * UNC_M_PMM_WPQ_INSERTS / 1e9 / duration_time if #has_optane > 0 else 0)",
"MetricGroup": "Mem;MemoryBW;Server;SoC",
"MetricName": "tma_info_pmm_write_bw"
},
@@ -1310,7 +1310,7 @@
{
"BriefDescription": "This metric roughly estimates (based on idle latencies) how often the CPU was stalled on accesses to external 3D-Xpoint (Crystal Ridge, a.k.a",
"MetricConstraint": "NO_GROUP_EVENTS",
- "MetricExpr": "((1 - (19 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + 10 * (MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_FWD * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_HITM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS))) / (19 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + 10 * (MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_FWD * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_HITM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + (25 * (MEM_LOAD_RETIRED.LOCAL_PMM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + 33 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_PMM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS))))) * (CYCLE_ACTIVITY.STALLS_L3_MISS / tma_info_clks + (CYCLE_ACTIVITY.STALLS_L1D_MISS - CYCLE_ACTIVITY.STALLS_L2_MISS) / tma_info_clks - tma_l2_bound) if 1e6 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_PMM + MEM_LOAD_RETIRED.LOCAL_PMM) > MEM_LOAD_RETIRED.L1_MISS else 0)",
+ "MetricExpr": "(((1 - ((19 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + 10 * (MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_FWD * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_HITM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS))) / (19 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + 10 * (MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_FWD * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_HITM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + (25 * (MEM_LOAD_RETIRED.LOCAL_PMM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) if #has_optane > 0 else 0) + 33 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_PMM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) if #has_optane > 0 else 0))) if #has_optane > 0 else 0)) * (CYCLE_ACTIVITY.STALLS_L3_MISS / tma_info_clks + (CYCLE_ACTIVITY.STALLS_L1D_MISS - CYCLE_ACTIVITY.STALLS_L2_MISS) / tma_info_clks - tma_l2_bound) if 1e6 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_PMM + MEM_LOAD_RETIRED.LOCAL_PMM) > MEM_LOAD_RETIRED.L1_MISS else 0) if #has_optane > 0 else 0)",
"MetricGroup": "MemoryBound;Server;TmaL3mem;TopdownL3;tma_L3_group;tma_memory_bound_group",
"MetricName": "tma_pmm_bound",
"MetricThreshold": "tma_pmm_bound > 0.1 & (tma_memory_bound > 0.2 & tma_backend_bound > 0.2)",
diff --git a/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json b/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json
index 8109088a4df7..e5fba27dfe80 100644
--- a/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json
@@ -186,7 +186,7 @@
{
"BriefDescription": "This metric estimates how often the CPU was stalled on accesses to external memory (DRAM) by loads",
"MetricConstraint": "NO_GROUP_EVENTS",
- "MetricExpr": "CYCLE_ACTIVITY.STALLS_L3_MISS / tma_info_clks + (CYCLE_ACTIVITY.STALLS_L1D_MISS - CYCLE_ACTIVITY.STALLS_L2_MISS) / tma_info_clks - tma_l2_bound - tma_pmm_bound",
+ "MetricExpr": "(CYCLE_ACTIVITY.STALLS_L3_MISS / tma_info_clks + (CYCLE_ACTIVITY.STALLS_L1D_MISS - CYCLE_ACTIVITY.STALLS_L2_MISS) / tma_info_clks - tma_l2_bound - tma_pmm_bound if #has_optane > 0 else CYCLE_ACTIVITY.STALLS_L3_MISS / tma_info_clks + (CYCLE_ACTIVITY.STALLS_L1D_MISS - CYCLE_ACTIVITY.STALLS_L2_MISS) / tma_info_clks - tma_l2_bound)",
"MetricGroup": "MemoryBound;TmaL3mem;TopdownL3;tma_L3_group;tma_memory_bound_group",
"MetricName": "tma_dram_bound",
"MetricThreshold": "tma_dram_bound > 0.1 & (tma_memory_bound > 0.2 & tma_backend_bound > 0.2)",
@@ -918,7 +918,7 @@
},
{
"BriefDescription": "Average latency of data read request to external 3D X-Point memory [in nanoseconds]",
- "MetricExpr": "1e9 * (UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_PMM / UNC_CHA_TOR_INSERTS.IA_MISS_DRD_PMM) / cha_0@event\\=0x0@",
+ "MetricExpr": "(1e9 * (UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_PMM / UNC_CHA_TOR_INSERTS.IA_MISS_DRD_PMM) / cha_0@event\\=0x0@ if #has_optane > 0 else 0)",
"MetricGroup": "Mem;MemoryLat;Server;SoC",
"MetricName": "tma_info_mem_pmm_read_latency",
"PublicDescription": "Average latency of data read request to external 3D X-Point memory [in nanoseconds]. Accounts for demand loads and L1/L2 data-read prefetches"
@@ -984,13 +984,13 @@
},
{
"BriefDescription": "Average 3DXP Memory Bandwidth Use for reads [GB / sec]",
- "MetricExpr": "64 * UNC_M_PMM_RPQ_INSERTS / 1e9 / duration_time",
+ "MetricExpr": "(64 * UNC_M_PMM_RPQ_INSERTS / 1e9 / duration_time if #has_optane > 0 else 0)",
"MetricGroup": "Mem;MemoryBW;Server;SoC",
"MetricName": "tma_info_pmm_read_bw"
},
{
"BriefDescription": "Average 3DXP Memory Bandwidth Use for Writes [GB / sec]",
- "MetricExpr": "64 * UNC_M_PMM_WPQ_INSERTS / 1e9 / duration_time",
+ "MetricExpr": "(64 * UNC_M_PMM_WPQ_INSERTS / 1e9 / duration_time if #has_optane > 0 else 0)",
"MetricGroup": "Mem;MemoryBW;Server;SoC",
"MetricName": "tma_info_pmm_write_bw"
},
@@ -1298,7 +1298,7 @@
},
{
"BriefDescription": "This metric roughly estimates (based on idle latencies) how often the CPU was stalled on accesses to external 3D-Xpoint (Crystal Ridge, a.k.a",
- "MetricExpr": "((1 - (19 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + 10 * (MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_FWD * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_HITM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS))) / (19 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + 10 * (MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_FWD * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_HITM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + (25 * (MEM_LOAD_RETIRED.LOCAL_PMM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + 33 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_PMM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS))))) * (CYCLE_ACTIVITY.STALLS_L3_MISS / tma_info_clks + (CYCLE_ACTIVITY.STALLS_L1D_MISS - CYCLE_ACTIVITY.STALLS_L2_MISS) / tma_info_clks - tma_l2_bound) if 1e6 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_PMM + MEM_LOAD_RETIRED.LOCAL_PMM) > MEM_LOAD_RETIRED.L1_MISS else 0)",
+ "MetricExpr": "(((1 - ((19 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + 10 * (MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_FWD * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_HITM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS))) / (19 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + 10 * (MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_FWD * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_HITM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + (25 * (MEM_LOAD_RETIRED.LOCAL_PMM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) if #has_optane > 0 else 0) + 33 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_PMM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) if #has_optane > 0 else 0))) if #has_optane > 0 else 0)) * (CYCLE_ACTIVITY.STALLS_L3_MISS / tma_info_clks + (CYCLE_ACTIVITY.STALLS_L1D_MISS - CYCLE_ACTIVITY.STALLS_L2_MISS) / tma_info_clks - tma_l2_bound) if 1e6 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_PMM + MEM_LOAD_RETIRED.LOCAL_PMM) > MEM_LOAD_RETIRED.L1_MISS else 0) if #has_optane > 0 else 0)",
"MetricGroup": "MemoryBound;Server;TmaL3mem;TopdownL3;tma_L3_group;tma_memory_bound_group",
"MetricName": "tma_pmm_bound",
"MetricThreshold": "tma_pmm_bound > 0.1 & (tma_memory_bound > 0.2 & tma_backend_bound > 0.2)",
diff --git a/tools/perf/pmu-events/arch/x86/sapphirerapids/spr-metrics.json b/tools/perf/pmu-events/arch/x86/sapphirerapids/spr-metrics.json
index 149cc4c07fb5..d23fd6921a7f 100644
--- a/tools/perf/pmu-events/arch/x86/sapphirerapids/spr-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/sapphirerapids/spr-metrics.json
@@ -185,7 +185,7 @@
{
"BriefDescription": "This metric estimates how often the CPU was stalled on accesses to external memory (DRAM) by loads",
"MetricConstraint": "NO_GROUP_EVENTS",
- "MetricExpr": "MEMORY_ACTIVITY.STALLS_L3_MISS / tma_info_clks - tma_pmm_bound",
+ "MetricExpr": "(MEMORY_ACTIVITY.STALLS_L3_MISS / tma_info_clks - tma_pmm_bound if #has_optane > 0 else MEMORY_ACTIVITY.STALLS_L3_MISS / tma_info_clks)",
"MetricGroup": "MemoryBound;TmaL3mem;TopdownL3;tma_L3_group;tma_memory_bound_group",
"MetricName": "tma_dram_bound",
"MetricThreshold": "tma_dram_bound > 0.1 & (tma_memory_bound > 0.2 & tma_backend_bound > 0.2)",
@@ -968,7 +968,7 @@
},
{
"BriefDescription": "Average latency of data read request to external 3D X-Point memory [in nanoseconds]",
- "MetricExpr": "1e9 * (UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_PMM / UNC_CHA_TOR_INSERTS.IA_MISS_DRD_PMM) / uncore_cha_0@event\\=0x1@",
+ "MetricExpr": "(1e9 * (UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_PMM / UNC_CHA_TOR_INSERTS.IA_MISS_DRD_PMM) / uncore_cha_0@event\\=0x1@ if #has_optane > 0 else 0)",
"MetricGroup": "Mem;MemoryLat;Server;SoC",
"MetricName": "tma_info_mem_pmm_read_latency",
"PublicDescription": "Average latency of data read request to external 3D X-Point memory [in nanoseconds]. Accounts for demand loads and L1/L2 data-read prefetches"
@@ -1034,13 +1034,13 @@
},
{
"BriefDescription": "Average 3DXP Memory Bandwidth Use for reads [GB / sec]",
- "MetricExpr": "64 * UNC_M_PMM_RPQ_INSERTS / 1e9 / duration_time",
+ "MetricExpr": "(64 * UNC_M_PMM_RPQ_INSERTS / 1e9 / duration_time if #has_optane > 0 else 0)",
"MetricGroup": "Mem;MemoryBW;Server;SoC",
"MetricName": "tma_info_pmm_read_bw"
},
{
"BriefDescription": "Average 3DXP Memory Bandwidth Use for Writes [GB / sec]",
- "MetricExpr": "64 * UNC_M_PMM_WPQ_INSERTS / 1e9 / duration_time",
+ "MetricExpr": "(64 * UNC_M_PMM_WPQ_INSERTS / 1e9 / duration_time if #has_optane > 0 else 0)",
"MetricGroup": "Mem;MemoryBW;Server;SoC",
"MetricName": "tma_info_pmm_write_bw"
},
@@ -1406,7 +1406,7 @@
},
{
"BriefDescription": "This metric roughly estimates (based on idle latencies) how often the CPU was stalled on accesses to external 3D-Xpoint (Crystal Ridge, a.k.a",
- "MetricExpr": "((1 - (19 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + 10 * (MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_FWD * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_HITM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS))) / (19 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + 10 * (MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_FWD * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_HITM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + (25 * (MEM_LOAD_RETIRED.LOCAL_PMM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + 33 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_PMM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS))))) * (MEMORY_ACTIVITY.STALLS_L3_MISS / tma_info_clks) if 1e6 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_PMM + MEM_LOAD_RETIRED.LOCAL_PMM) > MEM_LOAD_RETIRED.L1_MISS else 0)",
+ "MetricExpr": "(((1 - ((19 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + 10 * (MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_FWD * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_HITM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS))) / (19 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + 10 * (MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_FWD * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) + MEM_LOAD_L3_MISS_RETIRED.REMOTE_HITM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS)) + (25 * (MEM_LOAD_RETIRED.LOCAL_PMM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) if #has_optane > 0 else 0) + 33 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_PMM * (1 + MEM_LOAD_RETIRED.FB_HIT / MEM_LOAD_RETIRED.L1_MISS) if #has_optane > 0 else 0))) if #has_optane > 0 else 0)) * (MEMORY_ACTIVITY.STALLS_L3_MISS / tma_info_clks) if 1e6 * (MEM_LOAD_L3_MISS_RETIRED.REMOTE_PMM + MEM_LOAD_RETIRED.LOCAL_PMM) > MEM_LOAD_RETIRED.L1_MISS else 0) if #has_optane > 0 else 0)",
"MetricGroup": "MemoryBound;Server;TmaL3mem;TopdownL3;tma_L3_group;tma_memory_bound_group",
"MetricName": "tma_pmm_bound",
"MetricThreshold": "tma_pmm_bound > 0.1 & (tma_memory_bound > 0.2 & tma_backend_bound > 0.2)",
--
2.40.0.348.gf938b09366-goog
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2 8/9] perf metrics: Add has_optane literal
2023-03-23 19:20 ` [PATCH v2 8/9] perf metrics: Add has_optane literal Ian Rogers
@ 2023-03-23 20:31 ` Liang, Kan
2023-03-23 20:43 ` Dan Williams
0 siblings, 1 reply; 10+ messages in thread
From: Liang, Kan @ 2023-03-23 20:31 UTC (permalink / raw)
To: Ian Rogers, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Zhengjun Xing, linux-perf-users, linux-kernel, Edward Baker,
Dan Williams, perry.taylor, caleb.biggers, samantha.alt,
weilin.wang
On 2023-03-23 3:20 p.m., Ian Rogers wrote:
> Add literal so that if optane memory isn't installed we can record
> fewer events.
I think we call it pmem (Persistent Memory) everywhere in the Linux
code. Maybe we should use #has_pmem instead?
Thanks,
Kan
> The file detection mechanism was suggested by Dan
> Williams <dan.j.williams@intel.com> in:
> https://lore.kernel.org/linux-perf-users/641bbe1eced26_1b98bb29440@dwillia2-xfh.jf.intel.com.notmuch/
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
> tools/perf/util/expr.c | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
> diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
> index d46a1878bc9e..a43cdda0b044 100644
> --- a/tools/perf/util/expr.c
> +++ b/tools/perf/util/expr.c
> @@ -14,6 +14,7 @@
> #include "util/hashmap.h"
> #include "smt.h"
> #include "tsc.h"
> +#include <api/fs/fs.h>
> #include <linux/err.h>
> #include <linux/kernel.h>
> #include <linux/zalloc.h>
> @@ -400,6 +401,20 @@ double arch_get_tsc_freq(void)
> }
> #endif
>
> +static double has_optane(void)
> +{
> + static bool has_optane, cached;
> + const char *sysfs = sysfs__mountpoint();
> + char path[PATH_MAX];
> +
> + if (!cached) {
> + snprintf(path, sizeof(path), "%s/firmware/acpi/tables/NFIT", sysfs);
> + has_optane = access(path, F_OK) == 0;
> + cached = true;
> + }
> + return has_optane ? 1.0 : 0.0;
> +}
> +
> double expr__get_literal(const char *literal, const struct expr_scanner_ctx *ctx)
> {
> const struct cpu_topology *topology;
> @@ -449,6 +464,10 @@ double expr__get_literal(const char *literal, const struct expr_scanner_ctx *ctx
> result = perf_pmu__cpu_slots_per_cycle();
> goto out;
> }
> + if (!strcmp("#has_optane", literal)) {
> + result = has_optane();
> + goto out;
> + }
>
> pr_err("Unrecognized literal '%s'", literal);
> out:
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 8/9] perf metrics: Add has_optane literal
2023-03-23 20:31 ` Liang, Kan
@ 2023-03-23 20:43 ` Dan Williams
2023-03-24 7:15 ` Ian Rogers
0 siblings, 1 reply; 10+ messages in thread
From: Dan Williams @ 2023-03-23 20:43 UTC (permalink / raw)
To: Liang, Kan, Ian Rogers, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Namhyung Kim, Zhengjun Xing, linux-perf-users,
linux-kernel, Edward Baker, Dan Williams, perry.taylor,
caleb.biggers, samantha.alt, weilin.wang
Liang, Kan wrote:
>
>
> On 2023-03-23 3:20 p.m., Ian Rogers wrote:
> > Add literal so that if optane memory isn't installed we can record
> > fewer events.
>
> I think we call it pmem (Persistent Memory) everywhere in the Linux
> code. Maybe we should use #has_pmem instead?
That makes sense especially because has_optane implies more precision
than the the check is performing. In general Optane is probably the
widest deployed NVDIMM type, but this check will succeed with battery
backed NVDIMMs and emulated PMEM in VMs which I think is perfectly ok
for making a default event record decision.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 8/9] perf metrics: Add has_optane literal
2023-03-23 20:43 ` Dan Williams
@ 2023-03-24 7:15 ` Ian Rogers
0 siblings, 0 replies; 10+ messages in thread
From: Ian Rogers @ 2023-03-24 7:15 UTC (permalink / raw)
To: Dan Williams
Cc: Liang, Kan, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Zhengjun Xing, linux-perf-users, linux-kernel, Edward Baker,
perry.taylor, caleb.biggers, samantha.alt, weilin.wang
On Thu, Mar 23, 2023 at 1:44 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> Liang, Kan wrote:
> >
> >
> > On 2023-03-23 3:20 p.m., Ian Rogers wrote:
> > > Add literal so that if optane memory isn't installed we can record
> > > fewer events.
> >
> > I think we call it pmem (Persistent Memory) everywhere in the Linux
> > code. Maybe we should use #has_pmem instead?
>
> That makes sense especially because has_optane implies more precision
> than the the check is performing. In general Optane is probably the
> widest deployed NVDIMM type, but this check will succeed with battery
> backed NVDIMMs and emulated PMEM in VMs which I think is perfectly ok
> for making a default event record decision.
Thanks, I'll change this in v3.
Ian
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-03-24 7:16 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-23 19:20 [PATCH v2 0/9] Update Intel events and make optane events dynamic Ian Rogers
2023-03-23 19:20 ` [PATCH v2 4/9] perf vendor events: Haswell v33 events Ian Rogers
2023-03-23 19:20 ` [PATCH v2 5/9] perf vendor events: Haswellx v27 events Ian Rogers
2023-03-23 19:20 ` [PATCH v2 6/9] perf vendor events: Jaketown v23 events Ian Rogers
2023-03-23 19:20 ` [PATCH v2 7/9] perf vendor events: Sandybridge v19 events Ian Rogers
2023-03-23 19:20 ` [PATCH v2 8/9] perf metrics: Add has_optane literal Ian Rogers
2023-03-23 20:31 ` Liang, Kan
2023-03-23 20:43 ` Dan Williams
2023-03-24 7:15 ` Ian Rogers
2023-03-23 19:20 ` [PATCH v2 9/9] perf vendor events: Update metrics to detect optane memory at runtime Ian Rogers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).