* [PATCH v2 00/12] Add metric has_event, update intel vendor events
@ 2023-06-23 15:10 Ian Rogers
2023-06-23 15:10 ` [PATCH v2 01/12] perf expr: Add has_event function Ian Rogers
` (10 more replies)
0 siblings, 11 replies; 15+ messages in thread
From: Ian Rogers @ 2023-06-23 15:10 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, Kan Liang, Zhengjun Xing, Kajol Jain,
John Garry, Andrii Nakryiko, Eduard Zingerman, Jing Zhang,
Sohom Datta, linux-kernel, linux-perf-users, Perry Taylor,
Samantha Alt, Caleb Biggers, Weilin Wang, Edward Baker
Add a new has_event function for metrics so that events that can be
disabled by the kernel/firmware don't cause metrics to fail. Use this
function for Intel transaction metrics fixing "perf all metrics test"
on systems with TSX disabled. The update conversion script is posted in:
https://github.com/intel/perfmon/pull/90
Re-generate Intel vendor events using:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
Adding rocketlake support, uncore and many core events for meteorlake,
and smaller updates for cascakelakex, icelake, icelakex,
sapphirerapids, skylake, skylakex and tigerlake.
v2. Handle failed memory allocated for evlist, John Garry.
Ian Rogers (12):
perf expr: Add has_event function
perf jevents: Support for has_event function
perf vendor metrics intel: Make transaction metrics conditional
perf vendor events intel: Add rocketlake events/metrics
perf vendor events intel: Update meteorlake to 1.03
perf vendor events intel: Update cascadelakex to 1.19
perf vendor events intel: Update icelake to 1.19
perf vendor events intel: Update icelakex to 1.21
perf vendor events intel: Update sapphirerapids to 1.14
perf vendor events intel: Update skylake to 57
perf vendor events intel: Update skylakex to 1.31
perf vendor events intel: Update tigerlake to 1.13
.../arch/x86/alderlake/adl-metrics.json | 8 +-
.../arch/x86/cascadelakex/clx-metrics.json | 8 +-
.../arch/x86/cascadelakex/frontend.json | 43 +-
.../arch/x86/cascadelakex/pipeline.json | 17 +-
.../x86/cascadelakex/uncore-interconnect.json | 2 +-
.../arch/x86/cascadelakex/uncore-memory.json | 2 +-
.../pmu-events/arch/x86/icelake/cache.json | 8 +-
.../pmu-events/arch/x86/icelake/frontend.json | 32 +-
.../arch/x86/icelake/icl-metrics.json | 8 +-
.../pmu-events/arch/x86/icelake/pipeline.json | 6 +-
.../arch/x86/icelakex/frontend.json | 32 +-
.../arch/x86/icelakex/icx-metrics.json | 8 +-
.../arch/x86/icelakex/pipeline.json | 4 +-
.../x86/icelakex/uncore-interconnect.json | 2 +-
tools/perf/pmu-events/arch/x86/mapfile.csv | 17 +-
.../pmu-events/arch/x86/meteorlake/cache.json | 811 +++++++++
.../arch/x86/meteorlake/floating-point.json | 143 ++
.../arch/x86/meteorlake/frontend.json | 410 +++++
.../arch/x86/meteorlake/memory.json | 142 +-
.../pmu-events/arch/x86/meteorlake/other.json | 57 +-
.../arch/x86/meteorlake/pipeline.json | 1211 ++++++++++++-
.../arch/x86/meteorlake/uncore-cache.json | 18 +
.../x86/meteorlake/uncore-interconnect.json | 42 +
.../arch/x86/meteorlake/uncore-memory.json | 126 ++
.../arch/x86/meteorlake/virtual-memory.json | 257 +++
.../pmu-events/arch/x86/rocketlake/cache.json | 894 ++++++++++
.../arch/x86/rocketlake/floating-point.json | 105 ++
.../arch/x86/rocketlake/frontend.json | 377 ++++
.../arch/x86/rocketlake/memory.json | 394 +++++
.../arch/x86/rocketlake/metricgroups.json | 113 ++
.../pmu-events/arch/x86/rocketlake/other.json | 242 +++
.../arch/x86/rocketlake/pipeline.json | 801 +++++++++
.../arch/x86/rocketlake/rkl-metrics.json | 1571 +++++++++++++++++
.../x86/rocketlake/uncore-interconnect.json | 74 +
.../arch/x86/rocketlake/uncore-other.json | 9 +
.../arch/x86/rocketlake/virtual-memory.json | 165 ++
.../arch/x86/sapphirerapids/pipeline.json | 2 +-
.../arch/x86/sapphirerapids/spr-metrics.json | 8 +-
.../arch/x86/sapphirerapids/uncore-cache.json | 308 ++++
.../sapphirerapids/uncore-interconnect.json | 2 +-
.../pmu-events/arch/x86/skylake/frontend.json | 43 +-
.../pmu-events/arch/x86/skylake/pipeline.json | 17 +-
.../arch/x86/skylake/skl-metrics.json | 8 +-
.../arch/x86/skylakex/frontend.json | 43 +-
.../arch/x86/skylakex/pipeline.json | 17 +-
.../arch/x86/skylakex/skx-metrics.json | 8 +-
.../x86/skylakex/uncore-interconnect.json | 2 +-
.../arch/x86/skylakex/uncore-memory.json | 2 +-
.../arch/x86/tigerlake/frontend.json | 32 +-
.../arch/x86/tigerlake/pipeline.json | 6 +-
.../arch/x86/tigerlake/tgl-metrics.json | 8 +-
tools/perf/pmu-events/metric.py | 8 +-
tools/perf/tests/expr.c | 4 +
tools/perf/util/expr.c | 21 +
tools/perf/util/expr.h | 1 +
tools/perf/util/expr.l | 1 +
tools/perf/util/expr.y | 8 +-
57 files changed, 8506 insertions(+), 202 deletions(-)
create mode 100644 tools/perf/pmu-events/arch/x86/meteorlake/floating-point.json
create mode 100644 tools/perf/pmu-events/arch/x86/meteorlake/uncore-cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/meteorlake/uncore-interconnect.json
create mode 100644 tools/perf/pmu-events/arch/x86/meteorlake/uncore-memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/rocketlake/cache.json
create mode 100644 tools/perf/pmu-events/arch/x86/rocketlake/floating-point.json
create mode 100644 tools/perf/pmu-events/arch/x86/rocketlake/frontend.json
create mode 100644 tools/perf/pmu-events/arch/x86/rocketlake/memory.json
create mode 100644 tools/perf/pmu-events/arch/x86/rocketlake/metricgroups.json
create mode 100644 tools/perf/pmu-events/arch/x86/rocketlake/other.json
create mode 100644 tools/perf/pmu-events/arch/x86/rocketlake/pipeline.json
create mode 100644 tools/perf/pmu-events/arch/x86/rocketlake/rkl-metrics.json
create mode 100644 tools/perf/pmu-events/arch/x86/rocketlake/uncore-interconnect.json
create mode 100644 tools/perf/pmu-events/arch/x86/rocketlake/uncore-other.json
create mode 100644 tools/perf/pmu-events/arch/x86/rocketlake/virtual-memory.json
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 01/12] perf expr: Add has_event function
2023-06-23 15:10 [PATCH v2 00/12] Add metric has_event, update intel vendor events Ian Rogers
@ 2023-06-23 15:10 ` Ian Rogers
2023-06-23 15:10 ` [PATCH v2 02/12] perf jevents: Support for " Ian Rogers
` (9 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Ian Rogers @ 2023-06-23 15:10 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, Kan Liang, Zhengjun Xing, Kajol Jain,
John Garry, Andrii Nakryiko, Eduard Zingerman, Jing Zhang,
Sohom Datta, linux-kernel, linux-perf-users, Perry Taylor,
Samantha Alt, Caleb Biggers, Weilin Wang, Edward Baker
Some events are dependent on firmware/kernel enablement. Allow such
events to be detected when the metric is parsed so that the metric's
event parsing doesn't fail.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/tests/expr.c | 4 ++++
tools/perf/util/expr.c | 21 +++++++++++++++++++++
tools/perf/util/expr.h | 1 +
tools/perf/util/expr.l | 1 +
tools/perf/util/expr.y | 8 +++++++-
5 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index 3d01eb5e2512..c1c3fcbc2753 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -254,6 +254,10 @@ static int test__expr(struct test_suite *t __maybe_unused, int subtest __maybe_u
TEST_ASSERT_VAL("source count", hashmap__size(ctx->ids) == 1);
TEST_ASSERT_VAL("source count", hashmap__find(ctx->ids, "EVENT1", &val_ptr));
+ /* has_event returns 1 when an event exists. */
+ expr__add_id_val(ctx, strdup("cycles"), 2);
+ ret = test(ctx, "has_event(cycles)", 1);
+
expr__ctx_free(ctx);
return 0;
diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index f4e52919324e..4814262e3805 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -8,6 +8,7 @@
#include "cpumap.h"
#include "cputopo.h"
#include "debug.h"
+#include "evlist.h"
#include "expr.h"
#include "expr-bison.h"
#include "expr-flex.h"
@@ -474,3 +475,23 @@ double expr__get_literal(const char *literal, const struct expr_scanner_ctx *ctx
pr_debug2("literal: %s = %f\n", literal, result);
return result;
}
+
+/* Does the event 'id' parse? Determine via ctx->ids if possible. */
+double expr__has_event(const struct expr_parse_ctx *ctx, bool compute_ids, const char *id)
+{
+ struct evlist *tmp;
+ double ret;
+
+ if (hashmap__find(ctx->ids, id, /*value=*/NULL))
+ return 1.0;
+
+ if (!compute_ids)
+ return 0.0;
+
+ tmp = evlist__new();
+ if (!tmp)
+ return NAN;
+ ret = parse_event(tmp, id) ? 0 : 1;
+ evlist__delete(tmp);
+ return ret;
+}
diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h
index eaa44b24c555..3c1e49b3e35d 100644
--- a/tools/perf/util/expr.h
+++ b/tools/perf/util/expr.h
@@ -54,5 +54,6 @@ int expr__find_ids(const char *expr, const char *one,
double expr_id_data__value(const struct expr_id_data *data);
double expr_id_data__source_count(const struct expr_id_data *data);
double expr__get_literal(const char *literal, const struct expr_scanner_ctx *ctx);
+double expr__has_event(const struct expr_parse_ctx *ctx, bool compute_ids, const char *id);
#endif
diff --git a/tools/perf/util/expr.l b/tools/perf/util/expr.l
index 4fbf353e78e7..dbb117414710 100644
--- a/tools/perf/util/expr.l
+++ b/tools/perf/util/expr.l
@@ -113,6 +113,7 @@ min { return MIN; }
if { return IF; }
else { return ELSE; }
source_count { return SOURCE_COUNT; }
+has_event { return HAS_EVENT; }
{literal} { return literal(yyscanner, sctx); }
{number} { return value(yyscanner); }
{symbol} { return str(yyscanner, ID, sctx->runtime); }
diff --git a/tools/perf/util/expr.y b/tools/perf/util/expr.y
index f04963eb6be0..dd504afd8f36 100644
--- a/tools/perf/util/expr.y
+++ b/tools/perf/util/expr.y
@@ -37,7 +37,7 @@
} ids;
}
-%token ID NUMBER MIN MAX IF ELSE LITERAL D_RATIO SOURCE_COUNT EXPR_ERROR
+%token ID NUMBER MIN MAX IF ELSE LITERAL D_RATIO SOURCE_COUNT HAS_EVENT EXPR_ERROR
%left MIN MAX IF
%left '|'
%left '^'
@@ -199,6 +199,12 @@ expr: NUMBER
}
| ID { $$ = handle_id(ctx, $1, compute_ids, /*source_count=*/false); }
| SOURCE_COUNT '(' ID ')' { $$ = handle_id(ctx, $3, compute_ids, /*source_count=*/true); }
+| HAS_EVENT '(' ID ')'
+{
+ $$.val = expr__has_event(ctx, compute_ids, $3);
+ $$.ids = NULL;
+ free($3);
+}
| expr '|' expr
{
if (is_const($1.val) && is_const($3.val)) {
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 02/12] perf jevents: Support for has_event function
2023-06-23 15:10 [PATCH v2 00/12] Add metric has_event, update intel vendor events Ian Rogers
2023-06-23 15:10 ` [PATCH v2 01/12] perf expr: Add has_event function Ian Rogers
@ 2023-06-23 15:10 ` Ian Rogers
2023-06-23 15:10 ` [PATCH v2 03/12] perf vendor metrics intel: Make transaction metrics conditional Ian Rogers
` (8 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Ian Rogers @ 2023-06-23 15:10 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, Kan Liang, Zhengjun Xing, Kajol Jain,
John Garry, Andrii Nakryiko, Eduard Zingerman, Jing Zhang,
Sohom Datta, linux-kernel, linux-perf-users, Perry Taylor,
Samantha Alt, Caleb Biggers, Weilin Wang, Edward Baker
Support for the new has_event function in metrics.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/metric.py | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/tools/perf/pmu-events/metric.py b/tools/perf/pmu-events/metric.py
index af58b74d1644..85a3545f5b6a 100644
--- a/tools/perf/pmu-events/metric.py
+++ b/tools/perf/pmu-events/metric.py
@@ -408,6 +408,12 @@ def source_count(event: Event) -> Function:
return Function('source_count', event)
+def has_event(event: Event) -> Function:
+ # pylint: disable=redefined-builtin
+ # pylint: disable=invalid-name
+ return Function('has_event', event)
+
+
class Metric:
"""An individual metric that will specifiable on the perf command line."""
groups: Set[str]
@@ -539,7 +545,7 @@ def ParsePerfJson(orig: str) -> Expression:
r'Event(r"\1")', py)
py = re.sub(r'#Event\(r"([^"]*)"\)', r'Literal("#\1")', py)
py = re.sub(r'([0-9]+)Event\(r"(e[0-9]+)"\)', r'\1\2', py)
- keywords = ['if', 'else', 'min', 'max', 'd_ratio', 'source_count']
+ keywords = ['if', 'else', 'min', 'max', 'd_ratio', 'source_count', 'has_event']
for kw in keywords:
py = re.sub(rf'Event\(r"{kw}"\)', kw, py)
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 03/12] perf vendor metrics intel: Make transaction metrics conditional
2023-06-23 15:10 [PATCH v2 00/12] Add metric has_event, update intel vendor events Ian Rogers
2023-06-23 15:10 ` [PATCH v2 01/12] perf expr: Add has_event function Ian Rogers
2023-06-23 15:10 ` [PATCH v2 02/12] perf jevents: Support for " Ian Rogers
@ 2023-06-23 15:10 ` Ian Rogers
2023-06-23 15:10 ` [PATCH v2 06/12] perf vendor events intel: Update cascadelakex to 1.19 Ian Rogers
` (7 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Ian Rogers @ 2023-06-23 15:10 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, Kan Liang, Zhengjun Xing, Kajol Jain,
John Garry, Andrii Nakryiko, Eduard Zingerman, Jing Zhang,
Sohom Datta, linux-kernel, linux-perf-users, Perry Taylor,
Samantha Alt, Caleb Biggers, Weilin Wang, Edward Baker
Make the transaction metrics conditional on the cycles-tx event being
present. This event may not be present when TSX extensions have been
disabled.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/arch/x86/alderlake/adl-metrics.json | 8 ++++----
.../pmu-events/arch/x86/cascadelakex/clx-metrics.json | 8 ++++----
tools/perf/pmu-events/arch/x86/icelake/icl-metrics.json | 8 ++++----
tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json | 8 ++++----
.../pmu-events/arch/x86/sapphirerapids/spr-metrics.json | 8 ++++----
tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json | 8 ++++----
tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json | 8 ++++----
tools/perf/pmu-events/arch/x86/tigerlake/tgl-metrics.json | 8 ++++----
8 files changed, 32 insertions(+), 32 deletions(-)
diff --git a/tools/perf/pmu-events/arch/x86/alderlake/adl-metrics.json b/tools/perf/pmu-events/arch/x86/alderlake/adl-metrics.json
index 85fb975b6f56..daf9458f0b77 100644
--- a/tools/perf/pmu-events/arch/x86/alderlake/adl-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/alderlake/adl-metrics.json
@@ -92,28 +92,28 @@
},
{
"BriefDescription": "Percentage of cycles in aborted transactions.",
- "MetricExpr": "max(cpu@cycles\\-t@ - cpu@cycles\\-ct@, 0) / cycles",
+ "MetricExpr": "(max(cycles\\-t - cycles\\-ct, 0) / cycles if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_aborted_cycles",
"ScaleUnit": "100%"
},
{
"BriefDescription": "Number of cycles within a transaction divided by the number of elisions.",
- "MetricExpr": "cpu@cycles\\-t@ / cpu@el\\-start@",
+ "MetricExpr": "(cycles\\-t / el\\-start if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_cycles_per_elision",
"ScaleUnit": "1cycles / elision"
},
{
"BriefDescription": "Number of cycles within a transaction divided by the number of transactions.",
- "MetricExpr": "cpu@cycles\\-t@ / cpu@tx\\-start@",
+ "MetricExpr": "(cycles\\-t / tx\\-start if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_cycles_per_transaction",
"ScaleUnit": "1cycles / transaction"
},
{
"BriefDescription": "Percentage of cycles within a transaction region.",
- "MetricExpr": "cpu@cycles\\-t@ / cycles",
+ "MetricExpr": "(cycles\\-t / cycles if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_transactional_cycles",
"ScaleUnit": "100%"
diff --git a/tools/perf/pmu-events/arch/x86/cascadelakex/clx-metrics.json b/tools/perf/pmu-events/arch/x86/cascadelakex/clx-metrics.json
index 0e2e446ced7a..fbb111e40829 100644
--- a/tools/perf/pmu-events/arch/x86/cascadelakex/clx-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/cascadelakex/clx-metrics.json
@@ -1830,28 +1830,28 @@
},
{
"BriefDescription": "Percentage of cycles in aborted transactions.",
- "MetricExpr": "max(cpu@cycles\\-t@ - cpu@cycles\\-ct@, 0) / cycles",
+ "MetricExpr": "(max(cycles\\-t - cycles\\-ct, 0) / cycles if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_aborted_cycles",
"ScaleUnit": "100%"
},
{
"BriefDescription": "Number of cycles within a transaction divided by the number of elisions.",
- "MetricExpr": "cpu@cycles\\-t@ / cpu@el\\-start@",
+ "MetricExpr": "(cycles\\-t / el\\-start if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_cycles_per_elision",
"ScaleUnit": "1cycles / elision"
},
{
"BriefDescription": "Number of cycles within a transaction divided by the number of transactions.",
- "MetricExpr": "cpu@cycles\\-t@ / cpu@tx\\-start@",
+ "MetricExpr": "(cycles\\-t / tx\\-start if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_cycles_per_transaction",
"ScaleUnit": "1cycles / transaction"
},
{
"BriefDescription": "Percentage of cycles within a transaction region.",
- "MetricExpr": "cpu@cycles\\-t@ / cycles",
+ "MetricExpr": "(cycles\\-t / cycles if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_transactional_cycles",
"ScaleUnit": "100%"
diff --git a/tools/perf/pmu-events/arch/x86/icelake/icl-metrics.json b/tools/perf/pmu-events/arch/x86/icelake/icl-metrics.json
index cc4edf855064..8fcc05c4e0a1 100644
--- a/tools/perf/pmu-events/arch/x86/icelake/icl-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/icelake/icl-metrics.json
@@ -1516,28 +1516,28 @@
},
{
"BriefDescription": "Percentage of cycles in aborted transactions.",
- "MetricExpr": "max(cpu@cycles\\-t@ - cpu@cycles\\-ct@, 0) / cycles",
+ "MetricExpr": "(max(cycles\\-t - cycles\\-ct, 0) / cycles if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_aborted_cycles",
"ScaleUnit": "100%"
},
{
"BriefDescription": "Number of cycles within a transaction divided by the number of elisions.",
- "MetricExpr": "cpu@cycles\\-t@ / cpu@el\\-start@",
+ "MetricExpr": "(cycles\\-t / el\\-start if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_cycles_per_elision",
"ScaleUnit": "1cycles / elision"
},
{
"BriefDescription": "Number of cycles within a transaction divided by the number of transactions.",
- "MetricExpr": "cpu@cycles\\-t@ / cpu@tx\\-start@",
+ "MetricExpr": "(cycles\\-t / tx\\-start if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_cycles_per_transaction",
"ScaleUnit": "1cycles / transaction"
},
{
"BriefDescription": "Percentage of cycles within a transaction region.",
- "MetricExpr": "cpu@cycles\\-t@ / cycles",
+ "MetricExpr": "(cycles\\-t / cycles if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_transactional_cycles",
"ScaleUnit": "100%"
diff --git a/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json b/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json
index 6f25b5b7aaf6..9bb7e3f20f7f 100644
--- a/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json
@@ -1812,28 +1812,28 @@
},
{
"BriefDescription": "Percentage of cycles in aborted transactions.",
- "MetricExpr": "max(cpu@cycles\\-t@ - cpu@cycles\\-ct@, 0) / cycles",
+ "MetricExpr": "(max(cycles\\-t - cycles\\-ct, 0) / cycles if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_aborted_cycles",
"ScaleUnit": "100%"
},
{
"BriefDescription": "Number of cycles within a transaction divided by the number of elisions.",
- "MetricExpr": "cpu@cycles\\-t@ / cpu@el\\-start@",
+ "MetricExpr": "(cycles\\-t / el\\-start if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_cycles_per_elision",
"ScaleUnit": "1cycles / elision"
},
{
"BriefDescription": "Number of cycles within a transaction divided by the number of transactions.",
- "MetricExpr": "cpu@cycles\\-t@ / cpu@tx\\-start@",
+ "MetricExpr": "(cycles\\-t / tx\\-start if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_cycles_per_transaction",
"ScaleUnit": "1cycles / transaction"
},
{
"BriefDescription": "Percentage of cycles within a transaction region.",
- "MetricExpr": "cpu@cycles\\-t@ / cycles",
+ "MetricExpr": "(cycles\\-t / cycles if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_transactional_cycles",
"ScaleUnit": "100%"
diff --git a/tools/perf/pmu-events/arch/x86/sapphirerapids/spr-metrics.json b/tools/perf/pmu-events/arch/x86/sapphirerapids/spr-metrics.json
index c732982f70b5..c207c851a9f9 100644
--- a/tools/perf/pmu-events/arch/x86/sapphirerapids/spr-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/sapphirerapids/spr-metrics.json
@@ -1938,28 +1938,28 @@
},
{
"BriefDescription": "Percentage of cycles in aborted transactions.",
- "MetricExpr": "max(cpu@cycles\\-t@ - cpu@cycles\\-ct@, 0) / cycles",
+ "MetricExpr": "(max(cycles\\-t - cycles\\-ct, 0) / cycles if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_aborted_cycles",
"ScaleUnit": "100%"
},
{
"BriefDescription": "Number of cycles within a transaction divided by the number of elisions.",
- "MetricExpr": "cpu@cycles\\-t@ / cpu@el\\-start@",
+ "MetricExpr": "(cycles\\-t / el\\-start if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_cycles_per_elision",
"ScaleUnit": "1cycles / elision"
},
{
"BriefDescription": "Number of cycles within a transaction divided by the number of transactions.",
- "MetricExpr": "cpu@cycles\\-t@ / cpu@tx\\-start@",
+ "MetricExpr": "(cycles\\-t / tx\\-start if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_cycles_per_transaction",
"ScaleUnit": "1cycles / transaction"
},
{
"BriefDescription": "Percentage of cycles within a transaction region.",
- "MetricExpr": "cpu@cycles\\-t@ / cycles",
+ "MetricExpr": "(cycles\\-t / cycles if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_transactional_cycles",
"ScaleUnit": "100%"
diff --git a/tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json b/tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json
index 2ed88842b880..94cb38540b5a 100644
--- a/tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json
@@ -1466,28 +1466,28 @@
},
{
"BriefDescription": "Percentage of cycles in aborted transactions.",
- "MetricExpr": "max(cpu@cycles\\-t@ - cpu@cycles\\-ct@, 0) / cycles",
+ "MetricExpr": "(max(cycles\\-t - cycles\\-ct, 0) / cycles if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_aborted_cycles",
"ScaleUnit": "100%"
},
{
"BriefDescription": "Number of cycles within a transaction divided by the number of elisions.",
- "MetricExpr": "cpu@cycles\\-t@ / cpu@el\\-start@",
+ "MetricExpr": "(cycles\\-t / el\\-start if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_cycles_per_elision",
"ScaleUnit": "1cycles / elision"
},
{
"BriefDescription": "Number of cycles within a transaction divided by the number of transactions.",
- "MetricExpr": "cpu@cycles\\-t@ / cpu@tx\\-start@",
+ "MetricExpr": "(cycles\\-t / tx\\-start if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_cycles_per_transaction",
"ScaleUnit": "1cycles / transaction"
},
{
"BriefDescription": "Percentage of cycles within a transaction region.",
- "MetricExpr": "cpu@cycles\\-t@ / cycles",
+ "MetricExpr": "(cycles\\-t / cycles if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_transactional_cycles",
"ScaleUnit": "100%"
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json b/tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json
index 507d39efacc8..fa4209809c57 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json
@@ -1774,28 +1774,28 @@
},
{
"BriefDescription": "Percentage of cycles in aborted transactions.",
- "MetricExpr": "max(cpu@cycles\\-t@ - cpu@cycles\\-ct@, 0) / cycles",
+ "MetricExpr": "(max(cycles\\-t - cycles\\-ct, 0) / cycles if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_aborted_cycles",
"ScaleUnit": "100%"
},
{
"BriefDescription": "Number of cycles within a transaction divided by the number of elisions.",
- "MetricExpr": "cpu@cycles\\-t@ / cpu@el\\-start@",
+ "MetricExpr": "(cycles\\-t / el\\-start if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_cycles_per_elision",
"ScaleUnit": "1cycles / elision"
},
{
"BriefDescription": "Number of cycles within a transaction divided by the number of transactions.",
- "MetricExpr": "cpu@cycles\\-t@ / cpu@tx\\-start@",
+ "MetricExpr": "(cycles\\-t / tx\\-start if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_cycles_per_transaction",
"ScaleUnit": "1cycles / transaction"
},
{
"BriefDescription": "Percentage of cycles within a transaction region.",
- "MetricExpr": "cpu@cycles\\-t@ / cycles",
+ "MetricExpr": "(cycles\\-t / cycles if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_transactional_cycles",
"ScaleUnit": "100%"
diff --git a/tools/perf/pmu-events/arch/x86/tigerlake/tgl-metrics.json b/tools/perf/pmu-events/arch/x86/tigerlake/tgl-metrics.json
index 83346911aa63..c7c2d6ab1a93 100644
--- a/tools/perf/pmu-events/arch/x86/tigerlake/tgl-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/tigerlake/tgl-metrics.json
@@ -1530,28 +1530,28 @@
},
{
"BriefDescription": "Percentage of cycles in aborted transactions.",
- "MetricExpr": "max(cpu@cycles\\-t@ - cpu@cycles\\-ct@, 0) / cycles",
+ "MetricExpr": "(max(cycles\\-t - cycles\\-ct, 0) / cycles if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_aborted_cycles",
"ScaleUnit": "100%"
},
{
"BriefDescription": "Number of cycles within a transaction divided by the number of elisions.",
- "MetricExpr": "cpu@cycles\\-t@ / cpu@el\\-start@",
+ "MetricExpr": "(cycles\\-t / el\\-start if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_cycles_per_elision",
"ScaleUnit": "1cycles / elision"
},
{
"BriefDescription": "Number of cycles within a transaction divided by the number of transactions.",
- "MetricExpr": "cpu@cycles\\-t@ / cpu@tx\\-start@",
+ "MetricExpr": "(cycles\\-t / tx\\-start if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_cycles_per_transaction",
"ScaleUnit": "1cycles / transaction"
},
{
"BriefDescription": "Percentage of cycles within a transaction region.",
- "MetricExpr": "cpu@cycles\\-t@ / cycles",
+ "MetricExpr": "(cycles\\-t / cycles if has_event(cycles\\-t) else 0)",
"MetricGroup": "transaction",
"MetricName": "tsx_transactional_cycles",
"ScaleUnit": "100%"
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 06/12] perf vendor events intel: Update cascadelakex to 1.19
2023-06-23 15:10 [PATCH v2 00/12] Add metric has_event, update intel vendor events Ian Rogers
` (2 preceding siblings ...)
2023-06-23 15:10 ` [PATCH v2 03/12] perf vendor metrics intel: Make transaction metrics conditional Ian Rogers
@ 2023-06-23 15:10 ` Ian Rogers
2023-06-23 15:10 ` [PATCH v2 07/12] perf vendor events intel: Update icelake " Ian Rogers
` (6 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Ian Rogers @ 2023-06-23 15:10 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, Kan Liang, Zhengjun Xing, Kajol Jain,
John Garry, Andrii Nakryiko, Eduard Zingerman, Jing Zhang,
Sohom Datta, linux-kernel, linux-perf-users, Perry Taylor,
Samantha Alt, Caleb Biggers, Weilin Wang, Edward Baker
Updates were released in:
https://github.com/intel/perfmon/commit/e4f83534f23a69e6da55c672c4d929919688c9b6
Adds the events IDQ.DSB_CYCLES_OK, IDQ.DSB_CYCLES_ANY,
ICACHE_TAG.STALLS, DECODE.LCP, LSD.CYCLES_OK. Descriptions are also
updated.
Signed-off-by: Ian Rogers <irogers@google.com>
---
.../arch/x86/cascadelakex/frontend.json | 43 ++++++++++++++++---
.../arch/x86/cascadelakex/pipeline.json | 17 ++++++--
.../x86/cascadelakex/uncore-interconnect.json | 2 +-
.../arch/x86/cascadelakex/uncore-memory.json | 2 +-
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
5 files changed, 54 insertions(+), 12 deletions(-)
diff --git a/tools/perf/pmu-events/arch/x86/cascadelakex/frontend.json b/tools/perf/pmu-events/arch/x86/cascadelakex/frontend.json
index 04f08e4d2402..095904c77001 100644
--- a/tools/perf/pmu-events/arch/x86/cascadelakex/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/cascadelakex/frontend.json
@@ -7,6 +7,14 @@
"SampleAfterValue": "100003",
"UMask": "0x1"
},
+ {
+ "BriefDescription": "Stalls caused by changing prefix length of the instruction. [This event is alias to ILD_STALL.LCP]",
+ "EventCode": "0x87",
+ "EventName": "DECODE.LCP",
+ "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk. [This event is alias to ILD_STALL.LCP]",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x1"
+ },
{
"BriefDescription": "Decode Stream Buffer (DSB)-to-MITE switches",
"EventCode": "0xAB",
@@ -245,27 +253,34 @@
"UMask": "0x2"
},
{
- "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss.",
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_TAG.STALLS]",
"EventCode": "0x83",
"EventName": "ICACHE_64B.IFTAG_STALL",
"SampleAfterValue": "200003",
"UMask": "0x4"
},
{
- "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops",
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_64B.IFTAG_STALL]",
+ "EventCode": "0x83",
+ "EventName": "ICACHE_TAG.STALLS",
+ "SampleAfterValue": "200003",
+ "UMask": "0x4"
+ },
+ {
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops [This event is alias to IDQ.DSB_CYCLES_OK]",
"CounterMask": "4",
"EventCode": "0x79",
"EventName": "IDQ.ALL_DSB_CYCLES_4_UOPS",
- "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ.",
+ "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.DSB_CYCLES_OK]",
"SampleAfterValue": "2000003",
"UMask": "0x18"
},
{
- "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering any Uop",
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering any Uop [This event is alias to IDQ.DSB_CYCLES_ANY]",
"CounterMask": "1",
"EventCode": "0x79",
"EventName": "IDQ.ALL_DSB_CYCLES_ANY_UOPS",
- "PublicDescription": "Counts the number of cycles uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ.",
+ "PublicDescription": "Counts the number of cycles uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.DSB_CYCLES_ANY]",
"SampleAfterValue": "2000003",
"UMask": "0x18"
},
@@ -296,6 +311,24 @@
"SampleAfterValue": "2000003",
"UMask": "0x8"
},
+ {
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering any Uop [This event is alias to IDQ.ALL_DSB_CYCLES_ANY_UOPS]",
+ "CounterMask": "1",
+ "EventCode": "0x79",
+ "EventName": "IDQ.DSB_CYCLES_ANY",
+ "PublicDescription": "Counts the number of cycles uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.ALL_DSB_CYCLES_ANY_UOPS]",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x18"
+ },
+ {
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
+ "CounterMask": "4",
+ "EventCode": "0x79",
+ "EventName": "IDQ.DSB_CYCLES_OK",
+ "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x18"
+ },
{
"BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
"EventCode": "0x79",
diff --git a/tools/perf/pmu-events/arch/x86/cascadelakex/pipeline.json b/tools/perf/pmu-events/arch/x86/cascadelakex/pipeline.json
index 31a1663d57f8..66d686cc933e 100644
--- a/tools/perf/pmu-events/arch/x86/cascadelakex/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/cascadelakex/pipeline.json
@@ -361,10 +361,10 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Stalls caused by changing prefix length of the instruction.",
+ "BriefDescription": "Stalls caused by changing prefix length of the instruction. [This event is alias to DECODE.LCP]",
"EventCode": "0x87",
"EventName": "ILD_STALL.LCP",
- "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk.",
+ "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk. [This event is alias to DECODE.LCP]",
"SampleAfterValue": "2000003",
"UMask": "0x1"
},
@@ -488,11 +488,11 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
+ "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder. [This event is alias to LSD.CYCLES_OK]",
"CounterMask": "4",
"EventCode": "0xA8",
"EventName": "LSD.CYCLES_4_UOPS",
- "PublicDescription": "Counts the cycles when 4 uops are delivered by the LSD (Loop-stream detector).",
+ "PublicDescription": "Counts the cycles when 4 uops are delivered by the LSD (Loop-stream detector). [This event is alias to LSD.CYCLES_OK]",
"SampleAfterValue": "2000003",
"UMask": "0x1"
},
@@ -505,6 +505,15 @@
"SampleAfterValue": "2000003",
"UMask": "0x1"
},
+ {
+ "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder. [This event is alias to LSD.CYCLES_4_UOPS]",
+ "CounterMask": "4",
+ "EventCode": "0xA8",
+ "EventName": "LSD.CYCLES_OK",
+ "PublicDescription": "Counts the cycles when 4 uops are delivered by the LSD (Loop-stream detector). [This event is alias to LSD.CYCLES_4_UOPS]",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x1"
+ },
{
"BriefDescription": "Number of Uops delivered by the LSD.",
"EventCode": "0xA8",
diff --git a/tools/perf/pmu-events/arch/x86/cascadelakex/uncore-interconnect.json b/tools/perf/pmu-events/arch/x86/cascadelakex/uncore-interconnect.json
index 725780fb3990..1a342dff1503 100644
--- a/tools/perf/pmu-events/arch/x86/cascadelakex/uncore-interconnect.json
+++ b/tools/perf/pmu-events/arch/x86/cascadelakex/uncore-interconnect.json
@@ -6606,7 +6606,7 @@
"EventCode": "0x52",
"EventName": "UNC_M3UPI_RxC_HELD.PARALLEL_SUCCESS",
"PerPkg": "1",
- "PublicDescription": "ad and bl messages were actually slotted into the same flit in paralle",
+ "PublicDescription": "ad and bl messages were actually slotted into the same flit in parallel",
"UMask": "0x8",
"Unit": "M3UPI"
},
diff --git a/tools/perf/pmu-events/arch/x86/cascadelakex/uncore-memory.json b/tools/perf/pmu-events/arch/x86/cascadelakex/uncore-memory.json
index f761856d738e..d82d2cca6f0a 100644
--- a/tools/perf/pmu-events/arch/x86/cascadelakex/uncore-memory.json
+++ b/tools/perf/pmu-events/arch/x86/cascadelakex/uncore-memory.json
@@ -2735,7 +2735,7 @@
"EventCode": "0x81",
"EventName": "UNC_M_WPQ_OCCUPANCY",
"PerPkg": "1",
- "PublicDescription": "Counts the number of entries in the Write Pending Queue (WPQ) at each cycle. This can then be used to calculate both the average queue occupancy (in conjunction with the number of cycles not empty) and the average latency (in conjunction with the number of allocations). The WPQ is used to schedule writes out to the memory controller and to track the requests. Requests allocate into the WPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the CHA to the iMC (memory controller). They deallocate after being issued to DRAM. Write requests themselves are able to complete (from the perspective of the rest of the system) as soon they have 'posted' to the iMC. This is not to be confused with actually performing the write to DRAM. Therefore, the average latency for this queue is actually not useful for deconstruction intermediate write latencies. So, we provide filtering based on if the request has posted or not. By using the 'not posted' filter, we can track how long writes spent in the iMC before completions were sent to the HA. The 'posted' filter, on the other hand, provides information about how much queueing is actually happening in the iMC for writes before they are actually issued to memory. High average occupancies will generally coincide with high write major mode counts. Is there a filter of sorts?",
+ "PublicDescription": "Counts the number of entries in the Write Pending Queue (WPQ) at each cycle. This can then be used to calculate both the average queue occupancy (in conjunction with the number of cycles not empty) and the average latency (in conjunction with the number of allocations). The WPQ is used to schedule writes out to the memory controller and to track the requests. Requests allocate into the WPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the CHA to the iMC (memory controller). They deallocate after being issued to DRAM. Write requests themselves are able to complete (from the perspective of the rest of the system) as soon they have 'posted' to the iMC. This is not to be confused with actually performing the write to DRAM. Therefore, the average latency for this queue is actually not useful for deconstruction intermediate write latencies. So, we provide filtering based on if the request has posted or not. By using the 'not posted' filter, we can track how long writes spent in the iMC before completions were sent to the HA. The 'posted' filter, on the other hand, provides information about how much queueing is actually happening in the iMC for writes before they are actually issued to memory. High average occupancies will generally coincide with high write major mode counts.",
"Unit": "iMC"
},
{
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index de4832bddf05..eccc7ef98870 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -5,7 +5,7 @@ GenuineIntel-6-(1C|26|27|35|36),v4,bonnell,core
GenuineIntel-6-(3D|47),v28,broadwell,core
GenuineIntel-6-56,v10,broadwellde,core
GenuineIntel-6-4F,v21,broadwellx,core
-GenuineIntel-6-55-[56789ABCDEF],v1.18,cascadelakex,core
+GenuineIntel-6-55-[56789ABCDEF],v1.19,cascadelakex,core
GenuineIntel-6-9[6C],v1.04,elkhartlake,core
GenuineIntel-6-5[CF],v13,goldmont,core
GenuineIntel-6-7A,v1.01,goldmontplus,core
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 07/12] perf vendor events intel: Update icelake to 1.19
2023-06-23 15:10 [PATCH v2 00/12] Add metric has_event, update intel vendor events Ian Rogers
` (3 preceding siblings ...)
2023-06-23 15:10 ` [PATCH v2 06/12] perf vendor events intel: Update cascadelakex to 1.19 Ian Rogers
@ 2023-06-23 15:10 ` Ian Rogers
2023-06-23 16:04 ` Vince Weaver
2023-06-23 15:10 ` [PATCH v2 08/12] perf vendor events intel: Update icelakex to 1.21 Ian Rogers
` (5 subsequent siblings)
10 siblings, 1 reply; 15+ messages in thread
From: Ian Rogers @ 2023-06-23 15:10 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, Kan Liang, Zhengjun Xing, Kajol Jain,
John Garry, Andrii Nakryiko, Eduard Zingerman, Jing Zhang,
Sohom Datta, linux-kernel, linux-perf-users, Perry Taylor,
Samantha Alt, Caleb Biggers, Weilin Wang, Edward Baker
Updates were released in:
https://github.com/intel/perfmon/commit/f3d841189f8964bc240c86301f4c849845630b5b
A number of events are deprecated and event descriptions updated. Adds
events ICACHE_DATA.STALLS, ICACHE_TAG.STALLS and DECODE.LCP.
Signed-off-by: Ian Rogers <irogers@google.com>
---
.../pmu-events/arch/x86/icelake/cache.json | 8 ++---
.../pmu-events/arch/x86/icelake/frontend.json | 32 ++++++++++++++++---
.../pmu-events/arch/x86/icelake/pipeline.json | 6 ++--
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
4 files changed, 36 insertions(+), 12 deletions(-)
diff --git a/tools/perf/pmu-events/arch/x86/icelake/cache.json b/tools/perf/pmu-events/arch/x86/icelake/cache.json
index 79b9f02a4b63..d26c4efe35f0 100644
--- a/tools/perf/pmu-events/arch/x86/icelake/cache.json
+++ b/tools/perf/pmu-events/arch/x86/icelake/cache.json
@@ -155,18 +155,18 @@
"UMask": "0x21"
},
{
- "BriefDescription": "All requests that miss L2 cache. This event is not supported on ICL and ICX products, only supported on RKL products.",
+ "BriefDescription": "This event is deprecated.",
+ "Deprecated": "1",
"EventCode": "0x24",
"EventName": "L2_RQSTS.MISS",
- "PublicDescription": "Counts all requests that miss L2 cache. This event is not supported on ICL and ICX products, only supported on RKL products.",
"SampleAfterValue": "200003",
"UMask": "0x3f"
},
{
- "BriefDescription": "All L2 requests. This event is not supported on ICL and ICX products, only supported on RKL products.",
+ "BriefDescription": "This event is deprecated.",
+ "Deprecated": "1",
"EventCode": "0x24",
"EventName": "L2_RQSTS.REFERENCES",
- "PublicDescription": "Counts all L2 requests. This event is not supported on ICL and ICX products, only supported on RKL products.",
"SampleAfterValue": "200003",
"UMask": "0xff"
},
diff --git a/tools/perf/pmu-events/arch/x86/icelake/frontend.json b/tools/perf/pmu-events/arch/x86/icelake/frontend.json
index 3e3d2b002170..2b539a08d2bf 100644
--- a/tools/perf/pmu-events/arch/x86/icelake/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/icelake/frontend.json
@@ -7,6 +7,14 @@
"SampleAfterValue": "100003",
"UMask": "0x1"
},
+ {
+ "BriefDescription": "Stalls caused by changing prefix length of the instruction. [This event is alias to ILD_STALL.LCP]",
+ "EventCode": "0x87",
+ "EventName": "DECODE.LCP",
+ "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk. [This event is alias to ILD_STALL.LCP]",
+ "SampleAfterValue": "500009",
+ "UMask": "0x1"
+ },
{
"BriefDescription": "Decode Stream Buffer (DSB)-to-MITE transitions count.",
"CounterMask": "1",
@@ -213,10 +221,10 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache miss.",
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache miss. [This event is alias to ICACHE_DATA.STALLS]",
"EventCode": "0x80",
"EventName": "ICACHE_16B.IFDATA_STALL",
- "PublicDescription": "Counts cycles where a code line fetch is stalled due to an L1 instruction cache miss. The legacy decode pipeline works at a 16 Byte granularity.",
+ "PublicDescription": "Counts cycles where a code line fetch is stalled due to an L1 instruction cache miss. The legacy decode pipeline works at a 16 Byte granularity. [This event is alias to ICACHE_DATA.STALLS]",
"SampleAfterValue": "500009",
"UMask": "0x4"
},
@@ -237,10 +245,26 @@
"UMask": "0x2"
},
{
- "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss.",
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_TAG.STALLS]",
"EventCode": "0x83",
"EventName": "ICACHE_64B.IFTAG_STALL",
- "PublicDescription": "Counts cycles where a code fetch is stalled due to L1 instruction cache tag miss.",
+ "PublicDescription": "Counts cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_TAG.STALLS]",
+ "SampleAfterValue": "200003",
+ "UMask": "0x4"
+ },
+ {
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache miss. [This event is alias to ICACHE_16B.IFDATA_STALL]",
+ "EventCode": "0x80",
+ "EventName": "ICACHE_DATA.STALLS",
+ "PublicDescription": "Counts cycles where a code line fetch is stalled due to an L1 instruction cache miss. The legacy decode pipeline works at a 16 Byte granularity. [This event is alias to ICACHE_16B.IFDATA_STALL]",
+ "SampleAfterValue": "500009",
+ "UMask": "0x4"
+ },
+ {
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_64B.IFTAG_STALL]",
+ "EventCode": "0x83",
+ "EventName": "ICACHE_TAG.STALLS",
+ "PublicDescription": "Counts cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_64B.IFTAG_STALL]",
"SampleAfterValue": "200003",
"UMask": "0x4"
},
diff --git a/tools/perf/pmu-events/arch/x86/icelake/pipeline.json b/tools/perf/pmu-events/arch/x86/icelake/pipeline.json
index 154fee4b60fb..375b78044f14 100644
--- a/tools/perf/pmu-events/arch/x86/icelake/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/icelake/pipeline.json
@@ -318,10 +318,10 @@
"UMask": "0x40"
},
{
- "BriefDescription": "Stalls caused by changing prefix length of the instruction.",
+ "BriefDescription": "Stalls caused by changing prefix length of the instruction. [This event is alias to DECODE.LCP]",
"EventCode": "0x87",
"EventName": "ILD_STALL.LCP",
- "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk.",
+ "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk. [This event is alias to DECODE.LCP]",
"SampleAfterValue": "500009",
"UMask": "0x1"
},
@@ -556,7 +556,7 @@
"BriefDescription": "TMA slots wasted due to incorrect speculation by branch mispredictions",
"EventCode": "0xa4",
"EventName": "TOPDOWN.BR_MISPREDICT_SLOTS",
- "PublicDescription": "Number of TMA slots that were wasted due to incorrect speculation by branch mispredictions. This event estimates number of operations that were issued but not retired from the specualtive path as well as the out-of-order engine recovery past a branch misprediction.",
+ "PublicDescription": "Number of TMA slots that were wasted due to incorrect speculation by branch mispredictions. This event estimates number of operations that were issued but not retired from the speculative path as well as the out-of-order engine recovery past a branch misprediction.",
"SampleAfterValue": "10000003",
"UMask": "0x8"
},
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index eccc7ef98870..d63c9df8f65d 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -13,7 +13,7 @@ GenuineIntel-6-B6,v1.00,grandridge,core
GenuineIntel-6-A[DE],v1.01,graniterapids,core
GenuineIntel-6-(3C|45|46),v33,haswell,core
GenuineIntel-6-3F,v27,haswellx,core
-GenuineIntel-6-7[DE],v1.18,icelake,core
+GenuineIntel-6-7[DE],v1.19,icelake,core
GenuineIntel-6-6[AC],v1.20,icelakex,core
GenuineIntel-6-3A,v24,ivybridge,core
GenuineIntel-6-3E,v23,ivytown,core
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 08/12] perf vendor events intel: Update icelakex to 1.21
2023-06-23 15:10 [PATCH v2 00/12] Add metric has_event, update intel vendor events Ian Rogers
` (4 preceding siblings ...)
2023-06-23 15:10 ` [PATCH v2 07/12] perf vendor events intel: Update icelake " Ian Rogers
@ 2023-06-23 15:10 ` Ian Rogers
2023-06-23 15:10 ` [PATCH v2 09/12] perf vendor events intel: Update sapphirerapids to 1.14 Ian Rogers
` (4 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Ian Rogers @ 2023-06-23 15:10 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, Kan Liang, Zhengjun Xing, Kajol Jain,
John Garry, Andrii Nakryiko, Eduard Zingerman, Jing Zhang,
Sohom Datta, linux-kernel, linux-perf-users, Perry Taylor,
Samantha Alt, Caleb Biggers, Weilin Wang, Edward Baker
Updates were released in:
https://github.com/intel/perfmon/commit/78d47cbbae48a0297a507ae4fea234ff37ff9960
Adds the events ICACHE_DATA.STALLS, ICACHE_TAG.STALLS and
DECODE.LCP. Descriptions are also updated.
Signed-off-by: Ian Rogers <irogers@google.com>
---
.../arch/x86/icelakex/frontend.json | 32 ++++++++++++++++---
.../arch/x86/icelakex/pipeline.json | 4 +--
.../x86/icelakex/uncore-interconnect.json | 2 +-
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
4 files changed, 32 insertions(+), 8 deletions(-)
diff --git a/tools/perf/pmu-events/arch/x86/icelakex/frontend.json b/tools/perf/pmu-events/arch/x86/icelakex/frontend.json
index 71498044f1cb..f6edc4222f42 100644
--- a/tools/perf/pmu-events/arch/x86/icelakex/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/icelakex/frontend.json
@@ -7,6 +7,14 @@
"SampleAfterValue": "100003",
"UMask": "0x1"
},
+ {
+ "BriefDescription": "Stalls caused by changing prefix length of the instruction. [This event is alias to ILD_STALL.LCP]",
+ "EventCode": "0x87",
+ "EventName": "DECODE.LCP",
+ "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk. [This event is alias to ILD_STALL.LCP]",
+ "SampleAfterValue": "500009",
+ "UMask": "0x1"
+ },
{
"BriefDescription": "Decode Stream Buffer (DSB)-to-MITE transitions count.",
"CounterMask": "1",
@@ -213,10 +221,10 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache miss.",
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache miss. [This event is alias to ICACHE_DATA.STALLS]",
"EventCode": "0x80",
"EventName": "ICACHE_16B.IFDATA_STALL",
- "PublicDescription": "Counts cycles where a code line fetch is stalled due to an L1 instruction cache miss. The legacy decode pipeline works at a 16 Byte granularity.",
+ "PublicDescription": "Counts cycles where a code line fetch is stalled due to an L1 instruction cache miss. The legacy decode pipeline works at a 16 Byte granularity. [This event is alias to ICACHE_DATA.STALLS]",
"SampleAfterValue": "500009",
"UMask": "0x4"
},
@@ -237,10 +245,26 @@
"UMask": "0x2"
},
{
- "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss.",
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_TAG.STALLS]",
"EventCode": "0x83",
"EventName": "ICACHE_64B.IFTAG_STALL",
- "PublicDescription": "Counts cycles where a code fetch is stalled due to L1 instruction cache tag miss.",
+ "PublicDescription": "Counts cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_TAG.STALLS]",
+ "SampleAfterValue": "200003",
+ "UMask": "0x4"
+ },
+ {
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache miss. [This event is alias to ICACHE_16B.IFDATA_STALL]",
+ "EventCode": "0x80",
+ "EventName": "ICACHE_DATA.STALLS",
+ "PublicDescription": "Counts cycles where a code line fetch is stalled due to an L1 instruction cache miss. The legacy decode pipeline works at a 16 Byte granularity. [This event is alias to ICACHE_16B.IFDATA_STALL]",
+ "SampleAfterValue": "500009",
+ "UMask": "0x4"
+ },
+ {
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_64B.IFTAG_STALL]",
+ "EventCode": "0x83",
+ "EventName": "ICACHE_TAG.STALLS",
+ "PublicDescription": "Counts cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_64B.IFTAG_STALL]",
"SampleAfterValue": "200003",
"UMask": "0x4"
},
diff --git a/tools/perf/pmu-events/arch/x86/icelakex/pipeline.json b/tools/perf/pmu-events/arch/x86/icelakex/pipeline.json
index 442a4c7539dd..176e5ef2a24a 100644
--- a/tools/perf/pmu-events/arch/x86/icelakex/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/icelakex/pipeline.json
@@ -318,10 +318,10 @@
"UMask": "0x40"
},
{
- "BriefDescription": "Stalls caused by changing prefix length of the instruction.",
+ "BriefDescription": "Stalls caused by changing prefix length of the instruction. [This event is alias to DECODE.LCP]",
"EventCode": "0x87",
"EventName": "ILD_STALL.LCP",
- "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk.",
+ "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk. [This event is alias to DECODE.LCP]",
"SampleAfterValue": "500009",
"UMask": "0x1"
},
diff --git a/tools/perf/pmu-events/arch/x86/icelakex/uncore-interconnect.json b/tools/perf/pmu-events/arch/x86/icelakex/uncore-interconnect.json
index 8ac5907762e1..f87ea3f66d1b 100644
--- a/tools/perf/pmu-events/arch/x86/icelakex/uncore-interconnect.json
+++ b/tools/perf/pmu-events/arch/x86/icelakex/uncore-interconnect.json
@@ -9311,7 +9311,7 @@
"EventCode": "0x50",
"EventName": "UNC_M3UPI_RxC_HELD.PARALLEL_SUCCESS",
"PerPkg": "1",
- "PublicDescription": "Message Held : Parallel Success : ad and bl messages were actually slotted into the same flit in paralle",
+ "PublicDescription": "Message Held : Parallel Success : ad and bl messages were actually slotted into the same flit in parallel",
"UMask": "0x8",
"Unit": "M3UPI"
},
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index d63c9df8f65d..98828c3a9cde 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -14,7 +14,7 @@ GenuineIntel-6-A[DE],v1.01,graniterapids,core
GenuineIntel-6-(3C|45|46),v33,haswell,core
GenuineIntel-6-3F,v27,haswellx,core
GenuineIntel-6-7[DE],v1.19,icelake,core
-GenuineIntel-6-6[AC],v1.20,icelakex,core
+GenuineIntel-6-6[AC],v1.21,icelakex,core
GenuineIntel-6-3A,v24,ivybridge,core
GenuineIntel-6-3E,v23,ivytown,core
GenuineIntel-6-2D,v23,jaketown,core
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 09/12] perf vendor events intel: Update sapphirerapids to 1.14
2023-06-23 15:10 [PATCH v2 00/12] Add metric has_event, update intel vendor events Ian Rogers
` (5 preceding siblings ...)
2023-06-23 15:10 ` [PATCH v2 08/12] perf vendor events intel: Update icelakex to 1.21 Ian Rogers
@ 2023-06-23 15:10 ` Ian Rogers
2023-06-23 15:10 ` [PATCH v2 10/12] perf vendor events intel: Update skylake to 57 Ian Rogers
` (3 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Ian Rogers @ 2023-06-23 15:10 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, Kan Liang, Zhengjun Xing, Kajol Jain,
John Garry, Andrii Nakryiko, Eduard Zingerman, Jing Zhang,
Sohom Datta, linux-kernel, linux-perf-users, Perry Taylor,
Samantha Alt, Caleb Biggers, Weilin Wang, Edward Baker
Updates were released in:
https://github.com/intel/perfmon/commit/a84850f1fec633002c35838ed34e51e1f0d6a2dd
Adds a large number of CXL events like
UNC_CHA_TOR_OCCUPANCY.IA_MISS_RFO_CXL_ACC,
UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_CXL_ACC,
UNC_CHA_TOR_INSERTS.IA_MISS_DRD_CXL_ACC,
UNC_CHA_TOR_INSERTS.IA_MISS_RFO_CXL_ACC,
UNC_CHA_TOR_INSERTS.IA_MISS_LLCPREFRFO_CXL_ACC,
UNC_CHA_TOR_INSERTS.IA_MISS_RFO_PREF_CXL_ACC,
UNC_CHA_TOR_INSERTS.IA_MISS_DRD_PREF_CXL_ACC,
UNC_CHA_TOR_INSERTS.IA_MISS_LLCPREFDATA_CXL_ACC,
UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_PREF_CXL_ACC,
UNC_CHA_TOR_OCCUPANCY.IA_MISS_LLCPREFDATA_CXL_ACC,
UNC_CHA_TOR_OCCUPANCY.IA_MISS_LLCPREFRFO_CXL_ACC,
UNC_CHA_TOR_OCCUPANCY.IA_MISS_RFO_PREF_CXL_ACC,
UNC_CHA_TOR_INSERTS.IA_MISS_CXL_ACC,
UNC_CHA_TOR_INSERTS.IA_HIT_CXL_ACC,
UNC_CHA_TOR_OCCUPANCY.IA_MISS_CXL_ACC,
UNC_CHA_TOR_OCCUPANCY.IA_HIT_CXL_ACC.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
.../arch/x86/sapphirerapids/pipeline.json | 2 +-
.../arch/x86/sapphirerapids/uncore-cache.json | 308 ++++++++++++++++++
.../sapphirerapids/uncore-interconnect.json | 2 +-
4 files changed, 311 insertions(+), 3 deletions(-)
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 98828c3a9cde..f321b2cd83da 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -24,7 +24,7 @@ GenuineIntel-6-1[AEF],v3,nehalemep,core
GenuineIntel-6-2E,v3,nehalemex,core
GenuineIntel-6-A7,v1.01,rocketlake,core
GenuineIntel-6-2A,v19,sandybridge,core
-GenuineIntel-6-(8F|CF),v1.13,sapphirerapids,core
+GenuineIntel-6-(8F|CF),v1.14,sapphirerapids,core
GenuineIntel-6-AF,v1.00,sierraforest,core
GenuineIntel-6-(37|4A|4C|4D|5A),v15,silvermont,core
GenuineIntel-6-(4E|5E|8E|9E|A5|A6),v56,skylake,core
diff --git a/tools/perf/pmu-events/arch/x86/sapphirerapids/pipeline.json b/tools/perf/pmu-events/arch/x86/sapphirerapids/pipeline.json
index 72e9bdfa9f80..6dcf3b763af4 100644
--- a/tools/perf/pmu-events/arch/x86/sapphirerapids/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/sapphirerapids/pipeline.json
@@ -706,7 +706,7 @@
"BriefDescription": "TMA slots wasted due to incorrect speculation by branch mispredictions",
"EventCode": "0xa4",
"EventName": "TOPDOWN.BR_MISPREDICT_SLOTS",
- "PublicDescription": "Number of TMA slots that were wasted due to incorrect speculation by (any type of) branch mispredictions. This event estimates number of specualtive operations that were issued but not retired as well as the out-of-order engine recovery past a branch misprediction.",
+ "PublicDescription": "Number of TMA slots that were wasted due to incorrect speculation by (any type of) branch mispredictions. This event estimates number of speculative operations that were issued but not retired as well as the out-of-order engine recovery past a branch misprediction.",
"SampleAfterValue": "10000003",
"UMask": "0x8"
},
diff --git a/tools/perf/pmu-events/arch/x86/sapphirerapids/uncore-cache.json b/tools/perf/pmu-events/arch/x86/sapphirerapids/uncore-cache.json
index b91cebf81f50..3fa660694bc7 100644
--- a/tools/perf/pmu-events/arch/x86/sapphirerapids/uncore-cache.json
+++ b/tools/perf/pmu-events/arch/x86/sapphirerapids/uncore-cache.json
@@ -3156,6 +3156,23 @@
"UMask": "0xc88ffd01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "All requests issued from IA cores to CXL accelerator memory regions that hit the LLC.",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_HIT_CXL_ACC",
+ "PerPkg": "1",
+ "UMask": "0x10c0018101",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "UNC_CHA_TOR_INSERTS.IA_HIT_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_HIT_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10c0008101",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Inserts; DRd hits from local IA",
"EventCode": "0x35",
@@ -3371,6 +3388,23 @@
"UMask": "0xc80f7e01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "All requests issued from IA cores to CXL accelerator memory regions that miss the LLC.",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_CXL_ACC",
+ "PerPkg": "1",
+ "UMask": "0x10c0018201",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "UNC_CHA_TOR_INSERTS.IA_MISS_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10c0008201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Inserts for DRd misses from local IA",
"EventCode": "0x35",
@@ -3397,6 +3431,23 @@
"UMask": "0xc837fe01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "DRds issued from an IA core which miss the L3 and target memory in a CXL type 2 memory expander card.",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_DRD_CXL_ACC",
+ "PerPkg": "1",
+ "UMask": "0x10c8178201",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "UNC_CHA_TOR_INSERTS.IA_MISS_DRD_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_DRD_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10c8168201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Inserts for DRds issued by IA Cores targeting DDR Mem that Missed the LLC",
"EventCode": "0x35",
@@ -3442,6 +3493,15 @@
"UMask": "0xc827fe01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "UNC_CHA_TOR_INSERTS.IA_MISS_DRD_OPT_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_DRD_OPT_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10c8268201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Inserts; DRd Opt Pref misses from local IA",
"EventCode": "0x35",
@@ -3451,6 +3511,15 @@
"UMask": "0xc8a7fe01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "UNC_CHA_TOR_INSERTS.IA_MISS_DRD_OPT_PREF_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_DRD_OPT_PREF_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10c8a68201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Inserts for DRds issued by iA Cores targeting PMM Mem that Missed the LLC",
"EventCode": "0x35",
@@ -3469,6 +3538,23 @@
"UMask": "0xc897fe01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "L2 data prefetches issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator.",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_DRD_PREF_CXL_ACC",
+ "PerPkg": "1",
+ "UMask": "0x10c8978201",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "UNC_CHA_TOR_INSERTS.IA_MISS_DRD_PREF_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_DRD_PREF_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10c8968201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Inserts : DRd_Prefs issued by iA Cores targeting DDR Mem that Missed the LLC",
"EventCode": "0x35",
@@ -3603,6 +3689,23 @@
"UMask": "0xccd7fe01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "LLC data prefetches issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator.",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_LLCPREFDATA_CXL_ACC",
+ "PerPkg": "1",
+ "UMask": "0x10ccd78201",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "UNC_CHA_TOR_INSERTS.IA_MISS_LLCPREFDATA_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_LLCPREFDATA_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10ccd68201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Inserts; LLCPrefRFO misses from local IA",
"EventCode": "0x35",
@@ -3612,6 +3715,23 @@
"UMask": "0xccc7fe01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "L2 RFO prefetches issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator.",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_LLCPREFRFO_CXL_ACC",
+ "PerPkg": "1",
+ "UMask": "0x10c8878201",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "UNC_CHA_TOR_INSERTS.IA_MISS_LLCPREFRFO_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_LLCPREFRFO_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10c8868201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Inserts : WCiLFs issued by iA Cores targeting DDR that missed the LLC - HOMed locally",
"EventCode": "0x35",
@@ -3701,6 +3821,23 @@
"UMask": "0x10c8038201",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "RFOs issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator.",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_RFO_CXL_ACC",
+ "PerPkg": "1",
+ "UMask": "0x10c8078201",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "UNC_CHA_TOR_INSERTS.IA_MISS_RFO_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_RFO_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10c8068201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Inserts RFO misses from local IA",
"EventCode": "0x35",
@@ -3719,6 +3856,23 @@
"UMask": "0xc887fe01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "LLC RFO prefetches issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator.",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_RFO_PREF_CXL_ACC",
+ "PerPkg": "1",
+ "UMask": "0x10ccc78201",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "UNC_CHA_TOR_INSERTS.IA_MISS_RFO_PREF_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_INSERTS.IA_MISS_RFO_PREF_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10ccc68201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Inserts; RFO prefetch misses from local IA",
"EventCode": "0x35",
@@ -4427,6 +4581,23 @@
"UMask": "0xc88ffd01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "TOR Occupancy for All requests issued from IA cores to CXL accelerator memory regions that hit the LLC.",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_CXL_ACC",
+ "PerPkg": "1",
+ "UMask": "0x10c0018101",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_HIT_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10c0008101",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy; DRd hits from local IA",
"EventCode": "0x36",
@@ -4644,6 +4815,23 @@
"UMask": "0xc80f7e01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "TOR Occupancy for All requests issued from IA cores to CXL accelerator memory regions that miss the LLC.",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_CXL_ACC",
+ "PerPkg": "1",
+ "UMask": "0x10c0018201",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10c0008201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy for DRd misses from local IA",
"EventCode": "0x36",
@@ -4672,6 +4860,23 @@
"UMask": "0xc837fe01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "TOR Occupancy for DRds and equivalent opcodes issued from an IA core which miss the L3 and target memory in a CXL type 2 memory expander card.",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_CXL_ACC",
+ "PerPkg": "1",
+ "UMask": "0x10c8178201",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10c8168201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy for DRds issued by iA Cores targeting DDR Mem that Missed the LLC",
"EventCode": "0x36",
@@ -4717,6 +4922,15 @@
"UMask": "0xc827fe01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_OPT_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_OPT_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10c8268201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy; DRd Opt Pref misses from local IA",
"EventCode": "0x36",
@@ -4726,6 +4940,15 @@
"UMask": "0xc8a7fe01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_OPT_PREF_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_OPT_PREF_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10c8a68201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy for DRds issued by iA Cores targeting PMM Mem that Missed the LLC",
"EventCode": "0x36",
@@ -4744,6 +4967,23 @@
"UMask": "0xc897fe01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "TOR Occupancy for L2 data prefetches issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator.",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_PREF_CXL_ACC",
+ "PerPkg": "1",
+ "UMask": "0x10c8978201",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_PREF_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_PREF_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10c8968201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy : DRd_Prefs issued by iA Cores targeting DDR Mem that Missed the LLC",
"EventCode": "0x36",
@@ -4878,6 +5118,23 @@
"UMask": "0xccd7fe01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "TOR Occupancy for LLC data prefetches issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator.",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_LLCPREFDATA_CXL_ACC",
+ "PerPkg": "1",
+ "UMask": "0x10ccd78201",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_LLCPREFDATA_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_LLCPREFDATA_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10ccd68201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy; LLCPrefRFO misses from local IA",
"EventCode": "0x36",
@@ -4887,6 +5144,23 @@
"UMask": "0xccc7fe01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "TOR Occupancy for L2 RFO prefetches issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator.",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_LLCPREFRFO_CXL_ACC",
+ "PerPkg": "1",
+ "UMask": "0x10c8878201",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_LLCPREFRFO_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_LLCPREFRFO_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10c8868201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy : WCiLFs issued by iA Cores targeting DDR that missed the LLC - HOMed locally",
"EventCode": "0x36",
@@ -4976,6 +5250,23 @@
"UMask": "0x10c8038201",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "TOR Occupancy for RFOs issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator.",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_RFO_CXL_ACC",
+ "PerPkg": "1",
+ "UMask": "0x10c8078201",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_RFO_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_RFO_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10c8068201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy; RFO misses from local IA",
"EventCode": "0x36",
@@ -4994,6 +5285,23 @@
"UMask": "0xc887fe01",
"Unit": "CHA"
},
+ {
+ "BriefDescription": "TOR Occupancy for LLC RFO prefetches issued from an IA core which miss the L3 and target memory in a CXL type 2 accelerator.",
+ "EventCode": "0x36",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_RFO_PREF_CXL_ACC",
+ "PerPkg": "1",
+ "UMask": "0x10ccc78201",
+ "Unit": "CHA"
+ },
+ {
+ "BriefDescription": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_RFO_PREF_CXL_ACC_LOCAL",
+ "EventCode": "0x35",
+ "EventName": "UNC_CHA_TOR_OCCUPANCY.IA_MISS_RFO_PREF_CXL_ACC_LOCAL",
+ "PerPkg": "1",
+ "PortMask": "0x000",
+ "UMask": "0x10ccc68201",
+ "Unit": "CHA"
+ },
{
"BriefDescription": "TOR Occupancy; RFO prefetch misses from local IA",
"EventCode": "0x36",
diff --git a/tools/perf/pmu-events/arch/x86/sapphirerapids/uncore-interconnect.json b/tools/perf/pmu-events/arch/x86/sapphirerapids/uncore-interconnect.json
index 6800de05c836..09d840c7da4c 100644
--- a/tools/perf/pmu-events/arch/x86/sapphirerapids/uncore-interconnect.json
+++ b/tools/perf/pmu-events/arch/x86/sapphirerapids/uncore-interconnect.json
@@ -3326,7 +3326,7 @@
"EventCode": "0x50",
"EventName": "UNC_M3UPI_RxC_HELD.PARALLEL_SUCCESS",
"PerPkg": "1",
- "PublicDescription": "Message Held : Parallel Success : ad and bl messages were actually slotted into the same flit in paralle",
+ "PublicDescription": "Message Held : Parallel Success : ad and bl messages were actually slotted into the same flit in parallel",
"UMask": "0x8",
"Unit": "M3UPI"
},
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 10/12] perf vendor events intel: Update skylake to 57
2023-06-23 15:10 [PATCH v2 00/12] Add metric has_event, update intel vendor events Ian Rogers
` (6 preceding siblings ...)
2023-06-23 15:10 ` [PATCH v2 09/12] perf vendor events intel: Update sapphirerapids to 1.14 Ian Rogers
@ 2023-06-23 15:10 ` Ian Rogers
2023-06-23 15:10 ` [PATCH v2 11/12] perf vendor events intel: Update skylakex to 1.31 Ian Rogers
` (2 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Ian Rogers @ 2023-06-23 15:10 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, Kan Liang, Zhengjun Xing, Kajol Jain,
John Garry, Andrii Nakryiko, Eduard Zingerman, Jing Zhang,
Sohom Datta, linux-kernel, linux-perf-users, Perry Taylor,
Samantha Alt, Caleb Biggers, Weilin Wang, Edward Baker
Updates were released in:
https://github.com/intel/perfmon/commit/1c3042c13bbfea05abe1ebb6910ae58b2172e9ef
Adds the events IDQ.DSB_CYCLES_OK, IDQ.DSB_CYCLES_ANY,
ICACHE_TAG.STALLS, DECODE.LCP, LSD.CYCLES_OK. Descriptions are also
updated.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
.../pmu-events/arch/x86/skylake/frontend.json | 43 ++++++++++++++++---
.../pmu-events/arch/x86/skylake/pipeline.json | 17 ++++++--
3 files changed, 52 insertions(+), 10 deletions(-)
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index f321b2cd83da..5104b93d57ab 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -27,7 +27,7 @@ GenuineIntel-6-2A,v19,sandybridge,core
GenuineIntel-6-(8F|CF),v1.14,sapphirerapids,core
GenuineIntel-6-AF,v1.00,sierraforest,core
GenuineIntel-6-(37|4A|4C|4D|5A),v15,silvermont,core
-GenuineIntel-6-(4E|5E|8E|9E|A5|A6),v56,skylake,core
+GenuineIntel-6-(4E|5E|8E|9E|A5|A6),v57,skylake,core
GenuineIntel-6-55-[01234],v1.30,skylakex,core
GenuineIntel-6-86,v1.21,snowridgex,core
GenuineIntel-6-8[CD],v1.12,tigerlake,core
diff --git a/tools/perf/pmu-events/arch/x86/skylake/frontend.json b/tools/perf/pmu-events/arch/x86/skylake/frontend.json
index 04f08e4d2402..095904c77001 100644
--- a/tools/perf/pmu-events/arch/x86/skylake/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/skylake/frontend.json
@@ -7,6 +7,14 @@
"SampleAfterValue": "100003",
"UMask": "0x1"
},
+ {
+ "BriefDescription": "Stalls caused by changing prefix length of the instruction. [This event is alias to ILD_STALL.LCP]",
+ "EventCode": "0x87",
+ "EventName": "DECODE.LCP",
+ "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk. [This event is alias to ILD_STALL.LCP]",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x1"
+ },
{
"BriefDescription": "Decode Stream Buffer (DSB)-to-MITE switches",
"EventCode": "0xAB",
@@ -245,27 +253,34 @@
"UMask": "0x2"
},
{
- "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss.",
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_TAG.STALLS]",
"EventCode": "0x83",
"EventName": "ICACHE_64B.IFTAG_STALL",
"SampleAfterValue": "200003",
"UMask": "0x4"
},
{
- "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops",
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_64B.IFTAG_STALL]",
+ "EventCode": "0x83",
+ "EventName": "ICACHE_TAG.STALLS",
+ "SampleAfterValue": "200003",
+ "UMask": "0x4"
+ },
+ {
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops [This event is alias to IDQ.DSB_CYCLES_OK]",
"CounterMask": "4",
"EventCode": "0x79",
"EventName": "IDQ.ALL_DSB_CYCLES_4_UOPS",
- "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ.",
+ "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.DSB_CYCLES_OK]",
"SampleAfterValue": "2000003",
"UMask": "0x18"
},
{
- "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering any Uop",
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering any Uop [This event is alias to IDQ.DSB_CYCLES_ANY]",
"CounterMask": "1",
"EventCode": "0x79",
"EventName": "IDQ.ALL_DSB_CYCLES_ANY_UOPS",
- "PublicDescription": "Counts the number of cycles uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ.",
+ "PublicDescription": "Counts the number of cycles uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.DSB_CYCLES_ANY]",
"SampleAfterValue": "2000003",
"UMask": "0x18"
},
@@ -296,6 +311,24 @@
"SampleAfterValue": "2000003",
"UMask": "0x8"
},
+ {
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering any Uop [This event is alias to IDQ.ALL_DSB_CYCLES_ANY_UOPS]",
+ "CounterMask": "1",
+ "EventCode": "0x79",
+ "EventName": "IDQ.DSB_CYCLES_ANY",
+ "PublicDescription": "Counts the number of cycles uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.ALL_DSB_CYCLES_ANY_UOPS]",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x18"
+ },
+ {
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
+ "CounterMask": "4",
+ "EventCode": "0x79",
+ "EventName": "IDQ.DSB_CYCLES_OK",
+ "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x18"
+ },
{
"BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
"EventCode": "0x79",
diff --git a/tools/perf/pmu-events/arch/x86/skylake/pipeline.json b/tools/perf/pmu-events/arch/x86/skylake/pipeline.json
index cc800fb8180a..cd3e737bf4a1 100644
--- a/tools/perf/pmu-events/arch/x86/skylake/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/skylake/pipeline.json
@@ -352,10 +352,10 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Stalls caused by changing prefix length of the instruction.",
+ "BriefDescription": "Stalls caused by changing prefix length of the instruction. [This event is alias to DECODE.LCP]",
"EventCode": "0x87",
"EventName": "ILD_STALL.LCP",
- "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk.",
+ "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk. [This event is alias to DECODE.LCP]",
"SampleAfterValue": "2000003",
"UMask": "0x1"
},
@@ -479,11 +479,11 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
+ "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder. [This event is alias to LSD.CYCLES_OK]",
"CounterMask": "4",
"EventCode": "0xA8",
"EventName": "LSD.CYCLES_4_UOPS",
- "PublicDescription": "Counts the cycles when 4 uops are delivered by the LSD (Loop-stream detector).",
+ "PublicDescription": "Counts the cycles when 4 uops are delivered by the LSD (Loop-stream detector). [This event is alias to LSD.CYCLES_OK]",
"SampleAfterValue": "2000003",
"UMask": "0x1"
},
@@ -496,6 +496,15 @@
"SampleAfterValue": "2000003",
"UMask": "0x1"
},
+ {
+ "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder. [This event is alias to LSD.CYCLES_4_UOPS]",
+ "CounterMask": "4",
+ "EventCode": "0xA8",
+ "EventName": "LSD.CYCLES_OK",
+ "PublicDescription": "Counts the cycles when 4 uops are delivered by the LSD (Loop-stream detector). [This event is alias to LSD.CYCLES_4_UOPS]",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x1"
+ },
{
"BriefDescription": "Number of Uops delivered by the LSD.",
"EventCode": "0xA8",
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 11/12] perf vendor events intel: Update skylakex to 1.31
2023-06-23 15:10 [PATCH v2 00/12] Add metric has_event, update intel vendor events Ian Rogers
` (7 preceding siblings ...)
2023-06-23 15:10 ` [PATCH v2 10/12] perf vendor events intel: Update skylake to 57 Ian Rogers
@ 2023-06-23 15:10 ` Ian Rogers
2023-06-23 15:10 ` [PATCH v2 12/12] perf vendor events intel: Update tigerlake to 1.13 Ian Rogers
2023-06-29 21:31 ` [PATCH v2 00/12] Add metric has_event, update intel vendor events Namhyung Kim
10 siblings, 0 replies; 15+ messages in thread
From: Ian Rogers @ 2023-06-23 15:10 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, Kan Liang, Zhengjun Xing, Kajol Jain,
John Garry, Andrii Nakryiko, Eduard Zingerman, Jing Zhang,
Sohom Datta, linux-kernel, linux-perf-users, Perry Taylor,
Samantha Alt, Caleb Biggers, Weilin Wang, Edward Baker
Updates were released in:
https://github.com/intel/perfmon/commit/cdaa69afe7a48a217b1d89320a27efc6e650cec3
Adds the events IDQ.DSB_CYCLES_OK, IDQ.DSB_CYCLES_ANY,
ICACHE_TAG.STALLS, DECODE.LCP, LSD.CYCLES_OK. Descriptions are also
updated.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
.../arch/x86/skylakex/frontend.json | 43 ++++++++++++++++---
.../arch/x86/skylakex/pipeline.json | 17 ++++++--
.../x86/skylakex/uncore-interconnect.json | 2 +-
.../arch/x86/skylakex/uncore-memory.json | 2 +-
5 files changed, 54 insertions(+), 12 deletions(-)
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 5104b93d57ab..7c6598a9b240 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -28,7 +28,7 @@ GenuineIntel-6-(8F|CF),v1.14,sapphirerapids,core
GenuineIntel-6-AF,v1.00,sierraforest,core
GenuineIntel-6-(37|4A|4C|4D|5A),v15,silvermont,core
GenuineIntel-6-(4E|5E|8E|9E|A5|A6),v57,skylake,core
-GenuineIntel-6-55-[01234],v1.30,skylakex,core
+GenuineIntel-6-55-[01234],v1.31,skylakex,core
GenuineIntel-6-86,v1.21,snowridgex,core
GenuineIntel-6-8[CD],v1.12,tigerlake,core
GenuineIntel-6-2C,v4,westmereep-dp,core
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/frontend.json b/tools/perf/pmu-events/arch/x86/skylakex/frontend.json
index 04f08e4d2402..095904c77001 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/frontend.json
@@ -7,6 +7,14 @@
"SampleAfterValue": "100003",
"UMask": "0x1"
},
+ {
+ "BriefDescription": "Stalls caused by changing prefix length of the instruction. [This event is alias to ILD_STALL.LCP]",
+ "EventCode": "0x87",
+ "EventName": "DECODE.LCP",
+ "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk. [This event is alias to ILD_STALL.LCP]",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x1"
+ },
{
"BriefDescription": "Decode Stream Buffer (DSB)-to-MITE switches",
"EventCode": "0xAB",
@@ -245,27 +253,34 @@
"UMask": "0x2"
},
{
- "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss.",
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_TAG.STALLS]",
"EventCode": "0x83",
"EventName": "ICACHE_64B.IFTAG_STALL",
"SampleAfterValue": "200003",
"UMask": "0x4"
},
{
- "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops",
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_64B.IFTAG_STALL]",
+ "EventCode": "0x83",
+ "EventName": "ICACHE_TAG.STALLS",
+ "SampleAfterValue": "200003",
+ "UMask": "0x4"
+ },
+ {
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops [This event is alias to IDQ.DSB_CYCLES_OK]",
"CounterMask": "4",
"EventCode": "0x79",
"EventName": "IDQ.ALL_DSB_CYCLES_4_UOPS",
- "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ.",
+ "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.DSB_CYCLES_OK]",
"SampleAfterValue": "2000003",
"UMask": "0x18"
},
{
- "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering any Uop",
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering any Uop [This event is alias to IDQ.DSB_CYCLES_ANY]",
"CounterMask": "1",
"EventCode": "0x79",
"EventName": "IDQ.ALL_DSB_CYCLES_ANY_UOPS",
- "PublicDescription": "Counts the number of cycles uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ.",
+ "PublicDescription": "Counts the number of cycles uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.DSB_CYCLES_ANY]",
"SampleAfterValue": "2000003",
"UMask": "0x18"
},
@@ -296,6 +311,24 @@
"SampleAfterValue": "2000003",
"UMask": "0x8"
},
+ {
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering any Uop [This event is alias to IDQ.ALL_DSB_CYCLES_ANY_UOPS]",
+ "CounterMask": "1",
+ "EventCode": "0x79",
+ "EventName": "IDQ.DSB_CYCLES_ANY",
+ "PublicDescription": "Counts the number of cycles uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.ALL_DSB_CYCLES_ANY_UOPS]",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x18"
+ },
+ {
+ "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
+ "CounterMask": "4",
+ "EventCode": "0x79",
+ "EventName": "IDQ.DSB_CYCLES_OK",
+ "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ. [This event is alias to IDQ.ALL_DSB_CYCLES_4_UOPS]",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x18"
+ },
{
"BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
"EventCode": "0x79",
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json b/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json
index 31a1663d57f8..66d686cc933e 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json
@@ -361,10 +361,10 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Stalls caused by changing prefix length of the instruction.",
+ "BriefDescription": "Stalls caused by changing prefix length of the instruction. [This event is alias to DECODE.LCP]",
"EventCode": "0x87",
"EventName": "ILD_STALL.LCP",
- "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk.",
+ "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk. [This event is alias to DECODE.LCP]",
"SampleAfterValue": "2000003",
"UMask": "0x1"
},
@@ -488,11 +488,11 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
+ "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder. [This event is alias to LSD.CYCLES_OK]",
"CounterMask": "4",
"EventCode": "0xA8",
"EventName": "LSD.CYCLES_4_UOPS",
- "PublicDescription": "Counts the cycles when 4 uops are delivered by the LSD (Loop-stream detector).",
+ "PublicDescription": "Counts the cycles when 4 uops are delivered by the LSD (Loop-stream detector). [This event is alias to LSD.CYCLES_OK]",
"SampleAfterValue": "2000003",
"UMask": "0x1"
},
@@ -505,6 +505,15 @@
"SampleAfterValue": "2000003",
"UMask": "0x1"
},
+ {
+ "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder. [This event is alias to LSD.CYCLES_4_UOPS]",
+ "CounterMask": "4",
+ "EventCode": "0xA8",
+ "EventName": "LSD.CYCLES_OK",
+ "PublicDescription": "Counts the cycles when 4 uops are delivered by the LSD (Loop-stream detector). [This event is alias to LSD.CYCLES_4_UOPS]",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x1"
+ },
{
"BriefDescription": "Number of Uops delivered by the LSD.",
"EventCode": "0xA8",
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/uncore-interconnect.json b/tools/perf/pmu-events/arch/x86/skylakex/uncore-interconnect.json
index 26a5a20bf37a..3eece8a728b5 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/uncore-interconnect.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/uncore-interconnect.json
@@ -6504,7 +6504,7 @@
"EventCode": "0x52",
"EventName": "UNC_M3UPI_RxC_HELD.PARALLEL_SUCCESS",
"PerPkg": "1",
- "PublicDescription": "ad and bl messages were actually slotted into the same flit in paralle",
+ "PublicDescription": "ad and bl messages were actually slotted into the same flit in parallel",
"UMask": "0x8",
"Unit": "M3UPI"
},
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/uncore-memory.json b/tools/perf/pmu-events/arch/x86/skylakex/uncore-memory.json
index 6f8ff2262ce7..7a40aa0f1018 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/uncore-memory.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/uncore-memory.json
@@ -1952,7 +1952,7 @@
"EventCode": "0x81",
"EventName": "UNC_M_WPQ_OCCUPANCY",
"PerPkg": "1",
- "PublicDescription": "Counts the number of entries in the Write Pending Queue (WPQ) at each cycle. This can then be used to calculate both the average queue occupancy (in conjunction with the number of cycles not empty) and the average latency (in conjunction with the number of allocations). The WPQ is used to schedule writes out to the memory controller and to track the requests. Requests allocate into the WPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the CHA to the iMC (memory controller). They deallocate after being issued to DRAM. Write requests themselves are able to complete (from the perspective of the rest of the system) as soon they have 'posted' to the iMC. This is not to be confused with actually performing the write to DRAM. Therefore, the average latency for this queue is actually not useful for deconstruction intermediate write latencies. So, we provide filtering based on if the request has posted or not. By using the 'not posted' filter, we can track how long writes spent in the iMC before completions were sent to the HA. The 'posted' filter, on the other hand, provides information about how much queueing is actually happening in the iMC for writes before they are actually issued to memory. High average occupancies will generally coincide with high write major mode counts. Is there a filter of sorts?",
+ "PublicDescription": "Counts the number of entries in the Write Pending Queue (WPQ) at each cycle. This can then be used to calculate both the average queue occupancy (in conjunction with the number of cycles not empty) and the average latency (in conjunction with the number of allocations). The WPQ is used to schedule writes out to the memory controller and to track the requests. Requests allocate into the WPQ soon after they enter the memory controller, and need credits for an entry in this buffer before being sent from the CHA to the iMC (memory controller). They deallocate after being issued to DRAM. Write requests themselves are able to complete (from the perspective of the rest of the system) as soon they have 'posted' to the iMC. This is not to be confused with actually performing the write to DRAM. Therefore, the average latency for this queue is actually not useful for deconstruction intermediate write latencies. So, we provide filtering based on if the request has posted or not. By using the 'not posted' filter, we can track how long writes spent in the iMC before completions were sent to the HA. The 'posted' filter, on the other hand, provides information about how much queueing is actually happening in the iMC for writes before they are actually issued to memory. High average occupancies will generally coincide with high write major mode counts.",
"Unit": "iMC"
},
{
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 12/12] perf vendor events intel: Update tigerlake to 1.13
2023-06-23 15:10 [PATCH v2 00/12] Add metric has_event, update intel vendor events Ian Rogers
` (8 preceding siblings ...)
2023-06-23 15:10 ` [PATCH v2 11/12] perf vendor events intel: Update skylakex to 1.31 Ian Rogers
@ 2023-06-23 15:10 ` Ian Rogers
2023-06-29 21:31 ` [PATCH v2 00/12] Add metric has_event, update intel vendor events Namhyung Kim
10 siblings, 0 replies; 15+ messages in thread
From: Ian Rogers @ 2023-06-23 15:10 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, Kan Liang, Zhengjun Xing, Kajol Jain,
John Garry, Andrii Nakryiko, Eduard Zingerman, Jing Zhang,
Sohom Datta, linux-kernel, linux-perf-users, Perry Taylor,
Samantha Alt, Caleb Biggers, Weilin Wang, Edward Baker
Updates were released in:
https://github.com/intel/perfmon/commit/9a3cd5ad68aee46078c663fe0cd9484e3956fd88
Adds the events ICACHE_DATA.STALLS, ICACHE_TAG.STALLS and
DECODE.LCP. Descriptions are also updated.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
.../arch/x86/tigerlake/frontend.json | 32 ++++++++++++++++---
.../arch/x86/tigerlake/pipeline.json | 6 ++--
3 files changed, 32 insertions(+), 8 deletions(-)
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index 7c6598a9b240..6650100830c4 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -30,7 +30,7 @@ GenuineIntel-6-(37|4A|4C|4D|5A),v15,silvermont,core
GenuineIntel-6-(4E|5E|8E|9E|A5|A6),v57,skylake,core
GenuineIntel-6-55-[01234],v1.31,skylakex,core
GenuineIntel-6-86,v1.21,snowridgex,core
-GenuineIntel-6-8[CD],v1.12,tigerlake,core
+GenuineIntel-6-8[CD],v1.13,tigerlake,core
GenuineIntel-6-2C,v4,westmereep-dp,core
GenuineIntel-6-25,v3,westmereep-sp,core
GenuineIntel-6-2F,v3,westmereex,core
diff --git a/tools/perf/pmu-events/arch/x86/tigerlake/frontend.json b/tools/perf/pmu-events/arch/x86/tigerlake/frontend.json
index 23b8528590b3..d7b972452c0e 100644
--- a/tools/perf/pmu-events/arch/x86/tigerlake/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/tigerlake/frontend.json
@@ -7,6 +7,14 @@
"SampleAfterValue": "100003",
"UMask": "0x1"
},
+ {
+ "BriefDescription": "Stalls caused by changing prefix length of the instruction. [This event is alias to ILD_STALL.LCP]",
+ "EventCode": "0x87",
+ "EventName": "DECODE.LCP",
+ "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk. [This event is alias to ILD_STALL.LCP]",
+ "SampleAfterValue": "500009",
+ "UMask": "0x1"
+ },
{
"BriefDescription": "Decode Stream Buffer (DSB)-to-MITE transitions count.",
"CounterMask": "1",
@@ -213,10 +221,10 @@
"UMask": "0x1"
},
{
- "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache miss.",
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache miss. [This event is alias to ICACHE_DATA.STALLS]",
"EventCode": "0x80",
"EventName": "ICACHE_16B.IFDATA_STALL",
- "PublicDescription": "Counts cycles where a code line fetch is stalled due to an L1 instruction cache miss. The legacy decode pipeline works at a 16 Byte granularity.",
+ "PublicDescription": "Counts cycles where a code line fetch is stalled due to an L1 instruction cache miss. The legacy decode pipeline works at a 16 Byte granularity. [This event is alias to ICACHE_DATA.STALLS]",
"SampleAfterValue": "500009",
"UMask": "0x4"
},
@@ -237,10 +245,26 @@
"UMask": "0x2"
},
{
- "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss.",
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_TAG.STALLS]",
"EventCode": "0x83",
"EventName": "ICACHE_64B.IFTAG_STALL",
- "PublicDescription": "Counts cycles where a code fetch is stalled due to L1 instruction cache tag miss.",
+ "PublicDescription": "Counts cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_TAG.STALLS]",
+ "SampleAfterValue": "200003",
+ "UMask": "0x4"
+ },
+ {
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache miss. [This event is alias to ICACHE_16B.IFDATA_STALL]",
+ "EventCode": "0x80",
+ "EventName": "ICACHE_DATA.STALLS",
+ "PublicDescription": "Counts cycles where a code line fetch is stalled due to an L1 instruction cache miss. The legacy decode pipeline works at a 16 Byte granularity. [This event is alias to ICACHE_16B.IFDATA_STALL]",
+ "SampleAfterValue": "500009",
+ "UMask": "0x4"
+ },
+ {
+ "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_64B.IFTAG_STALL]",
+ "EventCode": "0x83",
+ "EventName": "ICACHE_TAG.STALLS",
+ "PublicDescription": "Counts cycles where a code fetch is stalled due to L1 instruction cache tag miss. [This event is alias to ICACHE_64B.IFTAG_STALL]",
"SampleAfterValue": "200003",
"UMask": "0x4"
},
diff --git a/tools/perf/pmu-events/arch/x86/tigerlake/pipeline.json b/tools/perf/pmu-events/arch/x86/tigerlake/pipeline.json
index 020801cbd7e3..541bf1dd1679 100644
--- a/tools/perf/pmu-events/arch/x86/tigerlake/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/tigerlake/pipeline.json
@@ -335,10 +335,10 @@
"UMask": "0x80"
},
{
- "BriefDescription": "Stalls caused by changing prefix length of the instruction.",
+ "BriefDescription": "Stalls caused by changing prefix length of the instruction. [This event is alias to DECODE.LCP]",
"EventCode": "0x87",
"EventName": "ILD_STALL.LCP",
- "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk.",
+ "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk. [This event is alias to DECODE.LCP]",
"SampleAfterValue": "500009",
"UMask": "0x1"
},
@@ -564,7 +564,7 @@
"BriefDescription": "TMA slots wasted due to incorrect speculation by branch mispredictions",
"EventCode": "0xa4",
"EventName": "TOPDOWN.BR_MISPREDICT_SLOTS",
- "PublicDescription": "Number of TMA slots that were wasted due to incorrect speculation by branch mispredictions. This event estimates number of operations that were issued but not retired from the specualtive path as well as the out-of-order engine recovery past a branch misprediction.",
+ "PublicDescription": "Number of TMA slots that were wasted due to incorrect speculation by branch mispredictions. This event estimates number of operations that were issued but not retired from the speculative path as well as the out-of-order engine recovery past a branch misprediction.",
"SampleAfterValue": "10000003",
"UMask": "0x8"
},
--
2.41.0.162.gfafddb0af9-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v2 07/12] perf vendor events intel: Update icelake to 1.19
2023-06-23 15:10 ` [PATCH v2 07/12] perf vendor events intel: Update icelake " Ian Rogers
@ 2023-06-23 16:04 ` Vince Weaver
2023-06-23 16:08 ` Ian Rogers
0 siblings, 1 reply; 15+ messages in thread
From: Vince Weaver @ 2023-06-23 16:04 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Adrian Hunter, Kan Liang, Zhengjun Xing, Kajol Jain, John Garry,
Andrii Nakryiko, Eduard Zingerman, Jing Zhang, Sohom Datta,
linux-kernel, linux-perf-users, Perry Taylor, Samantha Alt,
Caleb Biggers, Weilin Wang, Edward Baker
On Fri, 23 Jun 2023, Ian Rogers wrote:
> Updates were released in:
> https://github.com/intel/perfmon/commit/f3d841189f8964bc240c86301f4c849845630b5b
> A number of events are deprecated and event descriptions updated. Adds
> events ICACHE_DATA.STALLS, ICACHE_TAG.STALLS and DECODE.LCP.
why are the events marked as deprecated rather than just being removed?
Vince Weaver
vincent.weaver@maine.edu
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 07/12] perf vendor events intel: Update icelake to 1.19
2023-06-23 16:04 ` Vince Weaver
@ 2023-06-23 16:08 ` Ian Rogers
0 siblings, 0 replies; 15+ messages in thread
From: Ian Rogers @ 2023-06-23 16:08 UTC (permalink / raw)
To: Vince Weaver
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Adrian Hunter, Kan Liang, Zhengjun Xing, Kajol Jain, John Garry,
Andrii Nakryiko, Eduard Zingerman, Jing Zhang, Sohom Datta,
linux-kernel, linux-perf-users, Perry Taylor, Samantha Alt,
Caleb Biggers, Weilin Wang, Edward Baker
On Fri, Jun 23, 2023 at 9:04 AM Vince Weaver <vincent.weaver@maine.edu> wrote:
>
> On Fri, 23 Jun 2023, Ian Rogers wrote:
>
> > Updates were released in:
> > https://github.com/intel/perfmon/commit/f3d841189f8964bc240c86301f4c849845630b5b
> > A number of events are deprecated and event descriptions updated. Adds
> > events ICACHE_DATA.STALLS, ICACHE_TAG.STALLS and DECODE.LCP.
>
> why are the events marked as deprecated rather than just being removed?
My guess would be so that people who used the deprecated event name
don't suddenly get failures. The deprecated flag means that the events
no longer show in "perf list" unless --deprecated is specified.
Thanks,
Ian
> Vince Weaver
> vincent.weaver@maine.edu
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 00/12] Add metric has_event, update intel vendor events
2023-06-23 15:10 [PATCH v2 00/12] Add metric has_event, update intel vendor events Ian Rogers
` (9 preceding siblings ...)
2023-06-23 15:10 ` [PATCH v2 12/12] perf vendor events intel: Update tigerlake to 1.13 Ian Rogers
@ 2023-06-29 21:31 ` Namhyung Kim
2023-06-30 21:03 ` Namhyung Kim
10 siblings, 1 reply; 15+ messages in thread
From: Namhyung Kim @ 2023-06-29 21:31 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, Zhengjun Xing, Kajol Jain, John Garry, Andrii Nakryiko,
Eduard Zingerman, Jing Zhang, Sohom Datta, linux-kernel,
linux-perf-users, Perry Taylor, Samantha Alt, Caleb Biggers,
Weilin Wang, Edward Baker
Hi Ian,
On Fri, Jun 23, 2023 at 8:10 AM Ian Rogers <irogers@google.com> wrote:
>
> Add a new has_event function for metrics so that events that can be
> disabled by the kernel/firmware don't cause metrics to fail. Use this
> function for Intel transaction metrics fixing "perf all metrics test"
> on systems with TSX disabled. The update conversion script is posted in:
> https://github.com/intel/perfmon/pull/90
>
> Re-generate Intel vendor events using:
> https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
> Adding rocketlake support, uncore and many core events for meteorlake,
> and smaller updates for cascakelakex, icelake, icelakex,
> sapphirerapids, skylake, skylakex and tigerlake.
>
> v2. Handle failed memory allocated for evlist, John Garry.
>
> Ian Rogers (12):
> perf expr: Add has_event function
> perf jevents: Support for has_event function
> perf vendor metrics intel: Make transaction metrics conditional
> perf vendor events intel: Add rocketlake events/metrics
> perf vendor events intel: Update meteorlake to 1.03
> perf vendor events intel: Update cascadelakex to 1.19
> perf vendor events intel: Update icelake to 1.19
> perf vendor events intel: Update icelakex to 1.21
> perf vendor events intel: Update sapphirerapids to 1.14
> perf vendor events intel: Update skylake to 57
> perf vendor events intel: Update skylakex to 1.31
> perf vendor events intel: Update tigerlake to 1.13
My tigerlake laptop now passes the all metrics test with this.
It used to fail like below:
event syntax error:
'{cpu/cycles-t,metric-id=cpu!3cycles!1t!3/,cpu/tx-start,m..'
\___ unknown term 'cycles-t' for pmu 'cpu'
Tested-by: Namhyung Kim <namhyung@kernel.org>
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 00/12] Add metric has_event, update intel vendor events
2023-06-29 21:31 ` [PATCH v2 00/12] Add metric has_event, update intel vendor events Namhyung Kim
@ 2023-06-30 21:03 ` Namhyung Kim
0 siblings, 0 replies; 15+ messages in thread
From: Namhyung Kim @ 2023-06-30 21:03 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, Zhengjun Xing, Kajol Jain, John Garry, Andrii Nakryiko,
Eduard Zingerman, Jing Zhang, Sohom Datta, linux-kernel,
linux-perf-users, Perry Taylor, Samantha Alt, Caleb Biggers,
Weilin Wang, Edward Baker
On Thu, Jun 29, 2023 at 2:31 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hi Ian,
>
> On Fri, Jun 23, 2023 at 8:10 AM Ian Rogers <irogers@google.com> wrote:
> >
> > Add a new has_event function for metrics so that events that can be
> > disabled by the kernel/firmware don't cause metrics to fail. Use this
> > function for Intel transaction metrics fixing "perf all metrics test"
> > on systems with TSX disabled. The update conversion script is posted in:
> > https://github.com/intel/perfmon/pull/90
> >
> > Re-generate Intel vendor events using:
> > https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
> > Adding rocketlake support, uncore and many core events for meteorlake,
> > and smaller updates for cascakelakex, icelake, icelakex,
> > sapphirerapids, skylake, skylakex and tigerlake.
> >
> > v2. Handle failed memory allocated for evlist, John Garry.
> >
> > Ian Rogers (12):
> > perf expr: Add has_event function
> > perf jevents: Support for has_event function
> > perf vendor metrics intel: Make transaction metrics conditional
> > perf vendor events intel: Add rocketlake events/metrics
> > perf vendor events intel: Update meteorlake to 1.03
> > perf vendor events intel: Update cascadelakex to 1.19
> > perf vendor events intel: Update icelake to 1.19
> > perf vendor events intel: Update icelakex to 1.21
> > perf vendor events intel: Update sapphirerapids to 1.14
> > perf vendor events intel: Update skylake to 57
> > perf vendor events intel: Update skylakex to 1.31
> > perf vendor events intel: Update tigerlake to 1.13
>
> My tigerlake laptop now passes the all metrics test with this.
> It used to fail like below:
>
> event syntax error:
> '{cpu/cycles-t,metric-id=cpu!3cycles!1t!3/,cpu/tx-start,m..'
> \___ unknown term 'cycles-t' for pmu 'cpu'
>
> Tested-by: Namhyung Kim <namhyung@kernel.org>
Applied to perf-tools-next, thanks!
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-06-30 21:04 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-23 15:10 [PATCH v2 00/12] Add metric has_event, update intel vendor events Ian Rogers
2023-06-23 15:10 ` [PATCH v2 01/12] perf expr: Add has_event function Ian Rogers
2023-06-23 15:10 ` [PATCH v2 02/12] perf jevents: Support for " Ian Rogers
2023-06-23 15:10 ` [PATCH v2 03/12] perf vendor metrics intel: Make transaction metrics conditional Ian Rogers
2023-06-23 15:10 ` [PATCH v2 06/12] perf vendor events intel: Update cascadelakex to 1.19 Ian Rogers
2023-06-23 15:10 ` [PATCH v2 07/12] perf vendor events intel: Update icelake " Ian Rogers
2023-06-23 16:04 ` Vince Weaver
2023-06-23 16:08 ` Ian Rogers
2023-06-23 15:10 ` [PATCH v2 08/12] perf vendor events intel: Update icelakex to 1.21 Ian Rogers
2023-06-23 15:10 ` [PATCH v2 09/12] perf vendor events intel: Update sapphirerapids to 1.14 Ian Rogers
2023-06-23 15:10 ` [PATCH v2 10/12] perf vendor events intel: Update skylake to 57 Ian Rogers
2023-06-23 15:10 ` [PATCH v2 11/12] perf vendor events intel: Update skylakex to 1.31 Ian Rogers
2023-06-23 15:10 ` [PATCH v2 12/12] perf vendor events intel: Update tigerlake to 1.13 Ian Rogers
2023-06-29 21:31 ` [PATCH v2 00/12] Add metric has_event, update intel vendor events Namhyung Kim
2023-06-30 21:03 ` Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).