linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/7] Add core wide metric literal
@ 2022-08-31 17:49 Ian Rogers
  2022-08-31 17:49 ` [PATCH v2 1/7] perf metric: Return early if no CPU PMU table exists Ian Rogers
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Ian Rogers @ 2022-08-31 17:49 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, Kan Liang, Thomas Richter, James Clark, Miaoqian Lin,
	John Garry, Zhengjun Xing, Florian Fischer, linux-perf-users,
	linux-kernel, perry.taylor, caleb.biggers, kshipra.bopardikar,
	ahmad.yasin
  Cc: Stephane Eranian, Ian Rogers

It is possible to optimize metrics when all SMT threads (CPUs) on a
core are measuring events in system wide mode. For example, TMA
metrics [1] defines CORE_CLKS for Sandybrdige as:
    
if SMT is disabled:
  CPU_CLK_UNHALTED.THREAD
if SMT is enabled and recording on all SMT threads (for all processes):
  CPU_CLK_UNHALTED.THREAD_ANY / 2
if SMT is enabled and not recording on all SMT threads:
  (CPU_CLK_UNHALTED.THREAD/2)*
  (1+CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE/CPU_CLK_UNHALTED.REF_XCLK )
    
That is two more events are necessary when not gathering counts on all
SMT threads. To distinguish all SMT threads on a core vs system wide
(all CPUs) call the new property core wide.

As this literal requires the user requested CPUs and system wide to be
present, the parsing of metrics is delayed until after command line
option processing. As events are used to compute the evlist maps, and
metrics create events, the data for core wide must come from the target.

This patch series doesn't correct the Intel metrics to use #core_wide,
which will be done in follow up work. To see the two behaviors
currently you need an Intel CPU between Sandybridge and before
Icelake, then compare the events for tma_backend_bound_percent and
Backend_Bound_SMT where the former assumes recording on all SMT
threads and the latter assumes not recording on all SMT threads. The
future work will just have a single backend bound metric for both
cases determined using #core_wide.

[1] https://download.01.org/perfmon/TMA_Metrics.xlsx Note, #EBS_Mode
is false when recording on all SMT threads and all processes which is
 #core_wide true in this change.

v2. Add error handling for ENOMEM to more strdup cases as suggested by
    Arnaldo. Init the shadow stats if running "perf stat report",
    issue caught by the stat shell test.

Ian Rogers (7):
  perf metric: Return early if no CPU PMU table exists
  perf expr: Move the scanner_ctx into the parse_ctx
  perf smt: Compute SMT from topology
  perf topology: Add core_wide
  perf stat: Delay metric parsing
  perf metrics: Wire up core_wide
  perf test: Add basic core_wide expression test

 tools/perf/builtin-stat.c     |  66 +++++++++++++----
 tools/perf/tests/expr.c       |  37 +++++++---
 tools/perf/util/cputopo.c     |  61 +++++++++++++++
 tools/perf/util/cputopo.h     |   5 ++
 tools/perf/util/expr.c        |  29 +++++---
 tools/perf/util/expr.h        |  14 ++--
 tools/perf/util/expr.l        |   6 +-
 tools/perf/util/metricgroup.c | 135 ++++++++++++++++++++++++----------
 tools/perf/util/metricgroup.h |   4 +-
 tools/perf/util/smt.c         | 109 ++++++---------------------
 tools/perf/util/smt.h         |  12 ++-
 tools/perf/util/stat-shadow.c |  13 +++-
 tools/perf/util/stat.h        |   2 +
 13 files changed, 318 insertions(+), 175 deletions(-)

-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/7] perf metric: Return early if no CPU PMU table exists
  2022-08-31 17:49 [PATCH v2 0/7] Add core wide metric literal Ian Rogers
@ 2022-08-31 17:49 ` Ian Rogers
  2022-08-31 17:49 ` [PATCH v2 2/7] perf expr: Move the scanner_ctx into the parse_ctx Ian Rogers
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Ian Rogers @ 2022-08-31 17:49 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, Kan Liang, Thomas Richter, James Clark, Miaoqian Lin,
	John Garry, Zhengjun Xing, Florian Fischer, linux-perf-users,
	linux-kernel, perry.taylor, caleb.biggers, kshipra.bopardikar,
	ahmad.yasin
  Cc: Stephane Eranian, Ian Rogers

Previous behavior is to segfault if there is no CPU PMU table and a
metric is sought. To reproduce compile with NO_JEVENTS=1 then request
a metric, for example, "perf stat -M IPC true".

Fixes: 00facc760903 ("perf jevents: Switch build to use jevents.py")
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/metricgroup.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index ad5cacdecd81..18aae040d61d 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -1655,6 +1655,9 @@ int metricgroup__parse_groups(const struct option *opt,
 	struct evlist *perf_evlist = *(struct evlist **)opt->value;
 	const struct pmu_events_table *table = pmu_events_table__find();
 
+	if (!table)
+		return -EINVAL;
+
 	return parse_groups(perf_evlist, str, metric_no_group,
 			    metric_no_merge, NULL, metric_events, table);
 }
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/7] perf expr: Move the scanner_ctx into the parse_ctx
  2022-08-31 17:49 [PATCH v2 0/7] Add core wide metric literal Ian Rogers
  2022-08-31 17:49 ` [PATCH v2 1/7] perf metric: Return early if no CPU PMU table exists Ian Rogers
@ 2022-08-31 17:49 ` Ian Rogers
  2022-08-31 17:49 ` [PATCH v2 3/7] perf smt: Compute SMT from topology Ian Rogers
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Ian Rogers @ 2022-08-31 17:49 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, Kan Liang, Thomas Richter, James Clark, Miaoqian Lin,
	John Garry, Zhengjun Xing, Florian Fischer, linux-perf-users,
	linux-kernel, perry.taylor, caleb.biggers, kshipra.bopardikar,
	ahmad.yasin
  Cc: Stephane Eranian, Ian Rogers

We currently maintain the two independently and copy from one to the
other. This is a burden when additional scanner context values are
necessary, so combine them.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/expr.c       |  2 +-
 tools/perf/util/expr.c        |  7 ++-----
 tools/perf/util/expr.h        | 10 +++++-----
 tools/perf/util/metricgroup.c |  4 ++--
 tools/perf/util/stat-shadow.c |  2 +-
 5 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index 2efe9e3a63b8..7ca5e37de560 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -133,7 +133,7 @@ static int test__expr(struct test_suite *t __maybe_unused, int subtest __maybe_u
 						    (void **)&val_ptr));
 
 	expr__ctx_clear(ctx);
-	ctx->runtime = 3;
+	ctx->sctx.runtime = 3;
 	TEST_ASSERT_VAL("find ids",
 			expr__find_ids("EVENT1\\,param\\=?@ + EVENT2\\,param\\=?@",
 					NULL, ctx) == 0);
diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index c15a9852fa41..00bde682e743 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -310,7 +310,7 @@ struct expr_parse_ctx *expr__ctx_new(void)
 		free(ctx);
 		return NULL;
 	}
-	ctx->runtime = 0;
+	ctx->sctx.runtime = 0;
 
 	return ctx;
 }
@@ -344,16 +344,13 @@ static int
 __expr__parse(double *val, struct expr_parse_ctx *ctx, const char *expr,
 	      bool compute_ids)
 {
-	struct expr_scanner_ctx scanner_ctx = {
-		.runtime = ctx->runtime,
-	};
 	YY_BUFFER_STATE buffer;
 	void *scanner;
 	int ret;
 
 	pr_debug2("parsing metric: %s\n", expr);
 
-	ret = expr_lex_init_extra(&scanner_ctx, &scanner);
+	ret = expr_lex_init_extra(&ctx->sctx, &scanner);
 	if (ret)
 		return ret;
 
diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h
index 0403a92d9dcc..07af3d438eb2 100644
--- a/tools/perf/util/expr.h
+++ b/tools/perf/util/expr.h
@@ -10,17 +10,17 @@
 
 struct metric_ref;
 
+struct expr_scanner_ctx {
+	int runtime;
+};
+
 struct expr_parse_ctx {
 	struct hashmap	*ids;
-	int runtime;
+	struct expr_scanner_ctx sctx;
 };
 
 struct expr_id_data;
 
-struct expr_scanner_ctx {
-	int runtime;
-};
-
 struct hashmap *ids__new(void);
 void ids__free(struct hashmap *ids);
 int ids__insert(struct hashmap *ids, const char *id);
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 18aae040d61d..b144c3e35264 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -215,7 +215,7 @@ static struct metric *metric__new(const struct pmu_event *pe,
 	}
 	m->metric_expr = pe->metric_expr;
 	m->metric_unit = pe->unit;
-	m->pctx->runtime = runtime;
+	m->pctx->sctx.runtime = runtime;
 	m->has_constraint = metric_no_group || metricgroup__has_constraint(pe);
 	m->metric_refs = NULL;
 	m->evlist = NULL;
@@ -1626,7 +1626,7 @@ static int parse_groups(struct evlist *perf_evlist, const char *str,
 		}
 		expr->metric_unit = m->metric_unit;
 		expr->metric_events = metric_events;
-		expr->runtime = m->pctx->runtime;
+		expr->runtime = m->pctx->sctx.runtime;
 		list_add(&expr->nd, &me->head);
 	}
 
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 788ce5e46470..815af948abb9 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -911,7 +911,7 @@ static void generic_metric(struct perf_stat_config *config,
 	if (!pctx)
 		return;
 
-	pctx->runtime = runtime;
+	pctx->sctx.runtime = runtime;
 	i = prepare_metric(metric_events, metric_refs, pctx, cpu_map_idx, st);
 	if (i < 0) {
 		expr__ctx_free(pctx);
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 3/7] perf smt: Compute SMT from topology
  2022-08-31 17:49 [PATCH v2 0/7] Add core wide metric literal Ian Rogers
  2022-08-31 17:49 ` [PATCH v2 1/7] perf metric: Return early if no CPU PMU table exists Ian Rogers
  2022-08-31 17:49 ` [PATCH v2 2/7] perf expr: Move the scanner_ctx into the parse_ctx Ian Rogers
@ 2022-08-31 17:49 ` Ian Rogers
  2022-08-31 17:49 ` [PATCH v2 4/7] perf topology: Add core_wide Ian Rogers
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Ian Rogers @ 2022-08-31 17:49 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, Kan Liang, Thomas Richter, James Clark, Miaoqian Lin,
	John Garry, Zhengjun Xing, Florian Fischer, linux-perf-users,
	linux-kernel, perry.taylor, caleb.biggers, kshipra.bopardikar,
	ahmad.yasin
  Cc: Stephane Eranian, Ian Rogers

The topology records sibling threads. Rather than computing SMT using
siblings in sysfs, reuse the values in topology. This only applies
when the file smt/active isn't available.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/expr.c   | 24 ++++++----
 tools/perf/util/cputopo.c | 15 +++++++
 tools/perf/util/cputopo.h |  2 +
 tools/perf/util/expr.c    |  9 ++--
 tools/perf/util/smt.c     | 95 ++++-----------------------------------
 tools/perf/util/smt.h     |  5 ++-
 6 files changed, 49 insertions(+), 101 deletions(-)

diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index 7ca5e37de560..db736ed49556 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -1,4 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
+#include "util/cputopo.h"
 #include "util/debug.h"
 #include "util/expr.h"
 #include "util/header.h"
@@ -154,15 +155,20 @@ static int test__expr(struct test_suite *t __maybe_unused, int subtest __maybe_u
 						    (void **)&val_ptr));
 
 	/* Only EVENT1 or EVENT2 need be measured depending on the value of smt_on. */
-	expr__ctx_clear(ctx);
-	TEST_ASSERT_VAL("find ids",
-			expr__find_ids("EVENT1 if #smt_on else EVENT2",
-				NULL, ctx) == 0);
-	TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 1);
-	TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids,
-						  smt_on() ? "EVENT1" : "EVENT2",
-						  (void **)&val_ptr));
-
+	{
+		struct cpu_topology *topology = cpu_topology__new();
+		bool smton = smt_on(topology);
+
+		cpu_topology__delete(topology);
+		expr__ctx_clear(ctx);
+		TEST_ASSERT_VAL("find ids",
+				expr__find_ids("EVENT1 if #smt_on else EVENT2",
+					NULL, ctx) == 0);
+		TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 1);
+		TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids,
+							  smton ? "EVENT1" : "EVENT2",
+							  (void **)&val_ptr));
+	}
 	/* The expression is a constant 1.0 without needing to evaluate EVENT1. */
 	expr__ctx_clear(ctx);
 	TEST_ASSERT_VAL("find ids",
diff --git a/tools/perf/util/cputopo.c b/tools/perf/util/cputopo.c
index d275d843c155..511002e52714 100644
--- a/tools/perf/util/cputopo.c
+++ b/tools/perf/util/cputopo.c
@@ -157,6 +157,21 @@ void cpu_topology__delete(struct cpu_topology *tp)
 	free(tp);
 }
 
+bool cpu_topology__smt_on(const struct cpu_topology *topology)
+{
+	for (u32 i = 0; i < topology->core_cpus_lists; i++) {
+		const char *cpu_list = topology->core_cpus_list[i];
+
+		/*
+		 * If there is a need to separate siblings in a core then SMT is
+		 * enabled.
+		 */
+		if (strchr(cpu_list, ',') || strchr(cpu_list, '-'))
+			return true;
+	}
+	return false;
+}
+
 static bool has_die_topology(void)
 {
 	char filename[MAXPATHLEN];
diff --git a/tools/perf/util/cputopo.h b/tools/perf/util/cputopo.h
index 854e18f9041e..469db775a13c 100644
--- a/tools/perf/util/cputopo.h
+++ b/tools/perf/util/cputopo.h
@@ -58,6 +58,8 @@ struct hybrid_topology {
 
 struct cpu_topology *cpu_topology__new(void);
 void cpu_topology__delete(struct cpu_topology *tp);
+/* Determine from the core list whether SMT was enabled. */
+bool cpu_topology__smt_on(const struct cpu_topology *topology);
 
 struct numa_topology *numa_topology__new(void);
 void numa_topology__delete(struct numa_topology *tp);
diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index 00bde682e743..8aa7dafa18b3 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -412,11 +412,6 @@ double expr__get_literal(const char *literal)
 	static struct cpu_topology *topology;
 	double result = NAN;
 
-	if (!strcasecmp("#smt_on", literal)) {
-		result = smt_on() > 0 ? 1.0 : 0.0;
-		goto out;
-	}
-
 	if (!strcmp("#num_cpus", literal)) {
 		result = cpu__max_present_cpu().cpu;
 		goto out;
@@ -440,6 +435,10 @@ double expr__get_literal(const char *literal)
 			goto out;
 		}
 	}
+	if (!strcasecmp("#smt_on", literal)) {
+		result = smt_on(topology) ? 1.0 : 0.0;
+		goto out;
+	}
 	if (!strcmp("#num_packages", literal)) {
 		result = topology->package_cpus_lists;
 		goto out;
diff --git a/tools/perf/util/smt.c b/tools/perf/util/smt.c
index 8fed03283c85..ce90c4ee4138 100644
--- a/tools/perf/util/smt.c
+++ b/tools/perf/util/smt.c
@@ -1,100 +1,23 @@
 // SPDX-License-Identifier: GPL-2.0-only
-#include <stdio.h>
-#include <stdlib.h>
-#include <unistd.h>
-#include <linux/bitops.h>
+#include <string.h>
 #include "api/fs/fs.h"
+#include "cputopo.h"
 #include "smt.h"
 
-/**
- * hweight_str - Returns the number of bits set in str. Stops at first non-hex
- *	       or ',' character.
- */
-static int hweight_str(char *str)
-{
-	int result = 0;
-
-	while (*str) {
-		switch (*str++) {
-		case '0':
-		case ',':
-			break;
-		case '1':
-		case '2':
-		case '4':
-		case '8':
-			result++;
-			break;
-		case '3':
-		case '5':
-		case '6':
-		case '9':
-		case 'a':
-		case 'A':
-		case 'c':
-		case 'C':
-			result += 2;
-			break;
-		case '7':
-		case 'b':
-		case 'B':
-		case 'd':
-		case 'D':
-		case 'e':
-		case 'E':
-			result += 3;
-			break;
-		case 'f':
-		case 'F':
-			result += 4;
-			break;
-		default:
-			goto done;
-		}
-	}
-done:
-	return result;
-}
-
-int smt_on(void)
+bool smt_on(const struct cpu_topology *topology)
 {
 	static bool cached;
-	static int cached_result;
-	int cpu;
-	int ncpu;
+	static bool cached_result;
+	int fs_value;
 
 	if (cached)
 		return cached_result;
 
-	if (sysfs__read_int("devices/system/cpu/smt/active", &cached_result) >= 0) {
-		cached = true;
-		return cached_result;
-	}
-
-	cached_result = 0;
-	ncpu = sysconf(_SC_NPROCESSORS_CONF);
-	for (cpu = 0; cpu < ncpu; cpu++) {
-		unsigned long long siblings;
-		char *str;
-		size_t strlen;
-		char fn[256];
+	if (sysfs__read_int("devices/system/cpu/smt/active", &fs_value) >= 0)
+		cached_result = (fs_value == 1);
+	else
+		cached_result = cpu_topology__smt_on(topology);
 
-		snprintf(fn, sizeof fn,
-			"devices/system/cpu/cpu%d/topology/thread_siblings", cpu);
-		if (sysfs__read_str(fn, &str, &strlen) < 0) {
-			snprintf(fn, sizeof fn,
-				"devices/system/cpu/cpu%d/topology/core_cpus", cpu);
-			if (sysfs__read_str(fn, &str, &strlen) < 0)
-				continue;
-		}
-		/* Entry is hex, but does not have 0x, so need custom parser */
-		siblings = hweight_str(str);
-		free(str);
-		if (siblings > 1) {
-			cached_result = 1;
-			break;
-		}
-	}
 	cached = true;
 	return cached_result;
 }
diff --git a/tools/perf/util/smt.h b/tools/perf/util/smt.h
index a98d65808f6a..e26999c6b8d4 100644
--- a/tools/perf/util/smt.h
+++ b/tools/perf/util/smt.h
@@ -2,6 +2,9 @@
 #ifndef __SMT_H
 #define __SMT_H 1
 
-int smt_on(void);
+struct cpu_topology;
+
+/* Returns true if SMT (aka hyperthreading) is enabled. */
+bool smt_on(const struct cpu_topology *topology);
 
 #endif /* __SMT_H */
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 4/7] perf topology: Add core_wide
  2022-08-31 17:49 [PATCH v2 0/7] Add core wide metric literal Ian Rogers
                   ` (2 preceding siblings ...)
  2022-08-31 17:49 ` [PATCH v2 3/7] perf smt: Compute SMT from topology Ian Rogers
@ 2022-08-31 17:49 ` Ian Rogers
  2022-09-02  6:29   ` Namhyung Kim
  2022-08-31 17:49 ` [PATCH v2 5/7] perf stat: Delay metric parsing Ian Rogers
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Ian Rogers @ 2022-08-31 17:49 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, Kan Liang, Thomas Richter, James Clark, Miaoqian Lin,
	John Garry, Zhengjun Xing, Florian Fischer, linux-perf-users,
	linux-kernel, perry.taylor, caleb.biggers, kshipra.bopardikar,
	ahmad.yasin
  Cc: Stephane Eranian, Ian Rogers

It is possible to optimize metrics when all SMT threads (CPUs) on a
core are measuring events in system wide mode. For example, TMA
metrics defines CORE_CLKS for Sandybrdige as:

if SMT is disabled:
  CPU_CLK_UNHALTED.THREAD
if SMT is enabled and recording on all SMT threads:
  CPU_CLK_UNHALTED.THREAD_ANY / 2
if SMT is enabled and not recording on all SMT threads:
  (CPU_CLK_UNHALTED.THREAD/2)*
  (1+CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE/CPU_CLK_UNHALTED.REF_XCLK )

That is two more events are necessary when not gathering counts on all
SMT threads. To distinguish all SMT threads on a core vs system wide
(all CPUs) call the new property core wide.  Add a core wide test that
determines the property from user requested CPUs, the topology and
system wide. System wide is required as other processes running on a
SMT thread will change the counts.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/cputopo.c | 46 +++++++++++++++++++++++++++++++++++++++
 tools/perf/util/cputopo.h |  3 +++
 tools/perf/util/smt.c     | 14 ++++++++++++
 tools/perf/util/smt.h     |  7 ++++++
 4 files changed, 70 insertions(+)

diff --git a/tools/perf/util/cputopo.c b/tools/perf/util/cputopo.c
index 511002e52714..1a3ff6449158 100644
--- a/tools/perf/util/cputopo.c
+++ b/tools/perf/util/cputopo.c
@@ -172,6 +172,52 @@ bool cpu_topology__smt_on(const struct cpu_topology *topology)
 	return false;
 }
 
+bool cpu_topology__core_wide(const struct cpu_topology *topology,
+			     const char *user_requested_cpu_list)
+{
+	struct perf_cpu_map *user_requested_cpus;
+
+	/*
+	 * If user_requested_cpu_list is empty then all CPUs are recorded and so
+	 * core_wide is true.
+	 */
+	if (!user_requested_cpu_list)
+		return true;
+
+	user_requested_cpus = perf_cpu_map__new(user_requested_cpu_list);
+	/* Check that every user requested CPU is the complete set of SMT threads on a core. */
+	for (u32 i = 0; i < topology->core_cpus_lists; i++) {
+		const char *core_cpu_list = topology->core_cpus_list[i];
+		struct perf_cpu_map *core_cpus = perf_cpu_map__new(core_cpu_list);
+		struct perf_cpu cpu;
+		int idx;
+		bool has_first, first = true;
+
+		perf_cpu_map__for_each_cpu(cpu, idx, core_cpus) {
+			if (first) {
+				has_first = perf_cpu_map__has(user_requested_cpus, cpu);
+				first = false;
+			} else {
+				/*
+				 * If the first core CPU is user requested then
+				 * all subsequent CPUs in the core must be user
+				 * requested too. If the first CPU isn't user
+				 * requested then none of the others must be
+				 * too.
+				 */
+				if (perf_cpu_map__has(user_requested_cpus, cpu) != has_first) {
+					perf_cpu_map__put(core_cpus);
+					perf_cpu_map__put(user_requested_cpus);
+					return false;
+				}
+			}
+		}
+		perf_cpu_map__put(core_cpus);
+	}
+	perf_cpu_map__put(user_requested_cpus);
+	return true;
+}
+
 static bool has_die_topology(void)
 {
 	char filename[MAXPATHLEN];
diff --git a/tools/perf/util/cputopo.h b/tools/perf/util/cputopo.h
index 469db775a13c..969e5920a00e 100644
--- a/tools/perf/util/cputopo.h
+++ b/tools/perf/util/cputopo.h
@@ -60,6 +60,9 @@ struct cpu_topology *cpu_topology__new(void);
 void cpu_topology__delete(struct cpu_topology *tp);
 /* Determine from the core list whether SMT was enabled. */
 bool cpu_topology__smt_on(const struct cpu_topology *topology);
+/* Are the sets of SMT siblings all enabled or all disabled in user_requested_cpus. */
+bool cpu_topology__core_wide(const struct cpu_topology *topology,
+			     const char *user_requested_cpu_list);
 
 struct numa_topology *numa_topology__new(void);
 void numa_topology__delete(struct numa_topology *tp);
diff --git a/tools/perf/util/smt.c b/tools/perf/util/smt.c
index ce90c4ee4138..994e9e418227 100644
--- a/tools/perf/util/smt.c
+++ b/tools/perf/util/smt.c
@@ -21,3 +21,17 @@ bool smt_on(const struct cpu_topology *topology)
 	cached = true;
 	return cached_result;
 }
+
+bool core_wide(bool system_wide, const char *user_requested_cpu_list,
+	       const struct cpu_topology *topology)
+{
+	/* If not everything running on a core is being recorded then we can't use core_wide. */
+	if (!system_wide)
+		return false;
+
+	/* Cheap case that SMT is disabled and therefore we're inherently core_wide. */
+	if (!smt_on(topology))
+		return true;
+
+	return cpu_topology__core_wide(topology, user_requested_cpu_list);
+}
diff --git a/tools/perf/util/smt.h b/tools/perf/util/smt.h
index e26999c6b8d4..ae9095f2c38c 100644
--- a/tools/perf/util/smt.h
+++ b/tools/perf/util/smt.h
@@ -7,4 +7,11 @@ struct cpu_topology;
 /* Returns true if SMT (aka hyperthreading) is enabled. */
 bool smt_on(const struct cpu_topology *topology);
 
+/*
+ * Returns true when system wide and all SMT threads for a core are in the
+ * user_requested_cpus map.
+ */
+bool core_wide(bool system_wide, const char *user_requested_cpu_list,
+	       const struct cpu_topology *topology);
+
 #endif /* __SMT_H */
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 5/7] perf stat: Delay metric parsing
  2022-08-31 17:49 [PATCH v2 0/7] Add core wide metric literal Ian Rogers
                   ` (3 preceding siblings ...)
  2022-08-31 17:49 ` [PATCH v2 4/7] perf topology: Add core_wide Ian Rogers
@ 2022-08-31 17:49 ` Ian Rogers
  2022-08-31 17:49 ` [PATCH v2 6/7] perf metrics: Wire up core_wide Ian Rogers
  2022-08-31 17:49 ` [PATCH v2 7/7] perf test: Add basic core_wide expression test Ian Rogers
  6 siblings, 0 replies; 11+ messages in thread
From: Ian Rogers @ 2022-08-31 17:49 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, Kan Liang, Thomas Richter, James Clark, Miaoqian Lin,
	John Garry, Zhengjun Xing, Florian Fischer, linux-perf-users,
	linux-kernel, perry.taylor, caleb.biggers, kshipra.bopardikar,
	ahmad.yasin
  Cc: Stephane Eranian, Ian Rogers

Having metric parsing as part of argument processing causes issues as
flags like metric-no-group may be specified later. It also denies the
opportunity to optimize the events on SMT systems where fewer events
may be possible if we know the target is system-wide. Move metric
parsing to after command line option parsing. Because of how stat runs
this moves the parsing after record/report which fail to work with
metrics currently anyway.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-stat.c     | 52 +++++++++++++++++++++++++----------
 tools/perf/util/metricgroup.c |  3 +-
 tools/perf/util/metricgroup.h |  2 +-
 3 files changed, 39 insertions(+), 18 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 54cd29d07ca8..a59c23f4ffd2 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -191,6 +191,7 @@ static bool			append_file;
 static bool			interval_count;
 static const char		*output_name;
 static int			output_fd;
+static char			*metrics;
 
 struct perf_stat {
 	bool			 record;
@@ -1148,14 +1149,23 @@ static int enable_metric_only(const struct option *opt __maybe_unused,
 	return 0;
 }
 
-static int parse_metric_groups(const struct option *opt,
+static int append_metric_groups(const struct option *opt __maybe_unused,
 			       const char *str,
 			       int unset __maybe_unused)
 {
-	return metricgroup__parse_groups(opt, str,
-					 stat_config.metric_no_group,
-					 stat_config.metric_no_merge,
-					 &stat_config.metric_events);
+	if (metrics) {
+		char *tmp;
+
+		if (asprintf(&tmp, "%s,%s", metrics, str) < 0)
+			return -ENOMEM;
+		free(metrics);
+		metrics = tmp;
+	} else {
+		metrics = strdup(str);
+		if (!metrics)
+			return -ENOMEM;
+	}
+	return 0;
 }
 
 static int parse_control_option(const struct option *opt,
@@ -1299,7 +1309,7 @@ static struct option stat_options[] = {
 			"measure SMI cost"),
 	OPT_CALLBACK('M', "metrics", &evsel_list, "metric/metric group list",
 		     "monitor specified metrics or metric groups (separated by ,)",
-		     parse_metric_groups),
+		     append_metric_groups),
 	OPT_BOOLEAN_FLAG(0, "all-kernel", &stat_config.all_kernel,
 			 "Configure all used events to run in kernel space.",
 			 PARSE_OPT_EXCLUSIVE),
@@ -1792,11 +1802,9 @@ static int add_default_attributes(void)
 		 * on an architecture test for such a metric name.
 		 */
 		if (metricgroup__has_metric("transaction")) {
-			struct option opt = { .value = &evsel_list };
-
-			return metricgroup__parse_groups(&opt, "transaction",
+			return metricgroup__parse_groups(evsel_list, "transaction",
 							 stat_config.metric_no_group,
-							stat_config.metric_no_merge,
+							 stat_config.metric_no_merge,
 							 &stat_config.metric_events);
 		}
 
@@ -2182,6 +2190,8 @@ static int __cmd_report(int argc, const char **argv)
 			input_name = "perf.data";
 	}
 
+	perf_stat__init_shadow_stats();
+
 	perf_stat.data.path = input_name;
 	perf_stat.data.mode = PERF_DATA_MODE_READ;
 
@@ -2261,8 +2271,6 @@ int cmd_stat(int argc, const char **argv)
 	argc = parse_options_subcommand(argc, argv, stat_options, stat_subcommands,
 					(const char **) stat_usage,
 					PARSE_OPT_STOP_AT_NON_OPTION);
-	perf_stat__collect_metric_expr(evsel_list);
-	perf_stat__init_shadow_stats();
 
 	if (stat_config.csv_sep) {
 		stat_config.csv_output = true;
@@ -2429,6 +2437,23 @@ int cmd_stat(int argc, const char **argv)
 			target.system_wide = true;
 	}
 
+	if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))
+		target.per_thread = true;
+
+	/*
+	 * Metric parsing needs to be delayed as metrics may optimize events
+	 * knowing the target is system-wide.
+	 */
+	if (metrics) {
+		metricgroup__parse_groups(evsel_list, metrics,
+					stat_config.metric_no_group,
+					stat_config.metric_no_merge,
+					&stat_config.metric_events);
+		zfree(&metrics);
+	}
+	perf_stat__collect_metric_expr(evsel_list);
+	perf_stat__init_shadow_stats();
+
 	if (add_default_attributes())
 		goto out;
 
@@ -2448,9 +2473,6 @@ int cmd_stat(int argc, const char **argv)
 		}
 	}
 
-	if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))
-		target.per_thread = true;
-
 	if (evlist__fix_hybrid_cpus(evsel_list, target.cpu_list)) {
 		pr_err("failed to use cpu list %s\n", target.cpu_list);
 		goto out;
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index b144c3e35264..9151346a16ab 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -1646,13 +1646,12 @@ static int parse_groups(struct evlist *perf_evlist, const char *str,
 	return ret;
 }
 
-int metricgroup__parse_groups(const struct option *opt,
+int metricgroup__parse_groups(struct evlist *perf_evlist,
 			      const char *str,
 			      bool metric_no_group,
 			      bool metric_no_merge,
 			      struct rblist *metric_events)
 {
-	struct evlist *perf_evlist = *(struct evlist **)opt->value;
 	const struct pmu_events_table *table = pmu_events_table__find();
 
 	if (!table)
diff --git a/tools/perf/util/metricgroup.h b/tools/perf/util/metricgroup.h
index 016b3b1a289a..af9ceadaec0f 100644
--- a/tools/perf/util/metricgroup.h
+++ b/tools/perf/util/metricgroup.h
@@ -64,7 +64,7 @@ struct metric_expr {
 struct metric_event *metricgroup__lookup(struct rblist *metric_events,
 					 struct evsel *evsel,
 					 bool create);
-int metricgroup__parse_groups(const struct option *opt,
+int metricgroup__parse_groups(struct evlist *perf_evlist,
 			      const char *str,
 			      bool metric_no_group,
 			      bool metric_no_merge,
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 6/7] perf metrics: Wire up core_wide
  2022-08-31 17:49 [PATCH v2 0/7] Add core wide metric literal Ian Rogers
                   ` (4 preceding siblings ...)
  2022-08-31 17:49 ` [PATCH v2 5/7] perf stat: Delay metric parsing Ian Rogers
@ 2022-08-31 17:49 ` Ian Rogers
  2022-08-31 17:49 ` [PATCH v2 7/7] perf test: Add basic core_wide expression test Ian Rogers
  6 siblings, 0 replies; 11+ messages in thread
From: Ian Rogers @ 2022-08-31 17:49 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, Kan Liang, Thomas Richter, James Clark, Miaoqian Lin,
	John Garry, Zhengjun Xing, Florian Fischer, linux-perf-users,
	linux-kernel, perry.taylor, caleb.biggers, kshipra.bopardikar,
	ahmad.yasin
  Cc: Stephane Eranian, Ian Rogers

Pass state necessary for core_wide into the expression parser. Add
system_wide and user_requested_cpu_list to perf_stat_config to make it
available at display time. evlist isn't used as the
evlist__create_maps, that computes user_requested_cpus, needs the list
of events which is generated by the metric.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/builtin-stat.c     |  14 ++++
 tools/perf/util/expr.c        |  13 +++-
 tools/perf/util/expr.h        |   4 +-
 tools/perf/util/expr.l        |   6 +-
 tools/perf/util/metricgroup.c | 125 ++++++++++++++++++++++++----------
 tools/perf/util/metricgroup.h |   2 +
 tools/perf/util/stat-shadow.c |  11 ++-
 tools/perf/util/stat.h        |   2 +
 8 files changed, 134 insertions(+), 43 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index a59c23f4ffd2..35185ee1243e 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1805,6 +1805,8 @@ static int add_default_attributes(void)
 			return metricgroup__parse_groups(evsel_list, "transaction",
 							 stat_config.metric_no_group,
 							 stat_config.metric_no_merge,
+							 stat_config.user_requested_cpu_list,
+							 stat_config.system_wide,
 							 &stat_config.metric_events);
 		}
 
@@ -2440,6 +2442,15 @@ int cmd_stat(int argc, const char **argv)
 	if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))
 		target.per_thread = true;
 
+	stat_config.system_wide = target.system_wide;
+	if (target.cpu_list) {
+		stat_config.user_requested_cpu_list = strdup(target.cpu_list);
+		if (!stat_config.user_requested_cpu_list) {
+			status = -ENOMEM;
+			goto out;
+		}
+	}
+
 	/*
 	 * Metric parsing needs to be delayed as metrics may optimize events
 	 * knowing the target is system-wide.
@@ -2448,6 +2459,8 @@ int cmd_stat(int argc, const char **argv)
 		metricgroup__parse_groups(evsel_list, metrics,
 					stat_config.metric_no_group,
 					stat_config.metric_no_merge,
+					stat_config.user_requested_cpu_list,
+					stat_config.system_wide,
 					&stat_config.metric_events);
 		zfree(&metrics);
 	}
@@ -2638,6 +2651,7 @@ int cmd_stat(int argc, const char **argv)
 		iostat_release(evsel_list);
 
 	zfree(&stat_config.walltime_run);
+	zfree(&stat_config.user_requested_cpu_list);
 
 	if (smi_cost && smi_reset)
 		sysfs__write_int(FREEZE_ON_SMI_PATH, 0);
diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index 8aa7dafa18b3..c6827900f8d3 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -310,7 +310,9 @@ struct expr_parse_ctx *expr__ctx_new(void)
 		free(ctx);
 		return NULL;
 	}
+	ctx->sctx.user_requested_cpu_list = NULL;
 	ctx->sctx.runtime = 0;
+	ctx->sctx.system_wide = false;
 
 	return ctx;
 }
@@ -332,6 +334,10 @@ void expr__ctx_free(struct expr_parse_ctx *ctx)
 	struct hashmap_entry *cur;
 	size_t bkt;
 
+	if (!ctx)
+		return;
+
+	free(ctx->sctx.user_requested_cpu_list);
 	hashmap__for_each_entry(ctx->ids, cur, bkt) {
 		free((char *)cur->key);
 		free(cur->value);
@@ -407,7 +413,7 @@ double arch_get_tsc_freq(void)
 }
 #endif
 
-double expr__get_literal(const char *literal)
+double expr__get_literal(const char *literal, const struct expr_scanner_ctx *ctx)
 {
 	static struct cpu_topology *topology;
 	double result = NAN;
@@ -439,6 +445,11 @@ double expr__get_literal(const char *literal)
 		result = smt_on(topology) ? 1.0 : 0.0;
 		goto out;
 	}
+	if (!strcmp("#core_wide", literal)) {
+		result = core_wide(ctx->system_wide, ctx->user_requested_cpu_list, topology)
+			? 1.0 : 0.0;
+		goto out;
+	}
 	if (!strcmp("#num_packages", literal)) {
 		result = topology->package_cpus_lists;
 		goto out;
diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h
index 07af3d438eb2..d6c1668dc1a0 100644
--- a/tools/perf/util/expr.h
+++ b/tools/perf/util/expr.h
@@ -11,7 +11,9 @@
 struct metric_ref;
 
 struct expr_scanner_ctx {
+	char *user_requested_cpu_list;
 	int runtime;
+	bool system_wide;
 };
 
 struct expr_parse_ctx {
@@ -55,6 +57,6 @@ int expr__find_ids(const char *expr, const char *one,
 
 double expr_id_data__value(const struct expr_id_data *data);
 double expr_id_data__source_count(const struct expr_id_data *data);
-double expr__get_literal(const char *literal);
+double expr__get_literal(const char *literal, const struct expr_scanner_ctx *ctx);
 
 #endif
diff --git a/tools/perf/util/expr.l b/tools/perf/util/expr.l
index 4dc8edbfd9ce..0168a9637330 100644
--- a/tools/perf/util/expr.l
+++ b/tools/perf/util/expr.l
@@ -79,11 +79,11 @@ static int str(yyscan_t scanner, int token, int runtime)
 	return token;
 }
 
-static int literal(yyscan_t scanner)
+static int literal(yyscan_t scanner, const struct expr_scanner_ctx *sctx)
 {
 	YYSTYPE *yylval = expr_get_lval(scanner);
 
-	yylval->num = expr__get_literal(expr_get_text(scanner));
+	yylval->num = expr__get_literal(expr_get_text(scanner), sctx);
 	if (isnan(yylval->num))
 		return EXPR_ERROR;
 
@@ -108,7 +108,7 @@ min		{ return MIN; }
 if		{ return IF; }
 else		{ return ELSE; }
 source_count	{ return SOURCE_COUNT; }
-{literal}	{ return literal(yyscanner); }
+{literal}	{ return literal(yyscanner, sctx); }
 {number}	{ return value(yyscanner); }
 {symbol}	{ return str(yyscanner, ID, sctx->runtime); }
 "|"		{ return '|'; }
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 9151346a16ab..b18da1a62a55 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -22,6 +22,7 @@
 #include <linux/list_sort.h>
 #include <linux/string.h>
 #include <linux/zalloc.h>
+#include <perf/cpumap.h>
 #include <subcmd/parse-options.h>
 #include <api/fs/fs.h>
 #include "util.h"
@@ -189,10 +190,24 @@ static bool metricgroup__has_constraint(const struct pmu_event *pe)
 	return false;
 }
 
+static void metric__free(struct metric *m)
+{
+	if (!m)
+		return;
+
+	free(m->metric_refs);
+	expr__ctx_free(m->pctx);
+	free((char *)m->modifier);
+	evlist__delete(m->evlist);
+	free(m);
+}
+
 static struct metric *metric__new(const struct pmu_event *pe,
 				  const char *modifier,
 				  bool metric_no_group,
-				  int runtime)
+				  int runtime,
+				  const char *user_requested_cpu_list,
+				  bool system_wide)
 {
 	struct metric *m;
 
@@ -201,35 +216,34 @@ static struct metric *metric__new(const struct pmu_event *pe,
 		return NULL;
 
 	m->pctx = expr__ctx_new();
-	if (!m->pctx) {
-		free(m);
-		return NULL;
-	}
+	if (!m->pctx)
+		goto out_err;
 
 	m->metric_name = pe->metric_name;
-	m->modifier = modifier ? strdup(modifier) : NULL;
-	if (modifier && !m->modifier) {
-		expr__ctx_free(m->pctx);
-		free(m);
-		return NULL;
+	m->modifier = NULL;
+	if (modifier) {
+		m->modifier = strdup(modifier);
+		if (!m->modifier)
+			goto out_err;
 	}
 	m->metric_expr = pe->metric_expr;
 	m->metric_unit = pe->unit;
+	m->pctx->sctx.user_requested_cpu_list = NULL;
+	if (user_requested_cpu_list) {
+		m->pctx->sctx.user_requested_cpu_list = strdup(user_requested_cpu_list);
+		if (!m->pctx->sctx.user_requested_cpu_list)
+			goto out_err;
+	}
 	m->pctx->sctx.runtime = runtime;
+	m->pctx->sctx.system_wide = system_wide;
 	m->has_constraint = metric_no_group || metricgroup__has_constraint(pe);
 	m->metric_refs = NULL;
 	m->evlist = NULL;
 
 	return m;
-}
-
-static void metric__free(struct metric *m)
-{
-	free(m->metric_refs);
-	expr__ctx_free(m->pctx);
-	free((char *)m->modifier);
-	evlist__delete(m->evlist);
-	free(m);
+out_err:
+	metric__free(m);
+	return NULL;
 }
 
 static bool contains_metric_id(struct evsel **metric_events, int num_events,
@@ -874,6 +888,8 @@ struct metricgroup_add_iter_data {
 	int *ret;
 	bool *has_match;
 	bool metric_no_group;
+	const char *user_requested_cpu_list;
+	bool system_wide;
 	struct metric *root_metric;
 	const struct visited_metric *visited;
 	const struct pmu_events_table *table;
@@ -887,6 +903,8 @@ static int add_metric(struct list_head *metric_list,
 		      const struct pmu_event *pe,
 		      const char *modifier,
 		      bool metric_no_group,
+		      const char *user_requested_cpu_list,
+		      bool system_wide,
 		      struct metric *root_metric,
 		      const struct visited_metric *visited,
 		      const struct pmu_events_table *table);
@@ -899,6 +917,8 @@ static int add_metric(struct list_head *metric_list,
  * @metric_no_group: Should events written to events be grouped "{}" or
  *                   global. Grouping is the default but due to multiplexing the
  *                   user may override.
+ * @user_requested_cpu_list: Command line specified CPUs to record on.
+ * @system_wide: Are events for all processes recorded.
  * @root_metric: Metrics may reference other metrics to form a tree. In this
  *               case the root_metric holds all the IDs and a list of referenced
  *               metrics. When adding a root this argument is NULL.
@@ -910,6 +930,8 @@ static int add_metric(struct list_head *metric_list,
 static int resolve_metric(struct list_head *metric_list,
 			  const char *modifier,
 			  bool metric_no_group,
+			  const char *user_requested_cpu_list,
+			  bool system_wide,
 			  struct metric *root_metric,
 			  const struct visited_metric *visited,
 			  const struct pmu_events_table *table)
@@ -956,7 +978,8 @@ static int resolve_metric(struct list_head *metric_list,
 	 */
 	for (i = 0; i < pending_cnt; i++) {
 		ret = add_metric(metric_list, &pending[i].pe, modifier, metric_no_group,
-				root_metric, visited, table);
+				 user_requested_cpu_list, system_wide, root_metric, visited,
+				 table);
 		if (ret)
 			break;
 	}
@@ -974,6 +997,8 @@ static int resolve_metric(struct list_head *metric_list,
  *                   global. Grouping is the default but due to multiplexing the
  *                   user may override.
  * @runtime: A special argument for the parser only known at runtime.
+ * @user_requested_cpu_list: Command line specified CPUs to record on.
+ * @system_wide: Are events for all processes recorded.
  * @root_metric: Metrics may reference other metrics to form a tree. In this
  *               case the root_metric holds all the IDs and a list of referenced
  *               metrics. When adding a root this argument is NULL.
@@ -987,6 +1012,8 @@ static int __add_metric(struct list_head *metric_list,
 			const char *modifier,
 			bool metric_no_group,
 			int runtime,
+			const char *user_requested_cpu_list,
+			bool system_wide,
 			struct metric *root_metric,
 			const struct visited_metric *visited,
 			const struct pmu_events_table *table)
@@ -1011,7 +1038,8 @@ static int __add_metric(struct list_head *metric_list,
 		 * This metric is the root of a tree and may reference other
 		 * metrics that are added recursively.
 		 */
-		root_metric = metric__new(pe, modifier, metric_no_group, runtime);
+		root_metric = metric__new(pe, modifier, metric_no_group, runtime,
+					  user_requested_cpu_list, system_wide);
 		if (!root_metric)
 			return -ENOMEM;
 
@@ -1060,8 +1088,9 @@ static int __add_metric(struct list_head *metric_list,
 		ret = -EINVAL;
 	} else {
 		/* Resolve referenced metrics. */
-		ret = resolve_metric(metric_list, modifier, metric_no_group, root_metric,
-				     &visited_node, table);
+		ret = resolve_metric(metric_list, modifier, metric_no_group,
+				     user_requested_cpu_list, system_wide,
+				     root_metric, &visited_node, table);
 	}
 
 	if (ret) {
@@ -1109,6 +1138,8 @@ static int add_metric(struct list_head *metric_list,
 		      const struct pmu_event *pe,
 		      const char *modifier,
 		      bool metric_no_group,
+		      const char *user_requested_cpu_list,
+		      bool system_wide,
 		      struct metric *root_metric,
 		      const struct visited_metric *visited,
 		      const struct pmu_events_table *table)
@@ -1119,7 +1150,8 @@ static int add_metric(struct list_head *metric_list,
 
 	if (!strstr(pe->metric_expr, "?")) {
 		ret = __add_metric(metric_list, pe, modifier, metric_no_group, 0,
-				   root_metric, visited, table);
+				   user_requested_cpu_list, system_wide, root_metric,
+				   visited, table);
 	} else {
 		int j, count;
 
@@ -1132,7 +1164,8 @@ static int add_metric(struct list_head *metric_list,
 
 		for (j = 0; j < count && !ret; j++)
 			ret = __add_metric(metric_list, pe, modifier, metric_no_group, j,
-					root_metric, visited, table);
+					   user_requested_cpu_list, system_wide,
+					   root_metric, visited, table);
 	}
 
 	return ret;
@@ -1149,6 +1182,7 @@ static int metricgroup__add_metric_sys_event_iter(const struct pmu_event *pe,
 		return 0;
 
 	ret = add_metric(d->metric_list, pe, d->modifier, d->metric_no_group,
+			 d->user_requested_cpu_list, d->system_wide,
 			 d->root_metric, d->visited, d->table);
 	if (ret)
 		goto out;
@@ -1191,7 +1225,9 @@ struct metricgroup__add_metric_data {
 	struct list_head *list;
 	const char *metric_name;
 	const char *modifier;
+	const char *user_requested_cpu_list;
 	bool metric_no_group;
+	bool system_wide;
 	bool has_match;
 };
 
@@ -1208,8 +1244,8 @@ static int metricgroup__add_metric_callback(const struct pmu_event *pe,
 
 		data->has_match = true;
 		ret = add_metric(data->list, pe, data->modifier, data->metric_no_group,
-				 /*root_metric=*/NULL,
-				 /*visited_metrics=*/NULL, table);
+				 data->user_requested_cpu_list, data->system_wide,
+				 /*root_metric=*/NULL, /*visited_metrics=*/NULL, table);
 	}
 	return ret;
 }
@@ -1223,12 +1259,16 @@ static int metricgroup__add_metric_callback(const struct pmu_event *pe,
  * @metric_no_group: Should events written to events be grouped "{}" or
  *                   global. Grouping is the default but due to multiplexing the
  *                   user may override.
+ * @user_requested_cpu_list: Command line specified CPUs to record on.
+ * @system_wide: Are events for all processes recorded.
  * @metric_list: The list that the metric or metric group are added to.
  * @table: The table that is searched for metrics, most commonly the table for the
  *       architecture perf is running upon.
  */
 static int metricgroup__add_metric(const char *metric_name, const char *modifier,
 				   bool metric_no_group,
+				   const char *user_requested_cpu_list,
+				   bool system_wide,
 				   struct list_head *metric_list,
 				   const struct pmu_events_table *table)
 {
@@ -1242,6 +1282,8 @@ static int metricgroup__add_metric(const char *metric_name, const char *modifier
 			.metric_name = metric_name,
 			.modifier = modifier,
 			.metric_no_group = metric_no_group,
+			.user_requested_cpu_list = user_requested_cpu_list,
+			.system_wide = system_wide,
 			.has_match = false,
 		};
 		/*
@@ -1263,6 +1305,8 @@ static int metricgroup__add_metric(const char *metric_name, const char *modifier
 				.metric_name = metric_name,
 				.modifier = modifier,
 				.metric_no_group = metric_no_group,
+				.user_requested_cpu_list = user_requested_cpu_list,
+				.system_wide = system_wide,
 				.has_match = &has_match,
 				.ret = &ret,
 				.table = table,
@@ -1293,12 +1337,15 @@ static int metricgroup__add_metric(const char *metric_name, const char *modifier
  * @metric_no_group: Should events written to events be grouped "{}" or
  *                   global. Grouping is the default but due to multiplexing the
  *                   user may override.
+ * @user_requested_cpu_list: Command line specified CPUs to record on.
+ * @system_wide: Are events for all processes recorded.
  * @metric_list: The list that metrics are added to.
  * @table: The table that is searched for metrics, most commonly the table for the
  *       architecture perf is running upon.
  */
 static int metricgroup__add_metric_list(const char *list, bool metric_no_group,
-					struct list_head *metric_list,
+					const char *user_requested_cpu_list,
+					bool system_wide, struct list_head *metric_list,
 					const struct pmu_events_table *table)
 {
 	char *list_itr, *list_copy, *metric_name, *modifier;
@@ -1315,8 +1362,8 @@ static int metricgroup__add_metric_list(const char *list, bool metric_no_group,
 			*modifier++ = '\0';
 
 		ret = metricgroup__add_metric(metric_name, modifier,
-					      metric_no_group, metric_list,
-					      table);
+					      metric_no_group, user_requested_cpu_list,
+					      system_wide, metric_list, table);
 		if (ret == -EINVAL)
 			pr_err("Cannot find metric or group `%s'\n", metric_name);
 
@@ -1505,6 +1552,8 @@ static int parse_ids(bool metric_no_merge, struct perf_pmu *fake_pmu,
 static int parse_groups(struct evlist *perf_evlist, const char *str,
 			bool metric_no_group,
 			bool metric_no_merge,
+			const char *user_requested_cpu_list,
+			bool system_wide,
 			struct perf_pmu *fake_pmu,
 			struct rblist *metric_events_list,
 			const struct pmu_events_table *table)
@@ -1518,7 +1567,8 @@ static int parse_groups(struct evlist *perf_evlist, const char *str,
 	if (metric_events_list->nr_entries == 0)
 		metricgroup__rblist_init(metric_events_list);
 	ret = metricgroup__add_metric_list(str, metric_no_group,
-					   &metric_list, table);
+					   user_requested_cpu_list,
+					   system_wide, &metric_list, table);
 	if (ret)
 		goto out;
 
@@ -1650,6 +1700,8 @@ int metricgroup__parse_groups(struct evlist *perf_evlist,
 			      const char *str,
 			      bool metric_no_group,
 			      bool metric_no_merge,
+			      const char *user_requested_cpu_list,
+			      bool system_wide,
 			      struct rblist *metric_events)
 {
 	const struct pmu_events_table *table = pmu_events_table__find();
@@ -1657,8 +1709,9 @@ int metricgroup__parse_groups(struct evlist *perf_evlist,
 	if (!table)
 		return -EINVAL;
 
-	return parse_groups(perf_evlist, str, metric_no_group,
-			    metric_no_merge, NULL, metric_events, table);
+	return parse_groups(perf_evlist, str, metric_no_group, metric_no_merge,
+			    user_requested_cpu_list, system_wide,
+			    /*fake_pmu=*/NULL, metric_events, table);
 }
 
 int metricgroup__parse_groups_test(struct evlist *evlist,
@@ -1668,8 +1721,10 @@ int metricgroup__parse_groups_test(struct evlist *evlist,
 				   bool metric_no_merge,
 				   struct rblist *metric_events)
 {
-	return parse_groups(evlist, str, metric_no_group,
-			    metric_no_merge, &perf_pmu__fake, metric_events, table);
+	return parse_groups(evlist, str, metric_no_group, metric_no_merge,
+			    /*user_requested_cpu_list=*/NULL,
+			    /*system_wide=*/false,
+			    &perf_pmu__fake, metric_events, table);
 }
 
 static int metricgroup__has_metric_callback(const struct pmu_event *pe,
diff --git a/tools/perf/util/metricgroup.h b/tools/perf/util/metricgroup.h
index af9ceadaec0f..732d3a0d3334 100644
--- a/tools/perf/util/metricgroup.h
+++ b/tools/perf/util/metricgroup.h
@@ -68,6 +68,8 @@ int metricgroup__parse_groups(struct evlist *perf_evlist,
 			      const char *str,
 			      bool metric_no_group,
 			      bool metric_no_merge,
+			      const char *user_requested_cpu_list,
+			      bool system_wide,
 			      struct rblist *metric_events);
 int metricgroup__parse_groups_test(struct evlist *evlist,
 				   const struct pmu_events_table *table,
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 815af948abb9..9e1eddeff21b 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -911,7 +911,10 @@ static void generic_metric(struct perf_stat_config *config,
 	if (!pctx)
 		return;
 
+	if (config->user_requested_cpu_list)
+		pctx->sctx.user_requested_cpu_list = strdup(config->user_requested_cpu_list);
 	pctx->sctx.runtime = runtime;
+	pctx->sctx.system_wide = config->system_wide;
 	i = prepare_metric(metric_events, metric_refs, pctx, cpu_map_idx, st);
 	if (i < 0) {
 		expr__ctx_free(pctx);
@@ -1304,7 +1307,8 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 				core_bound * 100.);
 	} else if (evsel->metric_expr) {
 		generic_metric(config, evsel->metric_expr, evsel->metric_events, NULL,
-				evsel->name, evsel->metric_name, NULL, 1, cpu_map_idx, out, st);
+			       evsel->name, evsel->metric_name, NULL, 1,
+			       cpu_map_idx, out, st);
 	} else if (runtime_stat_n(st, STAT_NSECS, cpu_map_idx, &rsd) != 0) {
 		char unit = ' ';
 		char unit_buf[10] = "/sec";
@@ -1329,8 +1333,9 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
 			if (num++ > 0)
 				out->new_line(config, ctxp);
 			generic_metric(config, mexp->metric_expr, mexp->metric_events,
-					mexp->metric_refs, evsel->name, mexp->metric_name,
-					mexp->metric_unit, mexp->runtime, cpu_map_idx, out, st);
+				       mexp->metric_refs, evsel->name, mexp->metric_name,
+				       mexp->metric_unit, mexp->runtime,
+				       cpu_map_idx, out, st);
 		}
 	}
 	if (num == 0)
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 668250022f8c..72713b344b79 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -141,6 +141,8 @@ struct perf_stat_config {
 	bool			 stop_read_counter;
 	bool			 quiet;
 	bool			 iostat_run;
+	char			 *user_requested_cpu_list;
+	bool			 system_wide;
 	FILE			*output;
 	unsigned int		 interval;
 	unsigned int		 timeout;
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 7/7] perf test: Add basic core_wide expression test
  2022-08-31 17:49 [PATCH v2 0/7] Add core wide metric literal Ian Rogers
                   ` (5 preceding siblings ...)
  2022-08-31 17:49 ` [PATCH v2 6/7] perf metrics: Wire up core_wide Ian Rogers
@ 2022-08-31 17:49 ` Ian Rogers
  6 siblings, 0 replies; 11+ messages in thread
From: Ian Rogers @ 2022-08-31 17:49 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Andi Kleen, Kan Liang, Thomas Richter, James Clark, Miaoqian Lin,
	John Garry, Zhengjun Xing, Florian Fischer, linux-perf-users,
	linux-kernel, perry.taylor, caleb.biggers, kshipra.bopardikar,
	ahmad.yasin
  Cc: Stephane Eranian, Ian Rogers

Add basic test for coverage, similar to #smt_on.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/expr.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index db736ed49556..8bd719766814 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -158,6 +158,9 @@ static int test__expr(struct test_suite *t __maybe_unused, int subtest __maybe_u
 	{
 		struct cpu_topology *topology = cpu_topology__new();
 		bool smton = smt_on(topology);
+		bool corewide = core_wide(/*system_wide=*/false,
+					  /*user_requested_cpus=*/false,
+					  topology);
 
 		cpu_topology__delete(topology);
 		expr__ctx_clear(ctx);
@@ -168,6 +171,16 @@ static int test__expr(struct test_suite *t __maybe_unused, int subtest __maybe_u
 		TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids,
 							  smton ? "EVENT1" : "EVENT2",
 							  (void **)&val_ptr));
+
+		expr__ctx_clear(ctx);
+		TEST_ASSERT_VAL("find ids",
+				expr__find_ids("EVENT1 if #core_wide else EVENT2",
+					NULL, ctx) == 0);
+		TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 1);
+		TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids,
+							  corewide ? "EVENT1" : "EVENT2",
+							  (void **)&val_ptr));
+
 	}
 	/* The expression is a constant 1.0 without needing to evaluate EVENT1. */
 	expr__ctx_clear(ctx);
-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/7] perf topology: Add core_wide
  2022-08-31 17:49 ` [PATCH v2 4/7] perf topology: Add core_wide Ian Rogers
@ 2022-09-02  6:29   ` Namhyung Kim
  2022-09-06 18:56     ` Ian Rogers
  0 siblings, 1 reply; 11+ messages in thread
From: Namhyung Kim @ 2022-09-02  6:29 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Andi Kleen,
	Kan Liang, Thomas Richter, James Clark, Miaoqian Lin, John Garry,
	Zhengjun Xing, Florian Fischer, linux-perf-users, linux-kernel,
	perry.taylor, caleb.biggers, kshipra.bopardikar, ahmad.yasin,
	Stephane Eranian

Hi Ian,

On Wed, Aug 31, 2022 at 10:50 AM Ian Rogers <irogers@google.com> wrote:
>
> It is possible to optimize metrics when all SMT threads (CPUs) on a
> core are measuring events in system wide mode. For example, TMA
> metrics defines CORE_CLKS for Sandybrdige as:
>
> if SMT is disabled:
>   CPU_CLK_UNHALTED.THREAD
> if SMT is enabled and recording on all SMT threads:
>   CPU_CLK_UNHALTED.THREAD_ANY / 2
> if SMT is enabled and not recording on all SMT threads:
>   (CPU_CLK_UNHALTED.THREAD/2)*
>   (1+CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE/CPU_CLK_UNHALTED.REF_XCLK )
>
> That is two more events are necessary when not gathering counts on all
> SMT threads. To distinguish all SMT threads on a core vs system wide
> (all CPUs) call the new property core wide.  Add a core wide test that
> determines the property from user requested CPUs, the topology and
> system wide. System wide is required as other processes running on a
> SMT thread will change the counts.
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
[SNIP]
> diff --git a/tools/perf/util/smt.c b/tools/perf/util/smt.c
> index ce90c4ee4138..994e9e418227 100644
> --- a/tools/perf/util/smt.c
> +++ b/tools/perf/util/smt.c
> @@ -21,3 +21,17 @@ bool smt_on(const struct cpu_topology *topology)
>         cached = true;
>         return cached_result;
>  }
> +
> +bool core_wide(bool system_wide, const char *user_requested_cpu_list,
> +              const struct cpu_topology *topology)
> +{
> +       /* If not everything running on a core is being recorded then we can't use core_wide. */
> +       if (!system_wide)
> +               return false;

I'm not sure if it's correct.  Wouldn't it be ok if it runs on all
threads in a core
even if system wide is off?  I thought that's what the below code checks.

In fact I thought the opposite logic like

    if (system_wide)
        return true;

But it seems the code allows to have cpu_list in the system wide mode.
Then it also needs to check the user requested cpus being NULL.
(IMHO system_wide should be cleared when it has a cpu list...)

    if (system_wide && !user_requested_cpu_list)
        return true;

And if we have a target pointer, we could add this too.

    if (!target__has_cpu(target))
        return false;

Thanks,
Namhyung


> +
> +       /* Cheap case that SMT is disabled and therefore we're inherently core_wide. */
> +       if (!smt_on(topology))
> +               return true;
> +
> +       return cpu_topology__core_wide(topology, user_requested_cpu_list);
> +}
> diff --git a/tools/perf/util/smt.h b/tools/perf/util/smt.h
> index e26999c6b8d4..ae9095f2c38c 100644
> --- a/tools/perf/util/smt.h
> +++ b/tools/perf/util/smt.h
> @@ -7,4 +7,11 @@ struct cpu_topology;
>  /* Returns true if SMT (aka hyperthreading) is enabled. */
>  bool smt_on(const struct cpu_topology *topology);
>
> +/*
> + * Returns true when system wide and all SMT threads for a core are in the
> + * user_requested_cpus map.
> + */
> +bool core_wide(bool system_wide, const char *user_requested_cpu_list,
> +              const struct cpu_topology *topology);
> +
>  #endif /* __SMT_H */
> --
> 2.37.2.672.g94769d06f0-goog
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/7] perf topology: Add core_wide
  2022-09-02  6:29   ` Namhyung Kim
@ 2022-09-06 18:56     ` Ian Rogers
  2022-09-09 18:29       ` Namhyung Kim
  0 siblings, 1 reply; 11+ messages in thread
From: Ian Rogers @ 2022-09-06 18:56 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Andi Kleen,
	Kan Liang, Thomas Richter, James Clark, Miaoqian Lin, John Garry,
	Zhengjun Xing, Florian Fischer, linux-perf-users, linux-kernel,
	perry.taylor, caleb.biggers, kshipra.bopardikar, ahmad.yasin,
	Stephane Eranian

On Thu, Sep 1, 2022 at 11:29 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hi Ian,
>
> On Wed, Aug 31, 2022 at 10:50 AM Ian Rogers <irogers@google.com> wrote:
> >
> > It is possible to optimize metrics when all SMT threads (CPUs) on a
> > core are measuring events in system wide mode. For example, TMA
> > metrics defines CORE_CLKS for Sandybrdige as:
> >
> > if SMT is disabled:
> >   CPU_CLK_UNHALTED.THREAD
> > if SMT is enabled and recording on all SMT threads:
> >   CPU_CLK_UNHALTED.THREAD_ANY / 2
> > if SMT is enabled and not recording on all SMT threads:
> >   (CPU_CLK_UNHALTED.THREAD/2)*
> >   (1+CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE/CPU_CLK_UNHALTED.REF_XCLK )
> >
> > That is two more events are necessary when not gathering counts on all
> > SMT threads. To distinguish all SMT threads on a core vs system wide
> > (all CPUs) call the new property core wide.  Add a core wide test that
> > determines the property from user requested CPUs, the topology and
> > system wide. System wide is required as other processes running on a
> > SMT thread will change the counts.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> [SNIP]
> > diff --git a/tools/perf/util/smt.c b/tools/perf/util/smt.c
> > index ce90c4ee4138..994e9e418227 100644
> > --- a/tools/perf/util/smt.c
> > +++ b/tools/perf/util/smt.c
> > @@ -21,3 +21,17 @@ bool smt_on(const struct cpu_topology *topology)
> >         cached = true;
> >         return cached_result;
> >  }
> > +
> > +bool core_wide(bool system_wide, const char *user_requested_cpu_list,
> > +              const struct cpu_topology *topology)
> > +{
> > +       /* If not everything running on a core is being recorded then we can't use core_wide. */
> > +       if (!system_wide)
> > +               return false;
>
> I'm not sure if it's correct.  Wouldn't it be ok if it runs on all
> threads in a core
> even if system wide is off?  I thought that's what the below code checks.
>
> In fact I thought the opposite logic like
>
>     if (system_wide)
>         return true;
>
> But it seems the code allows to have cpu_list in the system wide mode.
> Then it also needs to check the user requested cpus being NULL.
> (IMHO system_wide should be cleared when it has a cpu list...)
>
>     if (system_wide && !user_requested_cpu_list)
>         return true;
>
> And if we have a target pointer, we could add this too.
>
>     if (!target__has_cpu(target))
>         return false;
>
> Thanks,
> Namhyung

Thanks Namhyung,

It may be correct or may not be without the system wide flag, and so I
want to be conservative and say we need the flag. Let me provide more
detail with an example:

I'm going to assume we want to measure how many slots there were for
an SMT thread. In the "Count Domain" of TMA metrics before Icelake
slots is a core wide metric, meaning that you are measuring >1 hyper
thread. As we are measuring >1 hyper thread and we don't control the
scheduling of the other hyperthread, the only valid mode is system
wide on all CPUs, or all CPUs on a core (aka core wide in this patch
set). However, the slots count domain information also says "Slots -
ICL onwards or EBS_Mode before ICL" meaning the count domain is valid
for an SMT thread. If I look at SkylakeX currently we have two
metrics:

https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json?h=perf/core#n167

    {
       "BriefDescription": "Total issue-pipeline slots (per-Physical
Core till ICL; per-Logical Processor ICL onward)",
       "MetricExpr": "4 * CPU_CLK_UNHALTED.THREAD",
       "MetricGroup": "TmaL1",
       "MetricName": "SLOTS"
   },
   {
       "BriefDescription": "Total issue-pipeline slots (per-Physical
Core till ICL; per-Logical Processor ICL onward)",
       "MetricExpr": "4 * ( ( CPU_CLK_UNHALTED.THREAD / 2 ) * ( 1 +
CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE / CPU_CLK_UNHALTED.REF_XCLK ) )",
       "MetricGroup": "TmaL1_SMT",
       "MetricName": "SLOTS_SMT"
   },

The SLOTS metrics is going to compute a valid value if hyperthreading
is disabled, in other metrics you will see
(CPU_CLK_UNHALTED.THREAD_ANY / 2) where THREAD.ANY and THREAD are the
same event encoding. The reason for the divide by 2 is to adjust the
count down so that when we add the other hyperthreads count we get the
core wide value in the aggregation - the perf event on the 2nd hyper
thread isn't necessary, but that's a separate problem. SLOTS_SMT is
funny so let's dig into it:
The 4 is the pipeline width.
CPU_CLK_UNHALTED.THREAD is defined as, "Counts the number of core
cycles while the thread is not in a halt state.."
The divide by 2 is because of 2 hyperthreads.
The 1 is because we assume that half the slots are going to be given
to our hyperthread.
The CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE / CPU_CLK_UNHALTED.REF_XCLK
will give a ratio for how many cycles our SMT thread was the only one
running, and so we got those slots from the other hyperthread.

With #core_wide we can have a single SLOTS metric of (new lines and //
comments for clarity):

// If we're #core_wide with SMT and we know the work on both
hyperthreads are being measured for all processes then use a single
counter
(4 * CPU_CLK_UNHALTED.THREAD_ANY / 2
  if #core_wide else
// We're not core wide so we need thread counters to adjust the slots
metric based on the time this SMT thread is active
  4 * ( CPU_CLK_UNHALTED.THREAD / 2 ) * ( 1 +
CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE / CPU_CLK_UNHALTED.REF_XCLK )
) if #smt_on
// No hyper threading can just use the core wide counter
else 4 * CPU_CLK_UNHALTED.THREAD

As this metric is a foundation for many others then there is a great
deal of use for it and the current pre-Icelake computation is
generally only valid if on the command line you happen to have
specified a core wide set up - which is error prone.

Thanks,
Ian



> > +
> > +       /* Cheap case that SMT is disabled and therefore we're inherently core_wide. */
> > +       if (!smt_on(topology))
> > +               return true;
> > +
> > +       return cpu_topology__core_wide(topology, user_requested_cpu_list);
> > +}
> > diff --git a/tools/perf/util/smt.h b/tools/perf/util/smt.h
> > index e26999c6b8d4..ae9095f2c38c 100644
> > --- a/tools/perf/util/smt.h
> > +++ b/tools/perf/util/smt.h
> > @@ -7,4 +7,11 @@ struct cpu_topology;
> >  /* Returns true if SMT (aka hyperthreading) is enabled. */
> >  bool smt_on(const struct cpu_topology *topology);
> >
> > +/*
> > + * Returns true when system wide and all SMT threads for a core are in the
> > + * user_requested_cpus map.
> > + */
> > +bool core_wide(bool system_wide, const char *user_requested_cpu_list,
> > +              const struct cpu_topology *topology);
> > +
> >  #endif /* __SMT_H */
> > --
> > 2.37.2.672.g94769d06f0-goog
> >

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 4/7] perf topology: Add core_wide
  2022-09-06 18:56     ` Ian Rogers
@ 2022-09-09 18:29       ` Namhyung Kim
  0 siblings, 0 replies; 11+ messages in thread
From: Namhyung Kim @ 2022-09-09 18:29 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Andi Kleen,
	Kan Liang, Thomas Richter, James Clark, Miaoqian Lin, John Garry,
	Zhengjun Xing, Florian Fischer, linux-perf-users, linux-kernel,
	perry.taylor, caleb.biggers, kshipra.bopardikar, ahmad.yasin,
	Stephane Eranian

On Tue, Sep 6, 2022 at 11:56 AM Ian Rogers <irogers@google.com> wrote:
>
> On Thu, Sep 1, 2022 at 11:29 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hi Ian,
> >
> > On Wed, Aug 31, 2022 at 10:50 AM Ian Rogers <irogers@google.com> wrote:
> > >
> > > It is possible to optimize metrics when all SMT threads (CPUs) on a
> > > core are measuring events in system wide mode. For example, TMA
> > > metrics defines CORE_CLKS for Sandybrdige as:
> > >
> > > if SMT is disabled:
> > >   CPU_CLK_UNHALTED.THREAD
> > > if SMT is enabled and recording on all SMT threads:
> > >   CPU_CLK_UNHALTED.THREAD_ANY / 2
> > > if SMT is enabled and not recording on all SMT threads:
> > >   (CPU_CLK_UNHALTED.THREAD/2)*
> > >   (1+CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE/CPU_CLK_UNHALTED.REF_XCLK )
> > >
> > > That is two more events are necessary when not gathering counts on all
> > > SMT threads. To distinguish all SMT threads on a core vs system wide
> > > (all CPUs) call the new property core wide.  Add a core wide test that
> > > determines the property from user requested CPUs, the topology and
> > > system wide. System wide is required as other processes running on a
> > > SMT thread will change the counts.
> > >
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > ---
> > [SNIP]
> > > diff --git a/tools/perf/util/smt.c b/tools/perf/util/smt.c
> > > index ce90c4ee4138..994e9e418227 100644
> > > --- a/tools/perf/util/smt.c
> > > +++ b/tools/perf/util/smt.c
> > > @@ -21,3 +21,17 @@ bool smt_on(const struct cpu_topology *topology)
> > >         cached = true;
> > >         return cached_result;
> > >  }
> > > +
> > > +bool core_wide(bool system_wide, const char *user_requested_cpu_list,
> > > +              const struct cpu_topology *topology)
> > > +{
> > > +       /* If not everything running on a core is being recorded then we can't use core_wide. */
> > > +       if (!system_wide)
> > > +               return false;
> >
> > I'm not sure if it's correct.  Wouldn't it be ok if it runs on all
> > threads in a core
> > even if system wide is off?  I thought that's what the below code checks.
> >
> > In fact I thought the opposite logic like
> >
> >     if (system_wide)
> >         return true;
> >
> > But it seems the code allows to have cpu_list in the system wide mode.
> > Then it also needs to check the user requested cpus being NULL.
> > (IMHO system_wide should be cleared when it has a cpu list...)
> >
> >     if (system_wide && !user_requested_cpu_list)
> >         return true;
> >
> > And if we have a target pointer, we could add this too.
> >
> >     if (!target__has_cpu(target))
> >         return false;
> >
> > Thanks,
> > Namhyung
>
> Thanks Namhyung,
>
> It may be correct or may not be without the system wide flag, and so I
> want to be conservative and say we need the flag. Let me provide more
> detail with an example:

Ok.

>
> I'm going to assume we want to measure how many slots there were for
> an SMT thread. In the "Count Domain" of TMA metrics before Icelake
> slots is a core wide metric, meaning that you are measuring >1 hyper
> thread. As we are measuring >1 hyper thread and we don't control the
> scheduling of the other hyperthread, the only valid mode is system
> wide on all CPUs, or all CPUs on a core (aka core wide in this patch
> set). However, the slots count domain information also says "Slots -
> ICL onwards or EBS_Mode before ICL" meaning the count domain is valid
> for an SMT thread. If I look at SkylakeX currently we have two
> metrics:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json?h=perf/core#n167
>
>     {
>        "BriefDescription": "Total issue-pipeline slots (per-Physical
> Core till ICL; per-Logical Processor ICL onward)",
>        "MetricExpr": "4 * CPU_CLK_UNHALTED.THREAD",
>        "MetricGroup": "TmaL1",
>        "MetricName": "SLOTS"
>    },
>    {
>        "BriefDescription": "Total issue-pipeline slots (per-Physical
> Core till ICL; per-Logical Processor ICL onward)",
>        "MetricExpr": "4 * ( ( CPU_CLK_UNHALTED.THREAD / 2 ) * ( 1 +
> CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE / CPU_CLK_UNHALTED.REF_XCLK ) )",
>        "MetricGroup": "TmaL1_SMT",
>        "MetricName": "SLOTS_SMT"
>    },
>
> The SLOTS metrics is going to compute a valid value if hyperthreading
> is disabled, in other metrics you will see
> (CPU_CLK_UNHALTED.THREAD_ANY / 2) where THREAD.ANY and THREAD are the
> same event encoding. The reason for the divide by 2 is to adjust the
> count down so that when we add the other hyperthreads count we get the
> core wide value in the aggregation - the perf event on the 2nd hyper
> thread isn't necessary, but that's a separate problem. SLOTS_SMT is
> funny so let's dig into it:
> The 4 is the pipeline width.
> CPU_CLK_UNHALTED.THREAD is defined as, "Counts the number of core
> cycles while the thread is not in a halt state.."
> The divide by 2 is because of 2 hyperthreads.
> The 1 is because we assume that half the slots are going to be given
> to our hyperthread.
> The CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE / CPU_CLK_UNHALTED.REF_XCLK
> will give a ratio for how many cycles our SMT thread was the only one
> running, and so we got those slots from the other hyperthread.
>
> With #core_wide we can have a single SLOTS metric of (new lines and //
> comments for clarity):
>
> // If we're #core_wide with SMT and we know the work on both
> hyperthreads are being measured for all processes then use a single
> counter
> (4 * CPU_CLK_UNHALTED.THREAD_ANY / 2
>   if #core_wide else
> // We're not core wide so we need thread counters to adjust the slots
> metric based on the time this SMT thread is active
>   4 * ( CPU_CLK_UNHALTED.THREAD / 2 ) * ( 1 +
> CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE / CPU_CLK_UNHALTED.REF_XCLK )
> ) if #smt_on
> // No hyper threading can just use the core wide counter
> else 4 * CPU_CLK_UNHALTED.THREAD
>
> As this metric is a foundation for many others then there is a great
> deal of use for it and the current pre-Icelake computation is
> generally only valid if on the command line you happen to have
> specified a core wide set up - which is error prone.

Thanks for the explanation.  I understand why we need core_wide
for accurate metrics.  My concern was just to have a clear
relationship between system-wide mode and user requested cpu list.

But I think it should be a separate thread, let me work on cpu map
cleanups as we discussed earlier in:

https://lore.kernel.org/r/20220506122601.367589-23-adrian.hunter@intel.com/

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-09-09 18:30 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-31 17:49 [PATCH v2 0/7] Add core wide metric literal Ian Rogers
2022-08-31 17:49 ` [PATCH v2 1/7] perf metric: Return early if no CPU PMU table exists Ian Rogers
2022-08-31 17:49 ` [PATCH v2 2/7] perf expr: Move the scanner_ctx into the parse_ctx Ian Rogers
2022-08-31 17:49 ` [PATCH v2 3/7] perf smt: Compute SMT from topology Ian Rogers
2022-08-31 17:49 ` [PATCH v2 4/7] perf topology: Add core_wide Ian Rogers
2022-09-02  6:29   ` Namhyung Kim
2022-09-06 18:56     ` Ian Rogers
2022-09-09 18:29       ` Namhyung Kim
2022-08-31 17:49 ` [PATCH v2 5/7] perf stat: Delay metric parsing Ian Rogers
2022-08-31 17:49 ` [PATCH v2 6/7] perf metrics: Wire up core_wide Ian Rogers
2022-08-31 17:49 ` [PATCH v2 7/7] perf test: Add basic core_wide expression test Ian Rogers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).