* [PATCH 0/7] New function and literals for metrics
@ 2021-11-05 17:09 Ian Rogers
2021-11-05 17:09 ` [PATCH 1/7] perf test: Add expr test for events with hyphens Ian Rogers
` (6 more replies)
0 siblings, 7 replies; 14+ messages in thread
From: Ian Rogers @ 2021-11-05 17:09 UTC (permalink / raw)
To: Andi Kleen, Jiri Olsa, Jin Yao, Namhyung Kim, John Garry,
Kajol Jain, Paul A . Clarke, Arnaldo Carvalho de Melo,
Riccardo Mancini, Kan Liang, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Madhavan Srinivasan, Song Liu,
Wan Jiabing, Yury Norov, Changbin Du, linux-perf-users,
linux-kernel
Cc: Ian Rogers
Add a test for hyphenated events that are common with Intel's icelake
topdown events.
Try to tidy the existing cputopo code and make it more consistent with
the sysfs ABI. Refactor the #smt_on literal parsing into a general
literal token that we can use for more literals. This is then used for
num_cpus, num_cores, num_dies and num_packages literals that use the
existing topology code. Finally, a source_count function is added
which is used to determine the number of events contributing to an
aggregate event. The intent for these new literals and function is for
them to be used in upcoming metrics.
Ian Rogers (7):
perf test: Add expr test for events with hyphens
perf cputopo: Update to use pakage_cpus
perf cputopo: Match die_siblings to topology ABI name
perf cputopo: Match thread_siblings to topology ABI name
perf expr: Add literal values starting with #
perf expr: Add metric literals for topology.
perf expr: Add source_count for aggregating events
tools/perf/tests/expr.c | 34 ++++++++++++++-
tools/perf/util/cputopo.c | 78 ++++++++++++++++++-----------------
tools/perf/util/cputopo.h | 33 ++++++++++++---
tools/perf/util/evsel.c | 12 ++++++
tools/perf/util/evsel.h | 1 +
tools/perf/util/expr.c | 65 ++++++++++++++++++++++++++---
tools/perf/util/expr.h | 4 ++
tools/perf/util/expr.l | 16 ++++++-
tools/perf/util/expr.y | 72 ++++++++++++++++++--------------
tools/perf/util/header.c | 20 ++++-----
tools/perf/util/stat-shadow.c | 7 +++-
11 files changed, 250 insertions(+), 92 deletions(-)
--
2.34.0.rc0.344.g81b53c2807-goog
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/7] perf test: Add expr test for events with hyphens
2021-11-05 17:09 [PATCH 0/7] New function and literals for metrics Ian Rogers
@ 2021-11-05 17:09 ` Ian Rogers
2021-11-05 17:09 ` [PATCH 2/7] perf cputopo: Update to use pakage_cpus Ian Rogers
` (5 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2021-11-05 17:09 UTC (permalink / raw)
To: Andi Kleen, Jiri Olsa, Jin Yao, Namhyung Kim, John Garry,
Kajol Jain, Paul A . Clarke, Arnaldo Carvalho de Melo,
Riccardo Mancini, Kan Liang, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Madhavan Srinivasan, Song Liu,
Wan Jiabing, Yury Norov, Changbin Du, linux-perf-users,
linux-kernel
Cc: Ian Rogers
An example of such an event is topdown-fe-bound.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/tests/expr.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index 077783223ce0..9ee2dc91c27b 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -134,6 +134,16 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "EVENT2,param=3@",
(void **)&val_ptr));
+ expr__ctx_clear(ctx);
+ TEST_ASSERT_VAL("find ids",
+ expr__find_ids("dash\\-event1 - dash\\-event2",
+ NULL, ctx) == 0);
+ TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 2);
+ TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "dash-event1",
+ (void **)&val_ptr));
+ TEST_ASSERT_VAL("find ids", hashmap__find(ctx->ids, "dash-event2",
+ (void **)&val_ptr));
+
/* Only EVENT1 or EVENT2 need be measured depending on the value of smt_on. */
expr__ctx_clear(ctx);
TEST_ASSERT_VAL("find ids",
--
2.34.0.rc0.344.g81b53c2807-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/7] perf cputopo: Update to use pakage_cpus
2021-11-05 17:09 [PATCH 0/7] New function and literals for metrics Ian Rogers
2021-11-05 17:09 ` [PATCH 1/7] perf test: Add expr test for events with hyphens Ian Rogers
@ 2021-11-05 17:09 ` Ian Rogers
2021-11-05 17:09 ` [PATCH 3/7] perf cputopo: Match die_siblings to topology ABI name Ian Rogers
` (4 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2021-11-05 17:09 UTC (permalink / raw)
To: Andi Kleen, Jiri Olsa, Jin Yao, Namhyung Kim, John Garry,
Kajol Jain, Paul A . Clarke, Arnaldo Carvalho de Melo,
Riccardo Mancini, Kan Liang, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Madhavan Srinivasan, Song Liu,
Wan Jiabing, Yury Norov, Changbin Du, linux-perf-users,
linux-kernel
Cc: Ian Rogers
core_siblings_list is the deprecated topology name for
package_cpus_list, update the code to try the non-deprecated path first.
Adjust variable names to match topology name.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/cputopo.c | 26 ++++++++++++++++----------
tools/perf/util/cputopo.h | 11 +++++++++--
| 6 +++---
3 files changed, 28 insertions(+), 15 deletions(-)
diff --git a/tools/perf/util/cputopo.c b/tools/perf/util/cputopo.c
index ec77e2a7b3ca..f72a7de19e02 100644
--- a/tools/perf/util/cputopo.c
+++ b/tools/perf/util/cputopo.c
@@ -14,7 +14,9 @@
#include "env.h"
#include "pmu-hybrid.h"
-#define CORE_SIB_FMT \
+#define PACKAGE_CPUS_FMT \
+ "%s/devices/system/cpu/cpu%d/topology/package_cpus_list"
+#define PACKAGE_CPUS_FMT_OLD \
"%s/devices/system/cpu/cpu%d/topology/core_siblings_list"
#define DIE_SIB_FMT \
"%s/devices/system/cpu/cpu%d/topology/die_cpus_list"
@@ -39,8 +41,12 @@ static int build_cpu_topology(struct cpu_topology *tp, int cpu)
u32 i = 0;
int ret = -1;
- scnprintf(filename, MAXPATHLEN, CORE_SIB_FMT,
+ scnprintf(filename, MAXPATHLEN, PACKAGE_CPUS_FMT,
sysfs__mountpoint(), cpu);
+ if (access(filename, F_OK) == -1) {
+ scnprintf(filename, MAXPATHLEN, PACKAGE_CPUS_FMT_OLD,
+ sysfs__mountpoint(), cpu);
+ }
fp = fopen(filename, "r");
if (!fp)
goto try_dies;
@@ -54,13 +60,13 @@ static int build_cpu_topology(struct cpu_topology *tp, int cpu)
if (p)
*p = '\0';
- for (i = 0; i < tp->core_sib; i++) {
- if (!strcmp(buf, tp->core_siblings[i]))
+ for (i = 0; i < tp->package_cpus_lists; i++) {
+ if (!strcmp(buf, tp->package_cpus_list[i]))
break;
}
- if (i == tp->core_sib) {
- tp->core_siblings[i] = buf;
- tp->core_sib++;
+ if (i == tp->package_cpus_lists) {
+ tp->package_cpus_list[i] = buf;
+ tp->package_cpus_lists++;
buf = NULL;
len = 0;
}
@@ -139,8 +145,8 @@ void cpu_topology__delete(struct cpu_topology *tp)
if (!tp)
return;
- for (i = 0 ; i < tp->core_sib; i++)
- zfree(&tp->core_siblings[i]);
+ for (i = 0 ; i < tp->package_cpus_lists; i++)
+ zfree(&tp->package_cpus_list[i]);
if (tp->die_sib) {
for (i = 0 ; i < tp->die_sib; i++)
@@ -205,7 +211,7 @@ struct cpu_topology *cpu_topology__new(void)
tp = addr;
addr += sizeof(*tp);
- tp->core_siblings = addr;
+ tp->package_cpus_list = addr;
addr += sz;
if (has_die) {
tp->die_siblings = addr;
diff --git a/tools/perf/util/cputopo.h b/tools/perf/util/cputopo.h
index d9af97177068..4ec9cb192c3d 100644
--- a/tools/perf/util/cputopo.h
+++ b/tools/perf/util/cputopo.h
@@ -5,10 +5,17 @@
#include <linux/types.h>
struct cpu_topology {
- u32 core_sib;
+ /* The number of unique package_cpus_lists below. */
+ u32 package_cpus_lists;
u32 die_sib;
u32 thread_sib;
- char **core_siblings;
+ /*
+ * An array of strings where each string is unique and read from
+ * /sys/devices/system/cpu/cpuX/topology/package_cpus_list. From the ABI
+ * each of these is a human-readable list of CPUs sharing the same
+ * physical_package_id. The format is like 0-3, 8-11, 14,17.
+ */
+ const char **package_cpus_list;
char **die_siblings;
char **thread_siblings;
};
--git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 1c7414f66655..212512b9feb0 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -582,12 +582,12 @@ static int write_cpu_topology(struct feat_fd *ff,
if (!tp)
return -1;
- ret = do_write(ff, &tp->core_sib, sizeof(tp->core_sib));
+ ret = do_write(ff, &tp->package_cpus_lists, sizeof(tp->package_cpus_lists));
if (ret < 0)
goto done;
- for (i = 0; i < tp->core_sib; i++) {
- ret = do_write_string(ff, tp->core_siblings[i]);
+ for (i = 0; i < tp->package_cpus_lists; i++) {
+ ret = do_write_string(ff, tp->package_cpus_list[i]);
if (ret < 0)
goto done;
}
--
2.34.0.rc0.344.g81b53c2807-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/7] perf cputopo: Match die_siblings to topology ABI name
2021-11-05 17:09 [PATCH 0/7] New function and literals for metrics Ian Rogers
2021-11-05 17:09 ` [PATCH 1/7] perf test: Add expr test for events with hyphens Ian Rogers
2021-11-05 17:09 ` [PATCH 2/7] perf cputopo: Update to use pakage_cpus Ian Rogers
@ 2021-11-05 17:09 ` Ian Rogers
2021-11-05 17:09 ` [PATCH 4/7] perf cputopo: Match thread_siblings " Ian Rogers
` (3 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2021-11-05 17:09 UTC (permalink / raw)
To: Andi Kleen, Jiri Olsa, Jin Yao, Namhyung Kim, John Garry,
Kajol Jain, Paul A . Clarke, Arnaldo Carvalho de Melo,
Riccardo Mancini, Kan Liang, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Madhavan Srinivasan, Song Liu,
Wan Jiabing, Yury Norov, Changbin Du, linux-perf-users,
linux-kernel
Cc: Ian Rogers
The topology name for die_siblings is die_cpus_list, use this for
consistency and add documentation.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/cputopo.c | 26 ++++++++++++--------------
tools/perf/util/cputopo.h | 11 +++++++++--
| 8 ++++----
3 files changed, 25 insertions(+), 20 deletions(-)
diff --git a/tools/perf/util/cputopo.c b/tools/perf/util/cputopo.c
index f72a7de19e02..420aeda16fbb 100644
--- a/tools/perf/util/cputopo.c
+++ b/tools/perf/util/cputopo.c
@@ -18,7 +18,7 @@
"%s/devices/system/cpu/cpu%d/topology/package_cpus_list"
#define PACKAGE_CPUS_FMT_OLD \
"%s/devices/system/cpu/cpu%d/topology/core_siblings_list"
-#define DIE_SIB_FMT \
+#define DIE_CPUS_FMT \
"%s/devices/system/cpu/cpu%d/topology/die_cpus_list"
#define THRD_SIB_FMT \
"%s/devices/system/cpu/cpu%d/topology/thread_siblings_list"
@@ -73,10 +73,10 @@ static int build_cpu_topology(struct cpu_topology *tp, int cpu)
ret = 0;
try_dies:
- if (!tp->die_siblings)
+ if (!tp->die_cpus_list)
goto try_threads;
- scnprintf(filename, MAXPATHLEN, DIE_SIB_FMT,
+ scnprintf(filename, MAXPATHLEN, DIE_CPUS_FMT,
sysfs__mountpoint(), cpu);
fp = fopen(filename, "r");
if (!fp)
@@ -91,13 +91,13 @@ static int build_cpu_topology(struct cpu_topology *tp, int cpu)
if (p)
*p = '\0';
- for (i = 0; i < tp->die_sib; i++) {
- if (!strcmp(buf, tp->die_siblings[i]))
+ for (i = 0; i < tp->die_cpus_lists; i++) {
+ if (!strcmp(buf, tp->die_cpus_list[i]))
break;
}
- if (i == tp->die_sib) {
- tp->die_siblings[i] = buf;
- tp->die_sib++;
+ if (i == tp->die_cpus_lists) {
+ tp->die_cpus_list[i] = buf;
+ tp->die_cpus_lists++;
buf = NULL;
len = 0;
}
@@ -148,10 +148,8 @@ void cpu_topology__delete(struct cpu_topology *tp)
for (i = 0 ; i < tp->package_cpus_lists; i++)
zfree(&tp->package_cpus_list[i]);
- if (tp->die_sib) {
- for (i = 0 ; i < tp->die_sib; i++)
- zfree(&tp->die_siblings[i]);
- }
+ for (i = 0 ; i < tp->die_cpus_lists; i++)
+ zfree(&tp->die_cpus_list[i]);
for (i = 0 ; i < tp->thread_sib; i++)
zfree(&tp->thread_siblings[i]);
@@ -170,7 +168,7 @@ static bool has_die_topology(void)
if (strncmp(uts.machine, "x86_64", 6))
return false;
- scnprintf(filename, MAXPATHLEN, DIE_SIB_FMT,
+ scnprintf(filename, MAXPATHLEN, DIE_CPUS_FMT,
sysfs__mountpoint(), 0);
if (access(filename, F_OK) == -1)
return false;
@@ -214,7 +212,7 @@ struct cpu_topology *cpu_topology__new(void)
tp->package_cpus_list = addr;
addr += sz;
if (has_die) {
- tp->die_siblings = addr;
+ tp->die_cpus_list = addr;
addr += sz;
}
tp->thread_siblings = addr;
diff --git a/tools/perf/util/cputopo.h b/tools/perf/util/cputopo.h
index 4ec9cb192c3d..a3e01c367853 100644
--- a/tools/perf/util/cputopo.h
+++ b/tools/perf/util/cputopo.h
@@ -7,7 +7,8 @@
struct cpu_topology {
/* The number of unique package_cpus_lists below. */
u32 package_cpus_lists;
- u32 die_sib;
+ /* The number of unique die_cpu_lists below. */
+ u32 die_cpus_lists;
u32 thread_sib;
/*
* An array of strings where each string is unique and read from
@@ -16,7 +17,13 @@ struct cpu_topology {
* physical_package_id. The format is like 0-3, 8-11, 14,17.
*/
const char **package_cpus_list;
- char **die_siblings;
+ /*
+ * An array of string where each string is unique and from
+ * /sys/devices/system/cpu/cpuX/topology/die_cpus_list. From the ABI
+ * each of these is a human-readable list of CPUs within the same die.
+ * The format is like 0-3, 8-11, 14,17.
+ */
+ const char **die_cpus_list;
char **thread_siblings;
};
--git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 212512b9feb0..65ec250f3619 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -616,15 +616,15 @@ static int write_cpu_topology(struct feat_fd *ff,
return ret;
}
- if (!tp->die_sib)
+ if (!tp->die_cpus_lists)
goto done;
- ret = do_write(ff, &tp->die_sib, sizeof(tp->die_sib));
+ ret = do_write(ff, &tp->die_cpus_lists, sizeof(tp->die_cpus_lists));
if (ret < 0)
goto done;
- for (i = 0; i < tp->die_sib; i++) {
- ret = do_write_string(ff, tp->die_siblings[i]);
+ for (i = 0; i < tp->die_cpus_lists; i++) {
+ ret = do_write_string(ff, tp->die_cpus_list[i]);
if (ret < 0)
goto done;
}
--
2.34.0.rc0.344.g81b53c2807-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/7] perf cputopo: Match thread_siblings to topology ABI name
2021-11-05 17:09 [PATCH 0/7] New function and literals for metrics Ian Rogers
` (2 preceding siblings ...)
2021-11-05 17:09 ` [PATCH 3/7] perf cputopo: Match die_siblings to topology ABI name Ian Rogers
@ 2021-11-05 17:09 ` Ian Rogers
2021-11-05 17:09 ` [PATCH 5/7] perf expr: Add literal values starting with # Ian Rogers
` (2 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2021-11-05 17:09 UTC (permalink / raw)
To: Andi Kleen, Jiri Olsa, Jin Yao, Namhyung Kim, John Garry,
Kajol Jain, Paul A . Clarke, Arnaldo Carvalho de Melo,
Riccardo Mancini, Kan Liang, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Madhavan Srinivasan, Song Liu,
Wan Jiabing, Yury Norov, Changbin Du, linux-perf-users,
linux-kernel
Cc: Ian Rogers
The topology name for thread_siblings is core_cpus_list, use this for
consistency and add documentation.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/cputopo.c | 26 +++++++++++++-------------
tools/perf/util/cputopo.h | 11 +++++++++--
| 6 +++---
3 files changed, 25 insertions(+), 18 deletions(-)
diff --git a/tools/perf/util/cputopo.c b/tools/perf/util/cputopo.c
index 420aeda16fbb..51b429c86f98 100644
--- a/tools/perf/util/cputopo.c
+++ b/tools/perf/util/cputopo.c
@@ -20,10 +20,10 @@
"%s/devices/system/cpu/cpu%d/topology/core_siblings_list"
#define DIE_CPUS_FMT \
"%s/devices/system/cpu/cpu%d/topology/die_cpus_list"
-#define THRD_SIB_FMT \
- "%s/devices/system/cpu/cpu%d/topology/thread_siblings_list"
-#define THRD_SIB_FMT_NEW \
+#define CORE_CPUS_FMT \
"%s/devices/system/cpu/cpu%d/topology/core_cpus_list"
+#define CORE_CPUS_FMT_OLD \
+ "%s/devices/system/cpu/cpu%d/topology/thread_siblings_list"
#define NODE_ONLINE_FMT \
"%s/devices/system/node/online"
#define NODE_MEMINFO_FMT \
@@ -104,10 +104,10 @@ static int build_cpu_topology(struct cpu_topology *tp, int cpu)
ret = 0;
try_threads:
- scnprintf(filename, MAXPATHLEN, THRD_SIB_FMT_NEW,
+ scnprintf(filename, MAXPATHLEN, CORE_CPUS_FMT,
sysfs__mountpoint(), cpu);
if (access(filename, F_OK) == -1) {
- scnprintf(filename, MAXPATHLEN, THRD_SIB_FMT,
+ scnprintf(filename, MAXPATHLEN, CORE_CPUS_FMT_OLD,
sysfs__mountpoint(), cpu);
}
fp = fopen(filename, "r");
@@ -121,13 +121,13 @@ static int build_cpu_topology(struct cpu_topology *tp, int cpu)
if (p)
*p = '\0';
- for (i = 0; i < tp->thread_sib; i++) {
- if (!strcmp(buf, tp->thread_siblings[i]))
+ for (i = 0; i < tp->core_cpus_lists; i++) {
+ if (!strcmp(buf, tp->core_cpus_list[i]))
break;
}
- if (i == tp->thread_sib) {
- tp->thread_siblings[i] = buf;
- tp->thread_sib++;
+ if (i == tp->core_cpus_lists) {
+ tp->core_cpus_list[i] = buf;
+ tp->core_cpus_lists++;
buf = NULL;
}
ret = 0;
@@ -151,8 +151,8 @@ void cpu_topology__delete(struct cpu_topology *tp)
for (i = 0 ; i < tp->die_cpus_lists; i++)
zfree(&tp->die_cpus_list[i]);
- for (i = 0 ; i < tp->thread_sib; i++)
- zfree(&tp->thread_siblings[i]);
+ for (i = 0 ; i < tp->core_cpus_lists; i++)
+ zfree(&tp->core_cpus_list[i]);
free(tp);
}
@@ -215,7 +215,7 @@ struct cpu_topology *cpu_topology__new(void)
tp->die_cpus_list = addr;
addr += sz;
}
- tp->thread_siblings = addr;
+ tp->core_cpus_list = addr;
for (i = 0; i < nr; i++) {
if (!cpu_map__has(map, i))
diff --git a/tools/perf/util/cputopo.h b/tools/perf/util/cputopo.h
index a3e01c367853..854e18f9041e 100644
--- a/tools/perf/util/cputopo.h
+++ b/tools/perf/util/cputopo.h
@@ -9,7 +9,8 @@ struct cpu_topology {
u32 package_cpus_lists;
/* The number of unique die_cpu_lists below. */
u32 die_cpus_lists;
- u32 thread_sib;
+ /* The number of unique core_cpu_lists below. */
+ u32 core_cpus_lists;
/*
* An array of strings where each string is unique and read from
* /sys/devices/system/cpu/cpuX/topology/package_cpus_list. From the ABI
@@ -24,7 +25,13 @@ struct cpu_topology {
* The format is like 0-3, 8-11, 14,17.
*/
const char **die_cpus_list;
- char **thread_siblings;
+ /*
+ * An array of string where each string is unique and from
+ * /sys/devices/system/cpu/cpuX/topology/core_cpus_list. From the ABI
+ * each of these is a human-readable list of CPUs within the same
+ * core. The format is like 0-3, 8-11, 14,17.
+ */
+ const char **core_cpus_list;
};
struct numa_topology_node {
--git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 65ec250f3619..c18c5eaf6e23 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -591,12 +591,12 @@ static int write_cpu_topology(struct feat_fd *ff,
if (ret < 0)
goto done;
}
- ret = do_write(ff, &tp->thread_sib, sizeof(tp->thread_sib));
+ ret = do_write(ff, &tp->core_cpus_lists, sizeof(tp->core_cpus_lists));
if (ret < 0)
goto done;
- for (i = 0; i < tp->thread_sib; i++) {
- ret = do_write_string(ff, tp->thread_siblings[i]);
+ for (i = 0; i < tp->core_cpus_lists; i++) {
+ ret = do_write_string(ff, tp->core_cpus_list[i]);
if (ret < 0)
break;
}
--
2.34.0.rc0.344.g81b53c2807-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/7] perf expr: Add literal values starting with #
2021-11-05 17:09 [PATCH 0/7] New function and literals for metrics Ian Rogers
` (3 preceding siblings ...)
2021-11-05 17:09 ` [PATCH 4/7] perf cputopo: Match thread_siblings " Ian Rogers
@ 2021-11-05 17:09 ` Ian Rogers
2021-11-05 17:09 ` [PATCH 6/7] perf expr: Add metric literals for topology Ian Rogers
2021-11-05 17:09 ` [PATCH 7/7] perf expr: Add source_count for aggregating events Ian Rogers
6 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2021-11-05 17:09 UTC (permalink / raw)
To: Andi Kleen, Jiri Olsa, Jin Yao, Namhyung Kim, John Garry,
Kajol Jain, Paul A . Clarke, Arnaldo Carvalho de Melo,
Riccardo Mancini, Kan Liang, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Madhavan Srinivasan, Song Liu,
Wan Jiabing, Yury Norov, Changbin Du, linux-perf-users,
linux-kernel
Cc: Ian Rogers
It is useful to have literal values for constants relating to
topologies, SMT, etc. Make the parsing of literals shared code and add a
lookup function. Move #smt_on to this function.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/expr.c | 11 +++++++++++
tools/perf/util/expr.h | 1 +
tools/perf/util/expr.l | 15 ++++++++++++++-
tools/perf/util/expr.y | 9 ++++-----
4 files changed, 30 insertions(+), 6 deletions(-)
diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index 77c6ad81a923..7464739c2890 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -9,9 +9,11 @@
#include "expr.h"
#include "expr-bison.h"
#include "expr-flex.h"
+#include "smt.h"
#include <linux/kernel.h>
#include <linux/zalloc.h>
#include <ctype.h>
+#include <math.h>
#ifdef PARSER_DEBUG
extern int expr_debug;
@@ -370,3 +372,12 @@ double expr_id_data__value(const struct expr_id_data *data)
assert(data->kind == EXPR_ID_DATA__REF_VALUE);
return data->ref.val;
}
+
+double expr__get_literal(const char *literal)
+{
+ if (!strcmp("#smt_on", literal))
+ return smt_on() > 0 ? 1.0 : 0.0;
+
+ pr_err("Unrecognized literal '%s'", literal);
+ return NAN;
+}
diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h
index cf81f9166dbb..a6ab7f2b23d1 100644
--- a/tools/perf/util/expr.h
+++ b/tools/perf/util/expr.h
@@ -55,5 +55,6 @@ int expr__find_ids(const char *expr, const char *one,
struct expr_parse_ctx *ids);
double expr_id_data__value(const struct expr_id_data *data);
+double expr__get_literal(const char *literal);
#endif
diff --git a/tools/perf/util/expr.l b/tools/perf/util/expr.l
index bd20f33418ba..cf6e3c710502 100644
--- a/tools/perf/util/expr.l
+++ b/tools/perf/util/expr.l
@@ -6,6 +6,7 @@
#include <linux/compiler.h>
#include "expr.h"
#include "expr-bison.h"
+#include <math.h>
char *expr_get_text(yyscan_t yyscanner);
YYSTYPE *expr_get_lval(yyscan_t yyscanner);
@@ -77,6 +78,17 @@ static int str(yyscan_t scanner, int token, int runtime)
yylval->str = normalize(yylval->str, runtime);
return token;
}
+
+static int literal(yyscan_t scanner)
+{
+ YYSTYPE *yylval = expr_get_lval(scanner);
+
+ yylval->num = expr__get_literal(expr_get_text(scanner));
+ if (isnan(yylval->num))
+ return EXPR_ERROR;
+
+ return LITERAL;
+}
%}
number ([0-9]+\.?[0-9]*|[0-9]*\.?[0-9]+)
@@ -85,6 +97,7 @@ sch [-,=]
spec \\{sch}
sym [0-9a-zA-Z_\.:@?]+
symbol ({spec}|{sym})+
+literal #[0-9a-zA-Z_\.\-]+
%%
struct expr_scanner_ctx *sctx = expr_get_extra(yyscanner);
@@ -94,7 +107,7 @@ max { return MAX; }
min { return MIN; }
if { return IF; }
else { return ELSE; }
-#smt_on { return SMT_ON; }
+{literal} { return literal(yyscanner); }
{number} { return value(yyscanner); }
{symbol} { return str(yyscanner, ID, sctx->runtime); }
"|" { return '|'; }
diff --git a/tools/perf/util/expr.y b/tools/perf/util/expr.y
index f969dfa525bd..ba6c6dbf30c8 100644
--- a/tools/perf/util/expr.y
+++ b/tools/perf/util/expr.y
@@ -4,7 +4,6 @@
#include <assert.h>
#include <math.h>
#include "util/debug.h"
-#include "smt.h"
#define IN_EXPR_Y 1
#include "expr.h"
%}
@@ -37,7 +36,7 @@
} ids;
}
-%token ID NUMBER MIN MAX IF ELSE SMT_ON D_RATIO EXPR_ERROR
+%token ID NUMBER MIN MAX IF ELSE LITERAL D_RATIO EXPR_ERROR
%left MIN MAX IF
%left '|'
%left '^'
@@ -46,7 +45,7 @@
%left '-' '+'
%left '*' '/' '%'
%left NEG NOT
-%type <num> NUMBER
+%type <num> NUMBER LITERAL
%type <str> ID
%destructor { free ($$); } <str>
%type <ids> expr if_expr
@@ -280,9 +279,9 @@ expr: NUMBER
$$ = union_expr($3, $5);
}
}
-| SMT_ON
+| LITERAL
{
- $$.val = smt_on() > 0 ? 1.0 : 0.0;
+ $$.val = $1;
$$.ids = NULL;
}
;
--
2.34.0.rc0.344.g81b53c2807-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 6/7] perf expr: Add metric literals for topology.
2021-11-05 17:09 [PATCH 0/7] New function and literals for metrics Ian Rogers
` (4 preceding siblings ...)
2021-11-05 17:09 ` [PATCH 5/7] perf expr: Add literal values starting with # Ian Rogers
@ 2021-11-05 17:09 ` Ian Rogers
2021-11-10 12:56 ` Jiri Olsa
2021-11-05 17:09 ` [PATCH 7/7] perf expr: Add source_count for aggregating events Ian Rogers
6 siblings, 1 reply; 14+ messages in thread
From: Ian Rogers @ 2021-11-05 17:09 UTC (permalink / raw)
To: Andi Kleen, Jiri Olsa, Jin Yao, Namhyung Kim, John Garry,
Kajol Jain, Paul A . Clarke, Arnaldo Carvalho de Melo,
Riccardo Mancini, Kan Liang, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Madhavan Srinivasan, Song Liu,
Wan Jiabing, Yury Norov, Changbin Du, linux-perf-users,
linux-kernel
Cc: Ian Rogers
Allow the number of cpus, cores, dies and packages to be queried by a
metric expression.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/tests/expr.c | 12 +++++++++++-
tools/perf/util/expr.c | 27 +++++++++++++++++++++++++++
2 files changed, 38 insertions(+), 1 deletion(-)
diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index 9ee2dc91c27b..0c09ccc76665 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -66,7 +66,7 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
{
struct expr_id_data *val_ptr;
const char *p;
- double val;
+ double val, num_cpus, num_cores, num_dies, num_packages;
int ret;
struct expr_parse_ctx *ctx;
@@ -161,6 +161,16 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
NULL, ctx) == 0);
TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 0);
+ /* Test toplogy constants appear well ordered. */
+ expr__ctx_clear(ctx);
+ TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0);
+ TEST_ASSERT_VAL("#num_cores", expr__parse(&num_cores, ctx, "#num_cores") == 0);
+ TEST_ASSERT_VAL("#num_cpus >= #num_cores", num_cpus >= num_cores);
+ TEST_ASSERT_VAL("#num_dies", expr__parse(&num_dies, ctx, "#num_dies") == 0);
+ TEST_ASSERT_VAL("#num_cores >= #num_dies", num_cores >= num_dies);
+ TEST_ASSERT_VAL("#num_packages", expr__parse(&num_packages, ctx, "#num_packages") == 0);
+ TEST_ASSERT_VAL("#num_dies >= #num_packages", num_dies >= num_packages);
+
expr__ctx_free(ctx);
return 0;
diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index 7464739c2890..15af8b8ef5e7 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -5,6 +5,8 @@
#include <stdlib.h>
#include <string.h>
#include "metricgroup.h"
+#include "cpumap.h"
+#include "cputopo.h"
#include "debug.h"
#include "expr.h"
#include "expr-bison.h"
@@ -375,9 +377,34 @@ double expr_id_data__value(const struct expr_id_data *data)
double expr__get_literal(const char *literal)
{
+ static struct cpu_topology *topology;
+
if (!strcmp("#smt_on", literal))
return smt_on() > 0 ? 1.0 : 0.0;
+ if (!strcmp("#num_cpus", literal))
+ return cpu__max_present_cpu();
+
+ /*
+ * Assume that topology strings are consistent, such as CPUs "0-1"
+ * wouldn't be listed as "0,1", and so after deduplication the number of
+ * these strings gives an indication of the number of packages, dies,
+ * etc.
+ */
+ if (!topology) {
+ topology = cpu_topology__new();
+ if (!topology) {
+ pr_err("Error creating CPU topology");
+ return NAN;
+ }
+ }
+ if (!strcmp("#num_packages", literal))
+ return topology->package_cpus_lists;
+ if (!strcmp("#num_dies", literal))
+ return topology->die_cpus_lists;
+ if (!strcmp("#num_cores", literal))
+ return topology->core_cpus_lists;
+
pr_err("Unrecognized literal '%s'", literal);
return NAN;
}
--
2.34.0.rc0.344.g81b53c2807-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 7/7] perf expr: Add source_count for aggregating events
2021-11-05 17:09 [PATCH 0/7] New function and literals for metrics Ian Rogers
` (5 preceding siblings ...)
2021-11-05 17:09 ` [PATCH 6/7] perf expr: Add metric literals for topology Ian Rogers
@ 2021-11-05 17:09 ` Ian Rogers
2021-11-10 12:56 ` Jiri Olsa
6 siblings, 1 reply; 14+ messages in thread
From: Ian Rogers @ 2021-11-05 17:09 UTC (permalink / raw)
To: Andi Kleen, Jiri Olsa, Jin Yao, Namhyung Kim, John Garry,
Kajol Jain, Paul A . Clarke, Arnaldo Carvalho de Melo,
Riccardo Mancini, Kan Liang, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Madhavan Srinivasan, Song Liu,
Wan Jiabing, Yury Norov, Changbin Du, linux-perf-users,
linux-kernel
Cc: Ian Rogers
Events like uncore_imc/cas_count_read/ on Skylake open multiple events
and then aggregate in the metric leader. To determine the average value
per event the number of these events is needed. Add a source_count
function that returns this value by counting the number of events with
the given metric leader. For most events the value is 1 but for
uncore_imc/cas_count_read/ it can yield values like 6.
Add a generic test, but manually tested with a test metric that uses
the function.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/tests/expr.c | 12 +++++++
tools/perf/util/evsel.c | 12 +++++++
tools/perf/util/evsel.h | 1 +
tools/perf/util/expr.c | 27 ++++++++++++---
tools/perf/util/expr.h | 3 ++
tools/perf/util/expr.l | 1 +
tools/perf/util/expr.y | 65 ++++++++++++++++++++---------------
tools/perf/util/stat-shadow.c | 7 +++-
8 files changed, 95 insertions(+), 33 deletions(-)
diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index 0c09ccc76665..f14d11ac353e 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -171,6 +171,18 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
TEST_ASSERT_VAL("#num_packages", expr__parse(&num_packages, ctx, "#num_packages") == 0);
TEST_ASSERT_VAL("#num_dies >= #num_packages", num_dies >= num_packages);
+ /*
+ * Source count returns the number of events aggregating in a leader
+ * event including the leader. Check parsing yields an id.
+ */
+ expr__ctx_clear(ctx);
+ TEST_ASSERT_VAL("source count",
+ expr__find_ids("source_count(EVENT1)",
+ NULL, ctx) == 0);
+ TEST_ASSERT_VAL("source count", hashmap__size(ctx->ids) == 1);
+ TEST_ASSERT_VAL("source count", hashmap__find(ctx->ids, "EVENT1",
+ (void **)&val_ptr));
+
expr__ctx_free(ctx);
return 0;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 2cfc2935d1d2..9a2a5087f00a 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -3018,3 +3018,15 @@ void evsel__set_leader(struct evsel *evsel, struct evsel *leader)
{
evsel->core.leader = &leader->core;
}
+
+int evsel__source_count(const struct evsel *evsel)
+{
+ struct evsel *pos;
+ int count = 0;
+
+ evlist__for_each_entry(evsel->evlist, pos) {
+ if (pos->metric_leader == evsel)
+ count++;
+ }
+ return count;
+}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 846c827934de..d733a8458a52 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -484,6 +484,7 @@ struct evsel *evsel__leader(struct evsel *evsel);
bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
bool evsel__is_leader(struct evsel *evsel);
void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
+int evsel__source_count(const struct evsel *evsel);
/*
* Macro to swap the bit-field postition and size.
diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index 15af8b8ef5e7..1d532b9fed29 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -23,7 +23,10 @@ extern int expr_debug;
struct expr_id_data {
union {
- double val;
+ struct {
+ double val;
+ int source_count;
+ } val;
struct {
double val;
const char *metric_name;
@@ -140,6 +143,13 @@ int expr__add_id(struct expr_parse_ctx *ctx, const char *id)
/* Caller must make sure id is allocated */
int expr__add_id_val(struct expr_parse_ctx *ctx, const char *id, double val)
+{
+ return expr__add_id_val_source_count(ctx, id, val, /*source_count=*/1);
+}
+
+/* Caller must make sure id is allocated */
+int expr__add_id_val_source_count(struct expr_parse_ctx *ctx, const char *id,
+ double val, int source_count)
{
struct expr_id_data *data_ptr = NULL, *old_data = NULL;
char *old_key = NULL;
@@ -148,7 +158,8 @@ int expr__add_id_val(struct expr_parse_ctx *ctx, const char *id, double val)
data_ptr = malloc(sizeof(*data_ptr));
if (!data_ptr)
return -ENOMEM;
- data_ptr->val = val;
+ data_ptr->val.val = val;
+ data_ptr->val.source_count = source_count;
data_ptr->kind = EXPR_ID_DATA__VALUE;
ret = hashmap__set(ctx->ids, id, data_ptr,
@@ -244,7 +255,7 @@ int expr__resolve_id(struct expr_parse_ctx *ctx, const char *id,
switch (data->kind) {
case EXPR_ID_DATA__VALUE:
- pr_debug2("lookup(%s): val %f\n", id, data->val);
+ pr_debug2("lookup(%s): val %f\n", id, data->val.val);
break;
case EXPR_ID_DATA__REF:
pr_debug2("lookup(%s): ref metric name %s\n", id,
@@ -255,7 +266,7 @@ int expr__resolve_id(struct expr_parse_ctx *ctx, const char *id,
pr_debug("%s failed to count\n", id);
return -1;
}
- pr_debug("processing metric: %s EXIT: %f\n", id, data->val);
+ pr_debug("processing metric: %s EXIT: %f\n", id, data->ref.val);
break;
case EXPR_ID_DATA__REF_VALUE:
pr_debug2("lookup(%s): ref val %f metric name %s\n", id,
@@ -370,11 +381,17 @@ int expr__find_ids(const char *expr, const char *one,
double expr_id_data__value(const struct expr_id_data *data)
{
if (data->kind == EXPR_ID_DATA__VALUE)
- return data->val;
+ return data->val.val;
assert(data->kind == EXPR_ID_DATA__REF_VALUE);
return data->ref.val;
}
+double expr_id_data__source_count(const struct expr_id_data *data)
+{
+ assert(data->kind == EXPR_ID_DATA__VALUE);
+ return data->val.source_count;
+}
+
double expr__get_literal(const char *literal)
{
static struct cpu_topology *topology;
diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h
index a6ab7f2b23d1..bd2116983bbb 100644
--- a/tools/perf/util/expr.h
+++ b/tools/perf/util/expr.h
@@ -40,6 +40,8 @@ void expr__ctx_free(struct expr_parse_ctx *ctx);
void expr__del_id(struct expr_parse_ctx *ctx, const char *id);
int expr__add_id(struct expr_parse_ctx *ctx, const char *id);
int expr__add_id_val(struct expr_parse_ctx *ctx, const char *id, double val);
+int expr__add_id_val_source_count(struct expr_parse_ctx *ctx, const char *id,
+ double val, int source_count);
int expr__add_ref(struct expr_parse_ctx *ctx, struct metric_ref *ref);
int expr__get_id(struct expr_parse_ctx *ctx, const char *id,
struct expr_id_data **data);
@@ -55,6 +57,7 @@ int expr__find_ids(const char *expr, const char *one,
struct expr_parse_ctx *ids);
double expr_id_data__value(const struct expr_id_data *data);
+double expr_id_data__source_count(const struct expr_id_data *data);
double expr__get_literal(const char *literal);
#endif
diff --git a/tools/perf/util/expr.l b/tools/perf/util/expr.l
index cf6e3c710502..0a13eb20c814 100644
--- a/tools/perf/util/expr.l
+++ b/tools/perf/util/expr.l
@@ -107,6 +107,7 @@ max { return MAX; }
min { return MIN; }
if { return IF; }
else { return ELSE; }
+source_count { return SOURCE_COUNT; }
{literal} { return literal(yyscanner); }
{number} { return value(yyscanner); }
{symbol} { return str(yyscanner, ID, sctx->runtime); }
diff --git a/tools/perf/util/expr.y b/tools/perf/util/expr.y
index ba6c6dbf30c8..d5d10f21097a 100644
--- a/tools/perf/util/expr.y
+++ b/tools/perf/util/expr.y
@@ -3,6 +3,7 @@
#define YYDEBUG 1
#include <assert.h>
#include <math.h>
+#include <stdlib.h>
#include "util/debug.h"
#define IN_EXPR_Y 1
#include "expr.h"
@@ -36,7 +37,7 @@
} ids;
}
-%token ID NUMBER MIN MAX IF ELSE LITERAL D_RATIO EXPR_ERROR
+%token ID NUMBER MIN MAX IF ELSE LITERAL D_RATIO SOURCE_COUNT EXPR_ERROR
%left MIN MAX IF
%left '|'
%left '^'
@@ -82,6 +83,40 @@ static struct ids union_expr(struct ids ids1, struct ids ids2)
return result;
}
+static struct ids handle_id(struct expr_parse_ctx *ctx, char *id,
+ bool compute_ids, bool source_count)
+{
+ struct ids result;
+ if (!compute_ids) {
+ /*
+ * Compute the event's value from ID. If the ID isn't known then
+ * it isn't used to compute the formula so set to NAN.
+ */
+ struct expr_id_data *data;
+
+ result.val = NAN;
+ if (expr__resolve_id(ctx, id, &data) == 0) {
+ result.val = source_count
+ ? expr_id_data__source_count(data)
+ : expr_id_data__value(data);
+ }
+ result.ids = NULL;
+ free(id);
+ } else {
+ /*
+ * Set the value to BOTTOM to show that any value is possible
+ * when the event is computed. Create a set of just the ID.
+ */
+ result.val = BOTTOM;
+ result.ids = ids__new();
+ if (!result.ids || ids__insert(result.ids, id)) {
+ pr_err("Error creating IDs for '%s'", id);
+ free(id);
+ }
+ }
+ return result;
+}
+
/*
* If we're not computing ids or $1 and $3 are constants, compute the new
* constant value using OP. Its invariant that there are no ids. If computing
@@ -167,32 +202,8 @@ expr: NUMBER
$$.val = $1;
$$.ids = NULL;
}
-| ID
-{
- if (!compute_ids) {
- /*
- * Compute the event's value from ID. If the ID isn't known then
- * it isn't used to compute the formula so set to NAN.
- */
- struct expr_id_data *data;
-
- $$.val = NAN;
- if (expr__resolve_id(ctx, $1, &data) == 0)
- $$.val = expr_id_data__value(data);
-
- $$.ids = NULL;
- free($1);
- } else {
- /*
- * Set the value to BOTTOM to show that any value is possible
- * when the event is computed. Create a set of just the ID.
- */
- $$.val = BOTTOM;
- $$.ids = ids__new();
- if (!$$.ids || ids__insert($$.ids, $1))
- YYABORT;
- }
-}
+| ID { $$ = handle_id(ctx, $1, compute_ids, /*source_count=*/false); }
+| SOURCE_COUNT '(' ID ')' { $$ = handle_id(ctx, $3, compute_ids, /*source_count=*/true); }
| expr '|' expr { BINARY_LONG_OP($$, |, $1, $3); }
| expr '&' expr { BINARY_LONG_OP($$, &, $1, $3); }
| expr '^' expr { BINARY_LONG_OP($$, ^, $1, $3); }
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 69f3cf3b4a44..ce0ce8ac2874 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -828,10 +828,12 @@ static int prepare_metric(struct evsel **metric_events,
struct saved_value *v;
struct stats *stats;
u64 metric_total = 0;
+ int source_count;
if (!strcmp(metric_events[i]->name, "duration_time")) {
stats = &walltime_nsecs_stats;
scale = 1e-9;
+ source_count = 1;
} else {
v = saved_value_lookup(metric_events[i], cpu, false,
STAT_NONE, 0, st,
@@ -840,6 +842,7 @@ static int prepare_metric(struct evsel **metric_events,
break;
stats = &v->stats;
scale = 1.0;
+ source_count = evsel__source_count(metric_events[i]);
if (v->metric_other)
metric_total = v->metric_total;
@@ -848,7 +851,9 @@ static int prepare_metric(struct evsel **metric_events,
if (!n)
return -ENOMEM;
- expr__add_id_val(pctx, n, metric_total ? : avg_stats(stats) * scale);
+ expr__add_id_val_source_count(pctx, n,
+ metric_total ? : avg_stats(stats) * scale,
+ source_count);
}
for (j = 0; metric_refs && metric_refs[j].metric_name; j++) {
--
2.34.0.rc0.344.g81b53c2807-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 7/7] perf expr: Add source_count for aggregating events
2021-11-05 17:09 ` [PATCH 7/7] perf expr: Add source_count for aggregating events Ian Rogers
@ 2021-11-10 12:56 ` Jiri Olsa
0 siblings, 0 replies; 14+ messages in thread
From: Jiri Olsa @ 2021-11-10 12:56 UTC (permalink / raw)
To: Ian Rogers
Cc: Andi Kleen, Jin Yao, Namhyung Kim, John Garry, Kajol Jain,
Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
Alexander Shishkin, Madhavan Srinivasan, Song Liu, Wan Jiabing,
Yury Norov, Changbin Du, linux-perf-users, linux-kernel
On Fri, Nov 05, 2021 at 10:09:43AM -0700, Ian Rogers wrote:
> Events like uncore_imc/cas_count_read/ on Skylake open multiple events
> and then aggregate in the metric leader. To determine the average value
> per event the number of these events is needed. Add a source_count
> function that returns this value by counting the number of events with
> the given metric leader. For most events the value is 1 but for
> uncore_imc/cas_count_read/ it can yield values like 6.
>
> Add a generic test, but manually tested with a test metric that uses
> the function.
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
> tools/perf/tests/expr.c | 12 +++++++
> tools/perf/util/evsel.c | 12 +++++++
> tools/perf/util/evsel.h | 1 +
> tools/perf/util/expr.c | 27 ++++++++++++---
> tools/perf/util/expr.h | 3 ++
> tools/perf/util/expr.l | 1 +
> tools/perf/util/expr.y | 65 ++++++++++++++++++++---------------
> tools/perf/util/stat-shadow.c | 7 +++-
> 8 files changed, 95 insertions(+), 33 deletions(-)
>
> diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
> index 0c09ccc76665..f14d11ac353e 100644
> --- a/tools/perf/tests/expr.c
> +++ b/tools/perf/tests/expr.c
> @@ -171,6 +171,18 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
> TEST_ASSERT_VAL("#num_packages", expr__parse(&num_packages, ctx, "#num_packages") == 0);
> TEST_ASSERT_VAL("#num_dies >= #num_packages", num_dies >= num_packages);
>
> + /*
> + * Source count returns the number of events aggregating in a leader
> + * event including the leader. Check parsing yields an id.
> + */
> + expr__ctx_clear(ctx);
> + TEST_ASSERT_VAL("source count",
> + expr__find_ids("source_count(EVENT1)",
> + NULL, ctx) == 0);
> + TEST_ASSERT_VAL("source count", hashmap__size(ctx->ids) == 1);
> + TEST_ASSERT_VAL("source count", hashmap__find(ctx->ids, "EVENT1",
> + (void **)&val_ptr));
> +
> expr__ctx_free(ctx);
>
> return 0;
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 2cfc2935d1d2..9a2a5087f00a 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -3018,3 +3018,15 @@ void evsel__set_leader(struct evsel *evsel, struct evsel *leader)
> {
> evsel->core.leader = &leader->core;
> }
> +
> +int evsel__source_count(const struct evsel *evsel)
> +{
> + struct evsel *pos;
> + int count = 0;
> +
> + evlist__for_each_entry(evsel->evlist, pos) {
> + if (pos->metric_leader == evsel)
> + count++;
> + }
> + return count;
> +}
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 846c827934de..d733a8458a52 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -484,6 +484,7 @@ struct evsel *evsel__leader(struct evsel *evsel);
> bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
> bool evsel__is_leader(struct evsel *evsel);
> void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
> +int evsel__source_count(const struct evsel *evsel);
>
> /*
> * Macro to swap the bit-field postition and size.
> diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
> index 15af8b8ef5e7..1d532b9fed29 100644
> --- a/tools/perf/util/expr.c
> +++ b/tools/perf/util/expr.c
> @@ -23,7 +23,10 @@ extern int expr_debug;
>
> struct expr_id_data {
> union {
> - double val;
> + struct {
> + double val;
> + int source_count;
> + } val;
> struct {
> double val;
> const char *metric_name;
> @@ -140,6 +143,13 @@ int expr__add_id(struct expr_parse_ctx *ctx, const char *id)
>
> /* Caller must make sure id is allocated */
> int expr__add_id_val(struct expr_parse_ctx *ctx, const char *id, double val)
> +{
> + return expr__add_id_val_source_count(ctx, id, val, /*source_count=*/1);
> +}
> +
> +/* Caller must make sure id is allocated */
> +int expr__add_id_val_source_count(struct expr_parse_ctx *ctx, const char *id,
> + double val, int source_count)
> {
> struct expr_id_data *data_ptr = NULL, *old_data = NULL;
> char *old_key = NULL;
> @@ -148,7 +158,8 @@ int expr__add_id_val(struct expr_parse_ctx *ctx, const char *id, double val)
> data_ptr = malloc(sizeof(*data_ptr));
> if (!data_ptr)
> return -ENOMEM;
> - data_ptr->val = val;
> + data_ptr->val.val = val;
> + data_ptr->val.source_count = source_count;
> data_ptr->kind = EXPR_ID_DATA__VALUE;
>
> ret = hashmap__set(ctx->ids, id, data_ptr,
> @@ -244,7 +255,7 @@ int expr__resolve_id(struct expr_parse_ctx *ctx, const char *id,
>
> switch (data->kind) {
> case EXPR_ID_DATA__VALUE:
> - pr_debug2("lookup(%s): val %f\n", id, data->val);
> + pr_debug2("lookup(%s): val %f\n", id, data->val.val);
> break;
> case EXPR_ID_DATA__REF:
> pr_debug2("lookup(%s): ref metric name %s\n", id,
> @@ -255,7 +266,7 @@ int expr__resolve_id(struct expr_parse_ctx *ctx, const char *id,
> pr_debug("%s failed to count\n", id);
> return -1;
> }
> - pr_debug("processing metric: %s EXIT: %f\n", id, data->val);
> + pr_debug("processing metric: %s EXIT: %f\n", id, data->ref.val);
> break;
> case EXPR_ID_DATA__REF_VALUE:
> pr_debug2("lookup(%s): ref val %f metric name %s\n", id,
> @@ -370,11 +381,17 @@ int expr__find_ids(const char *expr, const char *one,
> double expr_id_data__value(const struct expr_id_data *data)
> {
> if (data->kind == EXPR_ID_DATA__VALUE)
> - return data->val;
> + return data->val.val;
> assert(data->kind == EXPR_ID_DATA__REF_VALUE);
> return data->ref.val;
> }
>
> +double expr_id_data__source_count(const struct expr_id_data *data)
> +{
> + assert(data->kind == EXPR_ID_DATA__VALUE);
> + return data->val.source_count;
> +}
> +
> double expr__get_literal(const char *literal)
> {
> static struct cpu_topology *topology;
> diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h
> index a6ab7f2b23d1..bd2116983bbb 100644
> --- a/tools/perf/util/expr.h
> +++ b/tools/perf/util/expr.h
> @@ -40,6 +40,8 @@ void expr__ctx_free(struct expr_parse_ctx *ctx);
> void expr__del_id(struct expr_parse_ctx *ctx, const char *id);
> int expr__add_id(struct expr_parse_ctx *ctx, const char *id);
> int expr__add_id_val(struct expr_parse_ctx *ctx, const char *id, double val);
> +int expr__add_id_val_source_count(struct expr_parse_ctx *ctx, const char *id,
> + double val, int source_count);
> int expr__add_ref(struct expr_parse_ctx *ctx, struct metric_ref *ref);
> int expr__get_id(struct expr_parse_ctx *ctx, const char *id,
> struct expr_id_data **data);
> @@ -55,6 +57,7 @@ int expr__find_ids(const char *expr, const char *one,
> struct expr_parse_ctx *ids);
>
> double expr_id_data__value(const struct expr_id_data *data);
> +double expr_id_data__source_count(const struct expr_id_data *data);
> double expr__get_literal(const char *literal);
>
> #endif
> diff --git a/tools/perf/util/expr.l b/tools/perf/util/expr.l
> index cf6e3c710502..0a13eb20c814 100644
> --- a/tools/perf/util/expr.l
> +++ b/tools/perf/util/expr.l
> @@ -107,6 +107,7 @@ max { return MAX; }
> min { return MIN; }
> if { return IF; }
> else { return ELSE; }
> +source_count { return SOURCE_COUNT; }
> {literal} { return literal(yyscanner); }
> {number} { return value(yyscanner); }
> {symbol} { return str(yyscanner, ID, sctx->runtime); }
> diff --git a/tools/perf/util/expr.y b/tools/perf/util/expr.y
> index ba6c6dbf30c8..d5d10f21097a 100644
> --- a/tools/perf/util/expr.y
> +++ b/tools/perf/util/expr.y
> @@ -3,6 +3,7 @@
> #define YYDEBUG 1
> #include <assert.h>
> #include <math.h>
> +#include <stdlib.h>
> #include "util/debug.h"
> #define IN_EXPR_Y 1
> #include "expr.h"
> @@ -36,7 +37,7 @@
> } ids;
> }
>
> -%token ID NUMBER MIN MAX IF ELSE LITERAL D_RATIO EXPR_ERROR
> +%token ID NUMBER MIN MAX IF ELSE LITERAL D_RATIO SOURCE_COUNT EXPR_ERROR
> %left MIN MAX IF
> %left '|'
> %left '^'
> @@ -82,6 +83,40 @@ static struct ids union_expr(struct ids ids1, struct ids ids2)
> return result;
> }
>
> +static struct ids handle_id(struct expr_parse_ctx *ctx, char *id,
> + bool compute_ids, bool source_count)
> +{
> + struct ids result;
> + if (!compute_ids) {
> + /*
> + * Compute the event's value from ID. If the ID isn't known then
> + * it isn't used to compute the formula so set to NAN.
> + */
> + struct expr_id_data *data;
> +
> + result.val = NAN;
> + if (expr__resolve_id(ctx, id, &data) == 0) {
> + result.val = source_count
> + ? expr_id_data__source_count(data)
> + : expr_id_data__value(data);
> + }
> + result.ids = NULL;
> + free(id);
> + } else {
> + /*
> + * Set the value to BOTTOM to show that any value is possible
> + * when the event is computed. Create a set of just the ID.
> + */
> + result.val = BOTTOM;
> + result.ids = ids__new();
> + if (!result.ids || ids__insert(result.ids, id)) {
> + pr_err("Error creating IDs for '%s'", id);
> + free(id);
> + }
> + }
could you please put the move of handle_id into separate patch?
it''ll be easier to review the actual changes
thanks,
jirka
> + return result;
> +}
> +
> /*
> * If we're not computing ids or $1 and $3 are constants, compute the new
> * constant value using OP. Its invariant that there are no ids. If computing
> @@ -167,32 +202,8 @@ expr: NUMBER
> $$.val = $1;
> $$.ids = NULL;
> }
> -| ID
> -{
> - if (!compute_ids) {
> - /*
> - * Compute the event's value from ID. If the ID isn't known then
> - * it isn't used to compute the formula so set to NAN.
> - */
> - struct expr_id_data *data;
> -
> - $$.val = NAN;
> - if (expr__resolve_id(ctx, $1, &data) == 0)
> - $$.val = expr_id_data__value(data);
> -
> - $$.ids = NULL;
> - free($1);
> - } else {
> - /*
> - * Set the value to BOTTOM to show that any value is possible
> - * when the event is computed. Create a set of just the ID.
> - */
> - $$.val = BOTTOM;
> - $$.ids = ids__new();
> - if (!$$.ids || ids__insert($$.ids, $1))
> - YYABORT;
> - }
> -}
> +| ID { $$ = handle_id(ctx, $1, compute_ids, /*source_count=*/false); }
> +| SOURCE_COUNT '(' ID ')' { $$ = handle_id(ctx, $3, compute_ids, /*source_count=*/true); }
> | expr '|' expr { BINARY_LONG_OP($$, |, $1, $3); }
> | expr '&' expr { BINARY_LONG_OP($$, &, $1, $3); }
> | expr '^' expr { BINARY_LONG_OP($$, ^, $1, $3); }
> diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
> index 69f3cf3b4a44..ce0ce8ac2874 100644
> --- a/tools/perf/util/stat-shadow.c
> +++ b/tools/perf/util/stat-shadow.c
> @@ -828,10 +828,12 @@ static int prepare_metric(struct evsel **metric_events,
> struct saved_value *v;
> struct stats *stats;
> u64 metric_total = 0;
> + int source_count;
>
> if (!strcmp(metric_events[i]->name, "duration_time")) {
> stats = &walltime_nsecs_stats;
> scale = 1e-9;
> + source_count = 1;
> } else {
> v = saved_value_lookup(metric_events[i], cpu, false,
> STAT_NONE, 0, st,
> @@ -840,6 +842,7 @@ static int prepare_metric(struct evsel **metric_events,
> break;
> stats = &v->stats;
> scale = 1.0;
> + source_count = evsel__source_count(metric_events[i]);
>
> if (v->metric_other)
> metric_total = v->metric_total;
> @@ -848,7 +851,9 @@ static int prepare_metric(struct evsel **metric_events,
> if (!n)
> return -ENOMEM;
>
> - expr__add_id_val(pctx, n, metric_total ? : avg_stats(stats) * scale);
> + expr__add_id_val_source_count(pctx, n,
> + metric_total ? : avg_stats(stats) * scale,
> + source_count);
> }
>
> for (j = 0; metric_refs && metric_refs[j].metric_name; j++) {
> --
> 2.34.0.rc0.344.g81b53c2807-goog
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 6/7] perf expr: Add metric literals for topology.
2021-11-05 17:09 ` [PATCH 6/7] perf expr: Add metric literals for topology Ian Rogers
@ 2021-11-10 12:56 ` Jiri Olsa
2021-11-10 14:19 ` Ian Rogers
0 siblings, 1 reply; 14+ messages in thread
From: Jiri Olsa @ 2021-11-10 12:56 UTC (permalink / raw)
To: Ian Rogers
Cc: Andi Kleen, Jin Yao, Namhyung Kim, John Garry, Kajol Jain,
Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
Alexander Shishkin, Madhavan Srinivasan, Song Liu, Wan Jiabing,
Yury Norov, Changbin Du, linux-perf-users, linux-kernel
On Fri, Nov 05, 2021 at 10:09:42AM -0700, Ian Rogers wrote:
> Allow the number of cpus, cores, dies and packages to be queried by a
> metric expression.
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
> tools/perf/tests/expr.c | 12 +++++++++++-
> tools/perf/util/expr.c | 27 +++++++++++++++++++++++++++
> 2 files changed, 38 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
> index 9ee2dc91c27b..0c09ccc76665 100644
> --- a/tools/perf/tests/expr.c
> +++ b/tools/perf/tests/expr.c
> @@ -66,7 +66,7 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
> {
> struct expr_id_data *val_ptr;
> const char *p;
> - double val;
> + double val, num_cpus, num_cores, num_dies, num_packages;
> int ret;
> struct expr_parse_ctx *ctx;
>
> @@ -161,6 +161,16 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
> NULL, ctx) == 0);
> TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 0);
>
> + /* Test toplogy constants appear well ordered. */
> + expr__ctx_clear(ctx);
> + TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0);
> + TEST_ASSERT_VAL("#num_cores", expr__parse(&num_cores, ctx, "#num_cores") == 0);
> + TEST_ASSERT_VAL("#num_cpus >= #num_cores", num_cpus >= num_cores);
> + TEST_ASSERT_VAL("#num_dies", expr__parse(&num_dies, ctx, "#num_dies") == 0);
> + TEST_ASSERT_VAL("#num_cores >= #num_dies", num_cores >= num_dies);
> + TEST_ASSERT_VAL("#num_packages", expr__parse(&num_packages, ctx, "#num_packages") == 0);
> + TEST_ASSERT_VAL("#num_dies >= #num_packages", num_dies >= num_packages);
> +
> expr__ctx_free(ctx);
>
> return 0;
> diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
> index 7464739c2890..15af8b8ef5e7 100644
> --- a/tools/perf/util/expr.c
> +++ b/tools/perf/util/expr.c
> @@ -5,6 +5,8 @@
> #include <stdlib.h>
> #include <string.h>
> #include "metricgroup.h"
> +#include "cpumap.h"
> +#include "cputopo.h"
> #include "debug.h"
> #include "expr.h"
> #include "expr-bison.h"
> @@ -375,9 +377,34 @@ double expr_id_data__value(const struct expr_id_data *data)
>
> double expr__get_literal(const char *literal)
> {
> + static struct cpu_topology *topology;
> +
> if (!strcmp("#smt_on", literal))
> return smt_on() > 0 ? 1.0 : 0.0;
>
> + if (!strcmp("#num_cpus", literal))
> + return cpu__max_present_cpu();
> +
> + /*
> + * Assume that topology strings are consistent, such as CPUs "0-1"
> + * wouldn't be listed as "0,1", and so after deduplication the number of
> + * these strings gives an indication of the number of packages, dies,
> + * etc.
> + */
> + if (!topology) {
> + topology = cpu_topology__new();
any chance we could propagate expr_scanner_ctx in here and store topology
to it and release it at the end? I think we have several places like this,
so it'd be nice not to make more if it's possible ;-)
thanks,
jirka
> + if (!topology) {
> + pr_err("Error creating CPU topology");
> + return NAN;
> + }
> + }
> + if (!strcmp("#num_packages", literal))
> + return topology->package_cpus_lists;
> + if (!strcmp("#num_dies", literal))
> + return topology->die_cpus_lists;
> + if (!strcmp("#num_cores", literal))
> + return topology->core_cpus_lists;
> +
> pr_err("Unrecognized literal '%s'", literal);
> return NAN;
> }
> --
> 2.34.0.rc0.344.g81b53c2807-goog
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 6/7] perf expr: Add metric literals for topology.
2021-11-10 12:56 ` Jiri Olsa
@ 2021-11-10 14:19 ` Ian Rogers
2021-11-10 14:35 ` Jiri Olsa
0 siblings, 1 reply; 14+ messages in thread
From: Ian Rogers @ 2021-11-10 14:19 UTC (permalink / raw)
To: Jiri Olsa
Cc: Andi Kleen, Jin Yao, Namhyung Kim, John Garry, Kajol Jain,
Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
Alexander Shishkin, Madhavan Srinivasan, Song Liu, Wan Jiabing,
Yury Norov, Changbin Du, linux-perf-users, linux-kernel
On Wed, Nov 10, 2021 at 4:56 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Fri, Nov 05, 2021 at 10:09:42AM -0700, Ian Rogers wrote:
> > Allow the number of cpus, cores, dies and packages to be queried by a
> > metric expression.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> > tools/perf/tests/expr.c | 12 +++++++++++-
> > tools/perf/util/expr.c | 27 +++++++++++++++++++++++++++
> > 2 files changed, 38 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
> > index 9ee2dc91c27b..0c09ccc76665 100644
> > --- a/tools/perf/tests/expr.c
> > +++ b/tools/perf/tests/expr.c
> > @@ -66,7 +66,7 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
> > {
> > struct expr_id_data *val_ptr;
> > const char *p;
> > - double val;
> > + double val, num_cpus, num_cores, num_dies, num_packages;
> > int ret;
> > struct expr_parse_ctx *ctx;
> >
> > @@ -161,6 +161,16 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
> > NULL, ctx) == 0);
> > TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 0);
> >
> > + /* Test toplogy constants appear well ordered. */
> > + expr__ctx_clear(ctx);
> > + TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0);
> > + TEST_ASSERT_VAL("#num_cores", expr__parse(&num_cores, ctx, "#num_cores") == 0);
> > + TEST_ASSERT_VAL("#num_cpus >= #num_cores", num_cpus >= num_cores);
> > + TEST_ASSERT_VAL("#num_dies", expr__parse(&num_dies, ctx, "#num_dies") == 0);
> > + TEST_ASSERT_VAL("#num_cores >= #num_dies", num_cores >= num_dies);
> > + TEST_ASSERT_VAL("#num_packages", expr__parse(&num_packages, ctx, "#num_packages") == 0);
> > + TEST_ASSERT_VAL("#num_dies >= #num_packages", num_dies >= num_packages);
> > +
> > expr__ctx_free(ctx);
> >
> > return 0;
> > diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
> > index 7464739c2890..15af8b8ef5e7 100644
> > --- a/tools/perf/util/expr.c
> > +++ b/tools/perf/util/expr.c
> > @@ -5,6 +5,8 @@
> > #include <stdlib.h>
> > #include <string.h>
> > #include "metricgroup.h"
> > +#include "cpumap.h"
> > +#include "cputopo.h"
> > #include "debug.h"
> > #include "expr.h"
> > #include "expr-bison.h"
> > @@ -375,9 +377,34 @@ double expr_id_data__value(const struct expr_id_data *data)
> >
> > double expr__get_literal(const char *literal)
> > {
> > + static struct cpu_topology *topology;
> > +
> > if (!strcmp("#smt_on", literal))
> > return smt_on() > 0 ? 1.0 : 0.0;
> >
> > + if (!strcmp("#num_cpus", literal))
> > + return cpu__max_present_cpu();
> > +
> > + /*
> > + * Assume that topology strings are consistent, such as CPUs "0-1"
> > + * wouldn't be listed as "0,1", and so after deduplication the number of
> > + * these strings gives an indication of the number of packages, dies,
> > + * etc.
> > + */
> > + if (!topology) {
> > + topology = cpu_topology__new();
>
> any chance we could propagate expr_scanner_ctx in here and store topology
> to it and release it at the end? I think we have several places like this,
> so it'd be nice not to make more if it's possible ;-)
The topology here is static and so will only get computed once per
execution rather than once pre expression parse. I was worried about
the cost of recomputing the topology for something like 'perf stat -I
1000 -M ...' in which case the static will do less recomputation.
Thanks,
Ian
> thanks,
> jirka
>
> > + if (!topology) {
> > + pr_err("Error creating CPU topology");
> > + return NAN;
> > + }
> > + }
> > + if (!strcmp("#num_packages", literal))
> > + return topology->package_cpus_lists;
> > + if (!strcmp("#num_dies", literal))
> > + return topology->die_cpus_lists;
> > + if (!strcmp("#num_cores", literal))
> > + return topology->core_cpus_lists;
> > +
> > pr_err("Unrecognized literal '%s'", literal);
> > return NAN;
> > }
> > --
> > 2.34.0.rc0.344.g81b53c2807-goog
> >
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 6/7] perf expr: Add metric literals for topology.
2021-11-10 14:19 ` Ian Rogers
@ 2021-11-10 14:35 ` Jiri Olsa
2021-11-10 17:58 ` Ian Rogers
0 siblings, 1 reply; 14+ messages in thread
From: Jiri Olsa @ 2021-11-10 14:35 UTC (permalink / raw)
To: Ian Rogers
Cc: Andi Kleen, Jin Yao, Namhyung Kim, John Garry, Kajol Jain,
Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
Alexander Shishkin, Madhavan Srinivasan, Song Liu, Wan Jiabing,
Yury Norov, Changbin Du, linux-perf-users, linux-kernel
On Wed, Nov 10, 2021 at 06:19:04AM -0800, Ian Rogers wrote:
> On Wed, Nov 10, 2021 at 4:56 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Fri, Nov 05, 2021 at 10:09:42AM -0700, Ian Rogers wrote:
> > > Allow the number of cpus, cores, dies and packages to be queried by a
> > > metric expression.
> > >
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > ---
> > > tools/perf/tests/expr.c | 12 +++++++++++-
> > > tools/perf/util/expr.c | 27 +++++++++++++++++++++++++++
> > > 2 files changed, 38 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
> > > index 9ee2dc91c27b..0c09ccc76665 100644
> > > --- a/tools/perf/tests/expr.c
> > > +++ b/tools/perf/tests/expr.c
> > > @@ -66,7 +66,7 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
> > > {
> > > struct expr_id_data *val_ptr;
> > > const char *p;
> > > - double val;
> > > + double val, num_cpus, num_cores, num_dies, num_packages;
> > > int ret;
> > > struct expr_parse_ctx *ctx;
> > >
> > > @@ -161,6 +161,16 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
> > > NULL, ctx) == 0);
> > > TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 0);
> > >
> > > + /* Test toplogy constants appear well ordered. */
> > > + expr__ctx_clear(ctx);
> > > + TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0);
> > > + TEST_ASSERT_VAL("#num_cores", expr__parse(&num_cores, ctx, "#num_cores") == 0);
> > > + TEST_ASSERT_VAL("#num_cpus >= #num_cores", num_cpus >= num_cores);
> > > + TEST_ASSERT_VAL("#num_dies", expr__parse(&num_dies, ctx, "#num_dies") == 0);
> > > + TEST_ASSERT_VAL("#num_cores >= #num_dies", num_cores >= num_dies);
> > > + TEST_ASSERT_VAL("#num_packages", expr__parse(&num_packages, ctx, "#num_packages") == 0);
> > > + TEST_ASSERT_VAL("#num_dies >= #num_packages", num_dies >= num_packages);
> > > +
> > > expr__ctx_free(ctx);
> > >
> > > return 0;
> > > diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
> > > index 7464739c2890..15af8b8ef5e7 100644
> > > --- a/tools/perf/util/expr.c
> > > +++ b/tools/perf/util/expr.c
> > > @@ -5,6 +5,8 @@
> > > #include <stdlib.h>
> > > #include <string.h>
> > > #include "metricgroup.h"
> > > +#include "cpumap.h"
> > > +#include "cputopo.h"
> > > #include "debug.h"
> > > #include "expr.h"
> > > #include "expr-bison.h"
> > > @@ -375,9 +377,34 @@ double expr_id_data__value(const struct expr_id_data *data)
> > >
> > > double expr__get_literal(const char *literal)
> > > {
> > > + static struct cpu_topology *topology;
> > > +
> > > if (!strcmp("#smt_on", literal))
> > > return smt_on() > 0 ? 1.0 : 0.0;
> > >
> > > + if (!strcmp("#num_cpus", literal))
> > > + return cpu__max_present_cpu();
> > > +
> > > + /*
> > > + * Assume that topology strings are consistent, such as CPUs "0-1"
> > > + * wouldn't be listed as "0,1", and so after deduplication the number of
> > > + * these strings gives an indication of the number of packages, dies,
> > > + * etc.
> > > + */
> > > + if (!topology) {
> > > + topology = cpu_topology__new();
> >
> > any chance we could propagate expr_scanner_ctx in here and store topology
> > to it and release it at the end? I think we have several places like this,
> > so it'd be nice not to make more if it's possible ;-)
>
> The topology here is static and so will only get computed once per
> execution rather than once pre expression parse. I was worried about
> the cost of recomputing the topology for something like 'perf stat -I
> 1000 -M ...' in which case the static will do less recomputation.
can't we have the topology created/release on the top fo the parsing
and released after all expressions are parsed?
or we could come up with some generic way to handle this kind of release
jirka
>
> Thanks,
> Ian
>
> > thanks,
> > jirka
> >
> > > + if (!topology) {
> > > + pr_err("Error creating CPU topology");
> > > + return NAN;
> > > + }
> > > + }
> > > + if (!strcmp("#num_packages", literal))
> > > + return topology->package_cpus_lists;
> > > + if (!strcmp("#num_dies", literal))
> > > + return topology->die_cpus_lists;
> > > + if (!strcmp("#num_cores", literal))
> > > + return topology->core_cpus_lists;
> > > +
> > > pr_err("Unrecognized literal '%s'", literal);
> > > return NAN;
> > > }
> > > --
> > > 2.34.0.rc0.344.g81b53c2807-goog
> > >
> >
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 6/7] perf expr: Add metric literals for topology.
2021-11-10 14:35 ` Jiri Olsa
@ 2021-11-10 17:58 ` Ian Rogers
2021-11-10 18:29 ` Jiri Olsa
0 siblings, 1 reply; 14+ messages in thread
From: Ian Rogers @ 2021-11-10 17:58 UTC (permalink / raw)
To: Jiri Olsa
Cc: Andi Kleen, Jin Yao, Namhyung Kim, John Garry, Kajol Jain,
Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
Alexander Shishkin, Madhavan Srinivasan, Song Liu, Wan Jiabing,
Yury Norov, Changbin Du, linux-perf-users, linux-kernel
On Wed, Nov 10, 2021 at 6:35 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Wed, Nov 10, 2021 at 06:19:04AM -0800, Ian Rogers wrote:
> > On Wed, Nov 10, 2021 at 4:56 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > On Fri, Nov 05, 2021 at 10:09:42AM -0700, Ian Rogers wrote:
> > > > Allow the number of cpus, cores, dies and packages to be queried by a
> > > > metric expression.
> > > >
> > > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > > ---
> > > > tools/perf/tests/expr.c | 12 +++++++++++-
> > > > tools/perf/util/expr.c | 27 +++++++++++++++++++++++++++
> > > > 2 files changed, 38 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
> > > > index 9ee2dc91c27b..0c09ccc76665 100644
> > > > --- a/tools/perf/tests/expr.c
> > > > +++ b/tools/perf/tests/expr.c
> > > > @@ -66,7 +66,7 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
> > > > {
> > > > struct expr_id_data *val_ptr;
> > > > const char *p;
> > > > - double val;
> > > > + double val, num_cpus, num_cores, num_dies, num_packages;
> > > > int ret;
> > > > struct expr_parse_ctx *ctx;
> > > >
> > > > @@ -161,6 +161,16 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
> > > > NULL, ctx) == 0);
> > > > TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 0);
> > > >
> > > > + /* Test toplogy constants appear well ordered. */
> > > > + expr__ctx_clear(ctx);
> > > > + TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0);
> > > > + TEST_ASSERT_VAL("#num_cores", expr__parse(&num_cores, ctx, "#num_cores") == 0);
> > > > + TEST_ASSERT_VAL("#num_cpus >= #num_cores", num_cpus >= num_cores);
> > > > + TEST_ASSERT_VAL("#num_dies", expr__parse(&num_dies, ctx, "#num_dies") == 0);
> > > > + TEST_ASSERT_VAL("#num_cores >= #num_dies", num_cores >= num_dies);
> > > > + TEST_ASSERT_VAL("#num_packages", expr__parse(&num_packages, ctx, "#num_packages") == 0);
> > > > + TEST_ASSERT_VAL("#num_dies >= #num_packages", num_dies >= num_packages);
> > > > +
> > > > expr__ctx_free(ctx);
> > > >
> > > > return 0;
> > > > diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
> > > > index 7464739c2890..15af8b8ef5e7 100644
> > > > --- a/tools/perf/util/expr.c
> > > > +++ b/tools/perf/util/expr.c
> > > > @@ -5,6 +5,8 @@
> > > > #include <stdlib.h>
> > > > #include <string.h>
> > > > #include "metricgroup.h"
> > > > +#include "cpumap.h"
> > > > +#include "cputopo.h"
> > > > #include "debug.h"
> > > > #include "expr.h"
> > > > #include "expr-bison.h"
> > > > @@ -375,9 +377,34 @@ double expr_id_data__value(const struct expr_id_data *data)
> > > >
> > > > double expr__get_literal(const char *literal)
> > > > {
> > > > + static struct cpu_topology *topology;
> > > > +
> > > > if (!strcmp("#smt_on", literal))
> > > > return smt_on() > 0 ? 1.0 : 0.0;
> > > >
> > > > + if (!strcmp("#num_cpus", literal))
> > > > + return cpu__max_present_cpu();
> > > > +
> > > > + /*
> > > > + * Assume that topology strings are consistent, such as CPUs "0-1"
> > > > + * wouldn't be listed as "0,1", and so after deduplication the number of
> > > > + * these strings gives an indication of the number of packages, dies,
> > > > + * etc.
> > > > + */
> > > > + if (!topology) {
> > > > + topology = cpu_topology__new();
> > >
> > > any chance we could propagate expr_scanner_ctx in here and store topology
> > > to it and release it at the end? I think we have several places like this,
> > > so it'd be nice not to make more if it's possible ;-)
> >
> > The topology here is static and so will only get computed once per
> > execution rather than once pre expression parse. I was worried about
> > the cost of recomputing the topology for something like 'perf stat -I
> > 1000 -M ...' in which case the static will do less recomputation.
>
> can't we have the topology created/release on the top fo the parsing
> and released after all expressions are parsed?
>
> or we could come up with some generic way to handle this kind of release
Creating the topology at the top of metric parsing would incur a sysfs
parsing cost on metrics regardless of whether they used the topology.
I feel a lazy approach is better to avoid this cost. We could make
this part of the expression context and lazily initialize it there.
It'd be good to keep the expression context around in 'perf stat' in
that case. I quite like this approach as really the topology should be
part of the perf environment and part of the header. It is interesting
to think how can we get metrics to work with 'perf stat record'.
Currently you will get metrics for something like 'perf stat record -e
instructions,cycles -a sleep 2; perf stat report', but none baked in
metrics don't work and we don't even have the metric-id to make this
work (even using '-M IPC' doesn't work). Putting the environment in
the session, making the topology part of it, writing/reading data
based on that seems like worthwhile clean up but beyond the scope of
what I was trying to do here.
In terms of releasing for the sake of memory leaks, leak sanitizer
doesn't treat memory allocated and reachable from a global as a leak.
We can't precompute the topology and so this style of approach is what
I'm used to. Putting the topology into the parsing context and lazily
initializing would be the second best option imo, and I can do that if
this is a blocker.
Thanks,
Ian
> jirka
>
> >
> > Thanks,
> > Ian
> >
> > > thanks,
> > > jirka
> > >
> > > > + if (!topology) {
> > > > + pr_err("Error creating CPU topology");
> > > > + return NAN;
> > > > + }
> > > > + }
> > > > + if (!strcmp("#num_packages", literal))
> > > > + return topology->package_cpus_lists;
> > > > + if (!strcmp("#num_dies", literal))
> > > > + return topology->die_cpus_lists;
> > > > + if (!strcmp("#num_cores", literal))
> > > > + return topology->core_cpus_lists;
> > > > +
> > > > pr_err("Unrecognized literal '%s'", literal);
> > > > return NAN;
> > > > }
> > > > --
> > > > 2.34.0.rc0.344.g81b53c2807-goog
> > > >
> > >
> >
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 6/7] perf expr: Add metric literals for topology.
2021-11-10 17:58 ` Ian Rogers
@ 2021-11-10 18:29 ` Jiri Olsa
0 siblings, 0 replies; 14+ messages in thread
From: Jiri Olsa @ 2021-11-10 18:29 UTC (permalink / raw)
To: Ian Rogers
Cc: Andi Kleen, Jin Yao, Namhyung Kim, John Garry, Kajol Jain,
Paul A . Clarke, Arnaldo Carvalho de Melo, Riccardo Mancini,
Kan Liang, Peter Zijlstra, Ingo Molnar, Mark Rutland,
Alexander Shishkin, Madhavan Srinivasan, Song Liu, Wan Jiabing,
Yury Norov, Changbin Du, linux-perf-users, linux-kernel
On Wed, Nov 10, 2021 at 09:58:57AM -0800, Ian Rogers wrote:
> On Wed, Nov 10, 2021 at 6:35 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Wed, Nov 10, 2021 at 06:19:04AM -0800, Ian Rogers wrote:
> > > On Wed, Nov 10, 2021 at 4:56 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > > >
> > > > On Fri, Nov 05, 2021 at 10:09:42AM -0700, Ian Rogers wrote:
> > > > > Allow the number of cpus, cores, dies and packages to be queried by a
> > > > > metric expression.
> > > > >
> > > > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > > > ---
> > > > > tools/perf/tests/expr.c | 12 +++++++++++-
> > > > > tools/perf/util/expr.c | 27 +++++++++++++++++++++++++++
> > > > > 2 files changed, 38 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
> > > > > index 9ee2dc91c27b..0c09ccc76665 100644
> > > > > --- a/tools/perf/tests/expr.c
> > > > > +++ b/tools/perf/tests/expr.c
> > > > > @@ -66,7 +66,7 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
> > > > > {
> > > > > struct expr_id_data *val_ptr;
> > > > > const char *p;
> > > > > - double val;
> > > > > + double val, num_cpus, num_cores, num_dies, num_packages;
> > > > > int ret;
> > > > > struct expr_parse_ctx *ctx;
> > > > >
> > > > > @@ -161,6 +161,16 @@ int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
> > > > > NULL, ctx) == 0);
> > > > > TEST_ASSERT_VAL("find ids", hashmap__size(ctx->ids) == 0);
> > > > >
> > > > > + /* Test toplogy constants appear well ordered. */
> > > > > + expr__ctx_clear(ctx);
> > > > > + TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0);
> > > > > + TEST_ASSERT_VAL("#num_cores", expr__parse(&num_cores, ctx, "#num_cores") == 0);
> > > > > + TEST_ASSERT_VAL("#num_cpus >= #num_cores", num_cpus >= num_cores);
> > > > > + TEST_ASSERT_VAL("#num_dies", expr__parse(&num_dies, ctx, "#num_dies") == 0);
> > > > > + TEST_ASSERT_VAL("#num_cores >= #num_dies", num_cores >= num_dies);
> > > > > + TEST_ASSERT_VAL("#num_packages", expr__parse(&num_packages, ctx, "#num_packages") == 0);
> > > > > + TEST_ASSERT_VAL("#num_dies >= #num_packages", num_dies >= num_packages);
> > > > > +
> > > > > expr__ctx_free(ctx);
> > > > >
> > > > > return 0;
> > > > > diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
> > > > > index 7464739c2890..15af8b8ef5e7 100644
> > > > > --- a/tools/perf/util/expr.c
> > > > > +++ b/tools/perf/util/expr.c
> > > > > @@ -5,6 +5,8 @@
> > > > > #include <stdlib.h>
> > > > > #include <string.h>
> > > > > #include "metricgroup.h"
> > > > > +#include "cpumap.h"
> > > > > +#include "cputopo.h"
> > > > > #include "debug.h"
> > > > > #include "expr.h"
> > > > > #include "expr-bison.h"
> > > > > @@ -375,9 +377,34 @@ double expr_id_data__value(const struct expr_id_data *data)
> > > > >
> > > > > double expr__get_literal(const char *literal)
> > > > > {
> > > > > + static struct cpu_topology *topology;
> > > > > +
> > > > > if (!strcmp("#smt_on", literal))
> > > > > return smt_on() > 0 ? 1.0 : 0.0;
> > > > >
> > > > > + if (!strcmp("#num_cpus", literal))
> > > > > + return cpu__max_present_cpu();
> > > > > +
> > > > > + /*
> > > > > + * Assume that topology strings are consistent, such as CPUs "0-1"
> > > > > + * wouldn't be listed as "0,1", and so after deduplication the number of
> > > > > + * these strings gives an indication of the number of packages, dies,
> > > > > + * etc.
> > > > > + */
> > > > > + if (!topology) {
> > > > > + topology = cpu_topology__new();
> > > >
> > > > any chance we could propagate expr_scanner_ctx in here and store topology
> > > > to it and release it at the end? I think we have several places like this,
> > > > so it'd be nice not to make more if it's possible ;-)
> > >
> > > The topology here is static and so will only get computed once per
> > > execution rather than once pre expression parse. I was worried about
> > > the cost of recomputing the topology for something like 'perf stat -I
> > > 1000 -M ...' in which case the static will do less recomputation.
> >
> > can't we have the topology created/release on the top fo the parsing
> > and released after all expressions are parsed?
> >
> > or we could come up with some generic way to handle this kind of release
>
> Creating the topology at the top of metric parsing would incur a sysfs
> parsing cost on metrics regardless of whether they used the topology.
> I feel a lazy approach is better to avoid this cost. We could make
> this part of the expression context and lazily initialize it there.
> It'd be good to keep the expression context around in 'perf stat' in
> that case. I quite like this approach as really the topology should be
> part of the perf environment and part of the header. It is interesting
> to think how can we get metrics to work with 'perf stat record'.
> Currently you will get metrics for something like 'perf stat record -e
> instructions,cycles -a sleep 2; perf stat report', but none baked in
> metrics don't work and we don't even have the metric-id to make this
> work (even using '-M IPC' doesn't work). Putting the environment in
> the session, making the topology part of it, writing/reading data
> based on that seems like worthwhile clean up but beyond the scope of
> what I was trying to do here.
>
> In terms of releasing for the sake of memory leaks, leak sanitizer
> doesn't treat memory allocated and reachable from a global as a leak.
> We can't precompute the topology and so this style of approach is what
> I'm used to. Putting the topology into the parsing context and lazily
> initializing would be the second best option imo, and I can do that if
> this is a blocker.
ok, let's leave it as it is.. thanks for the details
jirka
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2021-11-10 18:29 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-11-05 17:09 [PATCH 0/7] New function and literals for metrics Ian Rogers
2021-11-05 17:09 ` [PATCH 1/7] perf test: Add expr test for events with hyphens Ian Rogers
2021-11-05 17:09 ` [PATCH 2/7] perf cputopo: Update to use pakage_cpus Ian Rogers
2021-11-05 17:09 ` [PATCH 3/7] perf cputopo: Match die_siblings to topology ABI name Ian Rogers
2021-11-05 17:09 ` [PATCH 4/7] perf cputopo: Match thread_siblings " Ian Rogers
2021-11-05 17:09 ` [PATCH 5/7] perf expr: Add literal values starting with # Ian Rogers
2021-11-05 17:09 ` [PATCH 6/7] perf expr: Add metric literals for topology Ian Rogers
2021-11-10 12:56 ` Jiri Olsa
2021-11-10 14:19 ` Ian Rogers
2021-11-10 14:35 ` Jiri Olsa
2021-11-10 17:58 ` Ian Rogers
2021-11-10 18:29 ` Jiri Olsa
2021-11-05 17:09 ` [PATCH 7/7] perf expr: Add source_count for aggregating events Ian Rogers
2021-11-10 12:56 ` Jiri Olsa
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).