* [PATCH 1/2] perf script: Fix perf script -F +metric
@ 2024-06-25 0:42 Andi Kleen
2024-06-25 0:42 ` [PATCH 2/2] Add a test case for " Andi Kleen
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Andi Kleen @ 2024-06-25 0:42 UTC (permalink / raw)
To: linux-perf-users; +Cc: Andi Kleen
This fixes a regression with perf script -F +metric originally caused by :
commit 37cc8ad77cf81f3ffd226856c367b0e15333a738
Author: Ian Rogers <irogers@google.com>
Date: Sun Feb 19 01:28:46 2023 -0800
perf metric: Directly use counts rather than saved_value
In the perf script environment the evsel wouldn't allocate an aggr
values array, which led to a -1 reference because the metric
evaluation would try to reference NULL - 1 (for aggr_idx)
Give the perf script evsels a single CPU aggr setup. That's
enough because the groups are always contiguous, so no need
to store more than one CPU's worth of values.
Before
% perf record -e '{cycles,instructions}:S' perf bench mem memcpy
% perf script -F +metric
Segmentation fault (core dumped)
After:
% perf record -e '{cycles,instructions}:S' perf bench mem memcpy
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.028 MB perf.data (90 samples) ]
% perf script -F +metric
perf-exec 1847557 264658.180789: 3009 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf-exec 1847557 264658.180789: 382 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf-exec 1847557 264658.180789: metric: 0.13 insn per cycle
...
Fixes: 37cc8ad77cf8 ("perf metric: Directly use counts rather ...")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/builtin-script.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index c16224b1fef3..4aab194b4b1a 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -84,7 +84,7 @@ static bool system_wide;
static bool print_flags;
static const char *cpu_list;
static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
-static struct perf_stat_config stat_config;
+static struct perf_stat_config stat_config = { .aggr_map = &(struct cpu_aggr_map){ .nr = 1 } };
static int max_blocks;
static bool native_arch;
static struct dlfilter *dlfilter;
@@ -2133,12 +2133,14 @@ static void perf_sample__fprint_metric(struct perf_script *script,
if (evsel_script(leader)->gnum++ == 0)
perf_stat__reset_shadow_stats();
val = sample->period * evsel->scale;
+ /* Always use CPU 0 storage because the groups are contiguous. */
+ evsel->stats->aggr[0].counts.val = val;
evsel_script(evsel)->val = val;
if (evsel_script(leader)->gnum == leader->core.nr_members) {
for_each_group_member (ev2, leader) {
perf_stat__print_shadow_stats(&stat_config, ev2,
evsel_script(ev2)->val,
- sample->cpu,
+ 0,
&ctx,
NULL);
}
--
2.45.2
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH 2/2] Add a test case for perf script -F +metric 2024-06-25 0:42 [PATCH 1/2] perf script: Fix perf script -F +metric Andi Kleen @ 2024-06-25 0:42 ` Andi Kleen 2024-06-27 22:54 ` Namhyung Kim 2024-06-26 9:09 ` [PATCH 1/2] perf script: Fix " Artem Savkov 2024-06-27 22:46 ` Namhyung Kim 2 siblings, 1 reply; 9+ messages in thread From: Andi Kleen @ 2024-06-25 0:42 UTC (permalink / raw) To: linux-perf-users; +Cc: Andi Kleen Just a simple test Signed-off-by: Andi Kleen <ak@linux.intel.com> --- tools/perf/tests/shell/script.sh | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/tools/perf/tests/shell/script.sh b/tools/perf/tests/shell/script.sh index c1a603653662..cd9d41734021 100755 --- a/tools/perf/tests/shell/script.sh +++ b/tools/perf/tests/shell/script.sh @@ -1,4 +1,4 @@ -#!/bin/sh +#!/bin/bash # perf script tests # SPDX-License-Identifier: GPL-2.0 @@ -7,6 +7,7 @@ set -e temp_dir=$(mktemp -d /tmp/perf-test-script.XXXXXXXXXX) perfdatafile="${temp_dir}/perf.data" +scriptoutput="${temp_dir}/script" db_test="${temp_dir}/db_test.py" err=0 @@ -88,8 +89,21 @@ test_parallel_perf() echo "parallel-perf test [Success]" } +test_metric() +{ + echo "script metric test" + if ! perf list | grep -q cycles ; then return ; fi + if ! perf list | grep -q instructions ; then return ; fi + perf record -e '{cycles,instructions}' -o "${perfdatafile}" bash -c 'for i in `seq 1000` ; do true ; done' + perf script -i "${perfdatafile}" -F +metric > $scriptoutput + test $(grep -c metric $scriptoutput) -gt 5 + grep metric $scriptoutput | head + echo "script metric test [Success]" +} + test_db test_parallel_perf +test_metric cleanup -- 2.45.2 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] Add a test case for perf script -F +metric 2024-06-25 0:42 ` [PATCH 2/2] Add a test case for " Andi Kleen @ 2024-06-27 22:54 ` Namhyung Kim 0 siblings, 0 replies; 9+ messages in thread From: Namhyung Kim @ 2024-06-27 22:54 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-perf-users On Mon, Jun 24, 2024 at 05:42:32PM -0700, Andi Kleen wrote: > Just a simple test > > Signed-off-by: Andi Kleen <ak@linux.intel.com> > --- > tools/perf/tests/shell/script.sh | 16 +++++++++++++++- > 1 file changed, 15 insertions(+), 1 deletion(-) > > diff --git a/tools/perf/tests/shell/script.sh b/tools/perf/tests/shell/script.sh > index c1a603653662..cd9d41734021 100755 > --- a/tools/perf/tests/shell/script.sh > +++ b/tools/perf/tests/shell/script.sh > @@ -1,4 +1,4 @@ > -#!/bin/sh > +#!/bin/bash Do we really need this? > # perf script tests > # SPDX-License-Identifier: GPL-2.0 > > @@ -7,6 +7,7 @@ set -e > temp_dir=$(mktemp -d /tmp/perf-test-script.XXXXXXXXXX) > > perfdatafile="${temp_dir}/perf.data" > +scriptoutput="${temp_dir}/script" > db_test="${temp_dir}/db_test.py" > > err=0 > @@ -88,8 +89,21 @@ test_parallel_perf() > echo "parallel-perf test [Success]" > } > > +test_metric() > +{ > + echo "script metric test" > + if ! perf list | grep -q cycles ; then return ; fi > + if ! perf list | grep -q instructions ; then return ; fi > + perf record -e '{cycles,instructions}' -o "${perfdatafile}" bash -c 'for i in `seq 1000` ; do true ; done' You can use built-in workloads like noploop or so instead of running an external command. $ perf record -- perf test -w noploop Thanks, Namhyung > + perf script -i "${perfdatafile}" -F +metric > $scriptoutput > + test $(grep -c metric $scriptoutput) -gt 5 > + grep metric $scriptoutput | head > + echo "script metric test [Success]" > +} > + > test_db > test_parallel_perf > +test_metric > > cleanup > > -- > 2.45.2 > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] perf script: Fix perf script -F +metric 2024-06-25 0:42 [PATCH 1/2] perf script: Fix perf script -F +metric Andi Kleen 2024-06-25 0:42 ` [PATCH 2/2] Add a test case for " Andi Kleen @ 2024-06-26 9:09 ` Artem Savkov 2024-06-27 22:50 ` Namhyung Kim 2024-06-28 19:19 ` Andi Kleen 2024-06-27 22:46 ` Namhyung Kim 2 siblings, 2 replies; 9+ messages in thread From: Artem Savkov @ 2024-06-26 9:09 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-perf-users On Mon, Jun 24, 2024 at 05:42:31PM -0700, Andi Kleen wrote: > This fixes a regression with perf script -F +metric originally caused by : > > commit 37cc8ad77cf81f3ffd226856c367b0e15333a738 > Author: Ian Rogers <irogers@google.com> > Date: Sun Feb 19 01:28:46 2023 -0800 > > perf metric: Directly use counts rather than saved_value > > In the perf script environment the evsel wouldn't allocate an aggr > values array, which led to a -1 reference because the metric > evaluation would try to reference NULL - 1 (for aggr_idx) > > Give the perf script evsels a single CPU aggr setup. That's > enough because the groups are always contiguous, so no need > to store more than one CPU's worth of values. > > Before > > % perf record -e '{cycles,instructions}:S' perf bench mem memcpy > % perf script -F +metric > Segmentation fault (core dumped) > > After: > > % perf record -e '{cycles,instructions}:S' perf bench mem memcpy > ... > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.028 MB perf.data (90 samples) ] > % perf script -F +metric > perf-exec 1847557 264658.180789: 3009 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms]) > perf-exec 1847557 264658.180789: 382 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms]) > perf-exec 1847557 264658.180789: metric: 0.13 insn per cycle > ... > > Fixes: 37cc8ad77cf8 ("perf metric: Directly use counts rather ...") > Signed-off-by: Andi Kleen <ak@linux.intel.com> This works, however there seems to be yet another issue on perf-tools-next branch that breaks script metrics. 617824a7f0f73 (perf parse-events: Prefer sysfs/JSON hardware events over legacy, 2024-04-15) changes event priority so on my host "instructions" have type PERF_TYPE_RAW instead of PERF_TYPE_HARDWARE. # event : name = instructions, , id = { 190969, 190970, 190971, 190972, 190973, 190974, 190975, 190976 }, type = 4 (cpu), size = 136, config = 0xc0 (instructions), sample_type = IP|TID|TIME|READ|CPU|PERIOD|IDENTIFIER, read_format = ID|GROUP|LOST, sample_id_all = 1, exclude_guest = 1 This results in evsel__stat_type no longer recognizing these as STAT_INSTRUCTIONS. > --- > tools/perf/builtin-script.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c > index c16224b1fef3..4aab194b4b1a 100644 > --- a/tools/perf/builtin-script.c > +++ b/tools/perf/builtin-script.c > @@ -84,7 +84,7 @@ static bool system_wide; > static bool print_flags; > static const char *cpu_list; > static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS); > -static struct perf_stat_config stat_config; > +static struct perf_stat_config stat_config = { .aggr_map = &(struct cpu_aggr_map){ .nr = 1 } }; > static int max_blocks; > static bool native_arch; > static struct dlfilter *dlfilter; > @@ -2133,12 +2133,14 @@ static void perf_sample__fprint_metric(struct perf_script *script, > if (evsel_script(leader)->gnum++ == 0) > perf_stat__reset_shadow_stats(); > val = sample->period * evsel->scale; > + /* Always use CPU 0 storage because the groups are contiguous. */ > + evsel->stats->aggr[0].counts.val = val; > evsel_script(evsel)->val = val; > if (evsel_script(leader)->gnum == leader->core.nr_members) { > for_each_group_member (ev2, leader) { > perf_stat__print_shadow_stats(&stat_config, ev2, > evsel_script(ev2)->val, > - sample->cpu, > + 0, > &ctx, > NULL); > } > -- > 2.45.2 > -- Artem ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] perf script: Fix perf script -F +metric 2024-06-26 9:09 ` [PATCH 1/2] perf script: Fix " Artem Savkov @ 2024-06-27 22:50 ` Namhyung Kim 2024-06-28 19:19 ` Andi Kleen 1 sibling, 0 replies; 9+ messages in thread From: Namhyung Kim @ 2024-06-27 22:50 UTC (permalink / raw) To: Artem Savkov, Ian Rogers; +Cc: Andi Kleen, linux-perf-users On Wed, Jun 26, 2024 at 11:09:24AM +0200, Artem Savkov wrote: > On Mon, Jun 24, 2024 at 05:42:31PM -0700, Andi Kleen wrote: > > This fixes a regression with perf script -F +metric originally caused by : > > > > commit 37cc8ad77cf81f3ffd226856c367b0e15333a738 > > Author: Ian Rogers <irogers@google.com> > > Date: Sun Feb 19 01:28:46 2023 -0800 > > > > perf metric: Directly use counts rather than saved_value > > > > In the perf script environment the evsel wouldn't allocate an aggr > > values array, which led to a -1 reference because the metric > > evaluation would try to reference NULL - 1 (for aggr_idx) > > > > Give the perf script evsels a single CPU aggr setup. That's > > enough because the groups are always contiguous, so no need > > to store more than one CPU's worth of values. > > > > Before > > > > % perf record -e '{cycles,instructions}:S' perf bench mem memcpy > > % perf script -F +metric > > Segmentation fault (core dumped) > > > > After: > > > > % perf record -e '{cycles,instructions}:S' perf bench mem memcpy > > ... > > [ perf record: Woken up 1 times to write data ] > > [ perf record: Captured and wrote 0.028 MB perf.data (90 samples) ] > > % perf script -F +metric > > perf-exec 1847557 264658.180789: 3009 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms]) > > perf-exec 1847557 264658.180789: 382 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms]) > > perf-exec 1847557 264658.180789: metric: 0.13 insn per cycle > > ... > > > > Fixes: 37cc8ad77cf8 ("perf metric: Directly use counts rather ...") > > Signed-off-by: Andi Kleen <ak@linux.intel.com> > > This works, however there seems to be yet another issue on > perf-tools-next branch that breaks script metrics. > > 617824a7f0f73 (perf parse-events: Prefer sysfs/JSON hardware > events over legacy, 2024-04-15) changes event priority so on > my host "instructions" have type PERF_TYPE_RAW instead of > PERF_TYPE_HARDWARE. > > # event : name = instructions, , id = { 190969, 190970, 190971, 190972, 190973, 190974, 190975, 190976 }, type = 4 (cpu), size = 136, config = 0xc0 (instructions), sample_type = IP|TID|TIME|READ|CPU|PERIOD|IDENTIFIER, read_format = ID|GROUP|LOST, sample_id_all = 1, exclude_guest = 1 > > This results in evsel__stat_type no longer recognizing these as > STAT_INSTRUCTIONS. Hmm.. sounds like a problem. Ian, did you notice this? Thanks, Namhyung > > > --- > > tools/perf/builtin-script.c | 6 ++++-- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c > > index c16224b1fef3..4aab194b4b1a 100644 > > --- a/tools/perf/builtin-script.c > > +++ b/tools/perf/builtin-script.c > > @@ -84,7 +84,7 @@ static bool system_wide; > > static bool print_flags; > > static const char *cpu_list; > > static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS); > > -static struct perf_stat_config stat_config; > > +static struct perf_stat_config stat_config = { .aggr_map = &(struct cpu_aggr_map){ .nr = 1 } }; > > static int max_blocks; > > static bool native_arch; > > static struct dlfilter *dlfilter; > > @@ -2133,12 +2133,14 @@ static void perf_sample__fprint_metric(struct perf_script *script, > > if (evsel_script(leader)->gnum++ == 0) > > perf_stat__reset_shadow_stats(); > > val = sample->period * evsel->scale; > > + /* Always use CPU 0 storage because the groups are contiguous. */ > > + evsel->stats->aggr[0].counts.val = val; > > evsel_script(evsel)->val = val; > > if (evsel_script(leader)->gnum == leader->core.nr_members) { > > for_each_group_member (ev2, leader) { > > perf_stat__print_shadow_stats(&stat_config, ev2, > > evsel_script(ev2)->val, > > - sample->cpu, > > + 0, > > &ctx, > > NULL); > > } > > -- > > 2.45.2 > > > > -- > Artem > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] perf script: Fix perf script -F +metric 2024-06-26 9:09 ` [PATCH 1/2] perf script: Fix " Artem Savkov 2024-06-27 22:50 ` Namhyung Kim @ 2024-06-28 19:19 ` Andi Kleen 2024-07-01 7:04 ` Artem Savkov 1 sibling, 1 reply; 9+ messages in thread From: Andi Kleen @ 2024-06-28 19:19 UTC (permalink / raw) To: Artem Savkov; +Cc: linux-perf-users > This works, however there seems to be yet another issue on > perf-tools-next branch that breaks script metrics. > > 617824a7f0f73 (perf parse-events: Prefer sysfs/JSON hardware > events over legacy, 2024-04-15) changes event priority so on > my host "instructions" have type PERF_TYPE_RAW instead of > PERF_TYPE_HARDWARE. FWIW I don't see this on my build on Intel, even with that commit. > This results in evsel__stat_type no longer recognizing these as > STAT_INSTRUCTIONS. I guess we could just fix evsel__stat_type to handle that case Does this patch work? perf: Workaround event priority problems When instructions is not of HARDWARE types anymore, but raw, which breaks the shadow stat metrics detection for IPC. Reported-by: Artem Savkov <asavkov@redhat.com> Fixes: 617824a7f0f73 (perf parse-events: Prefer sysfs/JSON hardware Signed-off-by: Andi Kleen <ak@linux.intel.com> diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c index 3466aa952442..87a503e47ff0 100644 --- a/tools/perf/util/stat-shadow.c +++ b/tools/perf/util/stat-shadow.c @@ -97,9 +97,11 @@ static enum stat_type evsel__stat_type(const struct evsel *evsel) if (evsel__is_clock(evsel)) return STAT_NSECS; - else if (evsel__match(evsel, HARDWARE, HW_CPU_CYCLES)) + else if (evsel__match(evsel, HARDWARE, HW_CPU_CYCLES) || + !strcmp(evsel__name((struct evsel *)evsel), "cycles")) return STAT_CYCLES; - else if (evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) + else if (evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS) || + !strcmp(evsel__name((struct evsel *)evsel), "instructions")) return STAT_INSTRUCTIONS; else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_FRONTEND)) return STAT_STALLED_CYCLES_FRONT; ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] perf script: Fix perf script -F +metric 2024-06-28 19:19 ` Andi Kleen @ 2024-07-01 7:04 ` Artem Savkov 2024-07-01 16:54 ` Andi Kleen 0 siblings, 1 reply; 9+ messages in thread From: Artem Savkov @ 2024-07-01 7:04 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-perf-users On Fri, Jun 28, 2024 at 12:19:59PM -0700, Andi Kleen wrote: > > This works, however there seems to be yet another issue on > > perf-tools-next branch that breaks script metrics. > > > > 617824a7f0f73 (perf parse-events: Prefer sysfs/JSON hardware > > events over legacy, 2024-04-15) changes event priority so on > > my host "instructions" have type PERF_TYPE_RAW instead of > > PERF_TYPE_HARDWARE. > > FWIW I don't see this on my build on Intel, even with that commit. > > > This results in evsel__stat_type no longer recognizing these as > > STAT_INSTRUCTIONS. > > I guess we could just fix evsel__stat_type to handle that case > > Does this patch work? It does for this specific example, but there are other events as well. From arch/x86/events/intel/core.c: static u64 intel_perfmon_event_map[PERF_COUNT_HW_MAX] __read_mostly = { [PERF_COUNT_HW_CPU_CYCLES] = 0x003c, [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0, [PERF_COUNT_HW_CACHE_REFERENCES] = 0x4f2e, [PERF_COUNT_HW_CACHE_MISSES] = 0x412e, [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c4, [PERF_COUNT_HW_BRANCH_MISSES] = 0x00c5, [PERF_COUNT_HW_BUS_CYCLES] = 0x013c, [PERF_COUNT_HW_REF_CPU_CYCLES] = 0x0300, /* pseudo-encoding */ }; The list is probably different on other arches. > perf: Workaround event priority problems > > When instructions is not of HARDWARE types anymore, but raw, > which breaks the shadow stat metrics detection for IPC. > > Reported-by: Artem Savkov <asavkov@redhat.com> > Fixes: 617824a7f0f73 (perf parse-events: Prefer sysfs/JSON hardware > Signed-off-by: Andi Kleen <ak@linux.intel.com> > > diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c > index 3466aa952442..87a503e47ff0 100644 > --- a/tools/perf/util/stat-shadow.c > +++ b/tools/perf/util/stat-shadow.c > @@ -97,9 +97,11 @@ static enum stat_type evsel__stat_type(const struct evsel *evsel) > > if (evsel__is_clock(evsel)) > return STAT_NSECS; > - else if (evsel__match(evsel, HARDWARE, HW_CPU_CYCLES)) > + else if (evsel__match(evsel, HARDWARE, HW_CPU_CYCLES) || > + !strcmp(evsel__name((struct evsel *)evsel), "cycles")) It is also probably worth to add a PERF_TYPE_RAW check. > return STAT_CYCLES; > - else if (evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) > + else if (evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS) || > + !strcmp(evsel__name((struct evsel *)evsel), "instructions")) > return STAT_INSTRUCTIONS; > else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_FRONTEND)) > return STAT_STALLED_CYCLES_FRONT; > -- Artem ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] perf script: Fix perf script -F +metric 2024-07-01 7:04 ` Artem Savkov @ 2024-07-01 16:54 ` Andi Kleen 0 siblings, 0 replies; 9+ messages in thread From: Andi Kleen @ 2024-07-01 16:54 UTC (permalink / raw) To: Artem Savkov; +Cc: linux-perf-users On Mon, Jul 01, 2024 at 09:04:38AM +0200, Artem Savkov wrote: > On Fri, Jun 28, 2024 at 12:19:59PM -0700, Andi Kleen wrote: > > > This works, however there seems to be yet another issue on > > > perf-tools-next branch that breaks script metrics. > > > > > > 617824a7f0f73 (perf parse-events: Prefer sysfs/JSON hardware > > > events over legacy, 2024-04-15) changes event priority so on > > > my host "instructions" have type PERF_TYPE_RAW instead of > > > PERF_TYPE_HARDWARE. > > > > FWIW I don't see this on my build on Intel, even with that commit. > > > > > This results in evsel__stat_type no longer recognizing these as > > > STAT_INSTRUCTIONS. > > > > I guess we could just fix evsel__stat_type to handle that case > > > > Does this patch work? > > It does for this specific example, but there are other events as well. > From arch/x86/events/intel/core.c: > > static u64 intel_perfmon_event_map[PERF_COUNT_HW_MAX] __read_mostly = > { > [PERF_COUNT_HW_CPU_CYCLES] = 0x003c, > [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0, > [PERF_COUNT_HW_CACHE_REFERENCES] = 0x4f2e, > [PERF_COUNT_HW_CACHE_MISSES] = 0x412e, > [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x00c4, > [PERF_COUNT_HW_BRANCH_MISSES] = 0x00c5, > [PERF_COUNT_HW_BUS_CYCLES] = 0x013c, > [PERF_COUNT_HW_REF_CPU_CYCLES] = 0x0300, /* pseudo-encoding */ > }; > > The list is probably different on other arches. Most of them only use the fallback "xxx per second" metric, which should work even for raw I think. But yes some would be needed (like cache and branches) Probably it needs a more generic solution then, it was somewhat of a hack anyways. My proposal would be to revert the original patch. Clearly it was a bad idea. -Andi ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] perf script: Fix perf script -F +metric 2024-06-25 0:42 [PATCH 1/2] perf script: Fix perf script -F +metric Andi Kleen 2024-06-25 0:42 ` [PATCH 2/2] Add a test case for " Andi Kleen 2024-06-26 9:09 ` [PATCH 1/2] perf script: Fix " Artem Savkov @ 2024-06-27 22:46 ` Namhyung Kim 2 siblings, 0 replies; 9+ messages in thread From: Namhyung Kim @ 2024-06-27 22:46 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-perf-users Hi Andi, On Mon, Jun 24, 2024 at 05:42:31PM -0700, Andi Kleen wrote: > This fixes a regression with perf script -F +metric originally caused by : > > commit 37cc8ad77cf81f3ffd226856c367b0e15333a738 > Author: Ian Rogers <irogers@google.com> > Date: Sun Feb 19 01:28:46 2023 -0800 > > perf metric: Directly use counts rather than saved_value > > In the perf script environment the evsel wouldn't allocate an aggr > values array, which led to a -1 reference because the metric > evaluation would try to reference NULL - 1 (for aggr_idx) > > Give the perf script evsels a single CPU aggr setup. That's > enough because the groups are always contiguous, so no need > to store more than one CPU's worth of values. > > Before > > % perf record -e '{cycles,instructions}:S' perf bench mem memcpy > % perf script -F +metric > Segmentation fault (core dumped) > > After: > > % perf record -e '{cycles,instructions}:S' perf bench mem memcpy > ... > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.028 MB perf.data (90 samples) ] > % perf script -F +metric > perf-exec 1847557 264658.180789: 3009 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms]) > perf-exec 1847557 264658.180789: 382 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms]) > perf-exec 1847557 264658.180789: metric: 0.13 insn per cycle > ... Thanks for the fix! > > Fixes: 37cc8ad77cf8 ("perf metric: Directly use counts rather ...") > Signed-off-by: Andi Kleen <ak@linux.intel.com> > --- > tools/perf/builtin-script.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c > index c16224b1fef3..4aab194b4b1a 100644 > --- a/tools/perf/builtin-script.c > +++ b/tools/perf/builtin-script.c > @@ -84,7 +84,7 @@ static bool system_wide; > static bool print_flags; > static const char *cpu_list; > static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS); > -static struct perf_stat_config stat_config; > +static struct perf_stat_config stat_config = { .aggr_map = &(struct cpu_aggr_map){ .nr = 1 } }; Can you please follow the coding style and properly put the code? Thanks, Namhyung > static int max_blocks; > static bool native_arch; > static struct dlfilter *dlfilter; > @@ -2133,12 +2133,14 @@ static void perf_sample__fprint_metric(struct perf_script *script, > if (evsel_script(leader)->gnum++ == 0) > perf_stat__reset_shadow_stats(); > val = sample->period * evsel->scale; > + /* Always use CPU 0 storage because the groups are contiguous. */ > + evsel->stats->aggr[0].counts.val = val; > evsel_script(evsel)->val = val; > if (evsel_script(leader)->gnum == leader->core.nr_members) { > for_each_group_member (ev2, leader) { > perf_stat__print_shadow_stats(&stat_config, ev2, > evsel_script(ev2)->val, > - sample->cpu, > + 0, > &ctx, > NULL); > } > -- > 2.45.2 > ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-07-01 16:54 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-06-25 0:42 [PATCH 1/2] perf script: Fix perf script -F +metric Andi Kleen 2024-06-25 0:42 ` [PATCH 2/2] Add a test case for " Andi Kleen 2024-06-27 22:54 ` Namhyung Kim 2024-06-26 9:09 ` [PATCH 1/2] perf script: Fix " Artem Savkov 2024-06-27 22:50 ` Namhyung Kim 2024-06-28 19:19 ` Andi Kleen 2024-07-01 7:04 ` Artem Savkov 2024-07-01 16:54 ` Andi Kleen 2024-06-27 22:46 ` Namhyung Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).