* [PATCH v2 0/2] libperf: evlist: Fix --cpu argument on hybrid platform
@ 2024-11-14 16:04 James Clark
2024-11-14 16:04 ` [PATCH v2 1/2] " James Clark
2024-11-14 16:04 ` [PATCH v2 2/2] libperf: evlist: Keep evsel idx contiguous on removal James Clark
0 siblings, 2 replies; 9+ messages in thread
From: James Clark @ 2024-11-14 16:04 UTC (permalink / raw)
To: linux-perf-users, namhyung, irogers, thomas.falcon
Cc: James Clark, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Liang, Kan, linux-kernel
Since [1] this isn't strictly required anymore, because the failures
due to opening extended type events on 'any' CPU are now skipped.
However, it does reduce quite a few spurious warnings from all the skips
so I think we should still include it.
Separated into two parts so that the first is backportable and the
second is a cleanup. Also if the first one is backported then it will
fix stable Perfs that might not get [1].
Applies on [1].
[1]: https://lore.kernel.org/linux-perf-users/20241113011956.402096-1-irogers@google.com/T/#t
James Clark (2):
libperf: evlist: Fix --cpu argument on hybrid platform
libperf: evlist: Keep evsel idx contiguous on removal
tools/lib/perf/evlist.c | 19 ++++++++++++--
tools/lib/perf/tests/test-evlist.c | 41 ++++++++++++++++++++++++++++++
tools/perf/builtin-record.c | 18 +++----------
3 files changed, 61 insertions(+), 17 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 1/2] libperf: evlist: Fix --cpu argument on hybrid platform
2024-11-14 16:04 [PATCH v2 0/2] libperf: evlist: Fix --cpu argument on hybrid platform James Clark
@ 2024-11-14 16:04 ` James Clark
2024-12-10 13:51 ` Arnaldo Carvalho de Melo
2024-11-14 16:04 ` [PATCH v2 2/2] libperf: evlist: Keep evsel idx contiguous on removal James Clark
1 sibling, 1 reply; 9+ messages in thread
From: James Clark @ 2024-11-14 16:04 UTC (permalink / raw)
To: linux-perf-users, namhyung, irogers, thomas.falcon
Cc: James Clark, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Liang, Kan, linux-kernel
Since the linked fixes: commit, specifying a CPU on hybrid platforms
results in an error because Perf tries to open an extended type event
on "any" CPU which isn't valid. Extended type events can only be opened
on CPUs that match the type.
Before (working):
$ perf record --cpu 1 -- true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]
After (not working):
$ perf record -C 1 -- true
WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P'
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/).
/bin/dmesg | grep -i perf may provide additional information.
(Ignore the warning message, that's expected and not particularly
relevant to this issue).
This is because perf_cpu_map__intersect() of the user specified CPU (1)
and one of the PMU's CPUs (16-27) correctly results in an empty (NULL)
CPU map. However for the purposes of opening an event, libperf converts
empty CPU maps into an any CPU (-1) which the kernel rejects.
Fix it by deleting evsels with empty CPU maps in the specific case where
user requested CPU maps are evaluated.
Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: James Clark <james.clark@linaro.org>
---
tools/lib/perf/evlist.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index c6d67fc9e57e..83c43dc13313 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -47,6 +47,20 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
*/
perf_cpu_map__put(evsel->cpus);
evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->own_cpus);
+
+ /*
+ * Empty cpu lists would eventually get opened as "any" so remove
+ * genuinely empty ones before they're opened in the wrong place.
+ */
+ if (perf_cpu_map__is_empty(evsel->cpus)) {
+ struct perf_evsel *next = perf_evlist__next(evlist, evsel);
+
+ perf_evlist__remove(evlist, evsel);
+ /* Keep idx contiguous */
+ if (next)
+ list_for_each_entry_from(next, &evlist->entries, node)
+ next->idx--;
+ }
} else if (!evsel->own_cpus || evlist->has_user_cpus ||
(!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
/*
@@ -80,11 +94,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
{
- struct perf_evsel *evsel;
+ struct perf_evsel *evsel, *n;
evlist->needs_map_propagation = true;
- perf_evlist__for_each_evsel(evlist, evsel)
+ list_for_each_entry_safe(evsel, n, &evlist->entries, node)
__perf_evlist__propagate_maps(evlist, evsel);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v2 2/2] libperf: evlist: Keep evsel idx contiguous on removal
2024-11-14 16:04 [PATCH v2 0/2] libperf: evlist: Fix --cpu argument on hybrid platform James Clark
2024-11-14 16:04 ` [PATCH v2 1/2] " James Clark
@ 2024-11-14 16:04 ` James Clark
2024-11-14 23:35 ` Ian Rogers
1 sibling, 1 reply; 9+ messages in thread
From: James Clark @ 2024-11-14 16:04 UTC (permalink / raw)
To: linux-perf-users, namhyung, irogers, thomas.falcon
Cc: James Clark, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Liang, Kan, linux-kernel
The only two places where evsels are removed end up fixing up the index.
Move it into libperf so that it's always done.
Signed-off-by: James Clark <james.clark@linaro.org>
---
tools/lib/perf/evlist.c | 17 +++++++------
tools/lib/perf/tests/test-evlist.c | 41 ++++++++++++++++++++++++++++++
tools/perf/builtin-record.c | 18 +++----------
3 files changed, 53 insertions(+), 23 deletions(-)
diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index 83c43dc13313..ffa1a28fb14f 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -52,15 +52,8 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
* Empty cpu lists would eventually get opened as "any" so remove
* genuinely empty ones before they're opened in the wrong place.
*/
- if (perf_cpu_map__is_empty(evsel->cpus)) {
- struct perf_evsel *next = perf_evlist__next(evlist, evsel);
-
+ if (perf_cpu_map__is_empty(evsel->cpus))
perf_evlist__remove(evlist, evsel);
- /* Keep idx contiguous */
- if (next)
- list_for_each_entry_from(next, &evlist->entries, node)
- next->idx--;
- }
} else if (!evsel->own_cpus || evlist->has_user_cpus ||
(!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
/*
@@ -116,8 +109,16 @@ void perf_evlist__add(struct perf_evlist *evlist,
void perf_evlist__remove(struct perf_evlist *evlist,
struct perf_evsel *evsel)
{
+ struct perf_evsel *next = perf_evlist__next(evlist, evsel);
+
list_del_init(&evsel->node);
evlist->nr_entries -= 1;
+
+ /* Keep idx contiguous */
+ if (!next)
+ return;
+ list_for_each_entry_from(next, &evlist->entries, node)
+ next->idx--;
}
struct perf_evlist *perf_evlist__new(void)
diff --git a/tools/lib/perf/tests/test-evlist.c b/tools/lib/perf/tests/test-evlist.c
index 10f70cb41ff1..06153820f408 100644
--- a/tools/lib/perf/tests/test-evlist.c
+++ b/tools/lib/perf/tests/test-evlist.c
@@ -571,6 +571,46 @@ static int test_stat_multiplexing(void)
return 0;
}
+static int test_list_remove(void)
+{
+ struct perf_evlist *evlist;
+ struct perf_evsel *evsel, *evsel0, *evsel1, *evsel2;
+ struct perf_event_attr attr = {};
+ int idx = 0;
+
+ evlist = perf_evlist__new();
+ __T("failed to create evlist", evlist);
+
+ evsel0 = perf_evsel__new(&attr);
+ __T("failed to create evsel", evsel0);
+ evsel1 = perf_evsel__new(&attr);
+ __T("failed to create evsel", evsel1);
+ evsel2 = perf_evsel__new(&attr);
+ __T("failed to create evsel", evsel2);
+
+ perf_evlist__add(evlist, evsel0);
+ perf_evlist__add(evlist, evsel1);
+ perf_evlist__add(evlist, evsel2);
+
+ idx = 0;
+ perf_evlist__for_each_evsel(evlist, evsel)
+ __T("idx isn't contiguous", evsel->idx == idx++);
+
+ /* Test removing middle */
+ perf_evlist__remove(evlist, evsel1);
+ idx = 0;
+ perf_evlist__for_each_evsel(evlist, evsel)
+ __T("idx isn't contiguous", evsel->idx == idx++);
+
+ /* Test removing end */
+ perf_evlist__remove(evlist, evsel2);
+ idx = 0;
+ perf_evlist__for_each_evsel(evlist, evsel)
+ __T("idx isn't contiguous", evsel->idx == idx++);
+
+ return 0;
+}
+
int test_evlist(int argc, char **argv)
{
__T_START;
@@ -583,6 +623,7 @@ int test_evlist(int argc, char **argv)
test_mmap_thread();
test_mmap_cpus();
test_stat_multiplexing();
+ test_list_remove();
__T_END;
return tests_failed == 0 ? 0 : -1;
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 7e99743f7e42..2ebbb0fd0064 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1359,14 +1359,13 @@ static int record__mmap(struct record *rec)
static int record__open(struct record *rec)
{
char msg[BUFSIZ];
- struct evsel *pos;
+ struct evsel *pos, *tmp;
struct evlist *evlist = rec->evlist;
struct perf_session *session = rec->session;
struct record_opts *opts = &rec->opts;
int rc = 0;
- bool skipped = false;
- evlist__for_each_entry(evlist, pos) {
+ evlist__for_each_entry_safe(evlist, tmp, pos) {
try_again:
if (evsel__open(pos, pos->core.cpus, pos->core.threads) < 0) {
if (evsel__fallback(pos, &opts->target, errno, msg, sizeof(msg))) {
@@ -1383,23 +1382,12 @@ static int record__open(struct record *rec)
evsel__open_strerror(pos, &opts->target, errno, msg, sizeof(msg));
ui__error("Failure to open event '%s' on PMU '%s' which will be removed.\n%s\n",
evsel__name(pos), evsel__pmu_name(pos), msg);
- pos->skippable = true;
- skipped = true;
+ evlist__remove(evlist, pos);
} else {
pos->supported = true;
}
}
- if (skipped) {
- struct evsel *tmp;
- int idx = 0;
-
- evlist__for_each_entry_safe(evlist, tmp, pos) {
- if (pos->skippable)
- evlist__remove(evlist, pos);
- pos->core.idx = idx++;
- }
- }
if (symbol_conf.kptr_restrict && !evlist__exclude_kernel(evlist)) {
pr_warning(
"WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,\n"
--
2.34.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/2] libperf: evlist: Keep evsel idx contiguous on removal
2024-11-14 16:04 ` [PATCH v2 2/2] libperf: evlist: Keep evsel idx contiguous on removal James Clark
@ 2024-11-14 23:35 ` Ian Rogers
0 siblings, 0 replies; 9+ messages in thread
From: Ian Rogers @ 2024-11-14 23:35 UTC (permalink / raw)
To: James Clark
Cc: linux-perf-users, namhyung, thomas.falcon, Peter Zijlstra,
Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Liang, Kan,
linux-kernel
On Thu, Nov 14, 2024 at 8:07 AM James Clark <james.clark@linaro.org> wrote:
>
> The only two places where evsels are removed end up fixing up the index.
> Move it into libperf so that it's always done.
>
> Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ian Rogers <irogers@google.com>
but I was also reminded that the evsel idx is just a meaningless int
and we should stop trying to make it special, so I sent out some clean
up:
https://lore.kernel.org/lkml/20241114230713.330701-1-irogers@google.com/
You may get asked by the maintainers to separate the test from the
main logic to make patches with fewer lines of code ¯\_(ツ)_/¯
Thanks,
Ian
> ---
> tools/lib/perf/evlist.c | 17 +++++++------
> tools/lib/perf/tests/test-evlist.c | 41 ++++++++++++++++++++++++++++++
> tools/perf/builtin-record.c | 18 +++----------
> 3 files changed, 53 insertions(+), 23 deletions(-)
>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index 83c43dc13313..ffa1a28fb14f 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -52,15 +52,8 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> * Empty cpu lists would eventually get opened as "any" so remove
> * genuinely empty ones before they're opened in the wrong place.
> */
> - if (perf_cpu_map__is_empty(evsel->cpus)) {
> - struct perf_evsel *next = perf_evlist__next(evlist, evsel);
> -
> + if (perf_cpu_map__is_empty(evsel->cpus))
> perf_evlist__remove(evlist, evsel);
> - /* Keep idx contiguous */
> - if (next)
> - list_for_each_entry_from(next, &evlist->entries, node)
> - next->idx--;
> - }
> } else if (!evsel->own_cpus || evlist->has_user_cpus ||
> (!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
> /*
> @@ -116,8 +109,16 @@ void perf_evlist__add(struct perf_evlist *evlist,
> void perf_evlist__remove(struct perf_evlist *evlist,
> struct perf_evsel *evsel)
> {
> + struct perf_evsel *next = perf_evlist__next(evlist, evsel);
> +
> list_del_init(&evsel->node);
> evlist->nr_entries -= 1;
> +
> + /* Keep idx contiguous */
> + if (!next)
> + return;
> + list_for_each_entry_from(next, &evlist->entries, node)
> + next->idx--;
> }
>
> struct perf_evlist *perf_evlist__new(void)
> diff --git a/tools/lib/perf/tests/test-evlist.c b/tools/lib/perf/tests/test-evlist.c
> index 10f70cb41ff1..06153820f408 100644
> --- a/tools/lib/perf/tests/test-evlist.c
> +++ b/tools/lib/perf/tests/test-evlist.c
> @@ -571,6 +571,46 @@ static int test_stat_multiplexing(void)
> return 0;
> }
>
> +static int test_list_remove(void)
> +{
> + struct perf_evlist *evlist;
> + struct perf_evsel *evsel, *evsel0, *evsel1, *evsel2;
> + struct perf_event_attr attr = {};
> + int idx = 0;
> +
> + evlist = perf_evlist__new();
> + __T("failed to create evlist", evlist);
> +
> + evsel0 = perf_evsel__new(&attr);
> + __T("failed to create evsel", evsel0);
> + evsel1 = perf_evsel__new(&attr);
> + __T("failed to create evsel", evsel1);
> + evsel2 = perf_evsel__new(&attr);
> + __T("failed to create evsel", evsel2);
> +
> + perf_evlist__add(evlist, evsel0);
> + perf_evlist__add(evlist, evsel1);
> + perf_evlist__add(evlist, evsel2);
> +
> + idx = 0;
> + perf_evlist__for_each_evsel(evlist, evsel)
> + __T("idx isn't contiguous", evsel->idx == idx++);
> +
> + /* Test removing middle */
> + perf_evlist__remove(evlist, evsel1);
> + idx = 0;
> + perf_evlist__for_each_evsel(evlist, evsel)
> + __T("idx isn't contiguous", evsel->idx == idx++);
> +
> + /* Test removing end */
> + perf_evlist__remove(evlist, evsel2);
> + idx = 0;
> + perf_evlist__for_each_evsel(evlist, evsel)
> + __T("idx isn't contiguous", evsel->idx == idx++);
> +
> + return 0;
> +}
> +
> int test_evlist(int argc, char **argv)
> {
> __T_START;
> @@ -583,6 +623,7 @@ int test_evlist(int argc, char **argv)
> test_mmap_thread();
> test_mmap_cpus();
> test_stat_multiplexing();
> + test_list_remove();
>
> __T_END;
> return tests_failed == 0 ? 0 : -1;
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 7e99743f7e42..2ebbb0fd0064 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -1359,14 +1359,13 @@ static int record__mmap(struct record *rec)
> static int record__open(struct record *rec)
> {
> char msg[BUFSIZ];
> - struct evsel *pos;
> + struct evsel *pos, *tmp;
> struct evlist *evlist = rec->evlist;
> struct perf_session *session = rec->session;
> struct record_opts *opts = &rec->opts;
> int rc = 0;
> - bool skipped = false;
>
> - evlist__for_each_entry(evlist, pos) {
> + evlist__for_each_entry_safe(evlist, tmp, pos) {
> try_again:
> if (evsel__open(pos, pos->core.cpus, pos->core.threads) < 0) {
> if (evsel__fallback(pos, &opts->target, errno, msg, sizeof(msg))) {
> @@ -1383,23 +1382,12 @@ static int record__open(struct record *rec)
> evsel__open_strerror(pos, &opts->target, errno, msg, sizeof(msg));
> ui__error("Failure to open event '%s' on PMU '%s' which will be removed.\n%s\n",
> evsel__name(pos), evsel__pmu_name(pos), msg);
> - pos->skippable = true;
> - skipped = true;
> + evlist__remove(evlist, pos);
> } else {
> pos->supported = true;
> }
> }
>
> - if (skipped) {
> - struct evsel *tmp;
> - int idx = 0;
> -
> - evlist__for_each_entry_safe(evlist, tmp, pos) {
> - if (pos->skippable)
> - evlist__remove(evlist, pos);
> - pos->core.idx = idx++;
> - }
> - }
> if (symbol_conf.kptr_restrict && !evlist__exclude_kernel(evlist)) {
> pr_warning(
> "WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,\n"
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/2] libperf: evlist: Fix --cpu argument on hybrid platform
2024-11-14 16:04 ` [PATCH v2 1/2] " James Clark
@ 2024-12-10 13:51 ` Arnaldo Carvalho de Melo
2024-12-10 13:56 ` James Clark
0 siblings, 1 reply; 9+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-12-10 13:51 UTC (permalink / raw)
To: Namhyung Kim, James Clark
Cc: linux-perf-users, irogers, thomas.falcon, Peter Zijlstra,
Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Liang, Kan, linux-kernel
On Thu, Nov 14, 2024 at 04:04:48PM +0000, James Clark wrote:
> Since the linked fixes: commit, specifying a CPU on hybrid platforms
> results in an error because Perf tries to open an extended type event
> on "any" CPU which isn't valid. Extended type events can only be opened
> on CPUs that match the type.
> Before (working):
> $ perf record --cpu 1 -- true
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]
> After (not working):
> $ perf record -C 1 -- true
> WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P'
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/).
> /bin/dmesg | grep -i perf may provide additional information.
>
> (Ignore the warning message, that's expected and not particularly
> relevant to this issue).
>
> This is because perf_cpu_map__intersect() of the user specified CPU (1)
> and one of the PMU's CPUs (16-27) correctly results in an empty (NULL)
> CPU map. However for the purposes of opening an event, libperf converts
> empty CPU maps into an any CPU (-1) which the kernel rejects.
>
> Fix it by deleting evsels with empty CPU maps in the specific case where
> user requested CPU maps are evaluated.
Namhyung, I think this should go via perf-tools,
Before, on an hybrid Intel processor:
⬢ [acme@toolbox perf-tools]$ grep -m1 'model name' /proc/cpuinfo
model name : 13th Gen Intel(R) Core(TM) i7-1365U
⬢ [acme@toolbox perf-tools]$
root@x1:~# perf record -C 1 -- true
WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 4-11) for event 'cycles:P'
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles/P).
"dmesg | grep -i perf" may provide additional information.
root@x1:~#
After the patch:
root@x1:~# perf record -C 1 -- true
WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 4-11) for event 'cycles:P'
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.369 MB perf.data (8 samples) ]
root@x1:~#
root@x1:~# perf evlist -v
cpu_core/cycles/P: type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x400000000, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
dummy:u: type: 1 (software), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|IDENTIFIER, read_format: ID|LOST, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
root@x1:~#
All on CPU 1:
root@x1:~# perf report -D | grep PERF_RECORD_SAMPLE
1 1809541166457609 0x15e410 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9b1ce418 period: 1 addr: 0
1 1809541166470019 0x15e448 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0206b1 period: 1 addr: 0
1 1809541166475543 0x15e480 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0a9158 period: 5 addr: 0
1 1809541166478316 0x15e4b8 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0a9158 period: 53 addr: 0
1 1809541166480879 0x15e4f0 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0a9158 period: 667 addr: 0
1 1809541166483895 0x15e528 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9b1bcbcc period: 8537 addr: 0
1 1809541166489116 0x15e560 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a009b3b period: 98429 addr: 0
1 1809541168375508 0x15e620 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffff9b0e2a34 period: 732596 addr: 0
root@x1:~#
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James, the second patch isn't applying to perf-tools/perf-tools.
- Arnaldo
> Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
> Reviewed-by: Ian Rogers <irogers@google.com>
> Tested-by: Thomas Falcon <thomas.falcon@intel.com>
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
> tools/lib/perf/evlist.c | 18 ++++++++++++++++--
> 1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index c6d67fc9e57e..83c43dc13313 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -47,6 +47,20 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> */
> perf_cpu_map__put(evsel->cpus);
> evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->own_cpus);
> +
> + /*
> + * Empty cpu lists would eventually get opened as "any" so remove
> + * genuinely empty ones before they're opened in the wrong place.
> + */
> + if (perf_cpu_map__is_empty(evsel->cpus)) {
> + struct perf_evsel *next = perf_evlist__next(evlist, evsel);
> +
> + perf_evlist__remove(evlist, evsel);
> + /* Keep idx contiguous */
> + if (next)
> + list_for_each_entry_from(next, &evlist->entries, node)
> + next->idx--;
> + }
> } else if (!evsel->own_cpus || evlist->has_user_cpus ||
> (!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
> /*
> @@ -80,11 +94,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>
> static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
> {
> - struct perf_evsel *evsel;
> + struct perf_evsel *evsel, *n;
>
> evlist->needs_map_propagation = true;
>
> - perf_evlist__for_each_evsel(evlist, evsel)
> + list_for_each_entry_safe(evsel, n, &evlist->entries, node)
> __perf_evlist__propagate_maps(evlist, evsel);
> }
>
> --
> 2.34.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/2] libperf: evlist: Fix --cpu argument on hybrid platform
2024-12-10 13:51 ` Arnaldo Carvalho de Melo
@ 2024-12-10 13:56 ` James Clark
2024-12-10 14:07 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 9+ messages in thread
From: James Clark @ 2024-12-10 13:56 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Namhyung Kim
Cc: linux-perf-users, irogers, thomas.falcon, Peter Zijlstra,
Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Adrian Hunter, Liang, Kan, linux-kernel
On 10/12/2024 1:51 pm, Arnaldo Carvalho de Melo wrote:
> On Thu, Nov 14, 2024 at 04:04:48PM +0000, James Clark wrote:
>> Since the linked fixes: commit, specifying a CPU on hybrid platforms
>> results in an error because Perf tries to open an extended type event
>> on "any" CPU which isn't valid. Extended type events can only be opened
>> on CPUs that match the type.
>
>> Before (working):
>
>> $ perf record --cpu 1 -- true
>> [ perf record: Woken up 1 times to write data ]
>> [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]
>
>> After (not working):
>
>> $ perf record -C 1 -- true
>> WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P'
>> Error:
>> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/).
>> /bin/dmesg | grep -i perf may provide additional information.
>>
>> (Ignore the warning message, that's expected and not particularly
>> relevant to this issue).
>>
>> This is because perf_cpu_map__intersect() of the user specified CPU (1)
>> and one of the PMU's CPUs (16-27) correctly results in an empty (NULL)
>> CPU map. However for the purposes of opening an event, libperf converts
>> empty CPU maps into an any CPU (-1) which the kernel rejects.
>>
>> Fix it by deleting evsels with empty CPU maps in the specific case where
>> user requested CPU maps are evaluated.
>
>
> Namhyung, I think this should go via perf-tools,
>
> Before, on an hybrid Intel processor:
>
> ⬢ [acme@toolbox perf-tools]$ grep -m1 'model name' /proc/cpuinfo
> model name : 13th Gen Intel(R) Core(TM) i7-1365U
> ⬢ [acme@toolbox perf-tools]$
>
> root@x1:~# perf record -C 1 -- true
> WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 4-11) for event 'cycles:P'
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles/P).
> "dmesg | grep -i perf" may provide additional information.
> root@x1:~#
>
> After the patch:
>
> root@x1:~# perf record -C 1 -- true
> WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 4-11) for event 'cycles:P'
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 1.369 MB perf.data (8 samples) ]
> root@x1:~#
> root@x1:~# perf evlist -v
> cpu_core/cycles/P: type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x400000000, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
> dummy:u: type: 1 (software), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|IDENTIFIER, read_format: ID|LOST, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
> root@x1:~#
>
> All on CPU 1:
>
> root@x1:~# perf report -D | grep PERF_RECORD_SAMPLE
> 1 1809541166457609 0x15e410 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9b1ce418 period: 1 addr: 0
> 1 1809541166470019 0x15e448 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0206b1 period: 1 addr: 0
> 1 1809541166475543 0x15e480 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0a9158 period: 5 addr: 0
> 1 1809541166478316 0x15e4b8 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0a9158 period: 53 addr: 0
> 1 1809541166480879 0x15e4f0 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a0a9158 period: 667 addr: 0
> 1 1809541166483895 0x15e528 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9b1bcbcc period: 8537 addr: 0
> 1 1809541166489116 0x15e560 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 719180/719180: 0xffffffff9a009b3b period: 98429 addr: 0
> 1 1809541168375508 0x15e620 [0x38]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffff9b0e2a34 period: 732596 addr: 0
> root@x1:~#
>
> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>
> James, the second patch isn't applying to perf-tools/perf-tools.
>
> - Arnaldo
>
The second one applies on
https://lore.kernel.org/linux-perf-users/20241113011956.402096-1-irogers@google.com/T/#m2a3587fb83e6ab2d970bae25982ae9d6c8d9e5cd
because that also does an evlist__remove() which gets fixed up. But the
first one is ok to go in on its own.
>> Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
>> Reviewed-by: Ian Rogers <irogers@google.com>
>> Tested-by: Thomas Falcon <thomas.falcon@intel.com>
>> Signed-off-by: James Clark <james.clark@linaro.org>
>> ---
>> tools/lib/perf/evlist.c | 18 ++++++++++++++++--
>> 1 file changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
>> index c6d67fc9e57e..83c43dc13313 100644
>> --- a/tools/lib/perf/evlist.c
>> +++ b/tools/lib/perf/evlist.c
>> @@ -47,6 +47,20 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>> */
>> perf_cpu_map__put(evsel->cpus);
>> evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->own_cpus);
>> +
>> + /*
>> + * Empty cpu lists would eventually get opened as "any" so remove
>> + * genuinely empty ones before they're opened in the wrong place.
>> + */
>> + if (perf_cpu_map__is_empty(evsel->cpus)) {
>> + struct perf_evsel *next = perf_evlist__next(evlist, evsel);
>> +
>> + perf_evlist__remove(evlist, evsel);
>> + /* Keep idx contiguous */
>> + if (next)
>> + list_for_each_entry_from(next, &evlist->entries, node)
>> + next->idx--;
>> + }
>> } else if (!evsel->own_cpus || evlist->has_user_cpus ||
>> (!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
>> /*
>> @@ -80,11 +94,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>>
>> static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
>> {
>> - struct perf_evsel *evsel;
>> + struct perf_evsel *evsel, *n;
>>
>> evlist->needs_map_propagation = true;
>>
>> - perf_evlist__for_each_evsel(evlist, evsel)
>> + list_for_each_entry_safe(evsel, n, &evlist->entries, node)
>> __perf_evlist__propagate_maps(evlist, evsel);
>> }
>>
>> --
>> 2.34.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/2] libperf: evlist: Fix --cpu argument on hybrid platform
2024-12-10 13:56 ` James Clark
@ 2024-12-10 14:07 ` Arnaldo Carvalho de Melo
2024-12-10 18:10 ` Namhyung Kim
0 siblings, 1 reply; 9+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-12-10 14:07 UTC (permalink / raw)
To: James Clark
Cc: Namhyung Kim, linux-perf-users, Ian Rogers, thomas.falcon,
Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Liang, Kan, linux-kernel
On Tue, Dec 10, 2024 at 01:56:21PM +0000, James Clark wrote:
> On 10/12/2024 1:51 pm, Arnaldo Carvalho de Melo wrote:
> > James, the second patch isn't applying to perf-tools/perf-tools.
> The second one applies on
> https://lore.kernel.org/linux-perf-users/20241113011956.402096-1-irogers@google.com/T/#m2a3587fb83e6ab2d970bae25982ae9d6c8d9e5cd
> because that also does an evlist__remove() which gets fixed up.
Right, I have to test that series on the ARM machines I have access to,
but there is a question from a tester that is waiting for a reply, I'll
see if I can reproduce that problem as well.
> But the first one is ok to go in on its own.
Agreed.
- Arnaldo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/2] libperf: evlist: Fix --cpu argument on hybrid platform
2024-12-10 14:07 ` Arnaldo Carvalho de Melo
@ 2024-12-10 18:10 ` Namhyung Kim
2024-12-12 7:10 ` Namhyung Kim
0 siblings, 1 reply; 9+ messages in thread
From: Namhyung Kim @ 2024-12-10 18:10 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: James Clark, linux-perf-users, Ian Rogers, thomas.falcon,
Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Liang, Kan, linux-kernel
On Tue, Dec 10, 2024 at 11:07:13AM -0300, Arnaldo Carvalho de Melo wrote:
> On Tue, Dec 10, 2024 at 01:56:21PM +0000, James Clark wrote:
> > On 10/12/2024 1:51 pm, Arnaldo Carvalho de Melo wrote:
> > > James, the second patch isn't applying to perf-tools/perf-tools.
>
> > The second one applies on
> > https://lore.kernel.org/linux-perf-users/20241113011956.402096-1-irogers@google.com/T/#m2a3587fb83e6ab2d970bae25982ae9d6c8d9e5cd
> > because that also does an evlist__remove() which gets fixed up.
>
> Right, I have to test that series on the ARM machines I have access to,
> but there is a question from a tester that is waiting for a reply, I'll
> see if I can reproduce that problem as well.
>
> > But the first one is ok to go in on its own.
>
> Agreed.
Ok, I'll pick this one to perf-tools and leave the patch 2 go into
perf-tools-next.
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/2] libperf: evlist: Fix --cpu argument on hybrid platform
2024-12-10 18:10 ` Namhyung Kim
@ 2024-12-12 7:10 ` Namhyung Kim
0 siblings, 0 replies; 9+ messages in thread
From: Namhyung Kim @ 2024-12-12 7:10 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: James Clark, linux-perf-users, Ian Rogers, thomas.falcon,
Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Adrian Hunter, Liang, Kan, linux-kernel
On Tue, Dec 10, 2024 at 10:10:21AM -0800, Namhyung Kim wrote:
> On Tue, Dec 10, 2024 at 11:07:13AM -0300, Arnaldo Carvalho de Melo wrote:
> > On Tue, Dec 10, 2024 at 01:56:21PM +0000, James Clark wrote:
> > > On 10/12/2024 1:51 pm, Arnaldo Carvalho de Melo wrote:
> > > > James, the second patch isn't applying to perf-tools/perf-tools.
> >
> > > The second one applies on
> > > https://lore.kernel.org/linux-perf-users/20241113011956.402096-1-irogers@google.com/T/#m2a3587fb83e6ab2d970bae25982ae9d6c8d9e5cd
> > > because that also does an evlist__remove() which gets fixed up.
> >
> > Right, I have to test that series on the ARM machines I have access to,
> > but there is a question from a tester that is waiting for a reply, I'll
> > see if I can reproduce that problem as well.
> >
> > > But the first one is ok to go in on its own.
> >
> > Agreed.
>
> Ok, I'll pick this one to perf-tools and leave the patch 2 go into
> perf-tools-next.
Applied patch 1 to perf-tools, thanks!
Best regards,
Namhyung
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-12-12 7:10 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-14 16:04 [PATCH v2 0/2] libperf: evlist: Fix --cpu argument on hybrid platform James Clark
2024-11-14 16:04 ` [PATCH v2 1/2] " James Clark
2024-12-10 13:51 ` Arnaldo Carvalho de Melo
2024-12-10 13:56 ` James Clark
2024-12-10 14:07 ` Arnaldo Carvalho de Melo
2024-12-10 18:10 ` Namhyung Kim
2024-12-12 7:10 ` Namhyung Kim
2024-11-14 16:04 ` [PATCH v2 2/2] libperf: evlist: Keep evsel idx contiguous on removal James Clark
2024-11-14 23:35 ` Ian Rogers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).