* [RFC PATCH 0/1] libperf: evlist: Fix --cpu argument on hybrid platform
@ 2024-10-15 14:54 James Clark
2024-10-15 14:54 ` [RFC PATCH 1/1] " James Clark
0 siblings, 1 reply; 10+ messages in thread
From: James Clark @ 2024-10-15 14:54 UTC (permalink / raw)
To: linux-perf-users, acme, namhyung, irogers
Cc: James Clark, Peter Zijlstra, Ingo Molnar, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Liang, Kan,
linux-kernel
I noticed that this isn't working on Arm or Intel, and I've attached a
hack that I don't really think is the right fix but it does work. It's
not strictly a regression from the linked fixes: commit, as before that
events wouldn't be opened on multiple PMUs properly. But after it the
error becomes more fatal.
Posting the RFC because whatever the fix should be isn't straightforward
and probably needs discussion. I think similar things have been
discussed before about the empty cpumap issue, but not related to the
--cpu option.
I thought of a few possible fixes:
* Don't open "empty" CPU maps in Perf. This would make Perf and libperf
diverge. Also not sure if libperf is supposed to be backwards
compatible? But this would be a breaking change because someone might
be using empty == all deliberately.
* Prune the evlist for empty CPU maps in every tool in Perf. This isn't
great because users of libperf also have to do this and is a lot of
duplication in Perf.
* Use cpumap->nr == 0 for empty instead of cpumap == NULL
One issue with the attached fix is that you can only delete the core
evsel from within libperf and not Perf's part of it. Also maybe it's a
bit weird for the propagate function to start deleting things.
For reference the empty to any code is here:
static int __evsel__prepare_open(...)
if (cpus == NULL) {
if (empty_cpu_map == NULL) {
empty_cpu_map = perf_cpu_map__new_any_cpu();
if (empty_cpu_map == NULL)
return -ENOMEM;
}
cpus = empty_cpu_map;
}
James Clark (1):
libperf: evlist: Fix --cpu argument on hybrid platform
tools/lib/perf/evlist.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [RFC PATCH 1/1] libperf: evlist: Fix --cpu argument on hybrid platform
2024-10-15 14:54 [RFC PATCH 0/1] libperf: evlist: Fix --cpu argument on hybrid platform James Clark
@ 2024-10-15 14:54 ` James Clark
2024-10-15 15:14 ` Ian Rogers
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: James Clark @ 2024-10-15 14:54 UTC (permalink / raw)
To: linux-perf-users, acme, namhyung, irogers
Cc: James Clark, Peter Zijlstra, Ingo Molnar, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Liang, Kan,
linux-kernel
Since the linked fixes: commit, specifying a CPU on hybrid platforms
results in an error because Perf tries to open an extended type event
on "any" CPU which isn't valid. Extended type events can only be opened
on CPUs that match the type.
Before (working):
$ perf record --cpu 1 -- true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]
After (not working):
$ perf record -C 1 -- true
WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P'
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/).
/bin/dmesg | grep -i perf may provide additional information.
(Ignore the warning message, that's expected and not particularly
relevant to this issue).
This is because perf_cpu_map__intersect() of the user specified CPU (1)
and one of the PMU's CPUs (16-27) correctly results in an empty (NULL)
CPU map. However for the purposes of opening an event, libperf converts
empty CPU maps into an any CPU (-1) which the kernel rejects.
Fix it by deleting evsels with empty CPU maps in the specific case where
user requested CPU maps are evaluated.
Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
Signed-off-by: James Clark <james.clark@linaro.org>
---
tools/lib/perf/evlist.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index c6d67fc9e57e..8fae9a157a91 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -47,6 +47,13 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
*/
perf_cpu_map__put(evsel->cpus);
evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->own_cpus);
+
+ /*
+ * Empty cpu lists would eventually get opened as "any" so remove
+ * genuinely empty ones before they're opened in the wrong place.
+ */
+ if (perf_cpu_map__is_empty(evsel->cpus))
+ perf_evlist__remove(evlist, evsel);
} else if (!evsel->own_cpus || evlist->has_user_cpus ||
(!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
/*
@@ -80,11 +87,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
{
- struct perf_evsel *evsel;
+ struct perf_evsel *evsel, *n;
evlist->needs_map_propagation = true;
- perf_evlist__for_each_evsel(evlist, evsel)
+ list_for_each_entry_safe(evsel, n, &evlist->entries, node)
__perf_evlist__propagate_maps(evlist, evsel);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 1/1] libperf: evlist: Fix --cpu argument on hybrid platform
2024-10-15 14:54 ` [RFC PATCH 1/1] " James Clark
@ 2024-10-15 15:14 ` Ian Rogers
2024-10-16 8:29 ` James Clark
2024-10-15 16:53 ` Falcon, Thomas
2024-10-17 23:02 ` Namhyung Kim
2 siblings, 1 reply; 10+ messages in thread
From: Ian Rogers @ 2024-10-15 15:14 UTC (permalink / raw)
To: James Clark
Cc: linux-perf-users, acme, namhyung, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Liang, Kan, linux-kernel
On Tue, Oct 15, 2024 at 7:54 AM James Clark <james.clark@linaro.org> wrote:
>
> Since the linked fixes: commit, specifying a CPU on hybrid platforms
> results in an error because Perf tries to open an extended type event
> on "any" CPU which isn't valid. Extended type events can only be opened
> on CPUs that match the type.
>
> Before (working):
>
> $ perf record --cpu 1 -- true
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]
>
> After (not working):
>
> $ perf record -C 1 -- true
> WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P'
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/).
> /bin/dmesg | grep -i perf may provide additional information.
>
> (Ignore the warning message, that's expected and not particularly
> relevant to this issue).
>
> This is because perf_cpu_map__intersect() of the user specified CPU (1)
> and one of the PMU's CPUs (16-27) correctly results in an empty (NULL)
> CPU map. However for the purposes of opening an event, libperf converts
> empty CPU maps into an any CPU (-1) which the kernel rejects.
Ugh. The cpumap API tries its best to confuse NULL == empty but empty
can give you dummy, dummy is also known as 'any' or -1, 'any' sounds a
lot like 'all' but sometimes 'all' is only online CPUs. I tried to
tidy up the naming a while ago, but there is still a mess.
> Fix it by deleting evsels with empty CPU maps in the specific case where
> user requested CPU maps are evaluated.
If we delete evsels than the indices can be broken for certain things.
I'm guessing asan testing is clean but the large number of side data
structures that are indexed by things in another data structure makes
the whole code base brittle and I am nervous around this change.
> Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
> Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Thanks,
Ian
> ---
> tools/lib/perf/evlist.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index c6d67fc9e57e..8fae9a157a91 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -47,6 +47,13 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> */
> perf_cpu_map__put(evsel->cpus);
> evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->own_cpus);
> +
> + /*
> + * Empty cpu lists would eventually get opened as "any" so remove
> + * genuinely empty ones before they're opened in the wrong place.
> + */
> + if (perf_cpu_map__is_empty(evsel->cpus))
> + perf_evlist__remove(evlist, evsel);
> } else if (!evsel->own_cpus || evlist->has_user_cpus ||
> (!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
> /*
> @@ -80,11 +87,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>
> static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
> {
> - struct perf_evsel *evsel;
> + struct perf_evsel *evsel, *n;
>
> evlist->needs_map_propagation = true;
>
> - perf_evlist__for_each_evsel(evlist, evsel)
> + list_for_each_entry_safe(evsel, n, &evlist->entries, node)
> __perf_evlist__propagate_maps(evlist, evsel);
> }
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 1/1] libperf: evlist: Fix --cpu argument on hybrid platform
2024-10-15 14:54 ` [RFC PATCH 1/1] " James Clark
2024-10-15 15:14 ` Ian Rogers
@ 2024-10-15 16:53 ` Falcon, Thomas
2024-10-17 23:02 ` Namhyung Kim
2 siblings, 0 replies; 10+ messages in thread
From: Falcon, Thomas @ 2024-10-15 16:53 UTC (permalink / raw)
To: namhyung@kernel.org, acme@kernel.org, james.clark@linaro.org,
linux-perf-users@vger.kernel.org, irogers@google.com
Cc: alexander.shishkin@linux.intel.com, peterz@infradead.org,
mark.rutland@arm.com, mingo@redhat.com,
linux-kernel@vger.kernel.org, Hunter, Adrian,
kan.liang@linux.intel.com, jolsa@kernel.org
On Tue, 2024-10-15 at 15:54 +0100, James Clark wrote:
> Since the linked fixes: commit, specifying a CPU on hybrid platforms
> results in an error because Perf tries to open an extended type event
> on "any" CPU which isn't valid. Extended type events can only be
> opened
> on CPUs that match the type.
>
> Before (working):
>
> $ perf record --cpu 1 -- true
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]
>
> After (not working):
>
> $ perf record -C 1 -- true
> WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom'
> (CPUs 16-27) for event 'cycles:P'
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid
> argument) for event (cpu_atom/cycles:P/).
> /bin/dmesg | grep -i perf may provide additional information.
>
> (Ignore the warning message, that's expected and not particularly
> relevant to this issue).
>
> This is because perf_cpu_map__intersect() of the user specified CPU
> (1)
> and one of the PMU's CPUs (16-27) correctly results in an empty
> (NULL)
> CPU map. However for the purposes of opening an event, libperf
> converts
> empty CPU maps into an any CPU (-1) which the kernel rejects.
>
> Fix it by deleting evsels with empty CPU maps in the specific case
> where
> user requested CPU maps are evaluated.
>
> Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric"
> events")
> Signed-off-by: James Clark <james.clark@linaro.org>
Works for me on an i9-12900.
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Thanks
> ---
> tools/lib/perf/evlist.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index c6d67fc9e57e..8fae9a157a91 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -47,6 +47,13 @@ static void __perf_evlist__propagate_maps(struct
> perf_evlist *evlist,
> */
> perf_cpu_map__put(evsel->cpus);
> evsel->cpus = perf_cpu_map__intersect(evlist-
> >user_requested_cpus, evsel->own_cpus);
> +
> + /*
> + * Empty cpu lists would eventually get opened as
> "any" so remove
> + * genuinely empty ones before they're opened in the
> wrong place.
> + */
> + if (perf_cpu_map__is_empty(evsel->cpus))
> + perf_evlist__remove(evlist, evsel);
> } else if (!evsel->own_cpus || evlist->has_user_cpus ||
> (!evsel->requires_cpu &&
> perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
> /*
> @@ -80,11 +87,11 @@ static void __perf_evlist__propagate_maps(struct
> perf_evlist *evlist,
>
> static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
> {
> - struct perf_evsel *evsel;
> + struct perf_evsel *evsel, *n;
>
> evlist->needs_map_propagation = true;
>
> - perf_evlist__for_each_evsel(evlist, evsel)
> + list_for_each_entry_safe(evsel, n, &evlist->entries, node)
> __perf_evlist__propagate_maps(evlist, evsel);
> }
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 1/1] libperf: evlist: Fix --cpu argument on hybrid platform
2024-10-15 15:14 ` Ian Rogers
@ 2024-10-16 8:29 ` James Clark
2024-10-16 15:01 ` Ian Rogers
0 siblings, 1 reply; 10+ messages in thread
From: James Clark @ 2024-10-16 8:29 UTC (permalink / raw)
To: Ian Rogers
Cc: linux-perf-users, acme, namhyung, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Liang, Kan, linux-kernel
On 15/10/2024 4:14 pm, Ian Rogers wrote:
> On Tue, Oct 15, 2024 at 7:54 AM James Clark <james.clark@linaro.org> wrote:
>>
>> Since the linked fixes: commit, specifying a CPU on hybrid platforms
>> results in an error because Perf tries to open an extended type event
>> on "any" CPU which isn't valid. Extended type events can only be opened
>> on CPUs that match the type.
>>
>> Before (working):
>>
>> $ perf record --cpu 1 -- true
>> [ perf record: Woken up 1 times to write data ]
>> [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]
>>
>> After (not working):
>>
>> $ perf record -C 1 -- true
>> WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P'
>> Error:
>> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/).
>> /bin/dmesg | grep -i perf may provide additional information.
>>
>> (Ignore the warning message, that's expected and not particularly
>> relevant to this issue).
>>
>> This is because perf_cpu_map__intersect() of the user specified CPU (1)
>> and one of the PMU's CPUs (16-27) correctly results in an empty (NULL)
>> CPU map. However for the purposes of opening an event, libperf converts
>> empty CPU maps into an any CPU (-1) which the kernel rejects.
>
> Ugh. The cpumap API tries its best to confuse NULL == empty but empty
> can give you dummy, dummy is also known as 'any' or -1, 'any' sounds a
> lot like 'all' but sometimes 'all' is only online CPUs. I tried to
> tidy up the naming a while ago, but there is still a mess.
>
I don't know if you think this is a good opportunity for me to have a go
at finishing separating those? Or is it a dead end?
>> Fix it by deleting evsels with empty CPU maps in the specific case where
>> user requested CPU maps are evaluated.
>
> If we delete evsels than the indices can be broken for certain things.
> I'm guessing asan testing is clean but the large number of side data
> structures that are indexed by things in another data structure makes
> the whole code base brittle and I am nervous around this change.
>
>> Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
>> Signed-off-by: James Clark <james.clark@linaro.org>
>
> Reviewed-by: Ian Rogers <irogers@google.com>
>
> Thanks,
> Ian
>
Ok if we're not completely opposed to doing it this way I will dig a bit
more and double check everything is working.
>> ---
>> tools/lib/perf/evlist.c | 11 +++++++++--
>> 1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
>> index c6d67fc9e57e..8fae9a157a91 100644
>> --- a/tools/lib/perf/evlist.c
>> +++ b/tools/lib/perf/evlist.c
>> @@ -47,6 +47,13 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>> */
>> perf_cpu_map__put(evsel->cpus);
>> evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->own_cpus);
>> +
>> + /*
>> + * Empty cpu lists would eventually get opened as "any" so remove
>> + * genuinely empty ones before they're opened in the wrong place.
>> + */
>> + if (perf_cpu_map__is_empty(evsel->cpus))
>> + perf_evlist__remove(evlist, evsel);
>> } else if (!evsel->own_cpus || evlist->has_user_cpus ||
>> (!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
>> /*
>> @@ -80,11 +87,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>>
>> static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
>> {
>> - struct perf_evsel *evsel;
>> + struct perf_evsel *evsel, *n;
>>
>> evlist->needs_map_propagation = true;
>>
>> - perf_evlist__for_each_evsel(evlist, evsel)
>> + list_for_each_entry_safe(evsel, n, &evlist->entries, node)
>> __perf_evlist__propagate_maps(evlist, evsel);
>> }
>>
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 1/1] libperf: evlist: Fix --cpu argument on hybrid platform
2024-10-16 8:29 ` James Clark
@ 2024-10-16 15:01 ` Ian Rogers
2024-10-17 23:10 ` Namhyung Kim
0 siblings, 1 reply; 10+ messages in thread
From: Ian Rogers @ 2024-10-16 15:01 UTC (permalink / raw)
To: James Clark
Cc: linux-perf-users, acme, namhyung, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Liang, Kan, linux-kernel
On Wed, Oct 16, 2024 at 1:29 AM James Clark <james.clark@linaro.org> wrote:
>
> On 15/10/2024 4:14 pm, Ian Rogers wrote:
> > On Tue, Oct 15, 2024 at 7:54 AM James Clark <james.clark@linaro.org> wrote:
> >>
> >> Since the linked fixes: commit, specifying a CPU on hybrid platforms
> >> results in an error because Perf tries to open an extended type event
> >> on "any" CPU which isn't valid. Extended type events can only be opened
> >> on CPUs that match the type.
> >>
> >> Before (working):
> >>
> >> $ perf record --cpu 1 -- true
> >> [ perf record: Woken up 1 times to write data ]
> >> [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]
> >>
> >> After (not working):
> >>
> >> $ perf record -C 1 -- true
> >> WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P'
> >> Error:
> >> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/).
> >> /bin/dmesg | grep -i perf may provide additional information.
> >>
> >> (Ignore the warning message, that's expected and not particularly
> >> relevant to this issue).
> >>
> >> This is because perf_cpu_map__intersect() of the user specified CPU (1)
> >> and one of the PMU's CPUs (16-27) correctly results in an empty (NULL)
> >> CPU map. However for the purposes of opening an event, libperf converts
> >> empty CPU maps into an any CPU (-1) which the kernel rejects.
> >
> > Ugh. The cpumap API tries its best to confuse NULL == empty but empty
> > can give you dummy, dummy is also known as 'any' or -1, 'any' sounds a
> > lot like 'all' but sometimes 'all' is only online CPUs. I tried to
> > tidy up the naming a while ago, but there is still a mess.
> >
>
> I don't know if you think this is a good opportunity for me to have a go
> at finishing separating those? Or is it a dead end?
So cpumap (and threadmap) underpin a lot of things, we also used to
routinely confuse CPU numbers with cpumap indices that are used to
densely index xyarrays with file descriptors, etc. My thought was that
we may end up doing a proper Rust libperf that can be under a more
permissive license like libbpf - currently libperf is a source of GPL
infection. The rewrite would be a good time to clear these things up.
I believe someone at RedHat has looked at doing a Rust libperf.
> >> Fix it by deleting evsels with empty CPU maps in the specific case where
> >> user requested CPU maps are evaluated.
> >
> > If we delete evsels than the indices can be broken for certain things.
> > I'm guessing asan testing is clean but the large number of side data
> > structures that are indexed by things in another data structure makes
> > the whole code base brittle and I am nervous around this change.
> >
> >> Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
> >> Signed-off-by: James Clark <james.clark@linaro.org>
> >
> > Reviewed-by: Ian Rogers <irogers@google.com>
> >
> > Thanks,
> > Ian
> >
>
> Ok if we're not completely opposed to doing it this way I will dig a bit
> more and double check everything is working.
I think it is okay to do it this way (hence the reviewed-by tag) as
propagate maps should happen before the xyarrays are set up, it'd be
nice if these things were checked at runtime, or by the compiler...
Thanks,
Ian
> >> ---
> >> tools/lib/perf/evlist.c | 11 +++++++++--
> >> 1 file changed, 9 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> >> index c6d67fc9e57e..8fae9a157a91 100644
> >> --- a/tools/lib/perf/evlist.c
> >> +++ b/tools/lib/perf/evlist.c
> >> @@ -47,6 +47,13 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> >> */
> >> perf_cpu_map__put(evsel->cpus);
> >> evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->own_cpus);
> >> +
> >> + /*
> >> + * Empty cpu lists would eventually get opened as "any" so remove
> >> + * genuinely empty ones before they're opened in the wrong place.
> >> + */
> >> + if (perf_cpu_map__is_empty(evsel->cpus))
> >> + perf_evlist__remove(evlist, evsel);
> >> } else if (!evsel->own_cpus || evlist->has_user_cpus ||
> >> (!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
> >> /*
> >> @@ -80,11 +87,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> >>
> >> static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
> >> {
> >> - struct perf_evsel *evsel;
> >> + struct perf_evsel *evsel, *n;
> >>
> >> evlist->needs_map_propagation = true;
> >>
> >> - perf_evlist__for_each_evsel(evlist, evsel)
> >> + list_for_each_entry_safe(evsel, n, &evlist->entries, node)
> >> __perf_evlist__propagate_maps(evlist, evsel);
> >> }
> >>
> >> --
> >> 2.34.1
> >>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 1/1] libperf: evlist: Fix --cpu argument on hybrid platform
2024-10-15 14:54 ` [RFC PATCH 1/1] " James Clark
2024-10-15 15:14 ` Ian Rogers
2024-10-15 16:53 ` Falcon, Thomas
@ 2024-10-17 23:02 ` Namhyung Kim
2024-10-18 10:18 ` James Clark
2 siblings, 1 reply; 10+ messages in thread
From: Namhyung Kim @ 2024-10-17 23:02 UTC (permalink / raw)
To: James Clark
Cc: linux-perf-users, acme, irogers, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Liang, Kan, linux-kernel
On Tue, Oct 15, 2024 at 03:54:15PM +0100, James Clark wrote:
> Since the linked fixes: commit, specifying a CPU on hybrid platforms
> results in an error because Perf tries to open an extended type event
> on "any" CPU which isn't valid. Extended type events can only be opened
> on CPUs that match the type.
>
> Before (working):
>
> $ perf record --cpu 1 -- true
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]
>
> After (not working):
>
> $ perf record -C 1 -- true
> WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P'
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/).
> /bin/dmesg | grep -i perf may provide additional information.
>
> (Ignore the warning message, that's expected and not particularly
> relevant to this issue).
>
> This is because perf_cpu_map__intersect() of the user specified CPU (1)
> and one of the PMU's CPUs (16-27) correctly results in an empty (NULL)
> CPU map. However for the purposes of opening an event, libperf converts
> empty CPU maps into an any CPU (-1) which the kernel rejects.
>
> Fix it by deleting evsels with empty CPU maps in the specific case where
> user requested CPU maps are evaluated.
I think there's a discussion about making default events skippable and
hide them in the output unless all of them fail.
Thanks,
Namhyung
>
> Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
> tools/lib/perf/evlist.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index c6d67fc9e57e..8fae9a157a91 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -47,6 +47,13 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> */
> perf_cpu_map__put(evsel->cpus);
> evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->own_cpus);
> +
> + /*
> + * Empty cpu lists would eventually get opened as "any" so remove
> + * genuinely empty ones before they're opened in the wrong place.
> + */
> + if (perf_cpu_map__is_empty(evsel->cpus))
> + perf_evlist__remove(evlist, evsel);
> } else if (!evsel->own_cpus || evlist->has_user_cpus ||
> (!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
> /*
> @@ -80,11 +87,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
>
> static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
> {
> - struct perf_evsel *evsel;
> + struct perf_evsel *evsel, *n;
>
> evlist->needs_map_propagation = true;
>
> - perf_evlist__for_each_evsel(evlist, evsel)
> + list_for_each_entry_safe(evsel, n, &evlist->entries, node)
> __perf_evlist__propagate_maps(evlist, evsel);
> }
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 1/1] libperf: evlist: Fix --cpu argument on hybrid platform
2024-10-16 15:01 ` Ian Rogers
@ 2024-10-17 23:10 ` Namhyung Kim
2024-10-18 10:19 ` James Clark
0 siblings, 1 reply; 10+ messages in thread
From: Namhyung Kim @ 2024-10-17 23:10 UTC (permalink / raw)
To: Ian Rogers
Cc: James Clark, linux-perf-users, acme, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Liang, Kan, linux-kernel
On Wed, Oct 16, 2024 at 08:01:21AM -0700, Ian Rogers wrote:
> On Wed, Oct 16, 2024 at 1:29 AM James Clark <james.clark@linaro.org> wrote:
> >
> > On 15/10/2024 4:14 pm, Ian Rogers wrote:
> > > On Tue, Oct 15, 2024 at 7:54 AM James Clark <james.clark@linaro.org> wrote:
> > >>
> > >> Since the linked fixes: commit, specifying a CPU on hybrid platforms
> > >> results in an error because Perf tries to open an extended type event
> > >> on "any" CPU which isn't valid. Extended type events can only be opened
> > >> on CPUs that match the type.
> > >>
> > >> Before (working):
> > >>
> > >> $ perf record --cpu 1 -- true
> > >> [ perf record: Woken up 1 times to write data ]
> > >> [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]
> > >>
> > >> After (not working):
> > >>
> > >> $ perf record -C 1 -- true
> > >> WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P'
> > >> Error:
> > >> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/).
> > >> /bin/dmesg | grep -i perf may provide additional information.
> > >>
> > >> (Ignore the warning message, that's expected and not particularly
> > >> relevant to this issue).
> > >>
> > >> This is because perf_cpu_map__intersect() of the user specified CPU (1)
> > >> and one of the PMU's CPUs (16-27) correctly results in an empty (NULL)
> > >> CPU map. However for the purposes of opening an event, libperf converts
> > >> empty CPU maps into an any CPU (-1) which the kernel rejects.
> > >
> > > Ugh. The cpumap API tries its best to confuse NULL == empty but empty
> > > can give you dummy, dummy is also known as 'any' or -1, 'any' sounds a
> > > lot like 'all' but sometimes 'all' is only online CPUs. I tried to
> > > tidy up the naming a while ago, but there is still a mess.
> > >
> >
> > I don't know if you think this is a good opportunity for me to have a go
> > at finishing separating those? Or is it a dead end?
>
> So cpumap (and threadmap) underpin a lot of things, we also used to
> routinely confuse CPU numbers with cpumap indices that are used to
> densely index xyarrays with file descriptors, etc. My thought was that
> we may end up doing a proper Rust libperf that can be under a more
> permissive license like libbpf - currently libperf is a source of GPL
> infection. The rewrite would be a good time to clear these things up.
> I believe someone at RedHat has looked at doing a Rust libperf.
I really want to rewrite CPU/thread map related code but didn't have
time to work on it. :(
It'd be great if we can rewrite it in Rust! But current libperf API is
pretty bad and it's not clearly separated from the tools code. For
example, accessing internals like evsel->core.xxx should be changed
first.
>
> > >> Fix it by deleting evsels with empty CPU maps in the specific case where
> > >> user requested CPU maps are evaluated.
> > >
> > > If we delete evsels than the indices can be broken for certain things.
> > > I'm guessing asan testing is clean but the large number of side data
> > > structures that are indexed by things in another data structure makes
> > > the whole code base brittle and I am nervous around this change.
> > >
> > >> Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
> > >> Signed-off-by: James Clark <james.clark@linaro.org>
> > >
> > > Reviewed-by: Ian Rogers <irogers@google.com>
> > >
> > > Thanks,
> > > Ian
> > >
> >
> > Ok if we're not completely opposed to doing it this way I will dig a bit
> > more and double check everything is working.
>
> I think it is okay to do it this way (hence the reviewed-by tag) as
> propagate maps should happen before the xyarrays are set up, it'd be
> nice if these things were checked at runtime, or by the compiler...
Right, evsel index is used some places probably we need to update it
too.
Thanks,
Namhyung
>
> > >> ---
> > >> tools/lib/perf/evlist.c | 11 +++++++++--
> > >> 1 file changed, 9 insertions(+), 2 deletions(-)
> > >>
> > >> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> > >> index c6d67fc9e57e..8fae9a157a91 100644
> > >> --- a/tools/lib/perf/evlist.c
> > >> +++ b/tools/lib/perf/evlist.c
> > >> @@ -47,6 +47,13 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> > >> */
> > >> perf_cpu_map__put(evsel->cpus);
> > >> evsel->cpus = perf_cpu_map__intersect(evlist->user_requested_cpus, evsel->own_cpus);
> > >> +
> > >> + /*
> > >> + * Empty cpu lists would eventually get opened as "any" so remove
> > >> + * genuinely empty ones before they're opened in the wrong place.
> > >> + */
> > >> + if (perf_cpu_map__is_empty(evsel->cpus))
> > >> + perf_evlist__remove(evlist, evsel);
> > >> } else if (!evsel->own_cpus || evlist->has_user_cpus ||
> > >> (!evsel->requires_cpu && perf_cpu_map__has_any_cpu(evlist->user_requested_cpus))) {
> > >> /*
> > >> @@ -80,11 +87,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist,
> > >>
> > >> static void perf_evlist__propagate_maps(struct perf_evlist *evlist)
> > >> {
> > >> - struct perf_evsel *evsel;
> > >> + struct perf_evsel *evsel, *n;
> > >>
> > >> evlist->needs_map_propagation = true;
> > >>
> > >> - perf_evlist__for_each_evsel(evlist, evsel)
> > >> + list_for_each_entry_safe(evsel, n, &evlist->entries, node)
> > >> __perf_evlist__propagate_maps(evlist, evsel);
> > >> }
> > >>
> > >> --
> > >> 2.34.1
> > >>
> >
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 1/1] libperf: evlist: Fix --cpu argument on hybrid platform
2024-10-17 23:02 ` Namhyung Kim
@ 2024-10-18 10:18 ` James Clark
0 siblings, 0 replies; 10+ messages in thread
From: James Clark @ 2024-10-18 10:18 UTC (permalink / raw)
To: Namhyung Kim, Ian Rogers
Cc: linux-perf-users, acme, Peter Zijlstra, Ingo Molnar, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Liang, Kan,
linux-kernel
On 18/10/2024 00:02, Namhyung Kim wrote:
> On Tue, Oct 15, 2024 at 03:54:15PM +0100, James Clark wrote:
>> Since the linked fixes: commit, specifying a CPU on hybrid platforms
>> results in an error because Perf tries to open an extended type event
>> on "any" CPU which isn't valid. Extended type events can only be opened
>> on CPUs that match the type.
>>
>> Before (working):
>>
>> $ perf record --cpu 1 -- true
>> [ perf record: Woken up 1 times to write data ]
>> [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]
>>
>> After (not working):
>>
>> $ perf record -C 1 -- true
>> WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P'
>> Error:
>> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/).
>> /bin/dmesg | grep -i perf may provide additional information.
>>
>> (Ignore the warning message, that's expected and not particularly
>> relevant to this issue).
>>
>> This is because perf_cpu_map__intersect() of the user specified CPU (1)
>> and one of the PMU's CPUs (16-27) correctly results in an empty (NULL)
>> CPU map. However for the purposes of opening an event, libperf converts
>> empty CPU maps into an any CPU (-1) which the kernel rejects.
>>
>> Fix it by deleting evsels with empty CPU maps in the specific case where
>> user requested CPU maps are evaluated.
>
> I think there's a discussion about making default events skippable and
> hide them in the output unless all of them fail.
>
> Thanks,
> Namhyung
>
In this case it's still broken if you do -e cycles, not just for default
events. So I'm not sure if that would help.
James
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 1/1] libperf: evlist: Fix --cpu argument on hybrid platform
2024-10-17 23:10 ` Namhyung Kim
@ 2024-10-18 10:19 ` James Clark
0 siblings, 0 replies; 10+ messages in thread
From: James Clark @ 2024-10-18 10:19 UTC (permalink / raw)
To: Namhyung Kim, Ian Rogers
Cc: linux-perf-users, acme, Peter Zijlstra, Ingo Molnar, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Liang, Kan,
linux-kernel
On 18/10/2024 00:10, Namhyung Kim wrote:
> On Wed, Oct 16, 2024 at 08:01:21AM -0700, Ian Rogers wrote:
>> On Wed, Oct 16, 2024 at 1:29 AM James Clark <james.clark@linaro.org> wrote:
>>>
>>> On 15/10/2024 4:14 pm, Ian Rogers wrote:
>>>> On Tue, Oct 15, 2024 at 7:54 AM James Clark <james.clark@linaro.org> wrote:
>>>>>
>>>>> Since the linked fixes: commit, specifying a CPU on hybrid platforms
>>>>> results in an error because Perf tries to open an extended type event
>>>>> on "any" CPU which isn't valid. Extended type events can only be opened
>>>>> on CPUs that match the type.
>>>>>
>>>>> Before (working):
>>>>>
>>>>> $ perf record --cpu 1 -- true
>>>>> [ perf record: Woken up 1 times to write data ]
>>>>> [ perf record: Captured and wrote 2.385 MB perf.data (7 samples) ]
>>>>>
>>>>> After (not working):
>>>>>
>>>>> $ perf record -C 1 -- true
>>>>> WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cycles:P'
>>>>> Error:
>>>>> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_atom/cycles:P/).
>>>>> /bin/dmesg | grep -i perf may provide additional information.
>>>>>
>>>>> (Ignore the warning message, that's expected and not particularly
>>>>> relevant to this issue).
>>>>>
>>>>> This is because perf_cpu_map__intersect() of the user specified CPU (1)
>>>>> and one of the PMU's CPUs (16-27) correctly results in an empty (NULL)
>>>>> CPU map. However for the purposes of opening an event, libperf converts
>>>>> empty CPU maps into an any CPU (-1) which the kernel rejects.
>>>>
>>>> Ugh. The cpumap API tries its best to confuse NULL == empty but empty
>>>> can give you dummy, dummy is also known as 'any' or -1, 'any' sounds a
>>>> lot like 'all' but sometimes 'all' is only online CPUs. I tried to
>>>> tidy up the naming a while ago, but there is still a mess.
>>>>
>>>
>>> I don't know if you think this is a good opportunity for me to have a go
>>> at finishing separating those? Or is it a dead end?
>>
>> So cpumap (and threadmap) underpin a lot of things, we also used to
>> routinely confuse CPU numbers with cpumap indices that are used to
>> densely index xyarrays with file descriptors, etc. My thought was that
>> we may end up doing a proper Rust libperf that can be under a more
>> permissive license like libbpf - currently libperf is a source of GPL
>> infection. The rewrite would be a good time to clear these things up.
>> I believe someone at RedHat has looked at doing a Rust libperf.
>
> I really want to rewrite CPU/thread map related code but didn't have
> time to work on it. :(
>
> It'd be great if we can rewrite it in Rust! But current libperf API is
> pretty bad and it's not clearly separated from the tools code. For
> example, accessing internals like evsel->core.xxx should be changed
> first.
>
>>
>>>>> Fix it by deleting evsels with empty CPU maps in the specific case where
>>>>> user requested CPU maps are evaluated.
>>>>
>>>> If we delete evsels than the indices can be broken for certain things.
>>>> I'm guessing asan testing is clean but the large number of side data
>>>> structures that are indexed by things in another data structure makes
>>>> the whole code base brittle and I am nervous around this change.
>>>>
>>>>> Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
>>>>> Signed-off-by: James Clark <james.clark@linaro.org>
>>>>
>>>> Reviewed-by: Ian Rogers <irogers@google.com>
>>>>
>>>> Thanks,
>>>> Ian
>>>>
>>>
>>> Ok if we're not completely opposed to doing it this way I will dig a bit
>>> more and double check everything is working.
>>
>> I think it is okay to do it this way (hence the reviewed-by tag) as
>> propagate maps should happen before the xyarrays are set up, it'd be
>> nice if these things were checked at runtime, or by the compiler...
>
> Right, evsel index is used some places probably we need to update it
> too.
>
> Thanks,
> Namhyung
>
Ok I'll check for that.
James
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2024-10-18 10:19 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-15 14:54 [RFC PATCH 0/1] libperf: evlist: Fix --cpu argument on hybrid platform James Clark
2024-10-15 14:54 ` [RFC PATCH 1/1] " James Clark
2024-10-15 15:14 ` Ian Rogers
2024-10-16 8:29 ` James Clark
2024-10-16 15:01 ` Ian Rogers
2024-10-17 23:10 ` Namhyung Kim
2024-10-18 10:19 ` James Clark
2024-10-15 16:53 ` Falcon, Thomas
2024-10-17 23:02 ` Namhyung Kim
2024-10-18 10:18 ` James Clark
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).