* [PATCH V2] perf top: Use evsel's cpus to replace user_requested_cpus
@ 2023-12-12 19:38 kan.liang
2023-12-12 20:37 ` Ian Rogers
0 siblings, 1 reply; 9+ messages in thread
From: kan.liang @ 2023-12-12 19:38 UTC (permalink / raw)
To: acme, irogers
Cc: namhyung, mark.rutland, maz, marcan, linux-kernel,
linux-perf-users, Kan Liang
From: Kan Liang <kan.liang@linux.intel.com>
perf top errors out on a hybrid machine
$perf top
Error:
The cycles:P event is not supported.
The perf top expects that the "cycles" is collected on all CPUs in the
system. But for hybrid there is no single "cycles" event which can cover
all CPUs. Perf has to split it into two cycles events, e.g.,
cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask.
If a event is opened on the unsupported CPU. The open fails. That's the
reason of the above error out.
Perf should only open the cycles event on the corresponding CPU. The
commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting
core PMU maps") intersect the requested CPU map with the CPU map of the
PMU. Use the evsel's cpus to replace user_requested_cpus.
The evlist's threads are also propagated to the evsel's threads in
__perf_evlist__propagate_maps(). For a system-wide event, perf appends
a dummy event and assign it to the evsel's threads. For a per-thread
event, the evlist's thread_map is assigned to the evsel's threads. The
same as the other tools, e.g., perf record, using the evsel's threads
when opening an event.
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
Changes since V1:
- Update the description
- Add Reviewed-by from Ian
tools/perf/builtin-top.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index ea8c7eca5eee..cce9350177e2 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1027,8 +1027,8 @@ static int perf_top__start_counters(struct perf_top *top)
evlist__for_each_entry(evlist, counter) {
try_again:
- if (evsel__open(counter, top->evlist->core.user_requested_cpus,
- top->evlist->core.threads) < 0) {
+ if (evsel__open(counter, counter->core.cpus,
+ counter->core.threads) < 0) {
/*
* Specially handle overwrite fall back.
--
2.35.1
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH V2] perf top: Use evsel's cpus to replace user_requested_cpus 2023-12-12 19:38 [PATCH V2] perf top: Use evsel's cpus to replace user_requested_cpus kan.liang @ 2023-12-12 20:37 ` Ian Rogers 2023-12-12 21:25 ` Liang, Kan 0 siblings, 1 reply; 9+ messages in thread From: Ian Rogers @ 2023-12-12 20:37 UTC (permalink / raw) To: kan.liang Cc: acme, namhyung, mark.rutland, maz, marcan, linux-kernel, linux-perf-users On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@linux.intel.com> wrote: > > From: Kan Liang <kan.liang@linux.intel.com> > > perf top errors out on a hybrid machine > $perf top > > Error: > The cycles:P event is not supported. > > The perf top expects that the "cycles" is collected on all CPUs in the > system. But for hybrid there is no single "cycles" event which can cover > all CPUs. Perf has to split it into two cycles events, e.g., > cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask. > If a event is opened on the unsupported CPU. The open fails. That's the > reason of the above error out. > > Perf should only open the cycles event on the corresponding CPU. The > commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting > core PMU maps") intersect the requested CPU map with the CPU map of the > PMU. Use the evsel's cpus to replace user_requested_cpus. > > The evlist's threads are also propagated to the evsel's threads in > __perf_evlist__propagate_maps(). For a system-wide event, perf appends > a dummy event and assign it to the evsel's threads. For a per-thread > event, the evlist's thread_map is assigned to the evsel's threads. The > same as the other tools, e.g., perf record, using the evsel's threads > when opening an event. > > Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> > Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/ > Reviewed-by: Ian Rogers <irogers@google.com> > Signed-off-by: Kan Liang <kan.liang@linux.intel.com> > --- > > Changes since V1: > - Update the description > - Add Reviewed-by from Ian Thanks Kan, quick question. Does "perf top" on hybrid ask the user to select between the cycles event on cpu_atom and cpu_core? I'm wondering if there is some kind of missing "hybrid-merge" functionality like we have for perf stat. Thanks, Ian > tools/perf/builtin-top.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c > index ea8c7eca5eee..cce9350177e2 100644 > --- a/tools/perf/builtin-top.c > +++ b/tools/perf/builtin-top.c > @@ -1027,8 +1027,8 @@ static int perf_top__start_counters(struct perf_top *top) > > evlist__for_each_entry(evlist, counter) { > try_again: > - if (evsel__open(counter, top->evlist->core.user_requested_cpus, > - top->evlist->core.threads) < 0) { > + if (evsel__open(counter, counter->core.cpus, > + counter->core.threads) < 0) { > > /* > * Specially handle overwrite fall back. > -- > 2.35.1 > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH V2] perf top: Use evsel's cpus to replace user_requested_cpus 2023-12-12 20:37 ` Ian Rogers @ 2023-12-12 21:25 ` Liang, Kan 2023-12-12 22:12 ` Ian Rogers 0 siblings, 1 reply; 9+ messages in thread From: Liang, Kan @ 2023-12-12 21:25 UTC (permalink / raw) To: Ian Rogers Cc: acme, namhyung, mark.rutland, maz, marcan, linux-kernel, linux-perf-users On 2023-12-12 3:37 p.m., Ian Rogers wrote: > On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@linux.intel.com> wrote: >> >> From: Kan Liang <kan.liang@linux.intel.com> >> >> perf top errors out on a hybrid machine >> $perf top >> >> Error: >> The cycles:P event is not supported. >> >> The perf top expects that the "cycles" is collected on all CPUs in the >> system. But for hybrid there is no single "cycles" event which can cover >> all CPUs. Perf has to split it into two cycles events, e.g., >> cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask. >> If a event is opened on the unsupported CPU. The open fails. That's the >> reason of the above error out. >> >> Perf should only open the cycles event on the corresponding CPU. The >> commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting >> core PMU maps") intersect the requested CPU map with the CPU map of the >> PMU. Use the evsel's cpus to replace user_requested_cpus. >> >> The evlist's threads are also propagated to the evsel's threads in >> __perf_evlist__propagate_maps(). For a system-wide event, perf appends >> a dummy event and assign it to the evsel's threads. For a per-thread >> event, the evlist's thread_map is assigned to the evsel's threads. The >> same as the other tools, e.g., perf record, using the evsel's threads >> when opening an event. >> >> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> >> Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/ >> Reviewed-by: Ian Rogers <irogers@google.com> >> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> >> --- >> >> Changes since V1: >> - Update the description >> - Add Reviewed-by from Ian > > Thanks Kan, quick question. Does "perf top" on hybrid ask the user to > select between the cycles event on cpu_atom and cpu_core? Yes, but the event doesn't include the PMU information. We probably need a follow up patch to append the PMU name. Available samples 385 cycles:P 903 cycles:P Thanks, Kan > I'm > wondering if there is some kind of missing "hybrid-merge" > functionality like we have for perf stat. > > Thanks, > Ian > >> tools/perf/builtin-top.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c >> index ea8c7eca5eee..cce9350177e2 100644 >> --- a/tools/perf/builtin-top.c >> +++ b/tools/perf/builtin-top.c >> @@ -1027,8 +1027,8 @@ static int perf_top__start_counters(struct perf_top *top) >> >> evlist__for_each_entry(evlist, counter) { >> try_again: >> - if (evsel__open(counter, top->evlist->core.user_requested_cpus, >> - top->evlist->core.threads) < 0) { >> + if (evsel__open(counter, counter->core.cpus, >> + counter->core.threads) < 0) { >> >> /* >> * Specially handle overwrite fall back. >> -- >> 2.35.1 >> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH V2] perf top: Use evsel's cpus to replace user_requested_cpus 2023-12-12 21:25 ` Liang, Kan @ 2023-12-12 22:12 ` Ian Rogers 2023-12-13 1:06 ` Namhyung Kim 0 siblings, 1 reply; 9+ messages in thread From: Ian Rogers @ 2023-12-12 22:12 UTC (permalink / raw) To: Liang, Kan Cc: acme, namhyung, mark.rutland, maz, marcan, linux-kernel, linux-perf-users On Tue, Dec 12, 2023 at 1:25 PM Liang, Kan <kan.liang@linux.intel.com> wrote: > > > > On 2023-12-12 3:37 p.m., Ian Rogers wrote: > > On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@linux.intel.com> wrote: > >> > >> From: Kan Liang <kan.liang@linux.intel.com> > >> > >> perf top errors out on a hybrid machine > >> $perf top > >> > >> Error: > >> The cycles:P event is not supported. > >> > >> The perf top expects that the "cycles" is collected on all CPUs in the > >> system. But for hybrid there is no single "cycles" event which can cover > >> all CPUs. Perf has to split it into two cycles events, e.g., > >> cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask. > >> If a event is opened on the unsupported CPU. The open fails. That's the > >> reason of the above error out. > >> > >> Perf should only open the cycles event on the corresponding CPU. The > >> commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting > >> core PMU maps") intersect the requested CPU map with the CPU map of the > >> PMU. Use the evsel's cpus to replace user_requested_cpus. > >> > >> The evlist's threads are also propagated to the evsel's threads in > >> __perf_evlist__propagate_maps(). For a system-wide event, perf appends > >> a dummy event and assign it to the evsel's threads. For a per-thread > >> event, the evlist's thread_map is assigned to the evsel's threads. The > >> same as the other tools, e.g., perf record, using the evsel's threads > >> when opening an event. > >> > >> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> > >> Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/ > >> Reviewed-by: Ian Rogers <irogers@google.com> > >> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> > >> --- > >> > >> Changes since V1: > >> - Update the description > >> - Add Reviewed-by from Ian > > > > Thanks Kan, quick question. Does "perf top" on hybrid ask the user to > > select between the cycles event on cpu_atom and cpu_core? > > Yes, but the event doesn't include the PMU information. > We probably need a follow up patch to append the PMU name. > > Available samples > 385 cycles:P > > 903 cycles:P Thanks and agreed, it isn't possible to tell which is which PMU/CPU type at the moment. I tried the patch with perf top --stdio, there wasn't a choice of event and I can't tell what counter is being displayed. When I quit I also see: ``` exiting. corrupted double-linked list Aborted (core dumped) ``` but I wasn't able to repro this on a debuggable binary/system. If my memory serves there was a patch where perf top was showing >1 event. It would be nice here to do some kind of hybrid merging rather than having to view each PMU's top separately. Thanks, Ian > Thanks, > Kan > > > I'm > > wondering if there is some kind of missing "hybrid-merge" > > functionality like we have for perf stat. > > > > Thanks, > > Ian > > > >> tools/perf/builtin-top.c | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > >> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c > >> index ea8c7eca5eee..cce9350177e2 100644 > >> --- a/tools/perf/builtin-top.c > >> +++ b/tools/perf/builtin-top.c > >> @@ -1027,8 +1027,8 @@ static int perf_top__start_counters(struct perf_top *top) > >> > >> evlist__for_each_entry(evlist, counter) { > >> try_again: > >> - if (evsel__open(counter, top->evlist->core.user_requested_cpus, > >> - top->evlist->core.threads) < 0) { > >> + if (evsel__open(counter, counter->core.cpus, > >> + counter->core.threads) < 0) { > >> > >> /* > >> * Specially handle overwrite fall back. > >> -- > >> 2.35.1 > >> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH V2] perf top: Use evsel's cpus to replace user_requested_cpus 2023-12-12 22:12 ` Ian Rogers @ 2023-12-13 1:06 ` Namhyung Kim 2023-12-13 15:45 ` Liang, Kan 0 siblings, 1 reply; 9+ messages in thread From: Namhyung Kim @ 2023-12-13 1:06 UTC (permalink / raw) To: Ian Rogers Cc: Liang, Kan, acme, mark.rutland, maz, marcan, linux-kernel, linux-perf-users On Tue, Dec 12, 2023 at 2:12 PM Ian Rogers <irogers@google.com> wrote: > > On Tue, Dec 12, 2023 at 1:25 PM Liang, Kan <kan.liang@linux.intel.com> wrote: > > > > > > > > On 2023-12-12 3:37 p.m., Ian Rogers wrote: > > > On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@linux.intel.com> wrote: > > >> > > >> From: Kan Liang <kan.liang@linux.intel.com> > > >> > > >> perf top errors out on a hybrid machine > > >> $perf top > > >> > > >> Error: > > >> The cycles:P event is not supported. > > >> > > >> The perf top expects that the "cycles" is collected on all CPUs in the > > >> system. But for hybrid there is no single "cycles" event which can cover > > >> all CPUs. Perf has to split it into two cycles events, e.g., > > >> cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask. > > >> If a event is opened on the unsupported CPU. The open fails. That's the > > >> reason of the above error out. > > >> > > >> Perf should only open the cycles event on the corresponding CPU. The > > >> commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting > > >> core PMU maps") intersect the requested CPU map with the CPU map of the > > >> PMU. Use the evsel's cpus to replace user_requested_cpus. > > >> > > >> The evlist's threads are also propagated to the evsel's threads in > > >> __perf_evlist__propagate_maps(). For a system-wide event, perf appends > > >> a dummy event and assign it to the evsel's threads. For a per-thread > > >> event, the evlist's thread_map is assigned to the evsel's threads. The > > >> same as the other tools, e.g., perf record, using the evsel's threads > > >> when opening an event. > > >> > > >> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> > > >> Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/ > > >> Reviewed-by: Ian Rogers <irogers@google.com> > > >> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> > > >> --- > > >> > > >> Changes since V1: > > >> - Update the description > > >> - Add Reviewed-by from Ian > > > > > > Thanks Kan, quick question. Does "perf top" on hybrid ask the user to > > > select between the cycles event on cpu_atom and cpu_core? > > > > Yes, but the event doesn't include the PMU information. > > We probably need a follow up patch to append the PMU name. > > > > Available samples > > 385 cycles:P > > > > 903 cycles:P > > Thanks and agreed, it isn't possible to tell which is which PMU/CPU > type at the moment. I tried the patch with perf top --stdio, there > wasn't a choice of event and I can't tell what counter is being > displayed. When I quit I also see: > ``` > exiting. > corrupted double-linked list > Aborted (core dumped) > ``` > but I wasn't able to repro this on a debuggable binary/system. > > If my memory serves there was a patch where perf top was showing >1 > event. It would be nice here to do some kind of hybrid merging rather > than having to view each PMU's top separately. Using event groups, but I noticed you removed the --group option. Maybe perf top can just use `{ ... }` notation for explicit grouping, but it might be implicit like in the hybrid case. Thanks, Namhyung ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH V2] perf top: Use evsel's cpus to replace user_requested_cpus 2023-12-13 1:06 ` Namhyung Kim @ 2023-12-13 15:45 ` Liang, Kan 2023-12-13 17:42 ` Ian Rogers 0 siblings, 1 reply; 9+ messages in thread From: Liang, Kan @ 2023-12-13 15:45 UTC (permalink / raw) To: Namhyung Kim, Ian Rogers Cc: acme, mark.rutland, maz, marcan, linux-kernel, linux-perf-users On 2023-12-12 8:06 p.m., Namhyung Kim wrote: > On Tue, Dec 12, 2023 at 2:12 PM Ian Rogers <irogers@google.com> wrote: >> >> On Tue, Dec 12, 2023 at 1:25 PM Liang, Kan <kan.liang@linux.intel.com> wrote: >>> >>> >>> >>> On 2023-12-12 3:37 p.m., Ian Rogers wrote: >>>> On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@linux.intel.com> wrote: >>>>> >>>>> From: Kan Liang <kan.liang@linux.intel.com> >>>>> >>>>> perf top errors out on a hybrid machine >>>>> $perf top >>>>> >>>>> Error: >>>>> The cycles:P event is not supported. >>>>> >>>>> The perf top expects that the "cycles" is collected on all CPUs in the >>>>> system. But for hybrid there is no single "cycles" event which can cover >>>>> all CPUs. Perf has to split it into two cycles events, e.g., >>>>> cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask. >>>>> If a event is opened on the unsupported CPU. The open fails. That's the >>>>> reason of the above error out. >>>>> >>>>> Perf should only open the cycles event on the corresponding CPU. The >>>>> commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting >>>>> core PMU maps") intersect the requested CPU map with the CPU map of the >>>>> PMU. Use the evsel's cpus to replace user_requested_cpus. >>>>> >>>>> The evlist's threads are also propagated to the evsel's threads in >>>>> __perf_evlist__propagate_maps(). For a system-wide event, perf appends >>>>> a dummy event and assign it to the evsel's threads. For a per-thread >>>>> event, the evlist's thread_map is assigned to the evsel's threads. The >>>>> same as the other tools, e.g., perf record, using the evsel's threads >>>>> when opening an event. >>>>> >>>>> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> >>>>> Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/ >>>>> Reviewed-by: Ian Rogers <irogers@google.com> >>>>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> >>>>> --- >>>>> >>>>> Changes since V1: >>>>> - Update the description >>>>> - Add Reviewed-by from Ian >>>> >>>> Thanks Kan, quick question. Does "perf top" on hybrid ask the user to >>>> select between the cycles event on cpu_atom and cpu_core? >>> >>> Yes, but the event doesn't include the PMU information. >>> We probably need a follow up patch to append the PMU name. >>> >>> Available samples >>> 385 cycles:P >>> >>> 903 cycles:P >> >> Thanks and agreed, it isn't possible to tell which is which PMU/CPU >> type at the moment. I tried the patch with perf top --stdio, there >> wasn't a choice of event The perf top --stdio uses a dedicated display function, see perf_top__header_snprintf() in util/top.c It only shows one event at a time. "E" is used to switch the event. >> and I can't tell what counter is being >> displayed. For the hybrid case, I think we may display both PMU name and event name. For example, Available samples 656 cycles:P cpu_atom 701 cycles:P cpu_core diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c index f4812b226818..afc7a1d54fe4 100644 --- a/tools/perf/ui/browsers/hists.c +++ b/tools/perf/ui/browsers/hists.c @@ -3433,8 +3433,10 @@ static void perf_evsel_menu__write(struct ui_browser *browser, } nr_events = convert_unit(nr_events, &unit); - printed = scnprintf(bf, sizeof(bf), "%lu%c%s%s", nr_events, - unit, unit == ' ' ? "" : " ", ev_name); + printed = scnprintf(bf, sizeof(bf), "%lu%c%s%s %s", nr_events, + unit, unit == ' ' ? "" : " ", ev_name, + evsel->pmu ? evsel->pmu_name : ""); + ui_browser__printf(browser, "%s", bf); nr_events = evsel->evlist->stats.nr_events[PERF_RECORD_LOST]; >> When I quit I also see: >> ``` >> exiting. >> corrupted double-linked list >> Aborted (core dumped) >> ``` >> but I wasn't able to repro this on a debuggable binary/system. I haven't see the issue yet. >> >> If my memory serves there was a patch where perf top was showing >1 >> event. It would be nice here to do some kind of hybrid merging rather >> than having to view each PMU's top separately. The current perf top doesn't merge when there are >1 event. sudo ./perf top -e "cycles,instructions" Available samples 2K cycles 2K instructions For now, I think we may just append a PMU name to distinguish the hybrid case. We may implement the merging feature which impacts both hybrid and non-hybrid cases later separately. > > Using event groups, but I noticed you removed the --group option. > Maybe perf top can just use `{ ... }` notation for explicit grouping, > but it might be implicit like in the hybrid case. > Yes, if the events are from different PMUs, the perf tool will implicitly de-group the hybrid events. "{ ... }" may not help here. Thanks, Kan ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH V2] perf top: Use evsel's cpus to replace user_requested_cpus 2023-12-13 15:45 ` Liang, Kan @ 2023-12-13 17:42 ` Ian Rogers 2023-12-13 21:11 ` Liang, Kan 2023-12-13 21:54 ` Namhyung Kim 0 siblings, 2 replies; 9+ messages in thread From: Ian Rogers @ 2023-12-13 17:42 UTC (permalink / raw) To: Liang, Kan Cc: Namhyung Kim, acme, mark.rutland, maz, marcan, linux-kernel, linux-perf-users On Wed, Dec 13, 2023 at 7:45 AM Liang, Kan <kan.liang@linux.intel.com> wrote: > > > > On 2023-12-12 8:06 p.m., Namhyung Kim wrote: > > On Tue, Dec 12, 2023 at 2:12 PM Ian Rogers <irogers@google.com> wrote: > >> > >> On Tue, Dec 12, 2023 at 1:25 PM Liang, Kan <kan.liang@linux.intel.com> wrote: > >>> > >>> > >>> > >>> On 2023-12-12 3:37 p.m., Ian Rogers wrote: > >>>> On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@linux.intel.com> wrote: > >>>>> > >>>>> From: Kan Liang <kan.liang@linux.intel.com> > >>>>> > >>>>> perf top errors out on a hybrid machine > >>>>> $perf top > >>>>> > >>>>> Error: > >>>>> The cycles:P event is not supported. > >>>>> > >>>>> The perf top expects that the "cycles" is collected on all CPUs in the > >>>>> system. But for hybrid there is no single "cycles" event which can cover > >>>>> all CPUs. Perf has to split it into two cycles events, e.g., > >>>>> cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask. > >>>>> If a event is opened on the unsupported CPU. The open fails. That's the > >>>>> reason of the above error out. > >>>>> > >>>>> Perf should only open the cycles event on the corresponding CPU. The > >>>>> commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting > >>>>> core PMU maps") intersect the requested CPU map with the CPU map of the > >>>>> PMU. Use the evsel's cpus to replace user_requested_cpus. > >>>>> > >>>>> The evlist's threads are also propagated to the evsel's threads in > >>>>> __perf_evlist__propagate_maps(). For a system-wide event, perf appends > >>>>> a dummy event and assign it to the evsel's threads. For a per-thread > >>>>> event, the evlist's thread_map is assigned to the evsel's threads. The > >>>>> same as the other tools, e.g., perf record, using the evsel's threads > >>>>> when opening an event. > >>>>> > >>>>> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> > >>>>> Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/ > >>>>> Reviewed-by: Ian Rogers <irogers@google.com> > >>>>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> > >>>>> --- > >>>>> > >>>>> Changes since V1: > >>>>> - Update the description > >>>>> - Add Reviewed-by from Ian > >>>> > >>>> Thanks Kan, quick question. Does "perf top" on hybrid ask the user to > >>>> select between the cycles event on cpu_atom and cpu_core? > >>> > >>> Yes, but the event doesn't include the PMU information. > >>> We probably need a follow up patch to append the PMU name. > >>> > >>> Available samples > >>> 385 cycles:P > >>> > >>> 903 cycles:P > >> > >> Thanks and agreed, it isn't possible to tell which is which PMU/CPU > >> type at the moment. I tried the patch with perf top --stdio, there > >> wasn't a choice of event > > The perf top --stdio uses a dedicated display function, see > perf_top__header_snprintf() in util/top.c > > It only shows one event at a time. "E" is used to switch the event. > > >> and I can't tell what counter is being > >> displayed. > > For the hybrid case, I think we may display both PMU name and event > name. For example, > > Available samples > 656 cycles:P cpu_atom > > 701 cycles:P cpu_core I think there would be more uniformity if we could do: cpu/cycles/P I'm just reminded of the stat output where sometimes you can get things like: cpu/cycles/ or sometimes: cycles [cpu] It seems the slash style approach is the most uniform given it looks like what is written on the command line. Perhaps we need some kind of helper function to do this as reformatting the modifier is a pain. > diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c > index f4812b226818..afc7a1d54fe4 100644 > --- a/tools/perf/ui/browsers/hists.c > +++ b/tools/perf/ui/browsers/hists.c > @@ -3433,8 +3433,10 @@ static void perf_evsel_menu__write(struct > ui_browser *browser, > } > > nr_events = convert_unit(nr_events, &unit); > - printed = scnprintf(bf, sizeof(bf), "%lu%c%s%s", nr_events, > - unit, unit == ' ' ? "" : " ", ev_name); > + printed = scnprintf(bf, sizeof(bf), "%lu%c%s%s %s", nr_events, > + unit, unit == ' ' ? "" : " ", ev_name, > + evsel->pmu ? evsel->pmu_name : ""); > + > ui_browser__printf(browser, "%s", bf); > > nr_events = evsel->evlist->stats.nr_events[PERF_RECORD_LOST]; > > > >> When I quit I also see: > >> ``` > >> exiting. > >> corrupted double-linked list > >> Aborted (core dumped) > >> ``` > >> but I wasn't able to repro this on a debuggable binary/system. > > I haven't see the issue yet. > > >> > >> If my memory serves there was a patch where perf top was showing >1 > >> event. It would be nice here to do some kind of hybrid merging rather > >> than having to view each PMU's top separately. > > The current perf top doesn't merge when there are >1 event. > sudo ./perf top -e "cycles,instructions" > > Available samples > 2K cycles > > 2K instructions > > For now, I think we may just append a PMU name to distinguish the hybrid > case. > > We may implement the merging feature which impacts both hybrid and > non-hybrid cases later separately. > > > > > Using event groups, but I noticed you removed the --group option. > > Maybe perf top can just use `{ ... }` notation for explicit grouping, > > but it might be implicit like in the hybrid case. > > > > Yes, if the events are from different PMUs, the perf tool will > implicitly de-group the hybrid events. "{ ... }" may not help here. I think grouping may have been the situation where I saw >1 event displayed but I agree with Kan, we implicitly de-group events on different PMUs so it won't help here. Thanks, Ian > Thanks, > Kan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH V2] perf top: Use evsel's cpus to replace user_requested_cpus 2023-12-13 17:42 ` Ian Rogers @ 2023-12-13 21:11 ` Liang, Kan 2023-12-13 21:54 ` Namhyung Kim 1 sibling, 0 replies; 9+ messages in thread From: Liang, Kan @ 2023-12-13 21:11 UTC (permalink / raw) To: Ian Rogers Cc: Namhyung Kim, acme, mark.rutland, maz, marcan, linux-kernel, linux-perf-users On 2023-12-13 12:42 p.m., Ian Rogers wrote: > On Wed, Dec 13, 2023 at 7:45 AM Liang, Kan <kan.liang@linux.intel.com> wrote: >> >> >> On 2023-12-12 8:06 p.m., Namhyung Kim wrote: >>> On Tue, Dec 12, 2023 at 2:12 PM Ian Rogers <irogers@google.com> wrote: >>>> On Tue, Dec 12, 2023 at 1:25 PM Liang, Kan <kan.liang@linux.intel.com> wrote: >>>>> >>>>> >>>>> On 2023-12-12 3:37 p.m., Ian Rogers wrote: >>>>>> On Tue, Dec 12, 2023 at 11:39 AM <kan.liang@linux.intel.com> wrote: >>>>>>> From: Kan Liang <kan.liang@linux.intel.com> >>>>>>> >>>>>>> perf top errors out on a hybrid machine >>>>>>> $perf top >>>>>>> >>>>>>> Error: >>>>>>> The cycles:P event is not supported. >>>>>>> >>>>>>> The perf top expects that the "cycles" is collected on all CPUs in the >>>>>>> system. But for hybrid there is no single "cycles" event which can cover >>>>>>> all CPUs. Perf has to split it into two cycles events, e.g., >>>>>>> cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask. >>>>>>> If a event is opened on the unsupported CPU. The open fails. That's the >>>>>>> reason of the above error out. >>>>>>> >>>>>>> Perf should only open the cycles event on the corresponding CPU. The >>>>>>> commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting >>>>>>> core PMU maps") intersect the requested CPU map with the CPU map of the >>>>>>> PMU. Use the evsel's cpus to replace user_requested_cpus. >>>>>>> >>>>>>> The evlist's threads are also propagated to the evsel's threads in >>>>>>> __perf_evlist__propagate_maps(). For a system-wide event, perf appends >>>>>>> a dummy event and assign it to the evsel's threads. For a per-thread >>>>>>> event, the evlist's thread_map is assigned to the evsel's threads. The >>>>>>> same as the other tools, e.g., perf record, using the evsel's threads >>>>>>> when opening an event. >>>>>>> >>>>>>> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> >>>>>>> Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/ >>>>>>> Reviewed-by: Ian Rogers <irogers@google.com> >>>>>>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> >>>>>>> --- >>>>>>> >>>>>>> Changes since V1: >>>>>>> - Update the description >>>>>>> - Add Reviewed-by from Ian >>>>>> Thanks Kan, quick question. Does "perf top" on hybrid ask the user to >>>>>> select between the cycles event on cpu_atom and cpu_core? >>>>> Yes, but the event doesn't include the PMU information. >>>>> We probably need a follow up patch to append the PMU name. >>>>> >>>>> Available samples >>>>> 385 cycles:P >>>>> >>>>> 903 cycles:P >>>> Thanks and agreed, it isn't possible to tell which is which PMU/CPU >>>> type at the moment. I tried the patch with perf top --stdio, there >>>> wasn't a choice of event >> The perf top --stdio uses a dedicated display function, see >> perf_top__header_snprintf() in util/top.c >> >> It only shows one event at a time. "E" is used to switch the event. >> >>>> and I can't tell what counter is being >>>> displayed. >> For the hybrid case, I think we may display both PMU name and event >> name. For example, >> >> Available samples >> 656 cycles:P cpu_atom >> >> 701 cycles:P cpu_core > I think there would be more uniformity if we could do: > cpu/cycles/P > I'm just reminded of the stat output where sometimes you can get things like: > cpu/cycles/ > or sometimes: > cycles [cpu] > It seems the slash style approach is the most uniform given it looks > like what is written on the command line. Perhaps we need some kind of > helper function to do this as reformatting the modifier is a pain. Actually, we already have a helper in the perf record, record__uniquify_name(). I can move it to the util/record.c and let's perf top invoke it before the open as well. Then we can get this in the perf top. Available samples 148 cpu_atom/cycles:P/ 1K cpu_core/cycles:P/ It should be good enough to distinguish the events. I will send a patch to support it. Thanks, Kan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH V2] perf top: Use evsel's cpus to replace user_requested_cpus 2023-12-13 17:42 ` Ian Rogers 2023-12-13 21:11 ` Liang, Kan @ 2023-12-13 21:54 ` Namhyung Kim 1 sibling, 0 replies; 9+ messages in thread From: Namhyung Kim @ 2023-12-13 21:54 UTC (permalink / raw) To: Ian Rogers Cc: Liang, Kan, acme, mark.rutland, maz, marcan, linux-kernel, linux-perf-users On Wed, Dec 13, 2023 at 9:42 AM Ian Rogers <irogers@google.com> wrote: > > On Wed, Dec 13, 2023 at 7:45 AM Liang, Kan <kan.liang@linux.intel.com> wrote: > > > > > > > > On 2023-12-12 8:06 p.m., Namhyung Kim wrote: > > > On Tue, Dec 12, 2023 at 2:12 PM Ian Rogers <irogers@google.com> wrote: > > >> If my memory serves there was a patch where perf top was showing >1 > > >> event. It would be nice here to do some kind of hybrid merging rather > > >> than having to view each PMU's top separately. > > > > The current perf top doesn't merge when there are >1 event. > > sudo ./perf top -e "cycles,instructions" > > > > Available samples > > 2K cycles > > > > 2K instructions > > > > For now, I think we may just append a PMU name to distinguish the hybrid > > case. > > > > We may implement the merging feature which impacts both hybrid and > > non-hybrid cases later separately. > > > > > > > > Using event groups, but I noticed you removed the --group option. > > > Maybe perf top can just use `{ ... }` notation for explicit grouping, > > > but it might be implicit like in the hybrid case. > > > > > > > Yes, if the events are from different PMUs, the perf tool will > > implicitly de-group the hybrid events. "{ ... }" may not help here. > > I think grouping may have been the situation where I saw >1 event > displayed but I agree with Kan, we implicitly de-group events on > different PMUs so it won't help here. The --group option was implemented in perf report which means it doesn't matter if events are in different PMUs. It's just to display the results together. Thanks, Namhyung ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-12-13 21:54 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-12-12 19:38 [PATCH V2] perf top: Use evsel's cpus to replace user_requested_cpus kan.liang 2023-12-12 20:37 ` Ian Rogers 2023-12-12 21:25 ` Liang, Kan 2023-12-12 22:12 ` Ian Rogers 2023-12-13 1:06 ` Namhyung Kim 2023-12-13 15:45 ` Liang, Kan 2023-12-13 17:42 ` Ian Rogers 2023-12-13 21:11 ` Liang, Kan 2023-12-13 21:54 ` Namhyung Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox