* [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) @ 2022-10-03 20:46 Namhyung Kim 2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim ` (4 more replies) 0 siblings, 5 replies; 28+ messages in thread From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Jiri Olsa Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan Hello, The system-wide evsel has a cpu map for all (online) cpus regardless of user requested cpus. But the cpu map handling code has some special case for it and I think we can cleanup the code by making sure that such a evsel has a proper cpu/thread maps from the beginning. This patches should not cause any change in the behavior. Changes from v2: * build evlist->core.all_cpus from the beginning (Adrian) Changes from v1: * use evlist->core.needs_map_propagation field * add Reviewed-by from Adrian You can get the code from 'perf/cpumap-update-v3' branch in git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git Thanks, Namhyung Namhyung Kim (5): libperf: Populate system-wide evsel maps libperf: Propagate maps only if necessary perf tools: Get rid of evlist__add_on_all_cpus() perf tools: Add evlist__add_sched_switch() perf tools: Remove special handling of system-wide evsel tools/lib/perf/evlist.c | 26 +++++++------- tools/lib/perf/evsel.c | 3 -- tools/lib/perf/include/internal/evlist.h | 1 + tools/perf/arch/x86/util/intel-pt.c | 15 +++----- tools/perf/builtin-script.c | 3 -- tools/perf/tests/switch-tracking.c | 15 +++----- tools/perf/util/evlist.c | 46 ++++++++++-------------- tools/perf/util/evlist.h | 1 + tools/perf/util/evsel.c | 12 ++----- tools/perf/util/stat.c | 3 -- 10 files changed, 46 insertions(+), 79 deletions(-) base-commit: 62e64c9d2fd12839c02f1b3e8b873e7cb34e8720 -- 2.38.0.rc1.362.ged0d419d3c-goog ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 1/5] libperf: Populate system-wide evsel maps 2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim @ 2022-10-03 20:46 ` Namhyung Kim 2022-10-06 18:43 ` Ian Rogers 2022-10-03 20:46 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim ` (3 subsequent siblings) 4 siblings, 1 reply; 28+ messages in thread From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Jiri Olsa Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan Setting proper cpu and thread maps for system wide evsels regardless of user requested cpu in __perf_evlist__propagate_maps(). Those evsels need to be active on all cpus always. Do it in the libperf so that we can guarantee it has proper maps. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/lib/perf/evlist.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c index 6b1bafe267a4..187129652ab6 100644 --- a/tools/lib/perf/evlist.c +++ b/tools/lib/perf/evlist.c @@ -40,11 +40,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist, * We already have cpus for evsel (via PMU sysfs) so * keep it, if there's no target cpu list defined. */ - if (!evsel->own_cpus || - (!evsel->system_wide && evlist->has_user_cpus) || - (!evsel->system_wide && - !evsel->requires_cpu && - perf_cpu_map__empty(evlist->user_requested_cpus))) { + if (evsel->system_wide) { + perf_cpu_map__put(evsel->cpus); + evsel->cpus = perf_cpu_map__new(NULL); + } else if (!evsel->own_cpus || evlist->has_user_cpus || + (!evsel->requires_cpu && perf_cpu_map__empty(evlist->user_requested_cpus))) { perf_cpu_map__put(evsel->cpus); evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus); } else if (evsel->cpus != evsel->own_cpus) { @@ -52,7 +52,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist, evsel->cpus = perf_cpu_map__get(evsel->own_cpus); } - if (!evsel->system_wide) { + if (evsel->system_wide) { + perf_thread_map__put(evsel->threads); + evsel->threads = perf_thread_map__new_dummy(); + } else { perf_thread_map__put(evsel->threads); evsel->threads = perf_thread_map__get(evlist->threads); } -- 2.38.0.rc1.362.ged0d419d3c-goog ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] libperf: Populate system-wide evsel maps 2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim @ 2022-10-06 18:43 ` Ian Rogers 2022-10-06 23:13 ` Namhyung Kim 0 siblings, 1 reply; 28+ messages in thread From: Ian Rogers @ 2022-10-06 18:43 UTC (permalink / raw) To: Namhyung Kim Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote: > > Setting proper cpu and thread maps for system wide evsels regardless of > user requested cpu in __perf_evlist__propagate_maps(). Those evsels > need to be active on all cpus always. Do it in the libperf so that we > can guarantee it has proper maps. > > Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> > Signed-off-by: Namhyung Kim <namhyung@kernel.org> > --- > tools/lib/perf/evlist.c | 15 +++++++++------ > 1 file changed, 9 insertions(+), 6 deletions(-) > > diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c > index 6b1bafe267a4..187129652ab6 100644 > --- a/tools/lib/perf/evlist.c > +++ b/tools/lib/perf/evlist.c > @@ -40,11 +40,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist, > * We already have cpus for evsel (via PMU sysfs) so > * keep it, if there's no target cpu list defined. > */ This comment is for the 'else if' case and so is a little out of place before the system wide test. > - if (!evsel->own_cpus || > - (!evsel->system_wide && evlist->has_user_cpus) || > - (!evsel->system_wide && > - !evsel->requires_cpu && > - perf_cpu_map__empty(evlist->user_requested_cpus))) { > + if (evsel->system_wide) { > + perf_cpu_map__put(evsel->cpus); > + evsel->cpus = perf_cpu_map__new(NULL); Looking at perf_cpu_map__new, will this mean that in system wide mode every event/evsel will now read /sys/devices/system/cpu/online here? We may want to cache the cpumap to avoid this. Thanks, Ian > + } else if (!evsel->own_cpus || evlist->has_user_cpus || > + (!evsel->requires_cpu && perf_cpu_map__empty(evlist->user_requested_cpus))) { > perf_cpu_map__put(evsel->cpus); > evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus); > } else if (evsel->cpus != evsel->own_cpus) { > @@ -52,7 +52,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist, > evsel->cpus = perf_cpu_map__get(evsel->own_cpus); > } > > - if (!evsel->system_wide) { > + if (evsel->system_wide) { > + perf_thread_map__put(evsel->threads); > + evsel->threads = perf_thread_map__new_dummy(); > + } else { > perf_thread_map__put(evsel->threads); > evsel->threads = perf_thread_map__get(evlist->threads); > } > -- > 2.38.0.rc1.362.ged0d419d3c-goog > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/5] libperf: Populate system-wide evsel maps 2022-10-06 18:43 ` Ian Rogers @ 2022-10-06 23:13 ` Namhyung Kim 0 siblings, 0 replies; 28+ messages in thread From: Namhyung Kim @ 2022-10-06 23:13 UTC (permalink / raw) To: Ian Rogers Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan Hi Ian, On Thu, Oct 6, 2022 at 11:44 AM Ian Rogers <irogers@google.com> wrote: > > On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote: > > > > Setting proper cpu and thread maps for system wide evsels regardless of > > user requested cpu in __perf_evlist__propagate_maps(). Those evsels > > need to be active on all cpus always. Do it in the libperf so that we > > can guarantee it has proper maps. > > > > Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> > > Signed-off-by: Namhyung Kim <namhyung@kernel.org> > > --- > > tools/lib/perf/evlist.c | 15 +++++++++------ > > 1 file changed, 9 insertions(+), 6 deletions(-) > > > > diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c > > index 6b1bafe267a4..187129652ab6 100644 > > --- a/tools/lib/perf/evlist.c > > +++ b/tools/lib/perf/evlist.c > > @@ -40,11 +40,11 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist, > > * We already have cpus for evsel (via PMU sysfs) so > > * keep it, if there's no target cpu list defined. > > */ > > This comment is for the 'else if' case and so is a little out of place > before the system wide test. Right, I'll add a comment for system-wide evsels too. > > > - if (!evsel->own_cpus || > > - (!evsel->system_wide && evlist->has_user_cpus) || > > - (!evsel->system_wide && > > - !evsel->requires_cpu && > > - perf_cpu_map__empty(evlist->user_requested_cpus))) { > > + if (evsel->system_wide) { > > + perf_cpu_map__put(evsel->cpus); > > + evsel->cpus = perf_cpu_map__new(NULL); > > Looking at perf_cpu_map__new, will this mean that in system wide mode > every event/evsel will now read /sys/devices/system/cpu/online here? > We may want to cache the cpumap to avoid this. Yeah, it's redundant. I thought it's ok since the system-wide evsels are not common. We can consider caching when it becomes a problem. Thanks, Namhyung > > > + } else if (!evsel->own_cpus || evlist->has_user_cpus || > > + (!evsel->requires_cpu && perf_cpu_map__empty(evlist->user_requested_cpus))) { > > perf_cpu_map__put(evsel->cpus); > > evsel->cpus = perf_cpu_map__get(evlist->user_requested_cpus); > > } else if (evsel->cpus != evsel->own_cpus) { > > @@ -52,7 +52,10 @@ static void __perf_evlist__propagate_maps(struct perf_evlist *evlist, > > evsel->cpus = perf_cpu_map__get(evsel->own_cpus); > > } > > > > - if (!evsel->system_wide) { > > + if (evsel->system_wide) { > > + perf_thread_map__put(evsel->threads); > > + evsel->threads = perf_thread_map__new_dummy(); > > + } else { > > perf_thread_map__put(evsel->threads); > > evsel->threads = perf_thread_map__get(evlist->threads); > > } > > -- > > 2.38.0.rc1.362.ged0d419d3c-goog > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 2/5] libperf: Propagate maps only if necessary 2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim 2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim @ 2022-10-03 20:46 ` Namhyung Kim 2022-10-04 12:14 ` Arnaldo Carvalho de Melo 2022-10-06 18:52 ` Ian Rogers 2022-10-03 20:46 ` [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus() Namhyung Kim ` (2 subsequent siblings) 4 siblings, 2 replies; 28+ messages in thread From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Jiri Olsa Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan The current code propagate evsel's cpu map settings to evlist when it's added to an evlist. But the evlist->all_cpus and each evsel's cpus will be updated in perf_evlist__set_maps() later. No need to do it before evlist's cpus are set actually. In fact it discards this intermediate all_cpus maps at the beginning of perf_evlist__set_maps(). Let's not do this. It's only needed when an evsel is added after the evlist cpu/thread maps are set. Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/lib/perf/evlist.c | 11 ++++------- tools/lib/perf/include/internal/evlist.h | 1 + 2 files changed, 5 insertions(+), 7 deletions(-) diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c index 187129652ab6..8ce92070086c 100644 --- a/tools/lib/perf/evlist.c +++ b/tools/lib/perf/evlist.c @@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist) { struct perf_evsel *evsel; - /* Recomputing all_cpus, so start with a blank slate. */ - perf_cpu_map__put(evlist->all_cpus); - evlist->all_cpus = NULL; + evlist->needs_map_propagation = true; perf_evlist__for_each_evsel(evlist, evsel) __perf_evlist__propagate_maps(evlist, evsel); @@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist, evsel->idx = evlist->nr_entries; list_add_tail(&evsel->node, &evlist->entries); evlist->nr_entries += 1; - __perf_evlist__propagate_maps(evlist, evsel); + + if (evlist->needs_map_propagation) + __perf_evlist__propagate_maps(evlist, evsel); } void perf_evlist__remove(struct perf_evlist *evlist, @@ -177,9 +177,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist, evlist->threads = perf_thread_map__get(threads); } - if (!evlist->all_cpus && cpus) - evlist->all_cpus = perf_cpu_map__get(cpus); - perf_evlist__propagate_maps(evlist); } diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h index 6f89aec3e608..850f07070036 100644 --- a/tools/lib/perf/include/internal/evlist.h +++ b/tools/lib/perf/include/internal/evlist.h @@ -19,6 +19,7 @@ struct perf_evlist { int nr_entries; int nr_groups; bool has_user_cpus; + bool needs_map_propagation; /** * The cpus passed from the command line or all online CPUs by * default. -- 2.38.0.rc1.362.ged0d419d3c-goog ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-10-03 20:46 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim @ 2022-10-04 12:14 ` Arnaldo Carvalho de Melo 2022-10-04 13:55 ` Adrian Hunter 2022-10-06 18:52 ` Ian Rogers 1 sibling, 1 reply; 28+ messages in thread From: Arnaldo Carvalho de Melo @ 2022-10-04 12:14 UTC (permalink / raw) To: Namhyung Kim Cc: Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan Em Mon, Oct 03, 2022 at 01:46:44PM -0700, Namhyung Kim escreveu: > The current code propagate evsel's cpu map settings to evlist when it's > added to an evlist. But the evlist->all_cpus and each evsel's cpus will > be updated in perf_evlist__set_maps() later. No need to do it before > evlist's cpus are set actually. > > In fact it discards this intermediate all_cpus maps at the beginning > of perf_evlist__set_maps(). Let's not do this. It's only needed when > an evsel is added after the evlist cpu/thread maps are set. > > Signed-off-by: Namhyung Kim <namhyung@kernel.org> > --- > tools/lib/perf/evlist.c | 11 ++++------- > tools/lib/perf/include/internal/evlist.h | 1 + > 2 files changed, 5 insertions(+), 7 deletions(-) > > diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c > index 187129652ab6..8ce92070086c 100644 > --- a/tools/lib/perf/evlist.c > +++ b/tools/lib/perf/evlist.c > @@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist) > { > struct perf_evsel *evsel; > > - /* Recomputing all_cpus, so start with a blank slate. */ > - perf_cpu_map__put(evlist->all_cpus); > - evlist->all_cpus = NULL; > + evlist->needs_map_propagation = true; > > perf_evlist__for_each_evsel(evlist, evsel) > __perf_evlist__propagate_maps(evlist, evsel); > @@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist, > evsel->idx = evlist->nr_entries; > list_add_tail(&evsel->node, &evlist->entries); > evlist->nr_entries += 1; > - __perf_evlist__propagate_maps(evlist, evsel); > + > + if (evlist->needs_map_propagation) > + __perf_evlist__propagate_maps(evlist, evsel); > } > > void perf_evlist__remove(struct perf_evlist *evlist, > @@ -177,9 +177,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist, > evlist->threads = perf_thread_map__get(threads); > } > > - if (!evlist->all_cpus && cpus) > - evlist->all_cpus = perf_cpu_map__get(cpus); > - > perf_evlist__propagate_maps(evlist); IIRC Adrian suggested this extra thing, but he provided the Reviewed-by for the previous patch, where this isn't present, Adrian, can you please confirm this and if this is the case provide your Reviewed-by for this new version? Thanks, - Arnaldo > } > > diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h > index 6f89aec3e608..850f07070036 100644 > --- a/tools/lib/perf/include/internal/evlist.h > +++ b/tools/lib/perf/include/internal/evlist.h > @@ -19,6 +19,7 @@ struct perf_evlist { > int nr_entries; > int nr_groups; > bool has_user_cpus; > + bool needs_map_propagation; > /** > * The cpus passed from the command line or all online CPUs by > * default. > -- > 2.38.0.rc1.362.ged0d419d3c-goog -- - Arnaldo ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-10-04 12:14 ` Arnaldo Carvalho de Melo @ 2022-10-04 13:55 ` Adrian Hunter 0 siblings, 0 replies; 28+ messages in thread From: Adrian Hunter @ 2022-10-04 13:55 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Namhyung Kim Cc: Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users, Kan Liang, Leo Yan On 4/10/22 15:14, Arnaldo Carvalho de Melo wrote: > Em Mon, Oct 03, 2022 at 01:46:44PM -0700, Namhyung Kim escreveu: >> The current code propagate evsel's cpu map settings to evlist when it's >> added to an evlist. But the evlist->all_cpus and each evsel's cpus will >> be updated in perf_evlist__set_maps() later. No need to do it before >> evlist's cpus are set actually. >> >> In fact it discards this intermediate all_cpus maps at the beginning >> of perf_evlist__set_maps(). Let's not do this. It's only needed when >> an evsel is added after the evlist cpu/thread maps are set. >> >> Signed-off-by: Namhyung Kim <namhyung@kernel.org> >> --- >> tools/lib/perf/evlist.c | 11 ++++------- >> tools/lib/perf/include/internal/evlist.h | 1 + >> 2 files changed, 5 insertions(+), 7 deletions(-) >> >> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c >> index 187129652ab6..8ce92070086c 100644 >> --- a/tools/lib/perf/evlist.c >> +++ b/tools/lib/perf/evlist.c >> @@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist) >> { >> struct perf_evsel *evsel; >> >> - /* Recomputing all_cpus, so start with a blank slate. */ >> - perf_cpu_map__put(evlist->all_cpus); >> - evlist->all_cpus = NULL; >> + evlist->needs_map_propagation = true; >> >> perf_evlist__for_each_evsel(evlist, evsel) >> __perf_evlist__propagate_maps(evlist, evsel); >> @@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist, >> evsel->idx = evlist->nr_entries; >> list_add_tail(&evsel->node, &evlist->entries); >> evlist->nr_entries += 1; >> - __perf_evlist__propagate_maps(evlist, evsel); >> + >> + if (evlist->needs_map_propagation) >> + __perf_evlist__propagate_maps(evlist, evsel); >> } >> >> void perf_evlist__remove(struct perf_evlist *evlist, >> @@ -177,9 +177,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist, >> evlist->threads = perf_thread_map__get(threads); >> } >> >> - if (!evlist->all_cpus && cpus) >> - evlist->all_cpus = perf_cpu_map__get(cpus); >> - >> perf_evlist__propagate_maps(evlist); > > IIRC Adrian suggested this extra thing, but he provided the Reviewed-by > for the previous patch, where this isn't present, Adrian, can you please > confirm this and if this is the case provide your Reviewed-by for this > new version? Oops sorry, I meant this one: Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> > > Thanks, > > - Arnaldo > >> } >> >> diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h >> index 6f89aec3e608..850f07070036 100644 >> --- a/tools/lib/perf/include/internal/evlist.h >> +++ b/tools/lib/perf/include/internal/evlist.h >> @@ -19,6 +19,7 @@ struct perf_evlist { >> int nr_entries; >> int nr_groups; >> bool has_user_cpus; >> + bool needs_map_propagation; >> /** >> * The cpus passed from the command line or all online CPUs by >> * default. >> -- >> 2.38.0.rc1.362.ged0d419d3c-goog > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-10-03 20:46 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim 2022-10-04 12:14 ` Arnaldo Carvalho de Melo @ 2022-10-06 18:52 ` Ian Rogers 2022-10-06 23:21 ` Namhyung Kim 1 sibling, 1 reply; 28+ messages in thread From: Ian Rogers @ 2022-10-06 18:52 UTC (permalink / raw) To: Namhyung Kim Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote: > > The current code propagate evsel's cpu map settings to evlist when it's > added to an evlist. But the evlist->all_cpus and each evsel's cpus will > be updated in perf_evlist__set_maps() later. No need to do it before > evlist's cpus are set actually. > > In fact it discards this intermediate all_cpus maps at the beginning > of perf_evlist__set_maps(). Let's not do this. It's only needed when > an evsel is added after the evlist cpu/thread maps are set. > > Signed-off-by: Namhyung Kim <namhyung@kernel.org> > --- > tools/lib/perf/evlist.c | 11 ++++------- > tools/lib/perf/include/internal/evlist.h | 1 + > 2 files changed, 5 insertions(+), 7 deletions(-) > > diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c > index 187129652ab6..8ce92070086c 100644 > --- a/tools/lib/perf/evlist.c > +++ b/tools/lib/perf/evlist.c > @@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist) > { > struct perf_evsel *evsel; > > - /* Recomputing all_cpus, so start with a blank slate. */ > - perf_cpu_map__put(evlist->all_cpus); > - evlist->all_cpus = NULL; > + evlist->needs_map_propagation = true; Might be nice to also clear this in perf_evlist__init. > > perf_evlist__for_each_evsel(evlist, evsel) > __perf_evlist__propagate_maps(evlist, evsel); > @@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist, > evsel->idx = evlist->nr_entries; > list_add_tail(&evsel->node, &evlist->entries); > evlist->nr_entries += 1; > - __perf_evlist__propagate_maps(evlist, evsel); > + > + if (evlist->needs_map_propagation) > + __perf_evlist__propagate_maps(evlist, evsel); I think a comment here would be useful. Something like: Adding events won't set the CPU maps in the evlist until set_maps/propogate_maps is called. Catch the case that an evsel is added after this and propagate the map. Thanks, Ian > } > > void perf_evlist__remove(struct perf_evlist *evlist, > @@ -177,9 +177,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist, > evlist->threads = perf_thread_map__get(threads); > } > > - if (!evlist->all_cpus && cpus) > - evlist->all_cpus = perf_cpu_map__get(cpus); > - > perf_evlist__propagate_maps(evlist); > } > > diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h > index 6f89aec3e608..850f07070036 100644 > --- a/tools/lib/perf/include/internal/evlist.h > +++ b/tools/lib/perf/include/internal/evlist.h > @@ -19,6 +19,7 @@ struct perf_evlist { > int nr_entries; > int nr_groups; > bool has_user_cpus; > + bool needs_map_propagation; > /** > * The cpus passed from the command line or all online CPUs by > * default. > -- > 2.38.0.rc1.362.ged0d419d3c-goog > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-10-06 18:52 ` Ian Rogers @ 2022-10-06 23:21 ` Namhyung Kim 0 siblings, 0 replies; 28+ messages in thread From: Namhyung Kim @ 2022-10-06 23:21 UTC (permalink / raw) To: Ian Rogers Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan On Thu, Oct 6, 2022 at 11:52 AM Ian Rogers <irogers@google.com> wrote: > > On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote: > > > > The current code propagate evsel's cpu map settings to evlist when it's > > added to an evlist. But the evlist->all_cpus and each evsel's cpus will > > be updated in perf_evlist__set_maps() later. No need to do it before > > evlist's cpus are set actually. > > > > In fact it discards this intermediate all_cpus maps at the beginning > > of perf_evlist__set_maps(). Let's not do this. It's only needed when > > an evsel is added after the evlist cpu/thread maps are set. > > > > Signed-off-by: Namhyung Kim <namhyung@kernel.org> > > --- > > tools/lib/perf/evlist.c | 11 ++++------- > > tools/lib/perf/include/internal/evlist.h | 1 + > > 2 files changed, 5 insertions(+), 7 deletions(-) > > > > diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c > > index 187129652ab6..8ce92070086c 100644 > > --- a/tools/lib/perf/evlist.c > > +++ b/tools/lib/perf/evlist.c > > @@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist) > > { > > struct perf_evsel *evsel; > > > > - /* Recomputing all_cpus, so start with a blank slate. */ > > - perf_cpu_map__put(evlist->all_cpus); > > - evlist->all_cpus = NULL; > > + evlist->needs_map_propagation = true; > > Might be nice to also clear this in perf_evlist__init. It's zero-initialized so I skipped it. I couldn't find places where it resets the existing evlist. But now I think we should not call perf_evlist__set_maps() in evlist__init() from perf tools. > > > > > perf_evlist__for_each_evsel(evlist, evsel) > > __perf_evlist__propagate_maps(evlist, evsel); > > @@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist, > > evsel->idx = evlist->nr_entries; > > list_add_tail(&evsel->node, &evlist->entries); > > evlist->nr_entries += 1; > > - __perf_evlist__propagate_maps(evlist, evsel); > > + > > + if (evlist->needs_map_propagation) > > + __perf_evlist__propagate_maps(evlist, evsel); > > I think a comment here would be useful. Something like: > Adding events won't set the CPU maps in the evlist until > set_maps/propogate_maps is called. Catch the case that an evsel is > added after this and propagate the map. Looks good, will add! Thanks, Namhyung > > } > > > > void perf_evlist__remove(struct perf_evlist *evlist, > > @@ -177,9 +177,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist, > > evlist->threads = perf_thread_map__get(threads); > > } > > > > - if (!evlist->all_cpus && cpus) > > - evlist->all_cpus = perf_cpu_map__get(cpus); > > - > > perf_evlist__propagate_maps(evlist); > > } > > > > diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h > > index 6f89aec3e608..850f07070036 100644 > > --- a/tools/lib/perf/include/internal/evlist.h > > +++ b/tools/lib/perf/include/internal/evlist.h > > @@ -19,6 +19,7 @@ struct perf_evlist { > > int nr_entries; > > int nr_groups; > > bool has_user_cpus; > > + bool needs_map_propagation; > > /** > > * The cpus passed from the command line or all online CPUs by > > * default. > > -- > > 2.38.0.rc1.362.ged0d419d3c-goog > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus() 2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim 2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim 2022-10-03 20:46 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim @ 2022-10-03 20:46 ` Namhyung Kim 2022-10-03 20:46 ` [PATCH 4/5] perf tools: Add evlist__add_sched_switch() Namhyung Kim 2022-10-03 20:46 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim 4 siblings, 0 replies; 28+ messages in thread From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Jiri Olsa Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan The cpu and thread maps are properly handled in libperf now. No need to do it in the perf tools anymore. Let's remove the logic. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/perf/util/evlist.c | 29 ++--------------------------- 1 file changed, 2 insertions(+), 27 deletions(-) diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index fcfe5bcc0bcf..dcf57b271ff1 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -268,28 +268,6 @@ int evlist__add_dummy(struct evlist *evlist) return 0; } -static void evlist__add_on_all_cpus(struct evlist *evlist, struct evsel *evsel) -{ - evsel->core.system_wide = true; - - /* - * All CPUs. - * - * Note perf_event_open() does not accept CPUs that are not online, so - * in fact this CPU list will include only all online CPUs. - */ - perf_cpu_map__put(evsel->core.own_cpus); - evsel->core.own_cpus = perf_cpu_map__new(NULL); - perf_cpu_map__put(evsel->core.cpus); - evsel->core.cpus = perf_cpu_map__get(evsel->core.own_cpus); - - /* No threads */ - perf_thread_map__put(evsel->core.threads); - evsel->core.threads = perf_thread_map__new_dummy(); - - evlist__add(evlist, evsel); -} - struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide) { struct evsel *evsel = evlist__dummy_event(evlist); @@ -302,14 +280,11 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide) evsel->core.attr.exclude_hv = 1; evsel->core.attr.freq = 0; evsel->core.attr.sample_period = 1; + evsel->core.system_wide = system_wide; evsel->no_aux_samples = true; evsel->name = strdup("dummy:u"); - if (system_wide) - evlist__add_on_all_cpus(evlist, evsel); - else - evlist__add(evlist, evsel); - + evlist__add(evlist, evsel); return evsel; } -- 2.38.0.rc1.362.ged0d419d3c-goog ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH 4/5] perf tools: Add evlist__add_sched_switch() 2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim ` (2 preceding siblings ...) 2022-10-03 20:46 ` [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus() Namhyung Kim @ 2022-10-03 20:46 ` Namhyung Kim 2022-10-03 20:46 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim 4 siblings, 0 replies; 28+ messages in thread From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Jiri Olsa Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan Add a help to create a system-wide sched_switch event. One merit is that it sets the system-wide bit before adding it to evlist so that the libperf can handle the cpu and thread maps correctly. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/perf/arch/x86/util/intel-pt.c | 15 +++++---------- tools/perf/tests/switch-tracking.c | 15 +++++---------- tools/perf/util/evlist.c | 17 +++++++++++++++++ tools/perf/util/evlist.h | 1 + 4 files changed, 28 insertions(+), 20 deletions(-) diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c index 13933020a79e..793b35f2221a 100644 --- a/tools/perf/arch/x86/util/intel-pt.c +++ b/tools/perf/arch/x86/util/intel-pt.c @@ -11,6 +11,7 @@ #include <linux/bitops.h> #include <linux/log2.h> #include <linux/zalloc.h> +#include <linux/err.h> #include <cpuid.h> #include "../../../util/session.h" @@ -426,20 +427,14 @@ static int intel_pt_track_switches(struct evlist *evlist) if (!evlist__can_select_event(evlist, sched_switch)) return -EPERM; - err = parse_event(evlist, sched_switch); - if (err) { - pr_debug2("%s: failed to parse %s, error %d\n", + evsel = evlist__add_sched_switch(evlist, true); + if (IS_ERR(evsel)) { + err = PTR_ERR(evsel); + pr_debug2("%s: failed to create %s, error = %d\n", __func__, sched_switch, err); return err; } - evsel = evlist__last(evlist); - - evsel__set_sample_bit(evsel, CPU); - evsel__set_sample_bit(evsel, TIME); - - evsel->core.system_wide = true; - evsel->no_aux_samples = true; evsel->immediate = true; return 0; diff --git a/tools/perf/tests/switch-tracking.c b/tools/perf/tests/switch-tracking.c index 2d46af9ef935..87f565c7f650 100644 --- a/tools/perf/tests/switch-tracking.c +++ b/tools/perf/tests/switch-tracking.c @@ -6,6 +6,7 @@ #include <time.h> #include <stdlib.h> #include <linux/zalloc.h> +#include <linux/err.h> #include <perf/cpumap.h> #include <perf/evlist.h> #include <perf/mmap.h> @@ -398,19 +399,13 @@ static int test__switch_tracking(struct test_suite *test __maybe_unused, int sub goto out; } - err = parse_event(evlist, sched_switch); - if (err) { - pr_debug("Failed to parse event %s\n", sched_switch); + switch_evsel = evlist__add_sched_switch(evlist, true); + if (IS_ERR(switch_evsel)) { + err = PTR_ERR(switch_evsel); + pr_debug("Failed to create event %s\n", sched_switch); goto out_err; } - switch_evsel = evlist__last(evlist); - - evsel__set_sample_bit(switch_evsel, CPU); - evsel__set_sample_bit(switch_evsel, TIME); - - switch_evsel->core.system_wide = true; - switch_evsel->no_aux_samples = true; switch_evsel->immediate = true; /* Test moving an event to the front */ diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index dcf57b271ff1..6612b00949e7 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -288,6 +288,23 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide) return evsel; } +struct evsel *evlist__add_sched_switch(struct evlist *evlist, bool system_wide) +{ + struct evsel *evsel = evsel__newtp_idx("sched", "sched_switch", 0); + + if (IS_ERR(evsel)) + return evsel; + + evsel__set_sample_bit(evsel, CPU); + evsel__set_sample_bit(evsel, TIME); + + evsel->core.system_wide = system_wide; + evsel->no_aux_samples = true; + + evlist__add(evlist, evsel); + return evsel; +}; + int evlist__add_attrs(struct evlist *evlist, struct perf_event_attr *attrs, size_t nr_attrs) { struct evsel *evsel, *n; diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h index 9d967fe3953a..16734c6756b3 100644 --- a/tools/perf/util/evlist.h +++ b/tools/perf/util/evlist.h @@ -127,6 +127,7 @@ static inline struct evsel *evlist__add_dummy_on_all_cpus(struct evlist *evlist) { return evlist__add_aux_dummy(evlist, true); } +struct evsel *evlist__add_sched_switch(struct evlist *evlist, bool system_wide); int evlist__add_sb_event(struct evlist *evlist, struct perf_event_attr *attr, evsel__sb_cb_t cb, void *data); -- 2.38.0.rc1.362.ged0d419d3c-goog ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH 5/5] perf tools: Remove special handling of system-wide evsel 2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim ` (3 preceding siblings ...) 2022-10-03 20:46 ` [PATCH 4/5] perf tools: Add evlist__add_sched_switch() Namhyung Kim @ 2022-10-03 20:46 ` Namhyung Kim 2022-10-06 18:57 ` Ian Rogers 4 siblings, 1 reply; 28+ messages in thread From: Namhyung Kim @ 2022-10-03 20:46 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Jiri Olsa Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan For system-wide evsels, the thread map should be dummy - i.e. it has a single entry of -1. But the code guarantees such a thread map, so no need to handle it specially. No functional change intended. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/lib/perf/evsel.c | 3 --- tools/perf/builtin-script.c | 3 --- tools/perf/util/evsel.c | 12 ++---------- tools/perf/util/stat.c | 3 --- 4 files changed, 2 insertions(+), 19 deletions(-) diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c index 8ce5bbd09666..8b51b008a81f 100644 --- a/tools/lib/perf/evsel.c +++ b/tools/lib/perf/evsel.c @@ -515,9 +515,6 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads) if (ncpus == 0 || nthreads == 0) return 0; - if (evsel->system_wide) - nthreads = 1; - evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id)); if (evsel->sample_id == NULL) return -ENOMEM; diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 886f53cfa257..7fa467ed91dc 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -2243,9 +2243,6 @@ static void __process_stat(struct evsel *counter, u64 tstamp) struct perf_cpu cpu; static int header_printed; - if (counter->core.system_wide) - nthreads = 1; - if (!header_printed) { printf("%3s %8s %15s %15s %15s %15s %s\n", "CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT"); diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 5776bfa70f11..e319bb17d10d 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -1813,7 +1813,7 @@ static struct perf_thread_map *empty_thread_map; static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus, struct perf_thread_map *threads) { - int nthreads; + int nthreads = perf_thread_map__nr(threads); if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) || (perf_missing_features.aux_output && evsel->core.attr.aux_output)) @@ -1839,11 +1839,6 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus, threads = empty_thread_map; } - if (evsel->core.system_wide) - nthreads = 1; - else - nthreads = threads->nr; - if (evsel->core.fd == NULL && perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0) return -ENOMEM; @@ -2061,10 +2056,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, if (threads == NULL) threads = empty_thread_map; - if (evsel->core.system_wide) - nthreads = 1; - else - nthreads = threads->nr; + nthreads = perf_thread_map__nr(threads); if (evsel->cgrp) pid = evsel->cgrp->fd; diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c index ce5e9e372fc4..cef943377ad7 100644 --- a/tools/perf/util/stat.c +++ b/tools/perf/util/stat.c @@ -420,9 +420,6 @@ static int process_counter_maps(struct perf_stat_config *config, int ncpus = evsel__nr_cpus(counter); int idx, thread; - if (counter->core.system_wide) - nthreads = 1; - for (thread = 0; thread < nthreads; thread++) { for (idx = 0; idx < ncpus; idx++) { if (process_counter_values(config, counter, idx, thread, -- 2.38.0.rc1.362.ged0d419d3c-goog ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 5/5] perf tools: Remove special handling of system-wide evsel 2022-10-03 20:46 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim @ 2022-10-06 18:57 ` Ian Rogers 0 siblings, 0 replies; 28+ messages in thread From: Ian Rogers @ 2022-10-06 18:57 UTC (permalink / raw) To: Namhyung Kim Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan On Mon, Oct 3, 2022 at 1:46 PM Namhyung Kim <namhyung@kernel.org> wrote: > > For system-wide evsels, the thread map should be dummy - i.e. it has a > single entry of -1. But the code guarantees such a thread map, so no > need to handle it specially. > > No functional change intended. > > Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> > Signed-off-by: Namhyung Kim <namhyung@kernel.org> > --- > tools/lib/perf/evsel.c | 3 --- > tools/perf/builtin-script.c | 3 --- > tools/perf/util/evsel.c | 12 ++---------- > tools/perf/util/stat.c | 3 --- > 4 files changed, 2 insertions(+), 19 deletions(-) > > diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c > index 8ce5bbd09666..8b51b008a81f 100644 > --- a/tools/lib/perf/evsel.c > +++ b/tools/lib/perf/evsel.c > @@ -515,9 +515,6 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads) > if (ncpus == 0 || nthreads == 0) > return 0; > > - if (evsel->system_wide) > - nthreads = 1; > - > evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id)); > if (evsel->sample_id == NULL) > return -ENOMEM; > diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c > index 886f53cfa257..7fa467ed91dc 100644 > --- a/tools/perf/builtin-script.c > +++ b/tools/perf/builtin-script.c > @@ -2243,9 +2243,6 @@ static void __process_stat(struct evsel *counter, u64 tstamp) > struct perf_cpu cpu; > static int header_printed; > > - if (counter->core.system_wide) > - nthreads = 1; > - > if (!header_printed) { > printf("%3s %8s %15s %15s %15s %15s %s\n", > "CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT"); > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c > index 5776bfa70f11..e319bb17d10d 100644 > --- a/tools/perf/util/evsel.c > +++ b/tools/perf/util/evsel.c > @@ -1813,7 +1813,7 @@ static struct perf_thread_map *empty_thread_map; > static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus, > struct perf_thread_map *threads) > { > - int nthreads; > + int nthreads = perf_thread_map__nr(threads); A general nit for something not introduced by this change. nr/nthreads here is really the number of perf_event_open file descriptors needed for the thread map. I wonder if a variable name of num_thread_fds would make the code more intention revealing. Thanks, Ian > > if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) || > (perf_missing_features.aux_output && evsel->core.attr.aux_output)) > @@ -1839,11 +1839,6 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus, > threads = empty_thread_map; > } > > - if (evsel->core.system_wide) > - nthreads = 1; > - else > - nthreads = threads->nr; > - > if (evsel->core.fd == NULL && > perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0) > return -ENOMEM; > @@ -2061,10 +2056,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, > if (threads == NULL) > threads = empty_thread_map; > > - if (evsel->core.system_wide) > - nthreads = 1; > - else > - nthreads = threads->nr; > + nthreads = perf_thread_map__nr(threads); > > if (evsel->cgrp) > pid = evsel->cgrp->fd; > diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c > index ce5e9e372fc4..cef943377ad7 100644 > --- a/tools/perf/util/stat.c > +++ b/tools/perf/util/stat.c > @@ -420,9 +420,6 @@ static int process_counter_maps(struct perf_stat_config *config, > int ncpus = evsel__nr_cpus(counter); > int idx, thread; > > - if (counter->core.system_wide) > - nthreads = 1; > - > for (thread = 0; thread < nthreads; thread++) { > for (idx = 0; idx < ncpus; idx++) { > if (process_counter_values(config, counter, idx, thread, > -- > 2.38.0.rc1.362.ged0d419d3c-goog > ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v2) @ 2022-09-30 17:27 Namhyung Kim 2022-09-30 17:27 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim 0 siblings, 1 reply; 28+ messages in thread From: Namhyung Kim @ 2022-09-30 17:27 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Jiri Olsa Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan Hello, The system-wide evsel has a cpu map for all (online) cpus regardless of user requested cpus. But the cpu map handling code has some special case for it and I think we can cleanup the code by making sure that such a evsel has a proper cpu/thread maps from the beginning. This patches should not cause any change in the behavior. Changes from v1: * use evlist->core.needs_map_propagation field * add Reviewed-by from Adrian You can get the code from 'perf/cpumap-update-v2' branch in git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git Thanks, Namhyung Namhyung Kim (5): libperf: Populate system-wide evsel maps libperf: Propagate maps only if necessary perf tools: Get rid of evlist__add_on_all_cpus() perf tools: Add evlist__add_sched_switch() perf tools: Remove special handling of system-wide evsel tools/lib/perf/evlist.c | 23 ++++++------ tools/lib/perf/evsel.c | 3 -- tools/lib/perf/include/internal/evlist.h | 1 + tools/perf/arch/x86/util/intel-pt.c | 15 +++----- tools/perf/builtin-script.c | 3 -- tools/perf/tests/switch-tracking.c | 15 +++----- tools/perf/util/evlist.c | 46 ++++++++++-------------- tools/perf/util/evlist.h | 1 + tools/perf/util/evsel.c | 12 ++----- tools/perf/util/stat.c | 3 -- 10 files changed, 46 insertions(+), 76 deletions(-) base-commit: 62e64c9d2fd12839c02f1b3e8b873e7cb34e8720 -- 2.38.0.rc1.362.ged0d419d3c-goog ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 2/5] libperf: Propagate maps only if necessary 2022-09-30 17:27 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v2) Namhyung Kim @ 2022-09-30 17:27 ` Namhyung Kim 2022-10-03 5:36 ` Adrian Hunter 2022-10-04 5:20 ` Adrian Hunter 0 siblings, 2 replies; 28+ messages in thread From: Namhyung Kim @ 2022-09-30 17:27 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Jiri Olsa Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan The current code propagate evsel's cpu map settings to evlist when it's added to an evlist. But the evlist->all_cpus and each evsel's cpus will be updated in perf_evlist__set_maps() later. No need to do it before evlist's cpus are set actually. In fact it discards this intermediate all_cpus maps at the beginning of perf_evlist__set_maps(). Let's not do this. It's only needed when an evsel is added after the evlist cpu/thread maps are set. Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/lib/perf/evlist.c | 8 ++++---- tools/lib/perf/include/internal/evlist.h | 1 + 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c index 187129652ab6..c5f65b89d77a 100644 --- a/tools/lib/perf/evlist.c +++ b/tools/lib/perf/evlist.c @@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist) { struct perf_evsel *evsel; - /* Recomputing all_cpus, so start with a blank slate. */ - perf_cpu_map__put(evlist->all_cpus); - evlist->all_cpus = NULL; + evlist->needs_map_propagation = true; perf_evlist__for_each_evsel(evlist, evsel) __perf_evlist__propagate_maps(evlist, evsel); @@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist, evsel->idx = evlist->nr_entries; list_add_tail(&evsel->node, &evlist->entries); evlist->nr_entries += 1; - __perf_evlist__propagate_maps(evlist, evsel); + + if (evlist->needs_map_propagation) + __perf_evlist__propagate_maps(evlist, evsel); } void perf_evlist__remove(struct perf_evlist *evlist, diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h index 6f89aec3e608..850f07070036 100644 --- a/tools/lib/perf/include/internal/evlist.h +++ b/tools/lib/perf/include/internal/evlist.h @@ -19,6 +19,7 @@ struct perf_evlist { int nr_entries; int nr_groups; bool has_user_cpus; + bool needs_map_propagation; /** * The cpus passed from the command line or all online CPUs by * default. -- 2.38.0.rc1.362.ged0d419d3c-goog ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-09-30 17:27 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim @ 2022-10-03 5:36 ` Adrian Hunter 2022-10-04 5:20 ` Adrian Hunter 1 sibling, 0 replies; 28+ messages in thread From: Adrian Hunter @ 2022-10-03 5:36 UTC (permalink / raw) To: Namhyung Kim, Arnaldo Carvalho de Melo, Jiri Olsa Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users, Kan Liang, Leo Yan On 30/09/22 20:27, Namhyung Kim wrote: > The current code propagate evsel's cpu map settings to evlist when it's > added to an evlist. But the evlist->all_cpus and each evsel's cpus will > be updated in perf_evlist__set_maps() later. No need to do it before > evlist's cpus are set actually. > > In fact it discards this intermediate all_cpus maps at the beginning > of perf_evlist__set_maps(). Let's not do this. It's only needed when > an evsel is added after the evlist cpu/thread maps are set. > > Signed-off-by: Namhyung Kim <namhyung@kernel.org> > --- > tools/lib/perf/evlist.c | 8 ++++---- > tools/lib/perf/include/internal/evlist.h | 1 + > 2 files changed, 5 insertions(+), 4 deletions(-) > > diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c > index 187129652ab6..c5f65b89d77a 100644 > --- a/tools/lib/perf/evlist.c > +++ b/tools/lib/perf/evlist.c > @@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist) > { > struct perf_evsel *evsel; > > - /* Recomputing all_cpus, so start with a blank slate. */ > - perf_cpu_map__put(evlist->all_cpus); > - evlist->all_cpus = NULL; If you remove this, then you also need to remove the setting in perf_evlist__set_maps, so that we still start with a blank slate. i.e. diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c index 8ec5b9f344e0..4aeffcbc11d1 100644 --- a/tools/lib/perf/evlist.c +++ b/tools/lib/perf/evlist.c @@ -174,9 +174,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist, evlist->threads = perf_thread_map__get(threads); } - if (!evlist->all_cpus && cpus) - evlist->all_cpus = perf_cpu_map__get(cpus); - perf_evlist__propagate_maps(evlist); } > + evlist->needs_map_propagation = true; > > perf_evlist__for_each_evsel(evlist, evsel) > __perf_evlist__propagate_maps(evlist, evsel); > @@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist, > evsel->idx = evlist->nr_entries; > list_add_tail(&evsel->node, &evlist->entries); > evlist->nr_entries += 1; > - __perf_evlist__propagate_maps(evlist, evsel); > + > + if (evlist->needs_map_propagation) > + __perf_evlist__propagate_maps(evlist, evsel); > } > > void perf_evlist__remove(struct perf_evlist *evlist, > diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h > index 6f89aec3e608..850f07070036 100644 > --- a/tools/lib/perf/include/internal/evlist.h > +++ b/tools/lib/perf/include/internal/evlist.h > @@ -19,6 +19,7 @@ struct perf_evlist { > int nr_entries; > int nr_groups; > bool has_user_cpus; > + bool needs_map_propagation; > /** > * The cpus passed from the command line or all online CPUs by > * default. ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-09-30 17:27 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim 2022-10-03 5:36 ` Adrian Hunter @ 2022-10-04 5:20 ` Adrian Hunter 1 sibling, 0 replies; 28+ messages in thread From: Adrian Hunter @ 2022-10-04 5:20 UTC (permalink / raw) To: Namhyung Kim, Arnaldo Carvalho de Melo, Jiri Olsa Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users, Kan Liang, Leo Yan On 30/09/22 20:27, Namhyung Kim wrote: > The current code propagate evsel's cpu map settings to evlist when it's > added to an evlist. But the evlist->all_cpus and each evsel's cpus will > be updated in perf_evlist__set_maps() later. No need to do it before > evlist's cpus are set actually. > > In fact it discards this intermediate all_cpus maps at the beginning > of perf_evlist__set_maps(). Let's not do this. It's only needed when > an evsel is added after the evlist cpu/thread maps are set. > > Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> > --- > tools/lib/perf/evlist.c | 8 ++++---- > tools/lib/perf/include/internal/evlist.h | 1 + > 2 files changed, 5 insertions(+), 4 deletions(-) > > diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c > index 187129652ab6..c5f65b89d77a 100644 > --- a/tools/lib/perf/evlist.c > +++ b/tools/lib/perf/evlist.c > @@ -67,9 +67,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist) > { > struct perf_evsel *evsel; > > - /* Recomputing all_cpus, so start with a blank slate. */ > - perf_cpu_map__put(evlist->all_cpus); > - evlist->all_cpus = NULL; > + evlist->needs_map_propagation = true; > > perf_evlist__for_each_evsel(evlist, evsel) > __perf_evlist__propagate_maps(evlist, evsel); > @@ -81,7 +79,9 @@ void perf_evlist__add(struct perf_evlist *evlist, > evsel->idx = evlist->nr_entries; > list_add_tail(&evsel->node, &evlist->entries); > evlist->nr_entries += 1; > - __perf_evlist__propagate_maps(evlist, evsel); > + > + if (evlist->needs_map_propagation) > + __perf_evlist__propagate_maps(evlist, evsel); > } > > void perf_evlist__remove(struct perf_evlist *evlist, > diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h > index 6f89aec3e608..850f07070036 100644 > --- a/tools/lib/perf/include/internal/evlist.h > +++ b/tools/lib/perf/include/internal/evlist.h > @@ -19,6 +19,7 @@ struct perf_evlist { > int nr_entries; > int nr_groups; > bool has_user_cpus; > + bool needs_map_propagation; > /** > * The cpus passed from the command line or all online CPUs by > * default. ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) @ 2022-09-24 16:57 Namhyung Kim 2022-09-24 16:57 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim 0 siblings, 1 reply; 28+ messages in thread From: Namhyung Kim @ 2022-09-24 16:57 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Jiri Olsa Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan Hello, The system-wide evsel has a cpu map for all (online) cpus regardless of user requested cpus. But the cpu map handling code has some special case for it and I think we can cleanup the code by making sure that such a evsel has a proper cpu/thread maps from the beginning. This patches should not cause any change in the behavior. You can get the code from 'perf/cpumap-update-v1' branch in git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git Thanks, Namhyung Namhyung Kim (5): libperf: Populate system-wide evsel maps libperf: Propagate maps only if necessary perf tools: Get rid of evlist__add_on_all_cpus() perf tools: Add evlist__add_sched_switch() perf tools: Remove special handling of system-wide evsel tools/lib/perf/evlist.c | 23 ++++++++------- tools/lib/perf/evsel.c | 3 -- tools/perf/arch/x86/util/intel-pt.c | 15 ++++------ tools/perf/builtin-script.c | 3 -- tools/perf/tests/switch-tracking.c | 15 ++++------ tools/perf/util/evlist.c | 46 ++++++++++++----------------- tools/perf/util/evlist.h | 1 + tools/perf/util/evsel.c | 12 ++------ tools/perf/util/stat.c | 3 -- 9 files changed, 44 insertions(+), 77 deletions(-) -- 2.37.3.998.g577e59143f-goog ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 2/5] libperf: Propagate maps only if necessary 2022-09-24 16:57 [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) Namhyung Kim @ 2022-09-24 16:57 ` Namhyung Kim 2022-09-27 7:06 ` Adrian Hunter 0 siblings, 1 reply; 28+ messages in thread From: Namhyung Kim @ 2022-09-24 16:57 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Jiri Olsa Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, Adrian Hunter, linux-perf-users, Kan Liang, Leo Yan The current code propagate evsel's cpu map settings to evlist when it's added to an evlist. But the evlist->all_cpus and each evsel's cpus will be updated in perf_evlist__set_maps() later. No need to do it before evlist's cpus are set actually. Actually we discarded this intermediate all_cpus maps at the beginning of perf_evlist__set_maps(). Let's not do this. It's only needed when an evsel is added after the evlist cpu maps are set. Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/lib/perf/evlist.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c index 187129652ab6..cc070c3a134d 100644 --- a/tools/lib/perf/evlist.c +++ b/tools/lib/perf/evlist.c @@ -67,10 +67,6 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist) { struct perf_evsel *evsel; - /* Recomputing all_cpus, so start with a blank slate. */ - perf_cpu_map__put(evlist->all_cpus); - evlist->all_cpus = NULL; - perf_evlist__for_each_evsel(evlist, evsel) __perf_evlist__propagate_maps(evlist, evsel); } @@ -81,7 +77,9 @@ void perf_evlist__add(struct perf_evlist *evlist, evsel->idx = evlist->nr_entries; list_add_tail(&evsel->node, &evlist->entries); evlist->nr_entries += 1; - __perf_evlist__propagate_maps(evlist, evsel); + + if (evlist->all_cpus) + __perf_evlist__propagate_maps(evlist, evsel); } void perf_evlist__remove(struct perf_evlist *evlist, -- 2.37.3.998.g577e59143f-goog ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-09-24 16:57 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim @ 2022-09-27 7:06 ` Adrian Hunter 2022-09-27 17:28 ` Namhyung Kim 0 siblings, 1 reply; 28+ messages in thread From: Adrian Hunter @ 2022-09-27 7:06 UTC (permalink / raw) To: Namhyung Kim, Arnaldo Carvalho de Melo, Jiri Olsa Cc: Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users, Kan Liang, Leo Yan On 24/09/22 19:57, Namhyung Kim wrote: > The current code propagate evsel's cpu map settings to evlist when it's > added to an evlist. But the evlist->all_cpus and each evsel's cpus will > be updated in perf_evlist__set_maps() later. No need to do it before > evlist's cpus are set actually. > > Actually we discarded this intermediate all_cpus maps at the beginning > of perf_evlist__set_maps(). Let's not do this. It's only needed when > an evsel is added after the evlist cpu maps are set. That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles with evsel->core.cpus and evsel->core.own_cpus after the evsel has been added to the evlist. It can also remove an evsel from the evlist. There might be other cases like that, but that was just one that stuck out. > > Signed-off-by: Namhyung Kim <namhyung@kernel.org> > --- > tools/lib/perf/evlist.c | 8 +++----- > 1 file changed, 3 insertions(+), 5 deletions(-) > > diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c > index 187129652ab6..cc070c3a134d 100644 > --- a/tools/lib/perf/evlist.c > +++ b/tools/lib/perf/evlist.c > @@ -67,10 +67,6 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist) > { > struct perf_evsel *evsel; > > - /* Recomputing all_cpus, so start with a blank slate. */ > - perf_cpu_map__put(evlist->all_cpus); > - evlist->all_cpus = NULL; > - > perf_evlist__for_each_evsel(evlist, evsel) > __perf_evlist__propagate_maps(evlist, evsel); > } > @@ -81,7 +77,9 @@ void perf_evlist__add(struct perf_evlist *evlist, > evsel->idx = evlist->nr_entries; > list_add_tail(&evsel->node, &evlist->entries); > evlist->nr_entries += 1; > - __perf_evlist__propagate_maps(evlist, evsel); > + > + if (evlist->all_cpus) > + __perf_evlist__propagate_maps(evlist, evsel); > } > > void perf_evlist__remove(struct perf_evlist *evlist, ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-09-27 7:06 ` Adrian Hunter @ 2022-09-27 17:28 ` Namhyung Kim 2022-09-28 7:53 ` Adrian Hunter 0 siblings, 1 reply; 28+ messages in thread From: Namhyung Kim @ 2022-09-27 17:28 UTC (permalink / raw) To: Adrian Hunter Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users, Kan Liang, Leo Yan Hi Adrian, On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > On 24/09/22 19:57, Namhyung Kim wrote: > > The current code propagate evsel's cpu map settings to evlist when it's > > added to an evlist. But the evlist->all_cpus and each evsel's cpus will > > be updated in perf_evlist__set_maps() later. No need to do it before > > evlist's cpus are set actually. > > > > Actually we discarded this intermediate all_cpus maps at the beginning > > of perf_evlist__set_maps(). Let's not do this. It's only needed when > > an evsel is added after the evlist cpu maps are set. > > That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles > with evsel->core.cpus and evsel->core.own_cpus after the evsel has been > added to the evlist. It can also remove an evsel from the evlist. Thanks for your review. I think it's fine to change evsel cpus or to remove an evsel from evlist before calling evlist__create_maps(). The function will take care of setting evlist's all_cpus from the evsels in the evlist. So previous changes in evsel/cpus wouldn't be any special. After this point, adding a new evsel needs to update evlist all cpus by propagating cpu maps. So I think hybrid cpus should be fine. Did I miss something? > > There might be other cases like that, but that was just one that stuck > out. Thanks for sharing your concern. Please let me know if you could come up with another. Namhyung ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-09-27 17:28 ` Namhyung Kim @ 2022-09-28 7:53 ` Adrian Hunter 2022-09-28 23:46 ` Namhyung Kim 0 siblings, 1 reply; 28+ messages in thread From: Adrian Hunter @ 2022-09-28 7:53 UTC (permalink / raw) To: Namhyung Kim Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users, Kan Liang, Leo Yan On 27/09/22 20:28, Namhyung Kim wrote: > Hi Adrian, > > On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: >> >> On 24/09/22 19:57, Namhyung Kim wrote: >>> The current code propagate evsel's cpu map settings to evlist when it's >>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will >>> be updated in perf_evlist__set_maps() later. No need to do it before >>> evlist's cpus are set actually. >>> >>> Actually we discarded this intermediate all_cpus maps at the beginning >>> of perf_evlist__set_maps(). Let's not do this. It's only needed when >>> an evsel is added after the evlist cpu maps are set. >> >> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles >> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been >> added to the evlist. It can also remove an evsel from the evlist. > > Thanks for your review. I think it's fine to change evsel cpus or to remove > an evsel from evlist before calling evlist__create_maps(). The function > will take care of setting evlist's all_cpus from the evsels in the evlist. > So previous changes in evsel/cpus wouldn't be any special. > > After this point, adding a new evsel needs to update evlist all cpus by > propagating cpu maps. So I think hybrid cpus should be fine. > Did I miss something? I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the cpus from the target->cpu_list (using perf record -C) , since after this patch all_cpus always starts with the target->cpu_list instead of an empty list. But then, in the hybrid case, it puts a dummy event that uses the target cpu list anyway, so the result is the same. I don't know if there are any cases where all_cpus would actually need to exclude some of the cpus from target->cpu_list. > >> >> There might be other cases like that, but that was just one that stuck >> out. > > Thanks for sharing your concern. Please let me know if you could > come up with another. > > Namhyung ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-09-28 7:53 ` Adrian Hunter @ 2022-09-28 23:46 ` Namhyung Kim 2022-09-29 2:07 ` Ian Rogers 0 siblings, 1 reply; 28+ messages in thread From: Namhyung Kim @ 2022-09-28 23:46 UTC (permalink / raw) To: Adrian Hunter Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, Ian Rogers, linux-perf-users, Kan Liang, Leo Yan On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > On 27/09/22 20:28, Namhyung Kim wrote: > > Hi Adrian, > > > > On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > >> > >> On 24/09/22 19:57, Namhyung Kim wrote: > >>> The current code propagate evsel's cpu map settings to evlist when it's > >>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will > >>> be updated in perf_evlist__set_maps() later. No need to do it before > >>> evlist's cpus are set actually. > >>> > >>> Actually we discarded this intermediate all_cpus maps at the beginning > >>> of perf_evlist__set_maps(). Let's not do this. It's only needed when > >>> an evsel is added after the evlist cpu maps are set. > >> > >> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles > >> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been > >> added to the evlist. It can also remove an evsel from the evlist. > > > > Thanks for your review. I think it's fine to change evsel cpus or to remove > > an evsel from evlist before calling evlist__create_maps(). The function > > will take care of setting evlist's all_cpus from the evsels in the evlist. > > So previous changes in evsel/cpus wouldn't be any special. > > > > After this point, adding a new evsel needs to update evlist all cpus by > > propagating cpu maps. So I think hybrid cpus should be fine. > > Did I miss something? > > I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the > cpus from the target->cpu_list (using perf record -C) , since after this > patch all_cpus always starts with the target->cpu_list instead of an empty > list. But then, in the hybrid case, it puts a dummy event that uses the > target cpu list anyway, so the result is the same. > > I don't know if there are any cases where all_cpus would actually need to > exclude some of the cpus from target->cpu_list. I'm not aware of other cases to reduce cpu list. I think it'd be fine if it has a cpu in the evlist->all_cpus even if it's not used. The evsel should have a correct list anyway and we mostly use the evsel cpus to do the real work. Thanks, Namhyung ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-09-28 23:46 ` Namhyung Kim @ 2022-09-29 2:07 ` Ian Rogers 2022-09-29 5:09 ` Namhyung Kim 0 siblings, 1 reply; 28+ messages in thread From: Ian Rogers @ 2022-09-29 2:07 UTC (permalink / raw) To: Namhyung Kim Cc: Adrian Hunter, Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, linux-perf-users, Kan Liang, Leo Yan On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote: > > On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > > > On 27/09/22 20:28, Namhyung Kim wrote: > > > Hi Adrian, > > > > > > On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > >> > > >> On 24/09/22 19:57, Namhyung Kim wrote: > > >>> The current code propagate evsel's cpu map settings to evlist when it's > > >>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will > > >>> be updated in perf_evlist__set_maps() later. No need to do it before > > >>> evlist's cpus are set actually. > > >>> > > >>> Actually we discarded this intermediate all_cpus maps at the beginning > > >>> of perf_evlist__set_maps(). Let's not do this. It's only needed when > > >>> an evsel is added after the evlist cpu maps are set. > > >> > > >> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles > > >> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been > > >> added to the evlist. It can also remove an evsel from the evlist. > > > > > > Thanks for your review. I think it's fine to change evsel cpus or to remove > > > an evsel from evlist before calling evlist__create_maps(). The function > > > will take care of setting evlist's all_cpus from the evsels in the evlist. > > > So previous changes in evsel/cpus wouldn't be any special. > > > > > > After this point, adding a new evsel needs to update evlist all cpus by > > > propagating cpu maps. So I think hybrid cpus should be fine. > > > Did I miss something? > > > > I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the > > cpus from the target->cpu_list (using perf record -C) , since after this > > patch all_cpus always starts with the target->cpu_list instead of an empty > > list. But then, in the hybrid case, it puts a dummy event that uses the > > target cpu list anyway, so the result is the same. > > > > I don't know if there are any cases where all_cpus would actually need to > > exclude some of the cpus from target->cpu_list. > > I'm not aware of other cases to reduce cpu list. I think it'd be fine > if it has a cpu in the evlist->all_cpus even if it's not used. The evsel > should have a correct list anyway and we mostly use the evsel cpus > to do the real work. > > Thanks, > Namhyung The affinity changes made it so that we use all_cpus probably more often than the evsel CPU maps for real work. The reason being we want to avoid IPIs so we do all the work on 1 CPU and then move to the next CPU in evlist all_cpus. evsel CPU maps are used to make sure the indices are kept accurate - for example, if an uncore event is measured with a CPU event: https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366 https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404 Thanks, Ian ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-09-29 2:07 ` Ian Rogers @ 2022-09-29 5:09 ` Namhyung Kim 2022-09-29 5:18 ` Adrian Hunter 0 siblings, 1 reply; 28+ messages in thread From: Namhyung Kim @ 2022-09-29 5:09 UTC (permalink / raw) To: Ian Rogers Cc: Adrian Hunter, Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, linux-perf-users, Kan Liang, Leo Yan On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote: > > On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote: > > > > On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > > > > > On 27/09/22 20:28, Namhyung Kim wrote: > > > > Hi Adrian, > > > > > > > > On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > > >> > > > >> On 24/09/22 19:57, Namhyung Kim wrote: > > > >>> The current code propagate evsel's cpu map settings to evlist when it's > > > >>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will > > > >>> be updated in perf_evlist__set_maps() later. No need to do it before > > > >>> evlist's cpus are set actually. > > > >>> > > > >>> Actually we discarded this intermediate all_cpus maps at the beginning > > > >>> of perf_evlist__set_maps(). Let's not do this. It's only needed when > > > >>> an evsel is added after the evlist cpu maps are set. > > > >> > > > >> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles > > > >> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been > > > >> added to the evlist. It can also remove an evsel from the evlist. > > > > > > > > Thanks for your review. I think it's fine to change evsel cpus or to remove > > > > an evsel from evlist before calling evlist__create_maps(). The function > > > > will take care of setting evlist's all_cpus from the evsels in the evlist. > > > > So previous changes in evsel/cpus wouldn't be any special. > > > > > > > > After this point, adding a new evsel needs to update evlist all cpus by > > > > propagating cpu maps. So I think hybrid cpus should be fine. > > > > Did I miss something? > > > > > > I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the > > > cpus from the target->cpu_list (using perf record -C) , since after this > > > patch all_cpus always starts with the target->cpu_list instead of an empty > > > list. But then, in the hybrid case, it puts a dummy event that uses the > > > target cpu list anyway, so the result is the same. > > > > > > I don't know if there are any cases where all_cpus would actually need to > > > exclude some of the cpus from target->cpu_list. > > > > I'm not aware of other cases to reduce cpu list. I think it'd be fine > > if it has a cpu in the evlist->all_cpus even if it's not used. The evsel > > should have a correct list anyway and we mostly use the evsel cpus > > to do the real work. > > > > Thanks, > > Namhyung > > The affinity changes made it so that we use all_cpus probably more > often than the evsel CPU maps for real work. The reason being we want > to avoid IPIs so we do all the work on 1 CPU and then move to the next > CPU in evlist all_cpus. evsel CPU maps are used to make sure the > indices are kept accurate - for example, if an uncore event is > measured with a CPU event: > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366 > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404 Right, I meant it'd check the evsel cpus eventually even if it iterates on the evlist all_cpus. The evlist_cpu_iterator__next() will skip a CPU if it's not in the evsel cpus. Thanks, Namhyung ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-09-29 5:09 ` Namhyung Kim @ 2022-09-29 5:18 ` Adrian Hunter 2022-09-29 20:42 ` Namhyung Kim 0 siblings, 1 reply; 28+ messages in thread From: Adrian Hunter @ 2022-09-29 5:18 UTC (permalink / raw) To: Namhyung Kim, Ian Rogers Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, linux-perf-users, Kan Liang, Leo Yan On 29/09/22 08:09, Namhyung Kim wrote: > On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote: >> >> On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote: >>> >>> On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote: >>>> >>>> On 27/09/22 20:28, Namhyung Kim wrote: >>>>> Hi Adrian, >>>>> >>>>> On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: >>>>>> >>>>>> On 24/09/22 19:57, Namhyung Kim wrote: >>>>>>> The current code propagate evsel's cpu map settings to evlist when it's >>>>>>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will >>>>>>> be updated in perf_evlist__set_maps() later. No need to do it before >>>>>>> evlist's cpus are set actually. >>>>>>> >>>>>>> Actually we discarded this intermediate all_cpus maps at the beginning >>>>>>> of perf_evlist__set_maps(). Let's not do this. It's only needed when >>>>>>> an evsel is added after the evlist cpu maps are set. >>>>>> >>>>>> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles >>>>>> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been >>>>>> added to the evlist. It can also remove an evsel from the evlist. >>>>> >>>>> Thanks for your review. I think it's fine to change evsel cpus or to remove >>>>> an evsel from evlist before calling evlist__create_maps(). The function >>>>> will take care of setting evlist's all_cpus from the evsels in the evlist. >>>>> So previous changes in evsel/cpus wouldn't be any special. >>>>> >>>>> After this point, adding a new evsel needs to update evlist all cpus by >>>>> propagating cpu maps. So I think hybrid cpus should be fine. >>>>> Did I miss something? >>>> >>>> I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the >>>> cpus from the target->cpu_list (using perf record -C) , since after this >>>> patch all_cpus always starts with the target->cpu_list instead of an empty >>>> list. But then, in the hybrid case, it puts a dummy event that uses the >>>> target cpu list anyway, so the result is the same. >>>> >>>> I don't know if there are any cases where all_cpus would actually need to >>>> exclude some of the cpus from target->cpu_list. >>> >>> I'm not aware of other cases to reduce cpu list. I think it'd be fine >>> if it has a cpu in the evlist->all_cpus even if it's not used. The evsel >>> should have a correct list anyway and we mostly use the evsel cpus >>> to do the real work. >>> >>> Thanks, >>> Namhyung >> >> The affinity changes made it so that we use all_cpus probably more >> often than the evsel CPU maps for real work. The reason being we want >> to avoid IPIs so we do all the work on 1 CPU and then move to the next >> CPU in evlist all_cpus. evsel CPU maps are used to make sure the >> indices are kept accurate - for example, if an uncore event is >> measured with a CPU event: >> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366 >> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404 > > Right, I meant it'd check the evsel cpus eventually even if it iterates > on the evlist all_cpus. The evlist_cpu_iterator__next() will skip a > CPU if it's not in the evsel cpus. > > Thanks, > Namhyung Perhaps an alternative is to be explicit about deferring map propagation e.g. diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c index 19eaea99aa4f..5ce19e62397d 100644 --- a/tools/lib/perf/evlist.c +++ b/tools/lib/perf/evlist.c @@ -70,6 +70,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist) /* Recomputing all_cpus, so start with a blank slate. */ perf_cpu_map__put(evlist->all_cpus); evlist->all_cpus = NULL; + evlist->defer_map_propagation = false; perf_evlist__for_each_evsel(evlist, evsel) __perf_evlist__propagate_maps(evlist, evsel); @@ -81,7 +82,8 @@ void perf_evlist__add(struct perf_evlist *evlist, evsel->idx = evlist->nr_entries; list_add_tail(&evsel->node, &evlist->entries); evlist->nr_entries += 1; - __perf_evlist__propagate_maps(evlist, evsel); + if (!evlist->defer_map_propagation) + __perf_evlist__propagate_maps(evlist, evsel); } void perf_evlist__remove(struct perf_evlist *evlist, @@ -177,9 +179,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist, evlist->threads = perf_thread_map__get(threads); } - if (!evlist->all_cpus && cpus) - evlist->all_cpus = perf_cpu_map__get(cpus); - perf_evlist__propagate_maps(evlist); } diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h index 6f89aec3e608..dbe0b763f597 100644 --- a/tools/lib/perf/include/internal/evlist.h +++ b/tools/lib/perf/include/internal/evlist.h @@ -19,6 +19,7 @@ struct perf_evlist { int nr_entries; int nr_groups; bool has_user_cpus; + bool defer_map_propagation; /** * The cpus passed from the command line or all online CPUs by * default. diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 52d254b1530c..1c2523d66a14 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -3975,6 +3975,7 @@ int cmd_record(int argc, const char **argv) rec->evlist = evlist__new(); if (rec->evlist == NULL) return -ENOMEM; + rec->evlist->core.defer_map_propagation = true; err = perf_config(perf_record_config, rec); if (err) ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-09-29 5:18 ` Adrian Hunter @ 2022-09-29 20:42 ` Namhyung Kim 2022-09-30 12:49 ` Adrian Hunter 0 siblings, 1 reply; 28+ messages in thread From: Namhyung Kim @ 2022-09-29 20:42 UTC (permalink / raw) To: Adrian Hunter Cc: Ian Rogers, Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, linux-perf-users, Kan Liang, Leo Yan On Wed, Sep 28, 2022 at 10:19 PM Adrian Hunter <adrian.hunter@intel.com> wrote: > > On 29/09/22 08:09, Namhyung Kim wrote: > > On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote: > >> > >> On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote: > >>> > >>> On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > >>>> > >>>> On 27/09/22 20:28, Namhyung Kim wrote: > >>>>> Hi Adrian, > >>>>> > >>>>> On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > >>>>>> > >>>>>> On 24/09/22 19:57, Namhyung Kim wrote: > >>>>>>> The current code propagate evsel's cpu map settings to evlist when it's > >>>>>>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will > >>>>>>> be updated in perf_evlist__set_maps() later. No need to do it before > >>>>>>> evlist's cpus are set actually. > >>>>>>> > >>>>>>> Actually we discarded this intermediate all_cpus maps at the beginning > >>>>>>> of perf_evlist__set_maps(). Let's not do this. It's only needed when > >>>>>>> an evsel is added after the evlist cpu maps are set. > >>>>>> > >>>>>> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles > >>>>>> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been > >>>>>> added to the evlist. It can also remove an evsel from the evlist. > >>>>> > >>>>> Thanks for your review. I think it's fine to change evsel cpus or to remove > >>>>> an evsel from evlist before calling evlist__create_maps(). The function > >>>>> will take care of setting evlist's all_cpus from the evsels in the evlist. > >>>>> So previous changes in evsel/cpus wouldn't be any special. > >>>>> > >>>>> After this point, adding a new evsel needs to update evlist all cpus by > >>>>> propagating cpu maps. So I think hybrid cpus should be fine. > >>>>> Did I miss something? > >>>> > >>>> I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the > >>>> cpus from the target->cpu_list (using perf record -C) , since after this > >>>> patch all_cpus always starts with the target->cpu_list instead of an empty > >>>> list. But then, in the hybrid case, it puts a dummy event that uses the > >>>> target cpu list anyway, so the result is the same. > >>>> > >>>> I don't know if there are any cases where all_cpus would actually need to > >>>> exclude some of the cpus from target->cpu_list. > >>> > >>> I'm not aware of other cases to reduce cpu list. I think it'd be fine > >>> if it has a cpu in the evlist->all_cpus even if it's not used. The evsel > >>> should have a correct list anyway and we mostly use the evsel cpus > >>> to do the real work. > >>> > >>> Thanks, > >>> Namhyung > >> > >> The affinity changes made it so that we use all_cpus probably more > >> often than the evsel CPU maps for real work. The reason being we want > >> to avoid IPIs so we do all the work on 1 CPU and then move to the next > >> CPU in evlist all_cpus. evsel CPU maps are used to make sure the > >> indices are kept accurate - for example, if an uncore event is > >> measured with a CPU event: > >> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366 > >> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404 > > > > Right, I meant it'd check the evsel cpus eventually even if it iterates > > on the evlist all_cpus. The evlist_cpu_iterator__next() will skip a > > CPU if it's not in the evsel cpus. > > > > Thanks, > > Namhyung > > Perhaps an alternative is to be explicit about deferring map > propagation e.g. Thanks for your patch. Yeah, we can use this. But I still think it'd be better doing it unconditionally since any propagation before perf_evlist__set_maps will be discarded anyway. With this change, other than perf record will collect all cpus before _set_maps and then discard it. It seems like a waste, no? Or else, we can have allow_map_propagation initialized to false and set it to true in perf_evlist__set_maps(). Thanks, Namhyung > > diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c > index 19eaea99aa4f..5ce19e62397d 100644 > --- a/tools/lib/perf/evlist.c > +++ b/tools/lib/perf/evlist.c > @@ -70,6 +70,7 @@ static void perf_evlist__propagate_maps(struct perf_evlist *evlist) > /* Recomputing all_cpus, so start with a blank slate. */ > perf_cpu_map__put(evlist->all_cpus); > evlist->all_cpus = NULL; > + evlist->defer_map_propagation = false; > > perf_evlist__for_each_evsel(evlist, evsel) > __perf_evlist__propagate_maps(evlist, evsel); > @@ -81,7 +82,8 @@ void perf_evlist__add(struct perf_evlist *evlist, > evsel->idx = evlist->nr_entries; > list_add_tail(&evsel->node, &evlist->entries); > evlist->nr_entries += 1; > - __perf_evlist__propagate_maps(evlist, evsel); > + if (!evlist->defer_map_propagation) > + __perf_evlist__propagate_maps(evlist, evsel); > } > > void perf_evlist__remove(struct perf_evlist *evlist, > @@ -177,9 +179,6 @@ void perf_evlist__set_maps(struct perf_evlist *evlist, > evlist->threads = perf_thread_map__get(threads); > } > > - if (!evlist->all_cpus && cpus) > - evlist->all_cpus = perf_cpu_map__get(cpus); > - > perf_evlist__propagate_maps(evlist); > } > > diff --git a/tools/lib/perf/include/internal/evlist.h b/tools/lib/perf/include/internal/evlist.h > index 6f89aec3e608..dbe0b763f597 100644 > --- a/tools/lib/perf/include/internal/evlist.h > +++ b/tools/lib/perf/include/internal/evlist.h > @@ -19,6 +19,7 @@ struct perf_evlist { > int nr_entries; > int nr_groups; > bool has_user_cpus; > + bool defer_map_propagation; > /** > * The cpus passed from the command line or all online CPUs by > * default. > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c > index 52d254b1530c..1c2523d66a14 100644 > --- a/tools/perf/builtin-record.c > +++ b/tools/perf/builtin-record.c > @@ -3975,6 +3975,7 @@ int cmd_record(int argc, const char **argv) > rec->evlist = evlist__new(); > if (rec->evlist == NULL) > return -ENOMEM; > + rec->evlist->core.defer_map_propagation = true; > > err = perf_config(perf_record_config, rec); > if (err) > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-09-29 20:42 ` Namhyung Kim @ 2022-09-30 12:49 ` Adrian Hunter 2022-09-30 16:44 ` Namhyung Kim 0 siblings, 1 reply; 28+ messages in thread From: Adrian Hunter @ 2022-09-30 12:49 UTC (permalink / raw) To: Namhyung Kim Cc: Ian Rogers, Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, linux-perf-users, Kan Liang, Leo Yan On 29/09/22 23:42, Namhyung Kim wrote: > On Wed, Sep 28, 2022 at 10:19 PM Adrian Hunter <adrian.hunter@intel.com> wrote: >> >> On 29/09/22 08:09, Namhyung Kim wrote: >>> On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote: >>>> >>>> On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote: >>>>> >>>>> On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote: >>>>>> >>>>>> On 27/09/22 20:28, Namhyung Kim wrote: >>>>>>> Hi Adrian, >>>>>>> >>>>>>> On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: >>>>>>>> >>>>>>>> On 24/09/22 19:57, Namhyung Kim wrote: >>>>>>>>> The current code propagate evsel's cpu map settings to evlist when it's >>>>>>>>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will >>>>>>>>> be updated in perf_evlist__set_maps() later. No need to do it before >>>>>>>>> evlist's cpus are set actually. >>>>>>>>> >>>>>>>>> Actually we discarded this intermediate all_cpus maps at the beginning >>>>>>>>> of perf_evlist__set_maps(). Let's not do this. It's only needed when >>>>>>>>> an evsel is added after the evlist cpu maps are set. >>>>>>>> >>>>>>>> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles >>>>>>>> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been >>>>>>>> added to the evlist. It can also remove an evsel from the evlist. >>>>>>> >>>>>>> Thanks for your review. I think it's fine to change evsel cpus or to remove >>>>>>> an evsel from evlist before calling evlist__create_maps(). The function >>>>>>> will take care of setting evlist's all_cpus from the evsels in the evlist. >>>>>>> So previous changes in evsel/cpus wouldn't be any special. >>>>>>> >>>>>>> After this point, adding a new evsel needs to update evlist all cpus by >>>>>>> propagating cpu maps. So I think hybrid cpus should be fine. >>>>>>> Did I miss something? >>>>>> >>>>>> I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the >>>>>> cpus from the target->cpu_list (using perf record -C) , since after this >>>>>> patch all_cpus always starts with the target->cpu_list instead of an empty >>>>>> list. But then, in the hybrid case, it puts a dummy event that uses the >>>>>> target cpu list anyway, so the result is the same. >>>>>> >>>>>> I don't know if there are any cases where all_cpus would actually need to >>>>>> exclude some of the cpus from target->cpu_list. >>>>> >>>>> I'm not aware of other cases to reduce cpu list. I think it'd be fine >>>>> if it has a cpu in the evlist->all_cpus even if it's not used. The evsel >>>>> should have a correct list anyway and we mostly use the evsel cpus >>>>> to do the real work. >>>>> >>>>> Thanks, >>>>> Namhyung >>>> >>>> The affinity changes made it so that we use all_cpus probably more >>>> often than the evsel CPU maps for real work. The reason being we want >>>> to avoid IPIs so we do all the work on 1 CPU and then move to the next >>>> CPU in evlist all_cpus. evsel CPU maps are used to make sure the >>>> indices are kept accurate - for example, if an uncore event is >>>> measured with a CPU event: >>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366 >>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404 >>> >>> Right, I meant it'd check the evsel cpus eventually even if it iterates >>> on the evlist all_cpus. The evlist_cpu_iterator__next() will skip a >>> CPU if it's not in the evsel cpus. >>> >>> Thanks, >>> Namhyung >> >> Perhaps an alternative is to be explicit about deferring map >> propagation e.g. > > Thanks for your patch. Yeah, we can use this. > > But I still think it'd be better doing it unconditionally > since any propagation before perf_evlist__set_maps > will be discarded anyway. With this change, other > than perf record will collect all cpus before _set_maps > and then discard it. It seems like a waste, no? > > Or else, we can have allow_map_propagation initialized > to false and set it to true in perf_evlist__set_maps(). > That sounds fine. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-09-30 12:49 ` Adrian Hunter @ 2022-09-30 16:44 ` Namhyung Kim 2022-09-30 16:56 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 28+ messages in thread From: Namhyung Kim @ 2022-09-30 16:44 UTC (permalink / raw) To: Adrian Hunter, Arnaldo Carvalho de Melo Cc: Ian Rogers, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, linux-perf-users, Kan Liang, Leo Yan On Fri, Sep 30, 2022 at 5:50 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > On 29/09/22 23:42, Namhyung Kim wrote: > > On Wed, Sep 28, 2022 at 10:19 PM Adrian Hunter <adrian.hunter@intel.com> wrote: > >> > >> On 29/09/22 08:09, Namhyung Kim wrote: > >>> On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote: > >>>> > >>>> On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote: > >>>>> > >>>>> On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > >>>>>> > >>>>>> On 27/09/22 20:28, Namhyung Kim wrote: > >>>>>>> Hi Adrian, > >>>>>>> > >>>>>>> On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > >>>>>>>> > >>>>>>>> On 24/09/22 19:57, Namhyung Kim wrote: > >>>>>>>>> The current code propagate evsel's cpu map settings to evlist when it's > >>>>>>>>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will > >>>>>>>>> be updated in perf_evlist__set_maps() later. No need to do it before > >>>>>>>>> evlist's cpus are set actually. > >>>>>>>>> > >>>>>>>>> Actually we discarded this intermediate all_cpus maps at the beginning > >>>>>>>>> of perf_evlist__set_maps(). Let's not do this. It's only needed when > >>>>>>>>> an evsel is added after the evlist cpu maps are set. > >>>>>>>> > >>>>>>>> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles > >>>>>>>> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been > >>>>>>>> added to the evlist. It can also remove an evsel from the evlist. > >>>>>>> > >>>>>>> Thanks for your review. I think it's fine to change evsel cpus or to remove > >>>>>>> an evsel from evlist before calling evlist__create_maps(). The function > >>>>>>> will take care of setting evlist's all_cpus from the evsels in the evlist. > >>>>>>> So previous changes in evsel/cpus wouldn't be any special. > >>>>>>> > >>>>>>> After this point, adding a new evsel needs to update evlist all cpus by > >>>>>>> propagating cpu maps. So I think hybrid cpus should be fine. > >>>>>>> Did I miss something? > >>>>>> > >>>>>> I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the > >>>>>> cpus from the target->cpu_list (using perf record -C) , since after this > >>>>>> patch all_cpus always starts with the target->cpu_list instead of an empty > >>>>>> list. But then, in the hybrid case, it puts a dummy event that uses the > >>>>>> target cpu list anyway, so the result is the same. > >>>>>> > >>>>>> I don't know if there are any cases where all_cpus would actually need to > >>>>>> exclude some of the cpus from target->cpu_list. > >>>>> > >>>>> I'm not aware of other cases to reduce cpu list. I think it'd be fine > >>>>> if it has a cpu in the evlist->all_cpus even if it's not used. The evsel > >>>>> should have a correct list anyway and we mostly use the evsel cpus > >>>>> to do the real work. > >>>>> > >>>>> Thanks, > >>>>> Namhyung > >>>> > >>>> The affinity changes made it so that we use all_cpus probably more > >>>> often than the evsel CPU maps for real work. The reason being we want > >>>> to avoid IPIs so we do all the work on 1 CPU and then move to the next > >>>> CPU in evlist all_cpus. evsel CPU maps are used to make sure the > >>>> indices are kept accurate - for example, if an uncore event is > >>>> measured with a CPU event: > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366 > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404 > >>> > >>> Right, I meant it'd check the evsel cpus eventually even if it iterates > >>> on the evlist all_cpus. The evlist_cpu_iterator__next() will skip a > >>> CPU if it's not in the evsel cpus. > >>> > >>> Thanks, > >>> Namhyung > >> > >> Perhaps an alternative is to be explicit about deferring map > >> propagation e.g. > > > > Thanks for your patch. Yeah, we can use this. > > > > But I still think it'd be better doing it unconditionally > > since any propagation before perf_evlist__set_maps > > will be discarded anyway. With this change, other > > than perf record will collect all cpus before _set_maps > > and then discard it. It seems like a waste, no? > > > > Or else, we can have allow_map_propagation initialized > > to false and set it to true in perf_evlist__set_maps(). > > > > That sounds fine. Thanks! Arnaldo, how do you want to handle it? I can send v2 for the whole series, but I think you already applied it. Then do you want me to send this change on top? Namhyung ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/5] libperf: Propagate maps only if necessary 2022-09-30 16:44 ` Namhyung Kim @ 2022-09-30 16:56 ` Arnaldo Carvalho de Melo 0 siblings, 0 replies; 28+ messages in thread From: Arnaldo Carvalho de Melo @ 2022-09-30 16:56 UTC (permalink / raw) To: Namhyung Kim Cc: Adrian Hunter, Ian Rogers, Jiri Olsa, Ingo Molnar, Peter Zijlstra, LKML, linux-perf-users, Kan Liang, Leo Yan Em Fri, Sep 30, 2022 at 09:44:49AM -0700, Namhyung Kim escreveu: > On Fri, Sep 30, 2022 at 5:50 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > > > On 29/09/22 23:42, Namhyung Kim wrote: > > > On Wed, Sep 28, 2022 at 10:19 PM Adrian Hunter <adrian.hunter@intel.com> wrote: > > >> > > >> On 29/09/22 08:09, Namhyung Kim wrote: > > >>> On Wed, Sep 28, 2022 at 7:08 PM Ian Rogers <irogers@google.com> wrote: > > >>>> > > >>>> On Wed, Sep 28, 2022 at 4:46 PM Namhyung Kim <namhyung@kernel.org> wrote: > > >>>>> > > >>>>> On Wed, Sep 28, 2022 at 12:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > >>>>>> > > >>>>>> On 27/09/22 20:28, Namhyung Kim wrote: > > >>>>>>> Hi Adrian, > > >>>>>>> > > >>>>>>> On Tue, Sep 27, 2022 at 12:06 AM Adrian Hunter <adrian.hunter@intel.com> wrote: > > >>>>>>>> > > >>>>>>>> On 24/09/22 19:57, Namhyung Kim wrote: > > >>>>>>>>> The current code propagate evsel's cpu map settings to evlist when it's > > >>>>>>>>> added to an evlist. But the evlist->all_cpus and each evsel's cpus will > > >>>>>>>>> be updated in perf_evlist__set_maps() later. No need to do it before > > >>>>>>>>> evlist's cpus are set actually. > > >>>>>>>>> > > >>>>>>>>> Actually we discarded this intermediate all_cpus maps at the beginning > > >>>>>>>>> of perf_evlist__set_maps(). Let's not do this. It's only needed when > > >>>>>>>>> an evsel is added after the evlist cpu maps are set. > > >>>>>>>> > > >>>>>>>> That might not be true. Consider evlist__fix_hybrid_cpus() which fiddles > > >>>>>>>> with evsel->core.cpus and evsel->core.own_cpus after the evsel has been > > >>>>>>>> added to the evlist. It can also remove an evsel from the evlist. > > >>>>>>> > > >>>>>>> Thanks for your review. I think it's fine to change evsel cpus or to remove > > >>>>>>> an evsel from evlist before calling evlist__create_maps(). The function > > >>>>>>> will take care of setting evlist's all_cpus from the evsels in the evlist. > > >>>>>>> So previous changes in evsel/cpus wouldn't be any special. > > >>>>>>> > > >>>>>>> After this point, adding a new evsel needs to update evlist all cpus by > > >>>>>>> propagating cpu maps. So I think hybrid cpus should be fine. > > >>>>>>> Did I miss something? > > >>>>>> > > >>>>>> I wondered how it might play out if evlist__fix_hybrid_cpus() reduced the > > >>>>>> cpus from the target->cpu_list (using perf record -C) , since after this > > >>>>>> patch all_cpus always starts with the target->cpu_list instead of an empty > > >>>>>> list. But then, in the hybrid case, it puts a dummy event that uses the > > >>>>>> target cpu list anyway, so the result is the same. > > >>>>>> > > >>>>>> I don't know if there are any cases where all_cpus would actually need to > > >>>>>> exclude some of the cpus from target->cpu_list. > > >>>>> > > >>>>> I'm not aware of other cases to reduce cpu list. I think it'd be fine > > >>>>> if it has a cpu in the evlist->all_cpus even if it's not used. The evsel > > >>>>> should have a correct list anyway and we mostly use the evsel cpus > > >>>>> to do the real work. > > >>>>> > > >>>>> Thanks, > > >>>>> Namhyung > > >>>> > > >>>> The affinity changes made it so that we use all_cpus probably more > > >>>> often than the evsel CPU maps for real work. The reason being we want > > >>>> to avoid IPIs so we do all the work on 1 CPU and then move to the next > > >>>> CPU in evlist all_cpus. evsel CPU maps are used to make sure the > > >>>> indices are kept accurate - for example, if an uncore event is > > >>>> measured with a CPU event: > > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.h?h=perf/core#n366 > > >>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evlist.c?h=perf/core#n404 > > >>> > > >>> Right, I meant it'd check the evsel cpus eventually even if it iterates > > >>> on the evlist all_cpus. The evlist_cpu_iterator__next() will skip a > > >>> CPU if it's not in the evsel cpus. > > >>> > > >>> Thanks, > > >>> Namhyung > > >> > > >> Perhaps an alternative is to be explicit about deferring map > > >> propagation e.g. > > > > > > Thanks for your patch. Yeah, we can use this. > > > > > > But I still think it'd be better doing it unconditionally > > > since any propagation before perf_evlist__set_maps > > > will be discarded anyway. With this change, other > > > than perf record will collect all cpus before _set_maps > > > and then discard it. It seems like a waste, no? > > > > > > Or else, we can have allow_map_propagation initialized > > > to false and set it to true in perf_evlist__set_maps(). > > > > > > > That sounds fine. > > Thanks! > > Arnaldo, how do you want to handle it? I can send v2 for the > whole series, but I think you already applied it. Then do you > want me to send this change on top? Send v2 for the whole series, I haven't yet published it so I can replace. - Arnaldo ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2022-10-06 23:22 UTC | newest] Thread overview: 28+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-10-03 20:46 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v3) Namhyung Kim 2022-10-03 20:46 ` [PATCH 1/5] libperf: Populate system-wide evsel maps Namhyung Kim 2022-10-06 18:43 ` Ian Rogers 2022-10-06 23:13 ` Namhyung Kim 2022-10-03 20:46 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim 2022-10-04 12:14 ` Arnaldo Carvalho de Melo 2022-10-04 13:55 ` Adrian Hunter 2022-10-06 18:52 ` Ian Rogers 2022-10-06 23:21 ` Namhyung Kim 2022-10-03 20:46 ` [PATCH 3/5] perf tools: Get rid of evlist__add_on_all_cpus() Namhyung Kim 2022-10-03 20:46 ` [PATCH 4/5] perf tools: Add evlist__add_sched_switch() Namhyung Kim 2022-10-03 20:46 ` [PATCH 5/5] perf tools: Remove special handling of system-wide evsel Namhyung Kim 2022-10-06 18:57 ` Ian Rogers -- strict thread matches above, loose matches on Subject: below -- 2022-09-30 17:27 [PATCH 0/5] perf tools: Clean up cpu map handling for system-wide evsel (v2) Namhyung Kim 2022-09-30 17:27 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim 2022-10-03 5:36 ` Adrian Hunter 2022-10-04 5:20 ` Adrian Hunter 2022-09-24 16:57 [PATCH 0/5] perf tools: Clean up cpu map handling for a system-wide evsel (v1) Namhyung Kim 2022-09-24 16:57 ` [PATCH 2/5] libperf: Propagate maps only if necessary Namhyung Kim 2022-09-27 7:06 ` Adrian Hunter 2022-09-27 17:28 ` Namhyung Kim 2022-09-28 7:53 ` Adrian Hunter 2022-09-28 23:46 ` Namhyung Kim 2022-09-29 2:07 ` Ian Rogers 2022-09-29 5:09 ` Namhyung Kim 2022-09-29 5:18 ` Adrian Hunter 2022-09-29 20:42 ` Namhyung Kim 2022-09-30 12:49 ` Adrian Hunter 2022-09-30 16:44 ` Namhyung Kim 2022-09-30 16:56 ` Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).