Re: [PATCH] perf tools topdown: Fix incorrect topdown slots regroup

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>
To: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Kan Liang <kan.liang@linux.intel.com>,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	Dapeng Mi <dapeng1.mi@intel.com>,
	Xudong Hao <xudong.hao@intel.com>
Subject: Re: [PATCH] perf tools topdown: Fix incorrect topdown slots regroup
Date: Mon, 25 Aug 2025 08:54:16 +0800	[thread overview]
Message-ID: <03e30fdd-0624-4fa0-a187-cd870e5d3694@linux.intel.com> (raw)
In-Reply-To: <CAP-5=fW6N+eE0KTyV7F4wm=KBwk46QbXjwwG9POtZxEDhbRqRQ@mail.gmail.com>


On 8/23/2025 1:35 AM, Ian Rogers wrote:
> On Fri, Aug 22, 2025 at 1:23 AM Dapeng Mi <dapeng1.mi@linux.intel.com> wrote:
>> When running the command "perf stat -e "slots,slots" -C0 sleep 1", we
>> see below error.
>>
>> perf stat -e "slots,slots" -C0 sleep 1
>> WARNING: events were regrouped to match PMUs
>>  Performance counter stats for 'CPU(s) 0':
>>      <not counted>      slots
>>    <not supported>      slots
>>
>>      1.002265769 seconds time elapsed
>>
>> The topdown slots events are not correctly counted. The root cause is
>> that perf tools incorrectly regroup these 2 slots events into a group.
>> If there are only topdown slots events but no topdown metrics events,
>> the regroup should not be done since topdown slots event can only be
>> programed on fixed counter 3 and multiple slots events can only be
>> multiplexed to run on fixed counter 3, but grouping them blocks
>> multiplexing.
>>
>> So avoid to regroup topdown slots events if there is no topdown metrics
>> events.
>>
>> With this change, above command can be run successfully.
>>
>> perf stat -e "slots,slots" -C0 sleep 1
>>  Performance counter stats for 'CPU(s) 0':
>>        103,973,791      slots
>>        106,488,170      slots
>>
>>        1.003517284 seconds time elapsed
>>
>> Besides, run perf stats/record test on SPR and PTL, both passed.
>>
>> Reported-by: Xudong Hao <xudong.hao@intel.com>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> I don't think we should do this and if we were to do it we shouldn't
> do it in the common code. The perf metrics requiring a slots event is
> a massive mess that never seems to end. What should we do with:
> ```
> $ perf stat -e "topdown-fe-bound,topdown-fe-bound" true
>
> Performance counter stats for 'true':
>
>     <not counted>      slots
>     <not counted>      topdown-fe-bound
>   <not supported>      topdown-fe-bound
>
>       0.000960472 seconds time elapsed
>
>       0.001060000 seconds user
>       0.000000000 seconds sys
>
>
> Some events weren't counted. Try disabling the NMI watchdog:
>        echo 0 > /proc/sys/kernel/nmi_watchdog
>        perf stat ...
>        echo 1 > /proc/sys/kernel/nmi_watchdog
> ```
>
> We've injected the slots event but the repeated topdown-fe-bound
> causes the group to fail in a similar way. Why is repeating slots a
> case we care about more?
> Were we to say, okay slots is more special than the other perf metric
> events then I'd prefer arch_evsel__must_be_in_group to return false
> for the slots event when there are no other perf metric events in the
> evlist. But then what do you do if the slots event is in a different
> group like:
> ```
> $ perf stat -e "slots,{slots,topdown-fe-bound}" true
> ```
> It is pretty easy to teach the code to deduplicate events, but then
> again, what about the group behavior?
> It is not clear to me we can ever clean up all the mess that the perf
> metric events on p-cores create and I'm in favor of having the code be
> no more complex than it needs to be, so I'm not sure we should be
> solving this problem.

Ian, thanks for reviewing. Yeah, I agree what you said "topdown events are
a massive mess", we could never solve all issues. But it's annoying for
users that it reports "not counted" instead of an error. Is it possible
that we take a step back and don't try to resolve this issue and just
report an error (maybe plus a message) to warn users that multiple same
topdown events are not allowed? Thanks. 


>
> Thanks,
> Ian
>
>> ---
>>  tools/perf/arch/x86/util/topdown.h |  2 --
>>  tools/perf/util/evsel.c            | 10 ++++++++++
>>  tools/perf/util/evsel.h            |  2 ++
>>  tools/perf/util/parse-events.c     | 11 +++++++++++
>>  4 files changed, 23 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/arch/x86/util/topdown.h b/tools/perf/arch/x86/util/topdown.h
>> index 69035565e649..6a917b2066f7 100644
>> --- a/tools/perf/arch/x86/util/topdown.h
>> +++ b/tools/perf/arch/x86/util/topdown.h
>> @@ -8,8 +8,6 @@ struct evsel;
>>  struct list_head;
>>
>>  bool topdown_sys_has_perf_metrics(void);
>> -bool arch_is_topdown_slots(const struct evsel *evsel);
>> -bool arch_is_topdown_metrics(const struct evsel *evsel);
>>  int topdown_insert_slots_event(struct list_head *list, int idx, struct evsel *metric_event);
>>
>>  #endif
>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>> index d264c143b592..6aaae1ac026e 100644
>> --- a/tools/perf/util/evsel.c
>> +++ b/tools/perf/util/evsel.c
>> @@ -3965,6 +3965,16 @@ int evsel__source_count(const struct evsel *evsel)
>>         return count;
>>  }
>>
>> +bool __weak arch_is_topdown_slots(const struct evsel *evsel __maybe_unused)
>> +{
>> +       return false;
>> +}
>> +
>> +bool __weak arch_is_topdown_metrics(const struct evsel *evsel __maybe_unused)
>> +{
>> +       return false;
>> +}
>> +
>>  bool __weak arch_evsel__must_be_in_group(const struct evsel *evsel __maybe_unused)
>>  {
>>         return false;
>> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
>> index 5797a02e5d6a..33f8aab675a9 100644
>> --- a/tools/perf/util/evsel.h
>> +++ b/tools/perf/util/evsel.h
>> @@ -556,6 +556,8 @@ void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
>>  int evsel__source_count(const struct evsel *evsel);
>>  void evsel__remove_from_group(struct evsel *evsel, struct evsel *leader);
>>
>> +bool arch_is_topdown_slots(const struct evsel *evsel);
>> +bool arch_is_topdown_metrics(const struct evsel *evsel);
>>  bool arch_evsel__must_be_in_group(const struct evsel *evsel);
>>
>>  bool evsel__set_needs_uniquify(struct evsel *counter, const struct perf_stat_config *config);
>> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
>> index 8282ddf68b98..bd09fc47ea90 100644
>> --- a/tools/perf/util/parse-events.c
>> +++ b/tools/perf/util/parse-events.c
>> @@ -2127,6 +2127,8 @@ static int parse_events__sort_events_and_fix_groups(struct list_head *list)
>>         int ret;
>>         struct evsel *force_grouped_leader = NULL;
>>         bool last_event_was_forced_leader = false;
>> +       bool has_slots = false;
>> +       bool has_metrics = false;
>>
>>         /* On x86 topdown metrics events require a slots event. */
>>         ret = arch_evlist__add_required_events(list);
>> @@ -2147,6 +2149,11 @@ static int parse_events__sort_events_and_fix_groups(struct list_head *list)
>>                 if (pos == pos_leader)
>>                         orig_num_leaders++;
>>
>> +               if (!has_slots)
>> +                       has_slots = arch_is_topdown_slots(pos);
>> +               if (!has_metrics)
>> +                       has_metrics = arch_is_topdown_metrics(pos);
>> +
>>                 /*
>>                  * Ensure indexes are sequential, in particular for multiple
>>                  * event lists being merged. The indexes are used to detect when
>> @@ -2163,6 +2170,10 @@ static int parse_events__sort_events_and_fix_groups(struct list_head *list)
>>                         force_grouped_idx = pos_leader->core.idx;
>>         }
>>
>> +       /* Don't regroup if there are only topdown slots events. */
>> +       if (force_grouped_idx != -1 && has_slots && !has_metrics)
>> +               force_grouped_idx = -1;
>> +
>>         /* Sort events. */
>>         list_sort(&force_grouped_idx, list, evlist__cmp);
>>
>>
>> base-commit: 6235ce77749f45cac27f630337e2fdf04e8a6c73
>> --
>> 2.34.1
>>

next prev parent reply	other threads:[~2025-08-25  0:54 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-22  8:22 [PATCH] perf tools topdown: Fix incorrect topdown slots regroup Dapeng Mi
2025-08-22 17:35 ` Ian Rogers
2025-08-25  0:54   ` Mi, Dapeng [this message]
2025-08-25 15:29     ` Ian Rogers
2025-08-25 15:54       ` Ian Rogers
2025-08-25 21:13         ` Ian Rogers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=03e30fdd-0624-4fa0-a187-cd870e5d3694@linux.intel.com \
    --to=dapeng1.mi@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=dapeng1.mi@intel.com \
    --cc=irogers@google.com \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=xudong.hao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).