All of lore.kernel.org
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: "Mi, Dapeng" <dapeng1.mi@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	James Clark <james.clark@linaro.org>, Xu Yang <xu.yang_2@nxp.com>,
	Chun-Tse Shao <ctshao@google.com>,
	Thomas Richter <tmricht@linux.ibm.com>,
	Sumanth Korikkar <sumanthk@linux.ibm.com>,
	Collin Funk <collin.funk1@gmail.com>,
	Thomas Falcon <thomas.falcon@intel.com>,
	Howard Chu <howardchu95@gmail.com>,
	Levi Yun <yeoreum.yun@arm.com>,
	Yang Li <yang.lee@linux.alibaba.com>,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Andi Kleen <ak@linux.intel.com>,
	Weilin Wang <weilin.wang@intel.com>
Subject: Re: [PATCH v3 01/18] perf metricgroup: Add care to picking the evsel for displaying a metric
Date: Tue, 11 Nov 2025 11:05:59 -0800	[thread overview]
Message-ID: <aROJF9GjJUv-w5Wg@google.com> (raw)
In-Reply-To: <CAP-5=fWWxbRS4D1GsPvSgr32cfCGaT68qV0Q6-FLQ90R-bhH3w@mail.gmail.com>

On Tue, Nov 11, 2025 at 09:20:30AM -0800, Ian Rogers wrote:
> On Tue, Nov 11, 2025 at 12:15 AM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
> >
> >
> > On 11/11/2025 12:04 PM, Ian Rogers wrote:
> > > Rather than using the first evsel in the matched events, try to find
> > > the least shared non-tool evsel. The aim is to pick the first evsel
> > > that typifies the metric within the list of metrics.
> > >
> > > This addresses an issue where Default metric group metrics may lose
> > > their counter value due to how the stat displaying hides counters for
> > > default event/metric output.
> > >
> > > For a metricgroup like TopdownL1 on an Intel Alderlake the change is,
> > > before there are 4 events with metrics:
> > > ```
> > > $ perf stat -M topdownL1 -a sleep 1
> > >
> > >  Performance counter stats for 'system wide':
> > >
> > >      7,782,334,296      cpu_core/TOPDOWN.SLOTS/          #     10.4 %  tma_bad_speculation
> > >                                                   #     19.7 %  tma_frontend_bound
> > >      2,668,927,977      cpu_core/topdown-retiring/       #     35.7 %  tma_backend_bound
> > >                                                   #     34.1 %  tma_retiring
> > >        803,623,987      cpu_core/topdown-bad-spec/
> > >        167,514,386      cpu_core/topdown-heavy-ops/
> > >      1,555,265,776      cpu_core/topdown-fe-bound/
> > >      2,792,733,013      cpu_core/topdown-be-bound/
> > >        279,769,310      cpu_atom/TOPDOWN_RETIRING.ALL/   #     12.2 %  tma_retiring
> > >                                                   #     15.1 %  tma_bad_speculation
> > >        457,917,232      cpu_atom/CPU_CLK_UNHALTED.CORE/  #     38.4 %  tma_backend_bound
> > >                                                   #     34.2 %  tma_frontend_bound
> > >        783,519,226      cpu_atom/TOPDOWN_FE_BOUND.ALL/
> > >         10,790,192      cpu_core/INT_MISC.UOP_DROPPING/
> > >        879,845,633      cpu_atom/TOPDOWN_BE_BOUND.ALL/
> > > ```
> > >
> > > After there are 6 events with metrics:
> > > ```
> > > $ perf stat -M topdownL1 -a sleep 1
> > >
> > >  Performance counter stats for 'system wide':
> > >
> > >      2,377,551,258      cpu_core/TOPDOWN.SLOTS/          #      7.9 %  tma_bad_speculation
> > >                                                   #     36.4 %  tma_frontend_bound
> > >        480,791,142      cpu_core/topdown-retiring/       #     35.5 %  tma_backend_bound
> > >        186,323,991      cpu_core/topdown-bad-spec/
> > >         65,070,590      cpu_core/topdown-heavy-ops/      #     20.1 %  tma_retiring
> > >        871,733,444      cpu_core/topdown-fe-bound/
> > >        848,286,598      cpu_core/topdown-be-bound/
> > >        260,936,456      cpu_atom/TOPDOWN_RETIRING.ALL/   #     12.4 %  tma_retiring
> > >                                                   #     17.6 %  tma_bad_speculation
> > >        419,576,513      cpu_atom/CPU_CLK_UNHALTED.CORE/
> > >        797,132,597      cpu_atom/TOPDOWN_FE_BOUND.ALL/   #     38.0 %  tma_frontend_bound
> > >          3,055,447      cpu_core/INT_MISC.UOP_DROPPING/
> > >        671,014,164      cpu_atom/TOPDOWN_BE_BOUND.ALL/   #     32.0 %  tma_backend_bound
> > > ```
> >
> > It looks the output of cpu_core and cpu_atom events are mixed together,
> > like the "cpu_core/INT_MISC.UOP_DROPPING/". Could we resort the events and
> > separate the cpu_core and cpu_atom events output? It would make the output
> > more read-friendly. Thanks.
> 
> So the metrics are tagged as to not group the events:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/arch/x86/alderlake/adl-metrics.json?h=perf-tools-next#n117
> Running with each metric causes the output to be:
> ```
> $ perf stat -M tma_bad_speculation,tma_backend_bound,tma_frontend_bound,tma_retiring
> -a sleep 1
> 
>  Performance counter stats for 'system wide':
> 
>      1,615,145,897      cpu_core/TOPDOWN.SLOTS/          #      8.1 %
> tma_bad_speculation
>                                                   #     42.5 %
> tma_frontend_bound       (49.89%)
>        243,037,087      cpu_core/topdown-retiring/       #     34.5 %
> tma_backend_bound        (49.89%)
>        129,341,306      cpu_core/topdown-bad-spec/
>                          (49.89%)
>          2,679,894      cpu_core/INT_MISC.UOP_DROPPING/
>                          (49.89%)
>        696,940,348      cpu_core/topdown-fe-bound/
>                          (49.89%)
>        563,319,011      cpu_core/topdown-be-bound/
>                          (49.89%)
>      1,795,034,847      cpu_core/slots/
>                          (50.11%)
>        262,140,961      cpu_core/topdown-retiring/
>                          (50.11%)
>         44,589,349      cpu_core/topdown-heavy-ops/      #     14.4 %
> tma_retiring             (50.11%)
>        160,987,341      cpu_core/topdown-bad-spec/
>                          (50.11%)
>        778,250,364      cpu_core/topdown-fe-bound/
>                          (50.11%)
>        622,499,674      cpu_core/topdown-be-bound/
>                          (50.11%)
>         90,849,750      cpu_atom/TOPDOWN_RETIRING.ALL/   #      8.1 %
> tma_retiring
>                                                   #     17.2 %
> tma_bad_speculation
>        223,878,243      cpu_atom/CPU_CLK_UNHALTED.CORE/
>        423,068,733      cpu_atom/TOPDOWN_FE_BOUND.ALL/   #     37.8 %
> tma_frontend_bound
>        413,413,499      cpu_atom/TOPDOWN_BE_BOUND.ALL/   #     36.9 %
> tma_backend_bound
> ```
> so you can see that it is the effect of not grouping the events that
> leads to the cpu_core and cpu_atom split.
> 
> The code that does sorting/fixing/adding of events, primarily to fix
> topdown, is parse_events__sort_events_and_fix_groups:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/parse-events.c?h=perf-tools-next#n2030
> but I've tried to make that code respect the incoming evsel list order
> because if a user specifies an order then they generally expect it to
> be respected (unless invalid or because of topdown events). For
> --metric-only the event order doesn't really matter.
> 
> Anyway, I think trying to fix this is out of scope for this patch
> series, although I agree with you about the readability. The behavior
> here matches old behavior such as:
> ```
> $ perf --version
> perf version 6.16.12
> $ perf stat -M TopdownL1 -a sleep 1
> 
>  Performance counter stats for 'system wide':
> 
>     11,086,754,658      cpu_core/TOPDOWN.SLOTS/          #     27.1 %
> tma_backend_bound
>                                                   #      7.5 %
> tma_bad_speculation
>                                                   #     36.5 %
> tma_frontend_bound
>                                                   #     28.9 %
> tma_retiring
>      3,219,475,010      cpu_core/topdown-retiring/
>        820,655,931      cpu_core/topdown-bad-spec/
>        418,883,912      cpu_core/topdown-heavy-ops/
>      4,082,884,459      cpu_core/topdown-fe-bound/
>      3,012,532,414      cpu_core/topdown-be-bound/
>      1,030,171,196      cpu_atom/TOPDOWN_RETIRING.ALL/   #     17.4 %
> tma_retiring
>                                                   #     16.5 %
> tma_bad_speculation
>      1,185,093,601      cpu_atom/CPU_CLK_UNHALTED.CORE/  #     29.8 %
> tma_backend_bound
>                                                   #     36.4 %
> tma_frontend_bound
>      2,154,914,153      cpu_atom/TOPDOWN_FE_BOUND.ALL/
>         14,988,684      cpu_core/INT_MISC.UOP_DROPPING/
>      1,763,486,868      cpu_atom/TOPDOWN_BE_BOUND.ALL/
> 
>        1.004103365 seconds time elapsed
> ```
> ie the cpu_core and cpu_atom mixing of events isn't a regression
> introduced here. There isn't a simple fix for the ordering, as we
> don't want to mess up the non-metric cases. I'm happy if you think
> things can be otherwise to make a change.

Agreed and it should be handled in a separate patch (series).  Let's fix
problems one at a time.

Thanks,
Namhyung


  reply	other threads:[~2025-11-11 19:06 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-11  4:03 [PATCH v3 00/18] Switch the default perf stat metrics to json Ian Rogers
2025-11-11  4:04 ` [PATCH v3 01/18] perf metricgroup: Add care to picking the evsel for displaying a metric Ian Rogers
2025-11-11  8:15   ` Mi, Dapeng
2025-11-11 17:20     ` Ian Rogers
2025-11-11 19:05       ` Namhyung Kim [this message]
2025-11-12  8:14         ` Mi, Dapeng
2025-11-11  4:04 ` [PATCH v3 02/18] perf expr: Add #target_cpu literal Ian Rogers
2025-11-11  4:04 ` [PATCH v3 03/18] perf jevents: Add set of common metrics based on default ones Ian Rogers
2025-11-11  6:37   ` Namhyung Kim
2025-11-11 18:38     ` Ian Rogers
2025-11-11  4:04 ` [PATCH v3 04/18] perf jevents: Add metric DefaultShowEvents Ian Rogers
2025-11-11  4:04 ` [PATCH v3 05/18] perf stat: Add detail -d,-dd,-ddd metrics Ian Rogers
2025-11-11  4:04 ` [PATCH v3 06/18] perf script: Change metric format to use json metrics Ian Rogers
2025-11-11  6:59   ` Namhyung Kim
2025-11-11 20:52     ` Ian Rogers
2025-11-11  4:04 ` [PATCH v3 07/18] perf stat: Remove hard coded shadow metrics Ian Rogers
2025-11-11  7:02   ` Namhyung Kim
2025-11-11 17:23     ` Ian Rogers
2025-11-11 19:03       ` Namhyung Kim
2025-11-11  4:04 ` [PATCH v3 08/18] perf stat: Fix default metricgroup display on hybrid Ian Rogers
2025-11-11  4:04 ` [PATCH v3 09/18] perf stat: Sort default events/metrics Ian Rogers
2025-11-11  4:04 ` [PATCH v3 10/18] perf stat: Remove "unit" workarounds for metric-only Ian Rogers
2025-11-11  4:04 ` [PATCH v3 11/18] perf test stat+json: Improve metric-only testing Ian Rogers
2025-11-11  4:04 ` [PATCH v3 12/18] perf test stat: Ignore failures in Default[234] metricgroups Ian Rogers
2025-11-11  4:04 ` [PATCH v3 13/18] perf test stat: Update std_output testing metric expectations Ian Rogers
2025-11-11  4:04 ` [PATCH v3 14/18] perf test metrics: Update all metrics for possibly failing default metrics Ian Rogers
2025-11-11  4:04 ` [PATCH v3 15/18] perf test stat: Update shadow test to use metrics Ian Rogers
2025-11-11  4:04 ` [PATCH v3 16/18] perf test stat: Update test expectations and events Ian Rogers
2025-11-11  4:04 ` [PATCH v3 17/18] perf test stat csv: " Ian Rogers
2025-11-11  4:04 ` [PATCH v3 18/18] perf tool_pmu: Make core_wide and target_cpu json events Ian Rogers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aROJF9GjJUv-w5Wg@google.com \
    --to=namhyung@kernel.org \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=collin.funk1@gmail.com \
    --cc=ctshao@google.com \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=howardchu95@gmail.com \
    --cc=irogers@google.com \
    --cc=james.clark@linaro.org \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sumanthk@linux.ibm.com \
    --cc=thomas.falcon@intel.com \
    --cc=tmricht@linux.ibm.com \
    --cc=weilin.wang@intel.com \
    --cc=xu.yang_2@nxp.com \
    --cc=yang.lee@linux.alibaba.com \
    --cc=yeoreum.yun@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.