linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	James Clark <james.clark@linaro.org>, Xu Yang <xu.yang_2@nxp.com>,
	Chun-Tse Shao <ctshao@google.com>,
	Thomas Richter <tmricht@linux.ibm.com>,
	Sumanth Korikkar <sumanthk@linux.ibm.com>,
	Collin Funk <collin.funk1@gmail.com>,
	Thomas Falcon <thomas.falcon@intel.com>,
	Howard Chu <howardchu95@gmail.com>,
	Dapeng Mi <dapeng1.mi@linux.intel.com>,
	Levi Yun <yeoreum.yun@arm.com>,
	Yang Li <yang.lee@linux.alibaba.com>,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org
Subject: Re: [PATCH v1 05/22] perf metricgroup: Add care to picking the evsel for displaying a metric
Date: Wed, 5 Nov 2025 22:03:18 -0800	[thread overview]
Message-ID: <aQw6Jje7bSpywGqq@google.com> (raw)
In-Reply-To: <CAP-5=fXmT-1io1k1O5d2QMken9aJmGTiuDEz2f-7k8-n=GCWtA@mail.gmail.com>

On Mon, Nov 03, 2025 at 09:28:44PM -0800, Ian Rogers wrote:
> On Mon, Nov 3, 2025 at 8:52 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Fri, Oct 24, 2025 at 10:58:40AM -0700, Ian Rogers wrote:
> > > Rather than using the first evsel in the matched events, try to find
> > > the least shared non-tool evsel. The aim is to pick the first evsel
> > > that typifies the metric within the list of metrics.
> > >
> > > This addresses an issue where Default metric group metrics may lose
> > > their counter value due to how the stat displaying hides counters for
> > > default event/metric output.
> >
> > Do you have a command line example to show impact of this change?
> 
> You can just run a Topdown metricgroup on Intel to see differences,

Ok, before this change.

  $ perf stat -M topdownL1 true
  
   Performance counter stats for 'true':
  
           7,754,275      TOPDOWN.SLOTS                    #     37.1 %  tma_backend_bound      
                                                    #     38.7 %  tma_frontend_bound     
                                                    #      8.8 %  tma_bad_speculation    
                                                    #     15.3 %  tma_retiring           
           1,185,947      topdown-retiring                                                      
           3,010,483      topdown-fe-bound                                                      
           2,828,029      topdown-be-bound                                                      
             729,814      topdown-bad-spec                                                      
               9,987      INT_MISC.CLEARS_COUNT                                                 
             221,405      IDQ.MS_UOPS                                                           
               6,352      INT_MISC.UOP_DROPPING                                                 
           1,212,644      UOPS_RETIRED.SLOTS                                                    
             119,895      UOPS_DECODED.DEC0                                                     
              60,975      cpu/UOPS_DECODED.DEC0,cmask=1/                                        
           1,639,442      UOPS_ISSUED.ANY                                                       
             820,982      IDQ.MITE_UOPS                                                         
  
         0.001172956 seconds time elapsed
  
         0.001269000 seconds user
         0.000000000 seconds sys


And with this change, it does look better.

  $ perf stat -M topdownL1 true
  
   Performance counter stats for 'true':
  
           7,977,430      TOPDOWN.SLOTS                                                         
           1,188,793      topdown-retiring                                                      
           3,159,687      topdown-fe-bound                                                      
           2,940,699      topdown-be-bound                                                      
             688,248      topdown-bad-spec                                                      
               9,749      INT_MISC.CLEARS_COUNT            #     37.5 %  tma_backend_bound      
                                                    #      8.1 %  tma_bad_speculation    
             219,145      IDQ.MS_UOPS                      #     14.9 %  tma_retiring           
               6,188      INT_MISC.UOP_DROPPING            #     39.5 %  tma_frontend_bound     
           1,205,712      UOPS_RETIRED.SLOTS                                                    
             117,505      UOPS_DECODED.DEC0                                                     
              59,891      cpu/UOPS_DECODED.DEC0,cmask=1/                                        
           1,625,232      UOPS_ISSUED.ANY                                                       
             805,560      IDQ.MITE_UOPS                                                         
  
         0.001629344 seconds time elapsed
  
         0.001672000 seconds user
         0.000000000 seconds sys

> but they are minor. The main impact is on the Default legacy metrics
> as those have a counter then a metric, but without this change you get
> everything grouped on the cpu-clock event and the formatting gets broken.

Do you mean with other changes in this series?  I don't see any
differences in the output just after this patch..

Before:

  $ perf stat -a true
  
   Performance counter stats for 'system wide':
  
          19,078,719      cpu-clock                        #    7.256 CPUs utilized             
                  94      context-switches                 #    4.927 K/sec                     
                  14      cpu-migrations                   #  733.802 /sec                      
                  61      page-faults                      #    3.197 K/sec                     
          43,304,957      instructions                     #    1.10  insn per cycle            
          39,281,107      cycles                           #    2.059 GHz                       
           5,012,071      branches                         #  262.705 M/sec                     
             128,358      branch-misses                    #    2.56% of all branches           
   #     24.4 %  tma_retiring           
   #     33.7 %  tma_backend_bound      
                                                    #      5.9 %  tma_bad_speculation    
   #     36.0 %  tma_frontend_bound     
  
         0.002629534 seconds time elapsed
  
After:
  
  $ perf stat -a true
  
   Performance counter stats for 'system wide':
  
           6,201,661      cpu-clock                        #    3.692 CPUs utilized             
                  24      context-switches                 #    3.870 K/sec                     
                   7      cpu-migrations                   #    1.129 K/sec                     
                  60      page-faults                      #    9.675 K/sec                     
          11,458,681      instructions                     #    1.07  insn per cycle            
          10,704,978      cycles                           #    1.726 GHz                       
           2,457,704      branches                         #  396.298 M/sec                     
              54,553      branch-misses                    #    2.22% of all branches           
   #     21.4 %  tma_retiring           
   #     36.1 %  tma_backend_bound      
                                                    #     10.2 %  tma_bad_speculation    
   #     32.3 %  tma_frontend_bound     
  
         0.001679679 seconds time elapsed
  
Thanks,
Namhyung


> As --metric-only is popular when looking at a group of events
> and the Default legacy metrics are added in subsequent changes it
> didn't seem right to include the output (it either shows broken things
> keeping to be somewhat broken or output from later patches).
> 
> Thanks,
> Ian
> 
> > Thanks,
> > Namhyung
> >
> > >
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > ---
> > >  tools/perf/util/metricgroup.c | 48 ++++++++++++++++++++++++++++++++++-
> > >  1 file changed, 47 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> > > index 48936e517803..76092ee26761 100644
> > > --- a/tools/perf/util/metricgroup.c
> > > +++ b/tools/perf/util/metricgroup.c
> > > @@ -1323,6 +1323,51 @@ static int parse_ids(bool metric_no_merge, bool fake_pmu,
> > >       return ret;
> > >  }
> > >
> > > +/* How many times will a given evsel be used in a set of metrics? */
> > > +static int count_uses(struct list_head *metric_list, struct evsel *evsel)
> > > +{
> > > +     const char *metric_id = evsel__metric_id(evsel);
> > > +     struct metric *m;
> > > +     int uses = 0;
> > > +
> > > +     list_for_each_entry(m, metric_list, nd) {
> > > +             if (hashmap__find(m->pctx->ids, metric_id, NULL))
> > > +                     uses++;
> > > +     }
> > > +     return uses;
> > > +}
> > > +
> > > +/*
> > > + * Select the evsel that stat-display will use to trigger shadow/metric
> > > + * printing. Pick the least shared non-tool evsel, encouraging metrics to be
> > > + * with a hardware counter that is specific to them.
> > > + */
> > > +static struct evsel *pick_display_evsel(struct list_head *metric_list,
> > > +                                     struct evsel **metric_events)
> > > +{
> > > +     struct evsel *selected = metric_events[0];
> > > +     size_t selected_uses;
> > > +     bool selected_is_tool;
> > > +
> > > +     if (!selected)
> > > +             return NULL;
> > > +
> > > +     selected_uses = count_uses(metric_list, selected);
> > > +     selected_is_tool = evsel__is_tool(selected);
> > > +     for (int i = 1; metric_events[i]; i++) {
> > > +             struct evsel *candidate = metric_events[i];
> > > +             size_t candidate_uses = count_uses(metric_list, candidate);
> > > +
> > > +             if ((selected_is_tool && !evsel__is_tool(candidate)) ||
> > > +                 (candidate_uses < selected_uses)) {
> > > +                     selected = candidate;
> > > +                     selected_uses = candidate_uses;
> > > +                     selected_is_tool = evsel__is_tool(selected);
> > > +             }
> > > +     }
> > > +     return selected;
> > > +}
> > > +
> > >  static int parse_groups(struct evlist *perf_evlist,
> > >                       const char *pmu, const char *str,
> > >                       bool metric_no_group,
> > > @@ -1430,7 +1475,8 @@ static int parse_groups(struct evlist *perf_evlist,
> > >                       goto out;
> > >               }
> > >
> > > -             me = metricgroup__lookup(&perf_evlist->metric_events, metric_events[0],
> > > +             me = metricgroup__lookup(&perf_evlist->metric_events,
> > > +                                      pick_display_evsel(&metric_list, metric_events),
> > >                                        /*create=*/true);
> > >
> > >               expr = malloc(sizeof(struct metric_expr));
> > > --
> > > 2.51.1.821.gb6fe4d2222-goog
> > >

  reply	other threads:[~2025-11-06  6:03 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-24 17:58 [PATCH v1 00/22] Switch the default perf stat metrics to json Ian Rogers
2025-10-24 17:58 ` [PATCH v1 01/22] perf evsel: Remove unused metric_events variable Ian Rogers
2025-10-24 17:58 ` [PATCH v1 02/22] perf metricgroup: Update comment on location of metric_event list Ian Rogers
2025-10-24 17:58 ` [PATCH v1 03/22] perf metricgroup: Missed free on error path Ian Rogers
2025-10-24 17:58 ` [PATCH v1 04/22] perf metricgroup: When copy metrics copy default information Ian Rogers
2025-10-24 17:58 ` [PATCH v1 05/22] perf metricgroup: Add care to picking the evsel for displaying a metric Ian Rogers
2025-11-04  4:52   ` Namhyung Kim
2025-11-04  5:28     ` Ian Rogers
2025-11-06  6:03       ` Namhyung Kim [this message]
2025-11-06  6:42         ` Ian Rogers
2025-10-24 17:58 ` [PATCH v1 06/22] perf jevents: Make all tables static Ian Rogers
2025-10-24 17:58 ` [PATCH v1 07/22] perf expr: Add #target_cpu literal Ian Rogers
2025-11-04  4:56   ` Namhyung Kim
2025-11-06 18:43     ` Ian Rogers
2025-10-24 17:58 ` [PATCH v1 08/22] perf jevents: Add set of common metrics based on default ones Ian Rogers
2025-11-06  6:22   ` Namhyung Kim
2025-11-06 18:05     ` Ian Rogers
2025-10-24 17:58 ` [PATCH v1 09/22] perf jevents: Add metric DefaultShowEvents Ian Rogers
2025-10-24 17:58 ` [PATCH v1 10/22] perf stat: Add detail -d,-dd,-ddd metrics Ian Rogers
2025-10-24 17:58 ` [PATCH v1 11/22] perf script: Change metric format to use json metrics Ian Rogers
2025-10-24 17:58 ` [PATCH v1 12/22] perf stat: Remove hard coded shadow metrics Ian Rogers
2025-10-24 17:58 ` [PATCH v1 13/22] perf stat: Fix default metricgroup display on hybrid Ian Rogers
2025-10-24 17:58 ` [PATCH v1 14/22] perf stat: Sort default events/metrics Ian Rogers
2025-10-24 17:58 ` [PATCH v1 15/22] perf stat: Remove "unit" workarounds for metric-only Ian Rogers
2025-10-24 17:58 ` [PATCH v1 16/22] perf test stat+json: Improve metric-only testing Ian Rogers
2025-10-24 17:58 ` [PATCH v1 17/22] perf test stat: Ignore failures in Default[234] metricgroups Ian Rogers
2025-10-24 17:58 ` [PATCH v1 18/22] perf test stat: Update std_output testing metric expectations Ian Rogers
2025-10-24 17:58 ` [PATCH v1 19/22] perf test metrics: Update all metrics for possibly failing default metrics Ian Rogers
2025-10-24 17:58 ` [PATCH v1 20/22] perf test stat: Update shadow test to use metrics Ian Rogers
2025-10-24 17:58 ` [PATCH v1 21/22] perf test stat: Update test expectations and events Ian Rogers
2025-10-24 17:58 ` [PATCH v1 22/22] perf test stat csv: " Ian Rogers
2025-10-30 20:51 ` [PATCH v1 00/22] Switch the default perf stat metrics to json Ian Rogers
2025-11-03 17:05   ` Ian Rogers
2025-11-04  4:47 ` Namhyung Kim
2025-11-04  5:09   ` Ian Rogers
2025-11-06  5:29     ` Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aQw6Jje7bSpywGqq@google.com \
    --to=namhyung@kernel.org \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=collin.funk1@gmail.com \
    --cc=ctshao@google.com \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=howardchu95@gmail.com \
    --cc=irogers@google.com \
    --cc=james.clark@linaro.org \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sumanthk@linux.ibm.com \
    --cc=thomas.falcon@intel.com \
    --cc=tmricht@linux.ibm.com \
    --cc=xu.yang_2@nxp.com \
    --cc=yang.lee@linux.alibaba.com \
    --cc=yeoreum.yun@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).