All of lore.kernel.org
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	James Clark <james.clark@linaro.org>, Xu Yang <xu.yang_2@nxp.com>,
	Chun-Tse Shao <ctshao@google.com>,
	Thomas Richter <tmricht@linux.ibm.com>,
	Sumanth Korikkar <sumanthk@linux.ibm.com>,
	Collin Funk <collin.funk1@gmail.com>,
	Thomas Falcon <thomas.falcon@intel.com>,
	Howard Chu <howardchu95@gmail.com>,
	Dapeng Mi <dapeng1.mi@linux.intel.com>,
	Levi Yun <yeoreum.yun@arm.com>,
	Yang Li <yang.lee@linux.alibaba.com>,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org
Subject: Re: [PATCH v1 05/22] perf metricgroup: Add care to picking the evsel for displaying a metric
Date: Wed, 5 Nov 2025 22:03:18 -0800	[thread overview]
Message-ID: <aQw6Jje7bSpywGqq@google.com> (raw)
In-Reply-To: <CAP-5=fXmT-1io1k1O5d2QMken9aJmGTiuDEz2f-7k8-n=GCWtA@mail.gmail.com>

On Mon, Nov 03, 2025 at 09:28:44PM -0800, Ian Rogers wrote:
> On Mon, Nov 3, 2025 at 8:52 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Fri, Oct 24, 2025 at 10:58:40AM -0700, Ian Rogers wrote:
> > > Rather than using the first evsel in the matched events, try to find
> > > the least shared non-tool evsel. The aim is to pick the first evsel
> > > that typifies the metric within the list of metrics.
> > >
> > > This addresses an issue where Default metric group metrics may lose
> > > their counter value due to how the stat displaying hides counters for
> > > default event/metric output.
> >
> > Do you have a command line example to show impact of this change?
> 
> You can just run a Topdown metricgroup on Intel to see differences,

Ok, before this change.

  $ perf stat -M topdownL1 true
  
   Performance counter stats for 'true':
  
           7,754,275      TOPDOWN.SLOTS                    #     37.1 %  tma_backend_bound      
                                                    #     38.7 %  tma_frontend_bound     
                                                    #      8.8 %  tma_bad_speculation    
                                                    #     15.3 %  tma_retiring           
           1,185,947      topdown-retiring                                                      
           3,010,483      topdown-fe-bound                                                      
           2,828,029      topdown-be-bound                                                      
             729,814      topdown-bad-spec                                                      
               9,987      INT_MISC.CLEARS_COUNT                                                 
             221,405      IDQ.MS_UOPS                                                           
               6,352      INT_MISC.UOP_DROPPING                                                 
           1,212,644      UOPS_RETIRED.SLOTS                                                    
             119,895      UOPS_DECODED.DEC0                                                     
              60,975      cpu/UOPS_DECODED.DEC0,cmask=1/                                        
           1,639,442      UOPS_ISSUED.ANY                                                       
             820,982      IDQ.MITE_UOPS                                                         
  
         0.001172956 seconds time elapsed
  
         0.001269000 seconds user
         0.000000000 seconds sys


And with this change, it does look better.

  $ perf stat -M topdownL1 true
  
   Performance counter stats for 'true':
  
           7,977,430      TOPDOWN.SLOTS                                                         
           1,188,793      topdown-retiring                                                      
           3,159,687      topdown-fe-bound                                                      
           2,940,699      topdown-be-bound                                                      
             688,248      topdown-bad-spec                                                      
               9,749      INT_MISC.CLEARS_COUNT            #     37.5 %  tma_backend_bound      
                                                    #      8.1 %  tma_bad_speculation    
             219,145      IDQ.MS_UOPS                      #     14.9 %  tma_retiring           
               6,188      INT_MISC.UOP_DROPPING            #     39.5 %  tma_frontend_bound     
           1,205,712      UOPS_RETIRED.SLOTS                                                    
             117,505      UOPS_DECODED.DEC0                                                     
              59,891      cpu/UOPS_DECODED.DEC0,cmask=1/                                        
           1,625,232      UOPS_ISSUED.ANY                                                       
             805,560      IDQ.MITE_UOPS                                                         
  
         0.001629344 seconds time elapsed
  
         0.001672000 seconds user
         0.000000000 seconds sys

> but they are minor. The main impact is on the Default legacy metrics
> as those have a counter then a metric, but without this change you get
> everything grouped on the cpu-clock event and the formatting gets broken.

Do you mean with other changes in this series?  I don't see any
differences in the output just after this patch..

Before:

  $ perf stat -a true
  
   Performance counter stats for 'system wide':
  
          19,078,719      cpu-clock                        #    7.256 CPUs utilized             
                  94      context-switches                 #    4.927 K/sec                     
                  14      cpu-migrations                   #  733.802 /sec                      
                  61      page-faults                      #    3.197 K/sec                     
          43,304,957      instructions                     #    1.10  insn per cycle            
          39,281,107      cycles                           #    2.059 GHz                       
           5,012,071      branches                         #  262.705 M/sec                     
             128,358      branch-misses                    #    2.56% of all branches           
   #     24.4 %  tma_retiring           
   #     33.7 %  tma_backend_bound      
                                                    #      5.9 %  tma_bad_speculation    
   #     36.0 %  tma_frontend_bound     
  
         0.002629534 seconds time elapsed
  
After:
  
  $ perf stat -a true
  
   Performance counter stats for 'system wide':
  
           6,201,661      cpu-clock                        #    3.692 CPUs utilized             
                  24      context-switches                 #    3.870 K/sec                     
                   7      cpu-migrations                   #    1.129 K/sec                     
                  60      page-faults                      #    9.675 K/sec                     
          11,458,681      instructions                     #    1.07  insn per cycle            
          10,704,978      cycles                           #    1.726 GHz                       
           2,457,704      branches                         #  396.298 M/sec                     
              54,553      branch-misses                    #    2.22% of all branches           
   #     21.4 %  tma_retiring           
   #     36.1 %  tma_backend_bound      
                                                    #     10.2 %  tma_bad_speculation    
   #     32.3 %  tma_frontend_bound     
  
         0.001679679 seconds time elapsed
  
Thanks,
Namhyung


> As --metric-only is popular when looking at a group of events
> and the Default legacy metrics are added in subsequent changes it
> didn't seem right to include the output (it either shows broken things
> keeping to be somewhat broken or output from later patches).
> 
> Thanks,
> Ian
> 
> > Thanks,
> > Namhyung
> >
> > >
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > ---
> > >  tools/perf/util/metricgroup.c | 48 ++++++++++++++++++++++++++++++++++-
> > >  1 file changed, 47 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> > > index 48936e517803..76092ee26761 100644
> > > --- a/tools/perf/util/metricgroup.c
> > > +++ b/tools/perf/util/metricgroup.c
> > > @@ -1323,6 +1323,51 @@ static int parse_ids(bool metric_no_merge, bool fake_pmu,
> > >       return ret;
> > >  }
> > >
> > > +/* How many times will a given evsel be used in a set of metrics? */
> > > +static int count_uses(struct list_head *metric_list, struct evsel *evsel)
> > > +{
> > > +     const char *metric_id = evsel__metric_id(evsel);
> > > +     struct metric *m;
> > > +     int uses = 0;
> > > +
> > > +     list_for_each_entry(m, metric_list, nd) {
> > > +             if (hashmap__find(m->pctx->ids, metric_id, NULL))
> > > +                     uses++;
> > > +     }
> > > +     return uses;
> > > +}
> > > +
> > > +/*
> > > + * Select the evsel that stat-display will use to trigger shadow/metric
> > > + * printing. Pick the least shared non-tool evsel, encouraging metrics to be
> > > + * with a hardware counter that is specific to them.
> > > + */
> > > +static struct evsel *pick_display_evsel(struct list_head *metric_list,
> > > +                                     struct evsel **metric_events)
> > > +{
> > > +     struct evsel *selected = metric_events[0];
> > > +     size_t selected_uses;
> > > +     bool selected_is_tool;
> > > +
> > > +     if (!selected)
> > > +             return NULL;
> > > +
> > > +     selected_uses = count_uses(metric_list, selected);
> > > +     selected_is_tool = evsel__is_tool(selected);
> > > +     for (int i = 1; metric_events[i]; i++) {
> > > +             struct evsel *candidate = metric_events[i];
> > > +             size_t candidate_uses = count_uses(metric_list, candidate);
> > > +
> > > +             if ((selected_is_tool && !evsel__is_tool(candidate)) ||
> > > +                 (candidate_uses < selected_uses)) {
> > > +                     selected = candidate;
> > > +                     selected_uses = candidate_uses;
> > > +                     selected_is_tool = evsel__is_tool(selected);
> > > +             }
> > > +     }
> > > +     return selected;
> > > +}
> > > +
> > >  static int parse_groups(struct evlist *perf_evlist,
> > >                       const char *pmu, const char *str,
> > >                       bool metric_no_group,
> > > @@ -1430,7 +1475,8 @@ static int parse_groups(struct evlist *perf_evlist,
> > >                       goto out;
> > >               }
> > >
> > > -             me = metricgroup__lookup(&perf_evlist->metric_events, metric_events[0],
> > > +             me = metricgroup__lookup(&perf_evlist->metric_events,
> > > +                                      pick_display_evsel(&metric_list, metric_events),
> > >                                        /*create=*/true);
> > >
> > >               expr = malloc(sizeof(struct metric_expr));
> > > --
> > > 2.51.1.821.gb6fe4d2222-goog
> > >

  reply	other threads:[~2025-11-06  6:03 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-24 17:58 [PATCH v1 00/22] Switch the default perf stat metrics to json Ian Rogers
2025-10-24 17:58 ` [PATCH v1 01/22] perf evsel: Remove unused metric_events variable Ian Rogers
2025-10-24 17:58 ` [PATCH v1 02/22] perf metricgroup: Update comment on location of metric_event list Ian Rogers
2025-10-24 17:58 ` [PATCH v1 03/22] perf metricgroup: Missed free on error path Ian Rogers
2025-10-24 17:58 ` [PATCH v1 04/22] perf metricgroup: When copy metrics copy default information Ian Rogers
2025-10-24 17:58 ` [PATCH v1 05/22] perf metricgroup: Add care to picking the evsel for displaying a metric Ian Rogers
2025-11-04  4:52   ` Namhyung Kim
2025-11-04  5:28     ` Ian Rogers
2025-11-06  6:03       ` Namhyung Kim [this message]
2025-11-06  6:42         ` Ian Rogers
2025-10-24 17:58 ` [PATCH v1 06/22] perf jevents: Make all tables static Ian Rogers
2025-10-24 17:58 ` [PATCH v1 07/22] perf expr: Add #target_cpu literal Ian Rogers
2025-11-04  4:56   ` Namhyung Kim
2025-11-06 18:43     ` Ian Rogers
2025-10-24 17:58 ` [PATCH v1 08/22] perf jevents: Add set of common metrics based on default ones Ian Rogers
2025-11-06  6:22   ` Namhyung Kim
2025-11-06 18:05     ` Ian Rogers
2025-10-24 17:58 ` [PATCH v1 09/22] perf jevents: Add metric DefaultShowEvents Ian Rogers
2025-10-24 17:58 ` [PATCH v1 10/22] perf stat: Add detail -d,-dd,-ddd metrics Ian Rogers
2025-10-24 17:58 ` [PATCH v1 11/22] perf script: Change metric format to use json metrics Ian Rogers
2025-10-24 17:58 ` [PATCH v1 12/22] perf stat: Remove hard coded shadow metrics Ian Rogers
2025-10-24 17:58 ` [PATCH v1 13/22] perf stat: Fix default metricgroup display on hybrid Ian Rogers
2025-10-24 17:58 ` [PATCH v1 14/22] perf stat: Sort default events/metrics Ian Rogers
2025-10-24 17:58 ` [PATCH v1 15/22] perf stat: Remove "unit" workarounds for metric-only Ian Rogers
2025-10-24 17:58 ` [PATCH v1 16/22] perf test stat+json: Improve metric-only testing Ian Rogers
2025-10-24 17:58 ` [PATCH v1 17/22] perf test stat: Ignore failures in Default[234] metricgroups Ian Rogers
2025-10-24 17:58 ` [PATCH v1 18/22] perf test stat: Update std_output testing metric expectations Ian Rogers
2025-10-24 17:58 ` [PATCH v1 19/22] perf test metrics: Update all metrics for possibly failing default metrics Ian Rogers
2025-10-24 17:58 ` [PATCH v1 20/22] perf test stat: Update shadow test to use metrics Ian Rogers
2025-10-24 17:58 ` [PATCH v1 21/22] perf test stat: Update test expectations and events Ian Rogers
2025-10-24 17:58 ` [PATCH v1 22/22] perf test stat csv: " Ian Rogers
2025-10-30 20:51 ` [PATCH v1 00/22] Switch the default perf stat metrics to json Ian Rogers
2025-11-03 17:05   ` Ian Rogers
2025-11-04  4:47 ` Namhyung Kim
2025-11-04  5:09   ` Ian Rogers
2025-11-06  5:29     ` Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aQw6Jje7bSpywGqq@google.com \
    --to=namhyung@kernel.org \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=collin.funk1@gmail.com \
    --cc=ctshao@google.com \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=howardchu95@gmail.com \
    --cc=irogers@google.com \
    --cc=james.clark@linaro.org \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sumanthk@linux.ibm.com \
    --cc=thomas.falcon@intel.com \
    --cc=tmricht@linux.ibm.com \
    --cc=xu.yang_2@nxp.com \
    --cc=yang.lee@linux.alibaba.com \
    --cc=yeoreum.yun@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.