linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Ian Rogers <irogers@google.com>
Cc: acme@kernel.org, mingo@redhat.com, peterz@infradead.org,
	namhyung@kernel.org, jolsa@kernel.org, adrian.hunter@intel.com,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	ak@linux.intel.com, eranian@google.com, ahmad.yasin@intel.com
Subject: Re: [PATCH 2/8] perf evsel: Fix the annotation for hardware events on hybrid
Date: Tue, 13 Jun 2023 16:06:59 -0400	[thread overview]
Message-ID: <7487eff9-5769-1701-ea1b-45dd5ab67c85@linux.intel.com> (raw)
In-Reply-To: <CAP-5=fVz1zgwdJVs1V7putUdp9wf-QKWH1Ky-heLoHWgnJu6dg@mail.gmail.com>



On 2023-06-13 3:35 p.m., Ian Rogers wrote:
> On Wed, Jun 7, 2023 at 9:27 AM <kan.liang@linux.intel.com> wrote:
>>
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> The annotation for hardware events is wrong on hybrid. For example,
>>
>>  # ./perf stat -a sleep 1
>>
>>  Performance counter stats for 'system wide':
>>
>>          32,148.85 msec cpu-clock                        #   32.000 CPUs utilized
>>                374      context-switches                 #   11.633 /sec
>>                 33      cpu-migrations                   #    1.026 /sec
>>                295      page-faults                      #    9.176 /sec
>>         18,979,960      cpu_core/cycles/                 #  590.378 K/sec
>>        261,230,783      cpu_atom/cycles/                 #    8.126 M/sec                       (54.21%)
>>         17,019,732      cpu_core/instructions/           #  529.404 K/sec
>>         38,020,470      cpu_atom/instructions/           #    1.183 M/sec                       (63.36%)
>>          3,296,743      cpu_core/branches/               #  102.546 K/sec
>>          6,692,338      cpu_atom/branches/               #  208.167 K/sec                       (63.40%)
>>             96,421      cpu_core/branch-misses/          #    2.999 K/sec
>>          1,016,336      cpu_atom/branch-misses/          #   31.613 K/sec                       (63.38%)
>>
>> The hardware events have extended type on hybrid, but the evsel__match()
>> doesn't take it into account.
>>
>> Add a mask to filter the extended type on hybrid when checking the config.
>>
>> With the patch,
>>
>>  # ./perf stat -a sleep 1
>>
>>  Performance counter stats for 'system wide':
>>
>>          32,139.90 msec cpu-clock                        #   32.003 CPUs utilized
>>                343      context-switches                 #   10.672 /sec
>>                 32      cpu-migrations                   #    0.996 /sec
>>                 73      page-faults                      #    2.271 /sec
>>         13,712,841      cpu_core/cycles/                 #    0.000 GHz
>>        258,301,691      cpu_atom/cycles/                 #    0.008 GHz                         (54.20%)
>>         12,428,163      cpu_core/instructions/           #    0.91  insn per cycle
>>         37,786,557      cpu_atom/instructions/           #    2.76  insn per cycle              (63.35%)
>>          2,418,826      cpu_core/branches/               #   75.259 K/sec
>>          6,965,962      cpu_atom/branches/               #  216.739 K/sec                       (63.38%)
>>             72,150      cpu_core/branch-misses/          #    2.98% of all branches
>>          1,032,746      cpu_atom/branch-misses/          #   42.70% of all branches             (63.35%)
>>
>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>> ---
>>  tools/perf/util/evsel.h       | 12 ++++++-----
>>  tools/perf/util/stat-shadow.c | 39 +++++++++++++++++++----------------
>>  2 files changed, 28 insertions(+), 23 deletions(-)
>>
>> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
>> index b365b449c6ea..36a32e4ca168 100644
>> --- a/tools/perf/util/evsel.h
>> +++ b/tools/perf/util/evsel.h
>> @@ -350,9 +350,11 @@ u64 format_field__intval(struct tep_format_field *field, struct perf_sample *sam
>>
>>  struct tep_format_field *evsel__field(struct evsel *evsel, const char *name);
>>
>> -#define evsel__match(evsel, t, c)              \
>> +#define EVSEL_EVENT_MASK                       (~0ULL)
>> +
>> +#define evsel__match(evsel, t, c, m)                   \
>>         (evsel->core.attr.type == PERF_TYPE_##t &&      \
>> -        evsel->core.attr.config == PERF_COUNT_##c)
>> +        (evsel->core.attr.config & m) == PERF_COUNT_##c)
> 
> The EVSEL_EVENT_MASK here isn't very intention revealing, perhaps we
> can remove it and do something like:
> 
> static inline bool __evsel__match(const struct evsel *evsel, u32 type,
> u64 config)
> {
>   if ((type == PERF_TYPE_HARDWARE || type ==PERF_TYPE_HW_CACHE)  &&
> perf_pmus__supports_extended_type())
>      return (evsel->core.attr.config & PERF_HW_EVENT_MASK) == config;
> 
>   return evsel->core.attr.config == config;
> }
> #define evsel__match(evsel, t, c) __evsel__match(evsel, PERF_TYPE_##t,
> PERF_COUNT_##c)

Yes, the above code looks better. I will apply it in V2.

Thanks,
Kan
> 
> Thanks,
> Ian
> 
>>
>>  static inline bool evsel__match2(struct evsel *e1, struct evsel *e2)
>>  {
>> @@ -438,13 +440,13 @@ bool evsel__is_function_event(struct evsel *evsel);
>>
>>  static inline bool evsel__is_bpf_output(struct evsel *evsel)
>>  {
>> -       return evsel__match(evsel, SOFTWARE, SW_BPF_OUTPUT);
>> +       return evsel__match(evsel, SOFTWARE, SW_BPF_OUTPUT, EVSEL_EVENT_MASK);
>>  }
>>
>>  static inline bool evsel__is_clock(const struct evsel *evsel)
>>  {
>> -       return evsel__match(evsel, SOFTWARE, SW_CPU_CLOCK) ||
>> -              evsel__match(evsel, SOFTWARE, SW_TASK_CLOCK);
>> +       return evsel__match(evsel, SOFTWARE, SW_CPU_CLOCK, EVSEL_EVENT_MASK) ||
>> +              evsel__match(evsel, SOFTWARE, SW_TASK_CLOCK, EVSEL_EVENT_MASK);
>>  }
>>
>>  bool evsel__fallback(struct evsel *evsel, int err, char *msg, size_t msgsize);
>> diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
>> index 1566a206ba42..074f38b57e2d 100644
>> --- a/tools/perf/util/stat-shadow.c
>> +++ b/tools/perf/util/stat-shadow.c
>> @@ -6,6 +6,7 @@
>>  #include "color.h"
>>  #include "debug.h"
>>  #include "pmu.h"
>> +#include "pmus.h"
>>  #include "rblist.h"
>>  #include "evlist.h"
>>  #include "expr.h"
>> @@ -78,6 +79,8 @@ void perf_stat__reset_shadow_stats(void)
>>
>>  static enum stat_type evsel__stat_type(const struct evsel *evsel)
>>  {
>> +       u64 mask = perf_pmus__supports_extended_type() ? PERF_HW_EVENT_MASK : EVSEL_EVENT_MASK;
>> +
>>         /* Fake perf_hw_cache_op_id values for use with evsel__match. */
>>         u64 PERF_COUNT_hw_cache_l1d_miss = PERF_COUNT_HW_CACHE_L1D |
>>                 ((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
>> @@ -97,41 +100,41 @@ static enum stat_type evsel__stat_type(const struct evsel *evsel)
>>
>>         if (evsel__is_clock(evsel))
>>                 return STAT_NSECS;
>> -       else if (evsel__match(evsel, HARDWARE, HW_CPU_CYCLES))
>> +       else if (evsel__match(evsel, HARDWARE, HW_CPU_CYCLES, mask))
>>                 return STAT_CYCLES;
>> -       else if (evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS))
>> +       else if (evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS, mask))
>>                 return STAT_INSTRUCTIONS;
>> -       else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_FRONTEND))
>> +       else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_FRONTEND, mask))
>>                 return STAT_STALLED_CYCLES_FRONT;
>> -       else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_BACKEND))
>> +       else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_BACKEND, mask))
>>                 return STAT_STALLED_CYCLES_BACK;
>> -       else if (evsel__match(evsel, HARDWARE, HW_BRANCH_INSTRUCTIONS))
>> +       else if (evsel__match(evsel, HARDWARE, HW_BRANCH_INSTRUCTIONS, mask))
>>                 return STAT_BRANCHES;
>> -       else if (evsel__match(evsel, HARDWARE, HW_BRANCH_MISSES))
>> +       else if (evsel__match(evsel, HARDWARE, HW_BRANCH_MISSES, mask))
>>                 return STAT_BRANCH_MISS;
>> -       else if (evsel__match(evsel, HARDWARE, HW_CACHE_REFERENCES))
>> +       else if (evsel__match(evsel, HARDWARE, HW_CACHE_REFERENCES, mask))
>>                 return STAT_CACHE_REFS;
>> -       else if (evsel__match(evsel, HARDWARE, HW_CACHE_MISSES))
>> +       else if (evsel__match(evsel, HARDWARE, HW_CACHE_MISSES, mask))
>>                 return STAT_CACHE_MISSES;
>> -       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_L1D))
>> +       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_L1D, mask))
>>                 return STAT_L1_DCACHE;
>> -       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_L1I))
>> +       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_L1I, mask))
>>                 return STAT_L1_ICACHE;
>> -       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_LL))
>> +       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_LL, mask))
>>                 return STAT_LL_CACHE;
>> -       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_DTLB))
>> +       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_DTLB, mask))
>>                 return STAT_DTLB_CACHE;
>> -       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_ITLB))
>> +       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_ITLB, mask))
>>                 return STAT_ITLB_CACHE;
>> -       else if (evsel__match(evsel, HW_CACHE, hw_cache_l1d_miss))
>> +       else if (evsel__match(evsel, HW_CACHE, hw_cache_l1d_miss, mask))
>>                 return STAT_L1D_MISS;
>> -       else if (evsel__match(evsel, HW_CACHE, hw_cache_l1i_miss))
>> +       else if (evsel__match(evsel, HW_CACHE, hw_cache_l1i_miss, mask))
>>                 return STAT_L1I_MISS;
>> -       else if (evsel__match(evsel, HW_CACHE, hw_cache_ll_miss))
>> +       else if (evsel__match(evsel, HW_CACHE, hw_cache_ll_miss, mask))
>>                 return STAT_LL_MISS;
>> -       else if (evsel__match(evsel, HW_CACHE, hw_cache_dtlb_miss))
>> +       else if (evsel__match(evsel, HW_CACHE, hw_cache_dtlb_miss, mask))
>>                 return STAT_DTLB_MISS;
>> -       else if (evsel__match(evsel, HW_CACHE, hw_cache_itlb_miss))
>> +       else if (evsel__match(evsel, HW_CACHE, hw_cache_itlb_miss, mask))
>>                 return STAT_ITLB_MISS;
>>         return STAT_NONE;
>>  }
>> --
>> 2.35.1
>>

  reply	other threads:[~2023-06-13 20:07 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-07 16:26 [PATCH 0/8] New metricgroup output in perf stat default mode kan.liang
2023-06-07 16:26 ` [PATCH 1/8] perf metric: Fix no group check kan.liang
2023-06-13 19:22   ` Ian Rogers
2023-06-07 16:26 ` [PATCH 2/8] perf evsel: Fix the annotation for hardware events on hybrid kan.liang
2023-06-13 19:35   ` Ian Rogers
2023-06-13 20:06     ` Liang, Kan [this message]
2023-06-13 21:18       ` Arnaldo Carvalho de Melo
2023-06-13 23:57         ` Liang, Kan
2023-06-07 16:26 ` [PATCH 3/8] perf metric: JSON flag to default metric group kan.liang
2023-06-13 19:44   ` Ian Rogers
2023-06-13 20:10     ` Liang, Kan
2023-06-13 20:28       ` Ian Rogers
2023-06-13 20:59         ` Liang, Kan
2023-06-13 21:28           ` Ian Rogers
2023-06-14  0:02             ` Liang, Kan
2023-06-07 16:26 ` [PATCH 4/8] perf vendor events arm64: Add default tags into topdown L1 metrics kan.liang
2023-06-13 19:45   ` Ian Rogers
2023-06-13 20:31     ` Arnaldo Carvalho de Melo
2023-06-14 14:30   ` John Garry
2023-06-16  3:17     ` Liang, Kan
2023-06-07 16:26 ` [PATCH 5/8] perf stat,jevents: Introduce Default tags for the default mode kan.liang
2023-06-13 19:59   ` Ian Rogers
2023-06-13 20:11     ` Liang, Kan
2023-06-07 16:26 ` [PATCH 6/8] perf stat,metrics: New metricgroup output " kan.liang
2023-06-13 20:16   ` Ian Rogers
2023-06-13 20:50     ` Liang, Kan
2023-06-07 16:26 ` [PATCH 7/8] pert tests: Support metricgroup perf stat JSON output kan.liang
2023-06-13 20:17   ` Ian Rogers
2023-06-13 20:30     ` Arnaldo Carvalho de Melo
2023-06-07 16:27 ` [PATCH 8/8] perf test: Add test case for the standard perf stat output kan.liang
2023-06-13 20:21   ` Ian Rogers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7487eff9-5769-1701-ea1b-45dd5ab67c85@linux.intel.com \
    --to=kan.liang@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ahmad.yasin@intel.com \
    --cc=ak@linux.intel.com \
    --cc=eranian@google.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).