All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Ian Rogers <irogers@google.com>
Cc: acme@kernel.org, mingo@redhat.com, peterz@infradead.org,
	namhyung@kernel.org, jolsa@kernel.org, adrian.hunter@intel.com,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	ak@linux.intel.com, eranian@google.com, ahmad.yasin@intel.com
Subject: Re: [PATCH 2/8] perf evsel: Fix the annotation for hardware events on hybrid
Date: Tue, 13 Jun 2023 16:06:59 -0400	[thread overview]
Message-ID: <7487eff9-5769-1701-ea1b-45dd5ab67c85@linux.intel.com> (raw)
In-Reply-To: <CAP-5=fVz1zgwdJVs1V7putUdp9wf-QKWH1Ky-heLoHWgnJu6dg@mail.gmail.com>



On 2023-06-13 3:35 p.m., Ian Rogers wrote:
> On Wed, Jun 7, 2023 at 9:27 AM <kan.liang@linux.intel.com> wrote:
>>
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> The annotation for hardware events is wrong on hybrid. For example,
>>
>>  # ./perf stat -a sleep 1
>>
>>  Performance counter stats for 'system wide':
>>
>>          32,148.85 msec cpu-clock                        #   32.000 CPUs utilized
>>                374      context-switches                 #   11.633 /sec
>>                 33      cpu-migrations                   #    1.026 /sec
>>                295      page-faults                      #    9.176 /sec
>>         18,979,960      cpu_core/cycles/                 #  590.378 K/sec
>>        261,230,783      cpu_atom/cycles/                 #    8.126 M/sec                       (54.21%)
>>         17,019,732      cpu_core/instructions/           #  529.404 K/sec
>>         38,020,470      cpu_atom/instructions/           #    1.183 M/sec                       (63.36%)
>>          3,296,743      cpu_core/branches/               #  102.546 K/sec
>>          6,692,338      cpu_atom/branches/               #  208.167 K/sec                       (63.40%)
>>             96,421      cpu_core/branch-misses/          #    2.999 K/sec
>>          1,016,336      cpu_atom/branch-misses/          #   31.613 K/sec                       (63.38%)
>>
>> The hardware events have extended type on hybrid, but the evsel__match()
>> doesn't take it into account.
>>
>> Add a mask to filter the extended type on hybrid when checking the config.
>>
>> With the patch,
>>
>>  # ./perf stat -a sleep 1
>>
>>  Performance counter stats for 'system wide':
>>
>>          32,139.90 msec cpu-clock                        #   32.003 CPUs utilized
>>                343      context-switches                 #   10.672 /sec
>>                 32      cpu-migrations                   #    0.996 /sec
>>                 73      page-faults                      #    2.271 /sec
>>         13,712,841      cpu_core/cycles/                 #    0.000 GHz
>>        258,301,691      cpu_atom/cycles/                 #    0.008 GHz                         (54.20%)
>>         12,428,163      cpu_core/instructions/           #    0.91  insn per cycle
>>         37,786,557      cpu_atom/instructions/           #    2.76  insn per cycle              (63.35%)
>>          2,418,826      cpu_core/branches/               #   75.259 K/sec
>>          6,965,962      cpu_atom/branches/               #  216.739 K/sec                       (63.38%)
>>             72,150      cpu_core/branch-misses/          #    2.98% of all branches
>>          1,032,746      cpu_atom/branch-misses/          #   42.70% of all branches             (63.35%)
>>
>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>> ---
>>  tools/perf/util/evsel.h       | 12 ++++++-----
>>  tools/perf/util/stat-shadow.c | 39 +++++++++++++++++++----------------
>>  2 files changed, 28 insertions(+), 23 deletions(-)
>>
>> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
>> index b365b449c6ea..36a32e4ca168 100644
>> --- a/tools/perf/util/evsel.h
>> +++ b/tools/perf/util/evsel.h
>> @@ -350,9 +350,11 @@ u64 format_field__intval(struct tep_format_field *field, struct perf_sample *sam
>>
>>  struct tep_format_field *evsel__field(struct evsel *evsel, const char *name);
>>
>> -#define evsel__match(evsel, t, c)              \
>> +#define EVSEL_EVENT_MASK                       (~0ULL)
>> +
>> +#define evsel__match(evsel, t, c, m)                   \
>>         (evsel->core.attr.type == PERF_TYPE_##t &&      \
>> -        evsel->core.attr.config == PERF_COUNT_##c)
>> +        (evsel->core.attr.config & m) == PERF_COUNT_##c)
> 
> The EVSEL_EVENT_MASK here isn't very intention revealing, perhaps we
> can remove it and do something like:
> 
> static inline bool __evsel__match(const struct evsel *evsel, u32 type,
> u64 config)
> {
>   if ((type == PERF_TYPE_HARDWARE || type ==PERF_TYPE_HW_CACHE)  &&
> perf_pmus__supports_extended_type())
>      return (evsel->core.attr.config & PERF_HW_EVENT_MASK) == config;
> 
>   return evsel->core.attr.config == config;
> }
> #define evsel__match(evsel, t, c) __evsel__match(evsel, PERF_TYPE_##t,
> PERF_COUNT_##c)

Yes, the above code looks better. I will apply it in V2.

Thanks,
Kan
> 
> Thanks,
> Ian
> 
>>
>>  static inline bool evsel__match2(struct evsel *e1, struct evsel *e2)
>>  {
>> @@ -438,13 +440,13 @@ bool evsel__is_function_event(struct evsel *evsel);
>>
>>  static inline bool evsel__is_bpf_output(struct evsel *evsel)
>>  {
>> -       return evsel__match(evsel, SOFTWARE, SW_BPF_OUTPUT);
>> +       return evsel__match(evsel, SOFTWARE, SW_BPF_OUTPUT, EVSEL_EVENT_MASK);
>>  }
>>
>>  static inline bool evsel__is_clock(const struct evsel *evsel)
>>  {
>> -       return evsel__match(evsel, SOFTWARE, SW_CPU_CLOCK) ||
>> -              evsel__match(evsel, SOFTWARE, SW_TASK_CLOCK);
>> +       return evsel__match(evsel, SOFTWARE, SW_CPU_CLOCK, EVSEL_EVENT_MASK) ||
>> +              evsel__match(evsel, SOFTWARE, SW_TASK_CLOCK, EVSEL_EVENT_MASK);
>>  }
>>
>>  bool evsel__fallback(struct evsel *evsel, int err, char *msg, size_t msgsize);
>> diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
>> index 1566a206ba42..074f38b57e2d 100644
>> --- a/tools/perf/util/stat-shadow.c
>> +++ b/tools/perf/util/stat-shadow.c
>> @@ -6,6 +6,7 @@
>>  #include "color.h"
>>  #include "debug.h"
>>  #include "pmu.h"
>> +#include "pmus.h"
>>  #include "rblist.h"
>>  #include "evlist.h"
>>  #include "expr.h"
>> @@ -78,6 +79,8 @@ void perf_stat__reset_shadow_stats(void)
>>
>>  static enum stat_type evsel__stat_type(const struct evsel *evsel)
>>  {
>> +       u64 mask = perf_pmus__supports_extended_type() ? PERF_HW_EVENT_MASK : EVSEL_EVENT_MASK;
>> +
>>         /* Fake perf_hw_cache_op_id values for use with evsel__match. */
>>         u64 PERF_COUNT_hw_cache_l1d_miss = PERF_COUNT_HW_CACHE_L1D |
>>                 ((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
>> @@ -97,41 +100,41 @@ static enum stat_type evsel__stat_type(const struct evsel *evsel)
>>
>>         if (evsel__is_clock(evsel))
>>                 return STAT_NSECS;
>> -       else if (evsel__match(evsel, HARDWARE, HW_CPU_CYCLES))
>> +       else if (evsel__match(evsel, HARDWARE, HW_CPU_CYCLES, mask))
>>                 return STAT_CYCLES;
>> -       else if (evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS))
>> +       else if (evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS, mask))
>>                 return STAT_INSTRUCTIONS;
>> -       else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_FRONTEND))
>> +       else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_FRONTEND, mask))
>>                 return STAT_STALLED_CYCLES_FRONT;
>> -       else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_BACKEND))
>> +       else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_BACKEND, mask))
>>                 return STAT_STALLED_CYCLES_BACK;
>> -       else if (evsel__match(evsel, HARDWARE, HW_BRANCH_INSTRUCTIONS))
>> +       else if (evsel__match(evsel, HARDWARE, HW_BRANCH_INSTRUCTIONS, mask))
>>                 return STAT_BRANCHES;
>> -       else if (evsel__match(evsel, HARDWARE, HW_BRANCH_MISSES))
>> +       else if (evsel__match(evsel, HARDWARE, HW_BRANCH_MISSES, mask))
>>                 return STAT_BRANCH_MISS;
>> -       else if (evsel__match(evsel, HARDWARE, HW_CACHE_REFERENCES))
>> +       else if (evsel__match(evsel, HARDWARE, HW_CACHE_REFERENCES, mask))
>>                 return STAT_CACHE_REFS;
>> -       else if (evsel__match(evsel, HARDWARE, HW_CACHE_MISSES))
>> +       else if (evsel__match(evsel, HARDWARE, HW_CACHE_MISSES, mask))
>>                 return STAT_CACHE_MISSES;
>> -       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_L1D))
>> +       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_L1D, mask))
>>                 return STAT_L1_DCACHE;
>> -       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_L1I))
>> +       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_L1I, mask))
>>                 return STAT_L1_ICACHE;
>> -       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_LL))
>> +       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_LL, mask))
>>                 return STAT_LL_CACHE;
>> -       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_DTLB))
>> +       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_DTLB, mask))
>>                 return STAT_DTLB_CACHE;
>> -       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_ITLB))
>> +       else if (evsel__match(evsel, HW_CACHE, HW_CACHE_ITLB, mask))
>>                 return STAT_ITLB_CACHE;
>> -       else if (evsel__match(evsel, HW_CACHE, hw_cache_l1d_miss))
>> +       else if (evsel__match(evsel, HW_CACHE, hw_cache_l1d_miss, mask))
>>                 return STAT_L1D_MISS;
>> -       else if (evsel__match(evsel, HW_CACHE, hw_cache_l1i_miss))
>> +       else if (evsel__match(evsel, HW_CACHE, hw_cache_l1i_miss, mask))
>>                 return STAT_L1I_MISS;
>> -       else if (evsel__match(evsel, HW_CACHE, hw_cache_ll_miss))
>> +       else if (evsel__match(evsel, HW_CACHE, hw_cache_ll_miss, mask))
>>                 return STAT_LL_MISS;
>> -       else if (evsel__match(evsel, HW_CACHE, hw_cache_dtlb_miss))
>> +       else if (evsel__match(evsel, HW_CACHE, hw_cache_dtlb_miss, mask))
>>                 return STAT_DTLB_MISS;
>> -       else if (evsel__match(evsel, HW_CACHE, hw_cache_itlb_miss))
>> +       else if (evsel__match(evsel, HW_CACHE, hw_cache_itlb_miss, mask))
>>                 return STAT_ITLB_MISS;
>>         return STAT_NONE;
>>  }
>> --
>> 2.35.1
>>

  reply	other threads:[~2023-06-13 20:07 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-07 16:26 [PATCH 0/8] New metricgroup output in perf stat default mode kan.liang
2023-06-07 16:26 ` [PATCH 1/8] perf metric: Fix no group check kan.liang
2023-06-13 19:22   ` Ian Rogers
2023-06-07 16:26 ` [PATCH 2/8] perf evsel: Fix the annotation for hardware events on hybrid kan.liang
2023-06-13 19:35   ` Ian Rogers
2023-06-13 20:06     ` Liang, Kan [this message]
2023-06-13 21:18       ` Arnaldo Carvalho de Melo
2023-06-13 23:57         ` Liang, Kan
2023-06-07 16:26 ` [PATCH 3/8] perf metric: JSON flag to default metric group kan.liang
2023-06-13 19:44   ` Ian Rogers
2023-06-13 20:10     ` Liang, Kan
2023-06-13 20:28       ` Ian Rogers
2023-06-13 20:59         ` Liang, Kan
2023-06-13 21:28           ` Ian Rogers
2023-06-14  0:02             ` Liang, Kan
2023-06-07 16:26 ` [PATCH 4/8] perf vendor events arm64: Add default tags into topdown L1 metrics kan.liang
2023-06-13 19:45   ` Ian Rogers
2023-06-13 20:31     ` Arnaldo Carvalho de Melo
2023-06-14 14:30   ` John Garry
2023-06-16  3:17     ` Liang, Kan
2023-06-07 16:26 ` [PATCH 5/8] perf stat,jevents: Introduce Default tags for the default mode kan.liang
2023-06-13 19:59   ` Ian Rogers
2023-06-13 20:11     ` Liang, Kan
2023-06-07 16:26 ` [PATCH 6/8] perf stat,metrics: New metricgroup output " kan.liang
2023-06-13 20:16   ` Ian Rogers
2023-06-13 20:50     ` Liang, Kan
2023-06-07 16:26 ` [PATCH 7/8] pert tests: Support metricgroup perf stat JSON output kan.liang
2023-06-13 20:17   ` Ian Rogers
2023-06-13 20:30     ` Arnaldo Carvalho de Melo
2023-06-07 16:27 ` [PATCH 8/8] perf test: Add test case for the standard perf stat output kan.liang
2023-06-13 20:21   ` Ian Rogers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7487eff9-5769-1701-ea1b-45dd5ab67c85@linux.intel.com \
    --to=kan.liang@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ahmad.yasin@intel.com \
    --cc=ak@linux.intel.com \
    --cc=eranian@google.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.