Re: [PATCH v1 04/20] perf jevents: Add tsx metric group for Intel models

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Ian Rogers <irogers@google.com>
Cc: Perry Taylor <perry.taylor@intel.com>,
	Samantha Alt <samantha.alt@intel.com>,
	Caleb Biggers <caleb.biggers@intel.com>,
	Weilin Wang <weilin.wang@intel.com>,
	Edward Baker <edward.baker@intel.com>,
	Andi Kleen <ak@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	John Garry <john.g.garry@oracle.com>,
	Jing Zhang <renyu.zj@linux.alibaba.com>,
	Thomas Richter <tmricht@linux.ibm.com>,
	James Clark <james.clark@arm.com>,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH v1 04/20] perf jevents: Add tsx metric group for Intel models
Date: Fri, 1 Mar 2024 12:26:55 -0500	[thread overview]
Message-ID: <f9248ff6-6138-46b6-9cb7-a40442882195@linux.intel.com> (raw)
In-Reply-To: <CAP-5=fVBPT9itsyruLeChu=90xnvuxT7PSBtBkWi5LiDNAm2iw@mail.gmail.com>



On 2024-03-01 11:37 a.m., Ian Rogers wrote:
> On Fri, Mar 1, 2024 at 6:52 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>
>>
>>
>> On 2024-02-29 8:01 p.m., Ian Rogers wrote:
>>> On Thu, Feb 29, 2024 at 1:15 PM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2024-02-28 7:17 p.m., Ian Rogers wrote:
>>>>> Allow duplicated metric to be dropped from json files.
>>>>>
>>>>> Signed-off-by: Ian Rogers <irogers@google.com>
>>>>> ---
>>>>>  tools/perf/pmu-events/intel_metrics.py | 51 ++++++++++++++++++++++++++
>>>>>  1 file changed, 51 insertions(+)
>>>>>
>>>>> diff --git a/tools/perf/pmu-events/intel_metrics.py b/tools/perf/pmu-events/intel_metrics.py
>>>>> index 20c25d142f24..1096accea2aa 100755
>>>>> --- a/tools/perf/pmu-events/intel_metrics.py
>>>>> +++ b/tools/perf/pmu-events/intel_metrics.py
>>>>> @@ -7,6 +7,7 @@ import argparse
>>>>>  import json
>>>>>  import math
>>>>>  import os
>>>>> +from typing import Optional
>>>>>
>>>>>  parser = argparse.ArgumentParser(description="Intel perf json generator")
>>>>>  parser.add_argument("-metricgroups", help="Generate metricgroups data", action='store_true')
>>>>> @@ -77,10 +78,60 @@ def Smi() -> MetricGroup:
>>>>>      ])
>>>>>
>>>>>
>>>>> +def Tsx() -> Optional[MetricGroup]:
>>>>> +    if args.model not in [
>>>>> +        'alderlake',
>>>>> +        'cascadelakex',
>>>>> +        'icelake',
>>>>> +        'icelakex',
>>>>> +        'rocketlake',
>>>>> +        'sapphirerapids',
>>>>> +        'skylake',
>>>>> +        'skylakex',
>>>>> +        'tigerlake',> +    ]:
>>>>
>>>> Can we get ride of the model list? Otherwise, we have to keep updating
>>>> the list.
>>>
>>> Do we expect the list to update? :-)
>>
>> Yes, at least for the meteorlake and graniterapids. They should be the
>> same as alderlake and sapphirerapids. I'm not sure about the future
>> platforms.
>>
>> Maybe we can have a if args.model in list here to include all the
>> non-hybrid models which doesn't support TSX. I think the list should not
>> be changed shortly.
>>
>>> The issue is the events are in
>>> sysfs and not the json. If we added the tsx events to json then this
>>> list wouldn't be necessary, but it also would mean the events would be
>>> present in "perf list" even when TSX is disabled.
>>
>> I think there may an alternative way, to check the RTM events, e.g.,
>> RTM_RETIRED.START event. We only need to generate the metrics for the
>> platform which supports the RTM_RETIRED.START event.
>>
>>
>>>
>>>>> +        return None
>>>>> +> +    pmu = "cpu_core" if args.model == "alderlake" else "cpu"
>>>>
>>>> Is it possible to change the check to the existence of the "cpu" PMU
>>>> here? has_pmu("cpu") ? "cpu" : "cpu_core"
>>>
>>> The "Unit" on "cpu" events in json always just blank. On hybrid it is
>>> either "cpu_core" or "cpu_atom", so I can make this something like:
>>>
>>> pmu = "cpu_core" if metrics.HasPmu("cpu_core") else "cpu"
>>>
>>> which would be a build time test.
>>
>> Yes, I think using the "Unit" is good enough.
>>
>>>
>>>
>>>>> +    cycles = Event('cycles')
>>>>> +    cycles_in_tx = Event(f'{pmu}/cycles\-t/')
>>>>> +    transaction_start = Event(f'{pmu}/tx\-start/')
>>>>> +    cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
>>>>> +    metrics = [
>>>>> +        Metric('tsx_transactional_cycles',
>>>>> +                      'Percentage of cycles within a transaction region.',
>>>>> +                      Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
>>>>> +                      '100%'),
>>>>> +        Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted transactions.',
>>>>> +                      Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
>>>>> +                                    has_event(cycles_in_tx),
>>>>> +                                    0),
>>>>> +                      '100%'),
>>>>> +        Metric('tsx_cycles_per_transaction',
>>>>> +                      'Number of cycles within a transaction divided by the number of transactions.',
>>>>> +                      Select(cycles_in_tx / transaction_start,
>>>>> +                                    has_event(cycles_in_tx),
>>>>> +                                    0),
>>>>> +                      "cycles / transaction"),
>>>>> +    ]
>>>>> +    if args.model != 'sapphirerapids':
>>>>
>>>> Add the "tsx_cycles_per_elision" metric only if
>>>> has_event(f'{pmu}/el\-start/')?
>>>
>>> It's a sysfs event, so this wouldn't work :-(
>>
>> The below is the definition of el-start in the kernel.
>> EVENT_ATTR_STR(el-start,        el_start,       "event=0xc8,umask=0x1");
>>
>> The corresponding event in the event list should be HLE_RETIRED.START
>>       "EventCode": "0xC8",
>>       "UMask": "0x01",
>>       "EventName": "HLE_RETIRED.START",
>>
>> I think we may check the HLE_RETIRED.START instead. If the
>> HLE_RETIRED.START doesn't exist, I don't see a reason why the
>> tsx_cycles_per_elision should be supported.
>>
>> Again, in the virtualization world, it's possible that the
>> HLE_RETIRED.START exists in the event list but el_start isn't available
>> in the sysfs. I think it has to be specially handle in the test as well.
> 
> So we keep the has_event test on the sysfs event to handle the
> virtualization and disabled case. We use  HLE_RETIRED.START to detect
> whether the model supports TSX.

Yes. I think the JSON event always keeps the latest status of an event.
If an event is deprecated someday, I don't think there is a reason to
keep any metrics including the event. So we should use it to check
whether to generate a metrics.

The sysfs event tells if the current kernel support the event. It should
be used to check whether a metrics should be used/enabled.

> Should the event be the sysfs or json
> version? i.e.
> 
>         "MetricExpr": "(cycles\\-t / el\\-start if
> has_event(el\\-start) else 0)",
> 
> or
> 
>         "MetricExpr": "(cycles\\-t / HLE_RETIRED.START if
> has_event(el\\-start) else 0)",
> 
> I think I favor the former for some consistency with the has_event.
>

Agree, the former looks good to me too.


> Using HLE_RETIRED.START means the set of TSX models goes from:
>         'alderlake',
>         'cascadelakex',
>         'icelake',
>         'icelakex',
>         'rocketlake',
>         'sapphirerapids',
>         'skylake',
>         'skylakex',
>         'tigerlake',
> 
> To:
>    broadwell
>    broadwellde
>    broadwellx
>    cascadelakex
>    haswell
>    haswellx
>    icelake
>    rocketlake
>    skylake
>    skylakex
> 
> Using RTM_RETIRED.START it goes to:
>    broadwell
>    broadwellde
>    broadwellx
>    cascadelakex
>    emeraldrapids
>    graniterapids
>    haswell
>    haswellx
>    icelake
>    icelakex
>    rocketlake
>    sapphirerapids
>    skylake
>    skylakex
>    tigerlake
> 
> So I'm not sure it is working equivalently to what we have today,
> which may be good or bad. Here is what I think the code should look
> like:

Yes, there should be some changes. But I think the changes should be good.

For icelakex, the HLE_RETIRED.START has been deprecated. I don't see a
reason why should perf keep the tsx_cycles_per_elision metric.

For alderlake, TSX is deprecated. The perf should drop the related
metrics as well.
https://edc.intel.com/content/www/us/en/design/ipla/software-development-platforms/client/platforms/alder-lake-desktop/12th-generation-intel-core-processors-datasheet-volume-1-of-2/001/deprecated-technologies/

> 
> def Tsx() -> Optional[MetricGroup]:
>   pmu = "cpu_core" if CheckPmu("cpu_core") else "cpu"
>   cycles = Event('cycles')
>   cycles_in_tx = Event(f'{pmu}/cycles\-t/')
>   cycles_in_tx_cp = Event(f'{pmu}/cycles\-ct/')
>   try:
>     # Test if the tsx event is present in the json, prefer the
>     # sysfs version so that we can detect its presence at runtime.
>     transaction_start = Event("RTM_RETIRED.START")
>     transaction_start = Event(f'{pmu}/tx\-start/')
>   except:
>     return None
> 
>   elision_start = None
>   try:
>     # Elision start isn't supported by all models, but we'll not
>     # generate the tsx_cycles_per_elision metric in that
>     # case. Again, prefer the sysfs encoding of the event.
>     elision_start = Event("HLE_RETIRED.START")
>     elision_start = Event(f'{pmu}/el\-start/')
>   except:
>     pass
> 
>   return MetricGroup('transaction', [
>       Metric('tsx_transactional_cycles',
>              'Percentage of cycles within a transaction region.',
>              Select(cycles_in_tx / cycles, has_event(cycles_in_tx), 0),
>              '100%'),
>       Metric('tsx_aborted_cycles', 'Percentage of cycles in aborted
> transactions.',
>              Select(max(cycles_in_tx - cycles_in_tx_cp, 0) / cycles,
>                     has_event(cycles_in_tx),
>                     0),
>              '100%'),
>       Metric('tsx_cycles_per_transaction',
>              'Number of cycles within a transaction divided by the
> number of transactions.',
>              Select(cycles_in_tx / transaction_start,
>                     has_event(cycles_in_tx),
>                     0),
>              "cycles / transaction"),
>       Metric('tsx_cycles_per_elision',
>              'Number of cycles within a transaction divided by the
> number of elisions.',
>              Select(cycles_in_tx / elision_start,
>                     has_event(elision_start),
>                     0),
>              "cycles / elision") if elision_start else None,
>   ], description="Breakdown of transactional memory statistics")
> 
> Wdyt?

Looks good to me.

Thanks,
Kan
> 
> Thanks,
> Ian
> 
>> Thanks,
>> Kan
>>
>>>
>>> Thanks,
>>> Ian
>>>
>>>> Thanks,
>>>> Kan
>>>>
>>>>> +        elision_start = Event(f'{pmu}/el\-start/')
>>>>> +        metrics += [
>>>>> +            Metric('tsx_cycles_per_elision',
>>>>> +                          'Number of cycles within a transaction divided by the number of elisions.',
>>>>> +                          Select(cycles_in_tx / elision_start,
>>>>> +                                        has_event(elision_start),
>>>>> +                                        0),
>>>>> +                          "cycles / elision"),
>>>>> +        ]
>>>>> +    return MetricGroup('transaction', metrics)
>>>>> +
>>>>> +
>>>>>  all_metrics = MetricGroup("", [
>>>>>      Idle(),
>>>>>      Rapl(),
>>>>>      Smi(),
>>>>> +    Tsx(),
>>>>>  ])
>>>>>
>>>>>  if args.metricgroups:
>>>

next prev parent reply	other threads:[~2024-03-01 17:27 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-29  0:17 [PATCH v1 00/20] Python generated Intel metrics Ian Rogers
2024-02-29  0:17 ` [PATCH v1 01/20] perf jevents: Add RAPL metrics for all Intel models Ian Rogers
2024-02-29 20:59   ` Liang, Kan
2024-03-01  1:02     ` Ian Rogers
2024-02-29  0:17 ` [PATCH v1 02/20] perf jevents: Add idle metric for " Ian Rogers
2024-03-01 17:49   ` Andi Kleen
2024-03-01 18:17     ` Ian Rogers
2024-03-01 21:34       ` Andi Kleen
2024-03-01 23:09         ` Ian Rogers
2024-02-29  0:17 ` [PATCH v1 03/20] perf jevents: Add smi metric group " Ian Rogers
2024-02-29 21:09   ` Liang, Kan
2024-03-01  0:54     ` Ian Rogers
2024-02-29  0:17 ` [PATCH v1 04/20] perf jevents: Add tsx " Ian Rogers
2024-02-29 21:15   ` Liang, Kan
2024-03-01  1:01     ` Ian Rogers
2024-03-01 14:52       ` Liang, Kan
2024-03-01 16:37         ` Ian Rogers
2024-03-01 17:26           ` Liang, Kan [this message]
2024-02-29  0:17 ` [PATCH v1 05/20] perf jevents: Add br metric group for branch statistics on Intel Ian Rogers
2024-02-29 21:17   ` Liang, Kan
2024-03-01  1:02     ` Ian Rogers
2024-02-29  0:17 ` [PATCH v1 06/20] perf jevents: Add software prefetch (swpf) metric group for Intel Ian Rogers
2024-02-29  0:17 ` [PATCH v1 07/20] perf jevents: Add ports metric group giving utilization on Intel Ian Rogers
2024-02-29  0:17 ` [PATCH v1 08/20] perf jevents: Add L2 metrics for Intel Ian Rogers
2024-02-29  0:17 ` [PATCH v1 09/20] perf jevents: Add load store breakdown metrics ldst " Ian Rogers
2024-02-29  0:17 ` [PATCH v1 10/20] perf jevents: Add ILP metrics " Ian Rogers
2024-02-29  0:17 ` [PATCH v1 11/20] perf jevents: Add context switch " Ian Rogers
2024-02-29  0:17 ` [PATCH v1 12/20] perf jevents: Add FPU " Ian Rogers
2024-02-29  0:17 ` [PATCH v1 13/20] perf jevents: Add cycles breakdown metric " Ian Rogers
2024-02-29 21:30   ` Liang, Kan
2024-03-01  0:48     ` Ian Rogers
2024-03-01 13:53       ` Liang, Kan
2024-02-29  0:17 ` [PATCH v1 14/20] perf jevents: Add Miss Level Parallelism (MLP) " Ian Rogers
2024-02-29  0:18 ` [PATCH v1 15/20] perf jevents: Add mem_bw " Ian Rogers
2024-02-29  0:18 ` [PATCH v1 16/20] perf jevents: Add local/remote "mem" breakdown metrics " Ian Rogers
2024-02-29  0:18 ` [PATCH v1 17/20] perf jevents: Add dir " Ian Rogers
2024-02-29  0:18 ` [PATCH v1 18/20] perf jevents: Add C-State metrics from the PCU PMU " Ian Rogers
2024-02-29  0:18 ` [PATCH v1 19/20] perf jevents: Add local/remote miss latency metrics " Ian Rogers
2024-02-29  0:18 ` [PATCH v1 20/20] perf jevents: Add upi_bw metric " Ian Rogers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f9248ff6-6138-46b6-9cb7-a40442882195@linux.intel.com \
    --to=kan.liang@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=caleb.biggers@intel.com \
    --cc=edward.baker@intel.com \
    --cc=eranian@google.com \
    --cc=irogers@google.com \
    --cc=james.clark@arm.com \
    --cc=john.g.garry@oracle.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=perry.taylor@intel.com \
    --cc=peterz@infradead.org \
    --cc=renyu.zj@linux.alibaba.com \
    --cc=samantha.alt@intel.com \
    --cc=tmricht@linux.ibm.com \
    --cc=weilin.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).