From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Ian Rogers <irogers@google.com>
Cc: "Thomas Falcon" <thomas.falcon@intel.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Ingo Molnar" <mingo@redhat.com>,
"Arnaldo Carvalho de Melo" <acme@kernel.org>,
"Namhyung Kim" <namhyung@kernel.org>,
"Mark Rutland" <mark.rutland@arm.com>,
"Alexander Shishkin" <alexander.shishkin@linux.intel.com>,
"Jiri Olsa" <jolsa@kernel.org>,
"Adrian Hunter" <adrian.hunter@intel.com>,
"Andreas Färber" <afaerber@suse.de>,
"Manivannan Sadhasivam" <manivannan.sadhasivam@linaro.org>,
"Weilin Wang" <weilin.wang@intel.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
"Perry Taylor" <perry.taylor@intel.com>,
"Samantha Alt" <samantha.alt@intel.com>,
"Caleb Biggers" <caleb.biggers@intel.com>,
"Edward Baker" <edward.baker@intel.com>,
"Michael Petlan" <mpetlan@redhat.com>
Subject: Re: [PATCH v5 11/24] perf vendor events: Update/add Graniterapids events/metrics
Date: Thu, 6 Feb 2025 13:53:41 -0500 [thread overview]
Message-ID: <145ce38d-67c5-47e5-9625-0ae9e9831fd9@linux.intel.com> (raw)
In-Reply-To: <CAP-5=fWQj01O3WmGLoAf6O_uEeMHpOUqVWvHi3nW_kGj4VtZWg@mail.gmail.com>
On 2025-02-06 12:36 p.m., Ian Rogers wrote:
> On Thu, Feb 6, 2025 at 9:11 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>
>>
>>
>> On 2025-02-06 11:40 a.m., Ian Rogers wrote:
>>> On Thu, Feb 6, 2025 at 6:32 AM Liang, Kan <kan.liang@linux.intel.com>
>> wrote:
>>>>
>>>> On 2025-02-05 4:33 p.m., Ian Rogers wrote:
>>>>> On Wed, Feb 5, 2025 at 1:10 PM Liang, Kan <kan.liang@linux.intel.com>
>> wrote:
>>>>>>
>>>>>> On 2025-02-05 3:23 p.m., Ian Rogers wrote:
>>>>>>> On Wed, Feb 5, 2025 at 11:11 AM Liang, Kan <
>> kan.liang@linux.intel.com> wrote:
>>>>>>>>
>>>>>>>> On 2025-02-05 12:31 p.m., Ian Rogers wrote:
>>>>>>>>> + {
>>>>>>>>> + "BriefDescription": "This category represents fraction of
>> slots utilized by useful work i.e. issued uops that eventually get retired",
>>>>>>>>> + "MetricExpr": "topdown\\-retiring / (topdown\\-fe\\-bound
>> + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) + 0 *
>> slots",
>>>>>>>>> + "MetricGroup": "BvUW;TmaL1;TopdownL1;tma_L1_group",
>>>>>>>>> + "MetricName": "tma_retiring",
>>>>>>>>> + "MetricThreshold": "tma_retiring > 0.7 |
>> tma_heavy_operations > 0.1",
>>>>>>>>> + "MetricgroupNoGroup": "TopdownL1",
>>>>>>>>> + "PublicDescription": "This category represents fraction
>> of slots utilized by useful work i.e. issued uops that eventually get
>> retired. Ideally; all pipeline slots would be attributed to the Retiring
>> category. Retiring of 100% would indicate the maximum Pipeline_Width
>> throughput was achieved. Maximizing Retiring typically increases the
>> Instructions-per-cycle (see IPC metric). Note that a high Retiring value
>> does not necessary mean there is no room for more performance. For
>> example; Heavy-operations or Microcode Assists are categorized under
>> Retiring. They often indicate suboptimal performance and can often be
>> optimized or avoided. Sample with: UOPS_RETIRED.SLOTS",
>>>>>>>>> + "ScaleUnit": "100%"
>>>>>>>>> + },
>>>>>>>>
>>>>>>>> The "Default" tag is missed for GNR as well.
>>>>>>>> It seems the new CPUIDs are not added in the script?
>>>>>>>
>>>>>>> Spotted it, we need to manually say which architectures with
>> TopdownL1
>>>>>>> should be in Default because it was insisted upon that pre-Icelake
>>>>>>> CPUs with TopdownL1 not have TopdownL1 in Default. As you know, my
>>>>>>> preference would be to always put TopdownL1 metrics into Default.
>>>>>>>
>>>>>>
>>>>>> For the future platforms, there should be always at least TopdownL1
>>>>>> support. Intel even adds extra fixed counters for the TopdownL1
>> events.
>>>>>>
>>>>>> Maybe the script should be changed to only mark the old pre-Icelake as
>>>>>> no TopdownL1 Default. For the other platforms, always add TopdownL1 as
>>>>>> Default. It would avoid manually adding it for every new platforms.
>>>>>
>>>>> That's fair. What about TopdownL2 that is currently only in the
>>>>> Default set for SPR?
>>>>>
>>>>
>>>> Yes, the TopdownL2 is a bit tricky, which requires much more events.
>>>> Could you please set it just for SPR/EMR/GNR for now?
>>>>
>>>> I will ask around internally and make a long-term solution for the
>>>> TopdownL2.
>>>
>>> Thanks Kan, I've updated the script the existing way for now. Thomas
>>> saw another issue with TSC which is also fixed. I'm trying to
>>> understand what happened with it before sending out v6:
>>>
>> https://lore.kernel.org/lkml/4f42946ffdf474fbf8aeaa142c25a25ebe739b78.camel@intel.com/
>>> """
>>> There are all some errors like this,
>>>
>>> Testing tma_cisc
>>> Metric contains missing events
>>> Cannot resolve IDs for tma_cisc: cpu_atom@TOPDOWN_FE_BOUND.CISC@ / (5
>>> * cpu_atom@CPU_CLK_UNHALTED.CORE@)
>>> """
>>> But checking the json I wasn't able to spot a model with the metric
>>> and without these json events. Knowing the model would make my life
>>> easier :-)
>>>
>>
>> The problem should be caused by the fundamental Topdown metrics, e.g.,
>> tma_frontend_bound, since the MetricThreshold of the tma_cisc requires
>> the Topdown metrics.
>>
>> $ ./perf stat -M tma_frontend_bound
>> Cannot resolve IDs for tma_frontend_bound:
>> cpu_atom@TOPDOWN_FE_BOUND.ALL@ / (8 * cpu_atom@CPU_CLK_UNHALTED.CORE@)
>>
>>
>> The metric itself is correct.
>>
>> + "BriefDescription": "Counts the number of issue slots that were
>> not consumed by the backend due to frontend stalls.",
>> + "MetricExpr": "cpu_atom@TOPDOWN_FE_BOUND.ALL@ / (8 *
>> cpu_atom@CPU_CLK_UNHALTED.CORE@)",
>> + "MetricGroup": "TopdownL1;tma_L1_group",
>> + "MetricName": "tma_frontend_bound",
>> + "MetricThreshold": "(tma_frontend_bound >0.20)",
>> + "MetricgroupNoGroup": "TopdownL1",
>> + "ScaleUnit": "100%",
>> + "Unit": "cpu_atom"
>> + },
>>
>> However, when I dump the debug information,
>> ./perf stat -M tma_frontend_bound -vvv
>>
>> I got below debug information. I have no idea where the slot is from.
>> It seems the perf code mess up the p-core metrics with the e-core
>> metrics. But why only slot?
>> It seems a bug of perf tool.
>>
>> found event cpu_atom@CPU_CLK_UNHALTED.CORE@
>> found event cpu_atom@TOPDOWN_FE_BOUND.ALL@
>> found event slots
>> Parsing metric events
>>
>> '{cpu_atom/CPU_CLK_UNHALTED.CORE,metric-id=cpu_atom!3CPU_CLK_UNHALTED.CORE!3/,cpu_atom/TOPDOWN_FE_BOUND.ALL,metric-id=cpu_atom!3TOPDOWN_FE_BOUND.ALL!3/,slots/metric-id=slots/}:W'
It because the perf adds "slot" as a tool event for the e-core Topdown
metrics.
There is no "slot" event for e-core.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/metricgroup.c#n1481
I will check why "slot" event is added as a tool event for e-core?
That doesn't make sense.
Thanks,
Kan
>>
>
> Some more clues for me but still no model name :-)
> If this were in the metric json I'd expect the issue to be here:
> https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py#L1626
> but it appears the PMU in perf is somehow injecting events - I wasn't aware
> this happened but I don't see every change, my memory is also fallible. I'd
> expect the injection if it's happening to be in:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/arch/x86/util/topdown.c?h=perf-tools-next
> or:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/metricgroup.c?h=perf-tools-next
> and I'm not seeing it. Could you help me to debug as I have no way to
> reproduce? Perhaps set a watch point on the number of entries in the evlist.
>
> Thanks,
> Ian
>
>
>
>>
>>
>> Thanks,
>> Kan
>>
>
next prev parent reply other threads:[~2025-02-06 18:53 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-05 17:31 [PATCH v5 00/24] Intel vendor events and TMA 5.01 metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 01/24] perf vendor events: Update Alderlake events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 02/24] perf vendor events: Update AlderlakeN events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 03/24] perf vendor events: Add Arrowlake events/metrics Ian Rogers
2025-02-05 19:08 ` Liang, Kan
2025-02-05 17:31 ` [PATCH v5 04/24] perf vendor events: Update Broadwell events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 05/24] perf vendor events: Update BroadwellDE events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 06/24] perf vendor events: Update BroadwellX events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 07/24] perf vendor events: Update CascadelakeX events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 08/24] perf vendor events: Add Clearwaterforest events Ian Rogers
2025-02-05 17:31 ` [PATCH v5 09/24] perf vendor events: Update EmeraldRapids events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 10/24] perf vendor events: Update GrandRidge events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 11/24] perf vendor events: Update/add Graniterapids events/metrics Ian Rogers
2025-02-05 19:11 ` Liang, Kan
2025-02-05 20:23 ` Ian Rogers
2025-02-05 21:10 ` Liang, Kan
2025-02-05 21:33 ` Ian Rogers
2025-02-06 14:32 ` Liang, Kan
2025-02-06 16:40 ` Ian Rogers
2025-02-06 17:11 ` Liang, Kan
[not found] ` <CAP-5=fWQj01O3WmGLoAf6O_uEeMHpOUqVWvHi3nW_kGj4VtZWg@mail.gmail.com>
2025-02-06 18:53 ` Liang, Kan [this message]
2025-02-06 18:59 ` Liang, Kan
2025-02-06 19:05 ` Ian Rogers
2025-02-06 19:53 ` Liang, Kan
2025-02-07 17:39 ` Ian Rogers
2025-02-05 17:31 ` [PATCH v5 12/24] perf vendor events: Update Haswell events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 13/24] perf vendor events: Update HaswellX events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 14/24] perf vendor events: Update Icelake events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 15/24] perf vendor events: Update IcelakeX events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 16/24] perf vendor events: Update/add Lunarlake events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 17/24] perf vendor events: Update Meteorlake events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 18/24] perf vendor events: Update Rocketlake events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 19/24] perf vendor events: Update Sapphirerapids events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 20/24] perf vendor events: Update Sierraforest events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 21/24] perf vendor events: Update Skylake metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 22/24] perf vendor events: Update SkylakeX events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 23/24] perf vendor events: Update Tigerlake events/metrics Ian Rogers
2025-02-05 17:31 ` [PATCH v5 24/24] perf test stat_all_metrics: Ensure missing events fail test Ian Rogers
2025-02-05 19:44 ` [PATCH v5 00/24] Intel vendor events and TMA 5.01 metrics Falcon, Thomas
2025-02-05 20:47 ` Ian Rogers
2025-02-06 20:33 ` Falcon, Thomas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=145ce38d-67c5-47e5-9625-0ae9e9831fd9@linux.intel.com \
--to=kan.liang@linux.intel.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=afaerber@suse.de \
--cc=alexander.shishkin@linux.intel.com \
--cc=caleb.biggers@intel.com \
--cc=edward.baker@intel.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=manivannan.sadhasivam@linaro.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=mpetlan@redhat.com \
--cc=namhyung@kernel.org \
--cc=perry.taylor@intel.com \
--cc=peterz@infradead.org \
--cc=samantha.alt@intel.com \
--cc=thomas.falcon@intel.com \
--cc=weilin.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).