linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jirka Hladky <jhladky@redhat.com>
To: Sandipan Das <sandipan.das@amd.com>
Cc: linux-perf-users@vger.kernel.org, ravi.bangoria@amd.com,
	ananth.narayan@amd.com
Subject: Re: AMD -missing perf stat metricgroup "pipeline"
Date: Tue, 22 Nov 2022 18:41:29 +0100	[thread overview]
Message-ID: <CAE4VaGDk3os+P6-+XojA8=cmBiFJj1UrU1cQNXL-qPji6X07ew@mail.gmail.com> (raw)
In-Reply-To: <0ecb09b3-5a72-6bac-0236-8807bbacf702@amd.com>

(forgot to send the message in plain text mode - resending)

Hi Sandipan,

> For determining if a workload is backend-bound, the recommended
> method on Zen 4 is to use the pipeline utilization metrics. We are
> the process of providing similar metrics and metric groups through
> the perf JSON event files for Zen 4 and they will be out very soon.


This is great news - I'm looking forward to having it released! :-)

> The PPR for Genoa processors is available here:
> https://www.amd.com/system/files/TechDocs/55901_0.25.zip

Thanks for sharing this it!

I could confirm that my workload is heavily backend bound - to 86%.
See [1]. That is exactly what I was looking for. It will be awesome
once it will become easily accessible via the pipeline metricgroup.

Thanks a lot!
Jirka

[1]
perf stat -e r100431EA0,r1004360A0,r4300C1,r430076  ./harmonic_series 0 1e9
Time elapsed: 2.44143 s
AVX512:
Sum 23.3799
Difference Sum - Formula -1.91847e-13
Time elapsed: 2.44154 s

Performance counter stats for './harmonic_series 0 1e9':

   46,731,488,713      r100431EA0
           22,145      r1004360A0
    5,015,015,425      r4300C1
    9,021,144,392      r430076

      2.442290274 seconds time elapsed

      2.437987000 seconds user
      0.000000000 seconds sys

Total Dispatch Slots: Up to 6 instructions can be dispatched in one
cycle. 6 * Event[430076]

Retiring: Fraction of dispatch slots used by ops that retired:
Event[4300C1] / Total
Dispatch Slots
5/(9*6)*100 = 9%

Backend Bound: Fraction of dispatch slots that remained unused
because of backend stalls. Event[100431EA0] / Total Dispatch Slots
46.7/(9*6)*100 = 86%


On Tue, Nov 22, 2022 at 5:48 AM Sandipan Das <sandipan.das@amd.com> wrote:
>
> Hi,
>
> On 11/21/2022 7:33 PM, Jirka Hladky wrote:
> >
> > I'm testing AVX-512 packed double performance on the AMD Zen4
> > platform, and I need help identifying the backend-bound workloads. On
> > Intel systems, I use the metricgroup pipeline:
> >
> > perf stat -M pipeline binary
> >
> > which gives me exactly what I need.
> >
> > What plans are to add a similar metric group for the AMD systems?
> >
>
> For determining if a workload is backend-bound, the recommended
> method on Zen 4 is to use the pipeline utilization metrics. We are
> the process of providing similar metrics and metric groups through
> the perf JSON event files for Zen 4 and they will be out very soon.
>
> The Processor Programming Reference (PPR) for Zen 4 based parts
> has a table titled "Guidance for Pipeline Utilization Statistics"
> which has the formulae for different Level 1 and 2 pipeline
> utilization metrics.
>
> The PPR for Genoa processors is available here:
> https://www.amd.com/system/files/TechDocs/55901_0.25.zip
>
> In this specific document, the table is in page 235 under section
> 2.1.15 titled "Performance Monitor Counters".
>
> It may not be convenient to find out if a workload is backend-bound
> without the use of a metric but one can still do it by programming
> the raw events that make up that metric.
>
> E.g. the formula for determining backend boundedness is:
> Event[100431EA0] / 6 * Event[430076]
>
> Running perf with the raw events give the counts which can then be
> used to calculate the metric.
>
> E.g.
>
> $ perf stat -e r100431EA0,r430076 ./test
>
>  Performance counter stats for './test':
>
>            750,372      r100431EA0:u
>      7,500,728,022      r430076:u
>
>        2.894204814 seconds time elapsed
>
>        2.894060000 seconds user
>        0.000000000 seconds sys
>
> The backend boundedness is then 750372 / (6 * 7500728022)
> which is roughly 0.001667%.
>
> - Sandipan
>


-- 
-Jirka


      reply	other threads:[~2022-11-22 17:43 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-21 14:03 AMD -missing perf stat metricgroup "pipeline" Jirka Hladky
2022-11-22  4:48 ` Sandipan Das
2022-11-22 17:41   ` Jirka Hladky [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAE4VaGDk3os+P6-+XojA8=cmBiFJj1UrU1cQNXL-qPji6X07ew@mail.gmail.com' \
    --to=jhladky@redhat.com \
    --cc=ananth.narayan@amd.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=ravi.bangoria@amd.com \
    --cc=sandipan.das@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).