From: Andi Kleen <ak@linux.intel.com>
To: Ian Rogers <irogers@google.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
linux-perf-users <linux-perf-users@vger.kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v10 1/4] Create source symlink in perf object dir
Date: Tue, 3 Sep 2024 16:25:17 -0700 [thread overview]
Message-ID: <Ztea3dUZ-XSG2gfB@tassilo> (raw)
In-Reply-To: <CAP-5=fUZwoDrGaEh7Us1aDM+W3aj1zb3D5VEH39qDfCjQGvePQ@mail.gmail.com>
On Mon, Aug 26, 2024 at 04:53:01PM -0700, Ian Rogers wrote:
> On Mon, Aug 26, 2024, 4:34 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >
> > On Mon, Aug 26, 2024 at 08:27:43AM -0700, Ian Rogers wrote:
> > > On Mon, Aug 26, 2024 at 7:32 AM Arnaldo Carvalho de Melo
> > > <acme@kernel.org> wrote:
> > > >
> > > > On Sun, Aug 25, 2024 at 09:58:23AM -0700, Andi Kleen wrote:
> > > > > Arnaldo,
> > > >
> > > > > can you please apply the patchkit? This fixes a regression.
> > > >
> > > > First one was applied, was letting the others to be out there for a
> > > > while, I thought there were concerns about it, but I see Namhyung's Ack,
> > > > so applied.
> > >
> > > Can we not apply this? See comments on the thread. Basically we're
> >
> > And what about the reported segfault?
>
> It is better addressed by:
> https://lore.kernel.org/lkml/20240720074552.1915993-1-irogers@google.com/
I finally got around to test this other patch.
The reason for the feature is to get the metric for every individual
sampling interval as the most fine grained unit, as it was explained in the
original commit message:
perf script: Allow computing 'perf stat' style metrics
Add support for computing 'perf stat' style metrics in 'perf script'.
When using leader sampling we can get metrics ____for each sampling period___
by computing formulas over the values of the different group members.
This allows things like fine grained IPC tracking through sampling, much
more fine grained than with 'perf stat'.
The metric is still averaged over the sampling period, it is not just
for the sampling point.
...
Note the "for each sampling period" which is the key aspect.
With my version I get:
perf record -e '{cycles,instructions}:S' -a tcall
perf script -F +metric
perf 2061404 [000] 6395040.804752: 2687 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [000] 6395040.804752: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [000] 6395040.804752: metric: 0.15 insn per cycle
perf 2061404 [001] 6395040.804879: 2411 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [001] 6395040.804879: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [001] 6395040.804879: metric: 0.16 insn per cycle
perf 2061404 [002] 6395040.805000: 2245 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [002] 6395040.805000: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [002] 6395040.805000: metric: 0.18 insn per cycle
perf 2061404 [003] 6395040.805122: 2442 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [003] 6395040.805122: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [003] 6395040.805122: metric: 0.16 insn per cycle
perf 2061404 [004] 6395040.805241: 2208 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [004] 6395040.805241: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [004] 6395040.805241: metric: 0.18 insn per cycle
perf 2061404 [005] 6395040.805359: 2199 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [005] 6395040.805359: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [005] 6395040.805359: metric: 0.18 insn per cycle
perf 2061404 [006] 6395040.805479: 2269 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [006] 6395040.805479: 382 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [006] 6395040.805479: metric: 0.17 insn per cycle
perf 2061404 [007] 6395040.805596: 2215 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [007] 6395040.805596: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [007] 6395040.805596: metric: 0.18 insn per cycle
perf 2061404 [008] 6395040.805715: 2258 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [008] 6395040.805715: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [008] 6395040.805715: metric: 0.18 insn per cycle
perf 2061404 [009] 6395040.805835: 2293 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [009] 6395040.805835: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
You see there is one metric for every sampling period
But Ian's version generates this:
perf 2061404 [000] 6395040.804752: 2687 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [000] 6395040.804752: metric: 0.15 insn per cycle
perf 2061404 [000] 6395040.804752: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [000] 6395040.804752: metric: 0.07 insn per cycle
This is the only metric for "perf"
perf 2061404 [001] 6395040.804879: 2411 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [001] 6395040.804879: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [002] 6395040.805000: 2245 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [002] 6395040.805000: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [003] 6395040.805122: 2442 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [003] 6395040.805122: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [004] 6395040.805241: 2208 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [004] 6395040.805241: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [005] 6395040.805359: 2199 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [005] 6395040.805359: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [006] 6395040.805479: 2269 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [006] 6395040.805479: 382 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [007] 6395040.805596: 2215 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [007] 6395040.805596: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [008] 6395040.805715: 2258 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [008] 6395040.805715: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [009] 6395040.805835: 2293 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [009] 6395040.805835: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [010] 6395040.806013: 2159 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [010] 6395040.806013: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [011] 6395040.806121: 3058 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
.... <lots more samples but no metrics for "perf" anymore">
There are some metrics for other processes, but I don't even know what logic it follows here
(as in what intervals actually get aggregated)
So yes maybe his implementation may be cleaner, but it simply doesn't solve the problem,
it implements something else.
-Andi
prev parent reply other threads:[~2024-09-03 23:25 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-13 21:36 [PATCH v10 1/4] Create source symlink in perf object dir Andi Kleen
2024-08-13 21:36 ` [PATCH v10 2/4] perf test: Support external tests for separate objdir Andi Kleen
2024-08-13 21:36 ` [PATCH v10 3/4] perf script: Fix perf script -F +metric Andi Kleen
2024-08-13 21:36 ` [PATCH v10 4/4] Add a test case for " Andi Kleen
2024-08-25 16:58 ` [PATCH v10 1/4] Create source symlink in perf object dir Andi Kleen
2024-08-26 14:32 ` Arnaldo Carvalho de Melo
2024-08-26 15:27 ` Ian Rogers
2024-08-26 23:34 ` Arnaldo Carvalho de Melo
2024-08-26 23:53 ` Ian Rogers
2024-09-03 23:25 ` Andi Kleen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ztea3dUZ-XSG2gfB@tassilo \
--to=ak@linux.intel.com \
--cc=acme@kernel.org \
--cc=irogers@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=namhyung@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).