From: Namhyung Kim <namhyung@kernel.org>
To: Michael Petlan <mpetlan@redhat.com>
Cc: vmolnaro@redhat.com, linux-perf-users@vger.kernel.org,
acme@kernel.org, acme@redhat.com
Subject: Re: [PATCH] perf test stat_bpf_counter.sh: Remove comparison of separate runs
Date: Sat, 15 Jun 2024 20:56:14 -0700 [thread overview]
Message-ID: <Zm5iXuQFE6TmztkC@google.com> (raw)
In-Reply-To: <alpine.LRH.2.20.2406131407370.4040@Diego>
Hello,
On Thu, Jun 13, 2024 at 04:20:58PM +0200, Michael Petlan wrote:
> On Tue, 4 Jun 2024, Namhyung Kim wrote:
> > On Tue, Jun 04, 2024 at 05:31:11PM +0200, vmolnaro@redhat.com wrote:
> > > From: Veronika Molnarova <vmolnaro@redhat.com>
> > >
> > > The test has been failing for some time when two separate runs of
> > > perf benchmarks are recorded and the counts of the samples are compared,
> > > while once the recording was done with option --bpf-counters and once
> > > without it. It is expected that the count of the samples should within
> > > a certain range, firstly the difference should have been within 10%,
> > > which was then later raised to 20%. However, the test case keeps failing
> > > on certain architectures as recording the same benchmark can provide
> > > completely different counts samples based on the current load of the
> > > system.
> > >
> > > Sampling two separate runs on intel-eaglestream-spr-13 of "perf stat
> > > --no-big-num -e cycles -- perf bench sched messaging -g 1 -l 100 -t":
> > >
> > > Performance counter stats for 'perf bench sched messaging -g 1 -l 100 -t':
> > >
> > > 396782898 cycles
> > >
> > > 0.010051983 seconds time elapsed
> > >
> > > 0.008664000 seconds user
> > > 0.097058000 seconds sys
> > >
> > > Performance counter stats for 'perf bench sched messaging -g 1 -l 100 -t':
> > >
> > > 1431133032 cycles
> > >
> > > 0.021803714 seconds time elapsed
> > >
> > > 0.023377000 seconds user
> > > 0.349918000 seconds sys
> > >
> > > , which is ranging from 400mil to 1400mil samples.
> > >
> > > From the testing point of view, it does not make sense to compare two
> > > separate runs against each other when the conditions may change
> > > significantly. Remove the comparison of two separate runs and check only
> > > whether the stating works as expected for the --bpf-counters option. Compare
> > > the samples count only when the samples are recorded simultaneously
> > > ensuring the same conditions.
> >
> > Hmm.. but having a test which checks if the output is sane can be
> > useful. If it's a problem of dynamic changes in cpu cycles, maybe
> > we can use 'instructions' event instead (probably with :u) to get
> > more stable values?
>
> Hello.
>
> As far as I understand it, nowadays, the test checks two things:
>
> test_bpf_counters()
> record $workload twice (with and without --bpf-counters)
> check that there are numeric results
> compare the results
>
> test_bpf_modifier()
> record $workload once with and without modifier (which should be what
> --bpf-counters switch does to the events, right?)
> check that there are numeric results
> compare the results
>
> The problem here is not only the "dynamic changes in cpu-cycles", it
> is rather in the testcase design itself. A testcase that compares two
> metrics should get rid of all possible variable effects that influence it.
>
> The second function actually compares the values correctly, since they
> are measured against the same identical workload.
>
> So, in my opinion, a better test design would be:
>
> (1) check that record without --bpf-counters works as a reference run
Do you mean by `perf stat record`? Or saving the output of `perf stat`
in a text file?
> (2) check that record with --bpf-counters works too
> (if not, we may compare to (1) to find out if whole `record` is broken
> or just the --bpf-counters option)
Ditto.
> (3) possibly run `perf evlist` to check that --bpf-counters has added the
> '/b' modifier
The `perf evlist` would work only for `perf stat record`. Then I think
it's a completely new test case.
> (4) check with versus without "b", such as current test_bpf_modifier()
> function does
>
> I like what Veronika suggests, it is basically the above, except of (3).
>
> ...
>
> In case we want preserve two separate runs in test_bpf_counters() _and_
> also check the numbers, then we should:
> - use some more predictable workload:
> - in an ideal case a statically linked simple binary
> - in less-than-ideal case `perf test -w something`
> - use instructions instead of cycles
> However, I don't like that idea very much, because of the design principles
> mentioned above.
I understand it's not ideal.. maybe it's fine not to check the numbers
here because we do that in the test_bpf_modifier. But it'd be nice to
have that. Let's try with instructions event first, change it later if
it doesn't go well.
Thanks,
Namhyung
prev parent reply other threads:[~2024-06-16 3:56 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-04 15:31 [PATCH] perf test stat_bpf_counter.sh: Remove comparison of separate runs vmolnaro
2024-06-05 0:31 ` Namhyung Kim
2024-06-06 13:09 ` Veronika Molnarova
2024-06-07 18:38 ` Namhyung Kim
2024-06-13 14:20 ` Michael Petlan
2024-06-16 3:56 ` Namhyung Kim [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zm5iXuQFE6TmztkC@google.com \
--to=namhyung@kernel.org \
--cc=acme@kernel.org \
--cc=acme@redhat.com \
--cc=linux-perf-users@vger.kernel.org \
--cc=mpetlan@redhat.com \
--cc=vmolnaro@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).