From: Puranjay Mohan <puranjay@kernel.org>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>,
Alexei Starovoitov <ast@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Andrii Nakryiko <andrii@kernel.org>,
bpf@vger.kernel.org, Daniel Borkmann <daniel@iogearbox.net>,
"David S. Miller" <davem@davemloft.net>,
"Eduard Zingerman" <eddyz87@gmail.com>,
Eric Dumazet <edumazet@google.com>, Hao Luo <haoluo@google.com>,
Helge Deller <deller@gmx.de>, Jakub Kicinski <kuba@kernel.org>,
"James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>,
Jiri Olsa <jolsa@kernel.org>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
linux-kernel@vger.kernel.org, linux-parisc@vger.kernel.org,
linux-riscv@lists.infradead.org,
Martin KaFai Lau <martin.lau@linux.dev>,
Mykola Lysenko <mykolal@fb.com>,
netdev@vger.kernel.org, Palmer Dabbelt <palmer@dabbelt.com>,
Paolo Abeni <pabeni@redhat.com>,
Paul Walmsley <paul.walmsley@sifive.com>,
Shuah Khan <shuah@kernel.org>, Song Liu <song@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>,
Yonghong Song <yonghong.song@linux.dev>
Subject: Re: [PATCH bpf-next 4/5] selftests/bpf: Add benchmark for bpf_csum_diff() helper
Date: Wed, 23 Oct 2024 15:37:07 +0000 [thread overview]
Message-ID: <mb61pttd2bzks.fsf@kernel.org> (raw)
In-Reply-To: <mb61p8qugc955.fsf@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 3424 bytes --]
Puranjay Mohan <puranjay@kernel.org> writes:
> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>
>> On Tue, Oct 22, 2024 at 3:21 AM Puranjay Mohan <puranjay@kernel.org> wrote:
>>>
>>> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>>>
>>> > On Mon, Oct 21, 2024 at 5:22 AM Puranjay Mohan <puranjay@kernel.org> wrote:
>>> >>
>>> >> Add a microbenchmark for bpf_csum_diff() helper. This benchmark works by
>>> >> filling a 4KB buffer with random data and calculating the internet
>>> >> checksum on different parts of this buffer using bpf_csum_diff().
>>> >>
>>> >> Example run using ./benchs/run_bench_csum_diff.sh on x86_64:
>>> >>
>>> >> [bpf]$ ./benchs/run_bench_csum_diff.sh
>>> >> 4 2.296 ± 0.066M/s (drops 0.000 ± 0.000M/s)
>>> >> 8 2.320 ± 0.003M/s (drops 0.000 ± 0.000M/s)
>>> >> 16 2.315 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>>> >> 20 2.318 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>>> >> 32 2.308 ± 0.003M/s (drops 0.000 ± 0.000M/s)
>>> >> 40 2.300 ± 0.029M/s (drops 0.000 ± 0.000M/s)
>>> >> 64 2.286 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>>> >> 128 2.250 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>>> >> 256 2.173 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>>> >> 512 2.023 ± 0.055M/s (drops 0.000 ± 0.000M/s)
>>> >
>>> > you are not benchmarking bpf_csum_diff(), you are benchmarking how
>>> > often you can call bpf_prog_test_run(). Add some batching on the BPF
>>> > side, these numbers tell you that there is no difference between
>>> > calculating checksum for 4 bytes and for 512, that didn't seem strange
>>> > to you?
>>>
>>> This didn't seem strange to me because if you see the tables I added to
>>> the cover letter, there is a clear improvement after optimizing the
>>> helper and arm64 even shows a linear drop going from 4 bytes to 512
>>> bytes, even after the optimization.
>>>
>>
>> Regardless of optimization, it's strange that throughput barely
>> differs when you vary the amount of work by more than 100x. This
>> wouldn't be strange if this checksum calculation was some sort of
>> cryptographic hash, where it's intentional to have the same timing
>> regardless of amount of work, or something along those lines. But I
>> don't think that's the case here.
>>
>> But as it is right now, this benchmark is benchmarking
>> bpf_prog_test_run(), as I mentioned, which seems to be bottlenecking
>> at about 2mln/s throughput for your machine. bpf_csum_diff()'s
>> overhead is trivial compared to bpf_prog_test_run() overhead and
>> syscall/context switch overhead.
>>
>> We shouldn't add the benchmark that doesn't benchmark the right thing.
>> So just add a bpf_for(i, 0, 100) loop doing bpf_csum_diff(), and then
>> do atomic increment *after* the loop (to minimize atomics overhead).
>
> Thanks, now I undestand what you meant. Will add the bpf_for() in the
> next version.
I have decided to drop this patch as even after adding bpf_for() the
difference between 4B and 512B is not that much. So, benchmarking
bpf_csum_diff() using this triggering based framework is not useful.
So, v2 will not have this patch but the cover letter will still have the
tables to show the difference before/after the optimization.
Thanks,
Puranjay
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]
next prev parent reply other threads:[~2024-10-23 15:37 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-21 12:21 [PATCH bpf-next 0/5] Optimize bpf_csum_diff() and homogenize for all archs Puranjay Mohan
2024-10-21 12:21 ` [PATCH bpf-next 1/5] net: checksum: move from32to16() to generic header Puranjay Mohan
2024-10-21 13:41 ` Daniel Borkmann
2024-10-22 9:49 ` Toke Høiland-Jørgensen
2024-10-22 13:50 ` kernel test robot
2024-10-21 12:21 ` [PATCH bpf-next 2/5] bpf: bpf_csum_diff: optimize and homogenize for all archs Puranjay Mohan
2024-10-21 13:42 ` Daniel Borkmann
2024-10-22 9:54 ` Toke Høiland-Jørgensen
2024-10-22 18:09 ` kernel test robot
2024-10-21 12:21 ` [PATCH bpf-next 3/5] selftests/bpf: don't mask result of bpf_csum_diff() in test_verifier Puranjay Mohan
2024-10-21 13:01 ` Helge Deller
2024-10-21 13:14 ` Puranjay Mohan
2024-10-21 14:04 ` Helge Deller
2024-10-21 13:42 ` Daniel Borkmann
2024-10-22 9:55 ` Toke Høiland-Jørgensen
2024-10-21 12:21 ` [PATCH bpf-next 4/5] selftests/bpf: Add benchmark for bpf_csum_diff() helper Puranjay Mohan
2024-10-21 13:43 ` Daniel Borkmann
2024-10-21 23:28 ` Andrii Nakryiko
2024-10-22 10:21 ` Puranjay Mohan
2024-10-22 17:47 ` Andrii Nakryiko
2024-10-22 17:58 ` Puranjay Mohan
2024-10-23 15:37 ` Puranjay Mohan [this message]
2024-10-21 12:21 ` [PATCH bpf-next 5/5] selftests/bpf: Add a selftest for bpf_csum_diff() Puranjay Mohan
2024-10-21 13:44 ` Daniel Borkmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mb61pttd2bzks.fsf@kernel.org \
--to=puranjay@kernel.org \
--cc=James.Bottomley@hansenpartnership.com \
--cc=akpm@linux-foundation.org \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=aou@eecs.berkeley.edu \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=deller@gmx.de \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-parisc@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=martin.lau@linux.dev \
--cc=mykolal@fb.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=sdf@fomichev.me \
--cc=shuah@kernel.org \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).