From: Puranjay Mohan <puranjay@kernel.org>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>,
Alexei Starovoitov <ast@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Andrii Nakryiko <andrii@kernel.org>,
bpf@vger.kernel.org, Daniel Borkmann <daniel@iogearbox.net>,
"David S. Miller" <davem@davemloft.net>,
Eduard Zingerman <eddyz87@gmail.com>,
Eric Dumazet <edumazet@google.com>, Hao Luo <haoluo@google.com>,
Helge Deller <deller@gmx.de>, Jakub Kicinski <kuba@kernel.org>,
"James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>,
Jiri Olsa <jolsa@kernel.org>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
linux-kernel@vger.kernel.org, linux-parisc@vger.kernel.org,
linux-riscv@lists.infradead.org,
Martin KaFai Lau <martin.lau@linux.dev>,
Mykola Lysenko <mykolal@fb.com>,
netdev@vger.kernel.org, Palmer Dabbelt <palmer@dabbelt.com>,
Paolo Abeni <pabeni@redhat.com>,
Paul Walmsley <paul.walmsley@sifive.com>,
Shuah Khan <shuah@kernel.org>, Song Liu <song@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>,
Yonghong Song <yonghong.song@linux.dev>
Subject: Re: [PATCH bpf-next 4/5] selftests/bpf: Add benchmark for bpf_csum_diff() helper
Date: Tue, 22 Oct 2024 17:58:14 +0000 [thread overview]
Message-ID: <mb61p8qugc955.fsf@kernel.org> (raw)
In-Reply-To: <CAEf4BzZ-gfBqez-QJCSRVOPnvz-inaiVdNGOFRCdc2KQbnmeZQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2969 bytes --]
Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
> On Tue, Oct 22, 2024 at 3:21 AM Puranjay Mohan <puranjay@kernel.org> wrote:
>>
>> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>>
>> > On Mon, Oct 21, 2024 at 5:22 AM Puranjay Mohan <puranjay@kernel.org> wrote:
>> >>
>> >> Add a microbenchmark for bpf_csum_diff() helper. This benchmark works by
>> >> filling a 4KB buffer with random data and calculating the internet
>> >> checksum on different parts of this buffer using bpf_csum_diff().
>> >>
>> >> Example run using ./benchs/run_bench_csum_diff.sh on x86_64:
>> >>
>> >> [bpf]$ ./benchs/run_bench_csum_diff.sh
>> >> 4 2.296 ± 0.066M/s (drops 0.000 ± 0.000M/s)
>> >> 8 2.320 ± 0.003M/s (drops 0.000 ± 0.000M/s)
>> >> 16 2.315 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>> >> 20 2.318 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>> >> 32 2.308 ± 0.003M/s (drops 0.000 ± 0.000M/s)
>> >> 40 2.300 ± 0.029M/s (drops 0.000 ± 0.000M/s)
>> >> 64 2.286 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>> >> 128 2.250 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>> >> 256 2.173 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>> >> 512 2.023 ± 0.055M/s (drops 0.000 ± 0.000M/s)
>> >
>> > you are not benchmarking bpf_csum_diff(), you are benchmarking how
>> > often you can call bpf_prog_test_run(). Add some batching on the BPF
>> > side, these numbers tell you that there is no difference between
>> > calculating checksum for 4 bytes and for 512, that didn't seem strange
>> > to you?
>>
>> This didn't seem strange to me because if you see the tables I added to
>> the cover letter, there is a clear improvement after optimizing the
>> helper and arm64 even shows a linear drop going from 4 bytes to 512
>> bytes, even after the optimization.
>>
>
> Regardless of optimization, it's strange that throughput barely
> differs when you vary the amount of work by more than 100x. This
> wouldn't be strange if this checksum calculation was some sort of
> cryptographic hash, where it's intentional to have the same timing
> regardless of amount of work, or something along those lines. But I
> don't think that's the case here.
>
> But as it is right now, this benchmark is benchmarking
> bpf_prog_test_run(), as I mentioned, which seems to be bottlenecking
> at about 2mln/s throughput for your machine. bpf_csum_diff()'s
> overhead is trivial compared to bpf_prog_test_run() overhead and
> syscall/context switch overhead.
>
> We shouldn't add the benchmark that doesn't benchmark the right thing.
> So just add a bpf_for(i, 0, 100) loop doing bpf_csum_diff(), and then
> do atomic increment *after* the loop (to minimize atomics overhead).
Thanks, now I undestand what you meant. Will add the bpf_for() in the
next version.
Thanks,
Puranjay
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]
WARNING: multiple messages have this Message-ID (diff)
From: Puranjay Mohan <puranjay@kernel.org>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>,
Alexei Starovoitov <ast@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Andrii Nakryiko <andrii@kernel.org>,
bpf@vger.kernel.org, Daniel Borkmann <daniel@iogearbox.net>,
"David S. Miller" <davem@davemloft.net>,
Eduard Zingerman <eddyz87@gmail.com>,
Eric Dumazet <edumazet@google.com>, Hao Luo <haoluo@google.com>,
Helge Deller <deller@gmx.de>, Jakub Kicinski <kuba@kernel.org>,
"James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>,
Jiri Olsa <jolsa@kernel.org>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
linux-kernel@vger.kernel.org, linux-parisc@vger.kernel.org,
linux-riscv@lists.infradead.org,
Martin KaFai Lau <martin.lau@linux.dev>,
Mykola Lysenko <mykolal@fb.com>,
netdev@vger.kernel.org, Palmer Dabbelt <palmer@dabbelt.com>,
Paolo Abeni <pabeni@redhat.com>,
Paul Walmsley <paul.walmsley@sifive.com>,
Shuah Khan <shuah@kernel.org>, Song Liu <song@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>,
Yonghong Song <yonghong.song@linux.dev>
Subject: Re: [PATCH bpf-next 4/5] selftests/bpf: Add benchmark for bpf_csum_diff() helper
Date: Tue, 22 Oct 2024 17:58:14 +0000 [thread overview]
Message-ID: <mb61p8qugc955.fsf@kernel.org> (raw)
In-Reply-To: <CAEf4BzZ-gfBqez-QJCSRVOPnvz-inaiVdNGOFRCdc2KQbnmeZQ@mail.gmail.com>
[-- Attachment #1.1: Type: text/plain, Size: 2969 bytes --]
Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
> On Tue, Oct 22, 2024 at 3:21 AM Puranjay Mohan <puranjay@kernel.org> wrote:
>>
>> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>>
>> > On Mon, Oct 21, 2024 at 5:22 AM Puranjay Mohan <puranjay@kernel.org> wrote:
>> >>
>> >> Add a microbenchmark for bpf_csum_diff() helper. This benchmark works by
>> >> filling a 4KB buffer with random data and calculating the internet
>> >> checksum on different parts of this buffer using bpf_csum_diff().
>> >>
>> >> Example run using ./benchs/run_bench_csum_diff.sh on x86_64:
>> >>
>> >> [bpf]$ ./benchs/run_bench_csum_diff.sh
>> >> 4 2.296 ± 0.066M/s (drops 0.000 ± 0.000M/s)
>> >> 8 2.320 ± 0.003M/s (drops 0.000 ± 0.000M/s)
>> >> 16 2.315 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>> >> 20 2.318 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>> >> 32 2.308 ± 0.003M/s (drops 0.000 ± 0.000M/s)
>> >> 40 2.300 ± 0.029M/s (drops 0.000 ± 0.000M/s)
>> >> 64 2.286 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>> >> 128 2.250 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>> >> 256 2.173 ± 0.001M/s (drops 0.000 ± 0.000M/s)
>> >> 512 2.023 ± 0.055M/s (drops 0.000 ± 0.000M/s)
>> >
>> > you are not benchmarking bpf_csum_diff(), you are benchmarking how
>> > often you can call bpf_prog_test_run(). Add some batching on the BPF
>> > side, these numbers tell you that there is no difference between
>> > calculating checksum for 4 bytes and for 512, that didn't seem strange
>> > to you?
>>
>> This didn't seem strange to me because if you see the tables I added to
>> the cover letter, there is a clear improvement after optimizing the
>> helper and arm64 even shows a linear drop going from 4 bytes to 512
>> bytes, even after the optimization.
>>
>
> Regardless of optimization, it's strange that throughput barely
> differs when you vary the amount of work by more than 100x. This
> wouldn't be strange if this checksum calculation was some sort of
> cryptographic hash, where it's intentional to have the same timing
> regardless of amount of work, or something along those lines. But I
> don't think that's the case here.
>
> But as it is right now, this benchmark is benchmarking
> bpf_prog_test_run(), as I mentioned, which seems to be bottlenecking
> at about 2mln/s throughput for your machine. bpf_csum_diff()'s
> overhead is trivial compared to bpf_prog_test_run() overhead and
> syscall/context switch overhead.
>
> We shouldn't add the benchmark that doesn't benchmark the right thing.
> So just add a bpf_for(i, 0, 100) loop doing bpf_csum_diff(), and then
> do atomic increment *after* the loop (to minimize atomics overhead).
Thanks, now I undestand what you meant. Will add the bpf_for() in the
next version.
Thanks,
Puranjay
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]
[-- Attachment #2: Type: text/plain, Size: 161 bytes --]
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2024-10-22 17:58 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-21 12:21 [PATCH bpf-next 0/5] Optimize bpf_csum_diff() and homogenize for all archs Puranjay Mohan
2024-10-21 12:21 ` Puranjay Mohan
2024-10-21 12:21 ` [PATCH bpf-next 1/5] net: checksum: move from32to16() to generic header Puranjay Mohan
2024-10-21 12:21 ` Puranjay Mohan
2024-10-21 13:41 ` Daniel Borkmann
2024-10-21 13:41 ` Daniel Borkmann
2024-10-22 9:49 ` Toke Høiland-Jørgensen
2024-10-22 9:49 ` Toke Høiland-Jørgensen
2024-10-22 13:50 ` kernel test robot
2024-10-22 13:50 ` kernel test robot
2024-10-21 12:21 ` [PATCH bpf-next 2/5] bpf: bpf_csum_diff: optimize and homogenize for all archs Puranjay Mohan
2024-10-21 12:21 ` Puranjay Mohan
2024-10-21 13:42 ` Daniel Borkmann
2024-10-21 13:42 ` Daniel Borkmann
2024-10-22 9:54 ` Toke Høiland-Jørgensen
2024-10-22 9:54 ` Toke Høiland-Jørgensen
2024-10-22 18:09 ` kernel test robot
2024-10-22 18:09 ` kernel test robot
2024-10-21 12:21 ` [PATCH bpf-next 3/5] selftests/bpf: don't mask result of bpf_csum_diff() in test_verifier Puranjay Mohan
2024-10-21 12:21 ` Puranjay Mohan
2024-10-21 13:01 ` Helge Deller
2024-10-21 13:01 ` Helge Deller
2024-10-21 13:14 ` Puranjay Mohan
2024-10-21 13:14 ` Puranjay Mohan
2024-10-21 14:04 ` Helge Deller
2024-10-21 14:04 ` Helge Deller
2024-10-21 13:42 ` Daniel Borkmann
2024-10-21 13:42 ` Daniel Borkmann
2024-10-22 9:55 ` Toke Høiland-Jørgensen
2024-10-22 9:55 ` Toke Høiland-Jørgensen
2024-10-21 12:21 ` [PATCH bpf-next 4/5] selftests/bpf: Add benchmark for bpf_csum_diff() helper Puranjay Mohan
2024-10-21 12:21 ` Puranjay Mohan
2024-10-21 13:43 ` Daniel Borkmann
2024-10-21 13:43 ` Daniel Borkmann
2024-10-21 23:28 ` Andrii Nakryiko
2024-10-21 23:28 ` Andrii Nakryiko
2024-10-22 10:21 ` Puranjay Mohan
2024-10-22 10:21 ` Puranjay Mohan
2024-10-22 17:47 ` Andrii Nakryiko
2024-10-22 17:47 ` Andrii Nakryiko
2024-10-22 17:58 ` Puranjay Mohan [this message]
2024-10-22 17:58 ` Puranjay Mohan
2024-10-23 15:37 ` Puranjay Mohan
2024-10-23 15:37 ` Puranjay Mohan
2024-10-21 12:21 ` [PATCH bpf-next 5/5] selftests/bpf: Add a selftest for bpf_csum_diff() Puranjay Mohan
2024-10-21 12:21 ` Puranjay Mohan
2024-10-21 13:44 ` Daniel Borkmann
2024-10-21 13:44 ` Daniel Borkmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mb61p8qugc955.fsf@kernel.org \
--to=puranjay@kernel.org \
--cc=James.Bottomley@hansenpartnership.com \
--cc=akpm@linux-foundation.org \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=aou@eecs.berkeley.edu \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=deller@gmx.de \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-parisc@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=martin.lau@linux.dev \
--cc=mykolal@fb.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=sdf@fomichev.me \
--cc=shuah@kernel.org \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.