From: Yonghong Song <yonghong.song@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Kernel Team <kernel-team@fb.com>,
Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [PATCH bpf-next v2 2/2] [no_merge] selftests/bpf: Benchmark runtime performance with private stack
Date: Mon, 22 Jul 2024 09:33:40 -0700 [thread overview]
Message-ID: <c8c63a07-7eab-41e8-bb9f-05a42f86220f@linux.dev> (raw)
In-Reply-To: <CAADnVQ+C--rr_C=dCqwGhZux4JQSHJvAazgem1L8OGx7CC6+nw@mail.gmail.com>
On 7/19/24 6:08 PM, Alexei Starovoitov wrote:
> On Thu, Jul 18, 2024 at 1:52 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>>
>> The following are the jited progs with private stack:
>>
>> subprog:
>> 0: f3 0f 1e fa endbr64
>> 4: 0f 1f 44 00 00 nop DWORD PTR [rax+rax*1+0x0]
>> 9: 66 90 xchg ax,ax
>> b: 55 push rbp
>> c: 48 89 e5 mov rbp,rsp
>> f: f3 0f 1e fa endbr64
>> 13: 49 b9 70 a6 c1 08 7e movabs r9,0x607e08c1a670
>> 1a: 60 00 00
>> 1d: 65 4c 03 0c 25 00 1a add r9,QWORD PTR gs:0x21a00
>> 24: 02 00
>> 26: 31 c0 xor eax,eax
>> 28: c9 leave
>> 29: c3 ret
> Thanks for doing the benchmarking.
> It's clear now that worst case overhead is ~5%.
> Could you do one more benchmark such that the 'main prog'
> below stays as-is with setup of r9 and push/pop r9,
> but in the subprog above there is no 'movabs r9 + add r9' ?
> To simulate the case when a big function with a large stack
> triggers private-stack use, but it calls a subprog without
> a private stack.
> I think we should see a different overhead.
> Obviously subprog won't have these two extra insns that setup r9
> which would lead to something like ~4% slowdown vs 5%,
> but I feel the overhead of pure push/pop r9 around calls
> will be lower as well, because r9 is not written into inside subprog.
> The CPU HW should be able to execute such push/pop faster.
> I'm curious what it is.
Sure. Let me do an experiment with this.
>
>> main prog:
>> 0: f3 0f 1e fa endbr64
>> 4: 0f 1f 44 00 00 nop DWORD PTR [rax+rax*1+0x0]
>> 9: 66 90 xchg ax,ax
>> b: 55 push rbp
>> c: 48 89 e5 mov rbp,rsp
>> f: f3 0f 1e fa endbr64
>> 13: 49 b9 88 a6 c1 08 7e movabs r9,0x607e08c1a688
>> 1a: 60 00 00
>> 1d: 65 4c 03 0c 25 00 1a add r9,QWORD PTR gs:0x21a00
>> 24: 02 00
>> 26: 48 bf 00 d0 5b 00 00 movabs rdi,0xffffc900005bd000
>> 2d: c9 ff ff
>> 30: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
>> 34: 48 83 c6 01 add rsi,0x1
>> 38: 48 89 77 00 mov QWORD PTR [rdi+0x0],rsi
>> 3c: 41 51 push r9
>> 3e: e8 46 23 51 e1 call 0xffffffffe1512389
>> 43: 41 59 pop r9
>> 45: 41 51 push r9
>> 47: e8 3d 23 51 e1 call 0xffffffffe1512389
>> 4c: 41 59 pop r9
>> 4e: 41 51 push r9
>> 50: e8 34 23 51 e1 call 0xffffffffe1512389
>> 55: 41 59 pop r9
>> 57: 31 c0 xor eax,eax
>> 59: c9 leave
>> 5a: c3 ret
>>
> Also pls share 'perf annotate' of JIT-ed asm.
> I wonder where the hotspots are in the code.
Okay, will do.
next prev parent reply other threads:[~2024-07-22 16:33 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-18 20:51 [PATCH bpf-next v2 1/2] bpf: Support private stack for bpf progs Yonghong Song
2024-07-18 20:52 ` [PATCH bpf-next v2 2/2] [no_merge] selftests/bpf: Benchmark runtime performance with private stack Yonghong Song
2024-07-18 21:44 ` Yonghong Song
2024-07-18 21:59 ` Kumar Kartikeya Dwivedi
2024-07-19 3:01 ` Yonghong Song
2024-07-19 0:36 ` Alexei Starovoitov
2024-07-19 2:21 ` Yonghong Song
2024-07-20 0:14 ` bot+bpf-ci
2024-07-20 1:08 ` Alexei Starovoitov
2024-07-22 16:33 ` Yonghong Song [this message]
2024-07-20 3:28 ` [PATCH bpf-next v2 1/2] bpf: Support private stack for bpf progs Andrii Nakryiko
2024-07-22 16:43 ` Yonghong Song
2024-07-24 5:08 ` Yonghong Song
2024-07-24 16:54 ` Alexei Starovoitov
2024-07-24 17:56 ` Yonghong Song
2024-07-22 20:57 ` Andrii Nakryiko
2024-07-23 1:05 ` Alexei Starovoitov
2024-07-23 3:26 ` Andrii Nakryiko
2024-07-24 3:17 ` Alexei Starovoitov
2024-07-24 4:06 ` Andrii Nakryiko
2024-07-24 4:46 ` Yonghong Song
2024-07-24 4:32 ` Yonghong Song
2024-07-23 5:30 ` Yonghong Song
2024-07-23 7:02 ` Yonghong Song
2024-07-22 3:33 ` Eduard Zingerman
2024-07-22 16:54 ` Yonghong Song
2024-07-22 17:53 ` Eduard Zingerman
2024-07-22 17:51 ` Alexei Starovoitov
2024-07-22 18:22 ` Eduard Zingerman
2024-07-22 20:08 ` Alexei Starovoitov
2024-07-24 21:28 ` Yonghong Song
2024-07-25 4:55 ` Alexei Starovoitov
2024-07-25 17:20 ` Eduard Zingerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c8c63a07-7eab-41e8-bb9f-05a42f86220f@linux.dev \
--to=yonghong.song@linux.dev \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=kernel-team@fb.com \
--cc=martin.lau@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox