From: Yonghong Song <yonghong.song@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Kernel Team <kernel-team@fb.com>,
Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [PATCH bpf-next v2 2/2] [no_merge] selftests/bpf: Benchmark runtime performance with private stack
Date: Mon, 22 Jul 2024 09:33:40 -0700 [thread overview]
Message-ID: <c8c63a07-7eab-41e8-bb9f-05a42f86220f@linux.dev> (raw)
In-Reply-To: <CAADnVQ+C--rr_C=dCqwGhZux4JQSHJvAazgem1L8OGx7CC6+nw@mail.gmail.com>
On 7/19/24 6:08 PM, Alexei Starovoitov wrote:
> On Thu, Jul 18, 2024 at 1:52 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>>
>> The following are the jited progs with private stack:
>>
>> subprog:
>> 0: f3 0f 1e fa endbr64
>> 4: 0f 1f 44 00 00 nop DWORD PTR [rax+rax*1+0x0]
>> 9: 66 90 xchg ax,ax
>> b: 55 push rbp
>> c: 48 89 e5 mov rbp,rsp
>> f: f3 0f 1e fa endbr64
>> 13: 49 b9 70 a6 c1 08 7e movabs r9,0x607e08c1a670
>> 1a: 60 00 00
>> 1d: 65 4c 03 0c 25 00 1a add r9,QWORD PTR gs:0x21a00
>> 24: 02 00
>> 26: 31 c0 xor eax,eax
>> 28: c9 leave
>> 29: c3 ret
> Thanks for doing the benchmarking.
> It's clear now that worst case overhead is ~5%.
> Could you do one more benchmark such that the 'main prog'
> below stays as-is with setup of r9 and push/pop r9,
> but in the subprog above there is no 'movabs r9 + add r9' ?
> To simulate the case when a big function with a large stack
> triggers private-stack use, but it calls a subprog without
> a private stack.
> I think we should see a different overhead.
> Obviously subprog won't have these two extra insns that setup r9
> which would lead to something like ~4% slowdown vs 5%,
> but I feel the overhead of pure push/pop r9 around calls
> will be lower as well, because r9 is not written into inside subprog.
> The CPU HW should be able to execute such push/pop faster.
> I'm curious what it is.
Sure. Let me do an experiment with this.
>
>> main prog:
>> 0: f3 0f 1e fa endbr64
>> 4: 0f 1f 44 00 00 nop DWORD PTR [rax+rax*1+0x0]
>> 9: 66 90 xchg ax,ax
>> b: 55 push rbp
>> c: 48 89 e5 mov rbp,rsp
>> f: f3 0f 1e fa endbr64
>> 13: 49 b9 88 a6 c1 08 7e movabs r9,0x607e08c1a688
>> 1a: 60 00 00
>> 1d: 65 4c 03 0c 25 00 1a add r9,QWORD PTR gs:0x21a00
>> 24: 02 00
>> 26: 48 bf 00 d0 5b 00 00 movabs rdi,0xffffc900005bd000
>> 2d: c9 ff ff
>> 30: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
>> 34: 48 83 c6 01 add rsi,0x1
>> 38: 48 89 77 00 mov QWORD PTR [rdi+0x0],rsi
>> 3c: 41 51 push r9
>> 3e: e8 46 23 51 e1 call 0xffffffffe1512389
>> 43: 41 59 pop r9
>> 45: 41 51 push r9
>> 47: e8 3d 23 51 e1 call 0xffffffffe1512389
>> 4c: 41 59 pop r9
>> 4e: 41 51 push r9
>> 50: e8 34 23 51 e1 call 0xffffffffe1512389
>> 55: 41 59 pop r9
>> 57: 31 c0 xor eax,eax
>> 59: c9 leave
>> 5a: c3 ret
>>
> Also pls share 'perf annotate' of JIT-ed asm.
> I wonder where the hotspots are in the code.
Okay, will do.
next prev parent reply other threads:[~2024-07-22 16:33 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-18 20:51 [PATCH bpf-next v2 1/2] bpf: Support private stack for bpf progs Yonghong Song
2024-07-18 20:52 ` [PATCH bpf-next v2 2/2] [no_merge] selftests/bpf: Benchmark runtime performance with private stack Yonghong Song
2024-07-18 21:44 ` Yonghong Song
2024-07-18 21:59 ` Kumar Kartikeya Dwivedi
2024-07-19 3:01 ` Yonghong Song
2024-07-19 0:36 ` Alexei Starovoitov
2024-07-19 2:21 ` Yonghong Song
2024-07-20 0:14 ` bot+bpf-ci
2024-07-20 1:08 ` Alexei Starovoitov
2024-07-22 16:33 ` Yonghong Song [this message]
2024-07-20 3:28 ` [PATCH bpf-next v2 1/2] bpf: Support private stack for bpf progs Andrii Nakryiko
2024-07-22 16:43 ` Yonghong Song
2024-07-24 5:08 ` Yonghong Song
2024-07-24 16:54 ` Alexei Starovoitov
2024-07-24 17:56 ` Yonghong Song
2024-07-22 20:57 ` Andrii Nakryiko
2024-07-23 1:05 ` Alexei Starovoitov
2024-07-23 3:26 ` Andrii Nakryiko
2024-07-24 3:17 ` Alexei Starovoitov
2024-07-24 4:06 ` Andrii Nakryiko
2024-07-24 4:46 ` Yonghong Song
2024-07-24 4:32 ` Yonghong Song
2024-07-23 5:30 ` Yonghong Song
2024-07-23 7:02 ` Yonghong Song
2024-07-22 3:33 ` Eduard Zingerman
2024-07-22 16:54 ` Yonghong Song
2024-07-22 17:53 ` Eduard Zingerman
2024-07-22 17:51 ` Alexei Starovoitov
2024-07-22 18:22 ` Eduard Zingerman
2024-07-22 20:08 ` Alexei Starovoitov
2024-07-24 21:28 ` Yonghong Song
2024-07-25 4:55 ` Alexei Starovoitov
2024-07-25 17:20 ` Eduard Zingerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c8c63a07-7eab-41e8-bb9f-05a42f86220f@linux.dev \
--to=yonghong.song@linux.dev \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=kernel-team@fb.com \
--cc=martin.lau@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.