From: Puranjay Mohan <puranjay@kernel.org>
To: Eduard Zingerman <eddyz87@gmail.com>,
bpf@vger.kernel.org, ast@kernel.org
Cc: andrii@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev,
kernel-team@fb.com, yonghong.song@linux.dev,
jose.marchesi@oracle.com, Eduard Zingerman <eddyz87@gmail.com>
Subject: Re: [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls
Date: Mon, 08 Jul 2024 11:44:30 +0000 [thread overview]
Message-ID: <mb61psewk3y75.fsf@kernel.org> (raw)
In-Reply-To: <20240704102402.1644916-1-eddyz87@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5661 bytes --]
Eduard Zingerman <eddyz87@gmail.com> writes:
> This RFC seeks to allow using no_caller_saved_registers gcc/clang
> attribute with some BPF helper functions (and kfuncs in the future).
>
> As documented in [1], this attribute means that function scratches
> only some of the caller saved registers defined by ABI.
> For BPF the set of such registers could be defined as follows:
> - R0 is scratched only if function is non-void;
> - R1-R5 are scratched only if corresponding parameter type is defined
> in the function prototype.
>
> The goal of the RFC is to implement no_caller_saved_registers
> (nocsr for short) in a backwards compatible manner:
> - for kernels that support the feature, gain some performance boost
> from better register allocation;
> - for kernels that don't support the feature, allow programs execution
> with minor performance losses.
>
> To achieve this, use a scheme suggested by Alexei Starovoitov:
> - for nocsr calls clang allocates registers as-if relevant r0-r5
> registers are not scratched by the call;
> - as a post-processing step, clang visits each nocsr call and adds
> spill/fill for every live r0-r5;
> - stack offsets used for spills/fills are allocated as minimal
> stack offsets in whole function and are not used for any other
> purposes;
> - when kernel loads a program, it looks for such patterns
> (nocsr function surrounded by spills/fills) and checks if
> spill/fill stack offsets are used exclusively in nocsr patterns;
> - if so, and if current JIT inlines the call to the nocsr function
> (e.g. a helper call), kernel removes unnecessary spill/fill pairs;
> - when old kernel loads a program, presence of spill/fill pairs
> keeps BPF program valid, albeit slightly less efficient.
>
> Corresponding clang/llvm changes are available in [2].
>
> The patch-set uses bpf_get_smp_processor_id() function as a canary,
> making it the first helper with nocsr attribute.
>
> For example, consider the following program:
>
> #define __no_csr __attribute__((no_caller_saved_registers))
> #define SEC(name) __attribute__((section(name), used))
> #define bpf_printk(fmt, ...) bpf_trace_printk((fmt), sizeof(fmt), __VA_ARGS__)
>
> typedef unsigned int __u32;
>
> static long (* const bpf_trace_printk)(const char *fmt, __u32 fmt_size, ...) = (void *) 6;
> static __u32 (*const bpf_get_smp_processor_id)(void) __no_csr = (void *)8;
>
> SEC("raw_tp")
> int test(void *ctx)
> {
> __u32 task = bpf_get_smp_processor_id();
> bpf_printk("ctx=%p, smp=%d", ctx, task);
> return 0;
> }
>
> char _license[] SEC("license") = "GPL";
>
> Compiled (using [2]) as follows:
>
> $ clang --target=bpf -O2 -g -c -o nocsr.bpf.o nocsr.bpf.c
> $ llvm-objdump --no-show-raw-insn -Sd nocsr.bpf.o
> ...
> 3rd parameter for printk call removable spill/fill pair
> .--- 0: r3 = r1 |
> ; | __u32 task = bpf_get_smp_processor_id(); |
> | 1: *(u64 *)(r10 - 0x8) = r3 <----------|
> | 2: call 0x8 |
> | 3: r3 = *(u64 *)(r10 - 0x8) <----------'
> ; | bpf_printk("ctx=%p, smp=%d", ctx, task);
> | 4: r1 = 0x0 ll
> | 6: r2 = 0xf
> | 7: r4 = r0
> '--> 8: call 0x6
> ; return 0;
> 9: r0 = 0x0
> 10: exit
>
> Here is how the program looks after verifier processing:
>
> # bpftool prog load ./nocsr.bpf.o /sys/fs/bpf/nocsr-test
> # bpftool prog dump xlated pinned /sys/fs/bpf/nocsr-test
> int test(void * ctx):
> ; int test(void *ctx)
> 0: (bf) r3 = r1 <--------- 3rd printk parameter
> ; __u32 task = bpf_get_smp_processor_id();
> 1: (b4) w0 = 197132 <--------- inlined helper call,
> 2: (bf) r0 = r0 <--------- spill/fill pair removed
> 3: (61) r0 = *(u32 *)(r0 +0) <---------
> ; bpf_printk("ctx=%p, smp=%d", ctx, task);
> 4: (18) r1 = map[id:13][0]+0
> 6: (b7) r2 = 15
> 7: (bf) r4 = r0
> 8: (85) call bpf_trace_printk#-125920
> ; return 0;
> 9: (b7) r0 = 0
> 10: (95) exit
>
> [1] https://clang.llvm.org/docs/AttributeReference.html#no-caller-saved-registers
> [2] https://github.com/eddyz87/llvm-project/tree/bpf-no-caller-saved-registers
>
> Change list:
> - v1 -> v2:
> - assume that functions inlined by either jit or verifier
> conform to no_caller_saved_registers contract (Andrii, Puranjay);
> - allow nocsr rewrite for bpf_get_smp_processor_id()
> on arm64 and riscv64 architectures (Puranjay);
> - __arch_{x86_64,arm64,riscv64} macro for test_loader;
> - moved remove_nocsr_spills_fills() inside do_misc_fixups() (Andrii);
> - moved nocsr pattern detection from check_cfg() to a separate pass
> (Andrii);
> - various stylistic/correctness changes according to Andrii's
> comments.
>
> Revisions:
> - v1 https://lore.kernel.org/bpf/20240629094733.3863850-1-eddyz87@gmail.com/
>
> Eduard Zingerman (9):
> bpf: add a get_helper_proto() utility function
> bpf: no_caller_saved_registers attribute for helper calls
> bpf, x86, riscv, arm: no_caller_saved_registers for
> bpf_get_smp_processor_id()
Ran the selftest on riscv-64 on qemu:
root@rv-tester:~/bpf# uname -a
Linux rv-tester 6.10.0-rc2 #27 SMP Mon Jul 8 09:58:20 UTC 2024 riscv64 riscv64 riscv64 GNU/Linux
root@rv-tester:~/bpf# ./test_progs -a verifier_nocsr/canary_arm64_riscv64
#496/2 verifier_nocsr/canary_arm64_riscv64:OK
#496 verifier_nocsr:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
Tested-by: Puranjay Mohan <puranjay@kernel.org> #riscv64
Thanks,
Puranjay
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]
next prev parent reply other threads:[~2024-07-08 11:44 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-04 10:23 [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 1/9] bpf: add a get_helper_proto() utility function Eduard Zingerman
2024-07-09 23:42 ` Andrii Nakryiko
2024-07-10 0:26 ` Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls Eduard Zingerman
2024-07-09 23:42 ` Andrii Nakryiko
2024-07-10 3:00 ` Eduard Zingerman
2024-07-10 6:01 ` Andrii Nakryiko
2024-07-10 7:57 ` Eduard Zingerman
2024-07-10 15:36 ` Andrii Nakryiko
2024-07-10 16:15 ` Eduard Zingerman
2024-07-10 17:50 ` Andrii Nakryiko
2024-07-10 18:40 ` Eduard Zingerman
2024-07-10 18:49 ` Andrii Nakryiko
2024-07-10 19:03 ` Eduard Zingerman
2024-07-10 19:16 ` Andrii Nakryiko
2024-07-10 19:07 ` Alexei Starovoitov
2024-07-10 19:17 ` Andrii Nakryiko
2024-07-10 19:01 ` Alexei Starovoitov
2024-07-10 9:46 ` Eduard Zingerman
2024-07-10 15:23 ` Andrii Nakryiko
2024-07-10 1:09 ` Alexei Starovoitov
2024-07-10 3:06 ` Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 3/9] bpf, x86, riscv, arm: no_caller_saved_registers for bpf_get_smp_processor_id() Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 4/9] selftests/bpf: extract utility function for BPF disassembly Eduard Zingerman
2024-07-09 23:46 ` Andrii Nakryiko
2024-07-04 10:23 ` [RFC bpf-next v2 5/9] selftests/bpf: no need to track next_match_pos in struct test_loader Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 6/9] selftests/bpf: extract test_loader->expect_msgs as a data structure Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 7/9] selftests/bpf: allow checking xlated programs in verifier_* tests Eduard Zingerman
2024-07-04 10:24 ` [RFC bpf-next v2 8/9] selftests/bpf: __arch_* macro to limit test cases to specific archs Eduard Zingerman
2024-07-09 23:50 ` Andrii Nakryiko
2024-07-04 10:24 ` [RFC bpf-next v2 9/9] selftests/bpf: test no_caller_saved_registers spill/fill removal Eduard Zingerman
2024-07-08 11:44 ` Puranjay Mohan [this message]
2024-07-08 17:29 ` [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Eduard Zingerman
2024-07-10 1:18 ` Alexei Starovoitov
2024-07-10 3:35 ` Eduard Zingerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mb61psewk3y75.fsf@kernel.org \
--to=puranjay@kernel.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=jose.marchesi@oracle.com \
--cc=kernel-team@fb.com \
--cc=martin.lau@linux.dev \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox