From: Puranjay Mohan <puranjay@kernel.org>
To: Eduard Zingerman <eddyz87@gmail.com>,
bpf@vger.kernel.org, ast@kernel.org
Cc: andrii@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev,
kernel-team@fb.com, yonghong.song@linux.dev,
jose.marchesi@oracle.com, Eduard Zingerman <eddyz87@gmail.com>
Subject: Re: [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls
Date: Mon, 08 Jul 2024 11:44:30 +0000 [thread overview]
Message-ID: <mb61psewk3y75.fsf@kernel.org> (raw)
In-Reply-To: <20240704102402.1644916-1-eddyz87@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5661 bytes --]
Eduard Zingerman <eddyz87@gmail.com> writes:
> This RFC seeks to allow using no_caller_saved_registers gcc/clang
> attribute with some BPF helper functions (and kfuncs in the future).
>
> As documented in [1], this attribute means that function scratches
> only some of the caller saved registers defined by ABI.
> For BPF the set of such registers could be defined as follows:
> - R0 is scratched only if function is non-void;
> - R1-R5 are scratched only if corresponding parameter type is defined
> in the function prototype.
>
> The goal of the RFC is to implement no_caller_saved_registers
> (nocsr for short) in a backwards compatible manner:
> - for kernels that support the feature, gain some performance boost
> from better register allocation;
> - for kernels that don't support the feature, allow programs execution
> with minor performance losses.
>
> To achieve this, use a scheme suggested by Alexei Starovoitov:
> - for nocsr calls clang allocates registers as-if relevant r0-r5
> registers are not scratched by the call;
> - as a post-processing step, clang visits each nocsr call and adds
> spill/fill for every live r0-r5;
> - stack offsets used for spills/fills are allocated as minimal
> stack offsets in whole function and are not used for any other
> purposes;
> - when kernel loads a program, it looks for such patterns
> (nocsr function surrounded by spills/fills) and checks if
> spill/fill stack offsets are used exclusively in nocsr patterns;
> - if so, and if current JIT inlines the call to the nocsr function
> (e.g. a helper call), kernel removes unnecessary spill/fill pairs;
> - when old kernel loads a program, presence of spill/fill pairs
> keeps BPF program valid, albeit slightly less efficient.
>
> Corresponding clang/llvm changes are available in [2].
>
> The patch-set uses bpf_get_smp_processor_id() function as a canary,
> making it the first helper with nocsr attribute.
>
> For example, consider the following program:
>
> #define __no_csr __attribute__((no_caller_saved_registers))
> #define SEC(name) __attribute__((section(name), used))
> #define bpf_printk(fmt, ...) bpf_trace_printk((fmt), sizeof(fmt), __VA_ARGS__)
>
> typedef unsigned int __u32;
>
> static long (* const bpf_trace_printk)(const char *fmt, __u32 fmt_size, ...) = (void *) 6;
> static __u32 (*const bpf_get_smp_processor_id)(void) __no_csr = (void *)8;
>
> SEC("raw_tp")
> int test(void *ctx)
> {
> __u32 task = bpf_get_smp_processor_id();
> bpf_printk("ctx=%p, smp=%d", ctx, task);
> return 0;
> }
>
> char _license[] SEC("license") = "GPL";
>
> Compiled (using [2]) as follows:
>
> $ clang --target=bpf -O2 -g -c -o nocsr.bpf.o nocsr.bpf.c
> $ llvm-objdump --no-show-raw-insn -Sd nocsr.bpf.o
> ...
> 3rd parameter for printk call removable spill/fill pair
> .--- 0: r3 = r1 |
> ; | __u32 task = bpf_get_smp_processor_id(); |
> | 1: *(u64 *)(r10 - 0x8) = r3 <----------|
> | 2: call 0x8 |
> | 3: r3 = *(u64 *)(r10 - 0x8) <----------'
> ; | bpf_printk("ctx=%p, smp=%d", ctx, task);
> | 4: r1 = 0x0 ll
> | 6: r2 = 0xf
> | 7: r4 = r0
> '--> 8: call 0x6
> ; return 0;
> 9: r0 = 0x0
> 10: exit
>
> Here is how the program looks after verifier processing:
>
> # bpftool prog load ./nocsr.bpf.o /sys/fs/bpf/nocsr-test
> # bpftool prog dump xlated pinned /sys/fs/bpf/nocsr-test
> int test(void * ctx):
> ; int test(void *ctx)
> 0: (bf) r3 = r1 <--------- 3rd printk parameter
> ; __u32 task = bpf_get_smp_processor_id();
> 1: (b4) w0 = 197132 <--------- inlined helper call,
> 2: (bf) r0 = r0 <--------- spill/fill pair removed
> 3: (61) r0 = *(u32 *)(r0 +0) <---------
> ; bpf_printk("ctx=%p, smp=%d", ctx, task);
> 4: (18) r1 = map[id:13][0]+0
> 6: (b7) r2 = 15
> 7: (bf) r4 = r0
> 8: (85) call bpf_trace_printk#-125920
> ; return 0;
> 9: (b7) r0 = 0
> 10: (95) exit
>
> [1] https://clang.llvm.org/docs/AttributeReference.html#no-caller-saved-registers
> [2] https://github.com/eddyz87/llvm-project/tree/bpf-no-caller-saved-registers
>
> Change list:
> - v1 -> v2:
> - assume that functions inlined by either jit or verifier
> conform to no_caller_saved_registers contract (Andrii, Puranjay);
> - allow nocsr rewrite for bpf_get_smp_processor_id()
> on arm64 and riscv64 architectures (Puranjay);
> - __arch_{x86_64,arm64,riscv64} macro for test_loader;
> - moved remove_nocsr_spills_fills() inside do_misc_fixups() (Andrii);
> - moved nocsr pattern detection from check_cfg() to a separate pass
> (Andrii);
> - various stylistic/correctness changes according to Andrii's
> comments.
>
> Revisions:
> - v1 https://lore.kernel.org/bpf/20240629094733.3863850-1-eddyz87@gmail.com/
>
> Eduard Zingerman (9):
> bpf: add a get_helper_proto() utility function
> bpf: no_caller_saved_registers attribute for helper calls
> bpf, x86, riscv, arm: no_caller_saved_registers for
> bpf_get_smp_processor_id()
Ran the selftest on riscv-64 on qemu:
root@rv-tester:~/bpf# uname -a
Linux rv-tester 6.10.0-rc2 #27 SMP Mon Jul 8 09:58:20 UTC 2024 riscv64 riscv64 riscv64 GNU/Linux
root@rv-tester:~/bpf# ./test_progs -a verifier_nocsr/canary_arm64_riscv64
#496/2 verifier_nocsr/canary_arm64_riscv64:OK
#496 verifier_nocsr:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
Tested-by: Puranjay Mohan <puranjay@kernel.org> #riscv64
Thanks,
Puranjay
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]
next prev parent reply other threads:[~2024-07-08 11:44 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-04 10:23 [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 1/9] bpf: add a get_helper_proto() utility function Eduard Zingerman
2024-07-09 23:42 ` Andrii Nakryiko
2024-07-10 0:26 ` Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls Eduard Zingerman
2024-07-09 23:42 ` Andrii Nakryiko
2024-07-10 3:00 ` Eduard Zingerman
2024-07-10 6:01 ` Andrii Nakryiko
2024-07-10 7:57 ` Eduard Zingerman
2024-07-10 15:36 ` Andrii Nakryiko
2024-07-10 16:15 ` Eduard Zingerman
2024-07-10 17:50 ` Andrii Nakryiko
2024-07-10 18:40 ` Eduard Zingerman
2024-07-10 18:49 ` Andrii Nakryiko
2024-07-10 19:03 ` Eduard Zingerman
2024-07-10 19:16 ` Andrii Nakryiko
2024-07-10 19:07 ` Alexei Starovoitov
2024-07-10 19:17 ` Andrii Nakryiko
2024-07-10 19:01 ` Alexei Starovoitov
2024-07-10 9:46 ` Eduard Zingerman
2024-07-10 15:23 ` Andrii Nakryiko
2024-07-10 1:09 ` Alexei Starovoitov
2024-07-10 3:06 ` Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 3/9] bpf, x86, riscv, arm: no_caller_saved_registers for bpf_get_smp_processor_id() Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 4/9] selftests/bpf: extract utility function for BPF disassembly Eduard Zingerman
2024-07-09 23:46 ` Andrii Nakryiko
2024-07-04 10:23 ` [RFC bpf-next v2 5/9] selftests/bpf: no need to track next_match_pos in struct test_loader Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 6/9] selftests/bpf: extract test_loader->expect_msgs as a data structure Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 7/9] selftests/bpf: allow checking xlated programs in verifier_* tests Eduard Zingerman
2024-07-04 10:24 ` [RFC bpf-next v2 8/9] selftests/bpf: __arch_* macro to limit test cases to specific archs Eduard Zingerman
2024-07-09 23:50 ` Andrii Nakryiko
2024-07-04 10:24 ` [RFC bpf-next v2 9/9] selftests/bpf: test no_caller_saved_registers spill/fill removal Eduard Zingerman
2024-07-08 11:44 ` Puranjay Mohan [this message]
2024-07-08 17:29 ` [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Eduard Zingerman
2024-07-10 1:18 ` Alexei Starovoitov
2024-07-10 3:35 ` Eduard Zingerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mb61psewk3y75.fsf@kernel.org \
--to=puranjay@kernel.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=jose.marchesi@oracle.com \
--cc=kernel-team@fb.com \
--cc=martin.lau@linux.dev \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.