From: Eduard Zingerman <eddyz87@gmail.com>
To: bpf@vger.kernel.org, ast@kernel.org
Cc: andrii@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev,
kernel-team@fb.com, yonghong.song@linux.dev, puranjay@kernel.org,
jose.marchesi@oracle.com, Eduard Zingerman <eddyz87@gmail.com>
Subject: [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls
Date: Thu, 4 Jul 2024 03:23:52 -0700 [thread overview]
Message-ID: <20240704102402.1644916-1-eddyz87@gmail.com> (raw)
This RFC seeks to allow using no_caller_saved_registers gcc/clang
attribute with some BPF helper functions (and kfuncs in the future).
As documented in [1], this attribute means that function scratches
only some of the caller saved registers defined by ABI.
For BPF the set of such registers could be defined as follows:
- R0 is scratched only if function is non-void;
- R1-R5 are scratched only if corresponding parameter type is defined
in the function prototype.
The goal of the RFC is to implement no_caller_saved_registers
(nocsr for short) in a backwards compatible manner:
- for kernels that support the feature, gain some performance boost
from better register allocation;
- for kernels that don't support the feature, allow programs execution
with minor performance losses.
To achieve this, use a scheme suggested by Alexei Starovoitov:
- for nocsr calls clang allocates registers as-if relevant r0-r5
registers are not scratched by the call;
- as a post-processing step, clang visits each nocsr call and adds
spill/fill for every live r0-r5;
- stack offsets used for spills/fills are allocated as minimal
stack offsets in whole function and are not used for any other
purposes;
- when kernel loads a program, it looks for such patterns
(nocsr function surrounded by spills/fills) and checks if
spill/fill stack offsets are used exclusively in nocsr patterns;
- if so, and if current JIT inlines the call to the nocsr function
(e.g. a helper call), kernel removes unnecessary spill/fill pairs;
- when old kernel loads a program, presence of spill/fill pairs
keeps BPF program valid, albeit slightly less efficient.
Corresponding clang/llvm changes are available in [2].
The patch-set uses bpf_get_smp_processor_id() function as a canary,
making it the first helper with nocsr attribute.
For example, consider the following program:
#define __no_csr __attribute__((no_caller_saved_registers))
#define SEC(name) __attribute__((section(name), used))
#define bpf_printk(fmt, ...) bpf_trace_printk((fmt), sizeof(fmt), __VA_ARGS__)
typedef unsigned int __u32;
static long (* const bpf_trace_printk)(const char *fmt, __u32 fmt_size, ...) = (void *) 6;
static __u32 (*const bpf_get_smp_processor_id)(void) __no_csr = (void *)8;
SEC("raw_tp")
int test(void *ctx)
{
__u32 task = bpf_get_smp_processor_id();
bpf_printk("ctx=%p, smp=%d", ctx, task);
return 0;
}
char _license[] SEC("license") = "GPL";
Compiled (using [2]) as follows:
$ clang --target=bpf -O2 -g -c -o nocsr.bpf.o nocsr.bpf.c
$ llvm-objdump --no-show-raw-insn -Sd nocsr.bpf.o
...
3rd parameter for printk call removable spill/fill pair
.--- 0: r3 = r1 |
; | __u32 task = bpf_get_smp_processor_id(); |
| 1: *(u64 *)(r10 - 0x8) = r3 <----------|
| 2: call 0x8 |
| 3: r3 = *(u64 *)(r10 - 0x8) <----------'
; | bpf_printk("ctx=%p, smp=%d", ctx, task);
| 4: r1 = 0x0 ll
| 6: r2 = 0xf
| 7: r4 = r0
'--> 8: call 0x6
; return 0;
9: r0 = 0x0
10: exit
Here is how the program looks after verifier processing:
# bpftool prog load ./nocsr.bpf.o /sys/fs/bpf/nocsr-test
# bpftool prog dump xlated pinned /sys/fs/bpf/nocsr-test
int test(void * ctx):
; int test(void *ctx)
0: (bf) r3 = r1 <--------- 3rd printk parameter
; __u32 task = bpf_get_smp_processor_id();
1: (b4) w0 = 197132 <--------- inlined helper call,
2: (bf) r0 = r0 <--------- spill/fill pair removed
3: (61) r0 = *(u32 *)(r0 +0) <---------
; bpf_printk("ctx=%p, smp=%d", ctx, task);
4: (18) r1 = map[id:13][0]+0
6: (b7) r2 = 15
7: (bf) r4 = r0
8: (85) call bpf_trace_printk#-125920
; return 0;
9: (b7) r0 = 0
10: (95) exit
[1] https://clang.llvm.org/docs/AttributeReference.html#no-caller-saved-registers
[2] https://github.com/eddyz87/llvm-project/tree/bpf-no-caller-saved-registers
Change list:
- v1 -> v2:
- assume that functions inlined by either jit or verifier
conform to no_caller_saved_registers contract (Andrii, Puranjay);
- allow nocsr rewrite for bpf_get_smp_processor_id()
on arm64 and riscv64 architectures (Puranjay);
- __arch_{x86_64,arm64,riscv64} macro for test_loader;
- moved remove_nocsr_spills_fills() inside do_misc_fixups() (Andrii);
- moved nocsr pattern detection from check_cfg() to a separate pass
(Andrii);
- various stylistic/correctness changes according to Andrii's
comments.
Revisions:
- v1 https://lore.kernel.org/bpf/20240629094733.3863850-1-eddyz87@gmail.com/
Eduard Zingerman (9):
bpf: add a get_helper_proto() utility function
bpf: no_caller_saved_registers attribute for helper calls
bpf, x86, riscv, arm: no_caller_saved_registers for
bpf_get_smp_processor_id()
selftests/bpf: extract utility function for BPF disassembly
selftests/bpf: no need to track next_match_pos in struct test_loader
selftests/bpf: extract test_loader->expect_msgs as a data structure
selftests/bpf: allow checking xlated programs in verifier_* tests
selftests/bpf: __arch_* macro to limit test cases to specific archs
selftests/bpf: test no_caller_saved_registers spill/fill removal
include/linux/bpf.h | 6 +
include/linux/bpf_verifier.h | 14 +
kernel/bpf/helpers.c | 1 +
kernel/bpf/verifier.c | 339 +++++++++++-
tools/testing/selftests/bpf/Makefile | 1 +
tools/testing/selftests/bpf/disasm_helpers.c | 51 ++
tools/testing/selftests/bpf/disasm_helpers.h | 12 +
.../selftests/bpf/prog_tests/ctx_rewrite.c | 74 +--
.../selftests/bpf/prog_tests/verifier.c | 2 +
tools/testing/selftests/bpf/progs/bpf_misc.h | 13 +
.../selftests/bpf/progs/verifier_nocsr.c | 521 ++++++++++++++++++
tools/testing/selftests/bpf/test_loader.c | 217 ++++++--
tools/testing/selftests/bpf/test_progs.h | 1 -
tools/testing/selftests/bpf/testing_helpers.c | 1 +
14 files changed, 1124 insertions(+), 129 deletions(-)
create mode 100644 tools/testing/selftests/bpf/disasm_helpers.c
create mode 100644 tools/testing/selftests/bpf/disasm_helpers.h
create mode 100644 tools/testing/selftests/bpf/progs/verifier_nocsr.c
--
2.45.2
next reply other threads:[~2024-07-04 10:24 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-04 10:23 Eduard Zingerman [this message]
2024-07-04 10:23 ` [RFC bpf-next v2 1/9] bpf: add a get_helper_proto() utility function Eduard Zingerman
2024-07-09 23:42 ` Andrii Nakryiko
2024-07-10 0:26 ` Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls Eduard Zingerman
2024-07-09 23:42 ` Andrii Nakryiko
2024-07-10 3:00 ` Eduard Zingerman
2024-07-10 6:01 ` Andrii Nakryiko
2024-07-10 7:57 ` Eduard Zingerman
2024-07-10 15:36 ` Andrii Nakryiko
2024-07-10 16:15 ` Eduard Zingerman
2024-07-10 17:50 ` Andrii Nakryiko
2024-07-10 18:40 ` Eduard Zingerman
2024-07-10 18:49 ` Andrii Nakryiko
2024-07-10 19:03 ` Eduard Zingerman
2024-07-10 19:16 ` Andrii Nakryiko
2024-07-10 19:07 ` Alexei Starovoitov
2024-07-10 19:17 ` Andrii Nakryiko
2024-07-10 19:01 ` Alexei Starovoitov
2024-07-10 9:46 ` Eduard Zingerman
2024-07-10 15:23 ` Andrii Nakryiko
2024-07-10 1:09 ` Alexei Starovoitov
2024-07-10 3:06 ` Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 3/9] bpf, x86, riscv, arm: no_caller_saved_registers for bpf_get_smp_processor_id() Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 4/9] selftests/bpf: extract utility function for BPF disassembly Eduard Zingerman
2024-07-09 23:46 ` Andrii Nakryiko
2024-07-04 10:23 ` [RFC bpf-next v2 5/9] selftests/bpf: no need to track next_match_pos in struct test_loader Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 6/9] selftests/bpf: extract test_loader->expect_msgs as a data structure Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 7/9] selftests/bpf: allow checking xlated programs in verifier_* tests Eduard Zingerman
2024-07-04 10:24 ` [RFC bpf-next v2 8/9] selftests/bpf: __arch_* macro to limit test cases to specific archs Eduard Zingerman
2024-07-09 23:50 ` Andrii Nakryiko
2024-07-04 10:24 ` [RFC bpf-next v2 9/9] selftests/bpf: test no_caller_saved_registers spill/fill removal Eduard Zingerman
2024-07-08 11:44 ` [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Puranjay Mohan
2024-07-08 17:29 ` Eduard Zingerman
2024-07-10 1:18 ` Alexei Starovoitov
2024-07-10 3:35 ` Eduard Zingerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240704102402.1644916-1-eddyz87@gmail.com \
--to=eddyz87@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=jose.marchesi@oracle.com \
--cc=kernel-team@fb.com \
--cc=martin.lau@linux.dev \
--cc=puranjay@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.