public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls
@ 2024-06-29  9:47 Eduard Zingerman
  2024-06-29  9:47 ` [RFC bpf-next v1 1/8] bpf: add a get_helper_proto() utility function Eduard Zingerman
                   ` (8 more replies)
  0 siblings, 9 replies; 47+ messages in thread
From: Eduard Zingerman @ 2024-06-29  9:47 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Eduard Zingerman

This RFC seeks to allow using no_caller_saved_registers gcc/clang
attribute with some BPF helper functions (and kfuncs in the future).

As documented in [1], this attribute means that function scratches
only some of the caller saved registers defined by ABI.
For BPF the set of such registers could be defined as follows:
- R0 is scratched only if function is non-void;
- R1-R5 are scratched only if corresponding parameter type is defined
  in the function prototype.

The goal of the RFC is to implement no_caller_saved_registers
(nocsr for short) in a backwards compatible manner:
- for kernels that support the feature, gain some performance boost
  from better register allocation;
- for kernels that don't support the feature, allow programs execution
  with minor performance losses.

To achieve this, use a scheme suggested by Alexei Starovoitov:
- for nocsr calls clang allocates registers as-if relevant r0-r5
  registers are not scratched by the call;
- as a post-processing step, clang visits each nocsr call and adds
  spill/fill for every live r0-r5;
- stack offsets used for spills/fills are allocated as minimal
  stack offsets in whole function and are not used for any other
  purposes;
- when kernel loads a program, it looks for such patterns
  (nocsr function surrounded by spills/fills) and checks if
  spill/fill stack offsets are used exclusively in nocsr patterns;
- if so, and if current JIT inlines the call to the nocsr function
  (e.g. a helper call), kernel removes unnecessary spill/fill pairs;
- when old kernel loads a program, presence of spill/fill pairs
  keeps BPF program valid, albeit slightly less efficient.

Corresponding clang/llvm changes are available in [2].

The patch-set uses bpf_get_smp_processor_id() function as a canary,
making it the first helper with nocsr attribute.

For example, consider the following program:

  #define __no_csr __attribute__((no_caller_saved_registers))
  #define SEC(name) __attribute__((section(name), used))
  #define bpf_printk(fmt, ...) bpf_trace_printk((fmt), sizeof(fmt), __VA_ARGS__)

  typedef unsigned int __u32;

  static long (* const bpf_trace_printk)(const char *fmt, __u32 fmt_size, ...) = (void *) 6;
  static __u32 (*const bpf_get_smp_processor_id)(void) __no_csr = (void *)8;

  SEC("raw_tp")
  int test(void *ctx)
  {
          __u32 task = bpf_get_smp_processor_id();
  	bpf_printk("ctx=%p, smp=%d", ctx, task);
  	return 0;
  }

  char _license[] SEC("license") = "GPL";

Compiled (using [2]) as follows:

  $ clang --target=bpf -O2 -g -c -o nocsr.bpf.o nocsr.bpf.c
  $ llvm-objdump --no-show-raw-insn -Sd nocsr.bpf.o
    ...
  3rd parameter for printk call     removable spill/fill pair
  .--- 0:       r3 = r1                             |
; |       __u32 task = bpf_get_smp_processor_id();  |
  |    1:       *(u64 *)(r10 - 0x8) = r3 <----------|
  |    2:       call 0x8                            |
  |    3:       r3 = *(u64 *)(r10 - 0x8) <----------'
; |     bpf_printk("ctx=%p, smp=%d", ctx, task);
  |    4:       r1 = 0x0 ll
  |    6:       r2 = 0xf
  |    7:       r4 = r0
  '--> 8:       call 0x6
;       return 0;
       9:       r0 = 0x0
      10:       exit

Here is how the program looks after verifier processing:

  # bpftool prog load ./nocsr.bpf.o /sys/fs/bpf/nocsr-test
  # bpftool prog dump xlated pinned /sys/fs/bpf/nocsr-test
  int test(void * ctx):
  ; int test(void *ctx)
     0: (bf) r3 = r1               <--------- 3rd printk parameter
  ; __u32 task = bpf_get_smp_processor_id();
     1: (b4) w0 = 197132           <--------- inlined helper call,
     2: (bf) r0 = r0               <--------- spill/fill pair removed
     3: (61) r0 = *(u32 *)(r0 +0)  <---------
  ; bpf_printk("ctx=%p, smp=%d", ctx, task);
     4: (18) r1 = map[id:13][0]+0
     6: (b7) r2 = 15
     7: (bf) r4 = r0
     8: (85) call bpf_trace_printk#-125920
  ; return 0;
     9: (b7) r0 = 0
    10: (95) exit

[1] https://clang.llvm.org/docs/AttributeReference.html#no-caller-saved-registers
[2] https://github.com/eddyz87/llvm-project/tree/bpf-no-caller-saved-registers

Eduard Zingerman (8):
  bpf: add a get_helper_proto() utility function
  bpf: no_caller_saved_registers attribute for helper calls
  bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
  selftests/bpf: extract utility function for BPF disassembly
  selftests/bpf: no need to track next_match_pos in struct test_loader
  selftests/bpf: extract test_loader->expect_msgs as a data structure
  selftests/bpf: allow checking xlated programs in verifier_* tests
  selftests/bpf: test no_caller_saved_registers spill/fill removal

 include/linux/bpf.h                           |   6 +
 include/linux/bpf_verifier.h                  |   9 +
 kernel/bpf/helpers.c                          |   1 +
 kernel/bpf/verifier.c                         | 346 +++++++++++++-
 tools/testing/selftests/bpf/Makefile          |   1 +
 tools/testing/selftests/bpf/disasm_helpers.c  |  50 ++
 tools/testing/selftests/bpf/disasm_helpers.h  |  12 +
 .../selftests/bpf/prog_tests/ctx_rewrite.c    |  71 +--
 .../selftests/bpf/prog_tests/verifier.c       |   7 +
 tools/testing/selftests/bpf/progs/bpf_misc.h  |   6 +
 .../selftests/bpf/progs/verifier_nocsr.c      | 437 ++++++++++++++++++
 tools/testing/selftests/bpf/test_loader.c     | 170 +++++--
 tools/testing/selftests/bpf/test_progs.h      |   1 -
 tools/testing/selftests/bpf/testing_helpers.c |   1 +
 14 files changed, 986 insertions(+), 132 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/disasm_helpers.c
 create mode 100644 tools/testing/selftests/bpf/disasm_helpers.h
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_nocsr.c

-- 
2.45.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2024-07-04 17:39 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-29  9:47 [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Eduard Zingerman
2024-06-29  9:47 ` [RFC bpf-next v1 1/8] bpf: add a get_helper_proto() utility function Eduard Zingerman
2024-07-02  0:41   ` Andrii Nakryiko
2024-07-02 20:07     ` Eduard Zingerman
2024-06-29  9:47 ` [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls Eduard Zingerman
2024-07-01 19:01   ` Eduard Zingerman
2024-07-02  0:41   ` Andrii Nakryiko
2024-07-02 20:38     ` Eduard Zingerman
2024-07-02 21:09       ` Andrii Nakryiko
2024-07-02 21:19         ` Eduard Zingerman
2024-07-02 21:22           ` Andrii Nakryiko
2024-07-03 11:57   ` Puranjay Mohan
2024-07-03 16:13     ` Eduard Zingerman
2024-07-04 10:55       ` Puranjay Mohan
2024-06-29  9:47 ` [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id() Eduard Zingerman
2024-07-02  0:41   ` Andrii Nakryiko
2024-07-02 20:43     ` Eduard Zingerman
2024-07-02 21:11       ` Andrii Nakryiko
2024-07-02 21:25         ` Eduard Zingerman
2024-07-03 11:27         ` Puranjay Mohan
2024-07-03 23:14           ` Eduard Zingerman
2024-07-04 11:19             ` Puranjay Mohan
2024-07-04 16:39               ` Eduard Zingerman
2024-07-04 17:00           ` Eduard Zingerman
2024-07-04 17:24             ` Puranjay Mohan
2024-07-04 17:39               ` Eduard Zingerman
2024-06-29  9:47 ` [RFC bpf-next v1 4/8] selftests/bpf: extract utility function for BPF disassembly Eduard Zingerman
2024-07-02  0:41   ` Andrii Nakryiko
2024-07-02 20:59     ` Eduard Zingerman
2024-07-02 21:16       ` Andrii Nakryiko
2024-07-02 21:23         ` Eduard Zingerman
2024-06-29  9:47 ` [RFC bpf-next v1 5/8] selftests/bpf: no need to track next_match_pos in struct test_loader Eduard Zingerman
2024-07-02  0:41   ` Andrii Nakryiko
2024-07-02 21:05     ` Eduard Zingerman
2024-07-02 21:18       ` Andrii Nakryiko
2024-06-29  9:47 ` [RFC bpf-next v1 6/8] selftests/bpf: extract test_loader->expect_msgs as a data structure Eduard Zingerman
2024-07-02  0:42   ` Andrii Nakryiko
2024-07-02 21:06     ` Eduard Zingerman
2024-06-29  9:47 ` [RFC bpf-next v1 7/8] selftests/bpf: allow checking xlated programs in verifier_* tests Eduard Zingerman
2024-07-02  0:42   ` Andrii Nakryiko
2024-07-02 21:07     ` Eduard Zingerman
2024-07-02 21:19       ` Andrii Nakryiko
2024-06-29  9:47 ` [RFC bpf-next v1 8/8] selftests/bpf: test no_caller_saved_registers spill/fill removal Eduard Zingerman
2024-07-02  0:42   ` Andrii Nakryiko
2024-07-02 21:12     ` Eduard Zingerman
2024-07-02 21:20       ` Andrii Nakryiko
2024-07-02  0:41 ` [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Andrii Nakryiko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox