All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls
@ 2024-06-29  9:47 Eduard Zingerman
  2024-06-29  9:47 ` [RFC bpf-next v1 1/8] bpf: add a get_helper_proto() utility function Eduard Zingerman
                   ` (8 more replies)
  0 siblings, 9 replies; 47+ messages in thread
From: Eduard Zingerman @ 2024-06-29  9:47 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Eduard Zingerman

This RFC seeks to allow using no_caller_saved_registers gcc/clang
attribute with some BPF helper functions (and kfuncs in the future).

As documented in [1], this attribute means that function scratches
only some of the caller saved registers defined by ABI.
For BPF the set of such registers could be defined as follows:
- R0 is scratched only if function is non-void;
- R1-R5 are scratched only if corresponding parameter type is defined
  in the function prototype.

The goal of the RFC is to implement no_caller_saved_registers
(nocsr for short) in a backwards compatible manner:
- for kernels that support the feature, gain some performance boost
  from better register allocation;
- for kernels that don't support the feature, allow programs execution
  with minor performance losses.

To achieve this, use a scheme suggested by Alexei Starovoitov:
- for nocsr calls clang allocates registers as-if relevant r0-r5
  registers are not scratched by the call;
- as a post-processing step, clang visits each nocsr call and adds
  spill/fill for every live r0-r5;
- stack offsets used for spills/fills are allocated as minimal
  stack offsets in whole function and are not used for any other
  purposes;
- when kernel loads a program, it looks for such patterns
  (nocsr function surrounded by spills/fills) and checks if
  spill/fill stack offsets are used exclusively in nocsr patterns;
- if so, and if current JIT inlines the call to the nocsr function
  (e.g. a helper call), kernel removes unnecessary spill/fill pairs;
- when old kernel loads a program, presence of spill/fill pairs
  keeps BPF program valid, albeit slightly less efficient.

Corresponding clang/llvm changes are available in [2].

The patch-set uses bpf_get_smp_processor_id() function as a canary,
making it the first helper with nocsr attribute.

For example, consider the following program:

  #define __no_csr __attribute__((no_caller_saved_registers))
  #define SEC(name) __attribute__((section(name), used))
  #define bpf_printk(fmt, ...) bpf_trace_printk((fmt), sizeof(fmt), __VA_ARGS__)

  typedef unsigned int __u32;

  static long (* const bpf_trace_printk)(const char *fmt, __u32 fmt_size, ...) = (void *) 6;
  static __u32 (*const bpf_get_smp_processor_id)(void) __no_csr = (void *)8;

  SEC("raw_tp")
  int test(void *ctx)
  {
          __u32 task = bpf_get_smp_processor_id();
  	bpf_printk("ctx=%p, smp=%d", ctx, task);
  	return 0;
  }

  char _license[] SEC("license") = "GPL";

Compiled (using [2]) as follows:

  $ clang --target=bpf -O2 -g -c -o nocsr.bpf.o nocsr.bpf.c
  $ llvm-objdump --no-show-raw-insn -Sd nocsr.bpf.o
    ...
  3rd parameter for printk call     removable spill/fill pair
  .--- 0:       r3 = r1                             |
; |       __u32 task = bpf_get_smp_processor_id();  |
  |    1:       *(u64 *)(r10 - 0x8) = r3 <----------|
  |    2:       call 0x8                            |
  |    3:       r3 = *(u64 *)(r10 - 0x8) <----------'
; |     bpf_printk("ctx=%p, smp=%d", ctx, task);
  |    4:       r1 = 0x0 ll
  |    6:       r2 = 0xf
  |    7:       r4 = r0
  '--> 8:       call 0x6
;       return 0;
       9:       r0 = 0x0
      10:       exit

Here is how the program looks after verifier processing:

  # bpftool prog load ./nocsr.bpf.o /sys/fs/bpf/nocsr-test
  # bpftool prog dump xlated pinned /sys/fs/bpf/nocsr-test
  int test(void * ctx):
  ; int test(void *ctx)
     0: (bf) r3 = r1               <--------- 3rd printk parameter
  ; __u32 task = bpf_get_smp_processor_id();
     1: (b4) w0 = 197132           <--------- inlined helper call,
     2: (bf) r0 = r0               <--------- spill/fill pair removed
     3: (61) r0 = *(u32 *)(r0 +0)  <---------
  ; bpf_printk("ctx=%p, smp=%d", ctx, task);
     4: (18) r1 = map[id:13][0]+0
     6: (b7) r2 = 15
     7: (bf) r4 = r0
     8: (85) call bpf_trace_printk#-125920
  ; return 0;
     9: (b7) r0 = 0
    10: (95) exit

[1] https://clang.llvm.org/docs/AttributeReference.html#no-caller-saved-registers
[2] https://github.com/eddyz87/llvm-project/tree/bpf-no-caller-saved-registers

Eduard Zingerman (8):
  bpf: add a get_helper_proto() utility function
  bpf: no_caller_saved_registers attribute for helper calls
  bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
  selftests/bpf: extract utility function for BPF disassembly
  selftests/bpf: no need to track next_match_pos in struct test_loader
  selftests/bpf: extract test_loader->expect_msgs as a data structure
  selftests/bpf: allow checking xlated programs in verifier_* tests
  selftests/bpf: test no_caller_saved_registers spill/fill removal

 include/linux/bpf.h                           |   6 +
 include/linux/bpf_verifier.h                  |   9 +
 kernel/bpf/helpers.c                          |   1 +
 kernel/bpf/verifier.c                         | 346 +++++++++++++-
 tools/testing/selftests/bpf/Makefile          |   1 +
 tools/testing/selftests/bpf/disasm_helpers.c  |  50 ++
 tools/testing/selftests/bpf/disasm_helpers.h  |  12 +
 .../selftests/bpf/prog_tests/ctx_rewrite.c    |  71 +--
 .../selftests/bpf/prog_tests/verifier.c       |   7 +
 tools/testing/selftests/bpf/progs/bpf_misc.h  |   6 +
 .../selftests/bpf/progs/verifier_nocsr.c      | 437 ++++++++++++++++++
 tools/testing/selftests/bpf/test_loader.c     | 170 +++++--
 tools/testing/selftests/bpf/test_progs.h      |   1 -
 tools/testing/selftests/bpf/testing_helpers.c |   1 +
 14 files changed, 986 insertions(+), 132 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/disasm_helpers.c
 create mode 100644 tools/testing/selftests/bpf/disasm_helpers.h
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_nocsr.c

-- 
2.45.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2024-07-04 17:39 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-29  9:47 [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Eduard Zingerman
2024-06-29  9:47 ` [RFC bpf-next v1 1/8] bpf: add a get_helper_proto() utility function Eduard Zingerman
2024-07-02  0:41   ` Andrii Nakryiko
2024-07-02 20:07     ` Eduard Zingerman
2024-06-29  9:47 ` [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls Eduard Zingerman
2024-07-01 19:01   ` Eduard Zingerman
2024-07-02  0:41   ` Andrii Nakryiko
2024-07-02 20:38     ` Eduard Zingerman
2024-07-02 21:09       ` Andrii Nakryiko
2024-07-02 21:19         ` Eduard Zingerman
2024-07-02 21:22           ` Andrii Nakryiko
2024-07-03 11:57   ` Puranjay Mohan
2024-07-03 16:13     ` Eduard Zingerman
2024-07-04 10:55       ` Puranjay Mohan
2024-06-29  9:47 ` [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id() Eduard Zingerman
2024-07-02  0:41   ` Andrii Nakryiko
2024-07-02 20:43     ` Eduard Zingerman
2024-07-02 21:11       ` Andrii Nakryiko
2024-07-02 21:25         ` Eduard Zingerman
2024-07-03 11:27         ` Puranjay Mohan
2024-07-03 23:14           ` Eduard Zingerman
2024-07-04 11:19             ` Puranjay Mohan
2024-07-04 16:39               ` Eduard Zingerman
2024-07-04 17:00           ` Eduard Zingerman
2024-07-04 17:24             ` Puranjay Mohan
2024-07-04 17:39               ` Eduard Zingerman
2024-06-29  9:47 ` [RFC bpf-next v1 4/8] selftests/bpf: extract utility function for BPF disassembly Eduard Zingerman
2024-07-02  0:41   ` Andrii Nakryiko
2024-07-02 20:59     ` Eduard Zingerman
2024-07-02 21:16       ` Andrii Nakryiko
2024-07-02 21:23         ` Eduard Zingerman
2024-06-29  9:47 ` [RFC bpf-next v1 5/8] selftests/bpf: no need to track next_match_pos in struct test_loader Eduard Zingerman
2024-07-02  0:41   ` Andrii Nakryiko
2024-07-02 21:05     ` Eduard Zingerman
2024-07-02 21:18       ` Andrii Nakryiko
2024-06-29  9:47 ` [RFC bpf-next v1 6/8] selftests/bpf: extract test_loader->expect_msgs as a data structure Eduard Zingerman
2024-07-02  0:42   ` Andrii Nakryiko
2024-07-02 21:06     ` Eduard Zingerman
2024-06-29  9:47 ` [RFC bpf-next v1 7/8] selftests/bpf: allow checking xlated programs in verifier_* tests Eduard Zingerman
2024-07-02  0:42   ` Andrii Nakryiko
2024-07-02 21:07     ` Eduard Zingerman
2024-07-02 21:19       ` Andrii Nakryiko
2024-06-29  9:47 ` [RFC bpf-next v1 8/8] selftests/bpf: test no_caller_saved_registers spill/fill removal Eduard Zingerman
2024-07-02  0:42   ` Andrii Nakryiko
2024-07-02 21:12     ` Eduard Zingerman
2024-07-02 21:20       ` Andrii Nakryiko
2024-07-02  0:41 ` [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Andrii Nakryiko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.