[RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls

BPF List
 help / color / mirror / Atom feed

* [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls
@ 2024-07-04 10:23 Eduard Zingerman
  2024-07-04 10:23 ` [RFC bpf-next v2 1/9] bpf: add a get_helper_proto() utility function Eduard Zingerman
                   ` (10 more replies)
  0 siblings, 11 replies; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-04 10:23 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, puranjay,
	jose.marchesi, Eduard Zingerman

This RFC seeks to allow using no_caller_saved_registers gcc/clang
attribute with some BPF helper functions (and kfuncs in the future).

As documented in [1], this attribute means that function scratches
only some of the caller saved registers defined by ABI.
For BPF the set of such registers could be defined as follows:
- R0 is scratched only if function is non-void;
- R1-R5 are scratched only if corresponding parameter type is defined
  in the function prototype.

The goal of the RFC is to implement no_caller_saved_registers
(nocsr for short) in a backwards compatible manner:
- for kernels that support the feature, gain some performance boost
  from better register allocation;
- for kernels that don't support the feature, allow programs execution
  with minor performance losses.

To achieve this, use a scheme suggested by Alexei Starovoitov:
- for nocsr calls clang allocates registers as-if relevant r0-r5
  registers are not scratched by the call;
- as a post-processing step, clang visits each nocsr call and adds
  spill/fill for every live r0-r5;
- stack offsets used for spills/fills are allocated as minimal
  stack offsets in whole function and are not used for any other
  purposes;
- when kernel loads a program, it looks for such patterns
  (nocsr function surrounded by spills/fills) and checks if
  spill/fill stack offsets are used exclusively in nocsr patterns;
- if so, and if current JIT inlines the call to the nocsr function
  (e.g. a helper call), kernel removes unnecessary spill/fill pairs;
- when old kernel loads a program, presence of spill/fill pairs
  keeps BPF program valid, albeit slightly less efficient.

Corresponding clang/llvm changes are available in [2].

The patch-set uses bpf_get_smp_processor_id() function as a canary,
making it the first helper with nocsr attribute.

For example, consider the following program:

  #define __no_csr __attribute__((no_caller_saved_registers))
  #define SEC(name) __attribute__((section(name), used))
  #define bpf_printk(fmt, ...) bpf_trace_printk((fmt), sizeof(fmt), __VA_ARGS__)

  typedef unsigned int __u32;

  static long (* const bpf_trace_printk)(const char *fmt, __u32 fmt_size, ...) = (void *) 6;
  static __u32 (*const bpf_get_smp_processor_id)(void) __no_csr = (void *)8;

  SEC("raw_tp")
  int test(void *ctx)
  {
          __u32 task = bpf_get_smp_processor_id();
  	bpf_printk("ctx=%p, smp=%d", ctx, task);
  	return 0;
  }

  char _license[] SEC("license") = "GPL";

Compiled (using [2]) as follows:

  $ clang --target=bpf -O2 -g -c -o nocsr.bpf.o nocsr.bpf.c
  $ llvm-objdump --no-show-raw-insn -Sd nocsr.bpf.o
    ...
  3rd parameter for printk call     removable spill/fill pair
  .--- 0:       r3 = r1                             |
; |       __u32 task = bpf_get_smp_processor_id();  |
  |    1:       *(u64 *)(r10 - 0x8) = r3 <----------|
  |    2:       call 0x8                            |
  |    3:       r3 = *(u64 *)(r10 - 0x8) <----------'
; |     bpf_printk("ctx=%p, smp=%d", ctx, task);
  |    4:       r1 = 0x0 ll
  |    6:       r2 = 0xf
  |    7:       r4 = r0
  '--> 8:       call 0x6
;       return 0;
       9:       r0 = 0x0
      10:       exit

Here is how the program looks after verifier processing:

  # bpftool prog load ./nocsr.bpf.o /sys/fs/bpf/nocsr-test
  # bpftool prog dump xlated pinned /sys/fs/bpf/nocsr-test
  int test(void * ctx):
  ; int test(void *ctx)
     0: (bf) r3 = r1               <--------- 3rd printk parameter
  ; __u32 task = bpf_get_smp_processor_id();
     1: (b4) w0 = 197132           <--------- inlined helper call,
     2: (bf) r0 = r0               <--------- spill/fill pair removed
     3: (61) r0 = *(u32 *)(r0 +0)  <---------
  ; bpf_printk("ctx=%p, smp=%d", ctx, task);
     4: (18) r1 = map[id:13][0]+0
     6: (b7) r2 = 15
     7: (bf) r4 = r0
     8: (85) call bpf_trace_printk#-125920
  ; return 0;
     9: (b7) r0 = 0
    10: (95) exit

[1] https://clang.llvm.org/docs/AttributeReference.html#no-caller-saved-registers
[2] https://github.com/eddyz87/llvm-project/tree/bpf-no-caller-saved-registers

Change list:
- v1 -> v2:
  - assume that functions inlined by either jit or verifier
    conform to no_caller_saved_registers contract (Andrii, Puranjay);
  - allow nocsr rewrite for bpf_get_smp_processor_id()
    on arm64 and riscv64 architectures (Puranjay);
  - __arch_{x86_64,arm64,riscv64} macro for test_loader;
  - moved remove_nocsr_spills_fills() inside do_misc_fixups() (Andrii);
  - moved nocsr pattern detection from check_cfg() to a separate pass
    (Andrii);
  - various stylistic/correctness changes according to Andrii's
    comments.

Revisions:
- v1 https://lore.kernel.org/bpf/20240629094733.3863850-1-eddyz87@gmail.com/

Eduard Zingerman (9):
  bpf: add a get_helper_proto() utility function
  bpf: no_caller_saved_registers attribute for helper calls
  bpf, x86, riscv, arm: no_caller_saved_registers for
    bpf_get_smp_processor_id()
  selftests/bpf: extract utility function for BPF disassembly
  selftests/bpf: no need to track next_match_pos in struct test_loader
  selftests/bpf: extract test_loader->expect_msgs as a data structure
  selftests/bpf: allow checking xlated programs in verifier_* tests
  selftests/bpf: __arch_* macro to limit test cases to specific archs
  selftests/bpf: test no_caller_saved_registers spill/fill removal

 include/linux/bpf.h                           |   6 +
 include/linux/bpf_verifier.h                  |  14 +
 kernel/bpf/helpers.c                          |   1 +
 kernel/bpf/verifier.c                         | 339 +++++++++++-
 tools/testing/selftests/bpf/Makefile          |   1 +
 tools/testing/selftests/bpf/disasm_helpers.c  |  51 ++
 tools/testing/selftests/bpf/disasm_helpers.h  |  12 +
 .../selftests/bpf/prog_tests/ctx_rewrite.c    |  74 +--
 .../selftests/bpf/prog_tests/verifier.c       |   2 +
 tools/testing/selftests/bpf/progs/bpf_misc.h  |  13 +
 .../selftests/bpf/progs/verifier_nocsr.c      | 521 ++++++++++++++++++
 tools/testing/selftests/bpf/test_loader.c     | 217 ++++++--
 tools/testing/selftests/bpf/test_progs.h      |   1 -
 tools/testing/selftests/bpf/testing_helpers.c |   1 +
 14 files changed, 1124 insertions(+), 129 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/disasm_helpers.c
 create mode 100644 tools/testing/selftests/bpf/disasm_helpers.h
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_nocsr.c

-- 
2.45.2


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFC bpf-next v2 1/9] bpf: add a get_helper_proto() utility function
  2024-07-04 10:23 [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Eduard Zingerman
@ 2024-07-04 10:23 ` Eduard Zingerman
  2024-07-09 23:42   ` Andrii Nakryiko
  2024-07-04 10:23 ` [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls Eduard Zingerman
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-04 10:23 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, puranjay,
	jose.marchesi, Eduard Zingerman

Extract the part of check_helper_call() as a utility function allowing
to query 'struct bpf_func_proto' for a specific helper function id.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/verifier.c | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d3927d819465..4869f1fb0a42 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -10261,6 +10261,24 @@ static void update_loop_inline_state(struct bpf_verifier_env *env, u32 subprogno
 				 state->callback_subprogno == subprogno);
 }
 
+static int get_helper_proto(struct bpf_verifier_env *env, int func_id,
+			    const struct bpf_func_proto **ptr)
+{
+	const struct bpf_func_proto *result = NULL;
+
+	if (func_id < 0 || func_id >= __BPF_FUNC_MAX_ID)
+		return -ERANGE;
+
+	if (env->ops->get_func_proto)
+		result = env->ops->get_func_proto(func_id, env->prog);
+
+	if (!result)
+		return -EINVAL;
+
+	*ptr = result;
+	return 0;
+}
+
 static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			     int *insn_idx_p)
 {
@@ -10277,18 +10295,16 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 
 	/* find function prototype */
 	func_id = insn->imm;
-	if (func_id < 0 || func_id >= __BPF_FUNC_MAX_ID) {
-		verbose(env, "invalid func %s#%d\n", func_id_name(func_id),
-			func_id);
+	err = get_helper_proto(env, insn->imm, &fn);
+	if (err == -ERANGE) {
+		verbose(env, "invalid func %s#%d\n", func_id_name(func_id), func_id);
 		return -EINVAL;
 	}
 
-	if (env->ops->get_func_proto)
-		fn = env->ops->get_func_proto(func_id, env->prog);
-	if (!fn) {
+	if (err) {
 		verbose(env, "program of this type cannot use helper %s#%d\n",
 			func_id_name(func_id), func_id);
-		return -EINVAL;
+		return err;
 	}
 
 	/* eBPF programs must be GPL compatible to use GPL-ed functions */
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-04 10:23 [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Eduard Zingerman
  2024-07-04 10:23 ` [RFC bpf-next v2 1/9] bpf: add a get_helper_proto() utility function Eduard Zingerman
@ 2024-07-04 10:23 ` Eduard Zingerman
  2024-07-09 23:42   ` Andrii Nakryiko
  2024-07-10  1:09   ` Alexei Starovoitov
  2024-07-04 10:23 ` [RFC bpf-next v2 3/9] bpf, x86, riscv, arm: no_caller_saved_registers for bpf_get_smp_processor_id() Eduard Zingerman
                   ` (8 subsequent siblings)
  10 siblings, 2 replies; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-04 10:23 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, puranjay,
	jose.marchesi, Eduard Zingerman, Alexei Starovoitov

GCC and LLVM define a no_caller_saved_registers function attribute.
This attribute means that function scratches only some of
the caller saved registers defined by ABI.
For BPF the set of such registers could be defined as follows:
- R0 is scratched only if function is non-void;
- R1-R5 are scratched only if corresponding parameter type is defined
  in the function prototype.

This commit introduces flag bpf_func_prot->allow_nocsr.
If this flag is set for some helper function, verifier assumes that
it follows no_caller_saved_registers calling convention.

The contract between kernel and clang allows to simultaneously use
such functions and maintain backwards compatibility with old
kernels that don't understand no_caller_saved_registers calls
(nocsr for short):

- clang generates a simple pattern for nocsr calls, e.g.:

    r1 = 1;
    r2 = 2;
    *(u64 *)(r10 - 8)  = r1;
    *(u64 *)(r10 - 16) = r2;
    call %[to_be_inlined]
    r2 = *(u64 *)(r10 - 16);
    r1 = *(u64 *)(r10 - 8);
    r0 = r1;
    r0 += r2;
    exit;

- kernel removes unnecessary spills and fills, if called function is
  inlined by verifier or current JIT (with assumption that patch
  inserted by verifier or JIT honors nocsr contract, e.g. does not
  scratch r3-r5 for the example above), e.g. the code above would be
  transformed to:

    r1 = 1;
    r2 = 2;
    call %[to_be_inlined]
    r0 = r1;
    r0 += r2;
    exit;

Technically, the transformation is split into the following phases:
- function mark_nocsr_pattern_patterns(), called from bpf_check()
  searches and marks potential patterns in instruction auxiliary data;
- upon stack read or write access,
  function check_nocsr_stack_contract() is used to verify if
  stack offsets, presumably reserved for nocsr patterns, are used
  only from those patterns;
- function do_misc_fixups(), called from bpf_check(),
  applies the rewrite for valid patterns.

See comment in mark_nocsr_pattern_for_call() for more details.

Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf.h          |   6 +
 include/linux/bpf_verifier.h |  14 ++
 kernel/bpf/verifier.c        | 300 ++++++++++++++++++++++++++++++++++-
 3 files changed, 314 insertions(+), 6 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 960780ef04e1..391e19c5cd68 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -807,6 +807,12 @@ struct bpf_func_proto {
 	bool gpl_only;
 	bool pkt_access;
 	bool might_sleep;
+	/* set to true if helper follows contract for gcc/llvm
+	 * attribute no_caller_saved_registers:
+	 * - void functions do not scratch r0
+	 * - functions taking N arguments scratch only registers r1-rN
+	 */
+	bool allow_nocsr;
 	enum bpf_return_type ret_type;
 	union {
 		struct {
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 2b54e25d2364..735ae0901b3d 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -585,6 +585,15 @@ struct bpf_insn_aux_data {
 	 * accepts callback function as a parameter.
 	 */
 	bool calls_callback;
+	/* true if STX or LDX instruction is a part of a spill/fill
+	 * pattern for a no_caller_saved_registers call.
+	 */
+	u8 nocsr_pattern:1;
+	/* for CALL instructions, a number of spill/fill pairs in the
+	 * no_caller_saved_registers pattern.
+	 */
+	u8 nocsr_spills_num:3;
+
 };
 
 #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
@@ -641,6 +650,11 @@ struct bpf_subprog_info {
 	u32 linfo_idx; /* The idx to the main_prog->aux->linfo */
 	u16 stack_depth; /* max. stack depth used by this function */
 	u16 stack_extra;
+	/* stack depth after which slots reserved for
+	 * no_caller_saved_registers spills/fills start,
+	 * value <= nocsr_stack_off belongs to the spill/fill area.
+	 */
+	s16 nocsr_stack_off;
 	bool has_tail_call: 1;
 	bool tail_call_reachable: 1;
 	bool has_ld_abs: 1;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 4869f1fb0a42..d16a249b59ad 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2471,16 +2471,37 @@ static int cmp_subprogs(const void *a, const void *b)
 	       ((struct bpf_subprog_info *)b)->start;
 }
 
-static int find_subprog(struct bpf_verifier_env *env, int off)
+/* Find subprogram that contains instruction at 'off' */
+static int find_containing_subprog(struct bpf_verifier_env *env, int off)
 {
-	struct bpf_subprog_info *p;
+	struct bpf_subprog_info *vals = env->subprog_info;
+	int l, r, m;
 
-	p = bsearch(&off, env->subprog_info, env->subprog_cnt,
-		    sizeof(env->subprog_info[0]), cmp_subprogs);
-	if (!p)
+	if (off >= env->prog->len || off < 0 || env->subprog_cnt == 0)
 		return -ENOENT;
-	return p - env->subprog_info;
 
+	l = 0;
+	m = 0;
+	r = env->subprog_cnt - 1;
+	while (l < r) {
+		m = l + (r - l + 1) / 2;
+		if (vals[m].start <= off)
+			l = m;
+		else
+			r = m - 1;
+	}
+	return l;
+}
+
+/* Find subprogram that starts exactly at 'off' */
+static int find_subprog(struct bpf_verifier_env *env, int off)
+{
+	int idx;
+
+	idx = find_containing_subprog(env, off);
+	if (idx < 0 || env->subprog_info[idx].start != off)
+		return -ENOENT;
+	return idx;
 }
 
 static int add_subprog(struct bpf_verifier_env *env, int off)
@@ -4501,6 +4522,23 @@ static int get_reg_width(struct bpf_reg_state *reg)
 	return fls64(reg->umax_value);
 }
 
+/* See comment for mark_nocsr_pattern_for_call() */
+static void check_nocsr_stack_contract(struct bpf_verifier_env *env, struct bpf_func_state *state,
+				       int insn_idx, int off)
+{
+	struct bpf_subprog_info *subprog = &env->subprog_info[state->subprogno];
+	struct bpf_insn_aux_data *aux = &env->insn_aux_data[insn_idx];
+
+	if (subprog->nocsr_stack_off <= off || aux->nocsr_pattern)
+		return;
+	/* access to the region [max_stack_depth .. nocsr_stack_off]
+	 * from something that is not a part of the nocsr pattern,
+	 * disable nocsr rewrites for current subprogram by setting
+	 * nocsr_stack_off to a value smaller than any possible offset.
+	 */
+	subprog->nocsr_stack_off = S16_MIN;
+}
+
 /* check_stack_{read,write}_fixed_off functions track spill/fill of registers,
  * stack boundary and alignment are checked in check_mem_access()
  */
@@ -4549,6 +4587,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env,
 	if (err)
 		return err;
 
+	check_nocsr_stack_contract(env, state, insn_idx, off);
 	mark_stack_slot_scratched(env, spi);
 	if (reg && !(off % BPF_REG_SIZE) && reg->type == SCALAR_VALUE && env->bpf_capable) {
 		bool reg_value_fits;
@@ -4682,6 +4721,7 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env,
 			return err;
 	}
 
+	check_nocsr_stack_contract(env, state, insn_idx, min_off);
 	/* Variable offset writes destroy any spilled pointers in range. */
 	for (i = min_off; i < max_off; i++) {
 		u8 new_type, *stype;
@@ -4820,6 +4860,7 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
 	reg = &reg_state->stack[spi].spilled_ptr;
 
 	mark_stack_slot_scratched(env, spi);
+	check_nocsr_stack_contract(env, state, env->insn_idx, off);
 
 	if (is_spilled_reg(&reg_state->stack[spi])) {
 		u8 spill_size = 1;
@@ -4980,6 +5021,7 @@ static int check_stack_read_var_off(struct bpf_verifier_env *env,
 	min_off = reg->smin_value + off;
 	max_off = reg->smax_value + off;
 	mark_reg_stack_read(env, ptr_state, min_off, max_off + size, dst_regno);
+	check_nocsr_stack_contract(env, ptr_state, env->insn_idx, min_off);
 	return 0;
 }
 
@@ -15951,6 +15993,206 @@ static int visit_func_call_insn(int t, struct bpf_insn *insns,
 	return ret;
 }
 
+/* Bitmask with 1s for all caller saved registers */
+#define ALL_CALLER_SAVED_REGS ((1u << CALLER_SAVED_REGS) - 1)
+
+/* Return a bitmask specifying which caller saved registers are
+ * modified by a call to a helper.
+ * (Either as a return value or as scratch registers).
+ *
+ * For normal helpers registers R0-R5 are scratched.
+ * For helpers marked as no_csr:
+ * - scratch R0 if function is non-void;
+ * - scratch R1-R5 if corresponding parameter type is set
+ *   in the function prototype.
+ */
+static u8 get_helper_reg_mask(const struct bpf_func_proto *fn)
+{
+	u8 mask;
+	int i;
+
+	if (!fn->allow_nocsr)
+		return ALL_CALLER_SAVED_REGS;
+
+	mask = 0;
+	if (fn->ret_type != RET_VOID)
+		mask |= BIT(BPF_REG_0);
+	for (i = 0; i < ARRAY_SIZE(fn->arg_type); ++i)
+		if (fn->arg_type[i] != ARG_DONTCARE)
+			mask |= BIT(BPF_REG_1 + i);
+	return mask;
+}
+
+/* True if do_misc_fixups() replaces calls to helper number 'imm',
+ * replacement patch is presumed to follow no_caller_saved_registers contract
+ * (see mark_nocsr_pattern_for_call() below).
+ */
+static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm)
+{
+	return false;
+}
+
+/* If 'insn' is a call that follows no_caller_saved_registers contract
+ * and called function is inlined by current jit or verifier,
+ * return a mask with 1s corresponding to registers that are scratched
+ * by this call (depends on return type and number of return parameters).
+ * Otherwise return ALL_CALLER_SAVED_REGS mask.
+ */
+static u32 call_csr_mask(struct bpf_verifier_env *env, struct bpf_insn *insn)
+{
+	const struct bpf_func_proto *fn;
+
+	if (bpf_helper_call(insn) &&
+	    (verifier_inlines_helper_call(env, insn->imm) || bpf_jit_inlines_helper_call(insn->imm)) &&
+	    get_helper_proto(env, insn->imm, &fn) == 0 &&
+	    fn->allow_nocsr)
+		return ~get_helper_reg_mask(fn);
+
+	return ALL_CALLER_SAVED_REGS;
+}
+
+/* GCC and LLVM define a no_caller_saved_registers function attribute.
+ * This attribute means that function scratches only some of
+ * the caller saved registers defined by ABI.
+ * For BPF the set of such registers could be defined as follows:
+ * - R0 is scratched only if function is non-void;
+ * - R1-R5 are scratched only if corresponding parameter type is defined
+ *   in the function prototype.
+ *
+ * The contract between kernel and clang allows to simultaneously use
+ * such functions and maintain backwards compatibility with old
+ * kernels that don't understand no_caller_saved_registers calls
+ * (nocsr for short):
+ *
+ * - for nocsr calls clang allocates registers as-if relevant r0-r5
+ *   registers are not scratched by the call;
+ *
+ * - as a post-processing step, clang visits each nocsr call and adds
+ *   spill/fill for every live r0-r5;
+ *
+ * - stack offsets used for the spill/fill are allocated as minimal
+ *   stack offsets in whole function and are not used for any other
+ *   purposes;
+ *
+ * - when kernel loads a program, it looks for such patterns
+ *   (nocsr function surrounded by spills/fills) and checks if
+ *   spill/fill stack offsets are used exclusively in nocsr patterns;
+ *
+ * - if so, and if verifier or current JIT inlines the call to the
+ *   nocsr function (e.g. a helper call), kernel removes unnecessary
+ *   spill/fill pairs;
+ *
+ * - when old kernel loads a program, presence of spill/fill pairs
+ *   keeps BPF program valid, albeit slightly less efficient.
+ *
+ * For example:
+ *
+ *   r1 = 1;
+ *   r2 = 2;
+ *   *(u64 *)(r10 - 8)  = r1;            r1 = 1;
+ *   *(u64 *)(r10 - 16) = r2;            r2 = 2;
+ *   call %[to_be_inlined]         -->   call %[to_be_inlined]
+ *   r2 = *(u64 *)(r10 - 16);            r0 = r1;
+ *   r1 = *(u64 *)(r10 - 8);             r0 += r2;
+ *   r0 = r1;                            exit;
+ *   r0 += r2;
+ *   exit;
+ *
+ * The purpose of mark_nocsr_pattern_for_call is to:
+ * - look for such patterns;
+ * - mark spill and fill instructions in env->insn_aux_data[*].nocsr_pattern;
+ * - mark set env->insn_aux_data[*].nocsr_spills_num for call instruction;
+ * - update env->subprog_info[*]->nocsr_stack_off to find an offset
+ *   at which nocsr spill/fill stack slots start.
+ *
+ * The .nocsr_pattern and .nocsr_stack_off are used by
+ * check_nocsr_stack_contract() to check if every stack access to
+ * nocsr spill/fill stack slot originates from spill/fill
+ * instructions, members of nocsr patterns.
+ *
+ * If such condition holds true for a subprogram, nocsr patterns could
+ * be rewritten by do_misc_fixups().
+ * Otherwise nocsr patterns are not changed in the subprogram
+ * (code, presumably, generated by an older clang version).
+ *
+ * For example, it is *not* safe to remove spill/fill below:
+ *
+ *   r1 = 1;
+ *   *(u64 *)(r10 - 8)  = r1;            r1 = 1;
+ *   call %[to_be_inlined]         -->   call %[to_be_inlined]
+ *   r1 = *(u64 *)(r10 - 8);             r0 = *(u64 *)(r10 - 8);  <---- wrong !!!
+ *   r0 = *(u64 *)(r10 - 8);             r0 += r1;
+ *   r0 += r1;                           exit;
+ *   exit;
+ */
+static void mark_nocsr_pattern_for_call(struct bpf_verifier_env *env, int t)
+{
+	struct bpf_insn *insns = env->prog->insnsi, *stx, *ldx;
+	struct bpf_subprog_info *subprog;
+	u32 csr_mask = call_csr_mask(env, &insns[t]);
+	u32 reg_mask = ~csr_mask | ~ALL_CALLER_SAVED_REGS;
+	int s, i;
+	s16 off;
+
+	if (csr_mask == ALL_CALLER_SAVED_REGS)
+		return;
+
+	for (i = 1, off = 0; i <= ARRAY_SIZE(caller_saved); ++i, off += BPF_REG_SIZE) {
+		if (t - i < 0 || t + i >= env->prog->len)
+			break;
+		stx = &insns[t - i];
+		ldx = &insns[t + i];
+		if (off == 0) {
+			off = stx->off;
+			if (off % BPF_REG_SIZE != 0)
+				break;
+		}
+		if (/* *(u64 *)(r10 - off) = r[0-5]? */
+		    stx->code != (BPF_STX | BPF_MEM | BPF_DW) ||
+		    stx->dst_reg != BPF_REG_10 ||
+		    /* r[0-5] = *(u64 *)(r10 - off)? */
+		    ldx->code != (BPF_LDX | BPF_MEM | BPF_DW) ||
+		    ldx->src_reg != BPF_REG_10 ||
+		    /* check spill/fill for the same reg and offset */
+		    stx->src_reg != ldx->dst_reg ||
+		    stx->off != ldx->off ||
+		    stx->off != off ||
+		    /* this should be a previously unseen register */
+		    BIT(stx->src_reg) & reg_mask)
+			break;
+		reg_mask |= BIT(stx->src_reg);
+		env->insn_aux_data[t - i].nocsr_pattern = 1;
+		env->insn_aux_data[t + i].nocsr_pattern = 1;
+	}
+	if (i == 1)
+		return;
+	env->insn_aux_data[t].nocsr_spills_num = i - 1;
+	s = find_containing_subprog(env, t);
+	/* can't happen */
+	if (WARN_ON_ONCE(s < 0))
+		return;
+	subprog = &env->subprog_info[s];
+	subprog->nocsr_stack_off = min(subprog->nocsr_stack_off, off);
+}
+
+/* Update the following fields when appropriate:
+ * - env->insn_aux_data[*].nocsr_pattern
+ * - env->insn_aux_data[*].spills_num and
+ * - env->subprog_info[*].nocsr_stack_off
+ * See mark_nocsr_pattern_for_call().
+ */
+static int mark_nocsr_patterns(struct bpf_verifier_env *env)
+{
+	struct bpf_insn *insn = env->prog->insnsi;
+	int i, insn_cnt = env->prog->len;
+
+	for (i = 0; i < insn_cnt; i++, insn++)
+		/* might be extended to handle kfuncs as well */
+		if (bpf_helper_call(insn))
+			mark_nocsr_pattern_for_call(env, i);
+	return 0;
+}
+
 /* Visits the instruction at index t and returns one of the following:
  *  < 0 - an error occurred
  *  DONE_EXPLORING - the instruction was fully explored
@@ -20119,6 +20361,48 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
 			goto next_insn;
 		if (insn->src_reg == BPF_PSEUDO_CALL)
 			goto next_insn;
+		/* Remove unnecessary spill/fill pairs, members of nocsr pattern */
+		if (env->insn_aux_data[i + delta].nocsr_spills_num > 0) {
+			u32 j, spills_num = env->insn_aux_data[i + delta].nocsr_spills_num;
+			int err;
+
+			/* don't apply this on a second visit */
+			env->insn_aux_data[i + delta].nocsr_spills_num = 0;
+
+			/* check if spill/fill stack access is in expected offset range */
+			for (j = 1; j <= spills_num; ++j) {
+				if ((insn - j)->off >= subprogs[cur_subprog].nocsr_stack_off ||
+				    (insn + j)->off >= subprogs[cur_subprog].nocsr_stack_off) {
+					/* do a second visit of this instruction,
+					 * so that verifier can inline it
+					 */
+					i -= 1;
+					insn -= 1;
+					goto next_insn;
+				}
+			}
+
+			/* apply the rewrite:
+			 *   *(u64 *)(r10 - X) = rY ; num-times
+			 *   call()                               -> call()
+			 *   rY = *(u64 *)(r10 - X) ; num-times
+			 */
+			err = verifier_remove_insns(env, i + delta - spills_num, spills_num);
+			if (err)
+				return err;
+			err = verifier_remove_insns(env, i + delta - spills_num + 1, spills_num);
+			if (err)
+				return err;
+
+			i += spills_num - 1;
+			/*   ^            ^   do a second visit of this instruction,
+			 *   |            '-- so that verifier can inline it
+			 *   '--------------- jump over deleted fills
+			 */
+			delta -= 2 * spills_num;
+			insn = env->prog->insnsi + i + delta;
+			goto next_insn;
+		}
 		if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
 			ret = fixup_kfunc_call(env, insn, insn_buf, i + delta, &cnt);
 			if (ret)
@@ -21704,6 +21988,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	if (ret < 0)
 		goto skip_full_check;
 
+	ret = mark_nocsr_patterns(env);
+	if (ret < 0)
+		goto skip_full_check;
+
 	ret = do_check_main(env);
 	ret = ret ?: do_check_subprogs(env);
 
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC bpf-next v2 3/9] bpf, x86, riscv, arm: no_caller_saved_registers for bpf_get_smp_processor_id()
  2024-07-04 10:23 [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Eduard Zingerman
  2024-07-04 10:23 ` [RFC bpf-next v2 1/9] bpf: add a get_helper_proto() utility function Eduard Zingerman
  2024-07-04 10:23 ` [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls Eduard Zingerman
@ 2024-07-04 10:23 ` Eduard Zingerman
  2024-07-04 10:23 ` [RFC bpf-next v2 4/9] selftests/bpf: extract utility function for BPF disassembly Eduard Zingerman
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-04 10:23 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, puranjay,
	jose.marchesi, Eduard Zingerman

The function bpf_get_smp_processor_id() is processed in a different
way, depending on the arch:
- on x86 verifier replaces call to bpf_get_smp_processor_id() with a
  sequence of instructions that modify only r0;
- on riscv64 jit replaces call to bpf_get_smp_processor_id() with a
  sequence of instructions that modify only r0;
- on arm64 jit replaces call to bpf_get_smp_processor_id() with a
  sequence of instructions that modify only r0 and tmp registers.

These rewrites satisfy attribute no_caller_saved_registers contract.
Allow rewrite of no_caller_saved_registers patterns for
bpf_get_smp_processor_id() in order to use this function as a canary
for no_caller_saved_registers tests.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/helpers.c  |  1 +
 kernel/bpf/verifier.c | 11 +++++++++--
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 229396172026..26863b162a88 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -158,6 +158,7 @@ const struct bpf_func_proto bpf_get_smp_processor_id_proto = {
 	.func		= bpf_get_smp_processor_id,
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
+	.allow_nocsr	= true,
 };
 
 BPF_CALL_0(bpf_get_numa_node_id)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d16a249b59ad..99115c552e3b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -16029,7 +16029,14 @@ static u8 get_helper_reg_mask(const struct bpf_func_proto *fn)
  */
 static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm)
 {
-	return false;
+	switch (imm) {
+#ifdef CONFIG_X86_64
+	case BPF_FUNC_get_smp_processor_id:
+		return env->prog->jit_requested && bpf_jit_supports_percpu_insn();
+#endif
+	default:
+		return false;
+	}
 }
 
 /* If 'insn' is a call that follows no_caller_saved_registers contract
@@ -20703,7 +20710,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
 #ifdef CONFIG_X86_64
 		/* Implement bpf_get_smp_processor_id() inline. */
 		if (insn->imm == BPF_FUNC_get_smp_processor_id &&
-		    prog->jit_requested && bpf_jit_supports_percpu_insn()) {
+		    verifier_inlines_helper_call(env, insn->imm)) {
 			/* BPF_FUNC_get_smp_processor_id inlining is an
 			 * optimization, so if pcpu_hot.cpu_number is ever
 			 * changed in some incompatible and hard to support
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC bpf-next v2 4/9] selftests/bpf: extract utility function for BPF disassembly
  2024-07-04 10:23 [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Eduard Zingerman
                   ` (2 preceding siblings ...)
  2024-07-04 10:23 ` [RFC bpf-next v2 3/9] bpf, x86, riscv, arm: no_caller_saved_registers for bpf_get_smp_processor_id() Eduard Zingerman
@ 2024-07-04 10:23 ` Eduard Zingerman
  2024-07-09 23:46   ` Andrii Nakryiko
  2024-07-04 10:23 ` [RFC bpf-next v2 5/9] selftests/bpf: no need to track next_match_pos in struct test_loader Eduard Zingerman
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-04 10:23 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, puranjay,
	jose.marchesi, Eduard Zingerman

struct bpf_insn *disasm_insn(struct bpf_insn *insn, char *buf, size_t buf_sz);

  Disassembles instruction 'insn' to a text buffer 'buf'.
  Removes insn->code hex prefix added by kernel disassembly routine.
  Returns a pointer to the next instruction
  (increments insn by either 1 or 2).

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/Makefile          |  1 +
 tools/testing/selftests/bpf/disasm_helpers.c  | 51 +++++++++++++
 tools/testing/selftests/bpf/disasm_helpers.h  | 12 +++
 .../selftests/bpf/prog_tests/ctx_rewrite.c    | 74 +++----------------
 tools/testing/selftests/bpf/testing_helpers.c |  1 +
 5 files changed, 75 insertions(+), 64 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/disasm_helpers.c
 create mode 100644 tools/testing/selftests/bpf/disasm_helpers.h

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index e0b3887b3d2d..5eb7b5eb89d2 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -636,6 +636,7 @@ TRUNNER_EXTRA_SOURCES := test_progs.c		\
 			 test_loader.c		\
 			 xsk.c			\
 			 disasm.c		\
+			 disasm_helpers.c	\
 			 json_writer.c 		\
 			 flow_dissector_load.h	\
 			 ip_check_defrag_frags.h
diff --git a/tools/testing/selftests/bpf/disasm_helpers.c b/tools/testing/selftests/bpf/disasm_helpers.c
new file mode 100644
index 000000000000..96b1f2ffe438
--- /dev/null
+++ b/tools/testing/selftests/bpf/disasm_helpers.c
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+
+#include <bpf/bpf.h>
+#include "disasm.h"
+
+struct print_insn_context {
+	char *buf;
+	size_t sz;
+};
+
+static void print_insn_cb(void *private_data, const char *fmt, ...)
+{
+	struct print_insn_context *ctx = private_data;
+	va_list args;
+
+	va_start(args, fmt);
+	vsnprintf(ctx->buf, ctx->sz, fmt, args);
+	va_end(args);
+}
+
+struct bpf_insn *disasm_insn(struct bpf_insn *insn, char *buf, size_t buf_sz)
+{
+	struct print_insn_context ctx = {
+		.buf = buf,
+		.sz = buf_sz,
+	};
+	struct bpf_insn_cbs cbs = {
+		.cb_print	= print_insn_cb,
+		.private_data	= &ctx,
+	};
+	char *tmp, *pfx_end, *sfx_start;
+	bool double_insn;
+	int len;
+
+	print_bpf_insn(&cbs, insn, true);
+	/* We share code with kernel BPF disassembler, it adds '(FF) ' prefix
+	 * for each instruction (FF stands for instruction `code` byte).
+	 * Remove the prefix inplace, and also simplify call instructions.
+	 * E.g.: "(85) call foo#10" -> "call foo".
+	 * Also remove newline in the end (the 'max(strlen(buf) - 1, 0)' thing).
+	 */
+	pfx_end = buf + 5;
+	sfx_start = buf + max((int)strlen(buf) - 1, 0);
+	if (strncmp(pfx_end, "call ", 5) == 0 && (tmp = strrchr(buf, '#')))
+		sfx_start = tmp;
+	len = sfx_start - pfx_end;
+	memmove(buf, pfx_end, len);
+	buf[len] = 0;
+	double_insn = insn->code == (BPF_LD | BPF_IMM | BPF_DW);
+	return insn + (double_insn ? 2 : 1);
+}
diff --git a/tools/testing/selftests/bpf/disasm_helpers.h b/tools/testing/selftests/bpf/disasm_helpers.h
new file mode 100644
index 000000000000..7b26cab70099
--- /dev/null
+++ b/tools/testing/selftests/bpf/disasm_helpers.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
+
+#ifndef __DISASM_HELPERS_H
+#define __DISASM_HELPERS_H
+
+#include <stdlib.h>
+
+struct bpf_insn;
+
+struct bpf_insn *disasm_insn(struct bpf_insn *insn, char *buf, size_t buf_sz);
+
+#endif /* __DISASM_HELPERS_H */
diff --git a/tools/testing/selftests/bpf/prog_tests/ctx_rewrite.c b/tools/testing/selftests/bpf/prog_tests/ctx_rewrite.c
index 08b6391f2f56..dd75ccb03770 100644
--- a/tools/testing/selftests/bpf/prog_tests/ctx_rewrite.c
+++ b/tools/testing/selftests/bpf/prog_tests/ctx_rewrite.c
@@ -10,7 +10,8 @@
 #include "bpf/btf.h"
 #include "bpf_util.h"
 #include "linux/filter.h"
-#include "disasm.h"
+#include "linux/kernel.h"
+#include "disasm_helpers.h"
 
 #define MAX_PROG_TEXT_SZ (32 * 1024)
 
@@ -628,63 +629,6 @@ static bool match_pattern(struct btf *btf, char *pattern, char *text, char *reg_
 	return false;
 }
 
-static void print_insn(void *private_data, const char *fmt, ...)
-{
-	va_list args;
-
-	va_start(args, fmt);
-	vfprintf((FILE *)private_data, fmt, args);
-	va_end(args);
-}
-
-/* Disassemble instructions to a stream */
-static void print_xlated(FILE *out, struct bpf_insn *insn, __u32 len)
-{
-	const struct bpf_insn_cbs cbs = {
-		.cb_print	= print_insn,
-		.cb_call	= NULL,
-		.cb_imm		= NULL,
-		.private_data	= out,
-	};
-	bool double_insn = false;
-	int i;
-
-	for (i = 0; i < len; i++) {
-		if (double_insn) {
-			double_insn = false;
-			continue;
-		}
-
-		double_insn = insn[i].code == (BPF_LD | BPF_IMM | BPF_DW);
-		print_bpf_insn(&cbs, insn + i, true);
-	}
-}
-
-/* We share code with kernel BPF disassembler, it adds '(FF) ' prefix
- * for each instruction (FF stands for instruction `code` byte).
- * This function removes the prefix inplace for each line in `str`.
- */
-static void remove_insn_prefix(char *str, int size)
-{
-	const int prefix_size = 5;
-
-	int write_pos = 0, read_pos = prefix_size;
-	int len = strlen(str);
-	char c;
-
-	size = min(size, len);
-
-	while (read_pos < size) {
-		c = str[read_pos++];
-		if (c == 0)
-			break;
-		str[write_pos++] = c;
-		if (c == '\n')
-			read_pos += prefix_size;
-	}
-	str[write_pos] = 0;
-}
-
 struct prog_info {
 	char *prog_kind;
 	enum bpf_prog_type prog_type;
@@ -699,9 +643,10 @@ static void match_program(struct btf *btf,
 			  char *reg_map[][2],
 			  bool skip_first_insn)
 {
-	struct bpf_insn *buf = NULL;
+	struct bpf_insn *buf = NULL, *insn, *insn_end;
 	int err = 0, prog_fd = 0;
 	FILE *prog_out = NULL;
+	char insn_buf[64];
 	char *text = NULL;
 	__u32 cnt = 0;
 
@@ -739,12 +684,13 @@ static void match_program(struct btf *btf,
 		PRINT_FAIL("Can't open memory stream\n");
 		goto out;
 	}
-	if (skip_first_insn)
-		print_xlated(prog_out, buf + 1, cnt - 1);
-	else
-		print_xlated(prog_out, buf, cnt);
+	insn_end = buf + cnt;
+	insn = buf + (skip_first_insn ? 1 : 0);
+	while (insn < insn_end) {
+		insn = disasm_insn(insn, insn_buf, sizeof(insn_buf));
+		fprintf(prog_out, "%s\n", insn_buf);
+	}
 	fclose(prog_out);
-	remove_insn_prefix(text, MAX_PROG_TEXT_SZ);
 
 	ASSERT_TRUE(match_pattern(btf, pattern, text, reg_map),
 		    pinfo->prog_kind);
diff --git a/tools/testing/selftests/bpf/testing_helpers.c b/tools/testing/selftests/bpf/testing_helpers.c
index d5379a0e6da8..ac7c66f4fc7b 100644
--- a/tools/testing/selftests/bpf/testing_helpers.c
+++ b/tools/testing/selftests/bpf/testing_helpers.c
@@ -7,6 +7,7 @@
 #include <errno.h>
 #include <bpf/bpf.h>
 #include <bpf/libbpf.h>
+#include "disasm.h"
 #include "test_progs.h"
 #include "testing_helpers.h"
 #include <linux/membarrier.h>
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC bpf-next v2 5/9] selftests/bpf: no need to track next_match_pos in struct test_loader
  2024-07-04 10:23 [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Eduard Zingerman
                   ` (3 preceding siblings ...)
  2024-07-04 10:23 ` [RFC bpf-next v2 4/9] selftests/bpf: extract utility function for BPF disassembly Eduard Zingerman
@ 2024-07-04 10:23 ` Eduard Zingerman
  2024-07-04 10:23 ` [RFC bpf-next v2 6/9] selftests/bpf: extract test_loader->expect_msgs as a data structure Eduard Zingerman
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-04 10:23 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, puranjay,
	jose.marchesi, Eduard Zingerman

The call stack for validate_case() function looks as follows:
- test_loader__run_subtests()
  - process_subtest()
    - run_subtest()
      - prepare_case(), which does 'tester->next_match_pos = 0';
      - validate_case(), which increments tester->next_match_pos.

Hence, each subtest is run with next_match_pos freshly set to zero.
Meaning that there is no need to persist this variable in the
struct test_loader, use local variable instead.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/test_loader.c | 19 ++++++++-----------
 tools/testing/selftests/bpf/test_progs.h  |  1 -
 2 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c
index f14e10b0de96..47508cf66e89 100644
--- a/tools/testing/selftests/bpf/test_loader.c
+++ b/tools/testing/selftests/bpf/test_loader.c
@@ -434,7 +434,6 @@ static void prepare_case(struct test_loader *tester,
 	bpf_program__set_flags(prog, prog_flags | spec->prog_flags);
 
 	tester->log_buf[0] = '\0';
-	tester->next_match_pos = 0;
 }
 
 static void emit_verifier_log(const char *log_buf, bool force)
@@ -450,25 +449,23 @@ static void validate_case(struct test_loader *tester,
 			  struct bpf_program *prog,
 			  int load_err)
 {
-	int i, j, err;
-	char *match;
 	regmatch_t reg_match[1];
+	const char *log = tester->log_buf;
+	int i, j, err;
 
 	for (i = 0; i < subspec->expect_msg_cnt; i++) {
 		struct expect_msg *msg = &subspec->expect_msgs[i];
+		const char *match = NULL;
 
 		if (msg->substr) {
-			match = strstr(tester->log_buf + tester->next_match_pos, msg->substr);
+			match = strstr(log, msg->substr);
 			if (match)
-				tester->next_match_pos = match - tester->log_buf + strlen(msg->substr);
+				log += strlen(msg->substr);
 		} else {
-			err = regexec(&msg->regex,
-				      tester->log_buf + tester->next_match_pos, 1, reg_match, 0);
+			err = regexec(&msg->regex, log, 1, reg_match, 0);
 			if (err == 0) {
-				match = tester->log_buf + tester->next_match_pos + reg_match[0].rm_so;
-				tester->next_match_pos += reg_match[0].rm_eo;
-			} else {
-				match = NULL;
+				match = log + reg_match[0].rm_so;
+				log += reg_match[0].rm_eo;
 			}
 		}
 
diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h
index 0ba5a20b19ba..8e997de596db 100644
--- a/tools/testing/selftests/bpf/test_progs.h
+++ b/tools/testing/selftests/bpf/test_progs.h
@@ -438,7 +438,6 @@ typedef int (*pre_execution_cb)(struct bpf_object *obj);
 struct test_loader {
 	char *log_buf;
 	size_t log_buf_sz;
-	size_t next_match_pos;
 	pre_execution_cb pre_execution_cb;
 
 	struct bpf_object *obj;
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC bpf-next v2 6/9] selftests/bpf: extract test_loader->expect_msgs as a data structure
  2024-07-04 10:23 [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Eduard Zingerman
                   ` (4 preceding siblings ...)
  2024-07-04 10:23 ` [RFC bpf-next v2 5/9] selftests/bpf: no need to track next_match_pos in struct test_loader Eduard Zingerman
@ 2024-07-04 10:23 ` Eduard Zingerman
  2024-07-04 10:23 ` [RFC bpf-next v2 7/9] selftests/bpf: allow checking xlated programs in verifier_* tests Eduard Zingerman
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-04 10:23 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, puranjay,
	jose.marchesi, Eduard Zingerman

Non-functional change: use a separate data structure to represented
expected messages in test_loader.
This would allow to use the same functionality for expected set of
disassembled instructions in the follow-up commit.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/test_loader.c | 81 ++++++++++++-----------
 1 file changed, 41 insertions(+), 40 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c
index 47508cf66e89..3f84903558dd 100644
--- a/tools/testing/selftests/bpf/test_loader.c
+++ b/tools/testing/selftests/bpf/test_loader.c
@@ -55,11 +55,15 @@ struct expect_msg {
 	regex_t regex;
 };
 
+struct expected_msgs {
+	struct expect_msg *patterns;
+	size_t cnt;
+};
+
 struct test_subspec {
 	char *name;
 	bool expect_failure;
-	struct expect_msg *expect_msgs;
-	size_t expect_msg_cnt;
+	struct expected_msgs expect_msgs;
 	int retval;
 	bool execute;
 };
@@ -96,44 +100,45 @@ void test_loader_fini(struct test_loader *tester)
 	free(tester->log_buf);
 }
 
-static void free_test_spec(struct test_spec *spec)
+static void free_msgs(struct expected_msgs *msgs)
 {
 	int i;
 
+	for (i = 0; i < msgs->cnt; i++)
+		if (msgs->patterns[i].regex_str)
+			regfree(&msgs->patterns[i].regex);
+	free(msgs->patterns);
+	msgs->patterns = NULL;
+	msgs->cnt = 0;
+}
+
+static void free_test_spec(struct test_spec *spec)
+{
 	/* Deallocate expect_msgs arrays. */
-	for (i = 0; i < spec->priv.expect_msg_cnt; i++)
-		if (spec->priv.expect_msgs[i].regex_str)
-			regfree(&spec->priv.expect_msgs[i].regex);
-	for (i = 0; i < spec->unpriv.expect_msg_cnt; i++)
-		if (spec->unpriv.expect_msgs[i].regex_str)
-			regfree(&spec->unpriv.expect_msgs[i].regex);
+	free_msgs(&spec->priv.expect_msgs);
+	free_msgs(&spec->unpriv.expect_msgs);
 
 	free(spec->priv.name);
 	free(spec->unpriv.name);
-	free(spec->priv.expect_msgs);
-	free(spec->unpriv.expect_msgs);
-
 	spec->priv.name = NULL;
 	spec->unpriv.name = NULL;
-	spec->priv.expect_msgs = NULL;
-	spec->unpriv.expect_msgs = NULL;
 }
 
-static int push_msg(const char *substr, const char *regex_str, struct test_subspec *subspec)
+static int push_msg(const char *substr, const char *regex_str, struct expected_msgs *msgs)
 {
 	void *tmp;
 	int regcomp_res;
 	char error_msg[100];
 	struct expect_msg *msg;
 
-	tmp = realloc(subspec->expect_msgs,
-		      (1 + subspec->expect_msg_cnt) * sizeof(struct expect_msg));
+	tmp = realloc(msgs->patterns,
+		      (1 + msgs->cnt) * sizeof(struct expect_msg));
 	if (!tmp) {
 		ASSERT_FAIL("failed to realloc memory for messages\n");
 		return -ENOMEM;
 	}
-	subspec->expect_msgs = tmp;
-	msg = &subspec->expect_msgs[subspec->expect_msg_cnt];
+	msgs->patterns = tmp;
+	msg = &msgs->patterns[msgs->cnt];
 
 	if (substr) {
 		msg->substr = substr;
@@ -150,7 +155,7 @@ static int push_msg(const char *substr, const char *regex_str, struct test_subsp
 		}
 	}
 
-	subspec->expect_msg_cnt += 1;
+	msgs->cnt += 1;
 	return 0;
 }
 
@@ -272,25 +277,25 @@ static int parse_test_spec(struct test_loader *tester,
 			spec->mode_mask |= UNPRIV;
 		} else if (str_has_pfx(s, TEST_TAG_EXPECT_MSG_PFX)) {
 			msg = s + sizeof(TEST_TAG_EXPECT_MSG_PFX) - 1;
-			err = push_msg(msg, NULL, &spec->priv);
+			err = push_msg(msg, NULL, &spec->priv.expect_msgs);
 			if (err)
 				goto cleanup;
 			spec->mode_mask |= PRIV;
 		} else if (str_has_pfx(s, TEST_TAG_EXPECT_MSG_PFX_UNPRIV)) {
 			msg = s + sizeof(TEST_TAG_EXPECT_MSG_PFX_UNPRIV) - 1;
-			err = push_msg(msg, NULL, &spec->unpriv);
+			err = push_msg(msg, NULL, &spec->unpriv.expect_msgs);
 			if (err)
 				goto cleanup;
 			spec->mode_mask |= UNPRIV;
 		} else if (str_has_pfx(s, TEST_TAG_EXPECT_REGEX_PFX)) {
 			msg = s + sizeof(TEST_TAG_EXPECT_REGEX_PFX) - 1;
-			err = push_msg(NULL, msg, &spec->priv);
+			err = push_msg(NULL, msg, &spec->priv.expect_msgs);
 			if (err)
 				goto cleanup;
 			spec->mode_mask |= PRIV;
 		} else if (str_has_pfx(s, TEST_TAG_EXPECT_REGEX_PFX_UNPRIV)) {
 			msg = s + sizeof(TEST_TAG_EXPECT_REGEX_PFX_UNPRIV) - 1;
-			err = push_msg(NULL, msg, &spec->unpriv);
+			err = push_msg(NULL, msg, &spec->unpriv.expect_msgs);
 			if (err)
 				goto cleanup;
 			spec->mode_mask |= UNPRIV;
@@ -387,11 +392,12 @@ static int parse_test_spec(struct test_loader *tester,
 			spec->unpriv.execute = spec->priv.execute;
 		}
 
-		if (!spec->unpriv.expect_msgs) {
-			for (i = 0; i < spec->priv.expect_msg_cnt; i++) {
-				struct expect_msg *msg = &spec->priv.expect_msgs[i];
+		if (spec->unpriv.expect_msgs.cnt == 0) {
+			for (i = 0; i < spec->priv.expect_msgs.cnt; i++) {
+				struct expect_msg *msg = &spec->priv.expect_msgs.patterns[i];
 
-				err = push_msg(msg->substr, msg->regex_str, &spec->unpriv);
+				err = push_msg(msg->substr, msg->regex_str,
+					       &spec->unpriv.expect_msgs);
 				if (err)
 					goto cleanup;
 			}
@@ -443,18 +449,14 @@ static void emit_verifier_log(const char *log_buf, bool force)
 	fprintf(stdout, "VERIFIER LOG:\n=============\n%s=============\n", log_buf);
 }
 
-static void validate_case(struct test_loader *tester,
-			  struct test_subspec *subspec,
-			  struct bpf_object *obj,
-			  struct bpf_program *prog,
-			  int load_err)
+static void validate_msgs(char *log_buf, struct expected_msgs *msgs)
 {
 	regmatch_t reg_match[1];
-	const char *log = tester->log_buf;
+	const char *log = log_buf;
 	int i, j, err;
 
-	for (i = 0; i < subspec->expect_msg_cnt; i++) {
-		struct expect_msg *msg = &subspec->expect_msgs[i];
+	for (i = 0; i < msgs->cnt; i++) {
+		struct expect_msg *msg = &msgs->patterns[i];
 		const char *match = NULL;
 
 		if (msg->substr) {
@@ -471,9 +473,9 @@ static void validate_case(struct test_loader *tester,
 
 		if (!ASSERT_OK_PTR(match, "expect_msg")) {
 			if (env.verbosity == VERBOSE_NONE)
-				emit_verifier_log(tester->log_buf, true /*force*/);
+				emit_verifier_log(log_buf, true /*force*/);
 			for (j = 0; j <= i; j++) {
-				msg = &subspec->expect_msgs[j];
+				msg = &msgs->patterns[j];
 				fprintf(stderr, "%s %s: '%s'\n",
 					j < i ? "MATCHED " : "EXPECTED",
 					msg->substr ? "SUBSTR" : " REGEX",
@@ -692,9 +694,8 @@ void run_subtest(struct test_loader *tester,
 			goto tobj_cleanup;
 		}
 	}
-
 	emit_verifier_log(tester->log_buf, false /*force*/);
-	validate_case(tester, subspec, tobj, tprog, err);
+	validate_msgs(tester->log_buf, &subspec->expect_msgs);
 
 	if (should_do_test_run(spec, subspec)) {
 		/* For some reason test_verifier executes programs
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC bpf-next v2 7/9] selftests/bpf: allow checking xlated programs in verifier_* tests
  2024-07-04 10:23 [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Eduard Zingerman
                   ` (5 preceding siblings ...)
  2024-07-04 10:23 ` [RFC bpf-next v2 6/9] selftests/bpf: extract test_loader->expect_msgs as a data structure Eduard Zingerman
@ 2024-07-04 10:23 ` Eduard Zingerman
  2024-07-04 10:24 ` [RFC bpf-next v2 8/9] selftests/bpf: __arch_* macro to limit test cases to specific archs Eduard Zingerman
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-04 10:23 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, puranjay,
	jose.marchesi, Eduard Zingerman

Add a macro __xlated("...") for use with test_loader tests.

When such annotations are present for the test case:
- bpf_prog_get_info_by_fd() is used to get BPF program after all
  rewrites are applied by verifier.
- the program is disassembled and patterns specified in __xlated are
  searched for in the disassembly text.

__xlated matching follows the same mechanics as __msg:
each subsequent pattern is matched from the point where
previous pattern ended.

This allows to write tests like below, where the goal is to verify the
behavior of one of the of the transformations applied by verifier:

    SEC("raw_tp")
    __xlated("1: w0 = ")
    __xlated("2: r0 = &(void __percpu *)(r0)")
    __xlated("3: r0 = *(u32 *)(r0 +0)")
    __xlated("4: exit")
    __success __naked void simple(void)
    {
            asm volatile (
            "call %[bpf_get_smp_processor_id];"
            "exit;"
            :
            : __imm(bpf_get_smp_processor_id)
            : __clobber_all);
    }

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/progs/bpf_misc.h |  5 ++
 tools/testing/selftests/bpf/test_loader.c    | 82 +++++++++++++++++++-
 2 files changed, 84 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h
index 81097a3f15eb..a70939c7bc26 100644
--- a/tools/testing/selftests/bpf/progs/bpf_misc.h
+++ b/tools/testing/selftests/bpf/progs/bpf_misc.h
@@ -26,6 +26,9 @@
  *
  * __regex           Same as __msg, but using a regular expression.
  * __regex_unpriv    Same as __msg_unpriv but using a regular expression.
+ * __xlated          Expect a line in a disassembly log after verifier applies rewrites.
+ *                   Multiple __xlated attributes could be specified.
+ * __xlated_unpriv   Same as __xlated but for unprivileged mode.
  *
  * __success         Expect program load success in privileged mode.
  * __success_unpriv  Expect program load success in unprivileged mode.
@@ -63,11 +66,13 @@
  */
 #define __msg(msg)		__attribute__((btf_decl_tag("comment:test_expect_msg=" msg)))
 #define __regex(regex)		__attribute__((btf_decl_tag("comment:test_expect_regex=" regex)))
+#define __xlated(msg)		__attribute__((btf_decl_tag("comment:test_expect_xlated=" msg)))
 #define __failure		__attribute__((btf_decl_tag("comment:test_expect_failure")))
 #define __success		__attribute__((btf_decl_tag("comment:test_expect_success")))
 #define __description(desc)	__attribute__((btf_decl_tag("comment:test_description=" desc)))
 #define __msg_unpriv(msg)	__attribute__((btf_decl_tag("comment:test_expect_msg_unpriv=" msg)))
 #define __regex_unpriv(regex)	__attribute__((btf_decl_tag("comment:test_expect_regex_unpriv=" regex)))
+#define __xlated_unpriv(msg)	__attribute__((btf_decl_tag("comment:test_expect_xlated_unpriv=" msg)))
 #define __failure_unpriv	__attribute__((btf_decl_tag("comment:test_expect_failure_unpriv")))
 #define __success_unpriv	__attribute__((btf_decl_tag("comment:test_expect_success_unpriv")))
 #define __log_level(lvl)	__attribute__((btf_decl_tag("comment:test_log_level="#lvl)))
diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c
index 3f84903558dd..b44b6a2fc82c 100644
--- a/tools/testing/selftests/bpf/test_loader.c
+++ b/tools/testing/selftests/bpf/test_loader.c
@@ -7,6 +7,7 @@
 #include <bpf/btf.h>
 
 #include "autoconf_helper.h"
+#include "disasm_helpers.h"
 #include "unpriv_helpers.h"
 #include "cap_helpers.h"
 
@@ -19,10 +20,12 @@
 #define TEST_TAG_EXPECT_SUCCESS "comment:test_expect_success"
 #define TEST_TAG_EXPECT_MSG_PFX "comment:test_expect_msg="
 #define TEST_TAG_EXPECT_REGEX_PFX "comment:test_expect_regex="
+#define TEST_TAG_EXPECT_XLATED_PFX "comment:test_expect_xlated="
 #define TEST_TAG_EXPECT_FAILURE_UNPRIV "comment:test_expect_failure_unpriv"
 #define TEST_TAG_EXPECT_SUCCESS_UNPRIV "comment:test_expect_success_unpriv"
 #define TEST_TAG_EXPECT_MSG_PFX_UNPRIV "comment:test_expect_msg_unpriv="
 #define TEST_TAG_EXPECT_REGEX_PFX_UNPRIV "comment:test_expect_regex_unpriv="
+#define TEST_TAG_EXPECT_XLATED_PFX_UNPRIV "comment:test_expect_xlated_unpriv="
 #define TEST_TAG_LOG_LEVEL_PFX "comment:test_log_level="
 #define TEST_TAG_PROG_FLAGS_PFX "comment:test_prog_flags="
 #define TEST_TAG_DESCRIPTION_PFX "comment:test_description="
@@ -64,6 +67,7 @@ struct test_subspec {
 	char *name;
 	bool expect_failure;
 	struct expected_msgs expect_msgs;
+	struct expected_msgs expect_xlated;
 	int retval;
 	bool execute;
 };
@@ -117,6 +121,8 @@ static void free_test_spec(struct test_spec *spec)
 	/* Deallocate expect_msgs arrays. */
 	free_msgs(&spec->priv.expect_msgs);
 	free_msgs(&spec->unpriv.expect_msgs);
+	free_msgs(&spec->priv.expect_xlated);
+	free_msgs(&spec->unpriv.expect_xlated);
 
 	free(spec->priv.name);
 	free(spec->unpriv.name);
@@ -299,6 +305,18 @@ static int parse_test_spec(struct test_loader *tester,
 			if (err)
 				goto cleanup;
 			spec->mode_mask |= UNPRIV;
+		} else if (str_has_pfx(s, TEST_TAG_EXPECT_XLATED_PFX)) {
+			msg = s + sizeof(TEST_TAG_EXPECT_XLATED_PFX) - 1;
+			err = push_msg(msg, NULL, &spec->priv.expect_xlated);
+			if (err)
+				goto cleanup;
+			spec->mode_mask |= PRIV;
+		} else if (str_has_pfx(s, TEST_TAG_EXPECT_XLATED_PFX_UNPRIV)) {
+			msg = s + sizeof(TEST_TAG_EXPECT_XLATED_PFX_UNPRIV) - 1;
+			err = push_msg(msg, NULL, &spec->unpriv.expect_xlated);
+			if (err)
+				goto cleanup;
+			spec->mode_mask |= UNPRIV;
 		} else if (str_has_pfx(s, TEST_TAG_RETVAL_PFX)) {
 			val = s + sizeof(TEST_TAG_RETVAL_PFX) - 1;
 			err = parse_retval(val, &spec->priv.retval, "__retval");
@@ -402,6 +420,16 @@ static int parse_test_spec(struct test_loader *tester,
 					goto cleanup;
 			}
 		}
+		if (spec->unpriv.expect_xlated.cnt == 0) {
+			for (i = 0; i < spec->priv.expect_xlated.cnt; i++) {
+				struct expect_msg *msg = &spec->priv.expect_xlated.patterns[i];
+
+				err = push_msg(msg->substr, msg->regex_str,
+					       &spec->unpriv.expect_xlated);
+				if (err)
+					goto cleanup;
+			}
+		}
 	}
 
 	spec->valid = true;
@@ -449,7 +477,15 @@ static void emit_verifier_log(const char *log_buf, bool force)
 	fprintf(stdout, "VERIFIER LOG:\n=============\n%s=============\n", log_buf);
 }
 
-static void validate_msgs(char *log_buf, struct expected_msgs *msgs)
+static void emit_xlated(const char *xlated, bool force)
+{
+	if (!force && env.verbosity == VERBOSE_NONE)
+		return;
+	fprintf(stdout, "XLATED:\n=============\n%s=============\n", xlated);
+}
+
+static void validate_msgs(char *log_buf, struct expected_msgs *msgs,
+			  void (*emit_fn)(const char *buf, bool force))
 {
 	regmatch_t reg_match[1];
 	const char *log = log_buf;
@@ -473,7 +509,7 @@ static void validate_msgs(char *log_buf, struct expected_msgs *msgs)
 
 		if (!ASSERT_OK_PTR(match, "expect_msg")) {
 			if (env.verbosity == VERBOSE_NONE)
-				emit_verifier_log(log_buf, true /*force*/);
+				emit_fn(log_buf, true /*force*/);
 			for (j = 0; j <= i; j++) {
 				msg = &msgs->patterns[j];
 				fprintf(stderr, "%s %s: '%s'\n",
@@ -610,6 +646,37 @@ static bool should_do_test_run(struct test_spec *spec, struct test_subspec *subs
 	return true;
 }
 
+/* Get a disassembly of BPF program after verifier applies all rewrites */
+static int get_xlated_program_text(int prog_fd, char *text, size_t text_sz)
+{
+	struct bpf_insn *insn_start = NULL, *insn, *insn_end;
+	__u32 insns_cnt = 0, i;
+	char buf[64];
+	FILE *out = NULL;
+	int err;
+
+	err = get_xlated_program(prog_fd, &insn_start, &insns_cnt);
+	if (!ASSERT_OK(err, "get_xlated_program"))
+		goto out;
+	out = fmemopen(text, text_sz, "w");
+	if (!ASSERT_OK_PTR(out, "open_memstream"))
+		goto out;
+	insn_end = insn_start + insns_cnt;
+	insn = insn_start;
+	while (insn < insn_end) {
+		i = insn - insn_start;
+		insn = disasm_insn(insn, buf, sizeof(buf));
+		fprintf(out, "%d: %s\n", i, buf);
+	}
+	fflush(out);
+
+out:
+	free(insn_start);
+	if (out)
+		fclose(out);
+	return err;
+}
+
 /* this function is forced noinline and has short generic name to look better
  * in test_progs output (in case of a failure)
  */
@@ -695,7 +762,16 @@ void run_subtest(struct test_loader *tester,
 		}
 	}
 	emit_verifier_log(tester->log_buf, false /*force*/);
-	validate_msgs(tester->log_buf, &subspec->expect_msgs);
+	validate_msgs(tester->log_buf, &subspec->expect_msgs, emit_verifier_log);
+
+	if (subspec->expect_xlated.cnt) {
+		err = get_xlated_program_text(bpf_program__fd(tprog),
+					      tester->log_buf, tester->log_buf_sz);
+		if (err)
+			goto tobj_cleanup;
+		emit_xlated(tester->log_buf, false /*force*/);
+		validate_msgs(tester->log_buf, &subspec->expect_xlated, emit_xlated);
+	}
 
 	if (should_do_test_run(spec, subspec)) {
 		/* For some reason test_verifier executes programs
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC bpf-next v2 8/9] selftests/bpf: __arch_* macro to limit test cases to specific archs
  2024-07-04 10:23 [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Eduard Zingerman
                   ` (6 preceding siblings ...)
  2024-07-04 10:23 ` [RFC bpf-next v2 7/9] selftests/bpf: allow checking xlated programs in verifier_* tests Eduard Zingerman
@ 2024-07-04 10:24 ` Eduard Zingerman
  2024-07-09 23:50   ` Andrii Nakryiko
  2024-07-04 10:24 ` [RFC bpf-next v2 9/9] selftests/bpf: test no_caller_saved_registers spill/fill removal Eduard Zingerman
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-04 10:24 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, puranjay,
	jose.marchesi, Eduard Zingerman

Add annotations __arch_x86_64, __arch_arm64, __arch_riscv64
to specify on which architecture the test case should be tested.
Several __arch_* annotations could be specified at once.
When test case is not run on current arch it is marked as skipped.

For example, the following would be tested only on arm64 and riscv64:

  SEC("raw_tp")
  __arch_arm64
  __arch_riscv64
  __xlated("1: *(u64 *)(r10 - 16) = r1")
  __xlated("2: call")
  __xlated("3: r1 = *(u64 *)(r10 - 16);")
  __success
  __naked void canary_arm64_riscv64(void)
  {
  	asm volatile (
  	"r1 = 1;"
  	"*(u64 *)(r10 - 16) = r1;"
  	"call %[bpf_get_smp_processor_id];"
  	"r1 = *(u64 *)(r10 - 16);"
  	"exit;"
  	:
  	: __imm(bpf_get_smp_processor_id)
  	: __clobber_all);
  }

On x86 it would be skipped:

  #467/2   verifier_nocsr/canary_arm64_riscv64:SKIP

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/progs/bpf_misc.h |  8 ++++
 tools/testing/selftests/bpf/test_loader.c    | 43 ++++++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h
index a70939c7bc26..a225cd87897c 100644
--- a/tools/testing/selftests/bpf/progs/bpf_misc.h
+++ b/tools/testing/selftests/bpf/progs/bpf_misc.h
@@ -63,6 +63,10 @@
  * __auxiliary         Annotated program is not a separate test, but used as auxiliary
  *                     for some other test cases and should always be loaded.
  * __auxiliary_unpriv  Same, but load program in unprivileged mode.
+ *
+ * __arch_*          Specify on which architecture the test case should be tested.
+ *                   Several __arch_* annotations could be specified at once.
+ *                   When test case is not run on current arch it is marked as skipped.
  */
 #define __msg(msg)		__attribute__((btf_decl_tag("comment:test_expect_msg=" msg)))
 #define __regex(regex)		__attribute__((btf_decl_tag("comment:test_expect_regex=" regex)))
@@ -82,6 +86,10 @@
 #define __auxiliary		__attribute__((btf_decl_tag("comment:test_auxiliary")))
 #define __auxiliary_unpriv	__attribute__((btf_decl_tag("comment:test_auxiliary_unpriv")))
 #define __btf_path(path)	__attribute__((btf_decl_tag("comment:test_btf_path=" path)))
+#define __arch(arch)		__attribute__((btf_decl_tag("comment:test_arch=" arch)))
+#define __arch_x86_64		__arch("X86_64")
+#define __arch_arm64		__arch("ARM64")
+#define __arch_riscv64		__arch("RISCV64")
 
 /* Convenience macro for use with 'asm volatile' blocks */
 #define __naked __attribute__((naked))
diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c
index b44b6a2fc82c..97befd720541 100644
--- a/tools/testing/selftests/bpf/test_loader.c
+++ b/tools/testing/selftests/bpf/test_loader.c
@@ -34,6 +34,7 @@
 #define TEST_TAG_AUXILIARY "comment:test_auxiliary"
 #define TEST_TAG_AUXILIARY_UNPRIV "comment:test_auxiliary_unpriv"
 #define TEST_BTF_PATH "comment:test_btf_path="
+#define TEST_TAG_ARCH "comment:test_arch="
 
 /* Warning: duplicated in bpf_misc.h */
 #define POINTER_VALUE	0xcafe4all
@@ -80,6 +81,7 @@ struct test_spec {
 	int log_level;
 	int prog_flags;
 	int mode_mask;
+	int arch_mask;
 	bool auxiliary;
 	bool valid;
 };
@@ -213,6 +215,12 @@ static void update_flags(int *flags, int flag, bool clear)
 		*flags |= flag;
 }
 
+enum arch {
+	ARCH_X86_64	= 0x1,
+	ARCH_ARM64	= 0x2,
+	ARCH_RISCV64	= 0x4,
+};
+
 /* Uses btf_decl_tag attributes to describe the expected test
  * behavior, see bpf_misc.h for detailed description of each attribute
  * and attribute combinations.
@@ -226,6 +234,7 @@ static int parse_test_spec(struct test_loader *tester,
 	bool has_unpriv_result = false;
 	bool has_unpriv_retval = false;
 	int func_id, i, err = 0;
+	u32 arch_mask = 0;
 	struct btf *btf;
 
 	memset(spec, 0, sizeof(*spec));
@@ -364,11 +373,26 @@ static int parse_test_spec(struct test_loader *tester,
 					goto cleanup;
 				update_flags(&spec->prog_flags, flags, clear);
 			}
+		} else if (str_has_pfx(s, TEST_TAG_ARCH)) {
+			val = s + sizeof(TEST_TAG_ARCH) - 1;
+			if (strcmp(val, "X86_64") == 0) {
+				arch_mask |= ARCH_X86_64;
+			} else if (strcmp(val, "ARM64") == 0) {
+				arch_mask |= ARCH_ARM64;
+			} else if (strcmp(val, "RISCV64") == 0) {
+				arch_mask |= ARCH_RISCV64;
+			} else {
+				PRINT_FAIL("bad arch spec: '%s'", val);
+				err = -EINVAL;
+				goto cleanup;
+			}
 		} else if (str_has_pfx(s, TEST_BTF_PATH)) {
 			spec->btf_custom_path = s + sizeof(TEST_BTF_PATH) - 1;
 		}
 	}
 
+	spec->arch_mask = arch_mask;
+
 	if (spec->mode_mask == 0)
 		spec->mode_mask = PRIV;
 
@@ -677,6 +701,20 @@ static int get_xlated_program_text(int prog_fd, char *text, size_t text_sz)
 	return err;
 }
 
+static bool run_on_current_arch(int arch_mask)
+{
+	if (arch_mask == 0)
+		return true;
+#if defined(__x86_64__)
+	return !!(arch_mask & ARCH_X86_64);
+#elif defined(__aarch64__)
+	return !!(arch_mask & ARCH_ARM64);
+#elif defined(__riscv) && __riscv_xlen == 64
+	return !!(arch_mask & ARCH_RISCV64);
+#endif
+	return false;
+}
+
 /* this function is forced noinline and has short generic name to look better
  * in test_progs output (in case of a failure)
  */
@@ -701,6 +739,11 @@ void run_subtest(struct test_loader *tester,
 	if (!test__start_subtest(subspec->name))
 		return;
 
+	if (!run_on_current_arch(spec->arch_mask)) {
+		test__skip();
+		return;
+	}
+
 	if (unpriv) {
 		if (!can_execute_unpriv(tester, spec)) {
 			test__skip();
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC bpf-next v2 9/9] selftests/bpf: test no_caller_saved_registers spill/fill removal
  2024-07-04 10:23 [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Eduard Zingerman
                   ` (7 preceding siblings ...)
  2024-07-04 10:24 ` [RFC bpf-next v2 8/9] selftests/bpf: __arch_* macro to limit test cases to specific archs Eduard Zingerman
@ 2024-07-04 10:24 ` Eduard Zingerman
  2024-07-08 11:44 ` [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Puranjay Mohan
  2024-07-10  1:18 ` Alexei Starovoitov
  10 siblings, 0 replies; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-04 10:24 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, puranjay,
	jose.marchesi, Eduard Zingerman

Tests for no_caller_saved_registers processing logic
(see verifier.c:match_and_mark_nocsr_pattern()):
- a canary positive test case;
- a canary test case for arm64 and riscv64;
- various tests with broken patterns;
- tests with read/write fixed/varying stack access that violate nocsr
  stack access contract;
- tests with multiple subprograms.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 .../selftests/bpf/prog_tests/verifier.c       |   2 +
 .../selftests/bpf/progs/verifier_nocsr.c      | 521 ++++++++++++++++++
 2 files changed, 523 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_nocsr.c

diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
index 6816ff064516..8ca306c28e62 100644
--- a/tools/testing/selftests/bpf/prog_tests/verifier.c
+++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
@@ -53,6 +53,7 @@
 #include "verifier_movsx.skel.h"
 #include "verifier_netfilter_ctx.skel.h"
 #include "verifier_netfilter_retcode.skel.h"
+#include "verifier_nocsr.skel.h"
 #include "verifier_precision.skel.h"
 #include "verifier_prevent_map_lookup.skel.h"
 #include "verifier_raw_stack.skel.h"
@@ -171,6 +172,7 @@ void test_verifier_meta_access(void)          { RUN(verifier_meta_access); }
 void test_verifier_movsx(void)                 { RUN(verifier_movsx); }
 void test_verifier_netfilter_ctx(void)        { RUN(verifier_netfilter_ctx); }
 void test_verifier_netfilter_retcode(void)    { RUN(verifier_netfilter_retcode); }
+void test_verifier_nocsr(void)                { RUN(verifier_nocsr); }
 void test_verifier_precision(void)            { RUN(verifier_precision); }
 void test_verifier_prevent_map_lookup(void)   { RUN(verifier_prevent_map_lookup); }
 void test_verifier_raw_stack(void)            { RUN(verifier_raw_stack); }
diff --git a/tools/testing/selftests/bpf/progs/verifier_nocsr.c b/tools/testing/selftests/bpf/progs/verifier_nocsr.c
new file mode 100644
index 000000000000..4e767d768f1c
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/verifier_nocsr.c
@@ -0,0 +1,521 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("4: r5 = 5")
+__xlated("5: w0 = ")
+__xlated("6: r0 = &(void __percpu *)(r0)")
+__xlated("7: r0 = *(u32 *)(r0 +0)")
+__xlated("8: exit")
+__success
+__naked void simple(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"r2 = 2;"
+	"r3 = 3;"
+	"r4 = 4;"
+	"r5 = 5;"
+	"*(u64 *)(r10 - 16) = r1;"
+	"*(u64 *)(r10 - 24) = r2;"
+	"*(u64 *)(r10 - 32) = r3;"
+	"*(u64 *)(r10 - 40) = r4;"
+	"*(u64 *)(r10 - 48) = r5;"
+	"call %[bpf_get_smp_processor_id];"
+	"r5 = *(u64 *)(r10 - 48);"
+	"r4 = *(u64 *)(r10 - 40);"
+	"r3 = *(u64 *)(r10 - 32);"
+	"r2 = *(u64 *)(r10 - 24);"
+	"r1 = *(u64 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+/* The logic for detecting and verifying nocsr pattern is the same for
+ * any arch, however x86 differs from arm64 or riscv64 in a way
+ * bpf_get_smp_processor_id is rewritten:
+ * - on x86 it is done by verifier
+ * - on arm64 and riscv64 it is done by jit
+ *
+ * Which leads to different xlated patterns for different archs:
+ * - on x86 the call is expanded as 3 instructions
+ * - on arm64 and riscv64 the call remains as is
+ *   (but spills/fills are still removed)
+ *
+ * It is really desirable to check instruction indexes in the xlated
+ * patterns, so add this canary test to check that function rewrite by
+ * jit is correctly processed by nocsr logic, keep the rest of the
+ * tests as x86.
+ */
+SEC("raw_tp")
+__arch_arm64
+__arch_riscv64
+__xlated("0: r1 = 1")
+__xlated("1: call bpf_get_smp_processor_id")
+__xlated("2: exit")
+__success
+__naked void canary_arm64_riscv64(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("1: r0 = &(void __percpu *)(r0)")
+__xlated("3: exit")
+__success
+__naked void canary_zero_spills(void)
+{
+	asm volatile (
+	"call %[bpf_get_smp_processor_id];"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("1: *(u64 *)(r10 -16) = r1")
+__xlated("3: r0 = &(void __percpu *)(r0)")
+__xlated("5: r2 = *(u64 *)(r10 -16)")
+__success
+__naked void wrong_reg_in_pattern1(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r2 = *(u64 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("1: *(u64 *)(r10 -16) = r6")
+__xlated("3: r0 = &(void __percpu *)(r0)")
+__xlated("5: r6 = *(u64 *)(r10 -16)")
+__success
+__naked void wrong_reg_in_pattern2(void)
+{
+	asm volatile (
+	"r6 = 1;"
+	"*(u64 *)(r10 - 16) = r6;"
+	"call %[bpf_get_smp_processor_id];"
+	"r6 = *(u64 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("1: *(u64 *)(r10 -16) = r0")
+__xlated("3: r0 = &(void __percpu *)(r0)")
+__xlated("5: r0 = *(u64 *)(r10 -16)")
+__success
+__naked void wrong_reg_in_pattern3(void)
+{
+	asm volatile (
+	"r0 = 1;"
+	"*(u64 *)(r10 - 16) = r0;"
+	"call %[bpf_get_smp_processor_id];"
+	"r0 = *(u64 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("2: *(u64 *)(r2 -16) = r1")
+__xlated("4: r0 = &(void __percpu *)(r0)")
+__xlated("6: r1 = *(u64 *)(r10 -16)")
+__success
+__naked void wrong_base_in_pattern(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"r2 = r10;"
+	"*(u64 *)(r2 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("1: *(u64 *)(r10 -16) = r1")
+__xlated("3: r0 = &(void __percpu *)(r0)")
+__xlated("5: r2 = 1")
+__success
+__naked void wrong_insn_in_pattern(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r2 = 1;"
+	"r1 = *(u64 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("2: *(u64 *)(r10 -16) = r1")
+__xlated("4: r0 = &(void __percpu *)(r0)")
+__xlated("6: r1 = *(u64 *)(r10 -8)")
+__success
+__naked void wrong_off_in_pattern1(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"*(u64 *)(r10 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 8);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("1: *(u32 *)(r10 -4) = r1")
+__xlated("3: r0 = &(void __percpu *)(r0)")
+__xlated("5: r1 = *(u32 *)(r10 -4)")
+__success
+__naked void wrong_off_in_pattern2(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u32 *)(r10 - 4) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u32 *)(r10 - 4);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("1: *(u32 *)(r10 -16) = r1")
+__xlated("3: r0 = &(void __percpu *)(r0)")
+__xlated("5: r1 = *(u32 *)(r10 -16)")
+__success
+__naked void wrong_size_in_pattern(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u32 *)(r10 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u32 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("2: *(u32 *)(r10 -8) = r1")
+__xlated("4: r0 = &(void __percpu *)(r0)")
+__xlated("6: r1 = *(u32 *)(r10 -8)")
+__success
+__naked void partial_pattern(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"r2 = 2;"
+	"*(u32 *)(r10 - 8) = r1;"
+	"*(u64 *)(r10 - 16) = r2;"
+	"call %[bpf_get_smp_processor_id];"
+	"r2 = *(u64 *)(r10 - 16);"
+	"r1 = *(u32 *)(r10 - 8);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("0: r1 = 1")
+__xlated("1: r2 = 2")
+/* not patched, spills for -8, -16 not removed */
+__xlated("2: *(u64 *)(r10 -8) = r1")
+__xlated("3: *(u64 *)(r10 -16) = r2")
+__xlated("5: r0 = &(void __percpu *)(r0)")
+__xlated("7: r2 = *(u64 *)(r10 -16)")
+__xlated("8: r1 = *(u64 *)(r10 -8)")
+/* patched, spills for -16, -24 removed */
+__xlated("10: r0 = &(void __percpu *)(r0)")
+__xlated("12: exit")
+__success
+__naked void min_stack_offset(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"r2 = 2;"
+	/* this call won't be patched */
+	"*(u64 *)(r10 - 8) = r1;"
+	"*(u64 *)(r10 - 16) = r2;"
+	"call %[bpf_get_smp_processor_id];"
+	"r2 = *(u64 *)(r10 - 16);"
+	"r1 = *(u64 *)(r10 - 8);"
+	/* this call would be patched */
+	"*(u64 *)(r10 - 16) = r1;"
+	"*(u64 *)(r10 - 24) = r2;"
+	"call %[bpf_get_smp_processor_id];"
+	"r2 = *(u64 *)(r10 - 24);"
+	"r1 = *(u64 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("1: *(u64 *)(r10 -8) = r1")
+__xlated("3: r0 = &(void __percpu *)(r0)")
+__xlated("5: r1 = *(u64 *)(r10 -8)")
+__success
+__naked void bad_fixed_read(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 8);"
+	"r1 = r10;"
+	"r1 += -8;"
+	"r1 = *(u64 *)(r1 - 0);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("1: *(u64 *)(r10 -8) = r1")
+__xlated("3: r0 = &(void __percpu *)(r0)")
+__xlated("5: r1 = *(u64 *)(r10 -8)")
+__success
+__naked void bad_fixed_write(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 8);"
+	"r1 = r10;"
+	"r1 += -8;"
+	"*(u64 *)(r1 - 0) = r1;"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("6: *(u64 *)(r10 -16) = r1")
+__xlated("8: r0 = &(void __percpu *)(r0)")
+__xlated("10: r1 = *(u64 *)(r10 -16)")
+__success
+__naked void bad_varying_read(void)
+{
+	asm volatile (
+	"r6 = *(u64 *)(r1 + 0);" /* random scalar value */
+	"r6 &= 0x7;"		 /* r6 range [0..7] */
+	"r6 += 0x2;"		 /* r6 range [2..9] */
+	"r7 = 0;"
+	"r7 -= r6;"		 /* r7 range [-9..-2] */
+	"r1 = 1;"
+	"*(u64 *)(r10 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 16);"
+	"r1 = r10;"
+	"r1 += r7;"
+	"r1 = *(u8 *)(r1 - 0);" /* touches slot [-16..-9] where spills are stored */
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("6: *(u64 *)(r10 -16) = r1")
+__xlated("8: r0 = &(void __percpu *)(r0)")
+__xlated("10: r1 = *(u64 *)(r10 -16)")
+__success
+__naked void bad_varying_write(void)
+{
+	asm volatile (
+	"r6 = *(u64 *)(r1 + 0);" /* random scalar value */
+	"r6 &= 0x7;"		 /* r6 range [0..7] */
+	"r6 += 0x2;"		 /* r6 range [2..9] */
+	"r7 = 0;"
+	"r7 -= r6;"		 /* r7 range [-9..-2] */
+	"r1 = 1;"
+	"*(u64 *)(r10 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 16);"
+	"r1 = r10;"
+	"r1 += r7;"
+	"*(u8 *)(r1 - 0) = r7;" /* touches slot [-16..-9] where spills are stored */
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+__xlated("1: *(u64 *)(r10 -8) = r1")
+__xlated("3: r0 = &(void __percpu *)(r0)")
+__xlated("5: r1 = *(u64 *)(r10 -8)")
+__success
+__naked void bad_write_in_subprog(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 8);"
+	"r1 = r10;"
+	"r1 += -8;"
+	"call bad_write_in_subprog_aux;"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+__used
+__naked static void bad_write_in_subprog_aux(void)
+{
+	asm volatile (
+	"r0 = 1;"
+	"*(u64 *)(r1 - 0) = r0;"	/* invalidates nocsr contract for caller: */
+	"exit;"				/* caller stack at -8 used outside of the pattern */
+	::: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+/* main, not patched */
+__xlated("1: *(u64 *)(r10 -8) = r1")
+__xlated("3: r0 = &(void __percpu *)(r0)")
+__xlated("5: r1 = *(u64 *)(r10 -8)")
+__xlated("9: call pc+1")
+__xlated("10: exit")
+/* subprogram, patched */
+__xlated("11: r1 = 1")
+__xlated("13: r0 = &(void __percpu *)(r0)")
+__xlated("15: exit")
+__success
+__naked void invalidate_one_subprog(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 8);"
+	"r1 = r10;"
+	"r1 += -8;"
+	"r1 = *(u64 *)(r1 - 0);"
+	"call invalidate_one_subprog_aux;"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+__used
+__naked static void invalidate_one_subprog_aux(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 8);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__arch_x86_64
+/* main */
+__xlated("0: r1 = 1")
+__xlated("2: r0 = &(void __percpu *)(r0)")
+__xlated("4: call pc+1")
+__xlated("5: exit")
+/* subprogram */
+__xlated("6: r1 = 1")
+__xlated("8: r0 = &(void __percpu *)(r0)")
+__xlated("10: *(u64 *)(r10 -16) = r1")
+__xlated("11: exit")
+__success
+__naked void subprogs_use_independent_offsets(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 16);"
+	"call subprogs_use_independent_offsets_aux;"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+__used
+__naked static void subprogs_use_independent_offsets_aux(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 24) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 24);"
+	"*(u64 *)(r10 - 16) = r1;"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls
  2024-07-04 10:23 [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Eduard Zingerman
                   ` (8 preceding siblings ...)
  2024-07-04 10:24 ` [RFC bpf-next v2 9/9] selftests/bpf: test no_caller_saved_registers spill/fill removal Eduard Zingerman
@ 2024-07-08 11:44 ` Puranjay Mohan
  2024-07-08 17:29   ` Eduard Zingerman
  2024-07-10  1:18 ` Alexei Starovoitov
  10 siblings, 1 reply; 36+ messages in thread
From: Puranjay Mohan @ 2024-07-08 11:44 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Eduard Zingerman

[-- Attachment #1: Type: text/plain, Size: 5661 bytes --]

Eduard Zingerman <eddyz87@gmail.com> writes:

> This RFC seeks to allow using no_caller_saved_registers gcc/clang
> attribute with some BPF helper functions (and kfuncs in the future).
>
> As documented in [1], this attribute means that function scratches
> only some of the caller saved registers defined by ABI.
> For BPF the set of such registers could be defined as follows:
> - R0 is scratched only if function is non-void;
> - R1-R5 are scratched only if corresponding parameter type is defined
>   in the function prototype.
>
> The goal of the RFC is to implement no_caller_saved_registers
> (nocsr for short) in a backwards compatible manner:
> - for kernels that support the feature, gain some performance boost
>   from better register allocation;
> - for kernels that don't support the feature, allow programs execution
>   with minor performance losses.
>
> To achieve this, use a scheme suggested by Alexei Starovoitov:
> - for nocsr calls clang allocates registers as-if relevant r0-r5
>   registers are not scratched by the call;
> - as a post-processing step, clang visits each nocsr call and adds
>   spill/fill for every live r0-r5;
> - stack offsets used for spills/fills are allocated as minimal
>   stack offsets in whole function and are not used for any other
>   purposes;
> - when kernel loads a program, it looks for such patterns
>   (nocsr function surrounded by spills/fills) and checks if
>   spill/fill stack offsets are used exclusively in nocsr patterns;
> - if so, and if current JIT inlines the call to the nocsr function
>   (e.g. a helper call), kernel removes unnecessary spill/fill pairs;
> - when old kernel loads a program, presence of spill/fill pairs
>   keeps BPF program valid, albeit slightly less efficient.
>
> Corresponding clang/llvm changes are available in [2].
>
> The patch-set uses bpf_get_smp_processor_id() function as a canary,
> making it the first helper with nocsr attribute.
>
> For example, consider the following program:
>
>   #define __no_csr __attribute__((no_caller_saved_registers))
>   #define SEC(name) __attribute__((section(name), used))
>   #define bpf_printk(fmt, ...) bpf_trace_printk((fmt), sizeof(fmt), __VA_ARGS__)
>
>   typedef unsigned int __u32;
>
>   static long (* const bpf_trace_printk)(const char *fmt, __u32 fmt_size, ...) = (void *) 6;
>   static __u32 (*const bpf_get_smp_processor_id)(void) __no_csr = (void *)8;
>
>   SEC("raw_tp")
>   int test(void *ctx)
>   {
>           __u32 task = bpf_get_smp_processor_id();
>   	bpf_printk("ctx=%p, smp=%d", ctx, task);
>   	return 0;
>   }
>
>   char _license[] SEC("license") = "GPL";
>
> Compiled (using [2]) as follows:
>
>   $ clang --target=bpf -O2 -g -c -o nocsr.bpf.o nocsr.bpf.c
>   $ llvm-objdump --no-show-raw-insn -Sd nocsr.bpf.o
>     ...
>   3rd parameter for printk call     removable spill/fill pair
>   .--- 0:       r3 = r1                             |
> ; |       __u32 task = bpf_get_smp_processor_id();  |
>   |    1:       *(u64 *)(r10 - 0x8) = r3 <----------|
>   |    2:       call 0x8                            |
>   |    3:       r3 = *(u64 *)(r10 - 0x8) <----------'
> ; |     bpf_printk("ctx=%p, smp=%d", ctx, task);
>   |    4:       r1 = 0x0 ll
>   |    6:       r2 = 0xf
>   |    7:       r4 = r0
>   '--> 8:       call 0x6
> ;       return 0;
>        9:       r0 = 0x0
>       10:       exit
>
> Here is how the program looks after verifier processing:
>
>   # bpftool prog load ./nocsr.bpf.o /sys/fs/bpf/nocsr-test
>   # bpftool prog dump xlated pinned /sys/fs/bpf/nocsr-test
>   int test(void * ctx):
>   ; int test(void *ctx)
>      0: (bf) r3 = r1               <--------- 3rd printk parameter
>   ; __u32 task = bpf_get_smp_processor_id();
>      1: (b4) w0 = 197132           <--------- inlined helper call,
>      2: (bf) r0 = r0               <--------- spill/fill pair removed
>      3: (61) r0 = *(u32 *)(r0 +0)  <---------
>   ; bpf_printk("ctx=%p, smp=%d", ctx, task);
>      4: (18) r1 = map[id:13][0]+0
>      6: (b7) r2 = 15
>      7: (bf) r4 = r0
>      8: (85) call bpf_trace_printk#-125920
>   ; return 0;
>      9: (b7) r0 = 0
>     10: (95) exit
>
> [1] https://clang.llvm.org/docs/AttributeReference.html#no-caller-saved-registers
> [2] https://github.com/eddyz87/llvm-project/tree/bpf-no-caller-saved-registers
>
> Change list:
> - v1 -> v2:
>   - assume that functions inlined by either jit or verifier
>     conform to no_caller_saved_registers contract (Andrii, Puranjay);
>   - allow nocsr rewrite for bpf_get_smp_processor_id()
>     on arm64 and riscv64 architectures (Puranjay);
>   - __arch_{x86_64,arm64,riscv64} macro for test_loader;
>   - moved remove_nocsr_spills_fills() inside do_misc_fixups() (Andrii);
>   - moved nocsr pattern detection from check_cfg() to a separate pass
>     (Andrii);
>   - various stylistic/correctness changes according to Andrii's
>     comments.
>
> Revisions:
> - v1 https://lore.kernel.org/bpf/20240629094733.3863850-1-eddyz87@gmail.com/
>
> Eduard Zingerman (9):
>   bpf: add a get_helper_proto() utility function
>   bpf: no_caller_saved_registers attribute for helper calls
>   bpf, x86, riscv, arm: no_caller_saved_registers for
>     bpf_get_smp_processor_id()

Ran the selftest on riscv-64 on qemu:

    root@rv-tester:~/bpf# uname -a
    Linux rv-tester 6.10.0-rc2 #27 SMP Mon Jul  8 09:58:20 UTC 2024 riscv64 riscv64 riscv64 GNU/Linux
    root@rv-tester:~/bpf# ./test_progs -a verifier_nocsr/canary_arm64_riscv64
    #496/2   verifier_nocsr/canary_arm64_riscv64:OK
    #496     verifier_nocsr:OK
    Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

Tested-by: Puranjay Mohan <puranjay@kernel.org> #riscv64

Thanks,
Puranjay

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls
  2024-07-08 11:44 ` [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Puranjay Mohan
@ 2024-07-08 17:29   ` Eduard Zingerman
  0 siblings, 0 replies; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-08 17:29 UTC (permalink / raw)
  To: Puranjay Mohan, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Mon, 2024-07-08 at 11:44 +0000, Puranjay Mohan wrote:

[...]

> Ran the selftest on riscv-64 on qemu:
> 
>     root@rv-tester:~/bpf# uname -a
>     Linux rv-tester 6.10.0-rc2 #27 SMP Mon Jul  8 09:58:20 UTC 2024 riscv64 riscv64 riscv64 GNU/Linux
>     root@rv-tester:~/bpf# ./test_progs -a verifier_nocsr/canary_arm64_riscv64
>     #496/2   verifier_nocsr/canary_arm64_riscv64:OK
>     #496     verifier_nocsr:OK
>     Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
> 
> Tested-by: Puranjay Mohan <puranjay@kernel.org> #riscv64

Great, thank you for testing!

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-04 10:23 ` [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls Eduard Zingerman
@ 2024-07-09 23:42   ` Andrii Nakryiko
  2024-07-10  3:00     ` Eduard Zingerman
  2024-07-10  9:46     ` Eduard Zingerman
  2024-07-10  1:09   ` Alexei Starovoitov
  1 sibling, 2 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2024-07-09 23:42 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi, Alexei Starovoitov

On Thu, Jul 4, 2024 at 3:24 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> GCC and LLVM define a no_caller_saved_registers function attribute.
> This attribute means that function scratches only some of
> the caller saved registers defined by ABI.
> For BPF the set of such registers could be defined as follows:
> - R0 is scratched only if function is non-void;
> - R1-R5 are scratched only if corresponding parameter type is defined
>   in the function prototype.
>
> This commit introduces flag bpf_func_prot->allow_nocsr.
> If this flag is set for some helper function, verifier assumes that
> it follows no_caller_saved_registers calling convention.
>
> The contract between kernel and clang allows to simultaneously use
> such functions and maintain backwards compatibility with old
> kernels that don't understand no_caller_saved_registers calls
> (nocsr for short):
>
> - clang generates a simple pattern for nocsr calls, e.g.:
>
>     r1 = 1;
>     r2 = 2;
>     *(u64 *)(r10 - 8)  = r1;
>     *(u64 *)(r10 - 16) = r2;
>     call %[to_be_inlined]
>     r2 = *(u64 *)(r10 - 16);
>     r1 = *(u64 *)(r10 - 8);
>     r0 = r1;
>     r0 += r2;
>     exit;
>
> - kernel removes unnecessary spills and fills, if called function is
>   inlined by verifier or current JIT (with assumption that patch
>   inserted by verifier or JIT honors nocsr contract, e.g. does not
>   scratch r3-r5 for the example above), e.g. the code above would be
>   transformed to:
>
>     r1 = 1;
>     r2 = 2;
>     call %[to_be_inlined]
>     r0 = r1;
>     r0 += r2;
>     exit;
>
> Technically, the transformation is split into the following phases:
> - function mark_nocsr_pattern_patterns(), called from bpf_check()
>   searches and marks potential patterns in instruction auxiliary data;
> - upon stack read or write access,
>   function check_nocsr_stack_contract() is used to verify if
>   stack offsets, presumably reserved for nocsr patterns, are used
>   only from those patterns;
> - function do_misc_fixups(), called from bpf_check(),
>   applies the rewrite for valid patterns.
>
> See comment in mark_nocsr_pattern_for_call() for more details.
>
> Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  include/linux/bpf.h          |   6 +
>  include/linux/bpf_verifier.h |  14 ++
>  kernel/bpf/verifier.c        | 300 ++++++++++++++++++++++++++++++++++-
>  3 files changed, 314 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 960780ef04e1..391e19c5cd68 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -807,6 +807,12 @@ struct bpf_func_proto {
>         bool gpl_only;
>         bool pkt_access;
>         bool might_sleep;
> +       /* set to true if helper follows contract for gcc/llvm
> +        * attribute no_caller_saved_registers:
> +        * - void functions do not scratch r0
> +        * - functions taking N arguments scratch only registers r1-rN
> +        */
> +       bool allow_nocsr;
>         enum bpf_return_type ret_type;
>         union {
>                 struct {
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 2b54e25d2364..735ae0901b3d 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -585,6 +585,15 @@ struct bpf_insn_aux_data {
>          * accepts callback function as a parameter.
>          */
>         bool calls_callback;
> +       /* true if STX or LDX instruction is a part of a spill/fill
> +        * pattern for a no_caller_saved_registers call.
> +        */
> +       u8 nocsr_pattern:1;
> +       /* for CALL instructions, a number of spill/fill pairs in the
> +        * no_caller_saved_registers pattern.
> +        */
> +       u8 nocsr_spills_num:3;

despite bitfields this will extend bpf_insn_aux_data by 8 bytes. there
are 2 bytes of padding after alu_state, let's put this there.

And let's not add bitfields unless absolutely necessary (this can be
always done later).

> +
>  };
>
>  #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
> @@ -641,6 +650,11 @@ struct bpf_subprog_info {
>         u32 linfo_idx; /* The idx to the main_prog->aux->linfo */
>         u16 stack_depth; /* max. stack depth used by this function */
>         u16 stack_extra;
> +       /* stack depth after which slots reserved for
> +        * no_caller_saved_registers spills/fills start,
> +        * value <= nocsr_stack_off belongs to the spill/fill area.

are you sure about <= (not <), it seems like nocsr_stack_off is
exclusive right bound for nocsr stack region (it would be good to call
this out explicitly here)

> +        */
> +       s16 nocsr_stack_off;
>         bool has_tail_call: 1;
>         bool tail_call_reachable: 1;
>         bool has_ld_abs: 1;
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 4869f1fb0a42..d16a249b59ad 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -2471,16 +2471,37 @@ static int cmp_subprogs(const void *a, const void *b)
>                ((struct bpf_subprog_info *)b)->start;
>  }
>
> -static int find_subprog(struct bpf_verifier_env *env, int off)
> +/* Find subprogram that contains instruction at 'off' */
> +static int find_containing_subprog(struct bpf_verifier_env *env, int off)
>  {
> -       struct bpf_subprog_info *p;
> +       struct bpf_subprog_info *vals = env->subprog_info;
> +       int l, r, m;
>
> -       p = bsearch(&off, env->subprog_info, env->subprog_cnt,
> -                   sizeof(env->subprog_info[0]), cmp_subprogs);
> -       if (!p)
> +       if (off >= env->prog->len || off < 0 || env->subprog_cnt == 0)
>                 return -ENOENT;
> -       return p - env->subprog_info;
>
> +       l = 0;
> +       m = 0;

no need to initialize m

> +       r = env->subprog_cnt - 1;
> +       while (l < r) {
> +               m = l + (r - l + 1) / 2;
> +               if (vals[m].start <= off)
> +                       l = m;
> +               else
> +                       r = m - 1;
> +       }
> +       return l;
> +}

I love it, looks great :)

> +
> +/* Find subprogram that starts exactly at 'off' */
> +static int find_subprog(struct bpf_verifier_env *env, int off)
> +{
> +       int idx;
> +
> +       idx = find_containing_subprog(env, off);
> +       if (idx < 0 || env->subprog_info[idx].start != off)
> +               return -ENOENT;
> +       return idx;
>  }
>
>  static int add_subprog(struct bpf_verifier_env *env, int off)
> @@ -4501,6 +4522,23 @@ static int get_reg_width(struct bpf_reg_state *reg)
>         return fls64(reg->umax_value);
>  }
>
> +/* See comment for mark_nocsr_pattern_for_call() */
> +static void check_nocsr_stack_contract(struct bpf_verifier_env *env, struct bpf_func_state *state,
> +                                      int insn_idx, int off)
> +{
> +       struct bpf_subprog_info *subprog = &env->subprog_info[state->subprogno];
> +       struct bpf_insn_aux_data *aux = &env->insn_aux_data[insn_idx];
> +
> +       if (subprog->nocsr_stack_off <= off || aux->nocsr_pattern)
> +               return;

can helper call instruction go through this check? E.g., if we do
bpf_probe_read_kernel() into stack slot, where do we check that that
slot is not overlapping with nocsr spill/fill region?

> +       /* access to the region [max_stack_depth .. nocsr_stack_off]

ok, here nocsr_stack_off should be exclusive, no?

> +        * from something that is not a part of the nocsr pattern,
> +        * disable nocsr rewrites for current subprogram by setting
> +        * nocsr_stack_off to a value smaller than any possible offset.
> +        */
> +       subprog->nocsr_stack_off = S16_MIN;
> +}
> +
>  /* check_stack_{read,write}_fixed_off functions track spill/fill of registers,
>   * stack boundary and alignment are checked in check_mem_access()
>   */
> @@ -4549,6 +4587,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env,
>         if (err)
>                 return err;
>
> +       check_nocsr_stack_contract(env, state, insn_idx, off);
>         mark_stack_slot_scratched(env, spi);
>         if (reg && !(off % BPF_REG_SIZE) && reg->type == SCALAR_VALUE && env->bpf_capable) {
>                 bool reg_value_fits;
> @@ -4682,6 +4721,7 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env,
>                         return err;
>         }
>
> +       check_nocsr_stack_contract(env, state, insn_idx, min_off);
>         /* Variable offset writes destroy any spilled pointers in range. */
>         for (i = min_off; i < max_off; i++) {
>                 u8 new_type, *stype;
> @@ -4820,6 +4860,7 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
>         reg = &reg_state->stack[spi].spilled_ptr;
>
>         mark_stack_slot_scratched(env, spi);
> +       check_nocsr_stack_contract(env, state, env->insn_idx, off);
>
>         if (is_spilled_reg(&reg_state->stack[spi])) {
>                 u8 spill_size = 1;
> @@ -4980,6 +5021,7 @@ static int check_stack_read_var_off(struct bpf_verifier_env *env,
>         min_off = reg->smin_value + off;
>         max_off = reg->smax_value + off;
>         mark_reg_stack_read(env, ptr_state, min_off, max_off + size, dst_regno);
> +       check_nocsr_stack_contract(env, ptr_state, env->insn_idx, min_off);
>         return 0;
>  }
>
> @@ -15951,6 +15993,206 @@ static int visit_func_call_insn(int t, struct bpf_insn *insns,
>         return ret;
>  }
>
> +/* Bitmask with 1s for all caller saved registers */
> +#define ALL_CALLER_SAVED_REGS ((1u << CALLER_SAVED_REGS) - 1)
> +
> +/* Return a bitmask specifying which caller saved registers are
> + * modified by a call to a helper.
> + * (Either as a return value or as scratch registers).
> + *
> + * For normal helpers registers R0-R5 are scratched.
> + * For helpers marked as no_csr:
> + * - scratch R0 if function is non-void;
> + * - scratch R1-R5 if corresponding parameter type is set
> + *   in the function prototype.
> + */
> +static u8 get_helper_reg_mask(const struct bpf_func_proto *fn)

suggestion: to make this less confusing, here we are returning a mask
of registers that are clobbered by the helper, is that right? so
get_helper_clobber_mask() maybe?

> +{
> +       u8 mask;
> +       int i;
> +
> +       if (!fn->allow_nocsr)
> +               return ALL_CALLER_SAVED_REGS;
> +
> +       mask = 0;
> +       if (fn->ret_type != RET_VOID)
> +               mask |= BIT(BPF_REG_0);
> +       for (i = 0; i < ARRAY_SIZE(fn->arg_type); ++i)
> +               if (fn->arg_type[i] != ARG_DONTCARE)
> +                       mask |= BIT(BPF_REG_1 + i);
> +       return mask;
> +}
> +
> +/* True if do_misc_fixups() replaces calls to helper number 'imm',
> + * replacement patch is presumed to follow no_caller_saved_registers contract
> + * (see mark_nocsr_pattern_for_call() below).
> + */
> +static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm)
> +{
> +       return false;
> +}
> +
> +/* If 'insn' is a call that follows no_caller_saved_registers contract
> + * and called function is inlined by current jit or verifier,
> + * return a mask with 1s corresponding to registers that are scratched
> + * by this call (depends on return type and number of return parameters).

return parameters? was it supposed to be "function parameters/arguments"?

> + * Otherwise return ALL_CALLER_SAVED_REGS mask.
> + */
> +static u32 call_csr_mask(struct bpf_verifier_env *env, struct bpf_insn *insn)

you use u8 for get_helper_reg_mask() and u32 here, why not keep them consistent?

similar to the naming nit above, I think we should be a bit more
explicit with what "mask" actually means. Is this also clobber mask?

> +{
> +       const struct bpf_func_proto *fn;
> +
> +       if (bpf_helper_call(insn) &&
> +           (verifier_inlines_helper_call(env, insn->imm) || bpf_jit_inlines_helper_call(insn->imm)) &&
> +           get_helper_proto(env, insn->imm, &fn) == 0 &&
> +           fn->allow_nocsr)
> +               return ~get_helper_reg_mask(fn);

hm... I'm a bit confused why we do a negation here? aren't we working
with clobbering mask... I'll keep reading for now.

> +
> +       return ALL_CALLER_SAVED_REGS;
> +}
> +
> +/* GCC and LLVM define a no_caller_saved_registers function attribute.
> + * This attribute means that function scratches only some of
> + * the caller saved registers defined by ABI.
> + * For BPF the set of such registers could be defined as follows:
> + * - R0 is scratched only if function is non-void;
> + * - R1-R5 are scratched only if corresponding parameter type is defined
> + *   in the function prototype.
> + *
> + * The contract between kernel and clang allows to simultaneously use
> + * such functions and maintain backwards compatibility with old
> + * kernels that don't understand no_caller_saved_registers calls
> + * (nocsr for short):
> + *
> + * - for nocsr calls clang allocates registers as-if relevant r0-r5
> + *   registers are not scratched by the call;
> + *
> + * - as a post-processing step, clang visits each nocsr call and adds
> + *   spill/fill for every live r0-r5;
> + *
> + * - stack offsets used for the spill/fill are allocated as minimal
> + *   stack offsets in whole function and are not used for any other
> + *   purposes;
> + *
> + * - when kernel loads a program, it looks for such patterns
> + *   (nocsr function surrounded by spills/fills) and checks if
> + *   spill/fill stack offsets are used exclusively in nocsr patterns;
> + *
> + * - if so, and if verifier or current JIT inlines the call to the
> + *   nocsr function (e.g. a helper call), kernel removes unnecessary
> + *   spill/fill pairs;
> + *
> + * - when old kernel loads a program, presence of spill/fill pairs
> + *   keeps BPF program valid, albeit slightly less efficient.
> + *
> + * For example:
> + *
> + *   r1 = 1;
> + *   r2 = 2;
> + *   *(u64 *)(r10 - 8)  = r1;            r1 = 1;
> + *   *(u64 *)(r10 - 16) = r2;            r2 = 2;
> + *   call %[to_be_inlined]         -->   call %[to_be_inlined]
> + *   r2 = *(u64 *)(r10 - 16);            r0 = r1;
> + *   r1 = *(u64 *)(r10 - 8);             r0 += r2;
> + *   r0 = r1;                            exit;
> + *   r0 += r2;
> + *   exit;
> + *
> + * The purpose of mark_nocsr_pattern_for_call is to:
> + * - look for such patterns;
> + * - mark spill and fill instructions in env->insn_aux_data[*].nocsr_pattern;
> + * - mark set env->insn_aux_data[*].nocsr_spills_num for call instruction;
> + * - update env->subprog_info[*]->nocsr_stack_off to find an offset
> + *   at which nocsr spill/fill stack slots start.
> + *
> + * The .nocsr_pattern and .nocsr_stack_off are used by
> + * check_nocsr_stack_contract() to check if every stack access to
> + * nocsr spill/fill stack slot originates from spill/fill
> + * instructions, members of nocsr patterns.
> + *
> + * If such condition holds true for a subprogram, nocsr patterns could
> + * be rewritten by do_misc_fixups().
> + * Otherwise nocsr patterns are not changed in the subprogram
> + * (code, presumably, generated by an older clang version).
> + *
> + * For example, it is *not* safe to remove spill/fill below:
> + *
> + *   r1 = 1;
> + *   *(u64 *)(r10 - 8)  = r1;            r1 = 1;
> + *   call %[to_be_inlined]         -->   call %[to_be_inlined]
> + *   r1 = *(u64 *)(r10 - 8);             r0 = *(u64 *)(r10 - 8);  <---- wrong !!!
> + *   r0 = *(u64 *)(r10 - 8);             r0 += r1;
> + *   r0 += r1;                           exit;
> + *   exit;
> + */
> +static void mark_nocsr_pattern_for_call(struct bpf_verifier_env *env, int t)

t is insn_idx, let's not carry over old crufty check_cfg naming

> +{
> +       struct bpf_insn *insns = env->prog->insnsi, *stx, *ldx;
> +       struct bpf_subprog_info *subprog;
> +       u32 csr_mask = call_csr_mask(env, &insns[t]);
> +       u32 reg_mask = ~csr_mask | ~ALL_CALLER_SAVED_REGS;

tbh, I'm lost with all these bitmask and their inversions...
call_csr_mask()'s result is basically always used inverted, so why not
return inverted mask in the first place?

> +       int s, i;
> +       s16 off;
> +
> +       if (csr_mask == ALL_CALLER_SAVED_REGS)
> +               return;
> +
> +       for (i = 1, off = 0; i <= ARRAY_SIZE(caller_saved); ++i, off += BPF_REG_SIZE) {
> +               if (t - i < 0 || t + i >= env->prog->len)
> +                       break;
> +               stx = &insns[t - i];
> +               ldx = &insns[t + i];
> +               if (off == 0) {
> +                       off = stx->off;
> +                       if (off % BPF_REG_SIZE != 0)
> +                               break;

kind of ugly that we assume stx before we actually checked that it's
STX?... maybe split humongous if below into instruction checking
(with code and src_reg) and then off checking separately?

> +               }
> +               if (/* *(u64 *)(r10 - off) = r[0-5]? */
> +                   stx->code != (BPF_STX | BPF_MEM | BPF_DW) ||
> +                   stx->dst_reg != BPF_REG_10 ||
> +                   /* r[0-5] = *(u64 *)(r10 - off)? */
> +                   ldx->code != (BPF_LDX | BPF_MEM | BPF_DW) ||
> +                   ldx->src_reg != BPF_REG_10 ||
> +                   /* check spill/fill for the same reg and offset */
> +                   stx->src_reg != ldx->dst_reg ||
> +                   stx->off != ldx->off ||
> +                   stx->off != off ||
> +                   /* this should be a previously unseen register */
> +                   BIT(stx->src_reg) & reg_mask)

() around & operation?

> +                       break;
> +               reg_mask |= BIT(stx->src_reg);
> +               env->insn_aux_data[t - i].nocsr_pattern = 1;
> +               env->insn_aux_data[t + i].nocsr_pattern = 1;
> +       }
> +       if (i == 1)
> +               return;
> +       env->insn_aux_data[t].nocsr_spills_num = i - 1;
> +       s = find_containing_subprog(env, t);
> +       /* can't happen */

then don't check ;) we leave the state partially set for CSR but not
quite. We either should error out completely or just assume
correctness of find_containing_subprog, IMO

> +       if (WARN_ON_ONCE(s < 0))
> +               return;
> +       subprog = &env->subprog_info[s];
> +       subprog->nocsr_stack_off = min(subprog->nocsr_stack_off, off);

should this be max()? offsets are negative, right? so if nocsr uses -8
and -16 as in the example, entire [-16, 0) region is nocsr region

> +}
> +
> +/* Update the following fields when appropriate:
> + * - env->insn_aux_data[*].nocsr_pattern
> + * - env->insn_aux_data[*].spills_num and
> + * - env->subprog_info[*].nocsr_stack_off
> + * See mark_nocsr_pattern_for_call().
> + */
> +static int mark_nocsr_patterns(struct bpf_verifier_env *env)
> +{
> +       struct bpf_insn *insn = env->prog->insnsi;
> +       int i, insn_cnt = env->prog->len;
> +
> +       for (i = 0; i < insn_cnt; i++, insn++)
> +               /* might be extended to handle kfuncs as well */
> +               if (bpf_helper_call(insn))
> +                       mark_nocsr_pattern_for_call(env, i);
> +       return 0;
> +}
> +
>  /* Visits the instruction at index t and returns one of the following:
>   *  < 0 - an error occurred
>   *  DONE_EXPLORING - the instruction was fully explored
> @@ -20119,6 +20361,48 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
>                         goto next_insn;
>                 if (insn->src_reg == BPF_PSEUDO_CALL)
>                         goto next_insn;
> +               /* Remove unnecessary spill/fill pairs, members of nocsr pattern */
> +               if (env->insn_aux_data[i + delta].nocsr_spills_num > 0) {
> +                       u32 j, spills_num = env->insn_aux_data[i + delta].nocsr_spills_num;
> +                       int err;
> +
> +                       /* don't apply this on a second visit */
> +                       env->insn_aux_data[i + delta].nocsr_spills_num = 0;
> +
> +                       /* check if spill/fill stack access is in expected offset range */
> +                       for (j = 1; j <= spills_num; ++j) {
> +                               if ((insn - j)->off >= subprogs[cur_subprog].nocsr_stack_off ||
> +                                   (insn + j)->off >= subprogs[cur_subprog].nocsr_stack_off) {
> +                                       /* do a second visit of this instruction,
> +                                        * so that verifier can inline it
> +                                        */
> +                                       i -= 1;
> +                                       insn -= 1;
> +                                       goto next_insn;
> +                               }
> +                       }

I don't get this loop, can you elaborate? Why are we double-checking
anything here, didn't we do this already?

> +
> +                       /* apply the rewrite:
> +                        *   *(u64 *)(r10 - X) = rY ; num-times
> +                        *   call()                               -> call()
> +                        *   rY = *(u64 *)(r10 - X) ; num-times
> +                        */
> +                       err = verifier_remove_insns(env, i + delta - spills_num, spills_num);
> +                       if (err)
> +                               return err;
> +                       err = verifier_remove_insns(env, i + delta - spills_num + 1, spills_num);
> +                       if (err)
> +                               return err;

why not a single bpf_patch_insn_data()?

> +
> +                       i += spills_num - 1;
> +                       /*   ^            ^   do a second visit of this instruction,
> +                        *   |            '-- so that verifier can inline it
> +                        *   '--------------- jump over deleted fills
> +                        */
> +                       delta -= 2 * spills_num;
> +                       insn = env->prog->insnsi + i + delta;
> +                       goto next_insn;

why not adjust the state and just fall through, what goto next_insn
does that we can't (and next instruction is misleading, so I'd rather
fix up and move forward)

> +               }
>                 if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
>                         ret = fixup_kfunc_call(env, insn, insn_buf, i + delta, &cnt);
>                         if (ret)
> @@ -21704,6 +21988,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
>         if (ret < 0)
>                 goto skip_full_check;
>
> +       ret = mark_nocsr_patterns(env);
> +       if (ret < 0)
> +               goto skip_full_check;
> +
>         ret = do_check_main(env);
>         ret = ret ?: do_check_subprogs(env);
>
> --
> 2.45.2
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 1/9] bpf: add a get_helper_proto() utility function
  2024-07-04 10:23 ` [RFC bpf-next v2 1/9] bpf: add a get_helper_proto() utility function Eduard Zingerman
@ 2024-07-09 23:42   ` Andrii Nakryiko
  2024-07-10  0:26     ` Eduard Zingerman
  0 siblings, 1 reply; 36+ messages in thread
From: Andrii Nakryiko @ 2024-07-09 23:42 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi

On Thu, Jul 4, 2024 at 3:24 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Extract the part of check_helper_call() as a utility function allowing
> to query 'struct bpf_func_proto' for a specific helper function id.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  kernel/bpf/verifier.c | 30 +++++++++++++++++++++++-------
>  1 file changed, 23 insertions(+), 7 deletions(-)
>

I'd write it differently (see below), but it's correct, so:

Acked-by: Andrii Nakryiko <andrii@kernel.org>

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index d3927d819465..4869f1fb0a42 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -10261,6 +10261,24 @@ static void update_loop_inline_state(struct bpf_verifier_env *env, u32 subprogno
>                                  state->callback_subprogno == subprogno);
>  }
>
> +static int get_helper_proto(struct bpf_verifier_env *env, int func_id,
> +                           const struct bpf_func_proto **ptr)
> +{
> +       const struct bpf_func_proto *result = NULL;
> +
> +       if (func_id < 0 || func_id >= __BPF_FUNC_MAX_ID)
> +               return -ERANGE;
> +
> +       if (env->ops->get_func_proto)
> +               result = env->ops->get_func_proto(func_id, env->prog);
> +
> +       if (!result)
> +               return -EINVAL;

result is a bit unnecessary. We could do either

*ptr = NULL;
if (env->ops->get_func_proto)
    *ptr = env->ops->get_func_proto(func_id, env->prog);
return *ptr ? 0 : -EINVAL;


or just

if (!env->ops->get_func_proto)
    return -EINVAL;

*ptr = env->ops->get_func_proto(func_id, env->prog);

return *ptr ? 0 : -EINVAL;


> +
> +       *ptr = result;
> +       return 0;
> +}
> +
>  static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>                              int *insn_idx_p)
>  {
> @@ -10277,18 +10295,16 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>
>         /* find function prototype */
>         func_id = insn->imm;
> -       if (func_id < 0 || func_id >= __BPF_FUNC_MAX_ID) {
> -               verbose(env, "invalid func %s#%d\n", func_id_name(func_id),
> -                       func_id);
> +       err = get_helper_proto(env, insn->imm, &fn);
> +       if (err == -ERANGE) {
> +               verbose(env, "invalid func %s#%d\n", func_id_name(func_id), func_id);
>                 return -EINVAL;
>         }
>
> -       if (env->ops->get_func_proto)
> -               fn = env->ops->get_func_proto(func_id, env->prog);
> -       if (!fn) {
> +       if (err) {

nit: this is one block of error handling, I'd keep it "connected":

if (err == -ERANGE) {
} else if (err) {

}


>                 verbose(env, "program of this type cannot use helper %s#%d\n",
>                         func_id_name(func_id), func_id);
> -               return -EINVAL;
> +               return err;
>         }
>
>         /* eBPF programs must be GPL compatible to use GPL-ed functions */
> --
> 2.45.2
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 4/9] selftests/bpf: extract utility function for BPF disassembly
  2024-07-04 10:23 ` [RFC bpf-next v2 4/9] selftests/bpf: extract utility function for BPF disassembly Eduard Zingerman
@ 2024-07-09 23:46   ` Andrii Nakryiko
  0 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2024-07-09 23:46 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi

On Thu, Jul 4, 2024 at 3:24 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> struct bpf_insn *disasm_insn(struct bpf_insn *insn, char *buf, size_t buf_sz);
>
>   Disassembles instruction 'insn' to a text buffer 'buf'.
>   Removes insn->code hex prefix added by kernel disassembly routine.
>   Returns a pointer to the next instruction
>   (increments insn by either 1 or 2).
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  tools/testing/selftests/bpf/Makefile          |  1 +
>  tools/testing/selftests/bpf/disasm_helpers.c  | 51 +++++++++++++
>  tools/testing/selftests/bpf/disasm_helpers.h  | 12 +++
>  .../selftests/bpf/prog_tests/ctx_rewrite.c    | 74 +++----------------
>  tools/testing/selftests/bpf/testing_helpers.c |  1 +
>  5 files changed, 75 insertions(+), 64 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/disasm_helpers.c
>  create mode 100644 tools/testing/selftests/bpf/disasm_helpers.h
>

LGTM.

Acked-by: Andrii Nakryiko <andrii@kernel.org>

[...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 8/9] selftests/bpf: __arch_* macro to limit test cases to specific archs
  2024-07-04 10:24 ` [RFC bpf-next v2 8/9] selftests/bpf: __arch_* macro to limit test cases to specific archs Eduard Zingerman
@ 2024-07-09 23:50   ` Andrii Nakryiko
  0 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2024-07-09 23:50 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi

On Thu, Jul 4, 2024 at 3:24 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Add annotations __arch_x86_64, __arch_arm64, __arch_riscv64
> to specify on which architecture the test case should be tested.
> Several __arch_* annotations could be specified at once.
> When test case is not run on current arch it is marked as skipped.
>
> For example, the following would be tested only on arm64 and riscv64:
>
>   SEC("raw_tp")
>   __arch_arm64
>   __arch_riscv64
>   __xlated("1: *(u64 *)(r10 - 16) = r1")
>   __xlated("2: call")
>   __xlated("3: r1 = *(u64 *)(r10 - 16);")
>   __success
>   __naked void canary_arm64_riscv64(void)
>   {
>         asm volatile (
>         "r1 = 1;"
>         "*(u64 *)(r10 - 16) = r1;"
>         "call %[bpf_get_smp_processor_id];"
>         "r1 = *(u64 *)(r10 - 16);"
>         "exit;"
>         :
>         : __imm(bpf_get_smp_processor_id)
>         : __clobber_all);
>   }
>
> On x86 it would be skipped:
>
>   #467/2   verifier_nocsr/canary_arm64_riscv64:SKIP
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  tools/testing/selftests/bpf/progs/bpf_misc.h |  8 ++++
>  tools/testing/selftests/bpf/test_loader.c    | 43 ++++++++++++++++++++
>  2 files changed, 51 insertions(+)
>

LGTM, just being pedantic below, because why not? ;)

Acked-by: Andrii Nakryiko <andrii@kernel.org>

[...]

> +       spec->arch_mask = arch_mask;
> +
>         if (spec->mode_mask == 0)
>                 spec->mode_mask = PRIV;
>
> @@ -677,6 +701,20 @@ static int get_xlated_program_text(int prog_fd, char *text, size_t text_sz)
>         return err;
>  }
>
> +static bool run_on_current_arch(int arch_mask)
> +{
> +       if (arch_mask == 0)
> +               return true;
> +#if defined(__x86_64__)
> +       return !!(arch_mask & ARCH_X86_64);

nit: !! is needed if you assign the result to integer and you want
either 0 or 1 (and not whatever the bit mask value is). In this case
it's well defined that a non-zero value will be implicitly converted
to true for function result, so just `return arch_mask & ARCH_X86_64;`
is totally fine and cleaner.

> +#elif defined(__aarch64__)
> +       return !!(arch_mask & ARCH_ARM64);
> +#elif defined(__riscv) && __riscv_xlen == 64
> +       return !!(arch_mask & ARCH_RISCV64);
> +#endif
> +       return false;
> +}
> +
>  /* this function is forced noinline and has short generic name to look better
>   * in test_progs output (in case of a failure)
>   */
> @@ -701,6 +739,11 @@ void run_subtest(struct test_loader *tester,
>         if (!test__start_subtest(subspec->name))
>                 return;
>
> +       if (!run_on_current_arch(spec->arch_mask)) {
> +               test__skip();
> +               return;
> +       }
> +
>         if (unpriv) {
>                 if (!can_execute_unpriv(tester, spec)) {
>                         test__skip();
> --
> 2.45.2
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 1/9] bpf: add a get_helper_proto() utility function
  2024-07-09 23:42   ` Andrii Nakryiko
@ 2024-07-10  0:26     ` Eduard Zingerman
  0 siblings, 0 replies; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-10  0:26 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi

On Tue, 2024-07-09 at 16:42 -0700, Andrii Nakryiko wrote:

[...]

> > @@ -10261,6 +10261,24 @@ static void update_loop_inline_state(struct bpf_verifier_env *env, u32 subprogno
> >                                  state->callback_subprogno == subprogno);
> >  }
> > 
> > +static int get_helper_proto(struct bpf_verifier_env *env, int func_id,
> > +                           const struct bpf_func_proto **ptr)
> > +{
> > +       const struct bpf_func_proto *result = NULL;
> > +
> > +       if (func_id < 0 || func_id >= __BPF_FUNC_MAX_ID)
> > +               return -ERANGE;
> > +
> > +       if (env->ops->get_func_proto)
> > +               result = env->ops->get_func_proto(func_id, env->prog);
> > +
> > +       if (!result)
> > +               return -EINVAL;
> 
> result is a bit unnecessary. We could do either
> 
> *ptr = NULL;
> if (env->ops->get_func_proto)
>     *ptr = env->ops->get_func_proto(func_id, env->prog);
> return *ptr ? 0 : -EINVAL;
> 
> 
> or just
> 
> if (!env->ops->get_func_proto)
>     return -EINVAL;
> 
> *ptr = env->ops->get_func_proto(func_id, env->prog);
> 
> return *ptr ? 0 : -EINVAL;

Ok, will change in v3.

[...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-04 10:23 ` [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls Eduard Zingerman
  2024-07-09 23:42   ` Andrii Nakryiko
@ 2024-07-10  1:09   ` Alexei Starovoitov
  2024-07-10  3:06     ` Eduard Zingerman
  1 sibling, 1 reply; 36+ messages in thread
From: Alexei Starovoitov @ 2024-07-10  1:09 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kernel Team, Yonghong Song, Puranjay Mohan,
	Jose E. Marchesi

On Thu, Jul 4, 2024 at 3:24 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> +                       for (j = 1; j <= spills_num; ++j) {
> +                               if ((insn - j)->off >= subprogs[cur_subprog].nocsr_stack_off ||
> +                                   (insn + j)->off >= subprogs[cur_subprog].nocsr_stack_off) {
> +                                       /* do a second visit of this instruction,
> +                                        * so that verifier can inline it
> +                                        */
> +                                       i -= 1;
> +                                       insn -= 1;
> +                                       goto next_insn;
> +                               }
> +                       }
> +
> +                       /* apply the rewrite:
> +                        *   *(u64 *)(r10 - X) = rY ; num-times
> +                        *   call()                               -> call()
> +                        *   rY = *(u64 *)(r10 - X) ; num-times
> +                        */
> +                       err = verifier_remove_insns(env, i + delta - spills_num, spills_num);
> +                       if (err)
> +                               return err;
> +                       err = verifier_remove_insns(env, i + delta - spills_num + 1, spills_num);
> +                       if (err)
> +                               return err;
> +
> +                       i += spills_num - 1;
> +                       /*   ^            ^   do a second visit of this instruction,
> +                        *   |            '-- so that verifier can inline it
> +                        *   '--------------- jump over deleted fills
> +                        */
> +                       delta -= 2 * spills_num;
> +                       insn = env->prog->insnsi + i + delta;
> +                       goto next_insn;
> +               }

somewhere after spill/fill removal subprog->stack_depth
needs to be adjust to nocsr_stack_off,
otherwise extra stack space is wasted.
I couldn't find this logic in the patch.

Once the adjustment logic is done, pls add a selftest with
nocsr and may_goto, since may_goto processing is in the same
do_misc_fixups() loop and it needs to grow the stack while
spill/fill removal will shrink the stack.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls
  2024-07-04 10:23 [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Eduard Zingerman
                   ` (9 preceding siblings ...)
  2024-07-08 11:44 ` [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Puranjay Mohan
@ 2024-07-10  1:18 ` Alexei Starovoitov
  2024-07-10  3:35   ` Eduard Zingerman
  10 siblings, 1 reply; 36+ messages in thread
From: Alexei Starovoitov @ 2024-07-10  1:18 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kernel Team, Yonghong Song, Puranjay Mohan,
	Jose E. Marchesi

On Thu, Jul 4, 2024 at 3:24 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> - stack offsets used for spills/fills are allocated as minimal
>   stack offsets in whole function and are not used for any other
>   purposes;

"minimal stack offset" reads odd to me.
I noticed the same naming convention is used in llvm diff.
imo it's odd there as well.
Maybe say:
llvm grows the stack that in bpf architecture always grows down and
picks the lowest stack offset not used by local variables
and spill/fill.

> Here is how the program looks after verifier processing:
>
>   # bpftool prog load ./nocsr.bpf.o /sys/fs/bpf/nocsr-test
>   # bpftool prog dump xlated pinned /sys/fs/bpf/nocsr-test
>   int test(void * ctx):
>   ; int test(void *ctx)
>      0: (bf) r3 = r1               <--------- 3rd printk parameter
>   ; __u32 task = bpf_get_smp_processor_id();
>      1: (b4) w0 = 197132           <--------- inlined helper call,
>      2: (bf) r0 = r0               <--------- spill/fill pair removed

Are you using old bpftool or something?
That should have been:
r0 = &(void __percpu *)(r0)
?

>      3: (61) r0 = *(u32 *)(r0 +0)  <---------
>   ; bpf_printk("ctx=%p, smp=%d", ctx, task);
>      4: (18) r1 = map[id:13][0]+0
>      6: (b7) r2 = 15
>      7: (bf) r4 = r0
>      8: (85) call bpf_trace_printk#-125920
>   ; return 0;
>      9: (b7) r0 = 0
>     10: (95) exit

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-09 23:42   ` Andrii Nakryiko
@ 2024-07-10  3:00     ` Eduard Zingerman
  2024-07-10  6:01       ` Andrii Nakryiko
  2024-07-10  9:46     ` Eduard Zingerman
  1 sibling, 1 reply; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-10  3:00 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi, Alexei Starovoitov

On Tue, 2024-07-09 at 16:42 -0700, Andrii Nakryiko wrote:

[...]

> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > index 2b54e25d2364..735ae0901b3d 100644
> > --- a/include/linux/bpf_verifier.h
> > +++ b/include/linux/bpf_verifier.h
> > @@ -585,6 +585,15 @@ struct bpf_insn_aux_data {
> >          * accepts callback function as a parameter.
> >          */
> >         bool calls_callback;
> > +       /* true if STX or LDX instruction is a part of a spill/fill
> > +        * pattern for a no_caller_saved_registers call.
> > +        */
> > +       u8 nocsr_pattern:1;
> > +       /* for CALL instructions, a number of spill/fill pairs in the
> > +        * no_caller_saved_registers pattern.
> > +        */
> > +       u8 nocsr_spills_num:3;
> 
> despite bitfields this will extend bpf_insn_aux_data by 8 bytes. there
> are 2 bytes of padding after alu_state, let's put this there.
> 
> And let's not add bitfields unless absolutely necessary (this can be
> always done later).

Will remove the bitfields and move the fields.

> 
> > +
> >  };
> > 
> >  #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
> > @@ -641,6 +650,11 @@ struct bpf_subprog_info {
> >         u32 linfo_idx; /* The idx to the main_prog->aux->linfo */
> >         u16 stack_depth; /* max. stack depth used by this function */
> >         u16 stack_extra;
> > +       /* stack depth after which slots reserved for
> > +        * no_caller_saved_registers spills/fills start,
> > +        * value <= nocsr_stack_off belongs to the spill/fill area.
> 
> are you sure about <= (not <), it seems like nocsr_stack_off is
> exclusive right bound for nocsr stack region (it would be good to call
> this out explicitly here)

Right, it should be '<', my bad, will update the comment.

> 
> > +        */
> > +       s16 nocsr_stack_off;
> >         bool has_tail_call: 1;
> >         bool tail_call_reachable: 1;
> >         bool has_ld_abs: 1;
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 4869f1fb0a42..d16a249b59ad 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -2471,16 +2471,37 @@ static int cmp_subprogs(const void *a, const void *b)
> >                ((struct bpf_subprog_info *)b)->start;
> >  }
> > 
> > -static int find_subprog(struct bpf_verifier_env *env, int off)
> > +/* Find subprogram that contains instruction at 'off' */
> > +static int find_containing_subprog(struct bpf_verifier_env *env, int off)
> >  {
> > -       struct bpf_subprog_info *p;
> > +       struct bpf_subprog_info *vals = env->subprog_info;
> > +       int l, r, m;
> > 
> > -       p = bsearch(&off, env->subprog_info, env->subprog_cnt,
> > -                   sizeof(env->subprog_info[0]), cmp_subprogs);
> > -       if (!p)
> > +       if (off >= env->prog->len || off < 0 || env->subprog_cnt == 0)
> >                 return -ENOENT;
> > -       return p - env->subprog_info;
> > 
> > +       l = 0;
> > +       m = 0;
> 
> no need to initialize m

Ok

> 
> > +       r = env->subprog_cnt - 1;
> > +       while (l < r) {
> > +               m = l + (r - l + 1) / 2;
> > +               if (vals[m].start <= off)
> > +                       l = m;
> > +               else
> > +                       r = m - 1;
> > +       }
> > +       return l;
> > +}
> 
> I love it, looks great :)
>

Agree

[...]

> > @@ -4501,6 +4522,23 @@ static int get_reg_width(struct bpf_reg_state *reg)
> >         return fls64(reg->umax_value);
> >  }
> > 
> > +/* See comment for mark_nocsr_pattern_for_call() */
> > +static void check_nocsr_stack_contract(struct bpf_verifier_env *env, struct bpf_func_state *state,
> > +                                      int insn_idx, int off)
> > +{
> > +       struct bpf_subprog_info *subprog = &env->subprog_info[state->subprogno];
> > +       struct bpf_insn_aux_data *aux = &env->insn_aux_data[insn_idx];
> > +
> > +       if (subprog->nocsr_stack_off <= off || aux->nocsr_pattern)
> > +               return;
> 
> can helper call instruction go through this check? E.g., if we do
> bpf_probe_read_kernel() into stack slot, where do we check that that
> slot is not overlapping with nocsr spill/fill region?

In check_helper_call() we do check_mem_access() that eventually calls
one of the check_stack_{read,write}_{fixed,varying}_off().
The .access_size should be set for bpf_probe_read_kernel()
because it's argument base type is ARG_PTR_TO_MEM.
I will add a test case to double-check this.

[...]

> > @@ -15951,6 +15993,206 @@ static int visit_func_call_insn(int t, struct bpf_insn *insns,
> >         return ret;
> >  }
> > 
> > +/* Bitmask with 1s for all caller saved registers */
> > +#define ALL_CALLER_SAVED_REGS ((1u << CALLER_SAVED_REGS) - 1)
> > +
> > +/* Return a bitmask specifying which caller saved registers are
> > + * modified by a call to a helper.
> > + * (Either as a return value or as scratch registers).
> > + *
> > + * For normal helpers registers R0-R5 are scratched.
> > + * For helpers marked as no_csr:
> > + * - scratch R0 if function is non-void;
> > + * - scratch R1-R5 if corresponding parameter type is set
> > + *   in the function prototype.
> > + */
> > +static u8 get_helper_reg_mask(const struct bpf_func_proto *fn)
> 
> suggestion: to make this less confusing, here we are returning a mask
> of registers that are clobbered by the helper, is that right? so
> get_helper_clobber_mask() maybe?

get_helper_clobber_mask() is a good name, will change.

[...]

> > +/* If 'insn' is a call that follows no_caller_saved_registers contract
> > + * and called function is inlined by current jit or verifier,
> > + * return a mask with 1s corresponding to registers that are scratched
> > + * by this call (depends on return type and number of return parameters).
> 
> return parameters? was it supposed to be "function parameters/arguments"?

My bad.

> 
> > + * Otherwise return ALL_CALLER_SAVED_REGS mask.
> > + */
> > +static u32 call_csr_mask(struct bpf_verifier_env *env, struct bpf_insn *insn)
> 
> you use u8 for get_helper_reg_mask() and u32 here, why not keep them consistent?

Ok

> similar to the naming nit above, I think we should be a bit more
> explicit with what "mask" actually means. Is this also clobber mask?

I mean, there is a comment right above the function.
This function returns a mask of caller saved registers (csr).
I'll make the name more explicit.

> 
> > +{
> > +       const struct bpf_func_proto *fn;
> > +
> > +       if (bpf_helper_call(insn) &&
> > +           (verifier_inlines_helper_call(env, insn->imm) || bpf_jit_inlines_helper_call(insn->imm)) &&
> > +           get_helper_proto(env, insn->imm, &fn) == 0 &&
> > +           fn->allow_nocsr)
> > +               return ~get_helper_reg_mask(fn);
> 
> hm... I'm a bit confused why we do a negation here? aren't we working
> with clobbering mask... I'll keep reading for now.

Please read the comment before the function.

> 
> > +
> > +       return ALL_CALLER_SAVED_REGS;
> > +}

[...]

> > +static void mark_nocsr_pattern_for_call(struct bpf_verifier_env *env, int t)
> 
> t is insn_idx, let's not carry over old crufty check_cfg naming

Ok

> 
> > +{
> > +       struct bpf_insn *insns = env->prog->insnsi, *stx, *ldx;
> > +       struct bpf_subprog_info *subprog;
> > +       u32 csr_mask = call_csr_mask(env, &insns[t]);
> > +       u32 reg_mask = ~csr_mask | ~ALL_CALLER_SAVED_REGS;
> 
> tbh, I'm lost with all these bitmask and their inversions...
> call_csr_mask()'s result is basically always used inverted, so why not
> return inverted mask in the first place?

The mask is initialized as a set of all registers preserved by this call.
Those that are not in mask need a spill/fill pair.
I'll toss things around to make this more clear.
(naming, comments, maybe move the '| ~ALL_CALLER_SAVED_REGS' to the call_csr_mask()).

> 
> > +       int s, i;
> > +       s16 off;
> > +
> > +       if (csr_mask == ALL_CALLER_SAVED_REGS)
> > +               return;
> > +
> > +       for (i = 1, off = 0; i <= ARRAY_SIZE(caller_saved); ++i, off += BPF_REG_SIZE) {
> > +               if (t - i < 0 || t + i >= env->prog->len)
> > +                       break;
> > +               stx = &insns[t - i];
> > +               ldx = &insns[t + i];
> > +               if (off == 0) {
> > +                       off = stx->off;
> > +                       if (off % BPF_REG_SIZE != 0)
> > +                               break;
> 
> kind of ugly that we assume stx before we actually checked that it's
> STX?... maybe split humongous if below into instruction checking
> (with code and src_reg) and then off checking separately?

Don't see anything ugly about this, tbh.
Can split the 'if' statement, if you think it's hard to read.

> 
> > +               }
> > +               if (/* *(u64 *)(r10 - off) = r[0-5]? */
> > +                   stx->code != (BPF_STX | BPF_MEM | BPF_DW) ||
> > +                   stx->dst_reg != BPF_REG_10 ||
> > +                   /* r[0-5] = *(u64 *)(r10 - off)? */
> > +                   ldx->code != (BPF_LDX | BPF_MEM | BPF_DW) ||
> > +                   ldx->src_reg != BPF_REG_10 ||
> > +                   /* check spill/fill for the same reg and offset */
> > +                   stx->src_reg != ldx->dst_reg ||
> > +                   stx->off != ldx->off ||
> > +                   stx->off != off ||
> > +                   /* this should be a previously unseen register */
> > +                   BIT(stx->src_reg) & reg_mask)
> 
> () around & operation?

No need, & has higher priority over ||.
You can check the AST using
https://tree-sitter.github.io/tree-sitter/playground .

> 
> > +                       break;
> > +               reg_mask |= BIT(stx->src_reg);
> > +               env->insn_aux_data[t - i].nocsr_pattern = 1;
> > +               env->insn_aux_data[t + i].nocsr_pattern = 1;
> > +       }
> > +       if (i == 1)
> > +               return;
> > +       env->insn_aux_data[t].nocsr_spills_num = i - 1;
> > +       s = find_containing_subprog(env, t);
> > +       /* can't happen */
> 
> then don't check ;) we leave the state partially set for CSR but not
> quite. We either should error out completely or just assume
> correctness of find_containing_subprog, IMO

Ok

> 
> > +       if (WARN_ON_ONCE(s < 0))
> > +               return;
> > +       subprog = &env->subprog_info[s];
> > +       subprog->nocsr_stack_off = min(subprog->nocsr_stack_off, off);
> 
> should this be max()? offsets are negative, right? so if nocsr uses -8
> and -16 as in the example, entire [-16, 0) region is nocsr region

This should be min exactly because stack offsets are negative.
For the example above the 'off' is initialized as -16 and then
is incremented by +8 giving final value of -8.
And I need to select the minimal value used between several patterns.

> 
> > +}

[...]

> > @@ -20119,6 +20361,48 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> >                         goto next_insn;
> >                 if (insn->src_reg == BPF_PSEUDO_CALL)
> >                         goto next_insn;
> > +               /* Remove unnecessary spill/fill pairs, members of nocsr pattern */
> > +               if (env->insn_aux_data[i + delta].nocsr_spills_num > 0) {
> > +                       u32 j, spills_num = env->insn_aux_data[i + delta].nocsr_spills_num;
> > +                       int err;
> > +
> > +                       /* don't apply this on a second visit */
> > +                       env->insn_aux_data[i + delta].nocsr_spills_num = 0;
> > +
> > +                       /* check if spill/fill stack access is in expected offset range */
> > +                       for (j = 1; j <= spills_num; ++j) {
> > +                               if ((insn - j)->off >= subprogs[cur_subprog].nocsr_stack_off ||
> > +                                   (insn + j)->off >= subprogs[cur_subprog].nocsr_stack_off) {
> > +                                       /* do a second visit of this instruction,
> > +                                        * so that verifier can inline it
> > +                                        */
> > +                                       i -= 1;
> > +                                       insn -= 1;
> > +                                       goto next_insn;
> > +                               }
> > +                       }
> 
> I don't get this loop, can you elaborate? Why are we double-checking
> anything here, didn't we do this already?

We established probable patterns and probable minimal offset.
Over the course of program verification we might have invalidated the
.nocsr_stack_off for a particular subprogram => hence a need for this check.

> 
> > +
> > +                       /* apply the rewrite:
> > +                        *   *(u64 *)(r10 - X) = rY ; num-times
> > +                        *   call()                               -> call()
> > +                        *   rY = *(u64 *)(r10 - X) ; num-times
> > +                        */
> > +                       err = verifier_remove_insns(env, i + delta - spills_num, spills_num);
> > +                       if (err)
> > +                               return err;
> > +                       err = verifier_remove_insns(env, i + delta - spills_num + 1, spills_num);
> > +                       if (err)
> > +                               return err;
> 
> why not a single bpf_patch_insn_data()?

bpf_patch_insn_data() assumes that one instruction has to be replaced with many.
Here I need to replace many instructions with a single instruction.
I'd prefer not to tweak bpf_patch_insn_data() for this patch-set.

On the other hand, the do_jit() for x86 removes NOPs (BPF_JA +0),
so I can probably replace spills/fills with NOPs here instead of
calling bpf_patch_insn_data() or bpf_remove_insns().

> > +
> > +                       i += spills_num - 1;
> > +                       /*   ^            ^   do a second visit of this instruction,
> > +                        *   |            '-- so that verifier can inline it
> > +                        *   '--------------- jump over deleted fills
> > +                        */
> > +                       delta -= 2 * spills_num;
> > +                       insn = env->prog->insnsi + i + delta;
> > +                       goto next_insn;
> 
> why not adjust the state and just fall through, what goto next_insn
> does that we can't (and next instruction is misleading, so I'd rather
> fix up and move forward)

I don't like this. The fall-through makes control flow more convoluted.
To understand what would happen next:
- with goto next_insn we just start over;
- with fall-through we need to think about position of this particular
  'if' statement within the loop.

> 
> > +               }
> >                 if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
> >                         ret = fixup_kfunc_call(env, insn, insn_buf, i + delta, &cnt);
> >                         if (ret)

[...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-10  1:09   ` Alexei Starovoitov
@ 2024-07-10  3:06     ` Eduard Zingerman
  0 siblings, 0 replies; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-10  3:06 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kernel Team, Yonghong Song, Puranjay Mohan,
	Jose E. Marchesi

On Tue, 2024-07-09 at 18:09 -0700, Alexei Starovoitov wrote:

[...]


> somewhere after spill/fill removal subprog->stack_depth
> needs to be adjust to nocsr_stack_off,
> otherwise extra stack space is wasted.
> I couldn't find this logic in the patch.

Such logic is not present.

> Once the adjustment logic is done, pls add a selftest with
> nocsr and may_goto, since may_goto processing is in the same
> do_misc_fixups() loop and it needs to grow the stack while
> spill/fill removal will shrink the stack.

It might be necessary to move the spill/fill removal to a separate
pass exactly because of this dependency.

I'll think about the implementation, but it might be tricky.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls
  2024-07-10  1:18 ` Alexei Starovoitov
@ 2024-07-10  3:35   ` Eduard Zingerman
  0 siblings, 0 replies; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-10  3:35 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kernel Team, Yonghong Song, Puranjay Mohan,
	Jose E. Marchesi

On Tue, 2024-07-09 at 18:18 -0700, Alexei Starovoitov wrote:
> On Thu, Jul 4, 2024 at 3:24 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > - stack offsets used for spills/fills are allocated as minimal
> >   stack offsets in whole function and are not used for any other
> >   purposes;
> 
> "minimal stack offset" reads odd to me.
> I noticed the same naming convention is used in llvm diff.
> imo it's odd there as well.
> Maybe say:
> llvm grows the stack that in bpf architecture always grows down and
> picks the lowest stack offset not used by local variables
> and spill/fill.

Will replace "minimal" with lowest here and in LLVM diff.

> > Here is how the program looks after verifier processing:
> > 
> >   # bpftool prog load ./nocsr.bpf.o /sys/fs/bpf/nocsr-test
> >   # bpftool prog dump xlated pinned /sys/fs/bpf/nocsr-test
> >   int test(void * ctx):
> >   ; int test(void *ctx)
> >      0: (bf) r3 = r1               <--------- 3rd printk parameter
> >   ; __u32 task = bpf_get_smp_processor_id();
> >      1: (b4) w0 = 197132           <--------- inlined helper call,
> >      2: (bf) r0 = r0               <--------- spill/fill pair removed
> 
> Are you using old bpftool or something?
> That should have been:
> r0 = &(void __percpu *)(r0)
> ?

Yes, I was using distro-provided bpftool.
Re-running with kernel version of the tool shows the __percpu thing.

> 
> >      3: (61) r0 = *(u32 *)(r0 +0)  <---------

[...]


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-10  3:00     ` Eduard Zingerman
@ 2024-07-10  6:01       ` Andrii Nakryiko
  2024-07-10  7:57         ` Eduard Zingerman
  0 siblings, 1 reply; 36+ messages in thread
From: Andrii Nakryiko @ 2024-07-10  6:01 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi, Alexei Starovoitov

On Tue, Jul 9, 2024 at 8:00 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2024-07-09 at 16:42 -0700, Andrii Nakryiko wrote:
>
> [...]
>

[...]

> >
> > > +
> > >  };
> > >
> > >  #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
> > > @@ -641,6 +650,11 @@ struct bpf_subprog_info {
> > >         u32 linfo_idx; /* The idx to the main_prog->aux->linfo */
> > >         u16 stack_depth; /* max. stack depth used by this function */
> > >         u16 stack_extra;
> > > +       /* stack depth after which slots reserved for
> > > +        * no_caller_saved_registers spills/fills start,
> > > +        * value <= nocsr_stack_off belongs to the spill/fill area.
> >
> > are you sure about <= (not <), it seems like nocsr_stack_off is
> > exclusive right bound for nocsr stack region (it would be good to call
> > this out explicitly here)
>
> Right, it should be '<', my bad, will update the comment.

See, I do read comments ;)

[...]

>
> > > @@ -4501,6 +4522,23 @@ static int get_reg_width(struct bpf_reg_state *reg)
> > >         return fls64(reg->umax_value);
> > >  }
> > >
> > > +/* See comment for mark_nocsr_pattern_for_call() */
> > > +static void check_nocsr_stack_contract(struct bpf_verifier_env *env, struct bpf_func_state *state,
> > > +                                      int insn_idx, int off)
> > > +{
> > > +       struct bpf_subprog_info *subprog = &env->subprog_info[state->subprogno];
> > > +       struct bpf_insn_aux_data *aux = &env->insn_aux_data[insn_idx];
> > > +
> > > +       if (subprog->nocsr_stack_off <= off || aux->nocsr_pattern)
> > > +               return;
> >
> > can helper call instruction go through this check? E.g., if we do
> > bpf_probe_read_kernel() into stack slot, where do we check that that
> > slot is not overlapping with nocsr spill/fill region?
>
> In check_helper_call() we do check_mem_access() that eventually calls
> one of the check_stack_{read,write}_{fixed,varying}_off().
> The .access_size should be set for bpf_probe_read_kernel()
> because it's argument base type is ARG_PTR_TO_MEM.
> I will add a test case to double-check this.

Ok. Also as I was reading this I didn't yet realize that
aux->nocsr_pattern is not set for call instruction itself, so I was
worried that we might accidentally skip the check. But we don't set
nocsr_pattern for call, so we should be good (but the test never
hurts)

>
> [...]
>
> > > @@ -15951,6 +15993,206 @@ static int visit_func_call_insn(int t, struct bpf_insn *insns,
> > >         return ret;
> > >  }
> > >
> > > +/* Bitmask with 1s for all caller saved registers */
> > > +#define ALL_CALLER_SAVED_REGS ((1u << CALLER_SAVED_REGS) - 1)
> > > +
> > > +/* Return a bitmask specifying which caller saved registers are
> > > + * modified by a call to a helper.
> > > + * (Either as a return value or as scratch registers).
> > > + *
> > > + * For normal helpers registers R0-R5 are scratched.
> > > + * For helpers marked as no_csr:
> > > + * - scratch R0 if function is non-void;
> > > + * - scratch R1-R5 if corresponding parameter type is set
> > > + *   in the function prototype.
> > > + */
> > > +static u8 get_helper_reg_mask(const struct bpf_func_proto *fn)
> >
> > suggestion: to make this less confusing, here we are returning a mask
> > of registers that are clobbered by the helper, is that right? so
> > get_helper_clobber_mask() maybe?
>
> get_helper_clobber_mask() is a good name, will change.
>
> [...]
>
> > > +/* If 'insn' is a call that follows no_caller_saved_registers contract
> > > + * and called function is inlined by current jit or verifier,
> > > + * return a mask with 1s corresponding to registers that are scratched
> > > + * by this call (depends on return type and number of return parameters).
> >
> > return parameters? was it supposed to be "function parameters/arguments"?
>
> My bad.
>
> >
> > > + * Otherwise return ALL_CALLER_SAVED_REGS mask.
> > > + */
> > > +static u32 call_csr_mask(struct bpf_verifier_env *env, struct bpf_insn *insn)
> >
> > you use u8 for get_helper_reg_mask() and u32 here, why not keep them consistent?
>
> Ok
>
> > similar to the naming nit above, I think we should be a bit more
> > explicit with what "mask" actually means. Is this also clobber mask?
>
> I mean, there is a comment right above the function.
> This function returns a mask of caller saved registers (csr).
> I'll make the name more explicit.
>
> >
> > > +{
> > > +       const struct bpf_func_proto *fn;
> > > +
> > > +       if (bpf_helper_call(insn) &&
> > > +           (verifier_inlines_helper_call(env, insn->imm) || bpf_jit_inlines_helper_call(insn->imm)) &&
> > > +           get_helper_proto(env, insn->imm, &fn) == 0 &&
> > > +           fn->allow_nocsr)
> > > +               return ~get_helper_reg_mask(fn);
> >
> > hm... I'm a bit confused why we do a negation here? aren't we working
> > with clobbering mask... I'll keep reading for now.
>
> Please read the comment before the function.
>

Believe it or not, but I read it like 10 times and that didn't help me much.

+/* If 'insn' is a call that follows no_caller_saved_registers contract
+ * and called function is inlined by current jit or verifier,
+ * return a mask with 1s corresponding to registers that are scratched
+ * by this call (depends on return type and number of return parameters).
+ * Otherwise return ALL_CALLER_SAVED_REGS mask.

"registers that are scratched by this call" would be what
get_helper_reg_mask() returns for the function (at least that's my
reading of the above), and yet you return inverted mask. It doesn't
really help that we return ALL_CALLER_SAVED_REGS as "nope, it's not
nocsr call".

Maybe it would be a bit easier to follow if call_csr_mask() returned
bool and mask as an out parameter in case it's nocsr call. So instead
of special casing ALL_CALLER_SAVED_REGS there would be a nice "not a
nocsr, never mind" and early exit.

Anyways, perhaps I'm just being very dense here, I just found this
particular piece extremely hard to follow intuitively.

> >
> > > +
> > > +       return ALL_CALLER_SAVED_REGS;
> > > +}
>
> [...]
>
> > > +static void mark_nocsr_pattern_for_call(struct bpf_verifier_env *env, int t)
> >
> > t is insn_idx, let's not carry over old crufty check_cfg naming
>
> Ok
>
> >
> > > +{
> > > +       struct bpf_insn *insns = env->prog->insnsi, *stx, *ldx;
> > > +       struct bpf_subprog_info *subprog;
> > > +       u32 csr_mask = call_csr_mask(env, &insns[t]);
> > > +       u32 reg_mask = ~csr_mask | ~ALL_CALLER_SAVED_REGS;
> >
> > tbh, I'm lost with all these bitmask and their inversions...
> > call_csr_mask()'s result is basically always used inverted, so why not
> > return inverted mask in the first place?
>
> The mask is initialized as a set of all registers preserved by this call.

ok, maybe it's just a mix of "no csr" and "csr" that confuses me. How
about we call call_csr_mask as get_helper_preserved_mask() or
something like that to "match" get_helper_clobber_mask()?

> Those that are not in mask need a spill/fill pair.
> I'll toss things around to make this more clear.
> (naming, comments, maybe move the '| ~ALL_CALLER_SAVED_REGS' to the call_csr_mask()).
>
> >
> > > +       int s, i;
> > > +       s16 off;
> > > +
> > > +       if (csr_mask == ALL_CALLER_SAVED_REGS)
> > > +               return;
> > > +
> > > +       for (i = 1, off = 0; i <= ARRAY_SIZE(caller_saved); ++i, off += BPF_REG_SIZE) {
> > > +               if (t - i < 0 || t + i >= env->prog->len)
> > > +                       break;
> > > +               stx = &insns[t - i];
> > > +               ldx = &insns[t + i];
> > > +               if (off == 0) {
> > > +                       off = stx->off;
> > > +                       if (off % BPF_REG_SIZE != 0)
> > > +                               break;
> >
> > kind of ugly that we assume stx before we actually checked that it's
> > STX?... maybe split humongous if below into instruction checking
> > (with code and src_reg) and then off checking separately?
>
> Don't see anything ugly about this, tbh.

you are using stx->off and do `% BPF_REG_SIZE` check on it while that
stx->off field might be meaningless for the instruction (because we
are not yet sure it's STX instruction), that's a bit ugly, IMO

> Can split the 'if' statement, if you think it's hard to read.

it's not about readability for me

>
> >
> > > +               }
> > > +               if (/* *(u64 *)(r10 - off) = r[0-5]? */
> > > +                   stx->code != (BPF_STX | BPF_MEM | BPF_DW) ||
> > > +                   stx->dst_reg != BPF_REG_10 ||
> > > +                   /* r[0-5] = *(u64 *)(r10 - off)? */
> > > +                   ldx->code != (BPF_LDX | BPF_MEM | BPF_DW) ||
> > > +                   ldx->src_reg != BPF_REG_10 ||
> > > +                   /* check spill/fill for the same reg and offset */
> > > +                   stx->src_reg != ldx->dst_reg ||
> > > +                   stx->off != ldx->off ||
> > > +                   stx->off != off ||
> > > +                   /* this should be a previously unseen register */
> > > +                   BIT(stx->src_reg) & reg_mask)
> >
> > () around & operation?
>
> No need, & has higher priority over ||.
> You can check the AST using
> https://tree-sitter.github.io/tree-sitter/playground .

Oh, I have no doubt you checked that it works correctly. It's just not
a really good habit to rely on C's obscure operation ordering rules
beyond A && B || C (and even then it is arguable). I think the rule of
thumb to not mix bitwise and logic operators without explicit
parenthesis makes sense.

But never mind, I already feel weird complaining about !strcmp(),
every real kernel engineer should memorize operator precedence by
heart.

>
> >
> > > +                       break;
> > > +               reg_mask |= BIT(stx->src_reg);
> > > +               env->insn_aux_data[t - i].nocsr_pattern = 1;
> > > +               env->insn_aux_data[t + i].nocsr_pattern = 1;
> > > +       }
> > > +       if (i == 1)
> > > +               return;
> > > +       env->insn_aux_data[t].nocsr_spills_num = i - 1;
> > > +       s = find_containing_subprog(env, t);
> > > +       /* can't happen */
> >
> > then don't check ;) we leave the state partially set for CSR but not
> > quite. We either should error out completely or just assume
> > correctness of find_containing_subprog, IMO
>
> Ok
>
> >
> > > +       if (WARN_ON_ONCE(s < 0))
> > > +               return;
> > > +       subprog = &env->subprog_info[s];
> > > +       subprog->nocsr_stack_off = min(subprog->nocsr_stack_off, off);
> >
> > should this be max()? offsets are negative, right? so if nocsr uses -8
> > and -16 as in the example, entire [-16, 0) region is nocsr region
>
> This should be min exactly because stack offsets are negative.
> For the example above the 'off' is initialized as -16 and then
> is incremented by +8 giving final value of -8.
> And I need to select the minimal value used between several patterns.

so let's say I have two nocsr calls in the same subprog

    *(u64 *)(r10 - 8)  = r1;
    *(u64 *)(r10 - 16) = r2;
    call %[to_be_inlined]
    r2 = *(u64 *)(r10 - 16);
    r1 = *(u64 *)(r10 - 8);


and then a bit later


    *(u64 *)(r10 - 16)  = r1;
    *(u64 *)(r10 - 24) = r2;
    call %[to_be_inlined]
    r2 = *(u64 *)(r10 - 24);
    r1 = *(u64 *)(r10 - 16);


For the first chunk nocsr range is [-16, 0). For the second it's [-24,
-8), right?

Should `*(u64 *)(r10 - 8) = 123` somewhere in that subprog (not for
nocsr calls) invalidate this whole nocsr thing? With min you'll
basically have [-24, -8) as nocsr-reserved region, but shouldn't it be
whole [-24, 0)?

Clang shouldn't generate such inconsistent offset, right? But it's
legal, no? And if not, then all the calls should use the same upper
stack boundary and we shouldn't be doing min/max, but rather checking
that they are all consistent.

Or what am I missing again?

>
> >
> > > +}
>
> [...]
>
> > > @@ -20119,6 +20361,48 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> > >                         goto next_insn;
> > >                 if (insn->src_reg == BPF_PSEUDO_CALL)
> > >                         goto next_insn;
> > > +               /* Remove unnecessary spill/fill pairs, members of nocsr pattern */
> > > +               if (env->insn_aux_data[i + delta].nocsr_spills_num > 0) {
> > > +                       u32 j, spills_num = env->insn_aux_data[i + delta].nocsr_spills_num;
> > > +                       int err;
> > > +
> > > +                       /* don't apply this on a second visit */
> > > +                       env->insn_aux_data[i + delta].nocsr_spills_num = 0;
> > > +
> > > +                       /* check if spill/fill stack access is in expected offset range */
> > > +                       for (j = 1; j <= spills_num; ++j) {
> > > +                               if ((insn - j)->off >= subprogs[cur_subprog].nocsr_stack_off ||
> > > +                                   (insn + j)->off >= subprogs[cur_subprog].nocsr_stack_off) {
> > > +                                       /* do a second visit of this instruction,
> > > +                                        * so that verifier can inline it
> > > +                                        */
> > > +                                       i -= 1;
> > > +                                       insn -= 1;
> > > +                                       goto next_insn;
> > > +                               }
> > > +                       }
> >
> > I don't get this loop, can you elaborate? Why are we double-checking
> > anything here, didn't we do this already?
>
> We established probable patterns and probable minimal offset.
> Over the course of program verification we might have invalidated the
> .nocsr_stack_off for a particular subprogram => hence a need for this check.

Ah, right, we can invalidate it if we detect misuse, makes sense. Missed that.

>
> >
> > > +
> > > +                       /* apply the rewrite:
> > > +                        *   *(u64 *)(r10 - X) = rY ; num-times
> > > +                        *   call()                               -> call()
> > > +                        *   rY = *(u64 *)(r10 - X) ; num-times
> > > +                        */
> > > +                       err = verifier_remove_insns(env, i + delta - spills_num, spills_num);
> > > +                       if (err)
> > > +                               return err;
> > > +                       err = verifier_remove_insns(env, i + delta - spills_num + 1, spills_num);
> > > +                       if (err)
> > > +                               return err;
> >
> > why not a single bpf_patch_insn_data()?
>
> bpf_patch_insn_data() assumes that one instruction has to be replaced with many.
> Here I need to replace many instructions with a single instruction.
> I'd prefer not to tweak bpf_patch_insn_data() for this patch-set.

ah, bpf_patch_insn_data just does bpf_patch_insn_single, somehow I
thought that it does range for range replacement of instructions.
Never mind then (it's a bit sad that we don't have a proper flexible
and powerful patching primitive, though, but oh well).

>
> On the other hand, the do_jit() for x86 removes NOPs (BPF_JA +0),
> so I can probably replace spills/fills with NOPs here instead of
> calling bpf_patch_insn_data() or bpf_remove_insns().

NOPing feels uglier, I think what you have is fine.

>
> > > +
> > > +                       i += spills_num - 1;
> > > +                       /*   ^            ^   do a second visit of this instruction,
> > > +                        *   |            '-- so that verifier can inline it
> > > +                        *   '--------------- jump over deleted fills
> > > +                        */
> > > +                       delta -= 2 * spills_num;
> > > +                       insn = env->prog->insnsi + i + delta;
> > > +                       goto next_insn;
> >
> > why not adjust the state and just fall through, what goto next_insn
> > does that we can't (and next instruction is misleading, so I'd rather
> > fix up and move forward)
>
> I don't like this. The fall-through makes control flow more convoluted.
> To understand what would happen next:
> - with goto next_insn we just start over;
> - with fall-through we need to think about position of this particular
>   'if' statement within the loop.
>

Alright, it's fine. Though I don't see anything wrong with knowing
that we patch up nocsr pattern before we do inlining of calls. And
yes, order is important, so what? I always assumed that the order of
operations in do_misc_fixups() is not arbitrary.

> >
> > > +               }
> > >                 if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
> > >                         ret = fixup_kfunc_call(env, insn, insn_buf, i + delta, &cnt);
> > >                         if (ret)
>
> [...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-10  6:01       ` Andrii Nakryiko
@ 2024-07-10  7:57         ` Eduard Zingerman
  2024-07-10 15:36           ` Andrii Nakryiko
  0 siblings, 1 reply; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-10  7:57 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi, Alexei Starovoitov

On Tue, 2024-07-09 at 23:01 -0700, Andrii Nakryiko wrote:

[...]

> > Right, it should be '<', my bad, will update the comment.
> 
> See, I do read comments ;)

Yes, sorry about that remark.

[...]

> > > > +/* See comment for mark_nocsr_pattern_for_call() */
> > > > +static void check_nocsr_stack_contract(struct bpf_verifier_env *env, struct bpf_func_state *state,
> > > > +                                      int insn_idx, int off)
> > > > +{
> > > > +       struct bpf_subprog_info *subprog = &env->subprog_info[state->subprogno];
> > > > +       struct bpf_insn_aux_data *aux = &env->insn_aux_data[insn_idx];
> > > > +
> > > > +       if (subprog->nocsr_stack_off <= off || aux->nocsr_pattern)
> > > > +               return;
> > > 
> > > can helper call instruction go through this check? E.g., if we do
> > > bpf_probe_read_kernel() into stack slot, where do we check that that
> > > slot is not overlapping with nocsr spill/fill region?
> > 
> > In check_helper_call() we do check_mem_access() that eventually calls
> > one of the check_stack_{read,write}_{fixed,varying}_off().
> > The .access_size should be set for bpf_probe_read_kernel()
> > because it's argument base type is ARG_PTR_TO_MEM.
> > I will add a test case to double-check this.
> 
> Ok. Also as I was reading this I didn't yet realize that
> aux->nocsr_pattern is not set for call instruction itself, so I was
> worried that we might accidentally skip the check. But we don't set
> nocsr_pattern for call, so we should be good (but the test never
> hurts)

Well yes, there is a flag check, but call to bpf_probe_read_kernel()
touching candidate nocsr stack address should disable nocsr rewrites
for current subprogram.

[...]

> > > > +static u32 call_csr_mask(struct bpf_verifier_env *env, struct bpf_insn *insn)
> > > 
> > > you use u8 for get_helper_reg_mask() and u32 here, why not keep them consistent?
> > 
> > Ok
> > 
> > > similar to the naming nit above, I think we should be a bit more
> > > explicit with what "mask" actually means. Is this also clobber mask?
> > 
> > I mean, there is a comment right above the function.
> > This function returns a mask of caller saved registers (csr).
> > I'll make the name more explicit.
> > 
> > > 
> > > > +{
> > > > +       const struct bpf_func_proto *fn;
> > > > +
> > > > +       if (bpf_helper_call(insn) &&
> > > > +           (verifier_inlines_helper_call(env, insn->imm) || bpf_jit_inlines_helper_call(insn->imm)) &&
> > > > +           get_helper_proto(env, insn->imm, &fn) == 0 &&
> > > > +           fn->allow_nocsr)
> > > > +               return ~get_helper_reg_mask(fn);
> > > 
> > > hm... I'm a bit confused why we do a negation here? aren't we working
> > > with clobbering mask... I'll keep reading for now.
> > 
> > Please read the comment before the function.
> > 
> 
> Believe it or not, but I read it like 10 times and that didn't help me much.
> 
> +/* If 'insn' is a call that follows no_caller_saved_registers contract
> + * and called function is inlined by current jit or verifier,
> + * return a mask with 1s corresponding to registers that are scratched
> + * by this call (depends on return type and number of return parameters).
> + * Otherwise return ALL_CALLER_SAVED_REGS mask.
> 
> "registers that are scratched by this call" would be what
> get_helper_reg_mask() returns for the function (at least that's my
> reading of the above), and yet you return inverted mask. It doesn't
> really help that we return ALL_CALLER_SAVED_REGS as "nope, it's not
> nocsr call".

I see, I messed up the comment...

> Maybe it would be a bit easier to follow if call_csr_mask() returned
> bool and mask as an out parameter in case it's nocsr call. So instead
> of special casing ALL_CALLER_SAVED_REGS there would be a nice "not a
> nocsr, never mind" and early exit.

Out parameter seems to be an overkill. I'll change names a little bit
to make it easier to follow. If for v3 you'd still think that out
parameter is better I'll switch then.

> Anyways, perhaps I'm just being very dense here, I just found this
> particular piece extremely hard to follow intuitively.
> 
> > > 
> > > > +
> > > > +       return ALL_CALLER_SAVED_REGS;
> > > > +}
> > 
> > [...]
> > 
> > > > +static void mark_nocsr_pattern_for_call(struct bpf_verifier_env *env, int t)
> > > > +{
> > > > +       struct bpf_insn *insns = env->prog->insnsi, *stx, *ldx;
> > > > +       struct bpf_subprog_info *subprog;
> > > > +       u32 csr_mask = call_csr_mask(env, &insns[t]);
> > > > +       u32 reg_mask = ~csr_mask | ~ALL_CALLER_SAVED_REGS;
> > > 
> > > tbh, I'm lost with all these bitmask and their inversions...
> > > call_csr_mask()'s result is basically always used inverted, so why not
> > > return inverted mask in the first place?
> > 
> > The mask is initialized as a set of all registers preserved by this call.
> 
> ok, maybe it's just a mix of "no csr" and "csr" that confuses me. How
> about we call call_csr_mask as get_helper_preserved_mask() or
> something like that to "match" get_helper_clobber_mask()?

Yes, these two names are good, thank you.
I'd still call it get_call_preserved_mask() as kfuncs might be added
at some point.

> > Those that are not in mask need a spill/fill pair.
> > I'll toss things around to make this more clear.
> > (naming, comments, maybe move the '| ~ALL_CALLER_SAVED_REGS' to the call_csr_mask()).
> > 
> > > 
> > > > +       int s, i;
> > > > +       s16 off;
> > > > +
> > > > +       if (csr_mask == ALL_CALLER_SAVED_REGS)
> > > > +               return;
> > > > +
> > > > +       for (i = 1, off = 0; i <= ARRAY_SIZE(caller_saved); ++i, off += BPF_REG_SIZE) {
> > > > +               if (t - i < 0 || t + i >= env->prog->len)
> > > > +                       break;
> > > > +               stx = &insns[t - i];
> > > > +               ldx = &insns[t + i];
> > > > +               if (off == 0) {
> > > > +                       off = stx->off;
> > > > +                       if (off % BPF_REG_SIZE != 0)
> > > > +                               break;
> > > 
> > > kind of ugly that we assume stx before we actually checked that it's
> > > STX?... maybe split humongous if below into instruction checking
> > > (with code and src_reg) and then off checking separately?
> > 
> > Don't see anything ugly about this, tbh.
> 
> you are using stx->off and do `% BPF_REG_SIZE` check on it while that
> stx->off field might be meaningless for the instruction (because we
> are not yet sure it's STX instruction), that's a bit ugly, IMO

I can rewrite it like below:

		if (stx->code != (BPF_STX | BPF_MEM | BPF_DW) ||
		    ldx->code != (BPF_LDX | BPF_MEM | BPF_DW))
			break;
		off = off != 0 ?: stx->off; // or use full 'if' block from the old version
		if (stx->dst_reg != BPF_REG_10 ||
		    ldx->src_reg != BPF_REG_10 ||
		    /* check spill/fill for the same reg and offset */
		    stx->src_reg != ldx->dst_reg ||
		    stx->off != ldx->off ||
		    stx->off != off ||
		    (off % BPF_REG_SIZE) != 0 ||
		    /* this should be a previously unseen register */
		    (BIT(stx->src_reg) & reg_mask))
			break;

But I'm not sure this adds any actual clarity.

> 
> > Can split the 'if' statement, if you think it's hard to read.
> 
> it's not about readability for me
> 
> > 
> > > 
> > > > +               }
> > > > +               if (/* *(u64 *)(r10 - off) = r[0-5]? */
> > > > +                   stx->code != (BPF_STX | BPF_MEM | BPF_DW) ||
> > > > +                   stx->dst_reg != BPF_REG_10 ||
> > > > +                   /* r[0-5] = *(u64 *)(r10 - off)? */
> > > > +                   ldx->code != (BPF_LDX | BPF_MEM | BPF_DW) ||
> > > > +                   ldx->src_reg != BPF_REG_10 ||
> > > > +                   /* check spill/fill for the same reg and offset */
> > > > +                   stx->src_reg != ldx->dst_reg ||
> > > > +                   stx->off != ldx->off ||
> > > > +                   stx->off != off ||
> > > > +                   /* this should be a previously unseen register */
> > > > +                   BIT(stx->src_reg) & reg_mask)
> > > 
> > > () around & operation?
> > 
> > No need, & has higher priority over ||.
> > You can check the AST using
> > https://tree-sitter.github.io/tree-sitter/playground .
> 
> Oh, I have no doubt you checked that it works correctly. It's just not
> a really good habit to rely on C's obscure operation ordering rules
> beyond A && B || C (and even then it is arguable). I think the rule of
> thumb to not mix bitwise and logic operators without explicit
> parenthesis makes sense.
> 
> But never mind, I already feel weird complaining about !strcmp(),
> every real kernel engineer should memorize operator precedence by
> heart.

I assumed you implied incorrect evaluation order.
Will add parenthesis.

> > > > +                       break;
> > > > +               reg_mask |= BIT(stx->src_reg);
> > > > +               env->insn_aux_data[t - i].nocsr_pattern = 1;
> > > > +               env->insn_aux_data[t + i].nocsr_pattern = 1;
> > > > +       }
> > > > +       if (i == 1)
> > > > +               return;
> > > > +       env->insn_aux_data[t].nocsr_spills_num = i - 1;
> > > > +       s = find_containing_subprog(env, t);
> > > > +       /* can't happen */
[...]
> > > > +       if (WARN_ON_ONCE(s < 0))
> > > > +               return;
> > > > +       subprog = &env->subprog_info[s];
> > > > +       subprog->nocsr_stack_off = min(subprog->nocsr_stack_off, off);
> > > 
> > > should this be max()? offsets are negative, right? so if nocsr uses -8
> > > and -16 as in the example, entire [-16, 0) region is nocsr region
> > 
> > This should be min exactly because stack offsets are negative.
> > For the example above the 'off' is initialized as -16 and then
> > is incremented by +8 giving final value of -8.
> > And I need to select the minimal value used between several patterns.
> 
> so let's say I have two nocsr calls in the same subprog
> 
>     *(u64 *)(r10 - 8)  = r1;
>     *(u64 *)(r10 - 16) = r2;
>     call %[to_be_inlined]
>     r2 = *(u64 *)(r10 - 16);
>     r1 = *(u64 *)(r10 - 8);
> 
> 
> and then a bit later
> 
> 
>     *(u64 *)(r10 - 16)  = r1;
>     *(u64 *)(r10 - 24) = r2;
>     call %[to_be_inlined]
>     r2 = *(u64 *)(r10 - 24);
>     r1 = *(u64 *)(r10 - 16);
> 
> 
> For the first chunk nocsr range is [-16, 0). For the second it's [-24,
> -8), right?
> Should `*(u64 *)(r10 - 8) = 123` somewhere in that subprog (not for
> nocsr calls) invalidate this whole nocsr thing? With min you'll
> basically have [-24, -8) as nocsr-reserved region, but shouldn't it be
> whole [-24, 0)?
> 
> Clang shouldn't generate such inconsistent offset, right? But it's
> legal, no? And if not, then all the calls should use the same upper
> stack boundary and we shouldn't be doing min/max, but rather checking
> that they are all consistent.
> 
> Or what am I missing again?

You don't miss anything.

With the current algorithm first call will not be rewritten because of
the offset check in do_misc_fixups().
The second call would be rewritten and this is safe to do, because
verifier knows that range [-24,-8) is only used by nocsr patterns.
What you suggest is slightly more pessimistic, compared to current
algorithm.

[...]

> > > > +
> > > > +                       /* apply the rewrite:
> > > > +                        *   *(u64 *)(r10 - X) = rY ; num-times
> > > > +                        *   call()                               -> call()
> > > > +                        *   rY = *(u64 *)(r10 - X) ; num-times
> > > > +                        */
> > > > +                       err = verifier_remove_insns(env, i + delta - spills_num, spills_num);
> > > > +                       if (err)
> > > > +                               return err;
> > > > +                       err = verifier_remove_insns(env, i + delta - spills_num + 1, spills_num);
> > > > +                       if (err)
> > > > +                               return err;
> > > 
> > > why not a single bpf_patch_insn_data()?
> > 
> > bpf_patch_insn_data() assumes that one instruction has to be replaced with many.
> > Here I need to replace many instructions with a single instruction.
> > I'd prefer not to tweak bpf_patch_insn_data() for this patch-set.
> 
> ah, bpf_patch_insn_data just does bpf_patch_insn_single, somehow I
> thought that it does range for range replacement of instructions.
> Never mind then (it's a bit sad that we don't have a proper flexible
> and powerful patching primitive, though, but oh well).

That is true.
I'll think about it once other issues with this patch-set are resolved.

[...]

> > > > +
> > > > +                       i += spills_num - 1;
> > > > +                       /*   ^            ^   do a second visit of this instruction,
> > > > +                        *   |            '-- so that verifier can inline it
> > > > +                        *   '--------------- jump over deleted fills
> > > > +                        */
> > > > +                       delta -= 2 * spills_num;
> > > > +                       insn = env->prog->insnsi + i + delta;
> > > > +                       goto next_insn;
> > > 
> > > why not adjust the state and just fall through, what goto next_insn
> > > does that we can't (and next instruction is misleading, so I'd rather
> > > fix up and move forward)
> > 
> > I don't like this. The fall-through makes control flow more convoluted.
> > To understand what would happen next:
> > - with goto next_insn we just start over;
> > - with fall-through we need to think about position of this particular
> >   'if' statement within the loop.
> > 
> 
> Alright, it's fine. Though I don't see anything wrong with knowing
> that we patch up nocsr pattern before we do inlining of calls. And
> yes, order is important, so what? I always assumed that the order of
> operations in do_misc_fixups() is not arbitrary.

On a second thought maybe fall-through is not that bad.

[...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-09 23:42   ` Andrii Nakryiko
  2024-07-10  3:00     ` Eduard Zingerman
@ 2024-07-10  9:46     ` Eduard Zingerman
  2024-07-10 15:23       ` Andrii Nakryiko
  1 sibling, 1 reply; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-10  9:46 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi, Alexei Starovoitov

On Tue, 2024-07-09 at 16:42 -0700, Andrii Nakryiko wrote:

[...]

> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > index 2b54e25d2364..735ae0901b3d 100644
> > --- a/include/linux/bpf_verifier.h
> > +++ b/include/linux/bpf_verifier.h
> > @@ -585,6 +585,15 @@ struct bpf_insn_aux_data {
> >          * accepts callback function as a parameter.
> >          */
> >         bool calls_callback;
> > +       /* true if STX or LDX instruction is a part of a spill/fill
> > +        * pattern for a no_caller_saved_registers call.
> > +        */
> > +       u8 nocsr_pattern:1;
> > +       /* for CALL instructions, a number of spill/fill pairs in the
> > +        * no_caller_saved_registers pattern.
> > +        */
> > +       u8 nocsr_spills_num:3;
> 
> despite bitfields this will extend bpf_insn_aux_data by 8 bytes. there
> are 2 bytes of padding after alu_state, let's put this there.
> 
> And let's not add bitfields unless absolutely necessary (this can be
> always done later).

Unfortunately the bitfields are still necessary, here is pahole output
after moving fields and removing bitfields:

struct bpf_insn_aux_data {
	...
	u8                         alu_state;            /*    62     1 */
	u8                         nocsr_pattern;        /*    63     1 */
	/* --- cacheline 1 boundary (64 bytes) --- */
	u8                         nocsr_spills_num;     /*    64     1 */

	/* XXX 3 bytes hole, try to pack */

	unsigned int               orig_idx;             /*    68     4 */
	...

	/* size: 80, cachelines: 2, members: 20 */
	/* sum members: 73, holes: 1, sum holes: 3 */
	/* padding: 4 */
	/* last cacheline: 16 bytes */
};

While with bitfields:

        /* size: 72, cachelines: 2, members: 20 */

[...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-10  9:46     ` Eduard Zingerman
@ 2024-07-10 15:23       ` Andrii Nakryiko
  0 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2024-07-10 15:23 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi, Alexei Starovoitov

On Wed, Jul 10, 2024 at 2:46 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2024-07-09 at 16:42 -0700, Andrii Nakryiko wrote:
>
> [...]
>
> > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > > index 2b54e25d2364..735ae0901b3d 100644
> > > --- a/include/linux/bpf_verifier.h
> > > +++ b/include/linux/bpf_verifier.h
> > > @@ -585,6 +585,15 @@ struct bpf_insn_aux_data {
> > >          * accepts callback function as a parameter.
> > >          */
> > >         bool calls_callback;
> > > +       /* true if STX or LDX instruction is a part of a spill/fill
> > > +        * pattern for a no_caller_saved_registers call.
> > > +        */
> > > +       u8 nocsr_pattern:1;
> > > +       /* for CALL instructions, a number of spill/fill pairs in the
> > > +        * no_caller_saved_registers pattern.
> > > +        */
> > > +       u8 nocsr_spills_num:3;
> >
> > despite bitfields this will extend bpf_insn_aux_data by 8 bytes. there
> > are 2 bytes of padding after alu_state, let's put this there.
> >
> > And let's not add bitfields unless absolutely necessary (this can be
> > always done later).
>
> Unfortunately the bitfields are still necessary, here is pahole output
> after moving fields and removing bitfields:

yep, if it's needed it's needed. My slightly older kernel has
alu_state at offset 61, so seems like we already added something
meanwhile.

>
> struct bpf_insn_aux_data {
>         ...
>         u8                         alu_state;            /*    62     1 */
>         u8                         nocsr_pattern;        /*    63     1 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         u8                         nocsr_spills_num;     /*    64     1 */
>
>         /* XXX 3 bytes hole, try to pack */
>
>         unsigned int               orig_idx;             /*    68     4 */
>         ...
>
>         /* size: 80, cachelines: 2, members: 20 */
>         /* sum members: 73, holes: 1, sum holes: 3 */
>         /* padding: 4 */
>         /* last cacheline: 16 bytes */
> };
>
> While with bitfields:
>
>         /* size: 72, cachelines: 2, members: 20 */
>
> [...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-10  7:57         ` Eduard Zingerman
@ 2024-07-10 15:36           ` Andrii Nakryiko
  2024-07-10 16:15             ` Eduard Zingerman
  2024-07-10 19:01             ` Alexei Starovoitov
  0 siblings, 2 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2024-07-10 15:36 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi, Alexei Starovoitov

On Wed, Jul 10, 2024 at 12:57 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2024-07-09 at 23:01 -0700, Andrii Nakryiko wrote:
>
> [...]
>
> > > Right, it should be '<', my bad, will update the comment.
> >
> > See, I do read comments ;)
>
> Yes, sorry about that remark.
>

no worries :)

> [...]
>
> > > > > +/* See comment for mark_nocsr_pattern_for_call() */
> > > > > +static void check_nocsr_stack_contract(struct bpf_verifier_env *env, struct bpf_func_state *state,
> > > > > +                                      int insn_idx, int off)
> > > > > +{
> > > > > +       struct bpf_subprog_info *subprog = &env->subprog_info[state->subprogno];
> > > > > +       struct bpf_insn_aux_data *aux = &env->insn_aux_data[insn_idx];
> > > > > +
> > > > > +       if (subprog->nocsr_stack_off <= off || aux->nocsr_pattern)
> > > > > +               return;
> > > >
> > > > can helper call instruction go through this check? E.g., if we do
> > > > bpf_probe_read_kernel() into stack slot, where do we check that that
> > > > slot is not overlapping with nocsr spill/fill region?
> > >
> > > In check_helper_call() we do check_mem_access() that eventually calls
> > > one of the check_stack_{read,write}_{fixed,varying}_off().
> > > The .access_size should be set for bpf_probe_read_kernel()
> > > because it's argument base type is ARG_PTR_TO_MEM.
> > > I will add a test case to double-check this.
> >
> > Ok. Also as I was reading this I didn't yet realize that
> > aux->nocsr_pattern is not set for call instruction itself, so I was
> > worried that we might accidentally skip the check. But we don't set
> > nocsr_pattern for call, so we should be good (but the test never
> > hurts)
>
> Well yes, there is a flag check, but call to bpf_probe_read_kernel()
> touching candidate nocsr stack address should disable nocsr rewrites
> for current subprogram.

cool, let's have a test confirming correctness

>
> [...]
>
> > > > > +static u32 call_csr_mask(struct bpf_verifier_env *env, struct bpf_insn *insn)
> > > >
> > > > you use u8 for get_helper_reg_mask() and u32 here, why not keep them consistent?
> > >
> > > Ok
> > >
> > > > similar to the naming nit above, I think we should be a bit more
> > > > explicit with what "mask" actually means. Is this also clobber mask?
> > >
> > > I mean, there is a comment right above the function.
> > > This function returns a mask of caller saved registers (csr).
> > > I'll make the name more explicit.
> > >
> > > >
> > > > > +{
> > > > > +       const struct bpf_func_proto *fn;
> > > > > +
> > > > > +       if (bpf_helper_call(insn) &&
> > > > > +           (verifier_inlines_helper_call(env, insn->imm) || bpf_jit_inlines_helper_call(insn->imm)) &&
> > > > > +           get_helper_proto(env, insn->imm, &fn) == 0 &&
> > > > > +           fn->allow_nocsr)
> > > > > +               return ~get_helper_reg_mask(fn);
> > > >
> > > > hm... I'm a bit confused why we do a negation here? aren't we working
> > > > with clobbering mask... I'll keep reading for now.
> > >
> > > Please read the comment before the function.
> > >
> >
> > Believe it or not, but I read it like 10 times and that didn't help me much.
> >
> > +/* If 'insn' is a call that follows no_caller_saved_registers contract
> > + * and called function is inlined by current jit or verifier,
> > + * return a mask with 1s corresponding to registers that are scratched
> > + * by this call (depends on return type and number of return parameters).
> > + * Otherwise return ALL_CALLER_SAVED_REGS mask.
> >
> > "registers that are scratched by this call" would be what
> > get_helper_reg_mask() returns for the function (at least that's my
> > reading of the above), and yet you return inverted mask. It doesn't
> > really help that we return ALL_CALLER_SAVED_REGS as "nope, it's not
> > nocsr call".
>
> I see, I messed up the comment...
>
> > Maybe it would be a bit easier to follow if call_csr_mask() returned
> > bool and mask as an out parameter in case it's nocsr call. So instead
> > of special casing ALL_CALLER_SAVED_REGS there would be a nice "not a
> > nocsr, never mind" and early exit.
>
> Out parameter seems to be an overkill. I'll change names a little bit
> to make it easier to follow. If for v3 you'd still think that out
> parameter is better I'll switch then.

ok

>
> > Anyways, perhaps I'm just being very dense here, I just found this
> > particular piece extremely hard to follow intuitively.
> >
> > > >
> > > > > +
> > > > > +       return ALL_CALLER_SAVED_REGS;
> > > > > +}
> > >
> > > [...]
> > >
> > > > > +static void mark_nocsr_pattern_for_call(struct bpf_verifier_env *env, int t)
> > > > > +{
> > > > > +       struct bpf_insn *insns = env->prog->insnsi, *stx, *ldx;
> > > > > +       struct bpf_subprog_info *subprog;
> > > > > +       u32 csr_mask = call_csr_mask(env, &insns[t]);
> > > > > +       u32 reg_mask = ~csr_mask | ~ALL_CALLER_SAVED_REGS;
> > > >
> > > > tbh, I'm lost with all these bitmask and their inversions...
> > > > call_csr_mask()'s result is basically always used inverted, so why not
> > > > return inverted mask in the first place?
> > >
> > > The mask is initialized as a set of all registers preserved by this call.
> >
> > ok, maybe it's just a mix of "no csr" and "csr" that confuses me. How
> > about we call call_csr_mask as get_helper_preserved_mask() or
> > something like that to "match" get_helper_clobber_mask()?
>
> Yes, these two names are good, thank you.
> I'd still call it get_call_preserved_mask() as kfuncs might be added
> at some point.

sure, makes sense

>
> > > Those that are not in mask need a spill/fill pair.
> > > I'll toss things around to make this more clear.
> > > (naming, comments, maybe move the '| ~ALL_CALLER_SAVED_REGS' to the call_csr_mask()).
> > >
> > > >
> > > > > +       int s, i;
> > > > > +       s16 off;
> > > > > +
> > > > > +       if (csr_mask == ALL_CALLER_SAVED_REGS)
> > > > > +               return;
> > > > > +
> > > > > +       for (i = 1, off = 0; i <= ARRAY_SIZE(caller_saved); ++i, off += BPF_REG_SIZE) {
> > > > > +               if (t - i < 0 || t + i >= env->prog->len)
> > > > > +                       break;
> > > > > +               stx = &insns[t - i];
> > > > > +               ldx = &insns[t + i];
> > > > > +               if (off == 0) {
> > > > > +                       off = stx->off;
> > > > > +                       if (off % BPF_REG_SIZE != 0)
> > > > > +                               break;
> > > >
> > > > kind of ugly that we assume stx before we actually checked that it's
> > > > STX?... maybe split humongous if below into instruction checking
> > > > (with code and src_reg) and then off checking separately?
> > >
> > > Don't see anything ugly about this, tbh.
> >
> > you are using stx->off and do `% BPF_REG_SIZE` check on it while that
> > stx->off field might be meaningless for the instruction (because we
> > are not yet sure it's STX instruction), that's a bit ugly, IMO
>
> I can rewrite it like below:
>
>                 if (stx->code != (BPF_STX | BPF_MEM | BPF_DW) ||
>                     ldx->code != (BPF_LDX | BPF_MEM | BPF_DW))

I'd add stx->dst_reg != BPF_REG_10 and ldx->src_reg != BPF_REG_10
checks here and preserve original comments with instruction assembly
form.

(think about this as establishing that we are looking at the correct
shapes of instructions)

>                         break;
>                 off = off != 0 ?: stx->off; // or use full 'if' block from the old version
>                 if (stx->dst_reg != BPF_REG_10 ||
>                     ldx->src_reg != BPF_REG_10 ||
>                     /* check spill/fill for the same reg and offset */
>                     stx->src_reg != ldx->dst_reg ||
>                     stx->off != ldx->off ||
>                     stx->off != off ||
>                     (off % BPF_REG_SIZE) != 0 ||
>                     /* this should be a previously unseen register */
>                     (BIT(stx->src_reg) & reg_mask))

and here we are checking all the correctness and additional imposed
semantical invariants knowing full well that we are dealing with
correct STX/LDX shapes

>                         break;
>
> But I'm not sure this adds any actual clarity.

I think it makes sense.

>
> >
> > > Can split the 'if' statement, if you think it's hard to read.
> >
> > it's not about readability for me
> >
> > >
> > > >
> > > > > +               }
> > > > > +               if (/* *(u64 *)(r10 - off) = r[0-5]? */
> > > > > +                   stx->code != (BPF_STX | BPF_MEM | BPF_DW) ||
> > > > > +                   stx->dst_reg != BPF_REG_10 ||
> > > > > +                   /* r[0-5] = *(u64 *)(r10 - off)? */
> > > > > +                   ldx->code != (BPF_LDX | BPF_MEM | BPF_DW) ||
> > > > > +                   ldx->src_reg != BPF_REG_10 ||
> > > > > +                   /* check spill/fill for the same reg and offset */
> > > > > +                   stx->src_reg != ldx->dst_reg ||
> > > > > +                   stx->off != ldx->off ||
> > > > > +                   stx->off != off ||
> > > > > +                   /* this should be a previously unseen register */
> > > > > +                   BIT(stx->src_reg) & reg_mask)
> > > >
> > > > () around & operation?
> > >
> > > No need, & has higher priority over ||.
> > > You can check the AST using
> > > https://tree-sitter.github.io/tree-sitter/playground .
> >
> > Oh, I have no doubt you checked that it works correctly. It's just not
> > a really good habit to rely on C's obscure operation ordering rules
> > beyond A && B || C (and even then it is arguable). I think the rule of
> > thumb to not mix bitwise and logic operators without explicit
> > parenthesis makes sense.
> >
> > But never mind, I already feel weird complaining about !strcmp(),
> > every real kernel engineer should memorize operator precedence by
> > heart.
>
> I assumed you implied incorrect evaluation order.
> Will add parenthesis.
>

thanks

> > > > > +                       break;
> > > > > +               reg_mask |= BIT(stx->src_reg);
> > > > > +               env->insn_aux_data[t - i].nocsr_pattern = 1;
> > > > > +               env->insn_aux_data[t + i].nocsr_pattern = 1;
> > > > > +       }
> > > > > +       if (i == 1)
> > > > > +               return;
> > > > > +       env->insn_aux_data[t].nocsr_spills_num = i - 1;
> > > > > +       s = find_containing_subprog(env, t);
> > > > > +       /* can't happen */
> [...]
> > > > > +       if (WARN_ON_ONCE(s < 0))
> > > > > +               return;
> > > > > +       subprog = &env->subprog_info[s];
> > > > > +       subprog->nocsr_stack_off = min(subprog->nocsr_stack_off, off);
> > > >
> > > > should this be max()? offsets are negative, right? so if nocsr uses -8
> > > > and -16 as in the example, entire [-16, 0) region is nocsr region
> > >
> > > This should be min exactly because stack offsets are negative.
> > > For the example above the 'off' is initialized as -16 and then
> > > is incremented by +8 giving final value of -8.
> > > And I need to select the minimal value used between several patterns.
> >
> > so let's say I have two nocsr calls in the same subprog
> >
> >     *(u64 *)(r10 - 8)  = r1;
> >     *(u64 *)(r10 - 16) = r2;
> >     call %[to_be_inlined]
> >     r2 = *(u64 *)(r10 - 16);
> >     r1 = *(u64 *)(r10 - 8);
> >
> >
> > and then a bit later
> >
> >
> >     *(u64 *)(r10 - 16)  = r1;
> >     *(u64 *)(r10 - 24) = r2;
> >     call %[to_be_inlined]
> >     r2 = *(u64 *)(r10 - 24);
> >     r1 = *(u64 *)(r10 - 16);
> >
> >
> > For the first chunk nocsr range is [-16, 0). For the second it's [-24,
> > -8), right?
> > Should `*(u64 *)(r10 - 8) = 123` somewhere in that subprog (not for
> > nocsr calls) invalidate this whole nocsr thing? With min you'll
> > basically have [-24, -8) as nocsr-reserved region, but shouldn't it be
> > whole [-24, 0)?
> >
> > Clang shouldn't generate such inconsistent offset, right? But it's
> > legal, no? And if not, then all the calls should use the same upper
> > stack boundary and we shouldn't be doing min/max, but rather checking
> > that they are all consistent.
> >
> > Or what am I missing again?
>
> You don't miss anything.
>
> With the current algorithm first call will not be rewritten because of
> the offset check in do_misc_fixups().
> The second call would be rewritten and this is safe to do, because
> verifier knows that range [-24,-8) is only used by nocsr patterns.
> What you suggest is slightly more pessimistic, compared to current
> algorithm.

Ok, I see, this wasn't obvious that you want this behavior. I actually
find it surprising (and so at least let's call this possibility out).

But, tbh, I'd make this stricter. I'd dictate that within a subprogram
all no_csr patterns *should* end with the same stack offset and I
would check for that (so what I mentioned before that instead of min
or max we need assignment and check for equality if we already
assigned it once).

Also, instead of doing that extra nocsr offset check in
do_misc_fixups(), why don't we just reset all no_csr patterns within a
subprogram *if we find a single violation*. Compiler shouldn't ever
emit such code, right? So let's be strict and fallback to not
recognizing nocsr.

And then we won't need that extra check in do_misc_fixups() because we
eagerly unset no_csr flag and will never hit that piece of logic in
patching.


I'd go even further and say that on any nocsr invariant violation, we
just go and reset all nocsr pattern flags across entire BPF program
(all subprogs including). In check_nocsr_stack_contract() I mean. Just
have a loop over all instructions and set that flag to false.

Why not? What realistic application would need to violate nocsr in
some subprogs but not in others?

KISS or it's too simplistic for some reason?

>
> [...]
>
> > > > > +
> > > > > +                       /* apply the rewrite:
> > > > > +                        *   *(u64 *)(r10 - X) = rY ; num-times
> > > > > +                        *   call()                               -> call()
> > > > > +                        *   rY = *(u64 *)(r10 - X) ; num-times
> > > > > +                        */
> > > > > +                       err = verifier_remove_insns(env, i + delta - spills_num, spills_num);
> > > > > +                       if (err)
> > > > > +                               return err;
> > > > > +                       err = verifier_remove_insns(env, i + delta - spills_num + 1, spills_num);
> > > > > +                       if (err)
> > > > > +                               return err;
> > > >
> > > > why not a single bpf_patch_insn_data()?
> > >
> > > bpf_patch_insn_data() assumes that one instruction has to be replaced with many.
> > > Here I need to replace many instructions with a single instruction.
> > > I'd prefer not to tweak bpf_patch_insn_data() for this patch-set.
> >
> > ah, bpf_patch_insn_data just does bpf_patch_insn_single, somehow I
> > thought that it does range for range replacement of instructions.
> > Never mind then (it's a bit sad that we don't have a proper flexible
> > and powerful patching primitive, though, but oh well).
>
> That is true.
> I'll think about it once other issues with this patch-set are resolved.

yeah, it's completely unrelated, but having a bpf_patch_insns that
takes input instruction range to be replaced and replacing it with
another set of instructions would cover all existing use cases and
some more. We wouldn't need verifier_remove_insns(), because that's
just replacing existing instructions with empty new set of
instructions. We would now have "insert instructions" primitive if we
specify empty input range of instructions. If we add a flag whether to
adjust jump offsets and make it configurable, we'd need the thing that
Alexei needed for may_goto without any extra hacks.

Furthermore, I find it wrong and silly that we keep having manual
delta+insn adjustments in every single piece of patching logic. I
think this should be done by bpf_patch_insns(). We need to have a
small "insn_patcher" struct that we use during instruction patching,
and a small instruction iterator on top that would keep returning next
instruction to process (and its index, probably). This will allow us
to optimize patching and instead of doing O(N) we can have something
faster and smarter (if we hide direct insn array accesses during
patching).

But this is all a completely separate story from all of this.

>
> [...]
>
> > > > > +
> > > > > +                       i += spills_num - 1;
> > > > > +                       /*   ^            ^   do a second visit of this instruction,
> > > > > +                        *   |            '-- so that verifier can inline it
> > > > > +                        *   '--------------- jump over deleted fills
> > > > > +                        */
> > > > > +                       delta -= 2 * spills_num;
> > > > > +                       insn = env->prog->insnsi + i + delta;
> > > > > +                       goto next_insn;
> > > >
> > > > why not adjust the state and just fall through, what goto next_insn
> > > > does that we can't (and next instruction is misleading, so I'd rather
> > > > fix up and move forward)
> > >
> > > I don't like this. The fall-through makes control flow more convoluted.
> > > To understand what would happen next:
> > > - with goto next_insn we just start over;
> > > - with fall-through we need to think about position of this particular
> > >   'if' statement within the loop.
> > >
> >
> > Alright, it's fine. Though I don't see anything wrong with knowing
> > that we patch up nocsr pattern before we do inlining of calls. And
> > yes, order is important, so what? I always assumed that the order of
> > operations in do_misc_fixups() is not arbitrary.
>
> On a second thought maybe fall-through is not that bad.
>
> [...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-10 15:36           ` Andrii Nakryiko
@ 2024-07-10 16:15             ` Eduard Zingerman
  2024-07-10 17:50               ` Andrii Nakryiko
  2024-07-10 19:01             ` Alexei Starovoitov
  1 sibling, 1 reply; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-10 16:15 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi, Alexei Starovoitov

On Wed, 2024-07-10 at 08:36 -0700, Andrii Nakryiko wrote:

[...]

> > I can rewrite it like below:
> > 
> >                 if (stx->code != (BPF_STX | BPF_MEM | BPF_DW) ||
> >                     ldx->code != (BPF_LDX | BPF_MEM | BPF_DW))
> 
> I'd add stx->dst_reg != BPF_REG_10 and ldx->src_reg != BPF_REG_10
> checks here and preserve original comments with instruction assembly
> form.
> 
> (think about this as establishing that we are looking at the correct
> shapes of instructions)
> 
> >                         break;
> >                 off = off != 0 ?: stx->off; // or use full 'if' block from the old version
> >                 if (stx->dst_reg != BPF_REG_10 ||
> >                     ldx->src_reg != BPF_REG_10 ||
> >                     /* check spill/fill for the same reg and offset */
> >                     stx->src_reg != ldx->dst_reg ||
> >                     stx->off != ldx->off ||
> >                     stx->off != off ||
> >                     (off % BPF_REG_SIZE) != 0 ||
> >                     /* this should be a previously unseen register */
> >                     (BIT(stx->src_reg) & reg_mask))
> 
> and here we are checking all the correctness and additional imposed
> semantical invariants knowing full well that we are dealing with
> correct STX/LDX shapes
> 
> >                         break;
> > 
> > But I'm not sure this adds any actual clarity.
> 
> I think it makes sense.

Ok, will change.

[...]

> Ok, I see, this wasn't obvious that you want this behavior. I actually
> find it surprising (and so at least let's call this possibility out).
> 
> But, tbh, I'd make this stricter. I'd dictate that within a subprogram
> all no_csr patterns *should* end with the same stack offset and I
> would check for that (so what I mentioned before that instead of min
> or max we need assignment and check for equality if we already
> assigned it once).

This makes sense, but does not cover a theoretical corner case:
- suppose there are two nocsr functions A and B on the kernel side,
  but only was A marked as nocsr during program compilation;
- the spill/fill "bracket" was generated for a call to function B by
  the compiler (because this is just a valid codegen).

So I wanted to keep the nocsr check a little bit more permissive.

> Also, instead of doing that extra nocsr offset check in
> do_misc_fixups(), why don't we just reset all no_csr patterns within a
> subprogram *if we find a single violation*. Compiler shouldn't ever
> emit such code, right? So let's be strict and fallback to not
> recognizing nocsr.
> 
> And then we won't need that extra check in do_misc_fixups() because we
> eagerly unset no_csr flag and will never hit that piece of logic in
> patching.

I can do that, but the detector pass would have to be two pass:
- on the first pass, find the nocsr_stack_off, add candidate insn marks;
- on the second pass, remove marks from insns with wrong stack access offset.

Otherwise I can't discern true patterns from false positives in
situation described above.

> I'd go even further and say that on any nocsr invariant violation, we
> just go and reset all nocsr pattern flags across entire BPF program
> (all subprogs including). In check_nocsr_stack_contract() I mean. Just
> have a loop over all instructions and set that flag to false.
> 
> Why not? What realistic application would need to violate nocsr in
> some subprogs but not in others?
> 
> KISS or it's too simplistic for some reason?

I can invalidate nocsr per-program.
If so, I would use an "allow_nocsr" flag in program aux to avoid
additional passes over instructions.

[...]

> > > > bpf_patch_insn_data() assumes that one instruction has to be replaced with many.
> > > > Here I need to replace many instructions with a single instruction.
> > > > I'd prefer not to tweak bpf_patch_insn_data() for this patch-set.
> > > 
> > > ah, bpf_patch_insn_data just does bpf_patch_insn_single, somehow I
> > > thought that it does range for range replacement of instructions.
> > > Never mind then (it's a bit sad that we don't have a proper flexible
> > > and powerful patching primitive, though, but oh well).
> > 
> > That is true.
> > I'll think about it once other issues with this patch-set are resolved.
> 
> yeah, it's completely unrelated, but having a bpf_patch_insns that
> takes input instruction range to be replaced and replacing it with
> another set of instructions would cover all existing use cases and
> some more. We wouldn't need verifier_remove_insns(), because that's
> just replacing existing instructions with empty new set of
> instructions. We would now have "insert instructions" primitive if we
> specify empty input range of instructions. If we add a flag whether to
> adjust jump offsets and make it configurable, we'd need the thing that
> Alexei needed for may_goto without any extra hacks.
> 
> Furthermore, I find it wrong and silly that we keep having manual
> delta+insn adjustments in every single piece of patching logic. I
> think this should be done by bpf_patch_insns(). We need to have a
> small "insn_patcher" struct that we use during instruction patching,
> and a small instruction iterator on top that would keep returning next
> instruction to process (and its index, probably). This will allow us
> to optimize patching and instead of doing O(N) we can have something
> faster and smarter (if we hide direct insn array accesses during
> patching).
> 
> But this is all a completely separate story from all of this.

I actually suggested something along these lines very long time ago:
https://lore.kernel.org/bpf/5c297e5009bcd4f0becf59ccd97cfd82bb3a49ec.camel@gmail.com/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-10 16:15             ` Eduard Zingerman
@ 2024-07-10 17:50               ` Andrii Nakryiko
  2024-07-10 18:40                 ` Eduard Zingerman
  0 siblings, 1 reply; 36+ messages in thread
From: Andrii Nakryiko @ 2024-07-10 17:50 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi, Alexei Starovoitov

On Wed, Jul 10, 2024 at 9:15 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2024-07-10 at 08:36 -0700, Andrii Nakryiko wrote:
>
> [...]
>
> > > I can rewrite it like below:
> > >
> > >                 if (stx->code != (BPF_STX | BPF_MEM | BPF_DW) ||
> > >                     ldx->code != (BPF_LDX | BPF_MEM | BPF_DW))
> >
> > I'd add stx->dst_reg != BPF_REG_10 and ldx->src_reg != BPF_REG_10
> > checks here and preserve original comments with instruction assembly
> > form.
> >
> > (think about this as establishing that we are looking at the correct
> > shapes of instructions)
> >
> > >                         break;
> > >                 off = off != 0 ?: stx->off; // or use full 'if' block from the old version
> > >                 if (stx->dst_reg != BPF_REG_10 ||
> > >                     ldx->src_reg != BPF_REG_10 ||
> > >                     /* check spill/fill for the same reg and offset */
> > >                     stx->src_reg != ldx->dst_reg ||
> > >                     stx->off != ldx->off ||
> > >                     stx->off != off ||
> > >                     (off % BPF_REG_SIZE) != 0 ||
> > >                     /* this should be a previously unseen register */
> > >                     (BIT(stx->src_reg) & reg_mask))
> >
> > and here we are checking all the correctness and additional imposed
> > semantical invariants knowing full well that we are dealing with
> > correct STX/LDX shapes
> >
> > >                         break;
> > >
> > > But I'm not sure this adds any actual clarity.
> >
> > I think it makes sense.
>
> Ok, will change.
>
> [...]
>
> > Ok, I see, this wasn't obvious that you want this behavior. I actually
> > find it surprising (and so at least let's call this possibility out).
> >
> > But, tbh, I'd make this stricter. I'd dictate that within a subprogram
> > all no_csr patterns *should* end with the same stack offset and I
> > would check for that (so what I mentioned before that instead of min
> > or max we need assignment and check for equality if we already
> > assigned it once).
>
> This makes sense, but does not cover a theoretical corner case:
> - suppose there are two nocsr functions A and B on the kernel side,
>   but only was A marked as nocsr during program compilation;
> - the spill/fill "bracket" was generated for a call to function B by
>   the compiler (because this is just a valid codegen).

In this case the call to A shouldn't be changing nocsr_offset at all,
though? You should find no spill/fill and thus even though A is
*allowed* to use no_csr, it doesn't, so it should have no effect on
nocsr offsets, no?

But your example actually made me think about another (not theoretical
at all) case. What if we have calls to A and B, the kernel is slightly
old and knows that B is nocsr, but A is not. But the BPF program was
compiled with the latest helper/kfunc definitions marking A as no_csr
eligible (as well as B). (btw, THAT'S THE WORD for allow_csr --
ELIGIBLE. csr_eligible FTW! but I digress...)

With the case above we'll disable nocsr for both A and B, right? That
sucks, but not sure if we can do anything about that. (We can probably
assume no_csr pattern and thus allow spill/fill and not disable nocsr
in general, but not remove spill/fills... a bit more complication
upfront for longer term extensibility.. not sure, maybe performance
regression is a fine price, hmm)

So I take it back about unmarking csr in the *entire BPF program*,
let's just limit it to the subprog scope. But I still think we should
do it eagerly, rather than double checking in do_misc_followups().
WDYT?

>
> So I wanted to keep the nocsr check a little bit more permissive.
>
> > Also, instead of doing that extra nocsr offset check in
> > do_misc_fixups(), why don't we just reset all no_csr patterns within a
> > subprogram *if we find a single violation*. Compiler shouldn't ever
> > emit such code, right? So let's be strict and fallback to not
> > recognizing nocsr.
> >
> > And then we won't need that extra check in do_misc_fixups() because we
> > eagerly unset no_csr flag and will never hit that piece of logic in
> > patching.
>
> I can do that, but the detector pass would have to be two pass:
> - on the first pass, find the nocsr_stack_off, add candidate insn marks;
> - on the second pass, remove marks from insns with wrong stack access offset.

It's not really a second pass, it's part of normal validation.
check_nocsr_stack_contract() will detect this and will do, yes, pass
over all instructions of a subprogram to unmark them.

>
> Otherwise I can't discern true patterns from false positives in
> situation described above.

See above, I might be missing what your "theoretical" example changes.

>
> > I'd go even further and say that on any nocsr invariant violation, we
> > just go and reset all nocsr pattern flags across entire BPF program
> > (all subprogs including). In check_nocsr_stack_contract() I mean. Just
> > have a loop over all instructions and set that flag to false.
> >
> > Why not? What realistic application would need to violate nocsr in
> > some subprogs but not in others?
> >
> > KISS or it's too simplistic for some reason?
>
> I can invalidate nocsr per-program.
> If so, I would use an "allow_nocsr" flag in program aux to avoid
> additional passes over instructions.

yep, see above (but let's keep it per-subprog for now)

>
> [...]
>
> > > > > bpf_patch_insn_data() assumes that one instruction has to be replaced with many.
> > > > > Here I need to replace many instructions with a single instruction.
> > > > > I'd prefer not to tweak bpf_patch_insn_data() for this patch-set.
> > > >
> > > > ah, bpf_patch_insn_data just does bpf_patch_insn_single, somehow I
> > > > thought that it does range for range replacement of instructions.
> > > > Never mind then (it's a bit sad that we don't have a proper flexible
> > > > and powerful patching primitive, though, but oh well).
> > >
> > > That is true.
> > > I'll think about it once other issues with this patch-set are resolved.
> >
> > yeah, it's completely unrelated, but having a bpf_patch_insns that
> > takes input instruction range to be replaced and replacing it with
> > another set of instructions would cover all existing use cases and
> > some more. We wouldn't need verifier_remove_insns(), because that's
> > just replacing existing instructions with empty new set of
> > instructions. We would now have "insert instructions" primitive if we
> > specify empty input range of instructions. If we add a flag whether to
> > adjust jump offsets and make it configurable, we'd need the thing that
> > Alexei needed for may_goto without any extra hacks.
> >
> > Furthermore, I find it wrong and silly that we keep having manual
> > delta+insn adjustments in every single piece of patching logic. I
> > think this should be done by bpf_patch_insns(). We need to have a
> > small "insn_patcher" struct that we use during instruction patching,
> > and a small instruction iterator on top that would keep returning next
> > instruction to process (and its index, probably). This will allow us
> > to optimize patching and instead of doing O(N) we can have something
> > faster and smarter (if we hide direct insn array accesses during
> > patching).
> >
> > But this is all a completely separate story from all of this.
>
> I actually suggested something along these lines very long time ago:
> https://lore.kernel.org/bpf/5c297e5009bcd4f0becf59ccd97cfd82bb3a49ec.camel@gmail.com/

We had a few discussions around patching improvements (you can see I
mentioned that in that thread as well). We all like to talk about
improving this, but not really doing anything about this. :) Tech debt
at its best.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-10 17:50               ` Andrii Nakryiko
@ 2024-07-10 18:40                 ` Eduard Zingerman
  2024-07-10 18:49                   ` Andrii Nakryiko
  2024-07-10 19:07                   ` Alexei Starovoitov
  0 siblings, 2 replies; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-10 18:40 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi, Alexei Starovoitov

On Wed, 2024-07-10 at 10:50 -0700, Andrii Nakryiko wrote:

[...]

> > > Ok, I see, this wasn't obvious that you want this behavior. I actually
> > > find it surprising (and so at least let's call this possibility out).
> > > 
> > > But, tbh, I'd make this stricter. I'd dictate that within a subprogram
> > > all no_csr patterns *should* end with the same stack offset and I
> > > would check for that (so what I mentioned before that instead of min
> > > or max we need assignment and check for equality if we already
> > > assigned it once).
> > 
> > This makes sense, but does not cover a theoretical corner case:
> > - suppose there are two nocsr functions A and B on the kernel side,
> >   but only was A marked as nocsr during program compilation;
> > - the spill/fill "bracket" was generated for a call to function B by
> >   the compiler (because this is just a valid codegen).
> 
> In this case the call to A shouldn't be changing nocsr_offset at all,
> though? You should find no spill/fill and thus even though A is
> *allowed* to use no_csr, it doesn't, so it should have no effect on
> nocsr offsets, no?

Consider the following code:

    *(u64 *)(r10 - 24) = r1;
    call A                     // kernel and clang know that A is nocsr
    r1 = *(u64 *)(r10 - 24);   // kernel assumes .nocsr_stack_offset -24
    ...
    *(u64 *)(r10 - 16) = r1;
    call B                     // only kernel knows that B is nocsr
    r1 = *(u64 *)(r10 - 16);   // with stricter logic this would disable
                               // nocsr for the whole sub-probram

> But your example actually made me think about another (not theoretical
> at all) case. What if we have calls to A and B, the kernel is slightly
> old and knows that B is nocsr, but A is not. But the BPF program was
> compiled with the latest helper/kfunc definitions marking A as no_csr
> eligible (as well as B). (btw, THAT'S THE WORD for allow_csr --
> ELIGIBLE. csr_eligible FTW! but I digress...)
> 
> With the case above we'll disable nocsr for both A and B, right? That
> sucks, but not sure if we can do anything about that. (We can probably
> assume no_csr pattern and thus allow spill/fill and not disable nocsr
> in general, but not remove spill/fills... a bit more complication
> upfront for longer term extensibility.. not sure, maybe performance
> regression is a fine price, hmm)
> 
> So I take it back about unmarking csr in the *entire BPF program*,
> let's just limit it to the subprog scope. But I still think we should
> do it eagerly, rather than double checking in do_misc_followups().
> WDYT?

With current algo this situation would disable nocsr indeed.
The problem is that checks for spilled stack slots usage is too simplistic.
However, it could be covered if the check is performed using e.g.
process I described earlier:
- for spill, remember the defining instruction in the stack slot structure;
- for fill, "merge" the defining instruction index;
- for other stack access mark defining instruction as escaping.

> > So I wanted to keep the nocsr check a little bit more permissive.
> > 
> > > Also, instead of doing that extra nocsr offset check in
> > > do_misc_fixups(), why don't we just reset all no_csr patterns within a
> > > subprogram *if we find a single violation*. Compiler shouldn't ever
> > > emit such code, right? So let's be strict and fallback to not
> > > recognizing nocsr.
> > > 
> > > And then we won't need that extra check in do_misc_fixups() because we
> > > eagerly unset no_csr flag and will never hit that piece of logic in
> > > patching.
> > 
> > I can do that, but the detector pass would have to be two pass:
> > - on the first pass, find the nocsr_stack_off, add candidate insn marks;
> > - on the second pass, remove marks from insns with wrong stack access offset.
> 
> It's not really a second pass, it's part of normal validation.
> check_nocsr_stack_contract() will detect this and will do, yes, pass
> over all instructions of a subprogram to unmark them.

- on the first pass true .nocsr_stack_off is not yet known,
  so .nocsr_pattern is set optimistically;
- on the second pass .nocsr_stack_off is already known,
  so .nocsr_pattern can be removed from spills/fills outside the range;
- check_nocsr_stack_contract() does not need to scan full sub-program
  on each violation, it can set a flag disabling nocsr in subprogram info.

> > Otherwise I can't discern true patterns from false positives in
> > situation described above.
> 
> See above, I might be missing what your "theoretical" example changes.
> 
> > 
> > > I'd go even further and say that on any nocsr invariant violation, we
> > > just go and reset all nocsr pattern flags across entire BPF program
> > > (all subprogs including). In check_nocsr_stack_contract() I mean. Just
> > > have a loop over all instructions and set that flag to false.
> > > 
> > > Why not? What realistic application would need to violate nocsr in
> > > some subprogs but not in others?
> > > 
> > > KISS or it's too simplistic for some reason?
> > 
> > I can invalidate nocsr per-program.
> > If so, I would use an "allow_nocsr" flag in program aux to avoid
> > additional passes over instructions.
> 
> yep, see above (but let's keep it per-subprog for now)

Ok

> > > > > > bpf_patch_insn_data() assumes that one instruction has to be replaced with many.
> > > > > > Here I need to replace many instructions with a single instruction.
> > > > > > I'd prefer not to tweak bpf_patch_insn_data() for this patch-set.
> > > > > 
> > > > > ah, bpf_patch_insn_data just does bpf_patch_insn_single, somehow I
> > > > > thought that it does range for range replacement of instructions.
> > > > > Never mind then (it's a bit sad that we don't have a proper flexible
> > > > > and powerful patching primitive, though, but oh well).
> > > > 
> > > > That is true.
> > > > I'll think about it once other issues with this patch-set are resolved.
> > > 
> > > yeah, it's completely unrelated, but having a bpf_patch_insns that
> > > takes input instruction range to be replaced and replacing it with
> > > another set of instructions would cover all existing use cases and
> > > some more. We wouldn't need verifier_remove_insns(), because that's
> > > just replacing existing instructions with empty new set of
> > > instructions. We would now have "insert instructions" primitive if we
> > > specify empty input range of instructions. If we add a flag whether to
> > > adjust jump offsets and make it configurable, we'd need the thing that
> > > Alexei needed for may_goto without any extra hacks.
> > > 
> > > Furthermore, I find it wrong and silly that we keep having manual
> > > delta+insn adjustments in every single piece of patching logic. I
> > > think this should be done by bpf_patch_insns(). We need to have a
> > > small "insn_patcher" struct that we use during instruction patching,
> > > and a small instruction iterator on top that would keep returning next
> > > instruction to process (and its index, probably). This will allow us
> > > to optimize patching and instead of doing O(N) we can have something
> > > faster and smarter (if we hide direct insn array accesses during
> > > patching).
> > > 
> > > But this is all a completely separate story from all of this.
> > 
> > I actually suggested something along these lines very long time ago:
> > https://lore.kernel.org/bpf/5c297e5009bcd4f0becf59ccd97cfd82bb3a49ec.camel@gmail.com/
> 
> We had a few discussions around patching improvements (you can see I
> mentioned that in that thread as well). We all like to talk about
> improving this, but not really doing anything about this. :) Tech debt
> at its best.

If you are not completely opposed to that idea I can follow-up with a
complete version as an RFC.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-10 18:40                 ` Eduard Zingerman
@ 2024-07-10 18:49                   ` Andrii Nakryiko
  2024-07-10 19:03                     ` Eduard Zingerman
  2024-07-10 19:07                   ` Alexei Starovoitov
  1 sibling, 1 reply; 36+ messages in thread
From: Andrii Nakryiko @ 2024-07-10 18:49 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi, Alexei Starovoitov

On Wed, Jul 10, 2024 at 11:41 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2024-07-10 at 10:50 -0700, Andrii Nakryiko wrote:
>
> [...]
>
> > > > Ok, I see, this wasn't obvious that you want this behavior. I actually
> > > > find it surprising (and so at least let's call this possibility out).
> > > >
> > > > But, tbh, I'd make this stricter. I'd dictate that within a subprogram
> > > > all no_csr patterns *should* end with the same stack offset and I
> > > > would check for that (so what I mentioned before that instead of min
> > > > or max we need assignment and check for equality if we already
> > > > assigned it once).
> > >
> > > This makes sense, but does not cover a theoretical corner case:
> > > - suppose there are two nocsr functions A and B on the kernel side,
> > >   but only was A marked as nocsr during program compilation;
> > > - the spill/fill "bracket" was generated for a call to function B by
> > >   the compiler (because this is just a valid codegen).
> >
> > In this case the call to A shouldn't be changing nocsr_offset at all,
> > though? You should find no spill/fill and thus even though A is
> > *allowed* to use no_csr, it doesn't, so it should have no effect on
> > nocsr offsets, no?
>
> Consider the following code:
>
>     *(u64 *)(r10 - 24) = r1;
>     call A                     // kernel and clang know that A is nocsr
>     r1 = *(u64 *)(r10 - 24);   // kernel assumes .nocsr_stack_offset -24
>     ...
>     *(u64 *)(r10 - 16) = r1;
>     call B                     // only kernel knows that B is nocsr
>     r1 = *(u64 *)(r10 - 16);   // with stricter logic this would disable
>                                // nocsr for the whole sub-probram

oh, you mean that r10-16 accesses are valid instructions emitted by
the compiler but not due to nocsr? I mean, tough luck?... We shouldn't
reject this, but nocsr is ultimately a performance optimization, so
not critical if it doesn't work within some subprogram.

>
> > But your example actually made me think about another (not theoretical
> > at all) case. What if we have calls to A and B, the kernel is slightly
> > old and knows that B is nocsr, but A is not. But the BPF program was
> > compiled with the latest helper/kfunc definitions marking A as no_csr
> > eligible (as well as B). (btw, THAT'S THE WORD for allow_csr --
> > ELIGIBLE. csr_eligible FTW! but I digress...)
> >
> > With the case above we'll disable nocsr for both A and B, right? That
> > sucks, but not sure if we can do anything about that. (We can probably
> > assume no_csr pattern and thus allow spill/fill and not disable nocsr
> > in general, but not remove spill/fills... a bit more complication
> > upfront for longer term extensibility.. not sure, maybe performance
> > regression is a fine price, hmm)
> >
> > So I take it back about unmarking csr in the *entire BPF program*,
> > let's just limit it to the subprog scope. But I still think we should
> > do it eagerly, rather than double checking in do_misc_followups().
> > WDYT?
>
> With current algo this situation would disable nocsr indeed.
> The problem is that checks for spilled stack slots usage is too simplistic.
> However, it could be covered if the check is performed using e.g.
> process I described earlier:
> - for spill, remember the defining instruction in the stack slot structure;
> - for fill, "merge" the defining instruction index;
> - for other stack access mark defining instruction as escaping.

Sorry, I have no idea what the above means and implies. "defining
instruction", "merge", "escaping"

As I mentioned above, I'd keep it as simple as reasonably possible.

>
> > > So I wanted to keep the nocsr check a little bit more permissive.
> > >
> > > > Also, instead of doing that extra nocsr offset check in
> > > > do_misc_fixups(), why don't we just reset all no_csr patterns within a
> > > > subprogram *if we find a single violation*. Compiler shouldn't ever
> > > > emit such code, right? So let's be strict and fallback to not
> > > > recognizing nocsr.
> > > >
> > > > And then we won't need that extra check in do_misc_fixups() because we
> > > > eagerly unset no_csr flag and will never hit that piece of logic in
> > > > patching.
> > >
> > > I can do that, but the detector pass would have to be two pass:
> > > - on the first pass, find the nocsr_stack_off, add candidate insn marks;
> > > - on the second pass, remove marks from insns with wrong stack access offset.
> >
> > It's not really a second pass, it's part of normal validation.
> > check_nocsr_stack_contract() will detect this and will do, yes, pass
> > over all instructions of a subprogram to unmark them.
>
> - on the first pass true .nocsr_stack_off is not yet known,
>   so .nocsr_pattern is set optimistically;
> - on the second pass .nocsr_stack_off is already known,
>   so .nocsr_pattern can be removed from spills/fills outside the range;
> - check_nocsr_stack_contract() does not need to scan full sub-program
>   on each violation, it can set a flag disabling nocsr in subprogram info.

I'm not even sure anymore if we are agreeing or disagreeing, let's
just look at the next revision or something, I'm getting lost.

I was saying that as soon as check_nocsr_stack_contract() detects that
contract is breached, we eagerly set nocsr_pattern and
nocsr_spills_num to 0 for all instructions within the current subprog
and setting nocsr_stack_off for subprog info to 0 (there is no more
csr). There will be no other violations after that within that
subprog. Maybe that's what you mean as well and I just can't read.

>
> > > Otherwise I can't discern true patterns from false positives in
> > > situation described above.
> >
> > See above, I might be missing what your "theoretical" example changes.
> >
> > >
> > > > I'd go even further and say that on any nocsr invariant violation, we
> > > > just go and reset all nocsr pattern flags across entire BPF program
> > > > (all subprogs including). In check_nocsr_stack_contract() I mean. Just
> > > > have a loop over all instructions and set that flag to false.
> > > >
> > > > Why not? What realistic application would need to violate nocsr in
> > > > some subprogs but not in others?
> > > >
> > > > KISS or it's too simplistic for some reason?
> > >
> > > I can invalidate nocsr per-program.
> > > If so, I would use an "allow_nocsr" flag in program aux to avoid
> > > additional passes over instructions.
> >
> > yep, see above (but let's keep it per-subprog for now)
>
> Ok
>
> > > > > > > bpf_patch_insn_data() assumes that one instruction has to be replaced with many.
> > > > > > > Here I need to replace many instructions with a single instruction.
> > > > > > > I'd prefer not to tweak bpf_patch_insn_data() for this patch-set.
> > > > > >
> > > > > > ah, bpf_patch_insn_data just does bpf_patch_insn_single, somehow I
> > > > > > thought that it does range for range replacement of instructions.
> > > > > > Never mind then (it's a bit sad that we don't have a proper flexible
> > > > > > and powerful patching primitive, though, but oh well).
> > > > >
> > > > > That is true.
> > > > > I'll think about it once other issues with this patch-set are resolved.
> > > >
> > > > yeah, it's completely unrelated, but having a bpf_patch_insns that
> > > > takes input instruction range to be replaced and replacing it with
> > > > another set of instructions would cover all existing use cases and
> > > > some more. We wouldn't need verifier_remove_insns(), because that's
> > > > just replacing existing instructions with empty new set of
> > > > instructions. We would now have "insert instructions" primitive if we
> > > > specify empty input range of instructions. If we add a flag whether to
> > > > adjust jump offsets and make it configurable, we'd need the thing that
> > > > Alexei needed for may_goto without any extra hacks.
> > > >
> > > > Furthermore, I find it wrong and silly that we keep having manual
> > > > delta+insn adjustments in every single piece of patching logic. I
> > > > think this should be done by bpf_patch_insns(). We need to have a
> > > > small "insn_patcher" struct that we use during instruction patching,
> > > > and a small instruction iterator on top that would keep returning next
> > > > instruction to process (and its index, probably). This will allow us
> > > > to optimize patching and instead of doing O(N) we can have something
> > > > faster and smarter (if we hide direct insn array accesses during
> > > > patching).
> > > >
> > > > But this is all a completely separate story from all of this.
> > >
> > > I actually suggested something along these lines very long time ago:
> > > https://lore.kernel.org/bpf/5c297e5009bcd4f0becf59ccd97cfd82bb3a49ec.camel@gmail.com/
> >
> > We had a few discussions around patching improvements (you can see I
> > mentioned that in that thread as well). We all like to talk about
> > improving this, but not really doing anything about this. :) Tech debt
> > at its best.
>
> If you are not completely opposed to that idea I can follow-up with a
> complete version as an RFC.

let's do one thing at a time, there is plenty to review for nocsr
without having to switch to patching.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-10 15:36           ` Andrii Nakryiko
  2024-07-10 16:15             ` Eduard Zingerman
@ 2024-07-10 19:01             ` Alexei Starovoitov
  1 sibling, 0 replies; 36+ messages in thread
From: Alexei Starovoitov @ 2024-07-10 19:01 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Eduard Zingerman, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song,
	Puranjay Mohan, Jose E. Marchesi

On Wed, Jul 10, 2024 at 8:36 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> But, tbh, I'd make this stricter. I'd dictate that within a subprogram
> all no_csr patterns *should* end with the same stack offset and I
> would check for that (so what I mentioned before that instead of min
> or max we need assignment and check for equality if we already
> assigned it once).

No.
Compiler has full right to use different slots in nocsr region
for spills/fills. At every nocsr callsite there could be a different number
of live registers too.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-10 18:49                   ` Andrii Nakryiko
@ 2024-07-10 19:03                     ` Eduard Zingerman
  2024-07-10 19:16                       ` Andrii Nakryiko
  0 siblings, 1 reply; 36+ messages in thread
From: Eduard Zingerman @ 2024-07-10 19:03 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi, Alexei Starovoitov

On Wed, 2024-07-10 at 11:49 -0700, Andrii Nakryiko wrote:
> On Wed, Jul 10, 2024 at 11:41 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > On Wed, 2024-07-10 at 10:50 -0700, Andrii Nakryiko wrote:
> > 
> > [...]
> > 
> > > > > Ok, I see, this wasn't obvious that you want this behavior. I actually
> > > > > find it surprising (and so at least let's call this possibility out).
> > > > > 
> > > > > But, tbh, I'd make this stricter. I'd dictate that within a subprogram
> > > > > all no_csr patterns *should* end with the same stack offset and I
> > > > > would check for that (so what I mentioned before that instead of min
> > > > > or max we need assignment and check for equality if we already
> > > > > assigned it once).
> > > > 
> > > > This makes sense, but does not cover a theoretical corner case:
> > > > - suppose there are two nocsr functions A and B on the kernel side,
> > > >   but only was A marked as nocsr during program compilation;
> > > > - the spill/fill "bracket" was generated for a call to function B by
> > > >   the compiler (because this is just a valid codegen).
> > > 
> > > In this case the call to A shouldn't be changing nocsr_offset at all,
> > > though? You should find no spill/fill and thus even though A is
> > > *allowed* to use no_csr, it doesn't, so it should have no effect on
> > > nocsr offsets, no?
> > 
> > Consider the following code:
> > 
> >     *(u64 *)(r10 - 24) = r1;
> >     call A                     // kernel and clang know that A is nocsr
> >     r1 = *(u64 *)(r10 - 24);   // kernel assumes .nocsr_stack_offset -24
> >     ...
> >     *(u64 *)(r10 - 16) = r1;
> >     call B                     // only kernel knows that B is nocsr
> >     r1 = *(u64 *)(r10 - 16);   // with stricter logic this would disable
> >                                // nocsr for the whole sub-probram
> 
> oh, you mean that r10-16 accesses are valid instructions emitted by
> the compiler but not due to nocsr? I mean, tough luck?... We shouldn't
> reject this, but nocsr is ultimately a performance optimization, so
> not critical if it doesn't work within some subprogram.

Not critical, but the difference between allowing and disallowing nocsr
for this case is '<' (current) vs '==' (suggested) check for .nocsr_stack_offset.
I think that current algo should not be made more strict.

> > > But your example actually made me think about another (not theoretical
> > > at all) case. What if we have calls to A and B, the kernel is slightly
> > > old and knows that B is nocsr, but A is not. But the BPF program was
> > > compiled with the latest helper/kfunc definitions marking A as no_csr
> > > eligible (as well as B). (btw, THAT'S THE WORD for allow_csr --
> > > ELIGIBLE. csr_eligible FTW! but I digress...)
> > > 
> > > With the case above we'll disable nocsr for both A and B, right? That
> > > sucks, but not sure if we can do anything about that. (We can probably
> > > assume no_csr pattern and thus allow spill/fill and not disable nocsr
> > > in general, but not remove spill/fills... a bit more complication
> > > upfront for longer term extensibility.. not sure, maybe performance
> > > regression is a fine price, hmm)
> > > 
> > > So I take it back about unmarking csr in the *entire BPF program*,
> > > let's just limit it to the subprog scope. But I still think we should
> > > do it eagerly, rather than double checking in do_misc_followups().
> > > WDYT?
> > 
> > With current algo this situation would disable nocsr indeed.
> > The problem is that checks for spilled stack slots usage is too simplistic.
> > However, it could be covered if the check is performed using e.g.
> > process I described earlier:
> > - for spill, remember the defining instruction in the stack slot structure;
> > - for fill, "merge" the defining instruction index;
> > - for other stack access mark defining instruction as escaping.
> 
> Sorry, I have no idea what the above means and implies. "defining
> instruction", "merge", "escaping"
> 
> As I mentioned above, I'd keep it as simple as reasonably possible.

It should not be much more complex compared to current implementation.

    1: *(u64 *)(r10 - 16) = r1; // for stack slot -16 remember that
                                // it is defined at insn (1)
    2: call %[nocsr_func]
    3: r1 = *(u64 *)(r10 - 16); // the value read from stack is defined
                                // at (1), so remember this in insn aux

If (1) is the only defining instruction for (3) and value written at (1)
is not used by other instructions (e.g. not passed as function argument,
"escapes"), the pair (1) and (3) could be removed.

> > > 
> > - on the first pass true .nocsr_stack_off is not yet known,
> >   so .nocsr_pattern is set optimistically;
> > - on the second pass .nocsr_stack_off is already known,
> >   so .nocsr_pattern can be removed from spills/fills outside the range;
> > - check_nocsr_stack_contract() does not need to scan full sub-program
> >   on each violation, it can set a flag disabling nocsr in subprogram info.
> 
> I'm not even sure anymore if we are agreeing or disagreeing, let's
> just look at the next revision or something, I'm getting lost.
> 
> I was saying that as soon as check_nocsr_stack_contract() detects that
> contract is breached, we eagerly set nocsr_pattern and
> nocsr_spills_num to 0 for all instructions within the current subprog
> and setting nocsr_stack_off for subprog info to 0 (there is no more
> csr). There will be no other violations after that within that
> subprog. Maybe that's what you mean as well and I just can't read.

Ok, I'll send v3, let's proceed from there.

[...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-10 18:40                 ` Eduard Zingerman
  2024-07-10 18:49                   ` Andrii Nakryiko
@ 2024-07-10 19:07                   ` Alexei Starovoitov
  2024-07-10 19:17                     ` Andrii Nakryiko
  1 sibling, 1 reply; 36+ messages in thread
From: Alexei Starovoitov @ 2024-07-10 19:07 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song,
	Puranjay Mohan, Jose E. Marchesi

On Wed, Jul 10, 2024 at 11:41 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> > > So I wanted to keep the nocsr check a little bit more permissive.
> > >
> > > > Also, instead of doing that extra nocsr offset check in
> > > > do_misc_fixups(), why don't we just reset all no_csr patterns within a
> > > > subprogram *if we find a single violation*. Compiler shouldn't ever
> > > > emit such code, right? So let's be strict and fallback to not
> > > > recognizing nocsr.

no.

> > > > And then we won't need that extra check in do_misc_fixups() because we
> > > > eagerly unset no_csr flag and will never hit that piece of logic in
> > > > patching.
> > >
> > > I can do that, but the detector pass would have to be two pass:
> > > - on the first pass, find the nocsr_stack_off, add candidate insn marks;
> > > - on the second pass, remove marks from insns with wrong stack access offset.
> >
> > It's not really a second pass, it's part of normal validation.
> > check_nocsr_stack_contract() will detect this and will do, yes, pass
> > over all instructions of a subprogram to unmark them.
>
> - on the first pass true .nocsr_stack_off is not yet known,
>   so .nocsr_pattern is set optimistically;
> - on the second pass .nocsr_stack_off is already known,
>   so .nocsr_pattern can be removed from spills/fills outside the range;
> - check_nocsr_stack_contract() does not need to scan full sub-program
>   on each violation, it can set a flag disabling nocsr in subprogram info.

I really don't like where this discussion is going.
Keep the current algo it's simple enough and matches what the compiler
will generate. Obscure cases of users manually writing such
patterns in asm are out of scope. Do not complicate the verifier for them.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-10 19:03                     ` Eduard Zingerman
@ 2024-07-10 19:16                       ` Andrii Nakryiko
  0 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2024-07-10 19:16 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	puranjay, jose.marchesi, Alexei Starovoitov

On Wed, Jul 10, 2024 at 12:03 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2024-07-10 at 11:49 -0700, Andrii Nakryiko wrote:
> > On Wed, Jul 10, 2024 at 11:41 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > On Wed, 2024-07-10 at 10:50 -0700, Andrii Nakryiko wrote:
> > >
> > > [...]
> > >
> > > > > > Ok, I see, this wasn't obvious that you want this behavior. I actually
> > > > > > find it surprising (and so at least let's call this possibility out).
> > > > > >
> > > > > > But, tbh, I'd make this stricter. I'd dictate that within a subprogram
> > > > > > all no_csr patterns *should* end with the same stack offset and I
> > > > > > would check for that (so what I mentioned before that instead of min
> > > > > > or max we need assignment and check for equality if we already
> > > > > > assigned it once).
> > > > >
> > > > > This makes sense, but does not cover a theoretical corner case:
> > > > > - suppose there are two nocsr functions A and B on the kernel side,
> > > > >   but only was A marked as nocsr during program compilation;
> > > > > - the spill/fill "bracket" was generated for a call to function B by
> > > > >   the compiler (because this is just a valid codegen).
> > > >
> > > > In this case the call to A shouldn't be changing nocsr_offset at all,
> > > > though? You should find no spill/fill and thus even though A is
> > > > *allowed* to use no_csr, it doesn't, so it should have no effect on
> > > > nocsr offsets, no?
> > >
> > > Consider the following code:
> > >
> > >     *(u64 *)(r10 - 24) = r1;
> > >     call A                     // kernel and clang know that A is nocsr
> > >     r1 = *(u64 *)(r10 - 24);   // kernel assumes .nocsr_stack_offset -24
> > >     ...
> > >     *(u64 *)(r10 - 16) = r1;
> > >     call B                     // only kernel knows that B is nocsr
> > >     r1 = *(u64 *)(r10 - 16);   // with stricter logic this would disable
> > >                                // nocsr for the whole sub-probram
> >
> > oh, you mean that r10-16 accesses are valid instructions emitted by
> > the compiler but not due to nocsr? I mean, tough luck?... We shouldn't
> > reject this, but nocsr is ultimately a performance optimization, so
> > not critical if it doesn't work within some subprogram.
>
> Not critical, but the difference between allowing and disallowing nocsr
> for this case is '<' (current) vs '==' (suggested) check for .nocsr_stack_offset.
> I think that current algo should not be made more strict.
>

ok

> > > > But your example actually made me think about another (not theoretical
> > > > at all) case. What if we have calls to A and B, the kernel is slightly
> > > > old and knows that B is nocsr, but A is not. But the BPF program was
> > > > compiled with the latest helper/kfunc definitions marking A as no_csr
> > > > eligible (as well as B). (btw, THAT'S THE WORD for allow_csr --
> > > > ELIGIBLE. csr_eligible FTW! but I digress...)
> > > >
> > > > With the case above we'll disable nocsr for both A and B, right? That
> > > > sucks, but not sure if we can do anything about that. (We can probably
> > > > assume no_csr pattern and thus allow spill/fill and not disable nocsr
> > > > in general, but not remove spill/fills... a bit more complication
> > > > upfront for longer term extensibility.. not sure, maybe performance
> > > > regression is a fine price, hmm)
> > > >
> > > > So I take it back about unmarking csr in the *entire BPF program*,
> > > > let's just limit it to the subprog scope. But I still think we should
> > > > do it eagerly, rather than double checking in do_misc_followups().
> > > > WDYT?
> > >
> > > With current algo this situation would disable nocsr indeed.
> > > The problem is that checks for spilled stack slots usage is too simplistic.
> > > However, it could be covered if the check is performed using e.g.
> > > process I described earlier:
> > > - for spill, remember the defining instruction in the stack slot structure;
> > > - for fill, "merge" the defining instruction index;
> > > - for other stack access mark defining instruction as escaping.
> >
> > Sorry, I have no idea what the above means and implies. "defining
> > instruction", "merge", "escaping"
> >
> > As I mentioned above, I'd keep it as simple as reasonably possible.
>
> It should not be much more complex compared to current implementation.
>
>     1: *(u64 *)(r10 - 16) = r1; // for stack slot -16 remember that
>                                 // it is defined at insn (1)
>     2: call %[nocsr_func]
>     3: r1 = *(u64 *)(r10 - 16); // the value read from stack is defined
>                                 // at (1), so remember this in insn aux
>
> If (1) is the only defining instruction for (3) and value written at (1)
> is not used by other instructions (e.g. not passed as function argument,
> "escapes"), the pair (1) and (3) could be removed.

> > As I mentioned above, I'd keep it as simple as reasonably possible.

>
> > > >
> > > - on the first pass true .nocsr_stack_off is not yet known,
> > >   so .nocsr_pattern is set optimistically;
> > > - on the second pass .nocsr_stack_off is already known,
> > >   so .nocsr_pattern can be removed from spills/fills outside the range;
> > > - check_nocsr_stack_contract() does not need to scan full sub-program
> > >   on each violation, it can set a flag disabling nocsr in subprogram info.
> >
> > I'm not even sure anymore if we are agreeing or disagreeing, let's
> > just look at the next revision or something, I'm getting lost.
> >
> > I was saying that as soon as check_nocsr_stack_contract() detects that
> > contract is breached, we eagerly set nocsr_pattern and
> > nocsr_spills_num to 0 for all instructions within the current subprog
> > and setting nocsr_stack_off for subprog info to 0 (there is no more
> > csr). There will be no other violations after that within that
> > subprog. Maybe that's what you mean as well and I just can't read.
>
> Ok, I'll send v3, let's proceed from there.
>
> [...]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-10 19:07                   ` Alexei Starovoitov
@ 2024-07-10 19:17                     ` Andrii Nakryiko
  0 siblings, 0 replies; 36+ messages in thread
From: Andrii Nakryiko @ 2024-07-10 19:17 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Eduard Zingerman, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song,
	Puranjay Mohan, Jose E. Marchesi

On Wed, Jul 10, 2024 at 12:07 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Jul 10, 2024 at 11:41 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > > > So I wanted to keep the nocsr check a little bit more permissive.
> > > >
> > > > > Also, instead of doing that extra nocsr offset check in
> > > > > do_misc_fixups(), why don't we just reset all no_csr patterns within a
> > > > > subprogram *if we find a single violation*. Compiler shouldn't ever
> > > > > emit such code, right? So let's be strict and fallback to not
> > > > > recognizing nocsr.
>
> no.

I agree with your arguments.

>
> > > > > And then we won't need that extra check in do_misc_fixups() because we
> > > > > eagerly unset no_csr flag and will never hit that piece of logic in
> > > > > patching.
> > > >
> > > > I can do that, but the detector pass would have to be two pass:
> > > > - on the first pass, find the nocsr_stack_off, add candidate insn marks;
> > > > - on the second pass, remove marks from insns with wrong stack access offset.
> > >
> > > It's not really a second pass, it's part of normal validation.
> > > check_nocsr_stack_contract() will detect this and will do, yes, pass
> > > over all instructions of a subprogram to unmark them.
> >
> > - on the first pass true .nocsr_stack_off is not yet known,
> >   so .nocsr_pattern is set optimistically;
> > - on the second pass .nocsr_stack_off is already known,
> >   so .nocsr_pattern can be removed from spills/fills outside the range;
> > - check_nocsr_stack_contract() does not need to scan full sub-program
> >   on each violation, it can set a flag disabling nocsr in subprogram info.
>
> I really don't like where this discussion is going.
> Keep the current algo it's simple enough and matches what the compiler
> will generate. Obscure cases of users manually writing such
> patterns in asm are out of scope. Do not complicate the verifier for them.

That's what I'm pushing for as well.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2024-07-10 19:17 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-04 10:23 [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 1/9] bpf: add a get_helper_proto() utility function Eduard Zingerman
2024-07-09 23:42   ` Andrii Nakryiko
2024-07-10  0:26     ` Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 2/9] bpf: no_caller_saved_registers attribute for helper calls Eduard Zingerman
2024-07-09 23:42   ` Andrii Nakryiko
2024-07-10  3:00     ` Eduard Zingerman
2024-07-10  6:01       ` Andrii Nakryiko
2024-07-10  7:57         ` Eduard Zingerman
2024-07-10 15:36           ` Andrii Nakryiko
2024-07-10 16:15             ` Eduard Zingerman
2024-07-10 17:50               ` Andrii Nakryiko
2024-07-10 18:40                 ` Eduard Zingerman
2024-07-10 18:49                   ` Andrii Nakryiko
2024-07-10 19:03                     ` Eduard Zingerman
2024-07-10 19:16                       ` Andrii Nakryiko
2024-07-10 19:07                   ` Alexei Starovoitov
2024-07-10 19:17                     ` Andrii Nakryiko
2024-07-10 19:01             ` Alexei Starovoitov
2024-07-10  9:46     ` Eduard Zingerman
2024-07-10 15:23       ` Andrii Nakryiko
2024-07-10  1:09   ` Alexei Starovoitov
2024-07-10  3:06     ` Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 3/9] bpf, x86, riscv, arm: no_caller_saved_registers for bpf_get_smp_processor_id() Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 4/9] selftests/bpf: extract utility function for BPF disassembly Eduard Zingerman
2024-07-09 23:46   ` Andrii Nakryiko
2024-07-04 10:23 ` [RFC bpf-next v2 5/9] selftests/bpf: no need to track next_match_pos in struct test_loader Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 6/9] selftests/bpf: extract test_loader->expect_msgs as a data structure Eduard Zingerman
2024-07-04 10:23 ` [RFC bpf-next v2 7/9] selftests/bpf: allow checking xlated programs in verifier_* tests Eduard Zingerman
2024-07-04 10:24 ` [RFC bpf-next v2 8/9] selftests/bpf: __arch_* macro to limit test cases to specific archs Eduard Zingerman
2024-07-09 23:50   ` Andrii Nakryiko
2024-07-04 10:24 ` [RFC bpf-next v2 9/9] selftests/bpf: test no_caller_saved_registers spill/fill removal Eduard Zingerman
2024-07-08 11:44 ` [RFC bpf-next v2 0/9] no_caller_saved_registers attribute for helper calls Puranjay Mohan
2024-07-08 17:29   ` Eduard Zingerman
2024-07-10  1:18 ` Alexei Starovoitov
2024-07-10  3:35   ` Eduard Zingerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox