[RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls

public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed

* [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls
@ 2024-06-29  9:47 Eduard Zingerman
  2024-06-29  9:47 ` [RFC bpf-next v1 1/8] bpf: add a get_helper_proto() utility function Eduard Zingerman
                   ` (8 more replies)
  0 siblings, 9 replies; 47+ messages in thread
From: Eduard Zingerman @ 2024-06-29  9:47 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Eduard Zingerman

This RFC seeks to allow using no_caller_saved_registers gcc/clang
attribute with some BPF helper functions (and kfuncs in the future).

As documented in [1], this attribute means that function scratches
only some of the caller saved registers defined by ABI.
For BPF the set of such registers could be defined as follows:
- R0 is scratched only if function is non-void;
- R1-R5 are scratched only if corresponding parameter type is defined
  in the function prototype.

The goal of the RFC is to implement no_caller_saved_registers
(nocsr for short) in a backwards compatible manner:
- for kernels that support the feature, gain some performance boost
  from better register allocation;
- for kernels that don't support the feature, allow programs execution
  with minor performance losses.

To achieve this, use a scheme suggested by Alexei Starovoitov:
- for nocsr calls clang allocates registers as-if relevant r0-r5
  registers are not scratched by the call;
- as a post-processing step, clang visits each nocsr call and adds
  spill/fill for every live r0-r5;
- stack offsets used for spills/fills are allocated as minimal
  stack offsets in whole function and are not used for any other
  purposes;
- when kernel loads a program, it looks for such patterns
  (nocsr function surrounded by spills/fills) and checks if
  spill/fill stack offsets are used exclusively in nocsr patterns;
- if so, and if current JIT inlines the call to the nocsr function
  (e.g. a helper call), kernel removes unnecessary spill/fill pairs;
- when old kernel loads a program, presence of spill/fill pairs
  keeps BPF program valid, albeit slightly less efficient.

Corresponding clang/llvm changes are available in [2].

The patch-set uses bpf_get_smp_processor_id() function as a canary,
making it the first helper with nocsr attribute.

For example, consider the following program:

  #define __no_csr __attribute__((no_caller_saved_registers))
  #define SEC(name) __attribute__((section(name), used))
  #define bpf_printk(fmt, ...) bpf_trace_printk((fmt), sizeof(fmt), __VA_ARGS__)

  typedef unsigned int __u32;

  static long (* const bpf_trace_printk)(const char *fmt, __u32 fmt_size, ...) = (void *) 6;
  static __u32 (*const bpf_get_smp_processor_id)(void) __no_csr = (void *)8;

  SEC("raw_tp")
  int test(void *ctx)
  {
          __u32 task = bpf_get_smp_processor_id();
  	bpf_printk("ctx=%p, smp=%d", ctx, task);
  	return 0;
  }

  char _license[] SEC("license") = "GPL";

Compiled (using [2]) as follows:

  $ clang --target=bpf -O2 -g -c -o nocsr.bpf.o nocsr.bpf.c
  $ llvm-objdump --no-show-raw-insn -Sd nocsr.bpf.o
    ...
  3rd parameter for printk call     removable spill/fill pair
  .--- 0:       r3 = r1                             |
; |       __u32 task = bpf_get_smp_processor_id();  |
  |    1:       *(u64 *)(r10 - 0x8) = r3 <----------|
  |    2:       call 0x8                            |
  |    3:       r3 = *(u64 *)(r10 - 0x8) <----------'
; |     bpf_printk("ctx=%p, smp=%d", ctx, task);
  |    4:       r1 = 0x0 ll
  |    6:       r2 = 0xf
  |    7:       r4 = r0
  '--> 8:       call 0x6
;       return 0;
       9:       r0 = 0x0
      10:       exit

Here is how the program looks after verifier processing:

  # bpftool prog load ./nocsr.bpf.o /sys/fs/bpf/nocsr-test
  # bpftool prog dump xlated pinned /sys/fs/bpf/nocsr-test
  int test(void * ctx):
  ; int test(void *ctx)
     0: (bf) r3 = r1               <--------- 3rd printk parameter
  ; __u32 task = bpf_get_smp_processor_id();
     1: (b4) w0 = 197132           <--------- inlined helper call,
     2: (bf) r0 = r0               <--------- spill/fill pair removed
     3: (61) r0 = *(u32 *)(r0 +0)  <---------
  ; bpf_printk("ctx=%p, smp=%d", ctx, task);
     4: (18) r1 = map[id:13][0]+0
     6: (b7) r2 = 15
     7: (bf) r4 = r0
     8: (85) call bpf_trace_printk#-125920
  ; return 0;
     9: (b7) r0 = 0
    10: (95) exit

[1] https://clang.llvm.org/docs/AttributeReference.html#no-caller-saved-registers
[2] https://github.com/eddyz87/llvm-project/tree/bpf-no-caller-saved-registers

Eduard Zingerman (8):
  bpf: add a get_helper_proto() utility function
  bpf: no_caller_saved_registers attribute for helper calls
  bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
  selftests/bpf: extract utility function for BPF disassembly
  selftests/bpf: no need to track next_match_pos in struct test_loader
  selftests/bpf: extract test_loader->expect_msgs as a data structure
  selftests/bpf: allow checking xlated programs in verifier_* tests
  selftests/bpf: test no_caller_saved_registers spill/fill removal

 include/linux/bpf.h                           |   6 +
 include/linux/bpf_verifier.h                  |   9 +
 kernel/bpf/helpers.c                          |   1 +
 kernel/bpf/verifier.c                         | 346 +++++++++++++-
 tools/testing/selftests/bpf/Makefile          |   1 +
 tools/testing/selftests/bpf/disasm_helpers.c  |  50 ++
 tools/testing/selftests/bpf/disasm_helpers.h  |  12 +
 .../selftests/bpf/prog_tests/ctx_rewrite.c    |  71 +--
 .../selftests/bpf/prog_tests/verifier.c       |   7 +
 tools/testing/selftests/bpf/progs/bpf_misc.h  |   6 +
 .../selftests/bpf/progs/verifier_nocsr.c      | 437 ++++++++++++++++++
 tools/testing/selftests/bpf/test_loader.c     | 170 +++++--
 tools/testing/selftests/bpf/test_progs.h      |   1 -
 tools/testing/selftests/bpf/testing_helpers.c |   1 +
 14 files changed, 986 insertions(+), 132 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/disasm_helpers.c
 create mode 100644 tools/testing/selftests/bpf/disasm_helpers.h
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_nocsr.c

-- 
2.45.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [RFC bpf-next v1 1/8] bpf: add a get_helper_proto() utility function
  2024-06-29  9:47 [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Eduard Zingerman
@ 2024-06-29  9:47 ` Eduard Zingerman
  2024-07-02  0:41   ` Andrii Nakryiko
  2024-06-29  9:47 ` [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls Eduard Zingerman
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-06-29  9:47 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Eduard Zingerman

Extract the part of check_helper_call() as a utility function allowing
to query 'struct bpf_func_proto' for a specific helper function id.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/verifier.c | 37 ++++++++++++++++++++++++++-----------
 1 file changed, 26 insertions(+), 11 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d3927d819465..8dd3385cf925 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -10261,6 +10261,24 @@ static void update_loop_inline_state(struct bpf_verifier_env *env, u32 subprogno
 				 state->callback_subprogno == subprogno);
 }
 
+static int get_helper_proto(struct bpf_verifier_env *env, int func_id,
+			    const struct bpf_func_proto **ptr)
+{
+	const struct bpf_func_proto *result = NULL;
+
+	if (func_id < 0 || func_id >= __BPF_FUNC_MAX_ID)
+		return -ERANGE;
+
+	if (env->ops->get_func_proto)
+		result = env->ops->get_func_proto(func_id, env->prog);
+
+	if (!result)
+		return -EINVAL;
+
+	*ptr = result;
+	return 0;
+}
+
 static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			     int *insn_idx_p)
 {
@@ -10277,17 +10295,14 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 
 	/* find function prototype */
 	func_id = insn->imm;
-	if (func_id < 0 || func_id >= __BPF_FUNC_MAX_ID) {
-		verbose(env, "invalid func %s#%d\n", func_id_name(func_id),
-			func_id);
-		return -EINVAL;
-	}
-
-	if (env->ops->get_func_proto)
-		fn = env->ops->get_func_proto(func_id, env->prog);
-	if (!fn) {
-		verbose(env, "program of this type cannot use helper %s#%d\n",
-			func_id_name(func_id), func_id);
+	err = get_helper_proto(env, insn->imm, &fn);
+	if (err) {
+		if (err == -ERANGE)
+			verbose(env, "invalid func %s#%d\n", func_id_name(func_id),
+				func_id);
+		else
+			verbose(env, "program of this type cannot use helper %s#%d\n",
+				func_id_name(func_id), func_id);
 		return -EINVAL;
 	}
 
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 1/8] bpf: add a get_helper_proto() utility function
  2024-06-29  9:47 ` [RFC bpf-next v1 1/8] bpf: add a get_helper_proto() utility function Eduard Zingerman
@ 2024-07-02  0:41   ` Andrii Nakryiko
  2024-07-02 20:07     ` Eduard Zingerman
  0 siblings, 1 reply; 47+ messages in thread
From: Andrii Nakryiko @ 2024-07-02  0:41 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Sat, Jun 29, 2024 at 2:48 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Extract the part of check_helper_call() as a utility function allowing
> to query 'struct bpf_func_proto' for a specific helper function id.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  kernel/bpf/verifier.c | 37 ++++++++++++++++++++++++++-----------
>  1 file changed, 26 insertions(+), 11 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index d3927d819465..8dd3385cf925 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -10261,6 +10261,24 @@ static void update_loop_inline_state(struct bpf_verifier_env *env, u32 subprogno
>                                  state->callback_subprogno == subprogno);
>  }
>
> +static int get_helper_proto(struct bpf_verifier_env *env, int func_id,
> +                           const struct bpf_func_proto **ptr)
> +{
> +       const struct bpf_func_proto *result = NULL;
> +
> +       if (func_id < 0 || func_id >= __BPF_FUNC_MAX_ID)
> +               return -ERANGE;
> +
> +       if (env->ops->get_func_proto)
> +               result = env->ops->get_func_proto(func_id, env->prog);
> +
> +       if (!result)
> +               return -EINVAL;
> +
> +       *ptr = result;
> +       return 0;
> +}
> +
>  static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>                              int *insn_idx_p)
>  {
> @@ -10277,17 +10295,14 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>
>         /* find function prototype */
>         func_id = insn->imm;
> -       if (func_id < 0 || func_id >= __BPF_FUNC_MAX_ID) {
> -               verbose(env, "invalid func %s#%d\n", func_id_name(func_id),
> -                       func_id);
> -               return -EINVAL;
> -       }
> -
> -       if (env->ops->get_func_proto)
> -               fn = env->ops->get_func_proto(func_id, env->prog);
> -       if (!fn) {
> -               verbose(env, "program of this type cannot use helper %s#%d\n",
> -                       func_id_name(func_id), func_id);
> +       err = get_helper_proto(env, insn->imm, &fn);
> +       if (err) {
> +               if (err == -ERANGE)
> +                       verbose(env, "invalid func %s#%d\n", func_id_name(func_id),
> +                               func_id);
> +               else
> +                       verbose(env, "program of this type cannot use helper %s#%d\n",
> +                               func_id_name(func_id), func_id);
>                 return -EINVAL;
>         }

subjective, but this might be cleaner and will keep first verbose() on
a single line:

err = get_helper_proto(...);
if (err == -ERANGE) {
    verbose(...);
    return -EINVAL;
} else if (err) {
    verbose(...);
    return err;
}


>
> --
> 2.45.2
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 1/8] bpf: add a get_helper_proto() utility function
  2024-07-02  0:41   ` Andrii Nakryiko
@ 2024-07-02 20:07     ` Eduard Zingerman
  0 siblings, 0 replies; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-02 20:07 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Mon, 2024-07-01 at 17:41 -0700, Andrii Nakryiko wrote:

[...]

> subjective, but this might be cleaner and will keep first verbose() on
> a single line:
> 
> err = get_helper_proto(...);
> if (err == -ERANGE) {
>     verbose(...);
>     return -EINVAL;
> } else if (err) {
>     verbose(...);
>     return err;
> }

Will do.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls
  2024-06-29  9:47 [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Eduard Zingerman
  2024-06-29  9:47 ` [RFC bpf-next v1 1/8] bpf: add a get_helper_proto() utility function Eduard Zingerman
@ 2024-06-29  9:47 ` Eduard Zingerman
  2024-07-01 19:01   ` Eduard Zingerman
                     ` (2 more replies)
  2024-06-29  9:47 ` [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id() Eduard Zingerman
                   ` (6 subsequent siblings)
  8 siblings, 3 replies; 47+ messages in thread
From: Eduard Zingerman @ 2024-06-29  9:47 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Eduard Zingerman, Alexei Starovoitov

GCC and LLVM define a no_caller_saved_registers function attribute.
This attribute means that function scratches only some of
the caller saved registers defined by ABI.
For BPF the set of such registers could be defined as follows:
- R0 is scratched only if function is non-void;
- R1-R5 are scratched only if corresponding parameter type is defined
  in the function prototype.

This commit introduces flag bpf_func_prot->nocsr.
If this flag is set for some helper function, verifier assumes that
it follows no_caller_saved_registers calling convention.

The contract between kernel and clang allows to simultaneously use
such functions and maintain backwards compatibility with old
kernels that don't understand no_caller_saved_registers calls
(nocsr for short):

- clang generates a simple pattern for nocsr calls, e.g.:

    r1 = 1;
    r2 = 2;
    *(u64 *)(r10 - 8)  = r1;
    *(u64 *)(r10 - 16) = r2;
    call %[to_be_inlined_by_jit]
    r2 = *(u64 *)(r10 - 16);
    r1 = *(u64 *)(r10 - 8);
    r0 = r1;
    r0 += r2;
    exit;

- kernel removes unnecessary spills and fills, if called function is
  inlined by current JIT (with assumption that patch inserted by JIT
  honors nocsr contract, e.g. does not scratch r3-r5 for the example
  above), e.g. the code above would be transformed to:

    r1 = 1;
    r2 = 2;
    call %[to_be_inlined_by_jit]
    r0 = r1;
    r0 += r2;
    exit;

Technically, the transformation is split into the following phases:
- during check_cfg() function update_nocsr_pattern_marks() is used to
  find potential patterns;
- upon stack read or write access,
  function check_nocsr_stack_contract() is used to verify if
  stack offsets, presumably reserved for nocsr patterns, are used
  only from those patterns;
- function remove_nocsr_spills_fills(), called from bpf_check(),
  applies the rewrite for valid patterns.

See comment in match_and_mark_nocsr_pattern() for more details.

Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf.h          |   6 +
 include/linux/bpf_verifier.h |   9 ++
 kernel/bpf/verifier.c        | 300 ++++++++++++++++++++++++++++++++++-
 3 files changed, 307 insertions(+), 8 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 960780ef04e1..f4faa6b8cb5b 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -807,6 +807,12 @@ struct bpf_func_proto {
 	bool gpl_only;
 	bool pkt_access;
 	bool might_sleep;
+	/* set to true if helper follows contract for gcc/llvm
+	 * attribute no_caller_saved_registers:
+	 * - void functions do not scratch r0
+	 * - functions taking N arguments scratch only registers r1-rN
+	 */
+	bool nocsr;
 	enum bpf_return_type ret_type;
 	union {
 		struct {
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 2b54e25d2364..67dbe4cb1529 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -585,6 +585,10 @@ struct bpf_insn_aux_data {
 	 * accepts callback function as a parameter.
 	 */
 	bool calls_callback;
+	/* true if STX or LDX instruction is a part of a spill/fill
+	 * pattern for a no_caller_saved_registers call.
+	 */
+	bool nocsr_pattern;
 };
 
 #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
@@ -641,6 +645,11 @@ struct bpf_subprog_info {
 	u32 linfo_idx; /* The idx to the main_prog->aux->linfo */
 	u16 stack_depth; /* max. stack depth used by this function */
 	u16 stack_extra;
+	/* stack depth after which slots reserved for
+	 * no_caller_saved_registers spills/fills start,
+	 * value <= nocsr_stack_off belongs to the spill/fill area.
+	 */
+	s16 nocsr_stack_off;
 	bool has_tail_call: 1;
 	bool tail_call_reachable: 1;
 	bool has_ld_abs: 1;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 8dd3385cf925..1340d3e60d30 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2471,16 +2471,41 @@ static int cmp_subprogs(const void *a, const void *b)
 	       ((struct bpf_subprog_info *)b)->start;
 }
 
-static int find_subprog(struct bpf_verifier_env *env, int off)
+/* Find subprogram that contains instruction at 'off' */
+static int find_containing_subprog(struct bpf_verifier_env *env, int off)
 {
-	struct bpf_subprog_info *p;
+	struct bpf_subprog_info *vals = env->subprog_info;
+	int high = env->subprog_cnt - 1;
+	int low = 0, ret = -ENOENT;
 
-	p = bsearch(&off, env->subprog_info, env->subprog_cnt,
-		    sizeof(env->subprog_info[0]), cmp_subprogs);
-	if (!p)
+	if (off >= env->prog->len || off < 0)
 		return -ENOENT;
-	return p - env->subprog_info;
 
+	while (low <= high) {
+		int mid = (low + high)/2;
+		struct bpf_subprog_info *val = &vals[mid];
+		int diff = off - val->start;
+
+		if (diff < 0) {
+			high = mid - 1;
+		} else {
+			low = mid + 1;
+			/* remember last time mid.start <= off */
+			ret = mid;
+		}
+	}
+	return ret;
+}
+
+/* Find subprogram that starts exactly at 'off' */
+static int find_subprog(struct bpf_verifier_env *env, int off)
+{
+	int idx;
+
+	idx = find_containing_subprog(env, off);
+	if (idx < 0 || env->subprog_info[idx].start != off)
+		return -ENOENT;
+	return idx;
 }
 
 static int add_subprog(struct bpf_verifier_env *env, int off)
@@ -4501,6 +4526,23 @@ static int get_reg_width(struct bpf_reg_state *reg)
 	return fls64(reg->umax_value);
 }
 
+/* See comment for match_and_mark_nocsr_pattern() */
+static void check_nocsr_stack_contract(struct bpf_verifier_env *env, struct bpf_func_state *state,
+				       int insn_idx, int off)
+{
+	struct bpf_subprog_info *subprog = &env->subprog_info[state->subprogno];
+	struct bpf_insn_aux_data *aux = &env->insn_aux_data[insn_idx];
+
+	if (subprog->nocsr_stack_off <= off || aux->nocsr_pattern)
+		return;
+	/* access to the region [max_stack_depth .. nocsr_stack_off]
+	 * from something that is not a part of the nocsr pattern,
+	 * disable nocsr rewrites for current subprogram by setting
+	 * nocsr_stack_off to a value smaller than any possible offset.
+	 */
+	subprog->nocsr_stack_off = S16_MIN;
+}
+
 /* check_stack_{read,write}_fixed_off functions track spill/fill of registers,
  * stack boundary and alignment are checked in check_mem_access()
  */
@@ -4549,6 +4591,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env,
 	if (err)
 		return err;
 
+	check_nocsr_stack_contract(env, state, insn_idx, off);
 	mark_stack_slot_scratched(env, spi);
 	if (reg && !(off % BPF_REG_SIZE) && reg->type == SCALAR_VALUE && env->bpf_capable) {
 		bool reg_value_fits;
@@ -4682,6 +4725,7 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env,
 			return err;
 	}
 
+	check_nocsr_stack_contract(env, state, insn_idx, min_off);
 	/* Variable offset writes destroy any spilled pointers in range. */
 	for (i = min_off; i < max_off; i++) {
 		u8 new_type, *stype;
@@ -4820,6 +4864,7 @@ static int check_stack_read_fixed_off(struct bpf_verifier_env *env,
 	reg = &reg_state->stack[spi].spilled_ptr;
 
 	mark_stack_slot_scratched(env, spi);
+	check_nocsr_stack_contract(env, state, env->insn_idx, off);
 
 	if (is_spilled_reg(&reg_state->stack[spi])) {
 		u8 spill_size = 1;
@@ -4980,6 +5025,7 @@ static int check_stack_read_var_off(struct bpf_verifier_env *env,
 	min_off = reg->smin_value + off;
 	max_off = reg->smax_value + off;
 	mark_reg_stack_read(env, ptr_state, min_off, max_off + size, dst_regno);
+	check_nocsr_stack_contract(env, ptr_state, env->insn_idx, min_off);
 	return 0;
 }
 
@@ -15950,6 +15996,205 @@ static int visit_func_call_insn(int t, struct bpf_insn *insns,
 	return ret;
 }
 
+/* Bitmask with 1s for all caller saved registers */
+#define ALL_CALLER_SAVED_REGS ((1u << CALLER_SAVED_REGS) - 1)
+
+/* Return a bitmask specifying which caller saved registers are
+ * modified by a call to a helper.
+ * (Either as a return value or as scratch registers).
+ *
+ * For normal helpers registers R0-R5 are scratched.
+ * For helpers marked as no_csr:
+ * - scratch R0 if function is non-void;
+ * - scratch R1-R5 if corresponding parameter type is set
+ *   in the function prototype.
+ */
+static u8 get_helper_reg_mask(const struct bpf_func_proto *fn)
+{
+	u8 mask;
+	int i;
+
+	if (!fn->nocsr)
+		return ALL_CALLER_SAVED_REGS;
+
+	mask = 0;
+	mask |= fn->ret_type == RET_VOID ? 0 : BIT(BPF_REG_0);
+	for (i = 0; i < ARRAY_SIZE(fn->arg_type); ++i)
+		mask |= fn->arg_type[i] == ARG_DONTCARE ? 0 : BIT(BPF_REG_1 + i);
+	return mask;
+}
+
+/* True if do_misc_fixups() replaces calls to helper number 'imm',
+ * replacement patch is presumed to follow no_caller_saved_registers contract
+ * (see match_and_mark_nocsr_pattern() below).
+ */
+static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm)
+{
+	return false;
+}
+
+/* If 'insn' is a call that follows no_caller_saved_registers contract
+ * and called function is inlined by current jit, return a mask with
+ * 1s corresponding to registers that are scratched by this call
+ * (depends on return type and number of return parameters).
+ * Otherwise return ALL_CALLER_SAVED_REGS mask.
+ */
+static u32 call_csr_mask(struct bpf_verifier_env *env, struct bpf_insn *insn)
+{
+	const struct bpf_func_proto *fn;
+
+	if (bpf_helper_call(insn) &&
+	    verifier_inlines_helper_call(env, insn->imm) &&
+	    get_helper_proto(env, insn->imm, &fn) == 0 &&
+	    fn->nocsr)
+		return ~get_helper_reg_mask(fn);
+
+	return ALL_CALLER_SAVED_REGS;
+}
+
+/* GCC and LLVM define a no_caller_saved_registers function attribute.
+ * This attribute means that function scratches only some of
+ * the caller saved registers defined by ABI.
+ * For BPF the set of such registers could be defined as follows:
+ * - R0 is scratched only if function is non-void;
+ * - R1-R5 are scratched only if corresponding parameter type is defined
+ *   in the function prototype.
+ *
+ * The contract between kernel and clang allows to simultaneously use
+ * such functions and maintain backwards compatibility with old
+ * kernels that don't understand no_caller_saved_registers calls
+ * (nocsr for short):
+ *
+ * - for nocsr calls clang allocates registers as-if relevant r0-r5
+ *   registers are not scratched by the call;
+ *
+ * - as a post-processing step, clang visits each nocsr call and adds
+ *   spill/fill for every live r0-r5;
+ *
+ * - stack offsets used for the spill/fill are allocated as minimal
+ *   stack offsets in whole function and are not used for any other
+ *   purposes;
+ *
+ * - when kernel loads a program, it looks for such patterns
+ *   (nocsr function surrounded by spills/fills) and checks if
+ *   spill/fill stack offsets are used exclusively in nocsr patterns;
+ *
+ * - if so, and if current JIT inlines the call to the nocsr function
+ *   (e.g. a helper call), kernel removes unnecessary spill/fill pairs;
+ *
+ * - when old kernel loads a program, presence of spill/fill pairs
+ *   keeps BPF program valid, albeit slightly less efficient.
+ *
+ * For example:
+ *
+ *   r1 = 1;
+ *   r2 = 2;
+ *   *(u64 *)(r10 - 8)  = r1;            r1 = 1;
+ *   *(u64 *)(r10 - 16) = r2;            r2 = 2;
+ *   call %[to_be_inlined_by_jit]  -->   call %[to_be_inlined_by_jit]
+ *   r2 = *(u64 *)(r10 - 16);            r0 = r1;
+ *   r1 = *(u64 *)(r10 - 8);             r0 += r2;
+ *   r0 = r1;                            exit;
+ *   r0 += r2;
+ *   exit;
+ *
+ * The purpose of match_and_mark_nocsr_pattern is to:
+ * - look for such patterns;
+ * - mark spill and fill instructions in env->insn_aux_data[*].nocsr_pattern;
+ * - update env->subprog_info[*]->nocsr_stack_off to find an offset
+ *   at which nocsr spill/fill stack slots start.
+ *
+ * The .nocsr_pattern and .nocsr_stack_off are used by
+ * check_nocsr_stack_contract() to check if every stack access to
+ * nocsr spill/fill stack slot originates from spill/fill
+ * instructions, members of nocsr patterns.
+ *
+ * If such condition holds true for a subprogram, nocsr patterns could
+ * be rewritten by remove_nocsr_spills_fills().
+ * Otherwise nocsr patterns are not changed in the subprogram
+ * (code, presumably, generated by an older clang version).
+ *
+ * For example, it is *not* safe to remove spill/fill below:
+ *
+ *   r1 = 1;
+ *   *(u64 *)(r10 - 8)  = r1;            r1 = 1;
+ *   call %[to_be_inlined_by_jit]  -->   call %[to_be_inlined_by_jit]
+ *   r1 = *(u64 *)(r10 - 8);             r0 = *(u64 *)(r10 - 8);  <---- wrong !!!
+ *   r0 = *(u64 *)(r10 - 8);             r0 += r1;
+ *   r0 += r1;                           exit;
+ *   exit;
+ */
+static int match_and_mark_nocsr_pattern(struct bpf_verifier_env *env, int t, bool mark)
+{
+	struct bpf_insn *insns = env->prog->insnsi, *stx, *ldx;
+	struct bpf_subprog_info *subprog;
+	u32 csr_mask = call_csr_mask(env, &insns[t]);
+	u32 reg_mask = ~csr_mask | ~ALL_CALLER_SAVED_REGS;
+	int s, i;
+	s16 off;
+
+	if (csr_mask == ALL_CALLER_SAVED_REGS)
+		return false;
+
+	for (i = 1, off = 0; i <= ARRAY_SIZE(caller_saved); ++i, off += BPF_REG_SIZE) {
+		if (t - i < 0 || t + i >= env->prog->len)
+			break;
+		stx = &insns[t - i];
+		ldx = &insns[t + i];
+		if (off == 0) {
+			off = stx->off;
+			if (off % BPF_REG_SIZE != 0)
+				break;
+		}
+		if (/* *(u64 *)(r10 - off) = r[0-5]? */
+		    stx->code != (BPF_STX | BPF_MEM | BPF_DW) ||
+		    stx->dst_reg != BPF_REG_10 ||
+		    /* r[0-5] = *(u64 *)(r10 - off)? */
+		    ldx->code != (BPF_LDX | BPF_MEM | BPF_DW) ||
+		    ldx->src_reg != BPF_REG_10 ||
+		    /* check spill/fill for the same reg and offset */
+		    stx->src_reg != ldx->dst_reg ||
+		    stx->off != ldx->off ||
+		    stx->off != off ||
+		    /* this should be a previously unseen register */
+		    BIT(stx->src_reg) & reg_mask)
+			break;
+		reg_mask |= BIT(stx->src_reg);
+		if (mark) {
+			env->insn_aux_data[t - i].nocsr_pattern = true;
+			env->insn_aux_data[t + i].nocsr_pattern = true;
+		}
+	}
+	if (i == 1)
+		return 0;
+	if (mark) {
+		s = find_containing_subprog(env, t);
+		/* can't happen */
+		if (WARN_ON_ONCE(s < 0))
+			return 0;
+		subprog = &env->subprog_info[s];
+		subprog->nocsr_stack_off = min(subprog->nocsr_stack_off, off);
+	}
+	return i - 1;
+}
+
+/* If instruction 't' is a nocsr call surrounded by spill/fill pairs,
+ * update env->subprog_info[_]->nocsr_stack_off and
+ * env->insn_aux_data[_].nocsr_pattern fields.
+ */
+static void update_nocsr_pattern_marks(struct bpf_verifier_env *env, int t)
+{
+	match_and_mark_nocsr_pattern(env, t, true);
+}
+
+/* If instruction 't' is a nocsr call surrounded by spill/fill pairs,
+ * return the number of such pairs.
+ */
+static int match_nocsr_pattern(struct bpf_verifier_env *env, int t)
+{
+	return match_and_mark_nocsr_pattern(env, t, false);
+}
+
 /* Visits the instruction at index t and returns one of the following:
  *  < 0 - an error occurred
  *  DONE_EXPLORING - the instruction was fully explored
@@ -16017,6 +16262,8 @@ static int visit_insn(int t, struct bpf_verifier_env *env)
 				mark_force_checkpoint(env, t);
 			}
 		}
+		if (insn->src_reg == 0)
+			update_nocsr_pattern_marks(env, t);
 		return visit_func_call_insn(t, insns, env, insn->src_reg == BPF_PSEUDO_CALL);
 
 	case BPF_JA:
@@ -19063,15 +19310,16 @@ static int opt_remove_dead_code(struct bpf_verifier_env *env)
 	return 0;
 }
 
+static const struct bpf_insn NOP = BPF_JMP_IMM(BPF_JA, 0, 0, 0);
+
 static int opt_remove_nops(struct bpf_verifier_env *env)
 {
-	const struct bpf_insn ja = BPF_JMP_IMM(BPF_JA, 0, 0, 0);
 	struct bpf_insn *insn = env->prog->insnsi;
 	int insn_cnt = env->prog->len;
 	int i, err;
 
 	for (i = 0; i < insn_cnt; i++) {
-		if (memcmp(&insn[i], &ja, sizeof(ja)))
+		if (memcmp(&insn[i], &NOP, sizeof(NOP)))
 			continue;
 
 		err = verifier_remove_insns(env, i, 1);
@@ -20801,6 +21049,39 @@ static int optimize_bpf_loop(struct bpf_verifier_env *env)
 	return 0;
 }
 
+/* Remove unnecessary spill/fill pairs, members of nocsr pattern.
+ * Do this as a separate pass to avoid interfering with helper/kfunc
+ * inlining logic in do_misc_fixups().
+ * See comment for match_and_mark_nocsr_pattern().
+ */
+static int remove_nocsr_spills_fills(struct bpf_verifier_env *env)
+{
+	struct bpf_subprog_info *subprogs = env->subprog_info;
+	int i, j, spills_num, cur_subprog = 0;
+	struct bpf_insn *insn = env->prog->insnsi;
+	int insn_cnt = env->prog->len;
+
+	for (i = 0; i < insn_cnt; i++, insn++) {
+		spills_num = match_nocsr_pattern(env, i);
+		if (spills_num == 0)
+			goto next_insn;
+		for (j = 1; j <= spills_num; ++j)
+			if ((insn - j)->off >= subprogs[cur_subprog].nocsr_stack_off ||
+			    (insn + j)->off >= subprogs[cur_subprog].nocsr_stack_off)
+				goto next_insn;
+		/* NOPs are removed by opt_remove_nops() later */
+		for (j = 1; j <= spills_num; ++j) {
+			*(insn - j) = NOP;
+			*(insn + j) = NOP;
+		}
+
+next_insn:
+		if (subprogs[cur_subprog + 1].start == i + 1)
+			cur_subprog++;
+	}
+	return 0;
+}
+
 static void free_states(struct bpf_verifier_env *env)
 {
 	struct bpf_verifier_state_list *sl, *sln;
@@ -21719,6 +22000,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	if (ret == 0)
 		ret = optimize_bpf_loop(env);
 
+	if (ret == 0)
+		ret = remove_nocsr_spills_fills(env);
+
 	if (is_priv) {
 		if (ret == 0)
 			opt_hard_wire_dead_code_branches(env);
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls
  2024-06-29  9:47 ` [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls Eduard Zingerman
@ 2024-07-01 19:01   ` Eduard Zingerman
  2024-07-02  0:41   ` Andrii Nakryiko
  2024-07-03 11:57   ` Puranjay Mohan
  2 siblings, 0 replies; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-01 19:01 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Alexei Starovoitov

On Sat, 2024-06-29 at 02:47 -0700, Eduard Zingerman wrote:

[...]

> Technically, the transformation is split into the following phases:
> - during check_cfg() function update_nocsr_pattern_marks() is used to
>   find potential patterns;
> - upon stack read or write access,
>   function check_nocsr_stack_contract() is used to verify if
>   stack offsets, presumably reserved for nocsr patterns, are used
>   only from those patterns;
> - function remove_nocsr_spills_fills(), called from bpf_check(),
>   applies the rewrite for valid patterns.

Talked to Andrii today, he asked to make the following changes:
- move update_nocsr_pattern_marks() from check_cfg() to a separate pass;
- make remove_nocsr_spills_fills() a part of do_misc_fixups()

I'll wait for some comments before submitting v2.

[...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls
  2024-06-29  9:47 ` [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls Eduard Zingerman
  2024-07-01 19:01   ` Eduard Zingerman
@ 2024-07-02  0:41   ` Andrii Nakryiko
  2024-07-02 20:38     ` Eduard Zingerman
  2024-07-03 11:57   ` Puranjay Mohan
  2 siblings, 1 reply; 47+ messages in thread
From: Andrii Nakryiko @ 2024-07-02  0:41 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Alexei Starovoitov

On Sat, Jun 29, 2024 at 2:48 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> GCC and LLVM define a no_caller_saved_registers function attribute.
> This attribute means that function scratches only some of
> the caller saved registers defined by ABI.
> For BPF the set of such registers could be defined as follows:
> - R0 is scratched only if function is non-void;
> - R1-R5 are scratched only if corresponding parameter type is defined
>   in the function prototype.
>
> This commit introduces flag bpf_func_prot->nocsr.
> If this flag is set for some helper function, verifier assumes that
> it follows no_caller_saved_registers calling convention.
>
> The contract between kernel and clang allows to simultaneously use
> such functions and maintain backwards compatibility with old
> kernels that don't understand no_caller_saved_registers calls
> (nocsr for short):
>
> - clang generates a simple pattern for nocsr calls, e.g.:
>
>     r1 = 1;
>     r2 = 2;
>     *(u64 *)(r10 - 8)  = r1;
>     *(u64 *)(r10 - 16) = r2;
>     call %[to_be_inlined_by_jit]

"inline_by_jit" is misleading, it can be inlined by BPF verifier using
BPF instructions, not just by BPF JIT

>     r2 = *(u64 *)(r10 - 16);
>     r1 = *(u64 *)(r10 - 8);
>     r0 = r1;
>     r0 += r2;
>     exit;
>
> - kernel removes unnecessary spills and fills, if called function is
>   inlined by current JIT (with assumption that patch inserted by JIT
>   honors nocsr contract, e.g. does not scratch r3-r5 for the example
>   above), e.g. the code above would be transformed to:
>
>     r1 = 1;
>     r2 = 2;
>     call %[to_be_inlined_by_jit]
>     r0 = r1;
>     r0 += r2;
>     exit;
>
> Technically, the transformation is split into the following phases:
> - during check_cfg() function update_nocsr_pattern_marks() is used to
>   find potential patterns;
> - upon stack read or write access,
>   function check_nocsr_stack_contract() is used to verify if
>   stack offsets, presumably reserved for nocsr patterns, are used
>   only from those patterns;
> - function remove_nocsr_spills_fills(), called from bpf_check(),
>   applies the rewrite for valid patterns.
>
> See comment in match_and_mark_nocsr_pattern() for more details.
>
> Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  include/linux/bpf.h          |   6 +
>  include/linux/bpf_verifier.h |   9 ++
>  kernel/bpf/verifier.c        | 300 ++++++++++++++++++++++++++++++++++-
>  3 files changed, 307 insertions(+), 8 deletions(-)
>

[...]

> -static int find_subprog(struct bpf_verifier_env *env, int off)
> +/* Find subprogram that contains instruction at 'off' */
> +static int find_containing_subprog(struct bpf_verifier_env *env, int off)
>  {
> -       struct bpf_subprog_info *p;
> +       struct bpf_subprog_info *vals = env->subprog_info;
> +       int high = env->subprog_cnt - 1;
> +       int low = 0, ret = -ENOENT;
>
> -       p = bsearch(&off, env->subprog_info, env->subprog_cnt,
> -                   sizeof(env->subprog_info[0]), cmp_subprogs);
> -       if (!p)
> +       if (off >= env->prog->len || off < 0)
>                 return -ENOENT;
> -       return p - env->subprog_info;
>
> +       while (low <= high) {
> +               int mid = (low + high)/2;

styling nit: (...) / 2

> +               struct bpf_subprog_info *val = &vals[mid];
> +               int diff = off - val->start;
> +
> +               if (diff < 0) {

tbh, this hurts my brain. Why not write human-readable and more meaningful

if (off < val->start)

?


> +                       high = mid - 1;
> +               } else {
> +                       low = mid + 1;
> +                       /* remember last time mid.start <= off */
> +                       ret = mid;
> +               }

feel free to ignore, but I find this unnecessary `ret = mid` part a
bit inelegant. See find_linfo in kernel/bpf/log.c for how
lower_bound-like binary search could be implemented without this (I
mean the pattern where invariant keeps low or high as always
satisfying the condition and the other one being adjusted with +1 or
-1, depending on desired logic).


> +       }
> +       return ret;
> +}
> +
> +/* Find subprogram that starts exactly at 'off' */
> +static int find_subprog(struct bpf_verifier_env *env, int off)
> +{
> +       int idx;
> +
> +       idx = find_containing_subprog(env, off);
> +       if (idx < 0 || env->subprog_info[idx].start != off)
> +               return -ENOENT;
> +       return idx;
>  }
>

[...]

> +static u8 get_helper_reg_mask(const struct bpf_func_proto *fn)
> +{
> +       u8 mask;
> +       int i;
> +
> +       if (!fn->nocsr)
> +               return ALL_CALLER_SAVED_REGS;
> +
> +       mask = 0;
> +       mask |= fn->ret_type == RET_VOID ? 0 : BIT(BPF_REG_0);
> +       for (i = 0; i < ARRAY_SIZE(fn->arg_type); ++i)
> +               mask |= fn->arg_type[i] == ARG_DONTCARE ? 0 : BIT(BPF_REG_1 + i);

again subjective, but

if (fn->ret_type != RET_VOID)
    mask |= BIT(BPF_REG_0);

(and similarly for ARG_DONTCARE)

seems a bit more readable and not that much more verbose

> +       return mask;
> +}
> +
> +/* True if do_misc_fixups() replaces calls to helper number 'imm',
> + * replacement patch is presumed to follow no_caller_saved_registers contract
> + * (see match_and_mark_nocsr_pattern() below).
> + */
> +static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm)
> +{

note that there is now also bpf_jit_inlines_helper_call()


> +       return false;
> +}
> +
> +/* If 'insn' is a call that follows no_caller_saved_registers contract
> + * and called function is inlined by current jit, return a mask with
> + * 1s corresponding to registers that are scratched by this call
> + * (depends on return type and number of return parameters).
> + * Otherwise return ALL_CALLER_SAVED_REGS mask.
> + */
> +static u32 call_csr_mask(struct bpf_verifier_env *env, struct bpf_insn *insn)
> +{
> +       const struct bpf_func_proto *fn;
> +
> +       if (bpf_helper_call(insn) &&
> +           verifier_inlines_helper_call(env, insn->imm) &&

strictly speaking, does nocsr have anything to do with inlining,
though? E.g., if we know for sure (however, that's a separate issue)
that helper implementation doesn't touch extra registers, why do we
need inlining to make use of nocsr?

> +           get_helper_proto(env, insn->imm, &fn) == 0 &&
> +           fn->nocsr)
> +               return ~get_helper_reg_mask(fn);
> +
> +       return ALL_CALLER_SAVED_REGS;
> +}
> +

[...]

> + * For example, it is *not* safe to remove spill/fill below:
> + *
> + *   r1 = 1;
> + *   *(u64 *)(r10 - 8)  = r1;            r1 = 1;
> + *   call %[to_be_inlined_by_jit]  -->   call %[to_be_inlined_by_jit]
> + *   r1 = *(u64 *)(r10 - 8);             r0 = *(u64 *)(r10 - 8);  <---- wrong !!!
> + *   r0 = *(u64 *)(r10 - 8);             r0 += r1;
> + *   r0 += r1;                           exit;
> + *   exit;
> + */
> +static int match_and_mark_nocsr_pattern(struct bpf_verifier_env *env, int t, bool mark)
> +{
> +       struct bpf_insn *insns = env->prog->insnsi, *stx, *ldx;
> +       struct bpf_subprog_info *subprog;
> +       u32 csr_mask = call_csr_mask(env, &insns[t]);
> +       u32 reg_mask = ~csr_mask | ~ALL_CALLER_SAVED_REGS;
> +       int s, i;
> +       s16 off;
> +
> +       if (csr_mask == ALL_CALLER_SAVED_REGS)
> +               return false;

false -> 0 ?

> +
> +       for (i = 1, off = 0; i <= ARRAY_SIZE(caller_saved); ++i, off += BPF_REG_SIZE) {
> +               if (t - i < 0 || t + i >= env->prog->len)
> +                       break;
> +               stx = &insns[t - i];
> +               ldx = &insns[t + i];
> +               if (off == 0) {
> +                       off = stx->off;
> +                       if (off % BPF_REG_SIZE != 0)
> +                               break;

just return here?

> +               }
> +               if (/* *(u64 *)(r10 - off) = r[0-5]? */
> +                   stx->code != (BPF_STX | BPF_MEM | BPF_DW) ||
> +                   stx->dst_reg != BPF_REG_10 ||
> +                   /* r[0-5] = *(u64 *)(r10 - off)? */
> +                   ldx->code != (BPF_LDX | BPF_MEM | BPF_DW) ||
> +                   ldx->src_reg != BPF_REG_10 ||
> +                   /* check spill/fill for the same reg and offset */
> +                   stx->src_reg != ldx->dst_reg ||
> +                   stx->off != ldx->off ||
> +                   stx->off != off ||
> +                   /* this should be a previously unseen register */
> +                   BIT(stx->src_reg) & reg_mask)
> +                       break;
> +               reg_mask |= BIT(stx->src_reg);
> +               if (mark) {
> +                       env->insn_aux_data[t - i].nocsr_pattern = true;
> +                       env->insn_aux_data[t + i].nocsr_pattern = true;
> +               }
> +       }
> +       if (i == 1)
> +               return 0;
> +       if (mark) {
> +               s = find_containing_subprog(env, t);
> +               /* can't happen */
> +               if (WARN_ON_ONCE(s < 0))
> +                       return 0;
> +               subprog = &env->subprog_info[s];
> +               subprog->nocsr_stack_off = min(subprog->nocsr_stack_off, off);
> +       }

why not split pattern detection and all this other marking logic? You
can return "the size of csr pattern", meaning how many spills/fills
are there surrounding the call, no? Then all the marking can be done
(if necessary) by the caller.

The question is what to do with zero patter (no spills/fills for nocsr
call, is that valid case?)

> +       return i - 1;
> +}
> +
> +/* If instruction 't' is a nocsr call surrounded by spill/fill pairs,
> + * update env->subprog_info[_]->nocsr_stack_off and
> + * env->insn_aux_data[_].nocsr_pattern fields.
> + */
> +static void update_nocsr_pattern_marks(struct bpf_verifier_env *env, int t)
> +{
> +       match_and_mark_nocsr_pattern(env, t, true);
> +}
> +
> +/* If instruction 't' is a nocsr call surrounded by spill/fill pairs,
> + * return the number of such pairs.
> + */
> +static int match_nocsr_pattern(struct bpf_verifier_env *env, int t)
> +{
> +       return match_and_mark_nocsr_pattern(env, t, false);
> +}
> +
>  /* Visits the instruction at index t and returns one of the following:
>   *  < 0 - an error occurred
>   *  DONE_EXPLORING - the instruction was fully explored
> @@ -16017,6 +16262,8 @@ static int visit_insn(int t, struct bpf_verifier_env *env)
>                                 mark_force_checkpoint(env, t);
>                         }
>                 }
> +               if (insn->src_reg == 0)
> +                       update_nocsr_pattern_marks(env, t);

as you mentioned, we discussed moving this from check_cfg() step, as
it doesn't seem to be coupled with "graph" part of the algorithm

>                 return visit_func_call_insn(t, insns, env, insn->src_reg == BPF_PSEUDO_CALL);
>
>         case BPF_JA:
> @@ -19063,15 +19310,16 @@ static int opt_remove_dead_code(struct bpf_verifier_env *env)
>         return 0;
>  }
>
> +static const struct bpf_insn NOP = BPF_JMP_IMM(BPF_JA, 0, 0, 0);
> +
>  static int opt_remove_nops(struct bpf_verifier_env *env)
>  {
> -       const struct bpf_insn ja = BPF_JMP_IMM(BPF_JA, 0, 0, 0);
>         struct bpf_insn *insn = env->prog->insnsi;
>         int insn_cnt = env->prog->len;
>         int i, err;
>
>         for (i = 0; i < insn_cnt; i++) {
> -               if (memcmp(&insn[i], &ja, sizeof(ja)))
> +               if (memcmp(&insn[i], &NOP, sizeof(NOP)))
>                         continue;
>
>                 err = verifier_remove_insns(env, i, 1);
> @@ -20801,6 +21049,39 @@ static int optimize_bpf_loop(struct bpf_verifier_env *env)
>         return 0;
>  }
>
> +/* Remove unnecessary spill/fill pairs, members of nocsr pattern.
> + * Do this as a separate pass to avoid interfering with helper/kfunc
> + * inlining logic in do_misc_fixups().
> + * See comment for match_and_mark_nocsr_pattern().
> + */
> +static int remove_nocsr_spills_fills(struct bpf_verifier_env *env)
> +{
> +       struct bpf_subprog_info *subprogs = env->subprog_info;
> +       int i, j, spills_num, cur_subprog = 0;
> +       struct bpf_insn *insn = env->prog->insnsi;
> +       int insn_cnt = env->prog->len;
> +
> +       for (i = 0; i < insn_cnt; i++, insn++) {
> +               spills_num = match_nocsr_pattern(env, i);

we can probably afford a single-byte field somewhere in
bpf_insn_aux_data to remember "csr pattern size" instead of just a
true/false fact that it is nocsr call. And so we wouldn't need to do
pattern matching again here, we'll just have all the data.


> +               if (spills_num == 0)
> +                       goto next_insn;
> +               for (j = 1; j <= spills_num; ++j)
> +                       if ((insn - j)->off >= subprogs[cur_subprog].nocsr_stack_off ||
> +                           (insn + j)->off >= subprogs[cur_subprog].nocsr_stack_off)
> +                               goto next_insn;
> +               /* NOPs are removed by opt_remove_nops() later */
> +               for (j = 1; j <= spills_num; ++j) {
> +                       *(insn - j) = NOP;
> +                       *(insn + j) = NOP;
> +               }
> +
> +next_insn:
> +               if (subprogs[cur_subprog + 1].start == i + 1)
> +                       cur_subprog++;
> +       }
> +       return 0;
> +}
> +
>  static void free_states(struct bpf_verifier_env *env)
>  {
>         struct bpf_verifier_state_list *sl, *sln;
> @@ -21719,6 +22000,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
>         if (ret == 0)
>                 ret = optimize_bpf_loop(env);
>
> +       if (ret == 0)
> +               ret = remove_nocsr_spills_fills(env);
> +
>         if (is_priv) {
>                 if (ret == 0)
>                         opt_hard_wire_dead_code_branches(env);
> --
> 2.45.2
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-02  0:41   ` Andrii Nakryiko
@ 2024-07-02 20:38     ` Eduard Zingerman
  2024-07-02 21:09       ` Andrii Nakryiko
  0 siblings, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-02 20:38 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Alexei Starovoitov

On Mon, 2024-07-01 at 17:41 -0700, Andrii Nakryiko wrote:

[...]

> > - clang generates a simple pattern for nocsr calls, e.g.:
> > 
> >     r1 = 1;
> >     r2 = 2;
> >     *(u64 *)(r10 - 8)  = r1;
> >     *(u64 *)(r10 - 16) = r2;
> >     call %[to_be_inlined_by_jit]
> 
> "inline_by_jit" is misleading, it can be inlined by BPF verifier using
> BPF instructions, not just by BPF JIT

Will change

> >     r2 = *(u64 *)(r10 - 16);
> >     r1 = *(u64 *)(r10 - 8);
> >     r0 = r1;
> >     r0 += r2;
> >     exit;

[...]
> 
> > -static int find_subprog(struct bpf_verifier_env *env, int off)
> > +/* Find subprogram that contains instruction at 'off' */
> > +static int find_containing_subprog(struct bpf_verifier_env *env, int off)
> >  {
> > -       struct bpf_subprog_info *p;
> > +       struct bpf_subprog_info *vals = env->subprog_info;
> > +       int high = env->subprog_cnt - 1;
> > +       int low = 0, ret = -ENOENT;
> > 
> > -       p = bsearch(&off, env->subprog_info, env->subprog_cnt,
> > -                   sizeof(env->subprog_info[0]), cmp_subprogs);
> > -       if (!p)
> > +       if (off >= env->prog->len || off < 0)
> >                 return -ENOENT;
> > -       return p - env->subprog_info;
> > 
> > +       while (low <= high) {
> > +               int mid = (low + high)/2;
> 
> styling nit: (...) / 2
> 
> > +               struct bpf_subprog_info *val = &vals[mid];
> > +               int diff = off - val->start;
> > +
> > +               if (diff < 0) {
> 
> tbh, this hurts my brain. Why not write human-readable and more meaningful
> 
> if (off < val->start)
> 
> ?

No reason, will change.

> > +                       high = mid - 1;
> > +               } else {
> > +                       low = mid + 1;
> > +                       /* remember last time mid.start <= off */
> > +                       ret = mid;
> > +               }
> 
> feel free to ignore, but I find this unnecessary `ret = mid` part a
> bit inelegant. See find_linfo in kernel/bpf/log.c for how
> lower_bound-like binary search could be implemented without this (I
> mean the pattern where invariant keeps low or high as always
> satisfying the condition and the other one being adjusted with +1 or
> -1, depending on desired logic).

Will steal the code from there, thank you.

[...]

> > +static u8 get_helper_reg_mask(const struct bpf_func_proto *fn)
> > +{
> > +       u8 mask;
> > +       int i;
> > +
> > +       if (!fn->nocsr)
> > +               return ALL_CALLER_SAVED_REGS;
> > +
> > +       mask = 0;
> > +       mask |= fn->ret_type == RET_VOID ? 0 : BIT(BPF_REG_0);
> > +       for (i = 0; i < ARRAY_SIZE(fn->arg_type); ++i)
> > +               mask |= fn->arg_type[i] == ARG_DONTCARE ? 0 : BIT(BPF_REG_1 + i);
> 
> again subjective, but
> 
> if (fn->ret_type != RET_VOID)
>     mask |= BIT(BPF_REG_0);
> 
> (and similarly for ARG_DONTCARE)
> 
> seems a bit more readable and not that much more verbose

Sure, will change.

[...]

> > +static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm)
> > +{
> 
> note that there is now also bpf_jit_inlines_helper_call()

There is.
I'd like authors of jit inline patches to explicitly opt-in
for nocsr, vouching that inline patch follows nocsr contract.
The idea is that people would extend call_csr_mask() as necessary.

> 
> 
> > +       return false;
> > +}
> > +
> > +/* If 'insn' is a call that follows no_caller_saved_registers contract
> > + * and called function is inlined by current jit, return a mask with
> > + * 1s corresponding to registers that are scratched by this call
> > + * (depends on return type and number of return parameters).
> > + * Otherwise return ALL_CALLER_SAVED_REGS mask.
> > + */
> > +static u32 call_csr_mask(struct bpf_verifier_env *env, struct bpf_insn *insn)
> > +{
> > +       const struct bpf_func_proto *fn;
> > +
> > +       if (bpf_helper_call(insn) &&
> > +           verifier_inlines_helper_call(env, insn->imm) &&
> 
> strictly speaking, does nocsr have anything to do with inlining,
> though? E.g., if we know for sure (however, that's a separate issue)
> that helper implementation doesn't touch extra registers, why do we
> need inlining to make use of nocsr?

Technically, alternative for nocsr is for C version of the
helper/kfunc itself has no_caller_saved_registers attribute.
Grep shows a single function annotated as such in kernel tree:
stackleak_track_stack().
Or, maybe, for helpers/kfuncs implemented in assembly.
Whenever such helpers/kfuncs are added, this function could be extended.

> 
> > +           get_helper_proto(env, insn->imm, &fn) == 0 &&
> > +           fn->nocsr)
> > +               return ~get_helper_reg_mask(fn);
> > +
> > +       return ALL_CALLER_SAVED_REGS;
> > +}
> > +

[...]

> > +static int match_and_mark_nocsr_pattern(struct bpf_verifier_env *env, int t, bool mark)
> > +{
> > +       struct bpf_insn *insns = env->prog->insnsi, *stx, *ldx;
> > +       struct bpf_subprog_info *subprog;
> > +       u32 csr_mask = call_csr_mask(env, &insns[t]);
> > +       u32 reg_mask = ~csr_mask | ~ALL_CALLER_SAVED_REGS;
> > +       int s, i;
> > +       s16 off;
> > +
> > +       if (csr_mask == ALL_CALLER_SAVED_REGS)
> > +               return false;
> 
> false -> 0 ?

Right.

> 
> > +
> > +       for (i = 1, off = 0; i <= ARRAY_SIZE(caller_saved); ++i, off += BPF_REG_SIZE) {
> > +               if (t - i < 0 || t + i >= env->prog->len)
> > +                       break;
> > +               stx = &insns[t - i];
> > +               ldx = &insns[t + i];
> > +               if (off == 0) {
> > +                       off = stx->off;
> > +                       if (off % BPF_REG_SIZE != 0)
> > +                               break;
> 

Makes sense.

> > +               }
> > +               if (/* *(u64 *)(r10 - off) = r[0-5]? */
> > +                   stx->code != (BPF_STX | BPF_MEM | BPF_DW) ||
> > +                   stx->dst_reg != BPF_REG_10 ||
> > +                   /* r[0-5] = *(u64 *)(r10 - off)? */
> > +                   ldx->code != (BPF_LDX | BPF_MEM | BPF_DW) ||
> > +                   ldx->src_reg != BPF_REG_10 ||
> > +                   /* check spill/fill for the same reg and offset */
> > +                   stx->src_reg != ldx->dst_reg ||
> > +                   stx->off != ldx->off ||
> > +                   stx->off != off ||
> > +                   /* this should be a previously unseen register */
> > +                   BIT(stx->src_reg) & reg_mask)
> > +                       break;
> > +               reg_mask |= BIT(stx->src_reg);
> > +               if (mark) {
> > +                       env->insn_aux_data[t - i].nocsr_pattern = true;
> > +                       env->insn_aux_data[t + i].nocsr_pattern = true;
> > +               }
> > +       }
> > +       if (i == 1)
> > +               return 0;
> > +       if (mark) {
> > +               s = find_containing_subprog(env, t);
> > +               /* can't happen */
> > +               if (WARN_ON_ONCE(s < 0))
> > +                       return 0;
> > +               subprog = &env->subprog_info[s];
> > +               subprog->nocsr_stack_off = min(subprog->nocsr_stack_off, off);
> > +       }
> 
> why not split pattern detection and all this other marking logic? You
> can return "the size of csr pattern", meaning how many spills/fills
> are there surrounding the call, no? Then all the marking can be done
> (if necessary) by the caller.

I'll try recording pattern size for function call,
so no split would be necessary.

> The question is what to do with zero patter (no spills/fills for nocsr
> call, is that valid case?)

Zero pattern -- no spills/fills to remove, so nothing to do.
I will add a test case, but it should be handled by current implementation fine.

[...]

> > +/* Remove unnecessary spill/fill pairs, members of nocsr pattern.
> > + * Do this as a separate pass to avoid interfering with helper/kfunc
> > + * inlining logic in do_misc_fixups().
> > + * See comment for match_and_mark_nocsr_pattern().
> > + */
> > +static int remove_nocsr_spills_fills(struct bpf_verifier_env *env)
> > +{
> > +       struct bpf_subprog_info *subprogs = env->subprog_info;
> > +       int i, j, spills_num, cur_subprog = 0;
> > +       struct bpf_insn *insn = env->prog->insnsi;
> > +       int insn_cnt = env->prog->len;
> > +
> > +       for (i = 0; i < insn_cnt; i++, insn++) {
> > +               spills_num = match_nocsr_pattern(env, i);
> 
> we can probably afford a single-byte field somewhere in
> bpf_insn_aux_data to remember "csr pattern size" instead of just a
> true/false fact that it is nocsr call. And so we wouldn't need to do
> pattern matching again here, we'll just have all the data.

Makes sense.

[...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-02 20:38     ` Eduard Zingerman
@ 2024-07-02 21:09       ` Andrii Nakryiko
  2024-07-02 21:19         ` Eduard Zingerman
  0 siblings, 1 reply; 47+ messages in thread
From: Andrii Nakryiko @ 2024-07-02 21:09 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Alexei Starovoitov

On Tue, Jul 2, 2024 at 1:38 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Mon, 2024-07-01 at 17:41 -0700, Andrii Nakryiko wrote:
>
> [...]
>

[...]

> > > +static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm)
> > > +{
> >
> > note that there is now also bpf_jit_inlines_helper_call()
>
> There is.
> I'd like authors of jit inline patches to explicitly opt-in
> for nocsr, vouching that inline patch follows nocsr contract.
> The idea is that people would extend call_csr_mask() as necessary.
>

you are defining a general framework with these changes, though, so
let's introduce a standard and simple way to do this. Say, in addition
to having arch-specific bpf_jit_inlines_helper_call() we can have
bpf_jit_supports_helper_nocsr() or something. And they should be
defined next to each other, so whenever one changes it's easier to
remember to change the other one.

I don't think requiring arm64 contributors to change the code of
call_csr_mask() is the right approach.

> >
> >
> > > +       return false;
> > > +}
> > > +
> > > +/* If 'insn' is a call that follows no_caller_saved_registers contract
> > > + * and called function is inlined by current jit, return a mask with
> > > + * 1s corresponding to registers that are scratched by this call
> > > + * (depends on return type and number of return parameters).
> > > + * Otherwise return ALL_CALLER_SAVED_REGS mask.
> > > + */
> > > +static u32 call_csr_mask(struct bpf_verifier_env *env, struct bpf_insn *insn)
> > > +{
> > > +       const struct bpf_func_proto *fn;
> > > +
> > > +       if (bpf_helper_call(insn) &&
> > > +           verifier_inlines_helper_call(env, insn->imm) &&
> >
> > strictly speaking, does nocsr have anything to do with inlining,
> > though? E.g., if we know for sure (however, that's a separate issue)
> > that helper implementation doesn't touch extra registers, why do we
> > need inlining to make use of nocsr?
>
> Technically, alternative for nocsr is for C version of the
> helper/kfunc itself has no_caller_saved_registers attribute.
> Grep shows a single function annotated as such in kernel tree:
> stackleak_track_stack().
> Or, maybe, for helpers/kfuncs implemented in assembly.

Yes, I suppose it's too dangerous to rely on the compiler to not use
some extra register. I guess worst case we can "inline" helper by
keeping call to it intact :)

> Whenever such helpers/kfuncs are added, this function could be extended.
>
> >
> > > +           get_helper_proto(env, insn->imm, &fn) == 0 &&
> > > +           fn->nocsr)
> > > +               return ~get_helper_reg_mask(fn);
> > > +
> > > +       return ALL_CALLER_SAVED_REGS;
> > > +}
> > > +
>
> [...]
>

[...]

> > > +       if (i == 1)
> > > +               return 0;
> > > +       if (mark) {
> > > +               s = find_containing_subprog(env, t);
> > > +               /* can't happen */
> > > +               if (WARN_ON_ONCE(s < 0))
> > > +                       return 0;
> > > +               subprog = &env->subprog_info[s];
> > > +               subprog->nocsr_stack_off = min(subprog->nocsr_stack_off, off);
> > > +       }
> >
> > why not split pattern detection and all this other marking logic? You
> > can return "the size of csr pattern", meaning how many spills/fills
> > are there surrounding the call, no? Then all the marking can be done
> > (if necessary) by the caller.
>
> I'll try recording pattern size for function call,
> so no split would be necessary.
>

ack

> > The question is what to do with zero patter (no spills/fills for nocsr
> > call, is that valid case?)
>
> Zero pattern -- no spills/fills to remove, so nothing to do.
> I will add a test case, but it should be handled by current implementation fine.

yep, thanks

>
> [...]
>
> > > +/* Remove unnecessary spill/fill pairs, members of nocsr pattern.
> > > + * Do this as a separate pass to avoid interfering with helper/kfunc
> > > + * inlining logic in do_misc_fixups().
> > > + * See comment for match_and_mark_nocsr_pattern().
> > > + */
> > > +static int remove_nocsr_spills_fills(struct bpf_verifier_env *env)
> > > +{
> > > +       struct bpf_subprog_info *subprogs = env->subprog_info;
> > > +       int i, j, spills_num, cur_subprog = 0;
> > > +       struct bpf_insn *insn = env->prog->insnsi;
> > > +       int insn_cnt = env->prog->len;
> > > +
> > > +       for (i = 0; i < insn_cnt; i++, insn++) {
> > > +               spills_num = match_nocsr_pattern(env, i);
> >
> > we can probably afford a single-byte field somewhere in
> > bpf_insn_aux_data to remember "csr pattern size" instead of just a
> > true/false fact that it is nocsr call. And so we wouldn't need to do
> > pattern matching again here, we'll just have all the data.
>
> Makes sense.
>
> [...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-02 21:09       ` Andrii Nakryiko
@ 2024-07-02 21:19         ` Eduard Zingerman
  2024-07-02 21:22           ` Andrii Nakryiko
  0 siblings, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-02 21:19 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Alexei Starovoitov

On Tue, 2024-07-02 at 14:09 -0700, Andrii Nakryiko wrote:

[...]

> you are defining a general framework with these changes, though, so
> let's introduce a standard and simple way to do this. Say, in addition
> to having arch-specific bpf_jit_inlines_helper_call() we can have
> bpf_jit_supports_helper_nocsr() or something. And they should be
> defined next to each other, so whenever one changes it's easier to
> remember to change the other one.
> 
> I don't think requiring arm64 contributors to change the code of
> call_csr_mask() is the right approach.

I'd change the return value for bpf_jit_inlines_helper_call() to enum,
to avoid naming functions several times.

[...]

> > > strictly speaking, does nocsr have anything to do with inlining,
> > > though? E.g., if we know for sure (however, that's a separate issue)
> > > that helper implementation doesn't touch extra registers, why do we
> > > need inlining to make use of nocsr?
> > 
> > Technically, alternative for nocsr is for C version of the
> > helper/kfunc itself has no_caller_saved_registers attribute.
> > Grep shows a single function annotated as such in kernel tree:
> > stackleak_track_stack().
> > Or, maybe, for helpers/kfuncs implemented in assembly.
> 
> Yes, I suppose it's too dangerous to rely on the compiler to not use
> some extra register. I guess worst case we can "inline" helper by
> keeping call to it intact :)

Something like that.

[...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-02 21:19         ` Eduard Zingerman
@ 2024-07-02 21:22           ` Andrii Nakryiko
  0 siblings, 0 replies; 47+ messages in thread
From: Andrii Nakryiko @ 2024-07-02 21:22 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Alexei Starovoitov

On Tue, Jul 2, 2024 at 2:19 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2024-07-02 at 14:09 -0700, Andrii Nakryiko wrote:
>
> [...]
>
> > you are defining a general framework with these changes, though, so
> > let's introduce a standard and simple way to do this. Say, in addition
> > to having arch-specific bpf_jit_inlines_helper_call() we can have
> > bpf_jit_supports_helper_nocsr() or something. And they should be
> > defined next to each other, so whenever one changes it's easier to
> > remember to change the other one.
> >
> > I don't think requiring arm64 contributors to change the code of
> > call_csr_mask() is the right approach.
>
> I'd change the return value for bpf_jit_inlines_helper_call() to enum,
> to avoid naming functions several times.

Hm... I don't know, feels ugly. nocsr is sort of a dependent but
orthogonal axis, so I'd keep it separate. Imagine we add yet another
property that relies on inlining, but is independent of nocsr. I guess
we can have enum be a bit set, but I'd start simple with two
functions.

>
> [...]
>
> > > > strictly speaking, does nocsr have anything to do with inlining,
> > > > though? E.g., if we know for sure (however, that's a separate issue)
> > > > that helper implementation doesn't touch extra registers, why do we
> > > > need inlining to make use of nocsr?
> > >
> > > Technically, alternative for nocsr is for C version of the
> > > helper/kfunc itself has no_caller_saved_registers attribute.
> > > Grep shows a single function annotated as such in kernel tree:
> > > stackleak_track_stack().
> > > Or, maybe, for helpers/kfuncs implemented in assembly.
> >
> > Yes, I suppose it's too dangerous to rely on the compiler to not use
> > some extra register. I guess worst case we can "inline" helper by
> > keeping call to it intact :)
>
> Something like that.
>
> [...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls
  2024-06-29  9:47 ` [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls Eduard Zingerman
  2024-07-01 19:01   ` Eduard Zingerman
  2024-07-02  0:41   ` Andrii Nakryiko
@ 2024-07-03 11:57   ` Puranjay Mohan
  2024-07-03 16:13     ` Eduard Zingerman
  2 siblings, 1 reply; 47+ messages in thread
From: Puranjay Mohan @ 2024-07-03 11:57 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Eduard Zingerman, Alexei Starovoitov, puranjay12

[-- Attachment #1: Type: text/plain, Size: 6702 bytes --]

Eduard Zingerman <eddyz87@gmail.com> writes:

> GCC and LLVM define a no_caller_saved_registers function attribute.
> This attribute means that function scratches only some of
> the caller saved registers defined by ABI.
> For BPF the set of such registers could be defined as follows:
> - R0 is scratched only if function is non-void;
> - R1-R5 are scratched only if corresponding parameter type is defined
>   in the function prototype.
>
> This commit introduces flag bpf_func_prot->nocsr.
> If this flag is set for some helper function, verifier assumes that
> it follows no_caller_saved_registers calling convention.
>
> The contract between kernel and clang allows to simultaneously use
> such functions and maintain backwards compatibility with old
> kernels that don't understand no_caller_saved_registers calls
> (nocsr for short):
>
> - clang generates a simple pattern for nocsr calls, e.g.:
>
>     r1 = 1;
>     r2 = 2;
>     *(u64 *)(r10 - 8)  = r1;
>     *(u64 *)(r10 - 16) = r2;
>     call %[to_be_inlined_by_jit]

The call can be inlined by the verifier (in do_misc_fixups()) or by the
JIT if bpf_jit_inlines_helper_call() is true for a helper.

>     r2 = *(u64 *)(r10 - 16);
>     r1 = *(u64 *)(r10 - 8);
>     r0 = r1;
>     r0 += r2;
>     exit;
>
> - kernel removes unnecessary spills and fills, if called function is
>   inlined by current JIT (with assumption that patch inserted by JIT
>   honors nocsr contract, e.g. does not scratch r3-r5 for the example
>   above), e.g. the code above would be transformed to:
>
>     r1 = 1;
>     r2 = 2;
>     call %[to_be_inlined_by_jit]

Same as above

>     r0 = r1;
>     r0 += r2;
>     exit;
>

[.....]

> +/* True if do_misc_fixups() replaces calls to helper number 'imm',
> + * replacement patch is presumed to follow no_caller_saved_registers contract
> + * (see match_and_mark_nocsr_pattern() below).
> + */
> +static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm)
> +{
> +	return false;
> +}
> +
> +/* If 'insn' is a call that follows no_caller_saved_registers contract
> + * and called function is inlined by current jit, return a mask with

We should update the comment to: inlined by the verifier or by the
current JIT.

> + * 1s corresponding to registers that are scratched by this call
> + * (depends on return type and number of return parameters).
> + * Otherwise return ALL_CALLER_SAVED_REGS mask.
> + */
> +static u32 call_csr_mask(struct bpf_verifier_env *env, struct bpf_insn *insn)
> +{
> +	const struct bpf_func_proto *fn;
> +
> +	if (bpf_helper_call(insn) &&
> +	    verifier_inlines_helper_call(env, insn->imm) &&

This should also check bpf_jit_inlines_helper_call(insn->imm) as the JIT
can also inline helper calls separately from the verifier.

    if (bpf_helper_call(insn) &&
        (verifier_inlines_helper_call(env, insn->imm) || bpf_jit_inlines_helper_call(insn->imm)) &&

This is currently being done by the arm64 and risc-v JITs and they don't
scratch any register except R0 (The helpers inlined by these JITs are
non-void).

> +	    get_helper_proto(env, insn->imm, &fn) == 0 &&
> +	    fn->nocsr)
> +		return ~get_helper_reg_mask(fn);
> +
> +	return ALL_CALLER_SAVED_REGS;
> +}
> +
> +/* GCC and LLVM define a no_caller_saved_registers function attribute.
> + * This attribute means that function scratches only some of
> + * the caller saved registers defined by ABI.
> + * For BPF the set of such registers could be defined as follows:
> + * - R0 is scratched only if function is non-void;
> + * - R1-R5 are scratched only if corresponding parameter type is defined
> + *   in the function prototype.
> + *
> + * The contract between kernel and clang allows to simultaneously use
> + * such functions and maintain backwards compatibility with old
> + * kernels that don't understand no_caller_saved_registers calls
> + * (nocsr for short):
> + *
> + * - for nocsr calls clang allocates registers as-if relevant r0-r5
> + *   registers are not scratched by the call;
> + *
> + * - as a post-processing step, clang visits each nocsr call and adds
> + *   spill/fill for every live r0-r5;
> + *
> + * - stack offsets used for the spill/fill are allocated as minimal
> + *   stack offsets in whole function and are not used for any other
> + *   purposes;
> + *
> + * - when kernel loads a program, it looks for such patterns
> + *   (nocsr function surrounded by spills/fills) and checks if
> + *   spill/fill stack offsets are used exclusively in nocsr patterns;
> + *
> + * - if so, and if current JIT inlines the call to the nocsr function

We should update the comment to: if the verifier or the current JIT.

> + *   (e.g. a helper call), kernel removes unnecessary spill/fill pairs;
> + *
> + * - when old kernel loads a program, presence of spill/fill pairs
> + *   keeps BPF program valid, albeit slightly less efficient.
> + *
> + * For example:
> + *
> + *   r1 = 1;
> + *   r2 = 2;
> + *   *(u64 *)(r10 - 8)  = r1;            r1 = 1;
> + *   *(u64 *)(r10 - 16) = r2;            r2 = 2;
> + *   call %[to_be_inlined_by_jit]  -->   call %[to_be_inlined_by_jit]

We should update this to say: to_be_inlined_by_verifier_or_jit

> + *   r2 = *(u64 *)(r10 - 16);            r0 = r1;
> + *   r1 = *(u64 *)(r10 - 8);             r0 += r2;
> + *   r0 = r1;                            exit;
> + *   r0 += r2;
> + *   exit;
> + *
> + * The purpose of match_and_mark_nocsr_pattern is to:
> + * - look for such patterns;
> + * - mark spill and fill instructions in env->insn_aux_data[*].nocsr_pattern;
> + * - update env->subprog_info[*]->nocsr_stack_off to find an offset
> + *   at which nocsr spill/fill stack slots start.
> + *
> + * The .nocsr_pattern and .nocsr_stack_off are used by
> + * check_nocsr_stack_contract() to check if every stack access to
> + * nocsr spill/fill stack slot originates from spill/fill
> + * instructions, members of nocsr patterns.
> + *
> + * If such condition holds true for a subprogram, nocsr patterns could
> + * be rewritten by remove_nocsr_spills_fills().
> + * Otherwise nocsr patterns are not changed in the subprogram
> + * (code, presumably, generated by an older clang version).
> + *
> + * For example, it is *not* safe to remove spill/fill below:
> + *
> + *   r1 = 1;
> + *   *(u64 *)(r10 - 8)  = r1;            r1 = 1;
> + *   call %[to_be_inlined_by_jit]  -->   call %[to_be_inlined_by_jit]

Same as above.

Thanks for working on this feature.

Most of my comments except for bpf_jit_inlines_helper_call() are nit
picks regarding the comments or the commit message, but as the verifier
is already complicated, I feel it is useful to have extremely accurate
comments/commit messages for new contributors.

Thanks,
Puranjay

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-03 11:57   ` Puranjay Mohan
@ 2024-07-03 16:13     ` Eduard Zingerman
  2024-07-04 10:55       ` Puranjay Mohan
  0 siblings, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-03 16:13 UTC (permalink / raw)
  To: Puranjay Mohan, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Alexei Starovoitov, puranjay12

On Wed, 2024-07-03 at 11:57 +0000, Puranjay Mohan wrote:

[...]

> > +static u32 call_csr_mask(struct bpf_verifier_env *env, struct bpf_insn *insn)
> > +{
> > +	const struct bpf_func_proto *fn;
> > +
> > +	if (bpf_helper_call(insn) &&
> > +	    verifier_inlines_helper_call(env, insn->imm) &&
> 
> This should also check bpf_jit_inlines_helper_call(insn->imm) as the JIT
> can also inline helper calls separately from the verifier.
> 
>     if (bpf_helper_call(insn) &&
>         (verifier_inlines_helper_call(env, insn->imm) || bpf_jit_inlines_helper_call(insn->imm)) &&
> 
> This is currently being done by the arm64 and risc-v JITs and they don't
> scratch any register except R0 (The helpers inlined by these JITs are
> non-void).

Hello Puranjay, thank you for commenting.
In a sibling email Andrii suggested to also add a function like below:

    __weak bool bpf_jit_supports_helper_nocsr(s32)

At the moment I see the following helpers inlined by jits:
- arm64:
  - BPF_FUNC_get_smp_processor_id
  - BPF_FUNC_get_current_task
  - BPF_FUNC_get_current_task_btf
- riscv:
  - BPF_FUNC_get_smp_processor_id

I suspect (but need to double check) that all of these patches conform
to nocsr. If so, what are you thoughts regarding bpf_jit_supports_helper_nocsr():
do we need it, or should inlining imply nocsr?

Thanks,
Eduard

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls
  2024-07-03 16:13     ` Eduard Zingerman
@ 2024-07-04 10:55       ` Puranjay Mohan
  0 siblings, 0 replies; 47+ messages in thread
From: Puranjay Mohan @ 2024-07-04 10:55 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Alexei Starovoitov

[-- Attachment #1: Type: text/plain, Size: 1744 bytes --]

Eduard Zingerman <eddyz87@gmail.com> writes:

> On Wed, 2024-07-03 at 11:57 +0000, Puranjay Mohan wrote:
>
> [...]
>
>> > +static u32 call_csr_mask(struct bpf_verifier_env *env, struct bpf_insn *insn)
>> > +{
>> > +	const struct bpf_func_proto *fn;
>> > +
>> > +	if (bpf_helper_call(insn) &&
>> > +	    verifier_inlines_helper_call(env, insn->imm) &&
>> 
>> This should also check bpf_jit_inlines_helper_call(insn->imm) as the JIT
>> can also inline helper calls separately from the verifier.
>> 
>>     if (bpf_helper_call(insn) &&
>>         (verifier_inlines_helper_call(env, insn->imm) || bpf_jit_inlines_helper_call(insn->imm)) &&
>> 
>> This is currently being done by the arm64 and risc-v JITs and they don't
>> scratch any register except R0 (The helpers inlined by these JITs are
>> non-void).
>
> Hello Puranjay, thank you for commenting.
> In a sibling email Andrii suggested to also add a function like below:
>
>     __weak bool bpf_jit_supports_helper_nocsr(s32)
>
> At the moment I see the following helpers inlined by jits:
> - arm64:
>   - BPF_FUNC_get_smp_processor_id
>   - BPF_FUNC_get_current_task
>   - BPF_FUNC_get_current_task_btf
> - riscv:
>   - BPF_FUNC_get_smp_processor_id

Yes, all of the above conform to nocsr.

> I suspect (but need to double check) that all of these patches conform
> to nocsr. If so, what are you thoughts regarding bpf_jit_supports_helper_nocsr():
> do we need it, or should inlining imply nocsr?

I don't think we need bpf_jit_supports_helper_nocsr() because JITs will
be only inlining very simple helpers and will not clobber caller saved
registers. JITs anyway use temperory registers for intermediate steps if
required.

Thanks,
Puranjay

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
  2024-06-29  9:47 [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Eduard Zingerman
  2024-06-29  9:47 ` [RFC bpf-next v1 1/8] bpf: add a get_helper_proto() utility function Eduard Zingerman
  2024-06-29  9:47 ` [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls Eduard Zingerman
@ 2024-06-29  9:47 ` Eduard Zingerman
  2024-07-02  0:41   ` Andrii Nakryiko
  2024-06-29  9:47 ` [RFC bpf-next v1 4/8] selftests/bpf: extract utility function for BPF disassembly Eduard Zingerman
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-06-29  9:47 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Eduard Zingerman

On x86 verifier replaces call to bpf_get_smp_processor_id() with a
sequence of instructions that modify only R0.
This satisfies attribute no_caller_saved_registers contract.
Allow rewrite of no_caller_saved_registers patterns for
bpf_get_smp_processor_id() in order to use this function
as a canary for no_caller_saved_registers tests.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/helpers.c  |  1 +
 kernel/bpf/verifier.c | 11 +++++++++--
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 229396172026..1df01f454590 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -158,6 +158,7 @@ const struct bpf_func_proto bpf_get_smp_processor_id_proto = {
 	.func		= bpf_get_smp_processor_id,
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
+	.nocsr		= true,
 };
 
 BPF_CALL_0(bpf_get_numa_node_id)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1340d3e60d30..d68ed70186c8 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -16030,7 +16030,14 @@ static u8 get_helper_reg_mask(const struct bpf_func_proto *fn)
  */
 static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm)
 {
-	return false;
+	switch (imm) {
+#ifdef CONFIG_X86_64
+	case BPF_FUNC_get_smp_processor_id:
+		return env->prog->jit_requested && bpf_jit_supports_percpu_insn();
+#endif
+	default:
+		return false;
+	}
 }
 
 /* If 'insn' is a call that follows no_caller_saved_registers contract
@@ -20666,7 +20673,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
 #ifdef CONFIG_X86_64
 		/* Implement bpf_get_smp_processor_id() inline. */
 		if (insn->imm == BPF_FUNC_get_smp_processor_id &&
-		    prog->jit_requested && bpf_jit_supports_percpu_insn()) {
+		    verifier_inlines_helper_call(env, insn->imm)) {
 			/* BPF_FUNC_get_smp_processor_id inlining is an
 			 * optimization, so if pcpu_hot.cpu_number is ever
 			 * changed in some incompatible and hard to support
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
  2024-06-29  9:47 ` [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id() Eduard Zingerman
@ 2024-07-02  0:41   ` Andrii Nakryiko
  2024-07-02 20:43     ` Eduard Zingerman
  0 siblings, 1 reply; 47+ messages in thread
From: Andrii Nakryiko @ 2024-07-02  0:41 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Sat, Jun 29, 2024 at 2:48 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On x86 verifier replaces call to bpf_get_smp_processor_id() with a
> sequence of instructions that modify only R0.
> This satisfies attribute no_caller_saved_registers contract.
> Allow rewrite of no_caller_saved_registers patterns for
> bpf_get_smp_processor_id() in order to use this function
> as a canary for no_caller_saved_registers tests.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  kernel/bpf/helpers.c  |  1 +
>  kernel/bpf/verifier.c | 11 +++++++++--
>  2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 229396172026..1df01f454590 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -158,6 +158,7 @@ const struct bpf_func_proto bpf_get_smp_processor_id_proto = {
>         .func           = bpf_get_smp_processor_id,
>         .gpl_only       = false,
>         .ret_type       = RET_INTEGER,
> +       .nocsr          = true,

I'm wondering if we should call this flag in such a way that it's
clear that this is more of an request, while the actual nocsr cleanup
and stuff is done only if BPF verifier/BPF JIT support that for
specific architecture/config/etc?

>  };
>
>  BPF_CALL_0(bpf_get_numa_node_id)
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 1340d3e60d30..d68ed70186c8 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -16030,7 +16030,14 @@ static u8 get_helper_reg_mask(const struct bpf_func_proto *fn)
>   */
>  static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm)
>  {
> -       return false;
> +       switch (imm) {
> +#ifdef CONFIG_X86_64
> +       case BPF_FUNC_get_smp_processor_id:
> +               return env->prog->jit_requested && bpf_jit_supports_percpu_insn();
> +#endif

please see bpf_jit_inlines_helper_call(), arm64 and risc-v inline it
in JIT, so we need to validate they don't assume any of R1-R5 register
to be a scratch register


> +       default:
> +               return false;
> +       }
>  }
>
>  /* If 'insn' is a call that follows no_caller_saved_registers contract
> @@ -20666,7 +20673,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
>  #ifdef CONFIG_X86_64
>                 /* Implement bpf_get_smp_processor_id() inline. */
>                 if (insn->imm == BPF_FUNC_get_smp_processor_id &&
> -                   prog->jit_requested && bpf_jit_supports_percpu_insn()) {
> +                   verifier_inlines_helper_call(env, insn->imm)) {
>                         /* BPF_FUNC_get_smp_processor_id inlining is an
>                          * optimization, so if pcpu_hot.cpu_number is ever
>                          * changed in some incompatible and hard to support
> --
> 2.45.2
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
  2024-07-02  0:41   ` Andrii Nakryiko
@ 2024-07-02 20:43     ` Eduard Zingerman
  2024-07-02 21:11       ` Andrii Nakryiko
  0 siblings, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-02 20:43 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Mon, 2024-07-01 at 17:41 -0700, Andrii Nakryiko wrote:

[...]

> > @@ -158,6 +158,7 @@ const struct bpf_func_proto bpf_get_smp_processor_id_proto = {
> >         .func           = bpf_get_smp_processor_id,
> >         .gpl_only       = false,
> >         .ret_type       = RET_INTEGER,
> > +       .nocsr          = true,
> 
> I'm wondering if we should call this flag in such a way that it's
> clear that this is more of an request, while the actual nocsr cleanup
> and stuff is done only if BPF verifier/BPF JIT support that for
> specific architecture/config/etc?

Can change to .allow_nocsr. On the other hand, can remove this flag
completely and rely on call_csr_mask().

[...]

> > @@ -16030,7 +16030,14 @@ static u8 get_helper_reg_mask(const struct bpf_func_proto *fn)
> >   */
> >  static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm)
> >  {
> > -       return false;
> > +       switch (imm) {
> > +#ifdef CONFIG_X86_64
> > +       case BPF_FUNC_get_smp_processor_id:
> > +               return env->prog->jit_requested && bpf_jit_supports_percpu_insn();
> > +#endif
> 
> please see bpf_jit_inlines_helper_call(), arm64 and risc-v inline it
> in JIT, so we need to validate they don't assume any of R1-R5 register
> to be a scratch register

At the moment I return false for this archs.
Or do you suggest these to be added in the current patch-set?

[...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
  2024-07-02 20:43     ` Eduard Zingerman
@ 2024-07-02 21:11       ` Andrii Nakryiko
  2024-07-02 21:25         ` Eduard Zingerman
  2024-07-03 11:27         ` Puranjay Mohan
  0 siblings, 2 replies; 47+ messages in thread
From: Andrii Nakryiko @ 2024-07-02 21:11 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Tue, Jul 2, 2024 at 1:44 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Mon, 2024-07-01 at 17:41 -0700, Andrii Nakryiko wrote:
>
> [...]
>
> > > @@ -158,6 +158,7 @@ const struct bpf_func_proto bpf_get_smp_processor_id_proto = {
> > >         .func           = bpf_get_smp_processor_id,
> > >         .gpl_only       = false,
> > >         .ret_type       = RET_INTEGER,
> > > +       .nocsr          = true,
> >
> > I'm wondering if we should call this flag in such a way that it's
> > clear that this is more of an request, while the actual nocsr cleanup
> > and stuff is done only if BPF verifier/BPF JIT support that for
> > specific architecture/config/etc?
>
> Can change to .allow_nocsr. On the other hand, can remove this flag
> completely and rely on call_csr_mask().

I like the declaration that helper is eligible to be close to helper
definition, so I'd definitely keep it, but yeah "allow_nocsr" seems
betterto me

>
> [...]
>
> > > @@ -16030,7 +16030,14 @@ static u8 get_helper_reg_mask(const struct bpf_func_proto *fn)
> > >   */
> > >  static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm)
> > >  {
> > > -       return false;
> > > +       switch (imm) {
> > > +#ifdef CONFIG_X86_64
> > > +       case BPF_FUNC_get_smp_processor_id:
> > > +               return env->prog->jit_requested && bpf_jit_supports_percpu_insn();
> > > +#endif
> >
> > please see bpf_jit_inlines_helper_call(), arm64 and risc-v inline it
> > in JIT, so we need to validate they don't assume any of R1-R5 register
> > to be a scratch register
>
> At the moment I return false for this archs.
> Or do you suggest these to be added in the current patch-set?

I'd add them from the get go. CC Puranjay to double-check?

>
> [...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
  2024-07-02 21:11       ` Andrii Nakryiko
@ 2024-07-02 21:25         ` Eduard Zingerman
  2024-07-03 11:27         ` Puranjay Mohan
  1 sibling, 0 replies; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-02 21:25 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Tue, 2024-07-02 at 14:11 -0700, Andrii Nakryiko wrote:

[...]

> > > > +#ifdef CONFIG_X86_64
> > > > +       case BPF_FUNC_get_smp_processor_id:
> > > > +               return env->prog->jit_requested && bpf_jit_supports_percpu_insn();
> > > > +#endif
> > > 
> > > please see bpf_jit_inlines_helper_call(), arm64 and risc-v inline it
> > > in JIT, so we need to validate they don't assume any of R1-R5 register
> > > to be a scratch register
> > 
> > At the moment I return false for this archs.
> > Or do you suggest these to be added in the current patch-set?
> 
> I'd add them from the get go. CC Puranjay to double-check?

I'll give it a try.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
  2024-07-02 21:11       ` Andrii Nakryiko
  2024-07-02 21:25         ` Eduard Zingerman
@ 2024-07-03 11:27         ` Puranjay Mohan
  2024-07-03 23:14           ` Eduard Zingerman
  2024-07-04 17:00           ` Eduard Zingerman
  1 sibling, 2 replies; 47+ messages in thread
From: Puranjay Mohan @ 2024-07-03 11:27 UTC (permalink / raw)
  To: Andrii Nakryiko, Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, puranjay12

[-- Attachment #1: Type: text/plain, Size: 2280 bytes --]

Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:

> On Tue, Jul 2, 2024 at 1:44 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>>
>> On Mon, 2024-07-01 at 17:41 -0700, Andrii Nakryiko wrote:
>>
>> [...]
>>
>> > > @@ -158,6 +158,7 @@ const struct bpf_func_proto bpf_get_smp_processor_id_proto = {
>> > >         .func           = bpf_get_smp_processor_id,
>> > >         .gpl_only       = false,
>> > >         .ret_type       = RET_INTEGER,
>> > > +       .nocsr          = true,
>> >
>> > I'm wondering if we should call this flag in such a way that it's
>> > clear that this is more of an request, while the actual nocsr cleanup
>> > and stuff is done only if BPF verifier/BPF JIT support that for
>> > specific architecture/config/etc?
>>
>> Can change to .allow_nocsr. On the other hand, can remove this flag
>> completely and rely on call_csr_mask().
>
> I like the declaration that helper is eligible to be close to helper
> definition, so I'd definitely keep it, but yeah "allow_nocsr" seems
> betterto me
>
>>
>> [...]
>>
>> > > @@ -16030,7 +16030,14 @@ static u8 get_helper_reg_mask(const struct bpf_func_proto *fn)
>> > >   */
>> > >  static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm)
>> > >  {
>> > > -       return false;
>> > > +       switch (imm) {
>> > > +#ifdef CONFIG_X86_64
>> > > +       case BPF_FUNC_get_smp_processor_id:
>> > > +               return env->prog->jit_requested && bpf_jit_supports_percpu_insn();
>> > > +#endif
>> >
>> > please see bpf_jit_inlines_helper_call(), arm64 and risc-v inline it
>> > in JIT, so we need to validate they don't assume any of R1-R5 register
>> > to be a scratch register

They don't assume any register to be scratch (except R0) so we can
enable this on arm64 and riscv.

>>
>> At the moment I return false for this archs.

Yes, verifier_inlines_helper_call() should keep returning false for
arm64 and risc-v. 

>> Or do you suggest these to be added in the current patch-set?

The correct way to do this would be to change call_csr_mask() to have:

verifier_inlines_helper_call(env, insn->imm) || bpf_jit_inlines_helper_call(insn->imm)

> I'd add them from the get go. CC Puranjay to double-check?


Thanks,
Puranjay

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
  2024-07-03 11:27         ` Puranjay Mohan
@ 2024-07-03 23:14           ` Eduard Zingerman
  2024-07-04 11:19             ` Puranjay Mohan
  2024-07-04 17:00           ` Eduard Zingerman
  1 sibling, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-03 23:14 UTC (permalink / raw)
  To: Puranjay Mohan, Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, puranjay12

On Wed, 2024-07-03 at 11:27 +0000, Puranjay Mohan wrote:

[...]

> > > > > @@ -16030,7 +16030,14 @@ static u8 get_helper_reg_mask(const struct bpf_func_proto *fn)
> > > > >   */
> > > > >  static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm)
> > > > >  {
> > > > > -       return false;
> > > > > +       switch (imm) {
> > > > > +#ifdef CONFIG_X86_64
> > > > > +       case BPF_FUNC_get_smp_processor_id:
> > > > > +               return env->prog->jit_requested && bpf_jit_supports_percpu_insn();
> > > > > +#endif
> > > > 
> > > > please see bpf_jit_inlines_helper_call(), arm64 and risc-v inline it
> > > > in JIT, so we need to validate they don't assume any of R1-R5 register
> > > > to be a scratch register
> 
> They don't assume any register to be scratch (except R0) so we can
> enable this on arm64 and riscv.

Puranjay, just out of curiosity and tangential to this patch-set,
I see that get_smp_processor_id is implemented as follows in riscv jit:

  emit_ld(bpf_to_rv_reg(BPF_REG_0, ctx), offsetof(struct thread_info, cpu),
	  RV_REG_TP, ctx);

Where bpf_to_rv_reg() refers to regmap, which in turn has the following line:

  static const int regmap[] = {
	[BPF_REG_0] =	RV_REG_A5,
	...
  }

At the same time, [1] says:

> 18.2 RVG Calling Convention
> ...
> Values are returned from functions in integer registers a0 and a1 and
> floating-point registers fa0 and fa1.

[1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf

So, I would expect r0 to be mapped to a0, do you happen to know why is it a5?

[...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
  2024-07-03 23:14           ` Eduard Zingerman
@ 2024-07-04 11:19             ` Puranjay Mohan
  2024-07-04 16:39               ` Eduard Zingerman
  0 siblings, 1 reply; 47+ messages in thread
From: Puranjay Mohan @ 2024-07-04 11:19 UTC (permalink / raw)
  To: Eduard Zingerman, Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

[-- Attachment #1: Type: text/plain, Size: 3080 bytes --]

Eduard Zingerman <eddyz87@gmail.com> writes:

> On Wed, 2024-07-03 at 11:27 +0000, Puranjay Mohan wrote:
>
> [...]
>
>> > > > > @@ -16030,7 +16030,14 @@ static u8 get_helper_reg_mask(const struct bpf_func_proto *fn)
>> > > > >   */
>> > > > >  static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm)
>> > > > >  {
>> > > > > -       return false;
>> > > > > +       switch (imm) {
>> > > > > +#ifdef CONFIG_X86_64
>> > > > > +       case BPF_FUNC_get_smp_processor_id:
>> > > > > +               return env->prog->jit_requested && bpf_jit_supports_percpu_insn();
>> > > > > +#endif
>> > > > 
>> > > > please see bpf_jit_inlines_helper_call(), arm64 and risc-v inline it
>> > > > in JIT, so we need to validate they don't assume any of R1-R5 register
>> > > > to be a scratch register
>> 
>> They don't assume any register to be scratch (except R0) so we can
>> enable this on arm64 and riscv.
>
> Puranjay, just out of curiosity and tangential to this patch-set,
> I see that get_smp_processor_id is implemented as follows in riscv jit:
>
>   emit_ld(bpf_to_rv_reg(BPF_REG_0, ctx), offsetof(struct thread_info, cpu),
> 	  RV_REG_TP, ctx);
>
> Where bpf_to_rv_reg() refers to regmap, which in turn has the following line:
>
>   static const int regmap[] = {
> 	[BPF_REG_0] =	RV_REG_A5,
> 	...
>   }
>
> At the same time, [1] says:
>
>> 18.2 RVG Calling Convention
>> ...
>> Values are returned from functions in integer registers a0 and a1 and
>> floating-point registers fa0 and fa1.
>
> [1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf
>
> So, I would expect r0 to be mapped to a0, do you happen to know why is it a5?

I had the same question when I started working with the JITs. This is
seen on both risc-v and arm64, where as you said on risc-v R0 should be
mapped to A0 but is mapped to A5. Similarly, on ARM64, BPF_R0 should be
mapped to ARM64_R0 but is mapped to ARM64_R7.

Here is my understanding of this:

The reason for this quirk is the usage of BPF register R0 as defined by
BPF Registers and calling convention [1]

[1] says:

```
* R0: return value from function calls, and exit value for BPF programs
* R1 - R5: arguments for function calls
```

On arm64 and risc-v the first argument and the return value are
passed/returned in the same register, A0 on risc-v and R0 on arm64.

In BPF, the first argument to a function is passed in R1 and not in R0.
So when we map these registers to riscv or arm64 calling convention, we
have to map BPF_R1 to A0 on risc-v and to R0 on ARM64. This is to make
argument passing easy. Therefore BPF_R0 is mapped to A5 on risc-v and
ARM64_R7 on arm64.

And when we JIT the 'BPF_JMP | BPF_CALL' we add a mov instruction at the
end to move A0 to A5 on risc-v and R0 to R7 on arm64.

But when inlining the call we can directly put the result in A5 or R7. 

Thanks,
Puranjay

[1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next/+/refs/heads/master/Documentation/bpf/standardization/abi.rst

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
  2024-07-04 11:19             ` Puranjay Mohan
@ 2024-07-04 16:39               ` Eduard Zingerman
  0 siblings, 0 replies; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-04 16:39 UTC (permalink / raw)
  To: Puranjay Mohan, Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Thu, 2024-07-04 at 11:19 +0000, Puranjay Mohan wrote:

[...]

> And when we JIT the 'BPF_JMP | BPF_CALL' we add a mov instruction at the
> end to move A0 to A5 on risc-v and R0 to R7 on arm64.
> 
> But when inlining the call we can directly put the result in A5 or R7. 

Oh, I see that additional move, thank you for explaining!

Best regards,
Eduard

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
  2024-07-03 11:27         ` Puranjay Mohan
  2024-07-03 23:14           ` Eduard Zingerman
@ 2024-07-04 17:00           ` Eduard Zingerman
  2024-07-04 17:24             ` Puranjay Mohan
  1 sibling, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-04 17:00 UTC (permalink / raw)
  To: Puranjay Mohan, Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, puranjay12

On Wed, 2024-07-03 at 11:27 +0000, Puranjay Mohan wrote:

[...]

> The correct way to do this would be to change call_csr_mask() to have:
> 
> verifier_inlines_helper_call(env, insn->imm) || bpf_jit_inlines_helper_call(insn->imm)

Hi Puranjay,

I've added bpf_jit_inlines_helper_call() logic in RFC v2 [1].
If you have a riscv setup at hand, would it be possible to ask you to
run test case 'verifier_nocsr/canary_arm64_riscv64' on it?
I verified that it works for arm64 in [2,3] but it would be nice to
have it checked on riscv, which is currently not a part of the CI.

Thanks,
Eduard

[1] https://lore.kernel.org/bpf/20240704102402.1644916-1-eddyz87@gmail.com/
[2] https://github.com/kernel-patches/bpf/actions/runs/9792217835/job/27037905408?pr=7274
[3] https://productionresultssa19.blob.core.windows.net/actions-results/6fb742ca-e78f-420e-9f1c-66e1e23365ef/workflow-job-run-0caa83a1-3221-5ca7-0e7d-7eb42ae68938/logs/job/job-logs.txt?rsct=text%2Fplain&se=2024-07-04T10%3A32%3A00Z&sig=gb0wAZ%2FPMOfZwCCtJyTqAogIyKZ%2F3lCZCfPUpMdN8rE%3D&ske=2024-07-04T18%3A36%3A08Z&skoid=ca7593d4-ee42-46cd-af88-8b886a2f84eb&sks=b&skt=2024-07-04T06%3A36%3A08Z&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skv=2023-11-03&sp=r&spr=https&sr=b&st=2024-07-04T10%3A21%3A55Z&sv=2023-11-03

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
  2024-07-04 17:00           ` Eduard Zingerman
@ 2024-07-04 17:24             ` Puranjay Mohan
  2024-07-04 17:39               ` Eduard Zingerman
  0 siblings, 1 reply; 47+ messages in thread
From: Puranjay Mohan @ 2024-07-04 17:24 UTC (permalink / raw)
  To: Eduard Zingerman, Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

[-- Attachment #1: Type: text/plain, Size: 773 bytes --]

Eduard Zingerman <eddyz87@gmail.com> writes:

> On Wed, 2024-07-03 at 11:27 +0000, Puranjay Mohan wrote:
>
> [...]
>
>> The correct way to do this would be to change call_csr_mask() to have:
>> 
>> verifier_inlines_helper_call(env, insn->imm) || bpf_jit_inlines_helper_call(insn->imm)
>
> Hi Puranjay,
>
> I've added bpf_jit_inlines_helper_call() logic in RFC v2 [1].
> If you have a riscv setup at hand, would it be possible to ask you to
> run test case 'verifier_nocsr/canary_arm64_riscv64' on it?
> I verified that it works for arm64 in [2,3] but it would be nice to
> have it checked on riscv, which is currently not a part of the CI.

Hi Eduard,

I have qemu setup for risc-v. I will test this and let you know the
results.

Thanks,
Puranjay

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
  2024-07-04 17:24             ` Puranjay Mohan
@ 2024-07-04 17:39               ` Eduard Zingerman
  0 siblings, 0 replies; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-04 17:39 UTC (permalink / raw)
  To: Puranjay Mohan, Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Thu, 2024-07-04 at 17:24 +0000, Puranjay Mohan wrote:

[...]

> I have qemu setup for risc-v. I will test this and let you know the
> results.

Great, thank you!

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [RFC bpf-next v1 4/8] selftests/bpf: extract utility function for BPF disassembly
  2024-06-29  9:47 [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Eduard Zingerman
                   ` (2 preceding siblings ...)
  2024-06-29  9:47 ` [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id() Eduard Zingerman
@ 2024-06-29  9:47 ` Eduard Zingerman
  2024-07-02  0:41   ` Andrii Nakryiko
  2024-06-29  9:47 ` [RFC bpf-next v1 5/8] selftests/bpf: no need to track next_match_pos in struct test_loader Eduard Zingerman
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-06-29  9:47 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Eduard Zingerman

uint32_t disasm_insn(struct bpf_insn *insn, char *buf, size_t buf_sz);

  Disassembles instruction 'insn' to a text buffer 'buf'.
  Removes insn->code hex prefix added by kernel disassemly routine.
  Returns the length of decoded instruction (either 1 or 2).

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/Makefile          |  1 +
 tools/testing/selftests/bpf/disasm_helpers.c  | 50 +++++++++++++
 tools/testing/selftests/bpf/disasm_helpers.h  | 12 ++++
 .../selftests/bpf/prog_tests/ctx_rewrite.c    | 71 +++----------------
 tools/testing/selftests/bpf/testing_helpers.c |  1 +
 5 files changed, 72 insertions(+), 63 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/disasm_helpers.c
 create mode 100644 tools/testing/selftests/bpf/disasm_helpers.h

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index e0b3887b3d2d..5eb7b5eb89d2 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -636,6 +636,7 @@ TRUNNER_EXTRA_SOURCES := test_progs.c		\
 			 test_loader.c		\
 			 xsk.c			\
 			 disasm.c		\
+			 disasm_helpers.c	\
 			 json_writer.c 		\
 			 flow_dissector_load.h	\
 			 ip_check_defrag_frags.h
diff --git a/tools/testing/selftests/bpf/disasm_helpers.c b/tools/testing/selftests/bpf/disasm_helpers.c
new file mode 100644
index 000000000000..7c29d294a456
--- /dev/null
+++ b/tools/testing/selftests/bpf/disasm_helpers.c
@@ -0,0 +1,50 @@
+// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+
+#include <bpf/bpf.h>
+#include "disasm.h"
+
+struct print_insn_context {
+	char *buf;
+	size_t sz;
+};
+
+static void print_insn_cb(void *private_data, const char *fmt, ...)
+{
+	struct print_insn_context *ctx = private_data;
+	va_list args;
+
+	va_start(args, fmt);
+	vsnprintf(ctx->buf, ctx->sz, fmt, args);
+	va_end(args);
+}
+
+uint32_t disasm_insn(struct bpf_insn *insn, char *buf, size_t buf_sz)
+{
+	struct print_insn_context ctx = {
+		.buf = buf,
+		.sz = buf_sz,
+	};
+	struct bpf_insn_cbs cbs = {
+		.cb_print	= print_insn_cb,
+		.private_data	= &ctx,
+	};
+	int pfx_end, sfx_start, len;
+	bool double_insn;
+
+	print_bpf_insn(&cbs, insn, true);
+	/* We share code with kernel BPF disassembler, it adds '(FF) ' prefix
+	 * for each instruction (FF stands for instruction `code` byte).
+	 * Remove the prefix inplace, and also simplify call instructions.
+	 * E.g.: "(85) call foo#10" -> "call foo".
+	 */
+	pfx_end = 0;
+	sfx_start = max((int)strlen(buf) - 1, 0);
+	/* For whatever reason %n is not counted in sscanf return value */
+	sscanf(buf, "(%*[^)]) %n", &pfx_end);
+	sscanf(buf, "(%*[^)]) call %*[^#]%n", &sfx_start);
+	len = sfx_start - pfx_end;
+	memmove(buf, buf + pfx_end, len);
+	buf[len] = 0;
+	double_insn = insn->code == (BPF_LD | BPF_IMM | BPF_DW);
+	return double_insn ? 2 : 1;
+}
diff --git a/tools/testing/selftests/bpf/disasm_helpers.h b/tools/testing/selftests/bpf/disasm_helpers.h
new file mode 100644
index 000000000000..db3dfe9f93dd
--- /dev/null
+++ b/tools/testing/selftests/bpf/disasm_helpers.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
+
+#ifndef __DISASM_HELPERS_H
+#define __DISASM_HELPERS_H
+
+#include <stdlib.h>
+
+struct bpf_insn;
+
+uint32_t disasm_insn(struct bpf_insn *insn, char *buf, size_t buf_sz);
+
+#endif /* __DISASM_HELPERS_H */
diff --git a/tools/testing/selftests/bpf/prog_tests/ctx_rewrite.c b/tools/testing/selftests/bpf/prog_tests/ctx_rewrite.c
index 08b6391f2f56..55e41167f1f3 100644
--- a/tools/testing/selftests/bpf/prog_tests/ctx_rewrite.c
+++ b/tools/testing/selftests/bpf/prog_tests/ctx_rewrite.c
@@ -10,7 +10,8 @@
 #include "bpf/btf.h"
 #include "bpf_util.h"
 #include "linux/filter.h"
-#include "disasm.h"
+#include "linux/kernel.h"
+#include "disasm_helpers.h"
 
 #define MAX_PROG_TEXT_SZ (32 * 1024)
 
@@ -628,63 +629,6 @@ static bool match_pattern(struct btf *btf, char *pattern, char *text, char *reg_
 	return false;
 }
 
-static void print_insn(void *private_data, const char *fmt, ...)
-{
-	va_list args;
-
-	va_start(args, fmt);
-	vfprintf((FILE *)private_data, fmt, args);
-	va_end(args);
-}
-
-/* Disassemble instructions to a stream */
-static void print_xlated(FILE *out, struct bpf_insn *insn, __u32 len)
-{
-	const struct bpf_insn_cbs cbs = {
-		.cb_print	= print_insn,
-		.cb_call	= NULL,
-		.cb_imm		= NULL,
-		.private_data	= out,
-	};
-	bool double_insn = false;
-	int i;
-
-	for (i = 0; i < len; i++) {
-		if (double_insn) {
-			double_insn = false;
-			continue;
-		}
-
-		double_insn = insn[i].code == (BPF_LD | BPF_IMM | BPF_DW);
-		print_bpf_insn(&cbs, insn + i, true);
-	}
-}
-
-/* We share code with kernel BPF disassembler, it adds '(FF) ' prefix
- * for each instruction (FF stands for instruction `code` byte).
- * This function removes the prefix inplace for each line in `str`.
- */
-static void remove_insn_prefix(char *str, int size)
-{
-	const int prefix_size = 5;
-
-	int write_pos = 0, read_pos = prefix_size;
-	int len = strlen(str);
-	char c;
-
-	size = min(size, len);
-
-	while (read_pos < size) {
-		c = str[read_pos++];
-		if (c == 0)
-			break;
-		str[write_pos++] = c;
-		if (c == '\n')
-			read_pos += prefix_size;
-	}
-	str[write_pos] = 0;
-}
-
 struct prog_info {
 	char *prog_kind;
 	enum bpf_prog_type prog_type;
@@ -702,8 +646,10 @@ static void match_program(struct btf *btf,
 	struct bpf_insn *buf = NULL;
 	int err = 0, prog_fd = 0;
 	FILE *prog_out = NULL;
+	char insn_buf[64];
 	char *text = NULL;
 	__u32 cnt = 0;
+	int i;
 
 	text = calloc(MAX_PROG_TEXT_SZ, 1);
 	if (!text) {
@@ -739,12 +685,11 @@ static void match_program(struct btf *btf,
 		PRINT_FAIL("Can't open memory stream\n");
 		goto out;
 	}
-	if (skip_first_insn)
-		print_xlated(prog_out, buf + 1, cnt - 1);
-	else
-		print_xlated(prog_out, buf, cnt);
+	for (i = skip_first_insn ? 1 : 0; i < cnt;) {
+		i += disasm_insn(buf + i, insn_buf, sizeof(insn_buf));
+		fprintf(prog_out, "%s\n", insn_buf);
+	}
 	fclose(prog_out);
-	remove_insn_prefix(text, MAX_PROG_TEXT_SZ);
 
 	ASSERT_TRUE(match_pattern(btf, pattern, text, reg_map),
 		    pinfo->prog_kind);
diff --git a/tools/testing/selftests/bpf/testing_helpers.c b/tools/testing/selftests/bpf/testing_helpers.c
index d5379a0e6da8..ac7c66f4fc7b 100644
--- a/tools/testing/selftests/bpf/testing_helpers.c
+++ b/tools/testing/selftests/bpf/testing_helpers.c
@@ -7,6 +7,7 @@
 #include <errno.h>
 #include <bpf/bpf.h>
 #include <bpf/libbpf.h>
+#include "disasm.h"
 #include "test_progs.h"
 #include "testing_helpers.h"
 #include <linux/membarrier.h>
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 4/8] selftests/bpf: extract utility function for BPF disassembly
  2024-06-29  9:47 ` [RFC bpf-next v1 4/8] selftests/bpf: extract utility function for BPF disassembly Eduard Zingerman
@ 2024-07-02  0:41   ` Andrii Nakryiko
  2024-07-02 20:59     ` Eduard Zingerman
  0 siblings, 1 reply; 47+ messages in thread
From: Andrii Nakryiko @ 2024-07-02  0:41 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Sat, Jun 29, 2024 at 2:48 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> uint32_t disasm_insn(struct bpf_insn *insn, char *buf, size_t buf_sz);

or you can return `struct bpf_insn *` which will point to the next
hypothetical instruction?

>
>   Disassembles instruction 'insn' to a text buffer 'buf'.
>   Removes insn->code hex prefix added by kernel disassemly routine.
>   Returns the length of decoded instruction (either 1 or 2).
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  tools/testing/selftests/bpf/Makefile          |  1 +
>  tools/testing/selftests/bpf/disasm_helpers.c  | 50 +++++++++++++
>  tools/testing/selftests/bpf/disasm_helpers.h  | 12 ++++
>  .../selftests/bpf/prog_tests/ctx_rewrite.c    | 71 +++----------------
>  tools/testing/selftests/bpf/testing_helpers.c |  1 +
>  5 files changed, 72 insertions(+), 63 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/disasm_helpers.c
>  create mode 100644 tools/testing/selftests/bpf/disasm_helpers.h
>

[...]

> +uint32_t disasm_insn(struct bpf_insn *insn, char *buf, size_t buf_sz)
> +{
> +       struct print_insn_context ctx = {
> +               .buf = buf,
> +               .sz = buf_sz,
> +       };
> +       struct bpf_insn_cbs cbs = {
> +               .cb_print       = print_insn_cb,
> +               .private_data   = &ctx,
> +       };
> +       int pfx_end, sfx_start, len;
> +       bool double_insn;
> +
> +       print_bpf_insn(&cbs, insn, true);
> +       /* We share code with kernel BPF disassembler, it adds '(FF) ' prefix
> +        * for each instruction (FF stands for instruction `code` byte).
> +        * Remove the prefix inplace, and also simplify call instructions.
> +        * E.g.: "(85) call foo#10" -> "call foo".
> +        */
> +       pfx_end = 0;
> +       sfx_start = max((int)strlen(buf) - 1, 0);
> +       /* For whatever reason %n is not counted in sscanf return value */
> +       sscanf(buf, "(%*[^)]) %n", &pfx_end);

let me simplify this a bit ;)

pfx_end = 5;

not as sophisticated, but equivalent

> +       sscanf(buf, "(%*[^)]) call %*[^#]%n", &sfx_start);

is it documented that sfx_start won't be updated if sscanf() doesn't
successfully match?

if not, maybe let's do something like below

if (strcmp(buf + 5, "call ", 5) == 0 && (tmp = strrchr(buf, '#')))
    sfx_start = tmp - buf;

> +       len = sfx_start - pfx_end;
> +       memmove(buf, buf + pfx_end, len);
> +       buf[len] = 0;
> +       double_insn = insn->code == (BPF_LD | BPF_IMM | BPF_DW);
> +       return double_insn ? 2 : 1;
> +}

[...]


> @@ -739,12 +685,11 @@ static void match_program(struct btf *btf,
>                 PRINT_FAIL("Can't open memory stream\n");
>                 goto out;
>         }
> -       if (skip_first_insn)
> -               print_xlated(prog_out, buf + 1, cnt - 1);
> -       else
> -               print_xlated(prog_out, buf, cnt);
> +       for (i = skip_first_insn ? 1 : 0; i < cnt;) {
> +               i += disasm_insn(buf + i, insn_buf, sizeof(insn_buf));
> +               fprintf(prog_out, "%s\n", insn_buf);
> +       }
>         fclose(prog_out);
> -       remove_insn_prefix(text, MAX_PROG_TEXT_SZ);
>
>         ASSERT_TRUE(match_pattern(btf, pattern, text, reg_map),
>                     pinfo->prog_kind);
> diff --git a/tools/testing/selftests/bpf/testing_helpers.c b/tools/testing/selftests/bpf/testing_helpers.c
> index d5379a0e6da8..ac7c66f4fc7b 100644
> --- a/tools/testing/selftests/bpf/testing_helpers.c
> +++ b/tools/testing/selftests/bpf/testing_helpers.c
> @@ -7,6 +7,7 @@
>  #include <errno.h>
>  #include <bpf/bpf.h>
>  #include <bpf/libbpf.h>
> +#include "disasm.h"
>  #include "test_progs.h"
>  #include "testing_helpers.h"
>  #include <linux/membarrier.h>
> --
> 2.45.2
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 4/8] selftests/bpf: extract utility function for BPF disassembly
  2024-07-02  0:41   ` Andrii Nakryiko
@ 2024-07-02 20:59     ` Eduard Zingerman
  2024-07-02 21:16       ` Andrii Nakryiko
  0 siblings, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-02 20:59 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Mon, 2024-07-01 at 17:41 -0700, Andrii Nakryiko wrote:
> On Sat, Jun 29, 2024 at 2:48 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > uint32_t disasm_insn(struct bpf_insn *insn, char *buf, size_t buf_sz);
> 
> or you can return `struct bpf_insn *` which will point to the next
> hypothetical instruction?

Not sure if it simplifies clients, e.g. from this patch, the following:

+	for (i = skip_first_insn ? 1 : 0; i < cnt;) {
+		i += disasm_insn(buf + i, insn_buf, sizeof(insn_buf));
+		fprintf(prog_out, "%s\n", insn_buf);
+	}

Would become:

+	for (i = buf + skip_first_insn ? 1 : 0; i < buf + cnt;) {
+		i = disasm_insn(buf + i, insn_buf, sizeof(insn_buf));
+		fprintf(prog_out, "%s\n", insn_buf);
+	}

idk, can change if you insist.

[...]

> > +       sscanf(buf, "(%*[^)]) %n", &pfx_end);
> 
> let me simplify this a bit ;)
> 
> pfx_end = 5;
> 
> not as sophisticated, but equivalent

Okay :(

> 
> > +       sscanf(buf, "(%*[^)]) call %*[^#]%n", &sfx_start);
> 
> is it documented that sfx_start won't be updated if sscanf() doesn't
> successfully match?
> 
> if not, maybe let's do something like below
> 
> if (strcmp(buf + 5, "call ", 5) == 0 && (tmp = strrchr(buf, '#')))
>     sfx_start = tmp - buf;

Will change, the doc is obscure.

[...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 4/8] selftests/bpf: extract utility function for BPF disassembly
  2024-07-02 20:59     ` Eduard Zingerman
@ 2024-07-02 21:16       ` Andrii Nakryiko
  2024-07-02 21:23         ` Eduard Zingerman
  0 siblings, 1 reply; 47+ messages in thread
From: Andrii Nakryiko @ 2024-07-02 21:16 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Tue, Jul 2, 2024 at 1:59 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Mon, 2024-07-01 at 17:41 -0700, Andrii Nakryiko wrote:
> > On Sat, Jun 29, 2024 at 2:48 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > uint32_t disasm_insn(struct bpf_insn *insn, char *buf, size_t buf_sz);
> >
> > or you can return `struct bpf_insn *` which will point to the next
> > hypothetical instruction?
>
> Not sure if it simplifies clients, e.g. from this patch, the following:
>
> +       for (i = skip_first_insn ? 1 : 0; i < cnt;) {
> +               i += disasm_insn(buf + i, insn_buf, sizeof(insn_buf));
> +               fprintf(prog_out, "%s\n", insn_buf);
> +       }
>
> Would become:
>
> +       for (i = buf + skip_first_insn ? 1 : 0; i < buf + cnt;) {
> +               i = disasm_insn(buf + i, insn_buf, sizeof(insn_buf));
> +               fprintf(prog_out, "%s\n", insn_buf);
> +       }
>

struct bpf_insn *insn = skip_first_insn ? buf + 1 : buf, *insn_end = buf + cnt;

while (insn != insn_end) {
    insn = disasm_insn(insn, insn_buf, sizeof(insn_buf));
    fprintf(prog_out, "%s\n", insn_buf);
}

less addition, but it's simple enough in both cases, of course (I just
find 1 or 2 as a result kind of a bad contract, but whatever)


> idk, can change if you insist.
>
> [...]
>
> > > +       sscanf(buf, "(%*[^)]) %n", &pfx_end);
> >
> > let me simplify this a bit ;)
> >
> > pfx_end = 5;
> >
> > not as sophisticated, but equivalent
>
> Okay :(

if 5 makes you sad, do keep sscanf(), of course, no worries :)

>
> >
> > > +       sscanf(buf, "(%*[^)]) call %*[^#]%n", &sfx_start);
> >
> > is it documented that sfx_start won't be updated if sscanf() doesn't
> > successfully match?
> >
> > if not, maybe let's do something like below
> >
> > if (strcmp(buf + 5, "call ", 5) == 0 && (tmp = strrchr(buf, '#')))
> >     sfx_start = tmp - buf;
>
> Will change, the doc is obscure.
>
> [...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 4/8] selftests/bpf: extract utility function for BPF disassembly
  2024-07-02 21:16       ` Andrii Nakryiko
@ 2024-07-02 21:23         ` Eduard Zingerman
  0 siblings, 0 replies; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-02 21:23 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Tue, 2024-07-02 at 14:16 -0700, Andrii Nakryiko wrote:

[...]

> struct bpf_insn *insn = skip_first_insn ? buf + 1 : buf, *insn_end = buf + cnt;
> 
> while (insn != insn_end) {
>     insn = disasm_insn(insn, insn_buf, sizeof(insn_buf));
>     fprintf(prog_out, "%s\n", insn_buf);
> }
> 
> less addition, but it's simple enough in both cases, of course (I just
> find 1 or 2 as a result kind of a bad contract, but whatever)

Will change.

[...]

> > > > +       sscanf(buf, "(%*[^)]) %n", &pfx_end);
> > > 
> > > let me simplify this a bit ;)
> > > 
> > > pfx_end = 5;
> > > 
> > > not as sophisticated, but equivalent
> > 
> > Okay :(
> 
> if 5 makes you sad, do keep sscanf(), of course, no worries :)

The obscure doc makes me even more sad.

[...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [RFC bpf-next v1 5/8] selftests/bpf: no need to track next_match_pos in struct test_loader
  2024-06-29  9:47 [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Eduard Zingerman
                   ` (3 preceding siblings ...)
  2024-06-29  9:47 ` [RFC bpf-next v1 4/8] selftests/bpf: extract utility function for BPF disassembly Eduard Zingerman
@ 2024-06-29  9:47 ` Eduard Zingerman
  2024-07-02  0:41   ` Andrii Nakryiko
  2024-06-29  9:47 ` [RFC bpf-next v1 6/8] selftests/bpf: extract test_loader->expect_msgs as a data structure Eduard Zingerman
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-06-29  9:47 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Eduard Zingerman

The call stack for validate_case() function looks as follows:
- test_loader__run_subtests()
  - process_subtest()
    - run_subtest()
      - prepare_case(), which does 'tester->next_match_pos = 0';
      - validate_case(), which increments tester->next_match_pos.

Hence, each subtest is run with next_match_pos freshly set to zero.
Meaning that there is no need to persist this variable in the
struct test_loader, use local variable instead.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/test_loader.c | 17 ++++++++---------
 tools/testing/selftests/bpf/test_progs.h  |  1 -
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c
index f14e10b0de96..ac9d3e81abdb 100644
--- a/tools/testing/selftests/bpf/test_loader.c
+++ b/tools/testing/selftests/bpf/test_loader.c
@@ -434,7 +434,6 @@ static void prepare_case(struct test_loader *tester,
 	bpf_program__set_flags(prog, prog_flags | spec->prog_flags);
 
 	tester->log_buf[0] = '\0';
-	tester->next_match_pos = 0;
 }
 
 static void emit_verifier_log(const char *log_buf, bool force)
@@ -450,23 +449,23 @@ static void validate_case(struct test_loader *tester,
 			  struct bpf_program *prog,
 			  int load_err)
 {
-	int i, j, err;
-	char *match;
 	regmatch_t reg_match[1];
+	const char *match;
+	const char *log = tester->log_buf;
+	int i, j, err;
 
 	for (i = 0; i < subspec->expect_msg_cnt; i++) {
 		struct expect_msg *msg = &subspec->expect_msgs[i];
 
 		if (msg->substr) {
-			match = strstr(tester->log_buf + tester->next_match_pos, msg->substr);
+			match = strstr(log, msg->substr);
 			if (match)
-				tester->next_match_pos = match - tester->log_buf + strlen(msg->substr);
+				log += strlen(msg->substr);
 		} else {
-			err = regexec(&msg->regex,
-				      tester->log_buf + tester->next_match_pos, 1, reg_match, 0);
+			err = regexec(&msg->regex, log, 1, reg_match, 0);
 			if (err == 0) {
-				match = tester->log_buf + tester->next_match_pos + reg_match[0].rm_so;
-				tester->next_match_pos += reg_match[0].rm_eo;
+				match = log + reg_match[0].rm_so;
+				log += reg_match[0].rm_eo;
 			} else {
 				match = NULL;
 			}
diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h
index 0ba5a20b19ba..8e997de596db 100644
--- a/tools/testing/selftests/bpf/test_progs.h
+++ b/tools/testing/selftests/bpf/test_progs.h
@@ -438,7 +438,6 @@ typedef int (*pre_execution_cb)(struct bpf_object *obj);
 struct test_loader {
 	char *log_buf;
 	size_t log_buf_sz;
-	size_t next_match_pos;
 	pre_execution_cb pre_execution_cb;
 
 	struct bpf_object *obj;
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 5/8] selftests/bpf: no need to track next_match_pos in struct test_loader
  2024-06-29  9:47 ` [RFC bpf-next v1 5/8] selftests/bpf: no need to track next_match_pos in struct test_loader Eduard Zingerman
@ 2024-07-02  0:41   ` Andrii Nakryiko
  2024-07-02 21:05     ` Eduard Zingerman
  0 siblings, 1 reply; 47+ messages in thread
From: Andrii Nakryiko @ 2024-07-02  0:41 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Sat, Jun 29, 2024 at 2:48 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> The call stack for validate_case() function looks as follows:
> - test_loader__run_subtests()
>   - process_subtest()
>     - run_subtest()
>       - prepare_case(), which does 'tester->next_match_pos = 0';
>       - validate_case(), which increments tester->next_match_pos.
>
> Hence, each subtest is run with next_match_pos freshly set to zero.
> Meaning that there is no need to persist this variable in the
> struct test_loader, use local variable instead.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  tools/testing/selftests/bpf/test_loader.c | 17 ++++++++---------
>  tools/testing/selftests/bpf/test_progs.h  |  1 -
>  2 files changed, 8 insertions(+), 10 deletions(-)
>

Nice cleanup:

Acked-by: Andrii Nakryiko <andrii@kernel.org>

> diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c
> index f14e10b0de96..ac9d3e81abdb 100644
> --- a/tools/testing/selftests/bpf/test_loader.c
> +++ b/tools/testing/selftests/bpf/test_loader.c
> @@ -434,7 +434,6 @@ static void prepare_case(struct test_loader *tester,
>         bpf_program__set_flags(prog, prog_flags | spec->prog_flags);
>
>         tester->log_buf[0] = '\0';
> -       tester->next_match_pos = 0;
>  }
>
>  static void emit_verifier_log(const char *log_buf, bool force)
> @@ -450,23 +449,23 @@ static void validate_case(struct test_loader *tester,
>                           struct bpf_program *prog,
>                           int load_err)
>  {
> -       int i, j, err;
> -       char *match;
>         regmatch_t reg_match[1];
> +       const char *match;
> +       const char *log = tester->log_buf;
> +       int i, j, err;
>
>         for (i = 0; i < subspec->expect_msg_cnt; i++) {
>                 struct expect_msg *msg = &subspec->expect_msgs[i];
>
>                 if (msg->substr) {
> -                       match = strstr(tester->log_buf + tester->next_match_pos, msg->substr);
> +                       match = strstr(log, msg->substr);
>                         if (match)
> -                               tester->next_match_pos = match - tester->log_buf + strlen(msg->substr);
> +                               log += strlen(msg->substr);
>                 } else {
> -                       err = regexec(&msg->regex,
> -                                     tester->log_buf + tester->next_match_pos, 1, reg_match, 0);
> +                       err = regexec(&msg->regex, log, 1, reg_match, 0);
>                         if (err == 0) {
> -                               match = tester->log_buf + tester->next_match_pos + reg_match[0].rm_so;
> -                               tester->next_match_pos += reg_match[0].rm_eo;
> +                               match = log + reg_match[0].rm_so;
> +                               log += reg_match[0].rm_eo;

invert and simplify:

log += reg_match[0].rm_eo;
match = log;

?

>                         } else {
>                                 match = NULL;
>                         }

how about we move this to the beginning of iteration (before `if
(msg->substr)`) and so we'll assume the match is NULL on regexec
failing?


> diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h
> index 0ba5a20b19ba..8e997de596db 100644
> --- a/tools/testing/selftests/bpf/test_progs.h
> +++ b/tools/testing/selftests/bpf/test_progs.h
> @@ -438,7 +438,6 @@ typedef int (*pre_execution_cb)(struct bpf_object *obj);
>  struct test_loader {
>         char *log_buf;
>         size_t log_buf_sz;
> -       size_t next_match_pos;
>         pre_execution_cb pre_execution_cb;
>
>         struct bpf_object *obj;
> --
> 2.45.2
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 5/8] selftests/bpf: no need to track next_match_pos in struct test_loader
  2024-07-02  0:41   ` Andrii Nakryiko
@ 2024-07-02 21:05     ` Eduard Zingerman
  2024-07-02 21:18       ` Andrii Nakryiko
  0 siblings, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-02 21:05 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Mon, 2024-07-01 at 17:41 -0700, Andrii Nakryiko wrote:

[...]

> >  static void emit_verifier_log(const char *log_buf, bool force)
> > @@ -450,23 +449,23 @@ static void validate_case(struct test_loader *tester,
> >                           struct bpf_program *prog,
> >                           int load_err)
> >  {
> > -       int i, j, err;
> > -       char *match;
> >         regmatch_t reg_match[1];
> > +       const char *match;
> > +       const char *log = tester->log_buf;
> > +       int i, j, err;
> > 
> >         for (i = 0; i < subspec->expect_msg_cnt; i++) {
> >                 struct expect_msg *msg = &subspec->expect_msgs[i];
> > 
> >                 if (msg->substr) {
> > -                       match = strstr(tester->log_buf + tester->next_match_pos, msg->substr);
> > +                       match = strstr(log, msg->substr);
> >                         if (match)
> > -                               tester->next_match_pos = match - tester->log_buf + strlen(msg->substr);
> > +                               log += strlen(msg->substr);
> >                 } else {
> > -                       err = regexec(&msg->regex,
> > -                                     tester->log_buf + tester->next_match_pos, 1, reg_match, 0);
> > +                       err = regexec(&msg->regex, log, 1, reg_match, 0);
> >                         if (err == 0) {
> > -                               match = tester->log_buf + tester->next_match_pos + reg_match[0].rm_so;
> > -                               tester->next_match_pos += reg_match[0].rm_eo;
> > +                               match = log + reg_match[0].rm_so;
> > +                               log += reg_match[0].rm_eo;
> 
> invert and simplify:
> 
> log += reg_match[0].rm_eo;
> match = log;
> 
> ?

The 'match' is at 'log + rm_so' (start offset).
The 'log'   is at 'log + rm_eo' (end offset).

The brilliance of standard library naming.

> 
> >                         } else {
> >                                 match = NULL;
> >                         }
> 
> how about we move this to the beginning of iteration (before `if
> (msg->substr)`) and so we'll assume the match is NULL on regexec
> failing?

Ok, but this would require explicit match re-initialization to NULL at
each iteration.

[...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 5/8] selftests/bpf: no need to track next_match_pos in struct test_loader
  2024-07-02 21:05     ` Eduard Zingerman
@ 2024-07-02 21:18       ` Andrii Nakryiko
  0 siblings, 0 replies; 47+ messages in thread
From: Andrii Nakryiko @ 2024-07-02 21:18 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Tue, Jul 2, 2024 at 2:05 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Mon, 2024-07-01 at 17:41 -0700, Andrii Nakryiko wrote:
>
> [...]
>
> > >  static void emit_verifier_log(const char *log_buf, bool force)
> > > @@ -450,23 +449,23 @@ static void validate_case(struct test_loader *tester,
> > >                           struct bpf_program *prog,
> > >                           int load_err)
> > >  {
> > > -       int i, j, err;
> > > -       char *match;
> > >         regmatch_t reg_match[1];
> > > +       const char *match;
> > > +       const char *log = tester->log_buf;
> > > +       int i, j, err;
> > >
> > >         for (i = 0; i < subspec->expect_msg_cnt; i++) {
> > >                 struct expect_msg *msg = &subspec->expect_msgs[i];
> > >
> > >                 if (msg->substr) {
> > > -                       match = strstr(tester->log_buf + tester->next_match_pos, msg->substr);
> > > +                       match = strstr(log, msg->substr);
> > >                         if (match)
> > > -                               tester->next_match_pos = match - tester->log_buf + strlen(msg->substr);
> > > +                               log += strlen(msg->substr);
> > >                 } else {
> > > -                       err = regexec(&msg->regex,
> > > -                                     tester->log_buf + tester->next_match_pos, 1, reg_match, 0);
> > > +                       err = regexec(&msg->regex, log, 1, reg_match, 0);
> > >                         if (err == 0) {
> > > -                               match = tester->log_buf + tester->next_match_pos + reg_match[0].rm_so;
> > > -                               tester->next_match_pos += reg_match[0].rm_eo;
> > > +                               match = log + reg_match[0].rm_so;
> > > +                               log += reg_match[0].rm_eo;
> >
> > invert and simplify:
> >
> > log += reg_match[0].rm_eo;
> > match = log;
> >
> > ?
>
> The 'match' is at 'log + rm_so' (start offset).
> The 'log'   is at 'log + rm_eo' (end offset).
>

oh... yeah... never mind... */me retreats*

> The brilliance of standard library naming.
>
> >
> > >                         } else {
> > >                                 match = NULL;
> > >                         }
> >
> > how about we move this to the beginning of iteration (before `if
> > (msg->substr)`) and so we'll assume the match is NULL on regexec
> > failing?
>
> Ok, but this would require explicit match re-initialization to NULL at
> each iteration.

yes, which also makes it clear that we don't carry over match in
between iterations (we can move `const char *match` inside the for
loop to make it even clearer)

>
> [...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [RFC bpf-next v1 6/8] selftests/bpf: extract test_loader->expect_msgs as a data structure
  2024-06-29  9:47 [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Eduard Zingerman
                   ` (4 preceding siblings ...)
  2024-06-29  9:47 ` [RFC bpf-next v1 5/8] selftests/bpf: no need to track next_match_pos in struct test_loader Eduard Zingerman
@ 2024-06-29  9:47 ` Eduard Zingerman
  2024-07-02  0:42   ` Andrii Nakryiko
  2024-06-29  9:47 ` [RFC bpf-next v1 7/8] selftests/bpf: allow checking xlated programs in verifier_* tests Eduard Zingerman
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-06-29  9:47 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Eduard Zingerman

Non-functional change: use a separate data structure to represented
expected messages in test_loader.
This would allow to use the same functionality for expected set of
disassembled instructions in the follow-up commit.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/test_loader.c | 81 ++++++++++++-----------
 1 file changed, 41 insertions(+), 40 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c
index ac9d3e81abdb..d4bb68685ba5 100644
--- a/tools/testing/selftests/bpf/test_loader.c
+++ b/tools/testing/selftests/bpf/test_loader.c
@@ -55,11 +55,15 @@ struct expect_msg {
 	regex_t regex;
 };
 
+struct msgs {
+	struct expect_msg *patterns;
+	size_t cnt;
+};
+
 struct test_subspec {
 	char *name;
 	bool expect_failure;
-	struct expect_msg *expect_msgs;
-	size_t expect_msg_cnt;
+	struct msgs expect_msgs;
 	int retval;
 	bool execute;
 };
@@ -96,44 +100,45 @@ void test_loader_fini(struct test_loader *tester)
 	free(tester->log_buf);
 }
 
-static void free_test_spec(struct test_spec *spec)
+static void free_msgs(struct msgs *msgs)
 {
 	int i;
 
+	for (i = 0; i < msgs->cnt; i++)
+		if (msgs->patterns[i].regex_str)
+			regfree(&msgs->patterns[i].regex);
+	free(msgs->patterns);
+	msgs->patterns = NULL;
+	msgs->cnt = 0;
+}
+
+static void free_test_spec(struct test_spec *spec)
+{
 	/* Deallocate expect_msgs arrays. */
-	for (i = 0; i < spec->priv.expect_msg_cnt; i++)
-		if (spec->priv.expect_msgs[i].regex_str)
-			regfree(&spec->priv.expect_msgs[i].regex);
-	for (i = 0; i < spec->unpriv.expect_msg_cnt; i++)
-		if (spec->unpriv.expect_msgs[i].regex_str)
-			regfree(&spec->unpriv.expect_msgs[i].regex);
+	free_msgs(&spec->priv.expect_msgs);
+	free_msgs(&spec->unpriv.expect_msgs);
 
 	free(spec->priv.name);
 	free(spec->unpriv.name);
-	free(spec->priv.expect_msgs);
-	free(spec->unpriv.expect_msgs);
-
 	spec->priv.name = NULL;
 	spec->unpriv.name = NULL;
-	spec->priv.expect_msgs = NULL;
-	spec->unpriv.expect_msgs = NULL;
 }
 
-static int push_msg(const char *substr, const char *regex_str, struct test_subspec *subspec)
+static int push_msg(const char *substr, const char *regex_str, struct msgs *msgs)
 {
 	void *tmp;
 	int regcomp_res;
 	char error_msg[100];
 	struct expect_msg *msg;
 
-	tmp = realloc(subspec->expect_msgs,
-		      (1 + subspec->expect_msg_cnt) * sizeof(struct expect_msg));
+	tmp = realloc(msgs->patterns,
+		      (1 + msgs->cnt) * sizeof(struct expect_msg));
 	if (!tmp) {
 		ASSERT_FAIL("failed to realloc memory for messages\n");
 		return -ENOMEM;
 	}
-	subspec->expect_msgs = tmp;
-	msg = &subspec->expect_msgs[subspec->expect_msg_cnt];
+	msgs->patterns = tmp;
+	msg = &msgs->patterns[msgs->cnt];
 
 	if (substr) {
 		msg->substr = substr;
@@ -150,7 +155,7 @@ static int push_msg(const char *substr, const char *regex_str, struct test_subsp
 		}
 	}
 
-	subspec->expect_msg_cnt += 1;
+	msgs->cnt += 1;
 	return 0;
 }
 
@@ -272,25 +277,25 @@ static int parse_test_spec(struct test_loader *tester,
 			spec->mode_mask |= UNPRIV;
 		} else if (str_has_pfx(s, TEST_TAG_EXPECT_MSG_PFX)) {
 			msg = s + sizeof(TEST_TAG_EXPECT_MSG_PFX) - 1;
-			err = push_msg(msg, NULL, &spec->priv);
+			err = push_msg(msg, NULL, &spec->priv.expect_msgs);
 			if (err)
 				goto cleanup;
 			spec->mode_mask |= PRIV;
 		} else if (str_has_pfx(s, TEST_TAG_EXPECT_MSG_PFX_UNPRIV)) {
 			msg = s + sizeof(TEST_TAG_EXPECT_MSG_PFX_UNPRIV) - 1;
-			err = push_msg(msg, NULL, &spec->unpriv);
+			err = push_msg(msg, NULL, &spec->unpriv.expect_msgs);
 			if (err)
 				goto cleanup;
 			spec->mode_mask |= UNPRIV;
 		} else if (str_has_pfx(s, TEST_TAG_EXPECT_REGEX_PFX)) {
 			msg = s + sizeof(TEST_TAG_EXPECT_REGEX_PFX) - 1;
-			err = push_msg(NULL, msg, &spec->priv);
+			err = push_msg(NULL, msg, &spec->priv.expect_msgs);
 			if (err)
 				goto cleanup;
 			spec->mode_mask |= PRIV;
 		} else if (str_has_pfx(s, TEST_TAG_EXPECT_REGEX_PFX_UNPRIV)) {
 			msg = s + sizeof(TEST_TAG_EXPECT_REGEX_PFX_UNPRIV) - 1;
-			err = push_msg(NULL, msg, &spec->unpriv);
+			err = push_msg(NULL, msg, &spec->unpriv.expect_msgs);
 			if (err)
 				goto cleanup;
 			spec->mode_mask |= UNPRIV;
@@ -387,11 +392,12 @@ static int parse_test_spec(struct test_loader *tester,
 			spec->unpriv.execute = spec->priv.execute;
 		}
 
-		if (!spec->unpriv.expect_msgs) {
-			for (i = 0; i < spec->priv.expect_msg_cnt; i++) {
-				struct expect_msg *msg = &spec->priv.expect_msgs[i];
+		if (spec->unpriv.expect_msgs.cnt == 0) {
+			for (i = 0; i < spec->priv.expect_msgs.cnt; i++) {
+				struct expect_msg *msg = &spec->priv.expect_msgs.patterns[i];
 
-				err = push_msg(msg->substr, msg->regex_str, &spec->unpriv);
+				err = push_msg(msg->substr, msg->regex_str,
+					       &spec->unpriv.expect_msgs);
 				if (err)
 					goto cleanup;
 			}
@@ -443,19 +449,15 @@ static void emit_verifier_log(const char *log_buf, bool force)
 	fprintf(stdout, "VERIFIER LOG:\n=============\n%s=============\n", log_buf);
 }
 
-static void validate_case(struct test_loader *tester,
-			  struct test_subspec *subspec,
-			  struct bpf_object *obj,
-			  struct bpf_program *prog,
-			  int load_err)
+static void validate_msgs(char *log_buf, struct msgs *msgs)
 {
 	regmatch_t reg_match[1];
 	const char *match;
-	const char *log = tester->log_buf;
+	const char *log = log_buf;
 	int i, j, err;
 
-	for (i = 0; i < subspec->expect_msg_cnt; i++) {
-		struct expect_msg *msg = &subspec->expect_msgs[i];
+	for (i = 0; i < msgs->cnt; i++) {
+		struct expect_msg *msg = &msgs->patterns[i];
 
 		if (msg->substr) {
 			match = strstr(log, msg->substr);
@@ -473,9 +475,9 @@ static void validate_case(struct test_loader *tester,
 
 		if (!ASSERT_OK_PTR(match, "expect_msg")) {
 			if (env.verbosity == VERBOSE_NONE)
-				emit_verifier_log(tester->log_buf, true /*force*/);
+				emit_verifier_log(log_buf, true /*force*/);
 			for (j = 0; j <= i; j++) {
-				msg = &subspec->expect_msgs[j];
+				msg = &msgs->patterns[j];
 				fprintf(stderr, "%s %s: '%s'\n",
 					j < i ? "MATCHED " : "EXPECTED",
 					msg->substr ? "SUBSTR" : " REGEX",
@@ -694,9 +696,8 @@ void run_subtest(struct test_loader *tester,
 			goto tobj_cleanup;
 		}
 	}
-
 	emit_verifier_log(tester->log_buf, false /*force*/);
-	validate_case(tester, subspec, tobj, tprog, err);
+	validate_msgs(tester->log_buf, &subspec->expect_msgs);
 
 	if (should_do_test_run(spec, subspec)) {
 		/* For some reason test_verifier executes programs
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 6/8] selftests/bpf: extract test_loader->expect_msgs as a data structure
  2024-06-29  9:47 ` [RFC bpf-next v1 6/8] selftests/bpf: extract test_loader->expect_msgs as a data structure Eduard Zingerman
@ 2024-07-02  0:42   ` Andrii Nakryiko
  2024-07-02 21:06     ` Eduard Zingerman
  0 siblings, 1 reply; 47+ messages in thread
From: Andrii Nakryiko @ 2024-07-02  0:42 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Sat, Jun 29, 2024 at 2:48 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Non-functional change: use a separate data structure to represented
> expected messages in test_loader.
> This would allow to use the same functionality for expected set of
> disassembled instructions in the follow-up commit.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  tools/testing/selftests/bpf/test_loader.c | 81 ++++++++++++-----------
>  1 file changed, 41 insertions(+), 40 deletions(-)
>

Just being a PITA below :)

Acked-by: Andrii Nakryiko <andrii@kernel.org>

> diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c
> index ac9d3e81abdb..d4bb68685ba5 100644
> --- a/tools/testing/selftests/bpf/test_loader.c
> +++ b/tools/testing/selftests/bpf/test_loader.c
> @@ -55,11 +55,15 @@ struct expect_msg {
>         regex_t regex;
>  };
>
> +struct msgs {

but then "expected_msgs"? It's not messages it's definitions of
expected message specifier (which is a substring or regex), seems
useful to preserve distinction/specificity?

> +       struct expect_msg *patterns;
> +       size_t cnt;
> +};
> +
>  struct test_subspec {
>         char *name;
>         bool expect_failure;
> -       struct expect_msg *expect_msgs;
> -       size_t expect_msg_cnt;
> +       struct msgs expect_msgs;
>         int retval;
>         bool execute;
>  };

[...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 6/8] selftests/bpf: extract test_loader->expect_msgs as a data structure
  2024-07-02  0:42   ` Andrii Nakryiko
@ 2024-07-02 21:06     ` Eduard Zingerman
  0 siblings, 0 replies; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-02 21:06 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Mon, 2024-07-01 at 17:42 -0700, Andrii Nakryiko wrote:

[...]

> > +struct msgs {
> 
> but then "expected_msgs"? It's not messages it's definitions of
> expected message specifier (which is a substring or regex), seems
> useful to preserve distinction/specificity?

Will change.

> 
> > +       struct expect_msg *patterns;
> > +       size_t cnt;
> > +};

[...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [RFC bpf-next v1 7/8] selftests/bpf: allow checking xlated programs in verifier_* tests
  2024-06-29  9:47 [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Eduard Zingerman
                   ` (5 preceding siblings ...)
  2024-06-29  9:47 ` [RFC bpf-next v1 6/8] selftests/bpf: extract test_loader->expect_msgs as a data structure Eduard Zingerman
@ 2024-06-29  9:47 ` Eduard Zingerman
  2024-07-02  0:42   ` Andrii Nakryiko
  2024-06-29  9:47 ` [RFC bpf-next v1 8/8] selftests/bpf: test no_caller_saved_registers spill/fill removal Eduard Zingerman
  2024-07-02  0:41 ` [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Andrii Nakryiko
  8 siblings, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-06-29  9:47 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Eduard Zingerman

Add a macro __xlated("...") for use with test_loader tests.

When such annotations are present for the test case:
- bpf_prog_get_info_by_fd() is used to get BPF program after all
  rewrites are applied by verifier.
- the program is diassembled and patterns specified in __xlated are
  searched for in the disassembly text.

__xlated matching follows the same mechanics as __msg:
each subsequent pattern is matched from the point where
previous pattern ended.

This allows to write tests like below, where the goal is to verify the
behavior of one of the of the transformations applied by verifier:

    SEC("raw_tp")
    __xlated("1: w0 = ")
    __xlated("2: r0 = &(void __percpu *)(r0)")
    __xlated("3: r0 = *(u32 *)(r0 +0)")
    __xlated("4: exit")
    __success __naked void simple(void)
    {
            asm volatile (
            "call %[bpf_get_smp_processor_id];"
            "exit;"
            :
            : __imm(bpf_get_smp_processor_id)
            : __clobber_all);
    }

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/progs/bpf_misc.h |  6 ++
 tools/testing/selftests/bpf/test_loader.c    | 80 +++++++++++++++++++-
 2 files changed, 83 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h
index 81097a3f15eb..fac131a23578 100644
--- a/tools/testing/selftests/bpf/progs/bpf_misc.h
+++ b/tools/testing/selftests/bpf/progs/bpf_misc.h
@@ -26,6 +26,9 @@
  *
  * __regex           Same as __msg, but using a regular expression.
  * __regex_unpriv    Same as __msg_unpriv but using a regular expression.
+ * __xlated          Expect a line in a disassembly log after verifier applies rewrites.
+ *                   Multiple __xlated attributes could be specified.
+ * __xlated_unpriv   Same as __xlated but for unprivileged mode.
  *
  * __success         Expect program load success in privileged mode.
  * __success_unpriv  Expect program load success in unprivileged mode.
@@ -63,11 +66,14 @@
  */
 #define __msg(msg)		__attribute__((btf_decl_tag("comment:test_expect_msg=" msg)))
 #define __regex(regex)		__attribute__((btf_decl_tag("comment:test_expect_regex=" regex)))
+#define __xlated(msg)		__attribute__((btf_decl_tag("comment:test_expect_xlated=" msg)))
 #define __failure		__attribute__((btf_decl_tag("comment:test_expect_failure")))
 #define __success		__attribute__((btf_decl_tag("comment:test_expect_success")))
 #define __description(desc)	__attribute__((btf_decl_tag("comment:test_description=" desc)))
 #define __msg_unpriv(msg)	__attribute__((btf_decl_tag("comment:test_expect_msg_unpriv=" msg)))
 #define __regex_unpriv(regex)	__attribute__((btf_decl_tag("comment:test_expect_regex_unpriv=" regex)))
+#define __xlated_unpriv(msg)	\
+	__attribute__((btf_decl_tag("comment:test_expect_xlated_unpriv=" msg)))
 #define __failure_unpriv	__attribute__((btf_decl_tag("comment:test_expect_failure_unpriv")))
 #define __success_unpriv	__attribute__((btf_decl_tag("comment:test_expect_success_unpriv")))
 #define __log_level(lvl)	__attribute__((btf_decl_tag("comment:test_log_level="#lvl)))
diff --git a/tools/testing/selftests/bpf/test_loader.c b/tools/testing/selftests/bpf/test_loader.c
index d4bb68685ba5..8e5f051801db 100644
--- a/tools/testing/selftests/bpf/test_loader.c
+++ b/tools/testing/selftests/bpf/test_loader.c
@@ -7,6 +7,7 @@
 #include <bpf/btf.h>
 
 #include "autoconf_helper.h"
+#include "disasm_helpers.h"
 #include "unpriv_helpers.h"
 #include "cap_helpers.h"
 
@@ -19,10 +20,12 @@
 #define TEST_TAG_EXPECT_SUCCESS "comment:test_expect_success"
 #define TEST_TAG_EXPECT_MSG_PFX "comment:test_expect_msg="
 #define TEST_TAG_EXPECT_REGEX_PFX "comment:test_expect_regex="
+#define TEST_TAG_EXPECT_XLATED_PFX "comment:test_expect_xlated="
 #define TEST_TAG_EXPECT_FAILURE_UNPRIV "comment:test_expect_failure_unpriv"
 #define TEST_TAG_EXPECT_SUCCESS_UNPRIV "comment:test_expect_success_unpriv"
 #define TEST_TAG_EXPECT_MSG_PFX_UNPRIV "comment:test_expect_msg_unpriv="
 #define TEST_TAG_EXPECT_REGEX_PFX_UNPRIV "comment:test_expect_regex_unpriv="
+#define TEST_TAG_EXPECT_XLATED_PFX_UNPRIV "comment:test_expect_xlated_unpriv="
 #define TEST_TAG_LOG_LEVEL_PFX "comment:test_log_level="
 #define TEST_TAG_PROG_FLAGS_PFX "comment:test_prog_flags="
 #define TEST_TAG_DESCRIPTION_PFX "comment:test_description="
@@ -64,6 +67,7 @@ struct test_subspec {
 	char *name;
 	bool expect_failure;
 	struct msgs expect_msgs;
+	struct msgs expect_xlated;
 	int retval;
 	bool execute;
 };
@@ -117,6 +121,8 @@ static void free_test_spec(struct test_spec *spec)
 	/* Deallocate expect_msgs arrays. */
 	free_msgs(&spec->priv.expect_msgs);
 	free_msgs(&spec->unpriv.expect_msgs);
+	free_msgs(&spec->priv.expect_xlated);
+	free_msgs(&spec->unpriv.expect_xlated);
 
 	free(spec->priv.name);
 	free(spec->unpriv.name);
@@ -299,6 +305,18 @@ static int parse_test_spec(struct test_loader *tester,
 			if (err)
 				goto cleanup;
 			spec->mode_mask |= UNPRIV;
+		} else if (str_has_pfx(s, TEST_TAG_EXPECT_XLATED_PFX)) {
+			msg = s + sizeof(TEST_TAG_EXPECT_XLATED_PFX) - 1;
+			err = push_msg(msg, NULL, &spec->priv.expect_xlated);
+			if (err)
+				goto cleanup;
+			spec->mode_mask |= PRIV;
+		} else if (str_has_pfx(s, TEST_TAG_EXPECT_XLATED_PFX_UNPRIV)) {
+			msg = s + sizeof(TEST_TAG_EXPECT_XLATED_PFX_UNPRIV) - 1;
+			err = push_msg(msg, NULL, &spec->unpriv.expect_xlated);
+			if (err)
+				goto cleanup;
+			spec->mode_mask |= UNPRIV;
 		} else if (str_has_pfx(s, TEST_TAG_RETVAL_PFX)) {
 			val = s + sizeof(TEST_TAG_RETVAL_PFX) - 1;
 			err = parse_retval(val, &spec->priv.retval, "__retval");
@@ -402,6 +420,16 @@ static int parse_test_spec(struct test_loader *tester,
 					goto cleanup;
 			}
 		}
+		if (spec->unpriv.expect_xlated.cnt == 0) {
+			for (i = 0; i < spec->priv.expect_xlated.cnt; i++) {
+				struct expect_msg *msg = &spec->priv.expect_xlated.patterns[i];
+
+				err = push_msg(msg->substr, msg->regex_str,
+					       &spec->unpriv.expect_xlated);
+				if (err)
+					goto cleanup;
+			}
+		}
 	}
 
 	spec->valid = true;
@@ -449,7 +477,15 @@ static void emit_verifier_log(const char *log_buf, bool force)
 	fprintf(stdout, "VERIFIER LOG:\n=============\n%s=============\n", log_buf);
 }
 
-static void validate_msgs(char *log_buf, struct msgs *msgs)
+static void emit_xlated(const char *xlated, bool force)
+{
+	if (!force && env.verbosity == VERBOSE_NONE)
+		return;
+	fprintf(stdout, "XLATED:\n=============\n%s=============\n", xlated);
+}
+
+static void validate_msgs(char *log_buf, struct msgs *msgs,
+			  void (*emit_fn)(const char *buf, bool force))
 {
 	regmatch_t reg_match[1];
 	const char *match;
@@ -475,7 +511,7 @@ static void validate_msgs(char *log_buf, struct msgs *msgs)
 
 		if (!ASSERT_OK_PTR(match, "expect_msg")) {
 			if (env.verbosity == VERBOSE_NONE)
-				emit_verifier_log(log_buf, true /*force*/);
+				emit_fn(log_buf, true /*force*/);
 			for (j = 0; j <= i; j++) {
 				msg = &msgs->patterns[j];
 				fprintf(stderr, "%s %s: '%s'\n",
@@ -612,6 +648,35 @@ static bool should_do_test_run(struct test_spec *spec, struct test_subspec *subs
 	return true;
 }
 
+/* Get a disassembly of BPF program after verifier applies all rewrites */
+static int get_xlated_program_text(int prog_fd, char *text, size_t text_sz)
+{
+	__u32 insns_cnt = 0, i, insn_sz;
+	struct bpf_insn *insns = NULL;
+	char buf[64];
+	FILE *out = NULL;
+	int err;
+
+	err = get_xlated_program(prog_fd, &insns, &insns_cnt);
+	if (!ASSERT_OK(err, "get_xlated_program"))
+		goto out;
+	out = fmemopen(text, text_sz, "w");
+	if (!ASSERT_OK_PTR(out, "open_memstream"))
+		goto out;
+	for (i = 0; i < insns_cnt;) {
+		insn_sz = disasm_insn(insns + i, buf, sizeof(buf));
+		fprintf(out, "%d: %s\n", i, buf);
+		i += insn_sz;
+	}
+	fflush(out);
+
+out:
+	free(insns);
+	if (out)
+		fclose(out);
+	return err;
+}
+
 /* this function is forced noinline and has short generic name to look better
  * in test_progs output (in case of a failure)
  */
@@ -697,7 +762,16 @@ void run_subtest(struct test_loader *tester,
 		}
 	}
 	emit_verifier_log(tester->log_buf, false /*force*/);
-	validate_msgs(tester->log_buf, &subspec->expect_msgs);
+	validate_msgs(tester->log_buf, &subspec->expect_msgs, emit_verifier_log);
+
+	if (subspec->expect_xlated.cnt) {
+		err = get_xlated_program_text(bpf_program__fd(tprog),
+					      tester->log_buf, tester->log_buf_sz);
+		if (err)
+			goto tobj_cleanup;
+		emit_xlated(tester->log_buf, false /*force*/);
+		validate_msgs(tester->log_buf, &subspec->expect_xlated, emit_xlated);
+	}
 
 	if (should_do_test_run(spec, subspec)) {
 		/* For some reason test_verifier executes programs
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 7/8] selftests/bpf: allow checking xlated programs in verifier_* tests
  2024-06-29  9:47 ` [RFC bpf-next v1 7/8] selftests/bpf: allow checking xlated programs in verifier_* tests Eduard Zingerman
@ 2024-07-02  0:42   ` Andrii Nakryiko
  2024-07-02 21:07     ` Eduard Zingerman
  0 siblings, 1 reply; 47+ messages in thread
From: Andrii Nakryiko @ 2024-07-02  0:42 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Sat, Jun 29, 2024 at 2:48 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Add a macro __xlated("...") for use with test_loader tests.
>
> When such annotations are present for the test case:
> - bpf_prog_get_info_by_fd() is used to get BPF program after all
>   rewrites are applied by verifier.
> - the program is diassembled and patterns specified in __xlated are

typo: google says there is a typo in disassembled

>   searched for in the disassembly text.
>
> __xlated matching follows the same mechanics as __msg:
> each subsequent pattern is matched from the point where
> previous pattern ended.
>
> This allows to write tests like below, where the goal is to verify the
> behavior of one of the of the transformations applied by verifier:
>
>     SEC("raw_tp")
>     __xlated("1: w0 = ")
>     __xlated("2: r0 = &(void __percpu *)(r0)")
>     __xlated("3: r0 = *(u32 *)(r0 +0)")
>     __xlated("4: exit")
>     __success __naked void simple(void)
>     {
>             asm volatile (
>             "call %[bpf_get_smp_processor_id];"
>             "exit;"
>             :
>             : __imm(bpf_get_smp_processor_id)
>             : __clobber_all);
>     }
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  tools/testing/selftests/bpf/progs/bpf_misc.h |  6 ++
>  tools/testing/selftests/bpf/test_loader.c    | 80 +++++++++++++++++++-
>  2 files changed, 83 insertions(+), 3 deletions(-)
>

Acked-by: Andrii Nakryiko <andrii@kernel.org>

> diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h
> index 81097a3f15eb..fac131a23578 100644
> --- a/tools/testing/selftests/bpf/progs/bpf_misc.h
> +++ b/tools/testing/selftests/bpf/progs/bpf_misc.h
> @@ -26,6 +26,9 @@
>   *
>   * __regex           Same as __msg, but using a regular expression.
>   * __regex_unpriv    Same as __msg_unpriv but using a regular expression.
> + * __xlated          Expect a line in a disassembly log after verifier applies rewrites.
> + *                   Multiple __xlated attributes could be specified.
> + * __xlated_unpriv   Same as __xlated but for unprivileged mode.
>   *
>   * __success         Expect program load success in privileged mode.
>   * __success_unpriv  Expect program load success in unprivileged mode.
> @@ -63,11 +66,14 @@
>   */
>  #define __msg(msg)             __attribute__((btf_decl_tag("comment:test_expect_msg=" msg)))
>  #define __regex(regex)         __attribute__((btf_decl_tag("comment:test_expect_regex=" regex)))
> +#define __xlated(msg)          __attribute__((btf_decl_tag("comment:test_expect_xlated=" msg)))
>  #define __failure              __attribute__((btf_decl_tag("comment:test_expect_failure")))
>  #define __success              __attribute__((btf_decl_tag("comment:test_expect_success")))
>  #define __description(desc)    __attribute__((btf_decl_tag("comment:test_description=" desc)))
>  #define __msg_unpriv(msg)      __attribute__((btf_decl_tag("comment:test_expect_msg_unpriv=" msg)))
>  #define __regex_unpriv(regex)  __attribute__((btf_decl_tag("comment:test_expect_regex_unpriv=" regex)))
> +#define __xlated_unpriv(msg)   \
> +       __attribute__((btf_decl_tag("comment:test_expect_xlated_unpriv=" msg)))

nit: keep on a single line? you are ruining the beauty :)

>  #define __failure_unpriv       __attribute__((btf_decl_tag("comment:test_expect_failure_unpriv")))
>  #define __success_unpriv       __attribute__((btf_decl_tag("comment:test_expect_success_unpriv")))
>  #define __log_level(lvl)       __attribute__((btf_decl_tag("comment:test_log_level="#lvl)))

[...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 7/8] selftests/bpf: allow checking xlated programs in verifier_* tests
  2024-07-02  0:42   ` Andrii Nakryiko
@ 2024-07-02 21:07     ` Eduard Zingerman
  2024-07-02 21:19       ` Andrii Nakryiko
  0 siblings, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-02 21:07 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Mon, 2024-07-01 at 17:42 -0700, Andrii Nakryiko wrote:

[...]

> > +#define __xlated_unpriv(msg)   \
> > +       __attribute__((btf_decl_tag("comment:test_expect_xlated_unpriv=" msg)))
> 
> nit: keep on a single line? you are ruining the beauty :)

checkpatch.pl won't be happy but makes sense.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 7/8] selftests/bpf: allow checking xlated programs in verifier_* tests
  2024-07-02 21:07     ` Eduard Zingerman
@ 2024-07-02 21:19       ` Andrii Nakryiko
  0 siblings, 0 replies; 47+ messages in thread
From: Andrii Nakryiko @ 2024-07-02 21:19 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Tue, Jul 2, 2024 at 2:07 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Mon, 2024-07-01 at 17:42 -0700, Andrii Nakryiko wrote:
>
> [...]
>
> > > +#define __xlated_unpriv(msg)   \
> > > +       __attribute__((btf_decl_tag("comment:test_expect_xlated_unpriv=" msg)))
> >
> > nit: keep on a single line? you are ruining the beauty :)
>
> checkpatch.pl won't be happy but makes sense.

checkpath.pl will cope just fine, let's keep the beauty of regularity intact

^ permalink raw reply	[flat|nested] 47+ messages in thread

* [RFC bpf-next v1 8/8] selftests/bpf: test no_caller_saved_registers spill/fill removal
  2024-06-29  9:47 [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Eduard Zingerman
                   ` (6 preceding siblings ...)
  2024-06-29  9:47 ` [RFC bpf-next v1 7/8] selftests/bpf: allow checking xlated programs in verifier_* tests Eduard Zingerman
@ 2024-06-29  9:47 ` Eduard Zingerman
  2024-07-02  0:42   ` Andrii Nakryiko
  2024-07-02  0:41 ` [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Andrii Nakryiko
  8 siblings, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-06-29  9:47 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi, Eduard Zingerman

Tests for no_caller_saved_registers processing logic
(see verifier.c:match_and_mark_nocsr_pattern()):
- a canary positive test case;
- various tests with broken patterns;
- tests with read/write fixed/varying stack access that violate nocsr
  stack access contract;
- tests with multiple subprograms.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 .../selftests/bpf/prog_tests/verifier.c       |   7 +
 .../selftests/bpf/progs/verifier_nocsr.c      | 437 ++++++++++++++++++
 2 files changed, 444 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_nocsr.c

diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
index 6816ff064516..8e056c36c549 100644
--- a/tools/testing/selftests/bpf/prog_tests/verifier.c
+++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
@@ -53,6 +53,7 @@
 #include "verifier_movsx.skel.h"
 #include "verifier_netfilter_ctx.skel.h"
 #include "verifier_netfilter_retcode.skel.h"
+#include "verifier_nocsr.skel.h"
 #include "verifier_precision.skel.h"
 #include "verifier_prevent_map_lookup.skel.h"
 #include "verifier_raw_stack.skel.h"
@@ -171,6 +172,12 @@ void test_verifier_meta_access(void)          { RUN(verifier_meta_access); }
 void test_verifier_movsx(void)                 { RUN(verifier_movsx); }
 void test_verifier_netfilter_ctx(void)        { RUN(verifier_netfilter_ctx); }
 void test_verifier_netfilter_retcode(void)    { RUN(verifier_netfilter_retcode); }
+void test_verifier_nocsr(void)
+{
+#if defined(__x86_64__)
+	RUN(verifier_nocsr);
+#endif /* __x86_64__ */
+}
 void test_verifier_precision(void)            { RUN(verifier_precision); }
 void test_verifier_prevent_map_lookup(void)   { RUN(verifier_prevent_map_lookup); }
 void test_verifier_raw_stack(void)            { RUN(verifier_raw_stack); }
diff --git a/tools/testing/selftests/bpf/progs/verifier_nocsr.c b/tools/testing/selftests/bpf/progs/verifier_nocsr.c
new file mode 100644
index 000000000000..5ddc2c97ada6
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/verifier_nocsr.c
@@ -0,0 +1,437 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+#define __xlated_bpf_get_smp_processor_id		\
+	__xlated(": w0 = ")				\
+	__xlated(": r0 = &(void __percpu *)(r0)")	\
+	__xlated(": r0 = *(u32 *)(r0 +0)")
+
+SEC("raw_tp")
+__xlated("4: r5 = 5")
+__xlated_bpf_get_smp_processor_id
+__xlated("8: exit")
+__success
+__naked void simple(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"r2 = 2;"
+	"r3 = 3;"
+	"r4 = 4;"
+	"r5 = 5;"
+	"*(u64 *)(r10 - 16) = r1;"
+	"*(u64 *)(r10 - 24) = r2;"
+	"*(u64 *)(r10 - 32) = r3;"
+	"*(u64 *)(r10 - 40) = r4;"
+	"*(u64 *)(r10 - 48) = r5;"
+	"call %[bpf_get_smp_processor_id];"
+	"r5 = *(u64 *)(r10 - 48);"
+	"r4 = *(u64 *)(r10 - 40);"
+	"r3 = *(u64 *)(r10 - 32);"
+	"r2 = *(u64 *)(r10 - 24);"
+	"r1 = *(u64 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__xlated("1: *(u64 *)(r10 -16) = r1")
+__xlated_bpf_get_smp_processor_id
+__xlated("5: r2 = *(u64 *)(r10 -16)")
+__success
+__naked void wrong_reg_in_pattern1(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r2 = *(u64 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__xlated("1: *(u64 *)(r10 -16) = r6")
+__xlated_bpf_get_smp_processor_id
+__xlated("5: r6 = *(u64 *)(r10 -16)")
+__success
+__naked void wrong_reg_in_pattern2(void)
+{
+	asm volatile (
+	"r6 = 1;"
+	"*(u64 *)(r10 - 16) = r6;"
+	"call %[bpf_get_smp_processor_id];"
+	"r6 = *(u64 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__xlated("1: *(u64 *)(r10 -16) = r0")
+__xlated_bpf_get_smp_processor_id
+__xlated("5: r0 = *(u64 *)(r10 -16)")
+__success
+__naked void wrong_reg_in_pattern3(void)
+{
+	asm volatile (
+	"r0 = 1;"
+	"*(u64 *)(r10 - 16) = r0;"
+	"call %[bpf_get_smp_processor_id];"
+	"r0 = *(u64 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__xlated("2: *(u64 *)(r2 -16) = r1")
+__xlated_bpf_get_smp_processor_id
+__xlated("6: r1 = *(u64 *)(r10 -16)")
+__success
+__naked void wrong_base_in_pattern(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"r2 = r10;"
+	"*(u64 *)(r2 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__xlated("1: *(u64 *)(r10 -16) = r1")
+__xlated_bpf_get_smp_processor_id
+__xlated("5: r2 = 1")
+__success
+__naked void wrong_insn_in_pattern(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r2 = 1;"
+	"r1 = *(u64 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__xlated("2: *(u64 *)(r10 -16) = r1")
+__xlated_bpf_get_smp_processor_id
+__xlated("6: r1 = *(u64 *)(r10 -8)")
+__success
+__naked void wrong_off_in_pattern(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"*(u64 *)(r10 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 8);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__xlated("1: *(u32 *)(r10 -16) = r1")
+__xlated_bpf_get_smp_processor_id
+__xlated("5: r1 = *(u32 *)(r10 -16)")
+__success
+__naked void wrong_size_in_pattern(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u32 *)(r10 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u32 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__xlated("2: *(u32 *)(r10 -8) = r1")
+__xlated_bpf_get_smp_processor_id
+__xlated("6: r1 = *(u32 *)(r10 -8)")
+__success
+__naked void partial_pattern(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"r2 = 2;"
+	"*(u32 *)(r10 - 8) = r1;"
+	"*(u64 *)(r10 - 16) = r2;"
+	"call %[bpf_get_smp_processor_id];"
+	"r2 = *(u64 *)(r10 - 16);"
+	"r1 = *(u32 *)(r10 - 8);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__xlated("0: r1 = 1")
+__xlated("1: r2 = 2")
+/* not patched, spills for -8, -16 not removed */
+__xlated("2: *(u64 *)(r10 -8) = r1")
+__xlated("3: *(u64 *)(r10 -16) = r2")
+__xlated_bpf_get_smp_processor_id
+__xlated("7: r2 = *(u64 *)(r10 -16)")
+__xlated("8: r1 = *(u64 *)(r10 -8)")
+/* patched, spills for -16, -24 removed */
+__xlated_bpf_get_smp_processor_id
+__xlated("12: exit")
+__success
+__naked void min_stack_offset(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"r2 = 2;"
+	/* this call won't be patched */
+	"*(u64 *)(r10 - 8) = r1;"
+	"*(u64 *)(r10 - 16) = r2;"
+	"call %[bpf_get_smp_processor_id];"
+	"r2 = *(u64 *)(r10 - 16);"
+	"r1 = *(u64 *)(r10 - 8);"
+	/* this call would be patched */
+	"*(u64 *)(r10 - 16) = r1;"
+	"*(u64 *)(r10 - 24) = r2;"
+	"call %[bpf_get_smp_processor_id];"
+	"r2 = *(u64 *)(r10 - 24);"
+	"r1 = *(u64 *)(r10 - 16);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__xlated("1: *(u64 *)(r10 -8) = r1")
+__xlated_bpf_get_smp_processor_id
+__xlated("5: r1 = *(u64 *)(r10 -8)")
+__success
+__naked void bad_fixed_read(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 8);"
+	"r1 = r10;"
+	"r1 += -8;"
+	"r1 = *(u64 *)(r1 - 0);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__xlated("1: *(u64 *)(r10 -8) = r1")
+__xlated_bpf_get_smp_processor_id
+__xlated("5: r1 = *(u64 *)(r10 -8)")
+__success
+__naked void bad_fixed_write(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 8);"
+	"r1 = r10;"
+	"r1 += -8;"
+	"*(u64 *)(r1 - 0) = r1;"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__xlated("6: *(u64 *)(r10 -16) = r1")
+__xlated_bpf_get_smp_processor_id
+__xlated("10: r1 = *(u64 *)(r10 -16)")
+__success
+__naked void bad_varying_read(void)
+{
+	asm volatile (
+	"r6 = *(u64 *)(r1 + 0);" /* random scalar value */
+	"r6 &= 0x7;"		 /* r6 range [0..7] */
+	"r6 += 0x2;"		 /* r6 range [2..9] */
+	"r7 = 0;"
+	"r7 -= r6;"		 /* r7 range [-9..-2] */
+	"r1 = 1;"
+	"*(u64 *)(r10 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 16);"
+	"r1 = r10;"
+	"r1 += r7;"
+	"r1 = *(u8 *)(r1 - 0);" /* touches slot [-16..-9] where spills are stored */
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__xlated("6: *(u64 *)(r10 -16) = r1")
+__xlated_bpf_get_smp_processor_id
+__xlated("10: r1 = *(u64 *)(r10 -16)")
+__success
+__naked void bad_varying_write(void)
+{
+	asm volatile (
+	"r6 = *(u64 *)(r1 + 0);" /* random scalar value */
+	"r6 &= 0x7;"		 /* r6 range [0..7] */
+	"r6 += 0x2;"		 /* r6 range [2..9] */
+	"r7 = 0;"
+	"r7 -= r6;"		 /* r7 range [-9..-2] */
+	"r1 = 1;"
+	"*(u64 *)(r10 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 16);"
+	"r1 = r10;"
+	"r1 += r7;"
+	"*(u8 *)(r1 - 0) = r7;" /* touches slot [-16..-9] where spills are stored */
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+__xlated("1: *(u64 *)(r10 -8) = r1")
+__xlated_bpf_get_smp_processor_id
+__xlated("5: r1 = *(u64 *)(r10 -8)")
+__success
+__naked void bad_write_in_subprog(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 8);"
+	"r1 = r10;"
+	"r1 += -8;"
+	"call bad_write_in_subprog_aux;"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+__used
+__naked static void bad_write_in_subprog_aux(void)
+{
+	asm volatile (
+	"r0 = 1;"
+	"*(u64 *)(r1 - 0) = r0;"	/* invalidates nocsr contract for caller: */
+	"exit;"				/* caller stack at -8 used outside of the pattern */
+	::: __clobber_all);
+}
+
+SEC("raw_tp")
+/* main, not patched */
+__xlated("1: *(u64 *)(r10 -8) = r1")
+__xlated_bpf_get_smp_processor_id
+__xlated("5: r1 = *(u64 *)(r10 -8)")
+__xlated("9: call pc+1")
+__xlated("10: exit")
+/* subprogram, patched */
+__xlated("11: r1 = 1")
+__xlated_bpf_get_smp_processor_id
+__xlated("15: exit")
+__success
+__naked void invalidate_one_subprog(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 8);"
+	"r1 = r10;"
+	"r1 += -8;"
+	"r1 = *(u64 *)(r1 - 0);"
+	"call invalidate_one_subprog_aux;"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+__used
+__naked static void invalidate_one_subprog_aux(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 8);"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+SEC("raw_tp")
+/* main */
+__xlated("0: r1 = 1")
+__xlated_bpf_get_smp_processor_id
+__xlated("4: call pc+1")
+__xlated("5: exit")
+/* subprogram */
+__xlated("6: r1 = 1")
+__xlated_bpf_get_smp_processor_id
+__xlated("10: *(u64 *)(r10 -16) = r1")
+__xlated("11: exit")
+__success
+__naked void subprogs_use_independent_offsets(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 16) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 16);"
+	"call subprogs_use_independent_offsets_aux;"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+__used
+__naked static void subprogs_use_independent_offsets_aux(void)
+{
+	asm volatile (
+	"r1 = 1;"
+	"*(u64 *)(r10 - 24) = r1;"
+	"call %[bpf_get_smp_processor_id];"
+	"r1 = *(u64 *)(r10 - 24);"
+	"*(u64 *)(r10 - 16) = r1;"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id)
+	: __clobber_all);
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 8/8] selftests/bpf: test no_caller_saved_registers spill/fill removal
  2024-06-29  9:47 ` [RFC bpf-next v1 8/8] selftests/bpf: test no_caller_saved_registers spill/fill removal Eduard Zingerman
@ 2024-07-02  0:42   ` Andrii Nakryiko
  2024-07-02 21:12     ` Eduard Zingerman
  0 siblings, 1 reply; 47+ messages in thread
From: Andrii Nakryiko @ 2024-07-02  0:42 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Sat, Jun 29, 2024 at 2:48 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Tests for no_caller_saved_registers processing logic
> (see verifier.c:match_and_mark_nocsr_pattern()):
> - a canary positive test case;
> - various tests with broken patterns;
> - tests with read/write fixed/varying stack access that violate nocsr
>   stack access contract;
> - tests with multiple subprograms.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  .../selftests/bpf/prog_tests/verifier.c       |   7 +
>  .../selftests/bpf/progs/verifier_nocsr.c      | 437 ++++++++++++++++++
>  2 files changed, 444 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/progs/verifier_nocsr.c
>

You don't seem to have a case where offset is not a multiple of 8
(though it would have to be a sub-register spill which would be
"rejected" anyway, so not sure if there is anything to add here)

> diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
> index 6816ff064516..8e056c36c549 100644
> --- a/tools/testing/selftests/bpf/prog_tests/verifier.c
> +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
> @@ -53,6 +53,7 @@
>  #include "verifier_movsx.skel.h"
>  #include "verifier_netfilter_ctx.skel.h"
>  #include "verifier_netfilter_retcode.skel.h"
> +#include "verifier_nocsr.skel.h"
>  #include "verifier_precision.skel.h"
>  #include "verifier_prevent_map_lookup.skel.h"
>  #include "verifier_raw_stack.skel.h"
> @@ -171,6 +172,12 @@ void test_verifier_meta_access(void)          { RUN(verifier_meta_access); }
>  void test_verifier_movsx(void)                 { RUN(verifier_movsx); }
>  void test_verifier_netfilter_ctx(void)        { RUN(verifier_netfilter_ctx); }
>  void test_verifier_netfilter_retcode(void)    { RUN(verifier_netfilter_retcode); }
> +void test_verifier_nocsr(void)
> +{
> +#if defined(__x86_64__)
> +       RUN(verifier_nocsr);
> +#endif /* __x86_64__ */

maybe #else <mark-as-skipped> ?

> +}
>  void test_verifier_precision(void)            { RUN(verifier_precision); }
>  void test_verifier_prevent_map_lookup(void)   { RUN(verifier_prevent_map_lookup); }
>  void test_verifier_raw_stack(void)            { RUN(verifier_raw_stack); }
> diff --git a/tools/testing/selftests/bpf/progs/verifier_nocsr.c b/tools/testing/selftests/bpf/progs/verifier_nocsr.c
> new file mode 100644
> index 000000000000..5ddc2c97ada6
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/verifier_nocsr.c
> @@ -0,0 +1,437 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include "bpf_misc.h"
> +
> +#define __xlated_bpf_get_smp_processor_id              \
> +       __xlated(": w0 = ")                             \

how will this work for no_alu32 mode?

> +       __xlated(": r0 = &(void __percpu *)(r0)")       \
> +       __xlated(": r0 = *(u32 *)(r0 +0)")
> +
> +SEC("raw_tp")
> +__xlated("4: r5 = 5")
> +__xlated_bpf_get_smp_processor_id
> +__xlated("8: exit")
> +__success
> +__naked void simple(void)
> +{
> +       asm volatile (
> +       "r1 = 1;"
> +       "r2 = 2;"
> +       "r3 = 3;"
> +       "r4 = 4;"
> +       "r5 = 5;"
> +       "*(u64 *)(r10 - 16) = r1;"
> +       "*(u64 *)(r10 - 24) = r2;"
> +       "*(u64 *)(r10 - 32) = r3;"
> +       "*(u64 *)(r10 - 40) = r4;"
> +       "*(u64 *)(r10 - 48) = r5;"
> +       "call %[bpf_get_smp_processor_id];"
> +       "r5 = *(u64 *)(r10 - 48);"
> +       "r4 = *(u64 *)(r10 - 40);"
> +       "r3 = *(u64 *)(r10 - 32);"
> +       "r2 = *(u64 *)(r10 - 24);"
> +       "r1 = *(u64 *)(r10 - 16);"
> +       "exit;"
> +       :
> +       : __imm(bpf_get_smp_processor_id)
> +       : __clobber_all);
> +}

[...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 8/8] selftests/bpf: test no_caller_saved_registers spill/fill removal
  2024-07-02  0:42   ` Andrii Nakryiko
@ 2024-07-02 21:12     ` Eduard Zingerman
  2024-07-02 21:20       ` Andrii Nakryiko
  0 siblings, 1 reply; 47+ messages in thread
From: Eduard Zingerman @ 2024-07-02 21:12 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Mon, 2024-07-01 at 17:42 -0700, Andrii Nakryiko wrote:

[...]

> You don't seem to have a case where offset is not a multiple of 8
> (though it would have to be a sub-register spill which would be
> "rejected" anyway, so not sure if there is anything to add here)

My bad, will add such a test-case.

> 
> > diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c

[...]

> > +void test_verifier_nocsr(void)
> > +{
> > +#if defined(__x86_64__)
> > +       RUN(verifier_nocsr);
> > +#endif /* __x86_64__ */
> 
> maybe #else <mark-as-skipped> ?

Right, makes sense.

[...]

> > +++ b/tools/testing/selftests/bpf/progs/verifier_nocsr.c
> > @@ -0,0 +1,437 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include <linux/bpf.h>
> > +#include <bpf/bpf_helpers.h>
> > +#include "bpf_misc.h"
> > +
> > +#define __xlated_bpf_get_smp_processor_id              \
> > +       __xlated(": w0 = ")                             \
> 
> how will this work for no_alu32 mode?

The patch is applied by kernel, and it does not care about alu32
compiler flags:

	insn_buf[0] = BPF_MOV32_IMM(BPF_REG_0, (u32)(unsigned long)&pcpu_hot.cpu_number);
	insn_buf[1] = BPF_MOV64_PERCPU_REG(BPF_REG_0, BPF_REG_0);
	insn_buf[2] = BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_0, 0);

[...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 8/8] selftests/bpf: test no_caller_saved_registers spill/fill removal
  2024-07-02 21:12     ` Eduard Zingerman
@ 2024-07-02 21:20       ` Andrii Nakryiko
  0 siblings, 0 replies; 47+ messages in thread
From: Andrii Nakryiko @ 2024-07-02 21:20 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Tue, Jul 2, 2024 at 2:12 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Mon, 2024-07-01 at 17:42 -0700, Andrii Nakryiko wrote:
>
> [...]
>
> > You don't seem to have a case where offset is not a multiple of 8
> > (though it would have to be a sub-register spill which would be
> > "rejected" anyway, so not sure if there is anything to add here)
>
> My bad, will add such a test-case.
>
> >
> > > diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
>
> [...]
>
> > > +void test_verifier_nocsr(void)
> > > +{
> > > +#if defined(__x86_64__)
> > > +       RUN(verifier_nocsr);
> > > +#endif /* __x86_64__ */
> >
> > maybe #else <mark-as-skipped> ?
>
> Right, makes sense.
>
> [...]
>
> > > +++ b/tools/testing/selftests/bpf/progs/verifier_nocsr.c
> > > @@ -0,0 +1,437 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +
> > > +#include <linux/bpf.h>
> > > +#include <bpf/bpf_helpers.h>
> > > +#include "bpf_misc.h"
> > > +
> > > +#define __xlated_bpf_get_smp_processor_id              \
> > > +       __xlated(": w0 = ")                             \
> >
> > how will this work for no_alu32 mode?
>
> The patch is applied by kernel, and it does not care about alu32
> compiler flags:
>
>         insn_buf[0] = BPF_MOV32_IMM(BPF_REG_0, (u32)(unsigned long)&pcpu_hot.cpu_number);

ah, right-right, of course!

>         insn_buf[1] = BPF_MOV64_PERCPU_REG(BPF_REG_0, BPF_REG_0);
>         insn_buf[2] = BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_0, 0);
>
> [...]

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls
  2024-06-29  9:47 [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Eduard Zingerman
                   ` (7 preceding siblings ...)
  2024-06-29  9:47 ` [RFC bpf-next v1 8/8] selftests/bpf: test no_caller_saved_registers spill/fill removal Eduard Zingerman
@ 2024-07-02  0:41 ` Andrii Nakryiko
  8 siblings, 0 replies; 47+ messages in thread
From: Andrii Nakryiko @ 2024-07-02  0:41 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	jose.marchesi

On Sat, Jun 29, 2024 at 2:48 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> This RFC seeks to allow using no_caller_saved_registers gcc/clang
> attribute with some BPF helper functions (and kfuncs in the future).
>
> As documented in [1], this attribute means that function scratches
> only some of the caller saved registers defined by ABI.
> For BPF the set of such registers could be defined as follows:
> - R0 is scratched only if function is non-void;
> - R1-R5 are scratched only if corresponding parameter type is defined
>   in the function prototype.
>
> The goal of the RFC is to implement no_caller_saved_registers
> (nocsr for short) in a backwards compatible manner:
> - for kernels that support the feature, gain some performance boost
>   from better register allocation;
> - for kernels that don't support the feature, allow programs execution
>   with minor performance losses.
>
> To achieve this, use a scheme suggested by Alexei Starovoitov:
> - for nocsr calls clang allocates registers as-if relevant r0-r5
>   registers are not scratched by the call;
> - as a post-processing step, clang visits each nocsr call and adds
>   spill/fill for every live r0-r5;
> - stack offsets used for spills/fills are allocated as minimal
>   stack offsets in whole function and are not used for any other
>   purposes;
> - when kernel loads a program, it looks for such patterns
>   (nocsr function surrounded by spills/fills) and checks if
>   spill/fill stack offsets are used exclusively in nocsr patterns;
> - if so, and if current JIT inlines the call to the nocsr function

JIT inlines or BPF verifier can inline as well?


>   (e.g. a helper call), kernel removes unnecessary spill/fill pairs;
> - when old kernel loads a program, presence of spill/fill pairs
>   keeps BPF program valid, albeit slightly less efficient.
>
> Corresponding clang/llvm changes are available in [2].
>
> The patch-set uses bpf_get_smp_processor_id() function as a canary,
> making it the first helper with nocsr attribute.
>
> For example, consider the following program:
>
>   #define __no_csr __attribute__((no_caller_saved_registers))
>   #define SEC(name) __attribute__((section(name), used))
>   #define bpf_printk(fmt, ...) bpf_trace_printk((fmt), sizeof(fmt), __VA_ARGS__)
>
>   typedef unsigned int __u32;
>
>   static long (* const bpf_trace_printk)(const char *fmt, __u32 fmt_size, ...) = (void *) 6;
>   static __u32 (*const bpf_get_smp_processor_id)(void) __no_csr = (void *)8;
>
>   SEC("raw_tp")
>   int test(void *ctx)
>   {
>           __u32 task = bpf_get_smp_processor_id();
>         bpf_printk("ctx=%p, smp=%d", ctx, task);
>         return 0;
>   }
>
>   char _license[] SEC("license") = "GPL";
>
> Compiled (using [2]) as follows:
>
>   $ clang --target=bpf -O2 -g -c -o nocsr.bpf.o nocsr.bpf.c
>   $ llvm-objdump --no-show-raw-insn -Sd nocsr.bpf.o
>     ...
>   3rd parameter for printk call     removable spill/fill pair
>   .--- 0:       r3 = r1                             |
> ; |       __u32 task = bpf_get_smp_processor_id();  |
>   |    1:       *(u64 *)(r10 - 0x8) = r3 <----------|
>   |    2:       call 0x8                            |
>   |    3:       r3 = *(u64 *)(r10 - 0x8) <----------'
> ; |     bpf_printk("ctx=%p, smp=%d", ctx, task);
>   |    4:       r1 = 0x0 ll
>   |    6:       r2 = 0xf
>   |    7:       r4 = r0
>   '--> 8:       call 0x6
> ;       return 0;
>        9:       r0 = 0x0
>       10:       exit
>
> Here is how the program looks after verifier processing:
>
>   # bpftool prog load ./nocsr.bpf.o /sys/fs/bpf/nocsr-test
>   # bpftool prog dump xlated pinned /sys/fs/bpf/nocsr-test
>   int test(void * ctx):
>   ; int test(void *ctx)
>      0: (bf) r3 = r1               <--------- 3rd printk parameter
>   ; __u32 task = bpf_get_smp_processor_id();
>      1: (b4) w0 = 197132           <--------- inlined helper call,
>      2: (bf) r0 = r0               <--------- spill/fill pair removed
>      3: (61) r0 = *(u32 *)(r0 +0)  <---------
>   ; bpf_printk("ctx=%p, smp=%d", ctx, task);
>      4: (18) r1 = map[id:13][0]+0
>      6: (b7) r2 = 15
>      7: (bf) r4 = r0
>      8: (85) call bpf_trace_printk#-125920
>   ; return 0;
>      9: (b7) r0 = 0
>     10: (95) exit
>
> [1] https://clang.llvm.org/docs/AttributeReference.html#no-caller-saved-registers
> [2] https://github.com/eddyz87/llvm-project/tree/bpf-no-caller-saved-registers
>
> Eduard Zingerman (8):
>   bpf: add a get_helper_proto() utility function
>   bpf: no_caller_saved_registers attribute for helper calls
>   bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id()
>   selftests/bpf: extract utility function for BPF disassembly
>   selftests/bpf: no need to track next_match_pos in struct test_loader
>   selftests/bpf: extract test_loader->expect_msgs as a data structure
>   selftests/bpf: allow checking xlated programs in verifier_* tests
>   selftests/bpf: test no_caller_saved_registers spill/fill removal
>
>  include/linux/bpf.h                           |   6 +
>  include/linux/bpf_verifier.h                  |   9 +
>  kernel/bpf/helpers.c                          |   1 +
>  kernel/bpf/verifier.c                         | 346 +++++++++++++-
>  tools/testing/selftests/bpf/Makefile          |   1 +
>  tools/testing/selftests/bpf/disasm_helpers.c  |  50 ++
>  tools/testing/selftests/bpf/disasm_helpers.h  |  12 +
>  .../selftests/bpf/prog_tests/ctx_rewrite.c    |  71 +--
>  .../selftests/bpf/prog_tests/verifier.c       |   7 +
>  tools/testing/selftests/bpf/progs/bpf_misc.h  |   6 +
>  .../selftests/bpf/progs/verifier_nocsr.c      | 437 ++++++++++++++++++
>  tools/testing/selftests/bpf/test_loader.c     | 170 +++++--
>  tools/testing/selftests/bpf/test_progs.h      |   1 -
>  tools/testing/selftests/bpf/testing_helpers.c |   1 +
>  14 files changed, 986 insertions(+), 132 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/disasm_helpers.c
>  create mode 100644 tools/testing/selftests/bpf/disasm_helpers.h
>  create mode 100644 tools/testing/selftests/bpf/progs/verifier_nocsr.c
>
> --
> 2.45.2
>

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2024-07-04 17:39 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-29  9:47 [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Eduard Zingerman
2024-06-29  9:47 ` [RFC bpf-next v1 1/8] bpf: add a get_helper_proto() utility function Eduard Zingerman
2024-07-02  0:41   ` Andrii Nakryiko
2024-07-02 20:07     ` Eduard Zingerman
2024-06-29  9:47 ` [RFC bpf-next v1 2/8] bpf: no_caller_saved_registers attribute for helper calls Eduard Zingerman
2024-07-01 19:01   ` Eduard Zingerman
2024-07-02  0:41   ` Andrii Nakryiko
2024-07-02 20:38     ` Eduard Zingerman
2024-07-02 21:09       ` Andrii Nakryiko
2024-07-02 21:19         ` Eduard Zingerman
2024-07-02 21:22           ` Andrii Nakryiko
2024-07-03 11:57   ` Puranjay Mohan
2024-07-03 16:13     ` Eduard Zingerman
2024-07-04 10:55       ` Puranjay Mohan
2024-06-29  9:47 ` [RFC bpf-next v1 3/8] bpf, x86: no_caller_saved_registers for bpf_get_smp_processor_id() Eduard Zingerman
2024-07-02  0:41   ` Andrii Nakryiko
2024-07-02 20:43     ` Eduard Zingerman
2024-07-02 21:11       ` Andrii Nakryiko
2024-07-02 21:25         ` Eduard Zingerman
2024-07-03 11:27         ` Puranjay Mohan
2024-07-03 23:14           ` Eduard Zingerman
2024-07-04 11:19             ` Puranjay Mohan
2024-07-04 16:39               ` Eduard Zingerman
2024-07-04 17:00           ` Eduard Zingerman
2024-07-04 17:24             ` Puranjay Mohan
2024-07-04 17:39               ` Eduard Zingerman
2024-06-29  9:47 ` [RFC bpf-next v1 4/8] selftests/bpf: extract utility function for BPF disassembly Eduard Zingerman
2024-07-02  0:41   ` Andrii Nakryiko
2024-07-02 20:59     ` Eduard Zingerman
2024-07-02 21:16       ` Andrii Nakryiko
2024-07-02 21:23         ` Eduard Zingerman
2024-06-29  9:47 ` [RFC bpf-next v1 5/8] selftests/bpf: no need to track next_match_pos in struct test_loader Eduard Zingerman
2024-07-02  0:41   ` Andrii Nakryiko
2024-07-02 21:05     ` Eduard Zingerman
2024-07-02 21:18       ` Andrii Nakryiko
2024-06-29  9:47 ` [RFC bpf-next v1 6/8] selftests/bpf: extract test_loader->expect_msgs as a data structure Eduard Zingerman
2024-07-02  0:42   ` Andrii Nakryiko
2024-07-02 21:06     ` Eduard Zingerman
2024-06-29  9:47 ` [RFC bpf-next v1 7/8] selftests/bpf: allow checking xlated programs in verifier_* tests Eduard Zingerman
2024-07-02  0:42   ` Andrii Nakryiko
2024-07-02 21:07     ` Eduard Zingerman
2024-07-02 21:19       ` Andrii Nakryiko
2024-06-29  9:47 ` [RFC bpf-next v1 8/8] selftests/bpf: test no_caller_saved_registers spill/fill removal Eduard Zingerman
2024-07-02  0:42   ` Andrii Nakryiko
2024-07-02 21:12     ` Eduard Zingerman
2024-07-02 21:20       ` Andrii Nakryiko
2024-07-02  0:41 ` [RFC bpf-next v1 0/8] no_caller_saved_registers attribute for helper calls Andrii Nakryiko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox