bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF
@ 2024-11-07 17:50 Eduard Zingerman
  2024-11-07 17:50 ` [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches() Eduard Zingerman
                   ` (11 more replies)
  0 siblings, 12 replies; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-07 17:50 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	Eduard Zingerman

Some time ago, in an off-list discussion, Alexei Starovoitov suggested
compiling certain kfuncs to BPF to allow inlining calls to such kfuncs
during verification. This RFC explores the idea.

This RFC introduces a notion of inlinable BPF kfuncs.
Inlinable kfuncs are compiled to BPF and are inlined by verifier after
program verification. Inlined kfunc bodies are subject to dead code
removal and removal of conditional jumps, if such jumps are proved to
always follow a single branch.

Motivation
----------

The patch set uses bpf_dynptr_slice() as guinea pig, as this function
is relatively complex and has a few conditionals that could be
removed most of the times. The function has the following structure:

    void *bpf_dynptr_slice(const struct bpf_dynptr *p, u32 offset,
                           void *buffer__opt, u32 buffer__szk)
    {
        ...
        type = bpf_dynptr_get_type(ptr);

        switch (type) {
        ...
        case BPF_DYNPTR_TYPE_SKB:
                if (buffer__opt)
                        return skb_header_pointer(...);
                else
                        return skb_pointer_if_linear(...);
        ...
    }

Parameters 'type' and 'buffer__opt' are most likely to be known at
callsite, and thus the function could be inlined w/o 'switch' and 'if'
checks above.

This has a measurable speedup on microbenchmark, e.g. a simple test
measuring number of bpf_dynptr_slice() executions per second shows
~1.5 speedup compared to master (see last patch in the series).
Granted, real world programs do some other work beside slice calls.

Mechanism
---------

Implementation pursues three main goals:
- avoid differences in program verification whether or not certain
  kfunc is inlinable;
- predict branches in inlined kfuncs bodies;
- allow inlined kfunc bodies to do arbitrary computations.

The goals are achieved in the following steps:
- Inlinable kfuncs are defined in kernel/bpf/inlinable_kfuncs.c.
- In order to include kernel headers as-is the C file is compiled to
  llvm bitcode targeting native architecture and then to BPF elf
  object using llc utility.
- BPF elf object is embedded in kernel data section.
- At kernel initialization time the elf object is parsed and functions
  defined in the object are deemed to be inlinable kfuncs.
- Before main verification pass, for each inlinable kfunc callsite,
  inlinable kfunc is cloned as a hidden subprogram. Such subprograms
  are called kfunc instances.
- A new KERNEL_VALUE register type is added:
  - ALU operations on any type and KERNEL_VALUE return KERNEL_VALUE;
  - load / store operations with base register having type
    KERNEL_VALUE return KERNEL_VALUE.
- During main verification pass:
  - inlinable kfunc calls are verified as regular kfunc calls;
  - the bodies of the corresponding kfunc instances are visited
    in a "distilled" context:
    - no caller stack frame;
    - scalar or null pointer r1-r5 from caller stack frame are copied
      to instance call frame verbatim;
    - pointer to dynptr r1-r5 from caller stack frame are represented
      in the instance call frame as pointers to a fake stack frame.
      For each dynptr this fake stack frame contains two register spills:
      - one scalar for 'size' field, which also encodes type;
      - one KERNEL_VALUE for 'data' field;
    - r1-r5 of all other types are represented in the instance call
      frame as KERNEL_VALUEs;
  - when 'exit' instruction within kfunc instance body is processed,
    verification for the current path is assumed complete.
- After main verification pass:
  - rely on existing passes opt_hard_wire_dead_code_branches() and
    opt_remove_dead_code() to simplify kfunc instance bodies;
  - calls to inlinable kfuncs are replaced with corresponding kfunc
    instance bodies, instance subprograms are removed.

Patch-set structure
-------------------

Implementation of the above mechanics is split in several patches to
simplify the review. The following patches are most interesting:

- "bpf: shared BPF/native kfuncs":
  Build system integration and kfuncs inlining after verification.

- "bpf: KERNEL_VALUE register type"
  Adds an opaque type for registers that could be used for ALU
  and memory access.

- "bpf: allow specifying inlinable kfuncs in modules"
  This is mostly for tests: for testing purposes
  it is necessary to control exact assembly code for
  both test case calling kfunc and kfunc itself.

- "bpf: instantiate inlinable kfuncs before verification"
  Adds a pass that clones inlinable kfunc bodies as hidden
  subprograms, one subprogram per callsite.

- "bpf: special rules for kernel function calls inside inlinable kfuncs"
  Allows to call arbitrary kernel functions from kfunc instances.

Limitations
-----------

Some existing verifier restrictions are not lifted for inlinable kfuncs:
- stack is limited to 512 bytes;
- max 8 call frames (7 with the fake call frame);
- loops are still verified in a "brute force" way;
- instructions processed for kfunc instances are counted in 1M
  instructions budget.

The list is probably not exhaustive.

TODO items
----------

The following items are currently on the TODO list:
- consider getting rid of bpftool linking phase for BPF elf object;
- consider getting rid of clang->bitcode->llc->elf dance;
- make bpf_dynptr_from_skb() inlinable and allow passing objects state
  between kfunc instances, thus avoiding special case for dynptr;
- determine the set of kfuncs for which inlining makes sense.

Alternatives
------------

Imo, this RFC is worth following through only if number of kfuncs
benefiting from inlining is big. If the set is limited to dynptr
family of functions, it is simpler to add a number of hard-coded
inlining templates for such functions (similarly to what is currently
done for some helpers).

Eduard Zingerman (11):
  bpf: use branch predictions in opt_hard_wire_dead_code_branches()
  selftests/bpf: tests for opt_hard_wire_dead_code_branches()
  bpf: shared BPF/native kfuncs
  bpf: allow specifying inlinable kfuncs in modules
  bpf: dynamic allocation for bpf_verifier_env->subprog_info
  bpf: KERNEL_VALUE register type
  bpf: instantiate inlinable kfuncs before verification
  bpf: special rules for kernel function calls inside inlinable kfuncs
  bpf: move selected dynptr kfuncs to inlinable_kfuncs.c
  selftests/bpf: tests to verify handling of inlined kfuncs
  selftests/bpf: dynptr_slice benchmark

 Makefile                                      |   22 +-
 include/linux/bpf.h                           |   40 +-
 include/linux/bpf_verifier.h                  |   12 +-
 include/linux/btf.h                           |    7 +
 kernel/bpf/Makefile                           |   25 +-
 kernel/bpf/btf.c                              |    2 +-
 kernel/bpf/helpers.c                          |  130 +-
 kernel/bpf/inlinable_kfuncs.c                 |  113 ++
 kernel/bpf/log.c                              |    1 +
 kernel/bpf/verifier.c                         | 1187 ++++++++++++++++-
 tools/testing/selftests/bpf/Makefile          |    2 +
 tools/testing/selftests/bpf/bench.c           |    2 +
 .../selftests/bpf/benchs/bench_dynptr_slice.c |   76 ++
 .../selftests/bpf/bpf_testmod/Makefile        |   24 +-
 .../{bpf_testmod.c => bpf_testmod_core.c}     |   25 +
 .../bpf/bpf_testmod/test_inlinable_kfuncs.c   |  132 ++
 .../selftests/bpf/prog_tests/verifier.c       |    9 +
 .../selftests/bpf/progs/dynptr_slice_bench.c  |   29 +
 .../selftests/bpf/progs/verifier_dead_code.c  |   63 +
 .../bpf/progs/verifier_inlinable_kfuncs.c     |  253 ++++
 20 files changed, 1970 insertions(+), 184 deletions(-)
 create mode 100644 kernel/bpf/inlinable_kfuncs.c
 create mode 100644 tools/testing/selftests/bpf/benchs/bench_dynptr_slice.c
 rename tools/testing/selftests/bpf/bpf_testmod/{bpf_testmod.c => bpf_testmod_core.c} (97%)
 create mode 100644 tools/testing/selftests/bpf/bpf_testmod/test_inlinable_kfuncs.c
 create mode 100644 tools/testing/selftests/bpf/progs/dynptr_slice_bench.c
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_dead_code.c
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_inlinable_kfuncs.c

-- 
2.47.0


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches()
  2024-11-07 17:50 [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Eduard Zingerman
@ 2024-11-07 17:50 ` Eduard Zingerman
  2024-11-14 22:20   ` Eduard Zingerman
  2024-11-15  0:16   ` Andrii Nakryiko
  2024-11-07 17:50 ` [RFC bpf-next 02/11] selftests/bpf: tests for opt_hard_wire_dead_code_branches() Eduard Zingerman
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-07 17:50 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	Eduard Zingerman

Consider dead code elimination problem for program like below:

    main:
      1: r1 = 42
      2: call <subprogram>;
      3: exit

    subprogram:
      4: r0 = 1
      5: if r1 != 42 goto +1
      6: r0 = 2
      7: exit;

Here verifier would visit every instruction and thus
bpf_insn_aux_data->seen flag would be set for both true (7)
and falltrhough (6) branches of conditional (5).
Hence opt_hard_wire_dead_code_branches() will not replace
conditional (5) with unconditional jump.

To cover such cases:
- add two fields in struct bpf_insn_aux_data:
  - true_branch_taken;
  - false_branch_taken;
- adjust check_cond_jmp_op() to set the fields according to jump
  predictions;
- handle these flags in the opt_hard_wire_dead_code_branches():
  - true_branch_taken && !false_branch_taken
    always jump, replace instruction with 'goto off';
  - !true_branch_taken && false_branch_taken
    always falltrhough, replace with 'goto +0';
  - true_branch_taken && false_branch_taken
    jump and falltrhough are possible, don't change the instruction;
  - !true_branch_taken && !false_branch_taken
    neither jump, nor falltrhough are possible, if condition itself
    must be a dead code (should be removed by opt_remove_dead_code).

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf_verifier.h |  4 +++-
 kernel/bpf/verifier.c        | 16 ++++++++++++----
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 4513372c5bc8..ed4eacfd4db7 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -570,7 +570,9 @@ struct bpf_insn_aux_data {
 	struct btf_struct_meta *kptr_struct_meta;
 	u64 map_key_state; /* constant (32 bit) key tracking for maps */
 	int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
-	u32 seen; /* this insn was processed by the verifier at env->pass_cnt */
+	bool seen; /* this insn was processed by the verifier at env->pass_cnt */
+	bool true_branch_taken; /* for cond jumps, set if verifier ever took true branch */
+	bool false_branch_taken; /* for cond jumps, set if verifier ever took false branch */
 	bool sanitize_stack_spill; /* subject to Spectre v4 sanitation */
 	bool zext_dst; /* this insn zero extends dst reg */
 	bool needs_zext; /* alu op needs to clear upper bits */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 7958d6ff6b73..3bae0bbc1da9 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13265,7 +13265,7 @@ static void sanitize_mark_insn_seen(struct bpf_verifier_env *env)
 	 * rewrite/sanitize them.
 	 */
 	if (!vstate->speculative)
-		env->insn_aux_data[env->insn_idx].seen = env->pass_cnt;
+		env->insn_aux_data[env->insn_idx].seen = env->pass_cnt > 0;
 }
 
 static int sanitize_err(struct bpf_verifier_env *env,
@@ -15484,6 +15484,7 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 {
 	struct bpf_verifier_state *this_branch = env->cur_state;
 	struct bpf_verifier_state *other_branch;
+	struct bpf_insn_aux_data *aux = &env->insn_aux_data[*insn_idx];
 	struct bpf_reg_state *regs = this_branch->frame[this_branch->curframe]->regs;
 	struct bpf_reg_state *dst_reg, *other_branch_regs, *src_reg = NULL;
 	struct bpf_reg_state *eq_branch_regs;
@@ -15510,6 +15511,8 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 				insn->off, insn->imm);
 			return -EINVAL;
 		}
+		aux->true_branch_taken = true;
+		aux->false_branch_taken = true;
 		prev_st = find_prev_entry(env, cur_st->parent, idx);
 
 		/* branch out 'fallthrough' insn as a new state to explore */
@@ -15579,6 +15582,7 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 		 * the fall-through branch for simulation under speculative
 		 * execution.
 		 */
+		aux->true_branch_taken = true;
 		if (!env->bypass_spec_v1 &&
 		    !sanitize_speculative_path(env, insn, *insn_idx + 1,
 					       *insn_idx))
@@ -15592,6 +15596,7 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 		 * program will go. If needed, push the goto branch for
 		 * simulation under speculative execution.
 		 */
+		aux->false_branch_taken = true;
 		if (!env->bypass_spec_v1 &&
 		    !sanitize_speculative_path(env, insn,
 					       *insn_idx + insn->off + 1,
@@ -15602,6 +15607,9 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 		return 0;
 	}
 
+	aux->true_branch_taken = true;
+	aux->false_branch_taken = true;
+
 	/* Push scalar registers sharing same ID to jump history,
 	 * do this before creating 'other_branch', so that both
 	 * 'this_branch' and 'other_branch' share this history
@@ -19288,7 +19296,7 @@ static void adjust_insn_aux_data(struct bpf_verifier_env *env,
 {
 	struct bpf_insn_aux_data *old_data = env->insn_aux_data;
 	struct bpf_insn *insn = new_prog->insnsi;
-	u32 old_seen = old_data[off].seen;
+	bool old_seen = old_data[off].seen;
 	u32 prog_len;
 	int i;
 
@@ -19608,9 +19616,9 @@ static void opt_hard_wire_dead_code_branches(struct bpf_verifier_env *env)
 		if (!insn_is_cond_jump(insn->code))
 			continue;
 
-		if (!aux_data[i + 1].seen)
+		if (aux_data[i].true_branch_taken && !aux_data[i].false_branch_taken)
 			ja.off = insn->off;
-		else if (!aux_data[i + 1 + insn->off].seen)
+		else if (!aux_data[i].true_branch_taken && aux_data[i].false_branch_taken)
 			ja.off = 0;
 		else
 			continue;
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC bpf-next 02/11] selftests/bpf: tests for opt_hard_wire_dead_code_branches()
  2024-11-07 17:50 [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Eduard Zingerman
  2024-11-07 17:50 ` [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches() Eduard Zingerman
@ 2024-11-07 17:50 ` Eduard Zingerman
  2024-11-07 17:50 ` [RFC bpf-next 03/11] bpf: shared BPF/native kfuncs Eduard Zingerman
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-07 17:50 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	Eduard Zingerman

As opt_hard_wire_dead_code_branches() was changed to react to
accumulated branch prediction flags for conditional jumps,
add tests for various possible predictions.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 .../selftests/bpf/prog_tests/verifier.c       |  2 +
 .../selftests/bpf/progs/verifier_dead_code.c  | 63 +++++++++++++++++++
 2 files changed, 65 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_dead_code.c

diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
index 75f7a2ce334b..efd42c07f58a 100644
--- a/tools/testing/selftests/bpf/prog_tests/verifier.c
+++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
@@ -26,6 +26,7 @@
 #include "verifier_ctx.skel.h"
 #include "verifier_ctx_sk_msg.skel.h"
 #include "verifier_d_path.skel.h"
+#include "verifier_dead_code.skel.h"
 #include "verifier_direct_packet_access.skel.h"
 #include "verifier_direct_stack_access_wraparound.skel.h"
 #include "verifier_div0.skel.h"
@@ -154,6 +155,7 @@ void test_verifier_const_or(void)             { RUN(verifier_const_or); }
 void test_verifier_ctx(void)                  { RUN(verifier_ctx); }
 void test_verifier_ctx_sk_msg(void)           { RUN(verifier_ctx_sk_msg); }
 void test_verifier_d_path(void)               { RUN(verifier_d_path); }
+void test_verifier_dead_code(void)            { RUN(verifier_dead_code); }
 void test_verifier_direct_packet_access(void) { RUN(verifier_direct_packet_access); }
 void test_verifier_direct_stack_access_wraparound(void) { RUN(verifier_direct_stack_access_wraparound); }
 void test_verifier_div0(void)                 { RUN(verifier_div0); }
diff --git a/tools/testing/selftests/bpf/progs/verifier_dead_code.c b/tools/testing/selftests/bpf/progs/verifier_dead_code.c
new file mode 100644
index 000000000000..b2eed6be0d42
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/verifier_dead_code.c
@@ -0,0 +1,63 @@
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+SEC("socket")
+__xlated("0: r1 = 1")
+__xlated("1: r0 = 42")
+__xlated("2: r0 = 24")
+__xlated("3: exit")
+__success
+__retval(24)
+__naked void cond_always_false(void)
+{
+	asm volatile (
+		"r1 = 1;"
+		"r0 = 42;"
+		"if r1 != 1 goto +1;"
+		"r0 = 24;"
+		"exit;"
+		::: __clobber_all
+	);
+}
+
+SEC("socket")
+__xlated("0: r1 = 2")
+__xlated("1: r0 = 42")
+__xlated("2: exit")
+__success
+__retval(42)
+__naked void cond_always_true(void)
+{
+	asm volatile (
+		"r1 = 2;"
+		"r0 = 42;"
+		"if r1 != 1 goto +1;"
+		"r0 = 24;"
+		"exit;"
+		::: __clobber_all
+	);
+}
+
+SEC("socket")
+__xlated("0: call")
+__xlated("1: r1 = r0")
+__xlated("2: r0 = 42")
+__xlated("3: if r1 != 0x1 goto pc+1")
+__xlated("4: r0 = 24")
+__xlated("5: exit")
+__success
+__naked void cond_unknown(void)
+{
+	asm volatile (
+		"call %[bpf_get_prandom_u32];"
+		"r1 = r0;"
+		"r0 = 42;"
+		"if r1 != 1 goto +1;"
+		"r0 = 24;"
+		"exit;"
+		:
+		: __imm(bpf_get_prandom_u32)
+		: __clobber_all
+	);
+}
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC bpf-next 03/11] bpf: shared BPF/native kfuncs
  2024-11-07 17:50 [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Eduard Zingerman
  2024-11-07 17:50 ` [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches() Eduard Zingerman
  2024-11-07 17:50 ` [RFC bpf-next 02/11] selftests/bpf: tests for opt_hard_wire_dead_code_branches() Eduard Zingerman
@ 2024-11-07 17:50 ` Eduard Zingerman
  2024-11-08 20:43   ` Toke Høiland-Jørgensen
  2024-11-15  0:27   ` Andrii Nakryiko
  2024-11-07 17:50 ` [RFC bpf-next 04/11] bpf: allow specifying inlinable kfuncs in modules Eduard Zingerman
                   ` (8 subsequent siblings)
  11 siblings, 2 replies; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-07 17:50 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	Eduard Zingerman

This commit adds a notion of inlinable kfuncs, compiled both to native
code and BPF. BPF-compiled version is embedded in kernel data section
and is available to verifier. Verifier uses it to replace calls to
such kfuncs with inlined function bodies.

Inlinable kfuncs are available only if CLANG is used for kernel
compilation.

In the scope of this commit all inlining happens as last steps of
do_check(), after verification is finished. Follow up commits would
extend this mechanism to allow removal of some conditional branches
inside inlined function bodies.

The commit consists of the following changes:
- main kernel makefile:
  modified to compile a bootstrap version of the bpftool;
- kernel/bpf/Makefile:
  - a new file inlinable_kfuncs.c is added;
  - makefile is modified to compile this file as BPF elf,
    using the following steps:
    - use clang with native target to produce LLVM bitcode;
    - compile LLVM bitcode to BPF object file;
    - resolve relocations inside BPF object file using bpftool as a
      linker;
    Such arrangement allows including unmodified network related
    header files.
- verifier.c:
  - generated BPF elf is included as a part of kernel data section;
  - at kernel initialization phase:
    - the elf is parsed and each function declared within it is
      recorded as an instance of 'inlinable_kfunc' structure;
    - calls to extern functions within elf file (pointed to by
      relocation records) are replaced with kfunc call instructions;
  - do_check() is modified to replace calls to kfuncs from inlinable
    kfunc table with function bodies:
    - replacement happens after main verification pass, so the bodies
      of the kfuncs are not analyzed by verifier;
    - if kfunc uses callee saved registers r6-r9 the spill/fill pairs
      are generated for these register before/after inlined kfunc body
      at call site;
    - if kfunc uses r10 as a base pointer for load or store
      instructions, offsets of these instructions are adjusted;
    - if kfunc uses r10 in other instructions, such r10 is considered
      as escaping and kfunc is not inlined.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 Makefile                      |  22 +-
 kernel/bpf/Makefile           |  24 +-
 kernel/bpf/inlinable_kfuncs.c |   1 +
 kernel/bpf/verifier.c         | 652 +++++++++++++++++++++++++++++++++-
 4 files changed, 680 insertions(+), 19 deletions(-)
 create mode 100644 kernel/bpf/inlinable_kfuncs.c

diff --git a/Makefile b/Makefile
index a9a7d9ffaa98..4ded57f4b0c2 100644
--- a/Makefile
+++ b/Makefile
@@ -496,6 +496,7 @@ CLIPPY_DRIVER	= clippy-driver
 BINDGEN		= bindgen
 PAHOLE		= pahole
 RESOLVE_BTFIDS	= $(objtree)/tools/bpf/resolve_btfids/resolve_btfids
+BPFTOOL		= $(objtree)/tools/bpf/bpftool/bootstrap/bpftool
 LEX		= flex
 YACC		= bison
 AWK		= awk
@@ -585,7 +586,7 @@ export RUSTC_BOOTSTRAP := 1
 export ARCH SRCARCH CONFIG_SHELL BASH HOSTCC KBUILD_HOSTCFLAGS CROSS_COMPILE LD CC HOSTPKG_CONFIG
 export RUSTC RUSTDOC RUSTFMT RUSTC_OR_CLIPPY_QUIET RUSTC_OR_CLIPPY BINDGEN
 export HOSTRUSTC KBUILD_HOSTRUSTFLAGS
-export CPP AR NM STRIP OBJCOPY OBJDUMP READELF PAHOLE RESOLVE_BTFIDS LEX YACC AWK INSTALLKERNEL
+export CPP AR NM STRIP OBJCOPY OBJDUMP READELF PAHOLE RESOLVE_BTFIDS BPFTOOL LEX YACC AWK INSTALLKERNEL
 export PERL PYTHON3 CHECK CHECKFLAGS MAKE UTS_MACHINE HOSTCXX
 export KGZIP KBZIP2 KLZOP LZMA LZ4 XZ ZSTD
 export KBUILD_HOSTCXXFLAGS KBUILD_HOSTLDFLAGS KBUILD_HOSTLDLIBS LDFLAGS_MODULE
@@ -1356,6 +1357,25 @@ ifneq ($(wildcard $(resolve_btfids_O)),)
 	$(Q)$(MAKE) -sC $(srctree)/tools/bpf/resolve_btfids O=$(resolve_btfids_O) clean
 endif
 
+# TODO: cross compilation?
+# TODO: bootstrap! (to avoid vmlinux.h generation)
+PHONY += bpftool_bootstrap bpftool_clean
+bpftool_O = $(abspath $(objtree))/tools/bpf/bpftool
+
+ifdef CONFIG_BPF
+ifdef CONFIG_CC_IS_CLANG
+prepare: bpftool_bootstrap
+endif
+endif
+
+bpftool_bootstrap:
+	$(Q)$(MAKE) -sC $(srctree)/tools/bpf/bpftool O=$(bpftool_O) srctree=$(abspath $(srctree)) bootstrap
+
+bpftool_clean:
+ifneq ($(wildcard $(bpftool_O)),)
+	$(Q)$(MAKE) -sC $(srctree)/tools/bpf/bpftool O=$(bpftool_O) srctree=$(abspath $(srctree)) clean
+endif
+
 # Clear a bunch of variables before executing the submake
 ifeq ($(quiet),silent_)
 tools_silent=s
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 105328f0b9c0..3d7ee81c8e2e 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -6,7 +6,7 @@ cflags-nogcse-$(CONFIG_X86)$(CONFIG_CC_IS_GCC) := -fno-gcse
 endif
 CFLAGS_core.o += -Wno-override-init $(cflags-nogcse-yy)
 
-obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o log.o token.o
+obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o tnum.o log.o token.o
 obj-$(CONFIG_BPF_SYSCALL) += bpf_iter.o map_iter.o task_iter.o prog_iter.o link_iter.o
 obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o bloom_filter.o
 obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o
@@ -53,3 +53,25 @@ obj-$(CONFIG_BPF_SYSCALL) += relo_core.o
 obj-$(CONFIG_BPF_SYSCALL) += btf_iter.o
 obj-$(CONFIG_BPF_SYSCALL) += btf_relocate.o
 obj-$(CONFIG_BPF_SYSCALL) += kmem_cache_iter.o
+obj-$(CONFIG_BPF_SYSCALL) += helpers.o inlinable_kfuncs.o
+
+ifdef CONFIG_CC_IS_CLANG
+
+LLC ?= $(LLVM_PREFIX)llc$(LLVM_SUFFIX)
+
+# -mfentry -pg is $(CC_FLAGS_FTRACE)
+# -fpatchable-function-entry=16,16 is $(PADDING_CFLAGS)
+CFLAGS_REMOVE_inlinable_kfuncs.bpf.bc.o += $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_inlinable_kfuncs.bpf.bc.o += $(PADDING_CFLAGS)
+$(obj)/inlinable_kfuncs.bpf.bc.o: $(src)/inlinable_kfuncs.c
+	$(Q)$(CLANG) $(c_flags) -emit-llvm -c $< -o $@
+
+$(obj)/inlinable_kfuncs.bpf.o: $(obj)/inlinable_kfuncs.bpf.bc.o
+	$(Q)$(LLC) -mcpu=v3 --mtriple=bpf --filetype=obj $< -o $@
+
+$(obj)/inlinable_kfuncs.bpf.linked.o: $(obj)/inlinable_kfuncs.bpf.o
+	$(Q)$(BPFTOOL) gen object $@ $<
+
+$(obj)/verifier.o: $(obj)/inlinable_kfuncs.bpf.linked.o
+
+endif
diff --git a/kernel/bpf/inlinable_kfuncs.c b/kernel/bpf/inlinable_kfuncs.c
new file mode 100644
index 000000000000..7b7dc05fa1a4
--- /dev/null
+++ b/kernel/bpf/inlinable_kfuncs.c
@@ -0,0 +1 @@
+// SPDX-License-Identifier: GPL-2.0-only
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 3bae0bbc1da9..fbf51147f319 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -20509,6 +20509,622 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 	return 0;
 }
 
+asm (
+"	.pushsection .data, \"a\"		\n"
+"	.global inlinable_kfuncs_data		\n"
+"inlinable_kfuncs_data:				\n"
+"	.incbin \"kernel/bpf/inlinable_kfuncs.bpf.linked.o\"\n"
+"	.global inlinable_kfuncs_data_end		\n"
+"inlinable_kfuncs_data_end:			\n"
+"	.popsection				\n"
+);
+
+extern void inlinable_kfuncs_data;
+extern void inlinable_kfuncs_data_end;
+
+struct blob {
+	void *mem;
+	u32 size;
+};
+
+struct inlinable_kfunc {
+	const char *name;
+	const struct bpf_insn *insns;
+	u32 insn_num;
+	u32 btf_id;
+};
+
+static struct inlinable_kfunc inlinable_kfuncs[16];
+
+static void *check_inlinable_kfuncs_ptr(struct blob *blob,
+				      void *ptr, u64 size, const char *context)
+{
+	if (ptr + size > blob->mem + blob->size) {
+		printk("malformed inlinable kfuncs data: bad offset/size 0x%lx/0x%llx: %s",
+		       ptr - blob->mem, size, context);
+		return NULL;
+	}
+	return ptr;
+}
+
+static void *get_inlinable_kfuncs_ptr(struct blob *blob,
+				    u64 off, u64 size, const char *context)
+{
+	return check_inlinable_kfuncs_ptr(blob, blob->mem + off, size, context);
+}
+
+struct inlinable_kfunc_regs_usage {
+	u32 used_regs_mask;
+	s32 lowest_stack_off;
+	bool r10_escapes;
+};
+
+static void scan_regs_usage(const struct bpf_insn *insns, u32 insn_num,
+			    struct inlinable_kfunc_regs_usage *usage)
+{
+	const struct bpf_insn *insn = insns;
+	s32 lowest_stack_off;
+	bool r10_escapes;
+	u32 i, mask;
+
+	lowest_stack_off = 0;
+	r10_escapes = false;
+	mask = 0;
+	for (i = 0; i < insn_num; ++i, ++insn) {
+		mask |= BIT(insn->src_reg);
+		mask |= BIT(insn->dst_reg);
+		switch (BPF_CLASS(insn->code)) {
+		case BPF_ST:
+		case BPF_STX:
+			if (insn->dst_reg == BPF_REG_10)
+				lowest_stack_off = min(lowest_stack_off, insn->off);
+			if (insn->src_reg == BPF_REG_10 && BPF_SRC(insn->code) == BPF_X)
+				r10_escapes = true;
+			break;
+		case BPF_LDX:
+			if (insn->src_reg == BPF_REG_10)
+				lowest_stack_off = min(lowest_stack_off, insn->off);
+			break;
+		case BPF_ALU:
+		case BPF_ALU64:
+			if (insn->src_reg == BPF_REG_10 && BPF_SRC(insn->code) == BPF_X)
+				r10_escapes = true;
+			break;
+		default:
+			break;
+		}
+	}
+	usage->used_regs_mask = mask;
+	usage->lowest_stack_off = lowest_stack_off;
+	usage->r10_escapes = r10_escapes;
+}
+
+#ifndef Elf_Rel
+#ifdef CONFIG_64BIT
+#define Elf_Rel		Elf64_Rel
+#else
+#define Elf_Rel		Elf32_Rel
+#endif
+#endif
+
+#define R_BPF_64_32 10
+
+struct sh_elf_sections {
+	Elf_Sym *sym;
+	Elf_Rel *rel;
+	void *text;
+	const char *strings;
+	u32 rel_cnt;
+	u32 sym_cnt;
+	u32 strings_sz;
+	u32 text_sz;
+	u32 symtab_idx;
+	u32 text_idx;
+};
+
+static int validate_inlinable_kfuncs_header(Elf_Ehdr *hdr)
+{
+	if (hdr->e_ident[EI_MAG0] != ELFMAG0 || hdr->e_ident[EI_MAG1] != ELFMAG1 ||
+	    hdr->e_ident[EI_MAG2] != ELFMAG2 || hdr->e_ident[EI_MAG3] != ELFMAG3 ||
+	    hdr->e_ident[EI_CLASS] != ELFCLASS64 || hdr->e_ident[EI_VERSION] != EV_CURRENT ||
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+	    hdr->e_ident[EI_DATA] != ELFDATA2LSB ||
+#else
+	    hdr->e_ident[EI_DATA] != ELFDATA2MSB ||
+#endif
+	    hdr->e_type != ET_REL || hdr->e_machine != EM_BPF || hdr->e_version != EV_CURRENT) {
+		printk("malformed inlinable kfuncs data: bad ELF header\n");
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int inlinable_kfuncs_parse_sections(struct blob *blob, struct sh_elf_sections *s)
+{
+	Elf_Shdr *sh_strings = NULL;
+	Elf_Shdr *sec_hdrs = NULL;
+	Elf_Shdr *symtab = NULL;
+	Elf_Shdr *text = NULL;
+	Elf_Shdr *text_rel = NULL;
+	Elf_Shdr *shdr;
+	Elf_Ehdr *hdr;
+	Elf_Sym *sym;
+	u32 i, symtab_idx, text_idx;
+	const char *strings, *name;
+	int err;
+
+	hdr = get_inlinable_kfuncs_ptr(blob, 0, sizeof(*hdr), "ELF header");
+	if (!hdr)
+		return -EINVAL;
+	err = validate_inlinable_kfuncs_header(hdr);
+	if (err < 0)
+		return err;
+	sec_hdrs = get_inlinable_kfuncs_ptr(blob, hdr->e_shoff, 0, "section headers table");
+	if (!sec_hdrs)
+		return -EINVAL;
+	sh_strings = check_inlinable_kfuncs_ptr(blob, &sec_hdrs[hdr->e_shstrndx], sizeof(*s),
+					      "string table header");
+	if (!sh_strings)
+		return -EINVAL;
+	strings = get_inlinable_kfuncs_ptr(blob, sh_strings->sh_offset, sh_strings->sh_size,
+					 "strings table");
+	if (!strings)
+		return -EINVAL;
+	if (strings[sh_strings->sh_size - 1] != 0) {
+		printk("malformed inlinable kfuncs data: string table is not null terminated\n");
+		return -EINVAL;
+	}
+	if (!check_inlinable_kfuncs_ptr(blob, &sec_hdrs[hdr->e_shnum - 1], sizeof(*sec_hdrs),
+				      "section table"))
+		return -EINVAL;
+
+	for (i = 1; i < hdr->e_shnum; i++) {
+		shdr = &sec_hdrs[i];
+		name = strings + shdr->sh_name;
+		if (shdr->sh_name >= sh_strings->sh_size) {
+			printk("malformed inlinable kfuncs data: bad section #%u name\n", i);
+			return -EINVAL;
+		}
+		if (!get_inlinable_kfuncs_ptr(blob, shdr->sh_offset, shdr->sh_size,
+					    "section size/offset"))
+			return -EINVAL;
+		switch (shdr->sh_type) {
+		case SHT_NULL:
+		case SHT_STRTAB:
+			continue;
+		case SHT_SYMTAB:
+			symtab = shdr;
+			symtab_idx = i;
+			if (symtab->sh_entsize != sizeof(*sym)) {
+				printk("malformed inlinable kfuncs data: unexpected symtab->sh_entsize: %llu\n",
+				       symtab->sh_entsize);
+				return -EINVAL;
+			}
+			if (symtab->sh_size % sizeof(*sym) != 0) {
+				printk("malformed inlinable kfuncs data: unexpected symtab->sh_size: %llu\n",
+				       symtab->sh_size);
+				return -EINVAL;
+			}
+			break;
+		case SHT_PROGBITS:
+			if (strcmp(name, ".text") == 0) {
+				text = shdr;
+				text_idx = i;
+			} else if (strcmp(name, ".BTF") == 0 || strcmp(name, ".BTF.ext") == 0) {
+				/* ignore BTF for now */
+				break;
+			} else {
+				printk("malformed inlinable kfuncs data: unexpected section #%u name ('%s')\n",
+				       i, name);
+				return -EINVAL;
+			}
+			break;
+		case SHT_REL:
+			if (text_rel) {
+				printk("malformed inlinable kfuncs data: unexpected relocation section #%u name ('%s')\n",
+				       i, name);
+				return -EINVAL;
+			}
+			text_rel = shdr;
+			break;
+		default:
+			printk("malformed inlinable kfuncs data: unexpected section #%u type (0x%x)\n",
+			       i, shdr->sh_type);
+			return -EINVAL;
+		}
+	}
+	if (!symtab) {
+		printk("malformed inlinable kfuncs data: SHT_SYMTAB section is missing\n");
+		return -EINVAL;
+	}
+	if (!text) {
+		printk("malformed inlinable kfuncs data: .text section is missing\n");
+		return -EINVAL;
+	}
+	if (text_rel && (text_rel->sh_info != text_idx || text_rel->sh_link != symtab_idx)) {
+		printk("malformed inlinable kfuncs data: SHT_REL section #%u '%s' unexpected sh_link %u\n",
+		       i, name, shdr->sh_link);
+		return -EINVAL;
+	}
+	s->sym = blob->mem + symtab->sh_offset;
+	s->text = blob->mem + text->sh_offset;
+	s->strings = strings;
+	s->sym_cnt = symtab->sh_size / sizeof(*s->sym);
+	s->text_sz = text->sh_size;
+	s->strings_sz = sh_strings->sh_size;
+	s->text_idx = text_idx;
+	s->symtab_idx = symtab_idx;
+	if (text_rel) {
+		s->rel = blob->mem + text_rel->sh_offset;
+		s->rel_cnt = text_rel->sh_size / sizeof(*s->rel);
+	} else {
+		s->rel = NULL;
+		s->rel_cnt = 0;
+	}
+	return 0;
+}
+
+/* Replace call instructions with function relocations by kfunc call
+ * instructions, looking up corresponding kernel functions by name.
+ * Inserted kfunc calls might refer to any kernel functions,
+ * not necessarily those marked with __bpf_kfunc.
+ * Any other relocation kinds are not supported.
+ */
+static int inlinable_kfuncs_apply_relocs(struct sh_elf_sections *s, struct btf *btf)
+{
+	Elf_Rel *rel;
+	u32 i;
+
+	if (!s->rel)
+		return 0;
+
+	if (!btf) {
+		printk("inlinable_kfuncs_init: no vmlinux btf\n");
+		return -EINVAL;
+	}
+	printk("inlinable_kfuncs_init: relocations:\n");
+	for (rel = s->rel, i = 0; i < s->rel_cnt; i++, rel++) {
+		printk("inlinable_kfuncs_init:  tp=0x%llx, sym=0x%llx, off=0x%llx\n",
+		       ELF_R_TYPE(rel->r_info), ELF_R_SYM(rel->r_info), rel->r_offset);
+	}
+	for (rel = s->rel, i = 0; i < s->rel_cnt; i++, rel++) {
+		struct bpf_insn *rinsn;
+		const char *rname;
+		Elf_Sym *rsym;
+		u32 idx;
+		s32 id;
+
+		if (ELF_R_TYPE(rel->r_info) != R_BPF_64_32) {
+			printk("relocation #%u unexpected relocation type: %llu\n", i, ELF_R_TYPE(rel->r_info));
+			return -EINVAL;
+		}
+		idx = ELF_R_SYM(rel->r_info);
+		rsym = s->sym + idx;
+		if (idx >= s->sym_cnt) {
+			printk("relocation #%u symbol index out of bounds: %u\n", i, idx);
+			return -EINVAL;
+		}
+		if (rsym->st_name >= s->strings_sz) {
+			printk("relocation #%u symbol name out of bounds: %u\n", i, rsym->st_name);
+			return -EINVAL;
+		}
+		rname = s->strings + rsym->st_name;
+		if (rel->r_offset + sizeof(struct bpf_insn) >= s->text_sz ||
+		    rel->r_offset % sizeof(struct bpf_insn) != 0) {
+			printk("relocation #%u invalid offset: %llu\n", i, rel->r_offset);
+			return -EINVAL;
+		}
+		rinsn = s->text + rel->r_offset;
+		if (rinsn->code != (BPF_JMP | BPF_CALL) ||
+		    rinsn->src_reg != BPF_PSEUDO_CALL ||
+		    rinsn->dst_reg != 0 ||
+		    rinsn->off != 0 ||
+		    rinsn->imm != -1) {
+			printk("relocation #%u invalid instruction at offset %llu\n", i, rel->r_offset);
+			return -EINVAL;
+		}
+		id = btf_find_by_name_kind(btf, rname, BTF_KIND_FUNC);
+		if (id < 0) {
+			printk("relocation #%u can't resolve function '%s'\n", i, rname);
+			return -EINVAL;
+		}
+		rinsn->src_reg = BPF_PSEUDO_KFUNC_CALL;
+		rinsn->imm = id;
+		printk("inlinable_kfuncs_init: patching insn %ld, imm=%d, off=%d\n",
+		       rinsn - (struct bpf_insn *)s->text, rinsn->imm, rinsn->off);
+	}
+	return 0;
+}
+
+/* Fill 'inlinable_kfuncs' table with STT_FUNC symbols from symbol table
+ * of the ELF file pointed to by elf_bin.
+ * Do some sanity checks for ELF data structures,
+ * (but refrain from being overly paranoid, as this ELF is a part of kernel build).
+ */
+static int bpf_register_inlinable_kfuncs(void *elf_bin, u32 size)
+{
+	struct blob blob = { .mem = elf_bin, .size = size };
+	struct sh_elf_sections s;
+	struct btf *btf;
+	Elf_Sym *sym;
+	u32 i, idx;
+	int err;
+
+	btf = bpf_get_btf_vmlinux();
+	if (!btf)
+		return -EINVAL;
+
+	err = inlinable_kfuncs_parse_sections(&blob, &s);
+	if (err < 0)
+		return err;
+
+	err = inlinable_kfuncs_apply_relocs(&s, btf);
+	if (err < 0)
+		return err;
+
+	idx = 0;
+	for (sym = s.sym, i = 0; i < s.sym_cnt; i++, sym++) {
+		struct inlinable_kfunc *sh;
+		struct bpf_insn *insns;
+		const char *name;
+		u32 insn_num;
+		int id;
+
+		if (ELF_ST_TYPE(sym->st_info) != STT_FUNC)
+			continue;
+		if (ELF_ST_BIND(sym->st_info) != STB_GLOBAL ||
+		    sym->st_other != 0 ||
+		    sym->st_shndx != s.text_idx ||
+		    sym->st_size % sizeof(struct bpf_insn) != 0 ||
+		    sym->st_value % sizeof(struct bpf_insn) != 0 ||
+		    sym->st_name >= s.strings_sz) {
+			printk("malformed inlinable kfuncs data: bad symbol #%u\n", i);
+			return -EINVAL;
+		}
+		if (idx == ARRAY_SIZE(inlinable_kfuncs) - 1) {
+			printk("malformed inlinable kfuncs data: too many helper functions\n");
+			return -EINVAL;
+		}
+		insn_num = sym->st_size / sizeof(struct bpf_insn);
+		insns = s.text + sym->st_value;
+		name = s.strings + sym->st_name;
+		id = btf_find_by_name_kind(btf, name, BTF_KIND_FUNC);
+		if (id < 0) {
+			printk("can't add inlinable kfunc '%s': no btf_id\n", name);
+			return -EINVAL;
+		}
+		sh = &inlinable_kfuncs[idx++];
+		sh->insn_num = insn_num;
+		sh->insns = insns;
+		sh->name = name;
+		sh->btf_id = id;
+		printk("adding inlinable kfunc %s at 0x%llx, %u instructions, btf_id=%d\n",
+		       sh->name, sym->st_value, sh->insn_num, sh->btf_id);
+	}
+
+	return 0;
+}
+
+static int __init inlinable_kfuncs_init(void)
+{
+	return bpf_register_inlinable_kfuncs(&inlinable_kfuncs_data,
+					   &inlinable_kfuncs_data_end - &inlinable_kfuncs_data);
+}
+
+late_initcall(inlinable_kfuncs_init);
+
+static struct inlinable_kfunc *find_inlinable_kfunc(u32 btf_id)
+{
+	struct inlinable_kfunc *sh = inlinable_kfuncs;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(inlinable_kfuncs); ++i, ++sh)
+		if (sh->btf_id == btf_id)
+			return sh;
+	return NULL;
+}
+
+/* Given a kfunc call instruction, when kfunc is an inlinable kfunc,
+ * replace the call instruction with a body of said kfunc.
+ * Stack slots used within kfunc body become stack slots of with calling function,
+ * report extra stack used in 'stack_depth_extra'.
+ */
+static struct bpf_prog *inline_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
+					  int insn_idx, int *cnt, s32 stack_base, u16 *stack_depth_extra)
+{
+	struct inlinable_kfunc_regs_usage regs_usage;
+	const struct bpf_kfunc_desc *desc;
+	struct bpf_prog *new_prog;
+	struct bpf_insn *insn_buf;
+	struct inlinable_kfunc *sh;
+	int i, j, r, off, err, exit;
+	u32 subprog_insn_cnt;
+	u16 extra_slots;
+	s16 stack_off;
+	u32 insn_num;
+
+	desc = find_kfunc_desc(env->prog, insn->imm, insn->off);
+	if (!desc || IS_ERR(desc))
+		return ERR_PTR(-EFAULT);
+	sh = find_inlinable_kfunc(desc->func_id);
+	if (!sh)
+		return NULL;
+
+	subprog_insn_cnt = sh->insn_num;
+	scan_regs_usage(sh->insns, subprog_insn_cnt, &regs_usage);
+	if (regs_usage.r10_escapes) {
+		if (env->log.level & BPF_LOG_LEVEL2)
+			verbose(env, "can't inline kfunc %s at insn %d, r10 escapes\n",
+				sh->name, insn_idx);
+		return NULL;
+	}
+
+	extra_slots = 0;
+	for (i = BPF_REG_6; i <= BPF_REG_9; ++i)
+		if (regs_usage.used_regs_mask & BIT(i))
+			++extra_slots;
+	*stack_depth_extra = extra_slots * BPF_REG_SIZE + -regs_usage.lowest_stack_off;
+	insn_num = subprog_insn_cnt + extra_slots * 2;
+	insn_buf = kmalloc_array(insn_num, sizeof(*insn_buf), GFP_KERNEL);
+	if (!insn_buf)
+		return ERR_PTR(-ENOMEM);
+
+	if (env->log.level & BPF_LOG_LEVEL2)
+		verbose(env, "inlining kfunc %s at insn %d\n", sh->name, insn_idx);
+	memcpy(insn_buf + extra_slots, sh->insns, subprog_insn_cnt * sizeof(*insn_buf));
+	off = stack_base;
+	i = 0;
+	j = insn_num - 1;
+	/* Generate spill/fill pairs for callee saved registers used
+         * by kfunc before/after inlined function body.
+	 */
+	for (r = BPF_REG_6; r <= BPF_REG_9; ++r) {
+		if ((regs_usage.used_regs_mask & BIT(r)) == 0)
+			continue;
+		off -= BPF_REG_SIZE;
+		insn_buf[i++] = BPF_STX_MEM(BPF_DW, BPF_REG_10, r, off);
+		insn_buf[j--] = BPF_LDX_MEM(BPF_DW, r, BPF_REG_10, off);
+	}
+	exit = insn_idx + subprog_insn_cnt + extra_slots;
+	new_prog = bpf_patch_insn_data(env, insn_idx, insn_buf, insn_num);
+	kfree(insn_buf);
+	if (!new_prog)
+		return ERR_PTR(-ENOMEM);
+
+	stack_off = 0;
+	if (!regs_usage.r10_escapes && (regs_usage.used_regs_mask & BIT(BPF_REG_10)))
+		stack_off = off;
+
+	for (j = 0; j < subprog_insn_cnt; ++j) {
+		i = insn_idx + extra_slots + j;
+		insn = new_prog->insnsi + i;
+
+		/* Replace 'exit' with jump to inlined body end. */
+		if (insn->code == (BPF_JMP | BPF_EXIT)) {
+			off = exit - i - 1;
+			if (off < S16_MAX)
+				*insn = BPF_JMP_A(off);
+			else
+				*insn = BPF_JMP32_A(off);
+		}
+
+		/* Adjust offsets of r10-based load and store instructions
+		 * to use slots not used by calling function.
+		 */
+		switch (BPF_CLASS(insn->code)) {
+		case BPF_ST:
+		case BPF_STX:
+			if (insn->dst_reg == BPF_REG_10)
+				insn->off += stack_off;
+			break;
+		case BPF_LDX:
+			if (insn->src_reg == BPF_REG_10)
+				insn->off += stack_off;
+			break;
+		default:
+			break;
+		}
+
+		/* Make sure kernel function calls from within kfunc body could be jitted. */
+		if (bpf_pseudo_kfunc_call(insn)) {
+			err = add_kfunc_call(env, insn->imm, insn->off);
+			if (err < 0)
+				return ERR_PTR(err);
+		}
+	}
+
+	*cnt = insn_num;
+
+	return new_prog;
+}
+
+/* Do this after all stack depth adjustments */
+static int inline_kfunc_calls(struct bpf_verifier_env *env)
+{
+	struct bpf_prog *prog = env->prog;
+	struct bpf_insn *insn = prog->insnsi;
+	const int insn_cnt = prog->len;
+	struct bpf_prog *new_prog;
+	int i, cnt, delta = 0, cur_subprog = 0;
+	struct bpf_subprog_info *subprogs = env->subprog_info;
+	u16 stack_depth = subprogs[cur_subprog].stack_depth;
+	u16 call_extra_stack = 0, subprog_extra_stack = 0;
+
+	for (i = 0; i < insn_cnt;) {
+		if (!bpf_pseudo_kfunc_call(insn))
+			goto next_insn;
+
+		new_prog = inline_kfunc_call(env, insn, i + delta, &cnt,
+					     -stack_depth, &call_extra_stack);
+		if (IS_ERR(new_prog))
+			return PTR_ERR(new_prog);
+		if (!new_prog)
+			goto next_insn;
+
+		subprog_extra_stack = max(subprog_extra_stack, call_extra_stack);
+		delta	 += cnt - 1;
+		env->prog = prog = new_prog;
+		insn	  = new_prog->insnsi + i + delta;
+
+next_insn:
+		if (subprogs[cur_subprog + 1].start == i + delta + 1) {
+			subprogs[cur_subprog].stack_depth += subprog_extra_stack;
+			cur_subprog++;
+			stack_depth = subprogs[cur_subprog].stack_depth;
+			subprog_extra_stack = 0;
+		}
+		i++;
+		insn++;
+	}
+
+	env->prog->aux->stack_depth = subprogs[0].stack_depth;
+
+	return 0;
+}
+
+/* Prepare kfunc calls for jiting:
+ * - by replacing insn->imm fields with offsets to real functions;
+ * - or by replacing calls to certain kfuncs using hard-coded templates;
+ * - or by replacing calls to inlinable kfuncs by kfunc bodies.
+ */
+static int resolve_kfunc_calls(struct bpf_verifier_env *env)
+{
+	struct bpf_prog *prog = env->prog;
+	struct bpf_insn *insn = prog->insnsi;
+	struct bpf_insn *insn_buf = env->insn_buf;
+	const int insn_cnt = prog->len;
+	struct bpf_prog *new_prog;
+	int i, ret, cnt, delta = 0;
+
+	for (i = 0; i < insn_cnt;) {
+		if (!bpf_pseudo_kfunc_call(insn))
+			goto next_insn;
+
+		ret = fixup_kfunc_call(env, insn, insn_buf, i + delta, &cnt);
+		if (ret)
+			return ret;
+		if (cnt == 0)
+			goto next_insn;
+
+		new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
+		if (!new_prog)
+			return -ENOMEM;
+
+		delta	 += cnt - 1;
+		env->prog = prog = new_prog;
+		insn	  = new_prog->insnsi + i + delta;
+		goto next_insn;
+
+next_insn:
+		i++;
+		insn++;
+	}
+
+	sort_kfunc_descs_by_imm_off(env->prog);
+
+	return 0;
+}
+
 /* The function requires that first instruction in 'patch' is insnsi[prog->len - 1] */
 static int add_hidden_subprog(struct bpf_verifier_env *env, struct bpf_insn *patch, int len)
 {
@@ -20848,22 +21464,8 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
 			goto next_insn;
 		if (insn->src_reg == BPF_PSEUDO_CALL)
 			goto next_insn;
-		if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
-			ret = fixup_kfunc_call(env, insn, insn_buf, i + delta, &cnt);
-			if (ret)
-				return ret;
-			if (cnt == 0)
-				goto next_insn;
-
-			new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
-			if (!new_prog)
-				return -ENOMEM;
-
-			delta	 += cnt - 1;
-			env->prog = prog = new_prog;
-			insn	  = new_prog->insnsi + i + delta;
+		if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL)
 			goto next_insn;
-		}
 
 		/* Skip inlining the helper call if the JIT does it. */
 		if (bpf_jit_inlines_helper_call(insn->imm))
@@ -21380,6 +21982,15 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
 		WARN_ON(adjust_jmp_off(env->prog, subprog_start, 1));
 	}
 
+	return 0;
+}
+
+static int notify_map_poke_trackers(struct bpf_verifier_env *env)
+{
+	struct bpf_prog *prog = env->prog;
+	struct bpf_map *map_ptr;
+	int i, ret;
+
 	/* Since poke tab is now finalized, publish aux to tracker. */
 	for (i = 0; i < prog->aux->size_poke_tab; i++) {
 		map_ptr = prog->aux->poke_tab[i].tail_call.map;
@@ -21397,8 +22008,6 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
 		}
 	}
 
-	sort_kfunc_descs_by_imm_off(env->prog);
-
 	return 0;
 }
 
@@ -22558,6 +23167,15 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	if (ret == 0)
 		ret = do_misc_fixups(env);
 
+	if (ret == 0)
+		ret = inline_kfunc_calls(env);
+
+	if (ret == 0)
+		ret = resolve_kfunc_calls(env);
+
+	if (ret == 0)
+		ret = notify_map_poke_trackers(env);
+
 	/* do 32-bit optimization after insn patching has done so those patched
 	 * insns could be handled correctly.
 	 */
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC bpf-next 04/11] bpf: allow specifying inlinable kfuncs in modules
  2024-11-07 17:50 [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Eduard Zingerman
                   ` (2 preceding siblings ...)
  2024-11-07 17:50 ` [RFC bpf-next 03/11] bpf: shared BPF/native kfuncs Eduard Zingerman
@ 2024-11-07 17:50 ` Eduard Zingerman
  2024-11-07 17:50 ` [RFC bpf-next 05/11] bpf: dynamic allocation for bpf_verifier_env->subprog_info Eduard Zingerman
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-07 17:50 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	Eduard Zingerman

Instead of keeping inlinable kfuncs in a single table,
track them as a list of tables, one table per each module providing
such kfuncs. Protect the list with an rwlock to safely deal with
modules load/unload.

Provide two interface functions for use in modules:
- int bpf_register_inlinable_kfuncs(void *elf_bin, u32 size,
                                    struct module *module);
- void bpf_unregister_inlinable_kfuncs(struct module *module);

With an assumption that each module providing inlinable kfuncs would
include BPF elf in the same way verifier.c does, with
bpf_register_inlinable_kfuncs() called at load.
The bpf_unregister_inlinable_kfuncs() is supposed to be called before
module unload.

Each table holds a reference to btf, the refcount of the btf object is
not hold by the table. For lifetime management of module's btf object
relies on:
- correct usage of bpf_register_inlinable_kfuncs /
  bpf_unregister_inlinable_kfuncs to insert/remove tables to the list,
  so that tables are members of the list only when module is loaded;
- add_kfunc_call(), so that kfuncs referenced from inlined bodies are
  not unloaded when inlinable kfunc is inlined.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf.h   |   3 +
 include/linux/btf.h   |   7 +++
 kernel/bpf/btf.c      |   2 +-
 kernel/bpf/verifier.c | 129 +++++++++++++++++++++++++++++++++++-------
 4 files changed, 119 insertions(+), 22 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 1b84613b10ac..2bcc9161687b 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -3501,4 +3501,7 @@ static inline bool bpf_prog_is_raw_tp(const struct bpf_prog *prog)
 	       prog->expected_attach_type == BPF_TRACE_RAW_TP;
 }
 
+void bpf_unregister_inlinable_kfuncs(struct module *module);
+int bpf_register_inlinable_kfuncs(void *elf_bin, u32 size, struct module *module);
+
 #endif /* _LINUX_BPF_H */
diff --git a/include/linux/btf.h b/include/linux/btf.h
index 4214e76c9168..9e8b27493139 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -582,6 +582,7 @@ int get_kern_ctx_btf_id(struct bpf_verifier_log *log, enum bpf_prog_type prog_ty
 bool btf_types_are_same(const struct btf *btf1, u32 id1,
 			const struct btf *btf2, u32 id2);
 int btf_check_iter_arg(struct btf *btf, const struct btf_type *func, int arg_idx);
+struct btf *btf_get_module_btf(const struct module *module);
 
 static inline bool btf_type_is_struct_ptr(struct btf *btf, const struct btf_type *t)
 {
@@ -670,5 +671,11 @@ static inline int btf_check_iter_arg(struct btf *btf, const struct btf_type *fun
 {
 	return -EOPNOTSUPP;
 }
+struct btf *btf_get_module_btf(const struct module *module)
+{
+	return NULL;
+}
+
 #endif
+
 #endif
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index e7a59e6462a9..49b25882fa58 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -8033,7 +8033,7 @@ struct module *btf_try_get_module(const struct btf *btf)
 /* Returns struct btf corresponding to the struct module.
  * This function can return NULL or ERR_PTR.
  */
-static struct btf *btf_get_module_btf(const struct module *module)
+struct btf *btf_get_module_btf(const struct module *module)
 {
 #ifdef CONFIG_DEBUG_INFO_BTF_MODULES
 	struct btf_module *btf_mod, *tmp;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index fbf51147f319..b86308896358 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -20534,7 +20534,20 @@ struct inlinable_kfunc {
 	u32 btf_id;
 };
 
-static struct inlinable_kfunc inlinable_kfuncs[16];
+/* Represents inlinable kfuncs provided by a module */
+struct inlinable_kfuncs_block {
+	struct list_head list;
+	/* Module or kernel BTF, refcount not taken. Relies on correct
+         * calls to bpf_[un]register_inlinable_kfuncs to guarantee BTF
+         * object lifetime.
+	 */
+	const struct btf *btf;
+	struct inlinable_kfunc funcs[16];
+};
+
+/* List of inlinable_kfuncs_block objects. */
+static struct list_head inlinable_kfuncs = LIST_HEAD_INIT(inlinable_kfuncs);
+static DEFINE_RWLOCK(inlinable_kfuncs_lock);
 
 static void *check_inlinable_kfuncs_ptr(struct blob *blob,
 				      void *ptr, u64 size, const char *context)
@@ -20710,8 +20723,10 @@ static int inlinable_kfuncs_parse_sections(struct blob *blob, struct sh_elf_sect
 			if (strcmp(name, ".text") == 0) {
 				text = shdr;
 				text_idx = i;
-			} else if (strcmp(name, ".BTF") == 0 || strcmp(name, ".BTF.ext") == 0) {
-				/* ignore BTF for now */
+			} else if (strcmp(name, ".BTF") == 0 ||
+				   strcmp(name, ".BTF.ext") == 0 ||
+				   strcmp(name, ".modinfo") == 0) {
+				/* ignore */
 				break;
 			} else {
 				printk("malformed inlinable kfuncs data: unexpected section #%u name ('%s')\n",
@@ -20841,27 +20856,35 @@ static int inlinable_kfuncs_apply_relocs(struct sh_elf_sections *s, struct btf *
  * Do some sanity checks for ELF data structures,
  * (but refrain from being overly paranoid, as this ELF is a part of kernel build).
  */
-static int bpf_register_inlinable_kfuncs(void *elf_bin, u32 size)
+int bpf_register_inlinable_kfuncs(void *elf_bin, u32 size, struct module *module)
 {
 	struct blob blob = { .mem = elf_bin, .size = size };
+	struct inlinable_kfuncs_block *block = NULL;
 	struct sh_elf_sections s;
 	struct btf *btf;
 	Elf_Sym *sym;
 	u32 i, idx;
 	int err;
 
-	btf = bpf_get_btf_vmlinux();
+	btf = btf_get_module_btf(module);
 	if (!btf)
 		return -EINVAL;
 
 	err = inlinable_kfuncs_parse_sections(&blob, &s);
 	if (err < 0)
-		return err;
+		goto err_out;
 
 	err = inlinable_kfuncs_apply_relocs(&s, btf);
 	if (err < 0)
-		return err;
+		goto err_out;
+
+	block = kvmalloc(sizeof(*block), GFP_KERNEL | __GFP_ZERO);
+	if (!block) {
+		err = -ENOMEM;
+		goto err_out;
+	}
 
+	block->btf = btf;
 	idx = 0;
 	for (sym = s.sym, i = 0; i < s.sym_cnt; i++, sym++) {
 		struct inlinable_kfunc *sh;
@@ -20879,11 +20902,13 @@ static int bpf_register_inlinable_kfuncs(void *elf_bin, u32 size)
 		    sym->st_value % sizeof(struct bpf_insn) != 0 ||
 		    sym->st_name >= s.strings_sz) {
 			printk("malformed inlinable kfuncs data: bad symbol #%u\n", i);
-			return -EINVAL;
+			err = -EINVAL;
+			goto err_out;
 		}
-		if (idx == ARRAY_SIZE(inlinable_kfuncs) - 1) {
+		if (idx == ARRAY_SIZE(block->funcs) - 1) {
 			printk("malformed inlinable kfuncs data: too many helper functions\n");
-			return -EINVAL;
+			err = -EINVAL;
+			goto err_out;
 		}
 		insn_num = sym->st_size / sizeof(struct bpf_insn);
 		insns = s.text + sym->st_value;
@@ -20891,9 +20916,10 @@ static int bpf_register_inlinable_kfuncs(void *elf_bin, u32 size)
 		id = btf_find_by_name_kind(btf, name, BTF_KIND_FUNC);
 		if (id < 0) {
 			printk("can't add inlinable kfunc '%s': no btf_id\n", name);
-			return -EINVAL;
+			err = -EINVAL;
+			goto err_out;
 		}
-		sh = &inlinable_kfuncs[idx++];
+		sh = &block->funcs[idx++];
 		sh->insn_num = insn_num;
 		sh->insns = insns;
 		sh->name = name;
@@ -20902,25 +20928,86 @@ static int bpf_register_inlinable_kfuncs(void *elf_bin, u32 size)
 		       sh->name, sym->st_value, sh->insn_num, sh->btf_id);
 	}
 
+	write_lock(&inlinable_kfuncs_lock);
+	list_add(&block->list, &inlinable_kfuncs);
+	write_unlock(&inlinable_kfuncs_lock);
+	if (module)
+		btf_put(btf);
 	return 0;
+
+err_out:
+	kvfree(block);
+	if (module)
+		btf_put(btf);
+	return err;
+}
+EXPORT_SYMBOL_GPL(bpf_register_inlinable_kfuncs);
+
+void bpf_unregister_inlinable_kfuncs(struct module *module)
+{
+	struct inlinable_kfuncs_block *block;
+	struct list_head *pos;
+	struct btf *btf;
+
+	btf = btf_get_module_btf(module);
+	if (!btf)
+		return;
+
+	write_lock(&inlinable_kfuncs_lock);
+	list_for_each(pos, &inlinable_kfuncs) {
+		if (pos == &inlinable_kfuncs)
+			continue;
+		block = container_of(pos, typeof(*block), list);
+		if (block->btf != btf)
+			continue;
+		list_del(&block->list);
+		kvfree(block);
+		break;
+	}
+	write_unlock(&inlinable_kfuncs_lock);
+	btf_put(btf);
 }
+EXPORT_SYMBOL_GPL(bpf_unregister_inlinable_kfuncs);
 
 static int __init inlinable_kfuncs_init(void)
 {
 	return bpf_register_inlinable_kfuncs(&inlinable_kfuncs_data,
-					   &inlinable_kfuncs_data_end - &inlinable_kfuncs_data);
+					     &inlinable_kfuncs_data_end - &inlinable_kfuncs_data,
+					     NULL);
 }
 
 late_initcall(inlinable_kfuncs_init);
 
-static struct inlinable_kfunc *find_inlinable_kfunc(u32 btf_id)
+/* If a program refers to a kfunc from some module, this module is guaranteed
+ * to not be unloaded for the duration of program verification.
+ * Hence there is no need to protect returned 'inlinable_kfunc' instance,
+ * as long as find_inlinable_kfunc() is called during program verification.
+ * However, read_lock(&inlinable_kfuncs_lock) for the duration of this
+ * function is necessary in case some unrelated modules are loaded/unloaded
+ * concurrently to find_inlinable_kfunc() call.
+ */
+static struct inlinable_kfunc *find_inlinable_kfunc(struct btf *btf, u32 btf_id)
 {
-	struct inlinable_kfunc *sh = inlinable_kfuncs;
+	struct inlinable_kfuncs_block *block;
+	struct inlinable_kfunc *sh;
+	struct list_head *pos;
 	int i;
 
-	for (i = 0; i < ARRAY_SIZE(inlinable_kfuncs); ++i, ++sh)
-		if (sh->btf_id == btf_id)
+	read_lock(&inlinable_kfuncs_lock);
+	list_for_each(pos, &inlinable_kfuncs) {
+		block = container_of(pos, typeof(*block), list);
+		if (block->btf != btf)
+			continue;
+		sh = block->funcs;
+		for (i = 0; i < ARRAY_SIZE(block->funcs); ++i, ++sh) {
+			if (sh->btf_id != btf_id)
+				continue;
+			read_unlock(&inlinable_kfuncs_lock);
 			return sh;
+		}
+		break;
+	}
+	read_unlock(&inlinable_kfuncs_lock);
 	return NULL;
 }
 
@@ -20933,7 +21020,7 @@ static struct bpf_prog *inline_kfunc_call(struct bpf_verifier_env *env, struct b
 					  int insn_idx, int *cnt, s32 stack_base, u16 *stack_depth_extra)
 {
 	struct inlinable_kfunc_regs_usage regs_usage;
-	const struct bpf_kfunc_desc *desc;
+	struct bpf_kfunc_call_arg_meta meta;
 	struct bpf_prog *new_prog;
 	struct bpf_insn *insn_buf;
 	struct inlinable_kfunc *sh;
@@ -20943,10 +21030,10 @@ static struct bpf_prog *inline_kfunc_call(struct bpf_verifier_env *env, struct b
 	s16 stack_off;
 	u32 insn_num;
 
-	desc = find_kfunc_desc(env->prog, insn->imm, insn->off);
-	if (!desc || IS_ERR(desc))
+	err = fetch_kfunc_meta(env, insn, &meta, NULL);
+	if (err < 0)
 		return ERR_PTR(-EFAULT);
-	sh = find_inlinable_kfunc(desc->func_id);
+	sh = find_inlinable_kfunc(meta.btf, meta.func_id);
 	if (!sh)
 		return NULL;
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC bpf-next 05/11] bpf: dynamic allocation for bpf_verifier_env->subprog_info
  2024-11-07 17:50 [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Eduard Zingerman
                   ` (3 preceding siblings ...)
  2024-11-07 17:50 ` [RFC bpf-next 04/11] bpf: allow specifying inlinable kfuncs in modules Eduard Zingerman
@ 2024-11-07 17:50 ` Eduard Zingerman
  2024-11-07 17:50 ` [RFC bpf-next 06/11] bpf: KERNEL_VALUE register type Eduard Zingerman
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-07 17:50 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	Eduard Zingerman

Follow-up patches use add_hidden_subprog() to inject inlinable kfunc
bodies into bpf program as subprograms. At the moment only one hidden
subprogram is allowed, as bpf_verifier_env->subprog_info is allocated
in advance as array of fixed size. This patch removes the limitation
by using dynamic memory allocation for this array.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf_verifier.h |  3 ++-
 kernel/bpf/verifier.c        | 29 ++++++++++++++++++++++-------
 2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index ed4eacfd4db7..b683dc3ede4a 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -738,7 +738,7 @@ struct bpf_verifier_env {
 	struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
 	const struct bpf_line_info *prev_linfo;
 	struct bpf_verifier_log log;
-	struct bpf_subprog_info subprog_info[BPF_MAX_SUBPROGS + 2]; /* max + 2 for the fake and exception subprogs */
+	struct bpf_subprog_info *subprog_info;
 	union {
 		struct bpf_idmap idmap_scratch;
 		struct bpf_idset idset_scratch;
@@ -751,6 +751,7 @@ struct bpf_verifier_env {
 	struct backtrack_state bt;
 	struct bpf_jmp_history_entry *cur_hist_ent;
 	u32 pass_cnt; /* number of times do_check() was called */
+	u32 subprog_cap;
 	u32 subprog_cnt;
 	/* number of instructions analyzed by the verifier */
 	u32 prev_insn_processed, insn_processed;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b86308896358..d4ea7fd8a967 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -19419,7 +19419,7 @@ static int adjust_jmp_off(struct bpf_prog *prog, u32 tgt_idx, u32 delta)
 static int adjust_subprog_starts_after_remove(struct bpf_verifier_env *env,
 					      u32 off, u32 cnt)
 {
-	int i, j;
+	int i, j, first_hidden = env->subprog_cnt - env->hidden_subprog_cnt;
 
 	/* find first prog starting at or after off (first to remove) */
 	for (i = 0; i < env->subprog_cnt; i++)
@@ -19446,6 +19446,8 @@ static int adjust_subprog_starts_after_remove(struct bpf_verifier_env *env,
 			env->subprog_info + j,
 			sizeof(*env->subprog_info) * move);
 		env->subprog_cnt -= j - i;
+		if (first_hidden <= j - 1)
+			env->hidden_subprog_cnt -= j - first_hidden;
 
 		/* remove func_info */
 		if (aux->func_info) {
@@ -21215,15 +21217,20 @@ static int resolve_kfunc_calls(struct bpf_verifier_env *env)
 /* The function requires that first instruction in 'patch' is insnsi[prog->len - 1] */
 static int add_hidden_subprog(struct bpf_verifier_env *env, struct bpf_insn *patch, int len)
 {
-	struct bpf_subprog_info *info = env->subprog_info;
+	struct bpf_subprog_info *info, *tmp;
 	int cnt = env->subprog_cnt;
 	struct bpf_prog *prog;
 
-	/* We only reserve one slot for hidden subprogs in subprog_info. */
-	if (env->hidden_subprog_cnt) {
-		verbose(env, "verifier internal error: only one hidden subprog supported\n");
-		return -EFAULT;
+	if (cnt == env->subprog_cap) {
+		env->subprog_cap *= 2;
+		tmp = vrealloc(env->subprog_info,
+			       array_size(sizeof(*env->subprog_info), env->subprog_cap + 1),
+			       GFP_KERNEL | __GFP_ZERO);
+		if (!tmp)
+			return -ENOMEM;
+		env->subprog_info = tmp;
 	}
+	info = env->subprog_info;
 	/* We're not patching any existing instruction, just appending the new
 	 * ones for the hidden subprog. Hence all of the adjustment operations
 	 * in bpf_patch_insn_data are no-ops.
@@ -23122,6 +23129,13 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	ret = -ENOMEM;
 	if (!env->insn_aux_data)
 		goto err_free_env;
+	env->subprog_cap = BPF_MAX_SUBPROGS;
+	env->subprog_info = vzalloc(array_size(sizeof(*env->subprog_info),
+					       env->subprog_cap + 1 /* max + 1 for the fake subprog */));
+	if (!env->subprog_info) {
+		ret = -ENOMEM;
+		goto err_free_env;
+	}
 	for (i = 0; i < len; i++)
 		env->insn_aux_data[i].orig_idx = i;
 	env->prog = *prog;
@@ -23353,8 +23367,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 err_unlock:
 	if (!is_priv)
 		mutex_unlock(&bpf_verifier_lock);
-	vfree(env->insn_aux_data);
 err_free_env:
+	vfree(env->subprog_info);
+	vfree(env->insn_aux_data);
 	kvfree(env);
 	return ret;
 }
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC bpf-next 06/11] bpf: KERNEL_VALUE register type
  2024-11-07 17:50 [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Eduard Zingerman
                   ` (4 preceding siblings ...)
  2024-11-07 17:50 ` [RFC bpf-next 05/11] bpf: dynamic allocation for bpf_verifier_env->subprog_info Eduard Zingerman
@ 2024-11-07 17:50 ` Eduard Zingerman
  2024-11-07 17:50 ` [RFC bpf-next 07/11] bpf: instantiate inlinable kfuncs before verification Eduard Zingerman
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-07 17:50 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	Eduard Zingerman

Follow-up patch adds inlinable kfuncs as subprograms of a bpf program
being verified. Bodies of these helpers are considered trusted and
don't require verification. To facilitate this, add a new register
type: KERNEL_VALUE.
- ALU operations on KERNEL_VALUEs return KERNEL_VALUE;
- stores with KERNEL_VALUE destination registers are legal;
- loads with KERNEL_VALUE source registers are legal and
  set destination registers to KERNEL_VALUE;
- KERNEL_VALUEs do not have any additional associated information:
  no ids, no range, etc.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf.h   |  1 +
 kernel/bpf/log.c      |  1 +
 kernel/bpf/verifier.c | 24 +++++++++++++++++++++++-
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 2bcc9161687b..75f57f791cd3 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -941,6 +941,7 @@ enum bpf_reg_type {
 	PTR_TO_BUF,		 /* reg points to a read/write buffer */
 	PTR_TO_FUNC,		 /* reg points to a bpf program function */
 	CONST_PTR_TO_DYNPTR,	 /* reg points to a const struct bpf_dynptr */
+	KERNEL_VALUE,		 /* pointer or scalar, any operation produces another KERNEL_VALUE */
 	__BPF_REG_TYPE_MAX,
 
 	/* Extended reg_types. */
diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
index 4a858fdb6476..87ab01b8fc1a 100644
--- a/kernel/bpf/log.c
+++ b/kernel/bpf/log.c
@@ -463,6 +463,7 @@ const char *reg_type_str(struct bpf_verifier_env *env, enum bpf_reg_type type)
 		[PTR_TO_FUNC]		= "func",
 		[PTR_TO_MAP_KEY]	= "map_key",
 		[CONST_PTR_TO_DYNPTR]	= "dynptr_ptr",
+		[KERNEL_VALUE]		= "kval",
 	};
 
 	if (type & PTR_MAYBE_NULL) {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d4ea7fd8a967..f38f73cc740b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2388,6 +2388,12 @@ static void mark_reg_unknown(struct bpf_verifier_env *env,
 	__mark_reg_unknown(env, regs + regno);
 }
 
+static void mark_reg_kernel_value(struct bpf_reg_state *reg)
+{
+	__mark_reg_unknown_imprecise(reg);
+	reg->type = KERNEL_VALUE;
+}
+
 static int __mark_reg_s32_range(struct bpf_verifier_env *env,
 				struct bpf_reg_state *regs,
 				u32 regno,
@@ -4534,6 +4540,9 @@ static bool __is_pointer_value(bool allow_ptr_leaks,
 	if (allow_ptr_leaks)
 		return false;
 
+	if (reg->type == KERNEL_VALUE)
+		return false;
+
 	return reg->type != SCALAR_VALUE;
 }
 
@@ -7208,6 +7217,9 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
 	} else if (reg->type == PTR_TO_ARENA) {
 		if (t == BPF_READ && value_regno >= 0)
 			mark_reg_unknown(env, regs, value_regno);
+	} else if (reg->type == KERNEL_VALUE) {
+		if (t == BPF_READ && value_regno >= 0)
+			mark_reg_kernel_value(regs + value_regno);
 	} else {
 		verbose(env, "R%d invalid mem access '%s'\n", regno,
 			reg_type_str(env, reg->type));
@@ -14319,6 +14331,13 @@ static int adjust_reg_min_max_vals(struct bpf_verifier_env *env,
 
 	if (BPF_SRC(insn->code) == BPF_X) {
 		src_reg = &regs[insn->src_reg];
+
+		if (src_reg->type == KERNEL_VALUE || dst_reg->type == KERNEL_VALUE) {
+			mark_reg_kernel_value(src_reg);
+			mark_reg_kernel_value(dst_reg);
+			return 0;
+		}
+
 		if (src_reg->type != SCALAR_VALUE) {
 			if (dst_reg->type != SCALAR_VALUE) {
 				/* Combining two pointers by any ALU op yields
@@ -14358,6 +14377,9 @@ static int adjust_reg_min_max_vals(struct bpf_verifier_env *env,
 				return err;
 		}
 	} else {
+		if (dst_reg->type == KERNEL_VALUE)
+			return 0;
+
 		/* Pretend the src is a reg with a known value, since we only
 		 * need to be able to read from this state.
 		 */
@@ -15976,7 +15998,7 @@ static int check_return_code(struct bpf_verifier_env *env, int regno, const char
 	}
 
 	if (is_subprog && !frame->in_exception_callback_fn) {
-		if (reg->type != SCALAR_VALUE) {
+		if (reg->type != SCALAR_VALUE && reg->type != KERNEL_VALUE) {
 			verbose(env, "At subprogram exit the register R%d is not a scalar value (%s)\n",
 				regno, reg_type_str(env, reg->type));
 			return -EINVAL;
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC bpf-next 07/11] bpf: instantiate inlinable kfuncs before verification
  2024-11-07 17:50 [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Eduard Zingerman
                   ` (5 preceding siblings ...)
  2024-11-07 17:50 ` [RFC bpf-next 06/11] bpf: KERNEL_VALUE register type Eduard Zingerman
@ 2024-11-07 17:50 ` Eduard Zingerman
  2024-11-07 17:50 ` [RFC bpf-next 08/11] bpf: special rules for kernel function calls inside inlinable kfuncs Eduard Zingerman
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-07 17:50 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	Eduard Zingerman

In order to detect which branches inside kfuncs are always to true or
always false (e.g. because of parameters being constants) visit bodies
of inlinable kfuncs during main verification pass. Such conditionals
would be replaced by gotos before jiting. Verify kfunc calls in
(almost) identical way, regardless of kfunc being inlinable or not.

To facilitate this:
- before main verification pass, for each inlinable kfunc callsite,
  create a hidden subprogram with a body being a copy of inlinable
  kfunc body; let's call such subprograms inlinable kfunc instances;
- during main verification pass:
  - when visiting inlinable kfunc call:
    - verify call instruction using regular kfunc rules;
    - mark scalar call parameters as precise;
    - do a push_stack() for a state visiting kfunc instance body in a
      distilled context:
      - no caller stack frame;
      - values of scalar parameters and null pointers from the calling
        frame are copied as-is (it is ok if some of the scalar
        parameters would be marked precise during instance body
        verification, as these parameters are already marked precise
        at the call site);
      - dynptrs are represented as two register spills in a fake stack
        frame crafted in a way allowing verifier to recover dynptr type
        when code generated for bpf_dynptr_get_type() is processed;
      - all other parameters are passed as registers of type
        KERNEL_VALUE.
  - when 'exit' instruction within instance body is verified, do not
    return to the callsite, assume a valid end of verification path;
- after main verification pass:
  - rely on existing passes opt_hard_wire_dead_code_branches() and
    opt_remove_dead_code() to simplify kfunc instance bodies;
  - adjust inline_kfunc_calls() to copy kfunc instance bodies at the
    corresponding call sites;
  - remove kfunc instance bodies before jit.

Note, that steps taken at main verification pass make kfunc instance
bodies "hermetic", verification of these bodies does not change state
of the call site.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf_verifier.h |   5 +
 kernel/bpf/verifier.c        | 346 +++++++++++++++++++++++++++++++----
 2 files changed, 317 insertions(+), 34 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index b683dc3ede4a..2de3536e4133 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -560,6 +560,8 @@ struct bpf_insn_aux_data {
 		 * the state of the relevant registers to make decision about inlining
 		 */
 		struct bpf_loop_inline_state loop_inline_state;
+		/* for kfunc calls, instance of the inlinable kfunc instance subprog */
+		u32 kfunc_instance_subprog;
 	};
 	union {
 		/* remember the size of type passed to bpf_obj_new to rewrite R1 */
@@ -722,6 +724,8 @@ struct bpf_verifier_env {
 	u32 used_btf_cnt;		/* number of used BTF objects */
 	u32 id_gen;			/* used to generate unique reg IDs */
 	u32 hidden_subprog_cnt;		/* number of hidden subprogs */
+	u32 first_kfunc_instance;	/* first inlinable kfunc instance subprog number */
+	u32 last_kfunc_instance;	/* last inlinable kfunc instance subprog number */
 	int exception_callback_subprog;
 	bool explore_alu_limits;
 	bool allow_ptr_leaks;
@@ -785,6 +789,7 @@ struct bpf_verifier_env {
 	 * e.g., in reg_type_str() to generate reg_type string
 	 */
 	char tmp_str_buf[TMP_STR_BUF_LEN];
+	char tmp_str_buf2[TMP_STR_BUF_LEN];
 	struct bpf_insn insn_buf[INSN_BUF_SIZE];
 	struct bpf_insn epilogue_buf[INSN_BUF_SIZE];
 };
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f38f73cc740b..87b6cc8c94f8 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3420,6 +3420,25 @@ static bool is_jmp_point(struct bpf_verifier_env *env, int insn_idx)
 	return env->insn_aux_data[insn_idx].jmp_point;
 }
 
+static bool is_inlinable_kfunc_call(struct bpf_verifier_env *env, int idx)
+{
+	struct bpf_insn_aux_data *aux = &env->insn_aux_data[idx];
+	struct bpf_insn *insn = &env->prog->insnsi[idx];
+
+	return bpf_pseudo_kfunc_call(insn) &&
+	       aux->kfunc_instance_subprog != 0;
+}
+
+static int inlinable_kfunc_instance_start(struct bpf_verifier_env *env, int idx)
+{
+	struct bpf_insn_aux_data *aux = &env->insn_aux_data[idx];
+	int subprog = aux->kfunc_instance_subprog;
+
+	if (!subprog)
+		return -1;
+	return env->subprog_info[subprog].start;
+}
+
 #define LR_FRAMENO_BITS	3
 #define LR_SPI_BITS	6
 #define LR_ENTRY_BITS	(LR_SPI_BITS + LR_FRAMENO_BITS + 1)
@@ -3906,13 +3925,20 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
 		if (class == BPF_STX)
 			bt_set_reg(bt, sreg);
 	} else if (class == BPF_JMP || class == BPF_JMP32) {
-		if (bpf_pseudo_call(insn)) {
+		bool return_from_inlinable_kfunc_call =
+			is_inlinable_kfunc_call(env, idx) && subseq_idx == inlinable_kfunc_instance_start(env, idx);
+
+		if (bpf_pseudo_call(insn) || return_from_inlinable_kfunc_call) {
 			int subprog_insn_idx, subprog;
 
-			subprog_insn_idx = idx + insn->imm + 1;
-			subprog = find_subprog(env, subprog_insn_idx);
-			if (subprog < 0)
-				return -EFAULT;
+			if (is_inlinable_kfunc_call(env, idx)) {
+				subprog = env->insn_aux_data[idx].kfunc_instance_subprog;
+			} else {
+				subprog_insn_idx = idx + insn->imm + 1;
+				subprog = find_subprog(env, subprog_insn_idx);
+				if (subprog < 0)
+					return -EFAULT;
+			}
 
 			if (subprog_is_global(env, subprog)) {
 				/* check that jump history doesn't have any
@@ -3934,6 +3960,17 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
 				/* global subprog always sets R0 */
 				bt_clear_reg(bt, BPF_REG_0);
 				return 0;
+			} else if (is_inlinable_kfunc_call(env, idx)) {
+				if (bt_reg_mask(bt) & ~BPF_REGMASK_ARGS) {
+					verbose(env, "BUG regs %x\n", bt_reg_mask(bt));
+					WARN_ONCE(1, "verifier backtracking bug");
+					return -EFAULT;
+				}
+				/* do not backtrack to the callsite, clear any precision marks
+				 * that might be present in the fake frame (e.g. dynptr type spill).
+				 */
+				bt_reset(bt);
+				return 0;
 			} else {
 				/* static subprog call instruction, which
 				 * means that we are exiting current subprog,
@@ -12525,6 +12562,123 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
 	return 0;
 }
 
+/* Establish an independent call stack for inlinable kfunc instance verification,
+ * copy "distilled" view of parameters from the callsite:
+ * - Scalars copied as-is.
+ * - Null pointers copied as-is.
+ * - For dynptrs:
+ *   - a fake frame #0 is created;
+ *   - a register spill corresponding to bpf_dynptr_kern->size field is created,
+ *     this spill range/tnum is set to represent dynptr type;
+ *   - a register spill corresponding to bpf_dynptr_kern->data field is created,
+ *     this spill's type is set to KERNEL_VALUE.
+ *   - the parameter itself is represented as pointer to stack.
+ * - Everything else is copied as KERNEL_VALUE.
+ *
+ * This allows to isolate main program verification from verification of
+ * kfunc instance bodies.
+ */
+static int push_inlinable_kfunc(struct bpf_verifier_env *env, struct bpf_kfunc_call_arg_meta *meta,
+			      u32 subprog)
+{
+	struct bpf_reg_state *reg, *sh_reg, *sh_regs, *regs = cur_regs(env);
+	const struct btf_param *args, *arg;
+	const struct btf *btf = meta->btf;
+	struct bpf_verifier_state *st;
+	struct bpf_func_state *frame;
+	const struct btf_type *t;
+	int fake_spi = 0;
+	u32 i, nargs;
+
+	st = push_async_cb(env, env->subprog_info[subprog].start,
+			   env->insn_idx, subprog, false);
+	if (!st)
+		return -ENOMEM;
+
+	frame = kzalloc(sizeof(*frame), GFP_KERNEL);
+	if (!frame)
+		return -ENOMEM;
+	init_func_state(env, frame,
+			BPF_MAIN_FUNC /* callsite */,
+			1 /* frameno within this callchain */,
+			subprog /* subprog number within this prog */);
+	/* Use frame #0 to represent memory objects with some known bits. */
+	st->frame[1] = frame;
+	st->curframe = 1;
+
+	args = (const struct btf_param *)(meta->func_proto + 1);
+	nargs = btf_type_vlen(meta->func_proto);
+	sh_regs = st->frame[1]->regs;
+	for (i = 0; i < nargs; i++) {
+		arg = &args[i];
+		reg = &regs[i + 1];
+		sh_reg = &sh_regs[i + 1];
+		t = btf_type_skip_modifiers(btf, arg->type, NULL);
+
+		if (is_kfunc_arg_dynptr(meta->btf, arg)) {
+			struct bpf_reg_state *fake_reg = &env->fake_reg[0];
+			enum bpf_dynptr_type type;
+			struct tnum a, b, c;
+			int spi;
+
+			if (reg->type == CONST_PTR_TO_DYNPTR) {
+				type = reg->dynptr.type;
+			} else if (reg->type == PTR_TO_STACK) {
+				spi = dynptr_get_spi(env, reg);
+				if (spi < 0) {
+					verbose(env, "BUG: can't recognize dynptr param\n");
+					return -EFAULT;
+				}
+				type = func(env, reg)->stack[spi].spilled_ptr.dynptr.type;
+			} else {
+				return -EFAULT;
+			}
+			grow_stack_state(env, st->frame[0], (fake_spi + 2) * BPF_REG_SIZE);
+
+			memset(fake_reg, 0, sizeof(*fake_reg));
+			__mark_reg_unknown_imprecise(fake_reg);
+			/* Setup bpf_dynptr_kern->size as expected by bpf_dynptr_get_type().
+			 * Exact value of the dynptr type could be recovered by verifier
+			 * when BPF code generated for bpf_dynptr_get_type() is processed.
+			 */
+			a = tnum_lshift(tnum_const(type), 28);	/* type */
+			b = tnum_rshift(tnum_unknown, 64 - 28);	/* size */
+			c = tnum_lshift(tnum_unknown, 31);		/* read-only bit */
+			fake_reg->var_off = tnum_or(tnum_or(a, b), c);
+			reg_bounds_sync(fake_reg);
+			save_register_state(env, st->frame[0], fake_spi++, fake_reg, 4);
+
+			/* bpf_dynptr_kern->data */
+			mark_reg_kernel_value(fake_reg);
+			save_register_state(env, st->frame[0], fake_spi++, fake_reg, BPF_REG_SIZE);
+
+			sh_reg->type = PTR_TO_STACK;
+			sh_reg->var_off = tnum_const(- fake_spi * BPF_REG_SIZE);
+			sh_reg->frameno = 0;
+			reg_bounds_sync(sh_reg);
+		} else if (register_is_null(reg) && btf_is_ptr(t)) {
+			__mark_reg_known_zero(sh_reg);
+			sh_reg->type = SCALAR_VALUE;
+		} else if (reg->type == SCALAR_VALUE) {
+			copy_register_state(sh_reg, reg);
+			sh_reg->subreg_def = 0;
+			sh_reg->id = 0;
+		} else {
+			mark_reg_kernel_value(sh_reg);
+		}
+	}
+	return 0;
+}
+
+static bool inside_inlinable_kfunc(struct bpf_verifier_env *env, u32 idx)
+{
+	struct bpf_subprog_info *subprog_info = env->subprog_info;
+
+	return env->first_kfunc_instance &&
+	       idx >= subprog_info[env->first_kfunc_instance].start &&
+	       idx < subprog_info[env->last_kfunc_instance + 1].start;
+}
+
 static int fetch_kfunc_meta(struct bpf_verifier_env *env,
 			    struct bpf_insn *insn,
 			    struct bpf_kfunc_call_arg_meta *meta,
@@ -12534,6 +12688,7 @@ static int fetch_kfunc_meta(struct bpf_verifier_env *env,
 	u32 func_id, *kfunc_flags;
 	const char *func_name;
 	struct btf *desc_btf;
+	u32 zero = 0;
 
 	if (kfunc_name)
 		*kfunc_name = NULL;
@@ -12554,7 +12709,13 @@ static int fetch_kfunc_meta(struct bpf_verifier_env *env,
 
 	kfunc_flags = btf_kfunc_id_set_contains(desc_btf, func_id, env->prog);
 	if (!kfunc_flags) {
-		return -EACCES;
+		/* inlinable kfuncs can call any kernel functions,
+		 * not just those residing in id sets.
+		 */
+		if (inside_inlinable_kfunc(env, env->insn_idx))
+			kfunc_flags = &zero;
+		else
+			return -EACCES;
 	}
 
 	memset(meta, 0, sizeof(*meta));
@@ -12595,6 +12756,7 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		return err;
 	desc_btf = meta.btf;
 	insn_aux = &env->insn_aux_data[insn_idx];
+	nargs = btf_type_vlen(meta.func_proto);
 
 	insn_aux->is_iter_next = is_iter_next_kfunc(&meta);
 
@@ -12614,6 +12776,22 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 	if (err < 0)
 		return err;
 
+	if (is_inlinable_kfunc_call(env, insn_idx)) {
+		err = push_inlinable_kfunc(env, &meta, insn_aux->kfunc_instance_subprog);
+		if (err < 0)
+			return err;
+		/* At the moment mark_chain_precision() does not
+                 * propagate precision from within inlinable kfunc
+                 * instance body. As push_inlinable_kfunc() passes
+                 * scalar parameters as-is any such parameter might be
+                 * used in the precise context. Conservatively mark
+                 * these parameters as precise.
+		 */
+		for (i = 0; i < nargs; ++i)
+			if (regs[i + 1].type == SCALAR_VALUE)
+				mark_chain_precision(env, i + 1);
+	}
+
 	if (meta.func_id == special_kfunc_list[KF_bpf_rbtree_add_impl]) {
 		err = push_callback_call(env, insn, insn_idx, meta.subprogno,
 					 set_rbtree_add_callback_state);
@@ -13010,7 +13188,6 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		}
 	}
 
-	nargs = btf_type_vlen(meta.func_proto);
 	args = (const struct btf_param *)(meta.func_proto + 1);
 	for (i = 0; i < nargs; i++) {
 		u32 regno = i + 1;
@@ -16258,7 +16435,11 @@ static int visit_func_call_insn(int t, struct bpf_insn *insns,
 
 	if (visit_callee) {
 		mark_prune_point(env, t);
-		ret = push_insn(t, t + insns[t].imm + 1, BRANCH, env);
+		if (is_inlinable_kfunc_call(env, t))
+			/* visit inlinable kfunc instance bodies to establish prune point marks */
+			ret = push_insn(t, inlinable_kfunc_instance_start(env, t), BRANCH, env);
+		else
+			ret = push_insn(t, t + insns[t].imm + 1, BRANCH, env);
 	}
 	return ret;
 }
@@ -16595,7 +16776,9 @@ static int visit_insn(int t, struct bpf_verifier_env *env)
 				mark_force_checkpoint(env, t);
 			}
 		}
-		return visit_func_call_insn(t, insns, env, insn->src_reg == BPF_PSEUDO_CALL);
+		return visit_func_call_insn(t, insns, env,
+					    insn->src_reg == BPF_PSEUDO_CALL ||
+					    is_inlinable_kfunc_call(env, t));
 
 	case BPF_JA:
 		if (BPF_SRC(insn->code) != BPF_K)
@@ -18718,7 +18901,8 @@ static int do_check(struct bpf_verifier_env *env)
 				if (exception_exit)
 					goto process_bpf_exit;
 
-				if (state->curframe) {
+				if (state->curframe &&
+				    state->frame[state->curframe]->callsite != BPF_MAIN_FUNC) {
 					/* exit from nested function */
 					err = prepare_func_exit(env, &env->insn_idx);
 					if (err)
@@ -21041,18 +21225,21 @@ static struct inlinable_kfunc *find_inlinable_kfunc(struct btf *btf, u32 btf_id)
  * report extra stack used in 'stack_depth_extra'.
  */
 static struct bpf_prog *inline_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
-					  int insn_idx, int *cnt, s32 stack_base, u16 *stack_depth_extra)
+					  int insn_idx, int *cnt, s32 stack_base, u16 *stack_depth_extra,
+					  u32 subprog)
 {
 	struct inlinable_kfunc_regs_usage regs_usage;
+	struct bpf_insn *insn_buf, *subprog_insns;
+	struct bpf_subprog_info *subprog_info;
 	struct bpf_kfunc_call_arg_meta meta;
 	struct bpf_prog *new_prog;
-	struct bpf_insn *insn_buf;
 	struct inlinable_kfunc *sh;
-	int i, j, r, off, err, exit;
+	int i, j, r, off, exit;
 	u32 subprog_insn_cnt;
 	u16 extra_slots;
 	s16 stack_off;
 	u32 insn_num;
+	int err;
 
 	err = fetch_kfunc_meta(env, insn, &meta, NULL);
 	if (err < 0)
@@ -21061,8 +21248,10 @@ static struct bpf_prog *inline_kfunc_call(struct bpf_verifier_env *env, struct b
 	if (!sh)
 		return NULL;
 
-	subprog_insn_cnt = sh->insn_num;
-	scan_regs_usage(sh->insns, subprog_insn_cnt, &regs_usage);
+	subprog_info = &env->subprog_info[subprog];
+	subprog_insn_cnt = (subprog_info + 1)->start - subprog_info->start;
+	subprog_insns = env->prog->insnsi + subprog_info->start;
+	scan_regs_usage(subprog_insns, subprog_insn_cnt, &regs_usage);
 	if (regs_usage.r10_escapes) {
 		if (env->log.level & BPF_LOG_LEVEL2)
 			verbose(env, "can't inline kfunc %s at insn %d, r10 escapes\n",
@@ -21082,7 +21271,7 @@ static struct bpf_prog *inline_kfunc_call(struct bpf_verifier_env *env, struct b
 
 	if (env->log.level & BPF_LOG_LEVEL2)
 		verbose(env, "inlining kfunc %s at insn %d\n", sh->name, insn_idx);
-	memcpy(insn_buf + extra_slots, sh->insns, subprog_insn_cnt * sizeof(*insn_buf));
+	memcpy(insn_buf + extra_slots, subprog_insns, subprog_insn_cnt * sizeof(*insn_buf));
 	off = stack_base;
 	i = 0;
 	j = insn_num - 1;
@@ -21135,13 +21324,6 @@ static struct bpf_prog *inline_kfunc_call(struct bpf_verifier_env *env, struct b
 		default:
 			break;
 		}
-
-		/* Make sure kernel function calls from within kfunc body could be jitted. */
-		if (bpf_pseudo_kfunc_call(insn)) {
-			err = add_kfunc_call(env, insn->imm, insn->off);
-			if (err < 0)
-				return ERR_PTR(err);
-		}
 	}
 
 	*cnt = insn_num;
@@ -21149,24 +21331,33 @@ static struct bpf_prog *inline_kfunc_call(struct bpf_verifier_env *env, struct b
 	return new_prog;
 }
 
-/* Do this after all stack depth adjustments */
+/* Copy bodies of inlinable kfunc instances to the callsites
+ * and remove instance subprograms.
+ * Do this after all stack depth adjustments.
+ */
 static int inline_kfunc_calls(struct bpf_verifier_env *env)
 {
 	struct bpf_prog *prog = env->prog;
 	struct bpf_insn *insn = prog->insnsi;
 	const int insn_cnt = prog->len;
 	struct bpf_prog *new_prog;
-	int i, cnt, delta = 0, cur_subprog = 0;
+	int err, i, cnt, delta = 0, cur_subprog = 0;
 	struct bpf_subprog_info *subprogs = env->subprog_info;
 	u16 stack_depth = subprogs[cur_subprog].stack_depth;
 	u16 call_extra_stack = 0, subprog_extra_stack = 0;
+	struct bpf_insn_aux_data *aux;
 
 	for (i = 0; i < insn_cnt;) {
 		if (!bpf_pseudo_kfunc_call(insn))
 			goto next_insn;
 
+		aux = &env->insn_aux_data[i + delta];
+		if (!aux->kfunc_instance_subprog)
+			goto next_insn;
+
 		new_prog = inline_kfunc_call(env, insn, i + delta, &cnt,
-					     -stack_depth, &call_extra_stack);
+					     -stack_depth, &call_extra_stack,
+					     aux->kfunc_instance_subprog);
 		if (IS_ERR(new_prog))
 			return PTR_ERR(new_prog);
 		if (!new_prog)
@@ -21189,7 +21380,14 @@ static int inline_kfunc_calls(struct bpf_verifier_env *env)
 	}
 
 	env->prog->aux->stack_depth = subprogs[0].stack_depth;
-
+	if (env->first_kfunc_instance) {
+		/* Do not jit instance subprograms. */
+		cnt = subprogs[env->last_kfunc_instance + 1].start -
+		      subprogs[env->first_kfunc_instance].start;
+		err = verifier_remove_insns(env, subprogs[env->first_kfunc_instance].start, cnt);
+		if (err < 0)
+			return err;
+	}
 	return 0;
 }
 
@@ -21268,6 +21466,82 @@ static int add_hidden_subprog(struct bpf_verifier_env *env, struct bpf_insn *pat
 	return 0;
 }
 
+/* For each callsite of the inlinable kfunc add a hidden subprogram
+ * with a copy of kfunc body (instance). Record the number of the
+ * added subprogram in bpf_insn_aux_data->kfunc_instance_subprog
+ * field for the callsite.
+ *
+ * During main verification pass verifier would discover if some of
+ * the branches / code inside each body could be removed because of
+ * known parameter values.
+ *
+ * At the end of program processing inline_kfunc_calls()
+ * would copy bodies to callsites and delete the subprograms.
+ */
+static int instantiate_inlinable_kfuncs(struct bpf_verifier_env *env)
+{
+	struct bpf_prog_aux *aux = env->prog->aux;
+	const int insn_cnt = env->prog->len;
+	struct bpf_kfunc_call_arg_meta meta;
+	struct bpf_insn *insn, *insn_buf;
+	struct inlinable_kfunc *sh = NULL;
+	int i, j, err;
+	u32 subprog;
+	void *tmp;
+
+	for (i = 0; i < insn_cnt; ++i) {
+		insn = env->prog->insnsi + i;
+		if (!bpf_pseudo_kfunc_call(insn))
+			continue;
+		err = fetch_kfunc_meta(env, insn, &meta, NULL);
+		if (err < 0)
+			/* missing kfunc, error would be reported later */
+			continue;
+		sh = find_inlinable_kfunc(meta.btf, meta.func_id);
+		if (!sh)
+			continue;
+		for (j = 0; j < sh->insn_num; ++j) {
+			if (!bpf_pseudo_kfunc_call(&sh->insns[j]))
+				continue;
+			err = add_kfunc_call(env, sh->insns[j].imm, sh->insns[j].off);
+			if (err < 0)
+				return err;
+		}
+		subprog = env->subprog_cnt;
+		insn_buf = kmalloc_array(sh->insn_num + 1, sizeof(*insn_buf), GFP_KERNEL);
+		if (!insn_buf)
+			return -ENOMEM;
+		/* this is an unfortunate requirement of add_hidden_subprog() */
+		insn_buf[0] = env->prog->insnsi[env->prog->len - 1];
+		memcpy(insn_buf + 1, sh->insns,  sh->insn_num * sizeof(*insn_buf));
+		err = add_hidden_subprog(env, insn_buf, sh->insn_num + 1);
+		kfree(insn_buf);
+		if (err)
+			return err;
+		tmp = krealloc_array(aux->func_info, aux->func_info_cnt + 1,
+				     sizeof(*aux->func_info), GFP_KERNEL);
+		if (!tmp)
+			return -ENOMEM;
+		aux->func_info = tmp;
+		memset(&aux->func_info[aux->func_info_cnt], 0, sizeof(*aux->func_info));
+		aux->func_info[aux->func_info_cnt].insn_off = env->subprog_info[subprog].start;
+		tmp = krealloc_array(aux->func_info_aux, aux->func_info_cnt + 1,
+				     sizeof(*aux->func_info_aux), GFP_KERNEL);
+		if (!tmp)
+			return -ENOMEM;
+		aux->func_info_aux = tmp;
+		memset(&aux->func_info_aux[aux->func_info_cnt], 0, sizeof(*aux->func_info_aux));
+		aux->func_info_aux[aux->func_info_cnt].linkage = BTF_FUNC_STATIC;
+		aux->func_info_cnt++;
+
+		if (!env->first_kfunc_instance)
+			env->first_kfunc_instance = subprog;
+		env->last_kfunc_instance = subprog;
+		env->insn_aux_data[i].kfunc_instance_subprog = subprog;
+	}
+	return 0;
+}
+
 /* Do various post-verification rewrites in a single program pass.
  * These rewrites simplify JIT and interpreter implementations.
  */
@@ -23204,13 +23478,6 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 		env->test_state_freq = attr->prog_flags & BPF_F_TEST_STATE_FREQ;
 	env->test_reg_invariants = attr->prog_flags & BPF_F_TEST_REG_INVARIANTS;
 
-	env->explored_states = kvcalloc(state_htab_size(env),
-				       sizeof(struct bpf_verifier_state_list *),
-				       GFP_USER);
-	ret = -ENOMEM;
-	if (!env->explored_states)
-		goto skip_full_check;
-
 	ret = check_btf_info_early(env, attr, uattr);
 	if (ret < 0)
 		goto skip_full_check;
@@ -23241,10 +23508,21 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 			goto skip_full_check;
 	}
 
+	ret = instantiate_inlinable_kfuncs(env);
+	if (ret < 0)
+		goto skip_full_check;
+
 	ret = check_cfg(env);
 	if (ret < 0)
 		goto skip_full_check;
 
+	env->explored_states = kvcalloc(state_htab_size(env),
+				       sizeof(struct bpf_verifier_state_list *),
+				       GFP_USER);
+	ret = -ENOMEM;
+	if (!env->explored_states)
+		goto skip_full_check;
+
 	ret = mark_fastcall_patterns(env);
 	if (ret < 0)
 		goto skip_full_check;
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC bpf-next 08/11] bpf: special rules for kernel function calls inside inlinable kfuncs
  2024-11-07 17:50 [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Eduard Zingerman
                   ` (6 preceding siblings ...)
  2024-11-07 17:50 ` [RFC bpf-next 07/11] bpf: instantiate inlinable kfuncs before verification Eduard Zingerman
@ 2024-11-07 17:50 ` Eduard Zingerman
  2024-11-07 17:50 ` [RFC bpf-next 09/11] bpf: move selected dynptr kfuncs to inlinable_kfuncs.c Eduard Zingerman
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-07 17:50 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	Eduard Zingerman

Inlinable kfuncs can call arbitrary kernel functions,
there is no need to check if these functions conform to kfunc or
helper usage rules. Upon seeing such calls:
- mark registers R0-R5 as KERNEL_VALUEs after the call;
- if there are any PTR_TO_STACK parameters, mark all
  allocated stack slots as KERNEL_VALUEs.

The assumption is that KERNEL_VALUE marks should never escape form
kfunc instance body: at the call site R0 is set in accordance to kfunc
processing rules, PTR_TO_STACK parameters are never passed from bpf
program to inlinable kfunc.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/verifier.c | 67 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 66 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 87b6cc8c94f8..5b109139f356 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13209,6 +13209,67 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 	return 0;
 }
 
+static int mark_stack_as_kernel_values(struct bpf_verifier_env *env, struct bpf_func_state *func)
+{
+	struct bpf_stack_state *slot;
+	struct bpf_reg_state *spill;
+	int spi, i;
+
+	if (!inside_inlinable_kfunc(env, func->callsite)) {
+		verbose(env, "verifier bug: shouldn't mark frame#%d as kernel values\n",
+			func->frameno);
+		return -EFAULT;
+	}
+
+	for (spi = 0; spi < func->allocated_stack / BPF_REG_SIZE; spi++) {
+		slot = &func->stack[spi];
+		spill = &slot->spilled_ptr;
+		mark_reg_kernel_value(spill);
+		spill->live |= REG_LIVE_WRITTEN;
+		for (i = 0; i < BPF_REG_SIZE; i++)
+			slot->slot_type[i] = STACK_SPILL;
+		mark_stack_slot_scratched(env, spi);
+	}
+
+	return 0;
+}
+
+static int check_internal_call(struct bpf_verifier_env *env, struct bpf_insn *insn)
+{
+	struct bpf_reg_state *reg, *regs = cur_regs(env);
+	struct bpf_kfunc_call_arg_meta meta;
+	int err, i, nargs;
+
+	err = fetch_kfunc_meta(env, insn, &meta, NULL);
+	if (err < 0)
+		return -EFAULT;
+
+	nargs = btf_type_vlen(meta.func_proto);
+	for (i = 0; i < nargs; i++) {
+		reg = &regs[BPF_REG_1 + i];
+		switch (reg->type) {
+		case SCALAR_VALUE:
+		case KERNEL_VALUE:
+			break;
+		case PTR_TO_STACK:
+			err = mark_stack_as_kernel_values(env, func(env, reg));
+			if (err)
+				return err;
+			break;
+		default:
+			verbose(env, "verifier bug: arg#%i unexpected register type %s\n",
+				i, reg_type_str(env, reg->type));
+			return -EFAULT;
+		}
+	}
+	for (i = 0; i < CALLER_SAVED_REGS; i++) {
+		mark_reg_not_init(env, regs, caller_saved[i]);
+		check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
+	}
+	mark_reg_kernel_value(&regs[BPF_REG_0]);
+	return 0;
+}
+
 static bool check_reg_sane_offset(struct bpf_verifier_env *env,
 				  const struct bpf_reg_state *reg,
 				  enum bpf_reg_type type)
@@ -18828,7 +18889,8 @@ static int do_check(struct bpf_verifier_env *env)
 					return -EINVAL;
 				}
 
-				if (env->cur_state->active_lock.ptr) {
+				if (env->cur_state->active_lock.ptr &&
+				    !inside_inlinable_kfunc(env, env->insn_idx)) {
 					if ((insn->src_reg == BPF_REG_0 && insn->imm != BPF_FUNC_spin_unlock) ||
 					    (insn->src_reg == BPF_PSEUDO_KFUNC_CALL &&
 					     (insn->off != 0 || !is_bpf_graph_api_kfunc(insn->imm)))) {
@@ -18838,6 +18900,9 @@ static int do_check(struct bpf_verifier_env *env)
 				}
 				if (insn->src_reg == BPF_PSEUDO_CALL) {
 					err = check_func_call(env, insn, &env->insn_idx);
+				} else if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL &&
+					   inside_inlinable_kfunc(env, env->insn_idx)) {
+					err = check_internal_call(env, insn);
 				} else if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
 					err = check_kfunc_call(env, insn, &env->insn_idx);
 					if (!err && is_bpf_throw_kfunc(insn)) {
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC bpf-next 09/11] bpf: move selected dynptr kfuncs to inlinable_kfuncs.c
  2024-11-07 17:50 [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Eduard Zingerman
                   ` (7 preceding siblings ...)
  2024-11-07 17:50 ` [RFC bpf-next 08/11] bpf: special rules for kernel function calls inside inlinable kfuncs Eduard Zingerman
@ 2024-11-07 17:50 ` Eduard Zingerman
  2024-11-07 17:50 ` [RFC bpf-next 10/11] selftests/bpf: tests to verify handling of inlined kfuncs Eduard Zingerman
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-07 17:50 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	Eduard Zingerman

Namely, move the following kfuncs:
- bpf_dynptr_is_null
- bpf_dynptr_is_rdonly
- bpf_dynptr_size
- bpf_dynptr_slice

Thus allowing verifier to inline these functions.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf.h           |  36 +++++++++-
 kernel/bpf/Makefile           |   1 +
 kernel/bpf/helpers.c          | 130 +---------------------------------
 kernel/bpf/inlinable_kfuncs.c | 112 +++++++++++++++++++++++++++++
 4 files changed, 148 insertions(+), 131 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 75f57f791cd3..7ca53e165ab0 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1319,11 +1319,43 @@ enum bpf_dynptr_type {
 	BPF_DYNPTR_TYPE_XDP,
 };
 
+/* Since the upper 8 bits of dynptr->size is reserved, the
+ * maximum supported size is 2^24 - 1.
+ */
+#define DYNPTR_MAX_SIZE	((1UL << 24) - 1)
+#define DYNPTR_TYPE_SHIFT	28
+#define DYNPTR_SIZE_MASK	0xFFFFFF
+#define DYNPTR_RDONLY_BIT	BIT(31)
+
 int bpf_dynptr_check_size(u32 size);
-u32 __bpf_dynptr_size(const struct bpf_dynptr_kern *ptr);
+
+static inline u32 __bpf_dynptr_size(const struct bpf_dynptr_kern *ptr)
+{
+	return ptr->size & DYNPTR_SIZE_MASK;
+}
+
 const void *__bpf_dynptr_data(const struct bpf_dynptr_kern *ptr, u32 len);
 void *__bpf_dynptr_data_rw(const struct bpf_dynptr_kern *ptr, u32 len);
-bool __bpf_dynptr_is_rdonly(const struct bpf_dynptr_kern *ptr);
+
+static inline bool __bpf_dynptr_is_rdonly(const struct bpf_dynptr_kern *ptr)
+{
+	return ptr->size & DYNPTR_RDONLY_BIT;
+}
+
+static inline enum bpf_dynptr_type bpf_dynptr_get_type(const struct bpf_dynptr_kern *ptr)
+{
+	return (ptr->size & ~(DYNPTR_RDONLY_BIT)) >> DYNPTR_TYPE_SHIFT;
+}
+
+static inline int bpf_dynptr_check_off_len(const struct bpf_dynptr_kern *ptr, u32 offset, u32 len)
+{
+	u32 size = __bpf_dynptr_size(ptr);
+
+	if (len > size || offset > size - len)
+		return -E2BIG;
+
+	return 0;
+}
 
 #ifdef CONFIG_BPF_JIT
 int bpf_trampoline_link_prog(struct bpf_tramp_link *link,
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 3d7ee81c8e2e..e806b2ea5d81 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -63,6 +63,7 @@ LLC ?= $(LLVM_PREFIX)llc$(LLVM_SUFFIX)
 # -fpatchable-function-entry=16,16 is $(PADDING_CFLAGS)
 CFLAGS_REMOVE_inlinable_kfuncs.bpf.bc.o += $(CC_FLAGS_FTRACE)
 CFLAGS_REMOVE_inlinable_kfuncs.bpf.bc.o += $(PADDING_CFLAGS)
+CFLAGS_inlinable_kfuncs.bpf.bc.o += -D__FOR_BPF
 $(obj)/inlinable_kfuncs.bpf.bc.o: $(src)/inlinable_kfuncs.c
 	$(Q)$(CLANG) $(c_flags) -emit-llvm -c $< -o $@
 
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 395221e53832..75dae5d3f05e 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1641,19 +1641,6 @@ static const struct bpf_func_proto bpf_kptr_xchg_proto = {
 	.arg2_btf_id  = BPF_PTR_POISON,
 };
 
-/* Since the upper 8 bits of dynptr->size is reserved, the
- * maximum supported size is 2^24 - 1.
- */
-#define DYNPTR_MAX_SIZE	((1UL << 24) - 1)
-#define DYNPTR_TYPE_SHIFT	28
-#define DYNPTR_SIZE_MASK	0xFFFFFF
-#define DYNPTR_RDONLY_BIT	BIT(31)
-
-bool __bpf_dynptr_is_rdonly(const struct bpf_dynptr_kern *ptr)
-{
-	return ptr->size & DYNPTR_RDONLY_BIT;
-}
-
 void bpf_dynptr_set_rdonly(struct bpf_dynptr_kern *ptr)
 {
 	ptr->size |= DYNPTR_RDONLY_BIT;
@@ -1664,16 +1651,6 @@ static void bpf_dynptr_set_type(struct bpf_dynptr_kern *ptr, enum bpf_dynptr_typ
 	ptr->size |= type << DYNPTR_TYPE_SHIFT;
 }
 
-static enum bpf_dynptr_type bpf_dynptr_get_type(const struct bpf_dynptr_kern *ptr)
-{
-	return (ptr->size & ~(DYNPTR_RDONLY_BIT)) >> DYNPTR_TYPE_SHIFT;
-}
-
-u32 __bpf_dynptr_size(const struct bpf_dynptr_kern *ptr)
-{
-	return ptr->size & DYNPTR_SIZE_MASK;
-}
-
 static void bpf_dynptr_set_size(struct bpf_dynptr_kern *ptr, u32 new_size)
 {
 	u32 metadata = ptr->size & ~DYNPTR_SIZE_MASK;
@@ -1700,16 +1677,6 @@ void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr)
 	memset(ptr, 0, sizeof(*ptr));
 }
 
-static int bpf_dynptr_check_off_len(const struct bpf_dynptr_kern *ptr, u32 offset, u32 len)
-{
-	u32 size = __bpf_dynptr_size(ptr);
-
-	if (len > size || offset > size - len)
-		return -E2BIG;
-
-	return 0;
-}
-
 BPF_CALL_4(bpf_dynptr_from_mem, void *, data, u32, size, u64, flags, struct bpf_dynptr_kern *, ptr)
 {
 	int err;
@@ -2540,76 +2507,8 @@ __bpf_kfunc struct task_struct *bpf_task_from_vpid(s32 vpid)
 	return p;
 }
 
-/**
- * bpf_dynptr_slice() - Obtain a read-only pointer to the dynptr data.
- * @p: The dynptr whose data slice to retrieve
- * @offset: Offset into the dynptr
- * @buffer__opt: User-provided buffer to copy contents into.  May be NULL
- * @buffer__szk: Size (in bytes) of the buffer if present. This is the
- *               length of the requested slice. This must be a constant.
- *
- * For non-skb and non-xdp type dynptrs, there is no difference between
- * bpf_dynptr_slice and bpf_dynptr_data.
- *
- *  If buffer__opt is NULL, the call will fail if buffer_opt was needed.
- *
- * If the intention is to write to the data slice, please use
- * bpf_dynptr_slice_rdwr.
- *
- * The user must check that the returned pointer is not null before using it.
- *
- * Please note that in the case of skb and xdp dynptrs, bpf_dynptr_slice
- * does not change the underlying packet data pointers, so a call to
- * bpf_dynptr_slice will not invalidate any ctx->data/data_end pointers in
- * the bpf program.
- *
- * Return: NULL if the call failed (eg invalid dynptr), pointer to a read-only
- * data slice (can be either direct pointer to the data or a pointer to the user
- * provided buffer, with its contents containing the data, if unable to obtain
- * direct pointer)
- */
 __bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr *p, u32 offset,
-				   void *buffer__opt, u32 buffer__szk)
-{
-	const struct bpf_dynptr_kern *ptr = (struct bpf_dynptr_kern *)p;
-	enum bpf_dynptr_type type;
-	u32 len = buffer__szk;
-	int err;
-
-	if (!ptr->data)
-		return NULL;
-
-	err = bpf_dynptr_check_off_len(ptr, offset, len);
-	if (err)
-		return NULL;
-
-	type = bpf_dynptr_get_type(ptr);
-
-	switch (type) {
-	case BPF_DYNPTR_TYPE_LOCAL:
-	case BPF_DYNPTR_TYPE_RINGBUF:
-		return ptr->data + ptr->offset + offset;
-	case BPF_DYNPTR_TYPE_SKB:
-		if (buffer__opt)
-			return skb_header_pointer(ptr->data, ptr->offset + offset, len, buffer__opt);
-		else
-			return skb_pointer_if_linear(ptr->data, ptr->offset + offset, len);
-	case BPF_DYNPTR_TYPE_XDP:
-	{
-		void *xdp_ptr = bpf_xdp_pointer(ptr->data, ptr->offset + offset, len);
-		if (!IS_ERR_OR_NULL(xdp_ptr))
-			return xdp_ptr;
-
-		if (!buffer__opt)
-			return NULL;
-		bpf_xdp_copy_buf(ptr->data, ptr->offset + offset, buffer__opt, len, false);
-		return buffer__opt;
-	}
-	default:
-		WARN_ONCE(true, "unknown dynptr type %d\n", type);
-		return NULL;
-	}
-}
+				   void *buffer__opt, u32 buffer__szk);
 
 /**
  * bpf_dynptr_slice_rdwr() - Obtain a writable pointer to the dynptr data.
@@ -2705,33 +2604,6 @@ __bpf_kfunc int bpf_dynptr_adjust(const struct bpf_dynptr *p, u32 start, u32 end
 	return 0;
 }
 
-__bpf_kfunc bool bpf_dynptr_is_null(const struct bpf_dynptr *p)
-{
-	struct bpf_dynptr_kern *ptr = (struct bpf_dynptr_kern *)p;
-
-	return !ptr->data;
-}
-
-__bpf_kfunc bool bpf_dynptr_is_rdonly(const struct bpf_dynptr *p)
-{
-	struct bpf_dynptr_kern *ptr = (struct bpf_dynptr_kern *)p;
-
-	if (!ptr->data)
-		return false;
-
-	return __bpf_dynptr_is_rdonly(ptr);
-}
-
-__bpf_kfunc __u32 bpf_dynptr_size(const struct bpf_dynptr *p)
-{
-	struct bpf_dynptr_kern *ptr = (struct bpf_dynptr_kern *)p;
-
-	if (!ptr->data)
-		return -EINVAL;
-
-	return __bpf_dynptr_size(ptr);
-}
-
 __bpf_kfunc int bpf_dynptr_clone(const struct bpf_dynptr *p,
 				 struct bpf_dynptr *clone__uninit)
 {
diff --git a/kernel/bpf/inlinable_kfuncs.c b/kernel/bpf/inlinable_kfuncs.c
index 7b7dc05fa1a4..aeb3e3f209f7 100644
--- a/kernel/bpf/inlinable_kfuncs.c
+++ b/kernel/bpf/inlinable_kfuncs.c
@@ -1 +1,113 @@
 // SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/bpf.h>
+#include <linux/btf.h>
+#include <linux/skbuff.h>
+#include <linux/filter.h>
+
+__bpf_kfunc bool bpf_dynptr_is_null(const struct bpf_dynptr *p);
+__bpf_kfunc bool bpf_dynptr_is_rdonly(const struct bpf_dynptr *p);
+__bpf_kfunc __u32 bpf_dynptr_size(const struct bpf_dynptr *p);
+__bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr *p, u32 offset,
+				   void *buffer__opt, u32 buffer__szk);
+
+__bpf_kfunc bool bpf_dynptr_is_null(const struct bpf_dynptr *p)
+{
+	struct bpf_dynptr_kern *ptr = (struct bpf_dynptr_kern *)p;
+
+	return !ptr->data;
+}
+
+__bpf_kfunc bool bpf_dynptr_is_rdonly(const struct bpf_dynptr *p)
+{
+	struct bpf_dynptr_kern *ptr = (struct bpf_dynptr_kern *)p;
+
+	if (!ptr->data)
+		return false;
+
+	return __bpf_dynptr_is_rdonly(ptr);
+}
+
+__bpf_kfunc __u32 bpf_dynptr_size(const struct bpf_dynptr *p)
+{
+	struct bpf_dynptr_kern *ptr = (struct bpf_dynptr_kern *)p;
+
+	if (!ptr->data)
+		return -EINVAL;
+
+	return __bpf_dynptr_size(ptr);
+}
+
+/**
+ * bpf_dynptr_slice() - Obtain a read-only pointer to the dynptr data.
+ * @p: The dynptr whose data slice to retrieve
+ * @offset: Offset into the dynptr
+ * @buffer__opt: User-provided buffer to copy contents into.  May be NULL
+ * @buffer__szk: Size (in bytes) of the buffer if present. This is the
+ *               length of the requested slice. This must be a constant.
+ *
+ * For non-skb and non-xdp type dynptrs, there is no difference between
+ * bpf_dynptr_slice and bpf_dynptr_data.
+ *
+ *  If buffer__opt is NULL, the call will fail if buffer_opt was needed.
+ *
+ * If the intention is to write to the data slice, please use
+ * bpf_dynptr_slice_rdwr.
+ *
+ * The user must check that the returned pointer is not null before using it.
+ *
+ * Please note that in the case of skb and xdp dynptrs, bpf_dynptr_slice
+ * does not change the underlying packet data pointers, so a call to
+ * bpf_dynptr_slice will not invalidate any ctx->data/data_end pointers in
+ * the bpf program.
+ *
+ * Return: NULL if the call failed (eg invalid dynptr), pointer to a read-only
+ * data slice (can be either direct pointer to the data or a pointer to the user
+ * provided buffer, with its contents containing the data, if unable to obtain
+ * direct pointer)
+ */
+__bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr *p, u32 offset,
+				   void *buffer__opt, u32 buffer__szk)
+{
+	const struct bpf_dynptr_kern *ptr = (struct bpf_dynptr_kern *)p;
+	enum bpf_dynptr_type type;
+	u32 len = buffer__szk;
+	int err;
+
+	if (!ptr->data)
+		return NULL;
+
+	err = bpf_dynptr_check_off_len(ptr, offset, len);
+	if (err)
+		return NULL;
+
+	type = bpf_dynptr_get_type(ptr);
+
+	switch (type) {
+	case BPF_DYNPTR_TYPE_LOCAL:
+	case BPF_DYNPTR_TYPE_RINGBUF:
+		return ptr->data + ptr->offset + offset;
+	case BPF_DYNPTR_TYPE_SKB:
+		if (buffer__opt)
+			return skb_header_pointer(ptr->data, ptr->offset + offset, len, buffer__opt);
+		else
+			return skb_pointer_if_linear(ptr->data, ptr->offset + offset, len);
+	case BPF_DYNPTR_TYPE_XDP:
+	{
+		void *xdp_ptr = bpf_xdp_pointer(ptr->data, ptr->offset + offset, len);
+		if (!IS_ERR_OR_NULL(xdp_ptr))
+			return xdp_ptr;
+
+		if (!buffer__opt)
+			return NULL;
+		bpf_xdp_copy_buf(ptr->data, ptr->offset + offset, buffer__opt, len, false);
+		return buffer__opt;
+	}
+	default:
+	// TODO: can't handle inline assembly inside this when compiling to BPF
+#ifndef __FOR_BPF
+		WARN_ONCE(true, "unknown dynptr type %d\n", type);
+#endif
+		return NULL;
+	}
+}
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC bpf-next 10/11] selftests/bpf: tests to verify handling of inlined kfuncs
  2024-11-07 17:50 [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Eduard Zingerman
                   ` (8 preceding siblings ...)
  2024-11-07 17:50 ` [RFC bpf-next 09/11] bpf: move selected dynptr kfuncs to inlinable_kfuncs.c Eduard Zingerman
@ 2024-11-07 17:50 ` Eduard Zingerman
  2024-11-07 22:04   ` Jeff Johnson
  2024-11-07 17:50 ` [RFC bpf-next 11/11] selftests/bpf: dynptr_slice benchmark Eduard Zingerman
  2024-11-08 20:41 ` [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Toke Høiland-Jørgensen
  11 siblings, 1 reply; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-07 17:50 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	Eduard Zingerman

Verify that:
- kfunc callsites are treated independently;
- scalar parameters range is known in the inlined body;
- null pointer parameters are known as null in the inlined body;
- type of dynptr parameters is known in the inlined body;
- memory references are passed as KERNEL_VALUE objects;
- callee saved registers r6-r9 are spilled/filled at before/after
  inlined body;
- r10 escapes in kfunc body.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 .../selftests/bpf/bpf_testmod/Makefile        |  24 +-
 .../{bpf_testmod.c => bpf_testmod_core.c}     |  25 ++
 .../bpf/bpf_testmod/test_inlinable_kfuncs.c   | 132 +++++++++
 .../selftests/bpf/prog_tests/verifier.c       |   7 +
 .../bpf/progs/verifier_inlinable_kfuncs.c     | 253 ++++++++++++++++++
 5 files changed, 440 insertions(+), 1 deletion(-)
 rename tools/testing/selftests/bpf/bpf_testmod/{bpf_testmod.c => bpf_testmod_core.c} (97%)
 create mode 100644 tools/testing/selftests/bpf/bpf_testmod/test_inlinable_kfuncs.c
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_inlinable_kfuncs.c

diff --git a/tools/testing/selftests/bpf/bpf_testmod/Makefile b/tools/testing/selftests/bpf/bpf_testmod/Makefile
index 15cb36c4483a..242669a6954a 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/Makefile
+++ b/tools/testing/selftests/bpf/bpf_testmod/Makefile
@@ -10,7 +10,9 @@ endif
 MODULES = bpf_testmod.ko
 
 obj-m += bpf_testmod.o
-CFLAGS_bpf_testmod.o = -I$(src)
+CFLAGS_bpf_testmod_core.o = -I$(src)
+
+bpf_testmod-y := bpf_testmod_core.o test_inlinable_kfuncs.o
 
 all:
 	+$(Q)make -C $(KDIR) M=$(BPF_TESTMOD_DIR) modules
@@ -18,3 +20,23 @@ all:
 clean:
 	+$(Q)make -C $(KDIR) M=$(BPF_TESTMOD_DIR) clean
 
+ifdef CONFIG_CC_IS_CLANG
+
+CLANG ?= $(LLVM_PREFIX)clang$(LLVM_SUFFIX)
+LLC ?= $(LLVM_PREFIX)llc$(LLVM_SUFFIX)
+
+CFLAGS_REMOVE_test_inlinable_kfuncs.bpf.bc.o += $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_test_inlinable_kfuncs.bpf.bc.o += $(PADDING_CFLAGS)
+CFLAGS_test_inlinable_kfuncs.bpf.bc.o += -D__FOR_BPF
+$(obj)/test_inlinable_kfuncs.bpf.bc.o: $(src)/test_inlinable_kfuncs.c
+	$(Q)$(CLANG) $(c_flags) -emit-llvm -c $< -o $@
+
+$(obj)/test_inlinable_kfuncs.bpf.o: $(obj)/test_inlinable_kfuncs.bpf.bc.o
+	$(Q)$(LLC) -mcpu=v3 --mtriple=bpf --filetype=obj $< -o $@
+
+$(obj)/test_inlinable_kfuncs.bpf.linked.o: $(obj)/test_inlinable_kfuncs.bpf.o
+	$(Q)$(BPFTOOL) gen object $@ $<
+
+$(obj)/bpf_testmod_core.o: $(obj)/test_inlinable_kfuncs.bpf.linked.o
+
+endif
diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_core.c
similarity index 97%
rename from tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
rename to tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_core.c
index 987d41af71d2..586b752ad6eb 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_core.c
@@ -586,6 +586,14 @@ BTF_ID_FLAGS(func, bpf_kfunc_trusted_num_test, KF_TRUSTED_ARGS)
 BTF_ID_FLAGS(func, bpf_kfunc_rcu_task_test, KF_RCU)
 BTF_ID_FLAGS(func, bpf_testmod_ctx_create, KF_ACQUIRE | KF_RET_NULL)
 BTF_ID_FLAGS(func, bpf_testmod_ctx_release, KF_RELEASE)
+BTF_ID_FLAGS(func, bpf_test_inline_kfunc1)
+BTF_ID_FLAGS(func, bpf_test_inline_kfunc2)
+BTF_ID_FLAGS(func, bpf_test_inline_kfunc3)
+BTF_ID_FLAGS(func, bpf_test_inline_kfunc4)
+BTF_ID_FLAGS(func, bpf_test_inline_kfunc5)
+BTF_ID_FLAGS(func, bpf_test_inline_kfunc6)
+BTF_ID_FLAGS(func, bpf_test_inline_kfunc7)
+BTF_ID_FLAGS(func, bpf_test_inline_kfunc8)
 BTF_KFUNCS_END(bpf_testmod_common_kfunc_ids)
 
 BTF_ID_LIST(bpf_testmod_dtor_ids)
@@ -1315,6 +1323,19 @@ static struct bpf_struct_ops testmod_st_ops = {
 
 extern int bpf_fentry_test1(int a);
 
+asm (
+"	.pushsection .data, \"a\"			\n"
+"	.global test_inlinable_kfuncs_data		\n"
+"test_inlinable_kfuncs_data:				\n"
+"	.incbin \"test_inlinable_kfuncs.bpf.linked.o\"	\n"
+"	.global test_inlinable_kfuncs_data_end		\n"
+"test_inlinable_kfuncs_data_end:				\n"
+"	.popsection					\n"
+);
+
+extern void test_inlinable_kfuncs_data;
+extern void test_inlinable_kfuncs_data_end;
+
 static int bpf_testmod_init(void)
 {
 	const struct btf_id_dtor_kfunc bpf_testmod_dtors[] = {
@@ -1337,6 +1358,9 @@ static int bpf_testmod_init(void)
 	ret = ret ?: register_btf_id_dtor_kfuncs(bpf_testmod_dtors,
 						 ARRAY_SIZE(bpf_testmod_dtors),
 						 THIS_MODULE);
+	ret = ret ?: bpf_register_inlinable_kfuncs(&test_inlinable_kfuncs_data,
+						 &test_inlinable_kfuncs_data_end - &test_inlinable_kfuncs_data,
+						 THIS_MODULE);
 	if (ret < 0)
 		return ret;
 	if (bpf_fentry_test1(0) < 0)
@@ -1373,6 +1397,7 @@ static void bpf_testmod_exit(void)
 	bpf_kfunc_close_sock();
 	sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file);
 	unregister_bpf_testmod_uprobe();
+	bpf_unregister_inlinable_kfuncs(THIS_MODULE);
 }
 
 module_init(bpf_testmod_init);
diff --git a/tools/testing/selftests/bpf/bpf_testmod/test_inlinable_kfuncs.c b/tools/testing/selftests/bpf/bpf_testmod/test_inlinable_kfuncs.c
new file mode 100644
index 000000000000..d8b90ee7f2b0
--- /dev/null
+++ b/tools/testing/selftests/bpf/bpf_testmod/test_inlinable_kfuncs.c
@@ -0,0 +1,132 @@
+#include <linux/bpf.h>
+
+#define __imm(name) [name]"i"(name)
+
+__bpf_kfunc int bpf_test_inline_kfunc1(int p);
+__bpf_kfunc int bpf_test_inline_kfunc2(int *p);
+__bpf_kfunc int bpf_test_inline_kfunc3(void *p, u64 p__sz);
+__bpf_kfunc int bpf_test_inline_kfunc4(void);
+__bpf_kfunc int bpf_test_inline_kfunc5(struct bpf_dynptr *d);
+__bpf_kfunc int bpf_test_inline_kfunc6(void);
+__bpf_kfunc int bpf_test_inline_kfunc7(void);
+__bpf_kfunc int bpf_test_inline_kfunc8(void);
+
+#ifdef __FOR_BPF
+__attribute__((naked))
+int bpf_test_inline_kfunc1(int p)
+{
+	asm volatile (
+		"r0 = 42;"
+		"if r1 != 1 goto +1;"
+		"r0 = 11;"
+		"if r1 != 2 goto +1;"
+		"r0 = 22;"
+		"exit;"
+	);
+}
+
+__attribute__((naked))
+int bpf_test_inline_kfunc2(int *p)
+{
+	asm volatile (
+		"r0 = 42;"
+		"if r1 != 0 goto +1;"
+		"r0 = 24;"
+		"exit;"
+	);
+}
+
+__attribute__((naked))
+int bpf_test_inline_kfunc3(void *p, u64 p__sz)
+{
+	asm volatile (
+		"r1 = *(u64 *)(r1 + 0);"
+		"*(u64 *)(r1 + 0) = 42;"
+		"r0 = 0;"
+		"exit;"
+	);
+}
+
+__attribute__((naked))
+int bpf_test_inline_kfunc4(void)
+{
+	asm volatile (
+		"r0 = 0;"
+		"r1 = 1;"
+		"r2 = 2;"
+		"r6 = 3;"
+		"r7 = 4;"
+		"exit;"
+	);
+}
+
+__attribute__((naked))
+int bpf_test_inline_kfunc5(struct bpf_dynptr *d)
+{
+	asm volatile (
+		"   r1 = *(u32 *)(r1 + 8);"
+		"   r1 &= %[INV_RDONLY_BIT];"
+		"   r1 >>= %[TYPE_SHIFT];"
+		"   if r1 != %[BPF_DYNPTR_TYPE_SKB] goto 1f;"
+		"   r0 = 1;"
+		"   goto 3f;"
+		"1: if r1 != %[BPF_DYNPTR_TYPE_XDP] goto 2f;"
+		"   r0 = 2;"
+		"   goto 3f;"
+		"2: r0 = 3;"
+		"3: exit;"
+	:: __imm(BPF_DYNPTR_TYPE_SKB),
+	   __imm(BPF_DYNPTR_TYPE_XDP),
+	   [INV_RDONLY_BIT]"i"(~DYNPTR_RDONLY_BIT),
+	   [TYPE_SHIFT]"i"(DYNPTR_TYPE_SHIFT));
+}
+
+__attribute__((naked))
+int bpf_test_inline_kfunc6(void)
+{
+	asm volatile (
+		"r0 = 0;"
+		"*(u64 *)(r10 - 8) = r0;"
+		"r0 = *(u64 *)(r10 - 8);"
+		"r6 = 1;"
+		"exit;"
+	);
+}
+
+__attribute__((naked))
+int bpf_test_inline_kfunc7(void)
+{
+	asm volatile (
+		"r0 = 0;"
+		"*(u64 *)(r10 - 8) = r10;"
+		"exit;"
+	);
+}
+
+__attribute__((naked))
+int bpf_test_inline_kfunc8(void)
+{
+	asm volatile (
+		"r0 = 0;"
+		"r1 = r10;"
+		"exit;"
+	);
+}
+
+#endif  /* __FOR_BPF */
+
+#ifndef __FOR_BPF
+
+/* Only interested in BPF assembly bodies of these functions, keep dummy bodies */
+__bpf_kfunc int bpf_test_inline_kfunc1(int p) { return 0; }
+__bpf_kfunc int bpf_test_inline_kfunc2(int *p) { return 0; }
+__bpf_kfunc int bpf_test_inline_kfunc3(void *p, u64 p__sz) { return 0; }
+__bpf_kfunc int bpf_test_inline_kfunc4(void) { return 0; }
+__bpf_kfunc int bpf_test_inline_kfunc5(struct bpf_dynptr *p) { return 0; }
+__bpf_kfunc int bpf_test_inline_kfunc6(void) { return 0; }
+__bpf_kfunc int bpf_test_inline_kfunc7(void) { return 0; }
+__bpf_kfunc int bpf_test_inline_kfunc8(void) { return 0; }
+
+#endif /* __FOR_BPF not defined */
+
+MODULE_LICENSE("Dual BSD/GPL");
diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
index efd42c07f58a..730631603870 100644
--- a/tools/testing/selftests/bpf/prog_tests/verifier.c
+++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
@@ -78,6 +78,7 @@
 #include "verifier_spill_fill.skel.h"
 #include "verifier_spin_lock.skel.h"
 #include "verifier_stack_ptr.skel.h"
+#include "verifier_inlinable_kfuncs.skel.h"
 #include "verifier_subprog_precision.skel.h"
 #include "verifier_subreg.skel.h"
 #include "verifier_tailcall_jit.skel.h"
@@ -291,3 +292,9 @@ void test_verifier_value_ptr_arith(void)
 		      verifier_value_ptr_arith__elf_bytes,
 		      init_value_ptr_arith_maps);
 }
+
+/* Do not drop CAP_SYS_ADMIN for these tests */
+void test_verifier_inlinable_kfuncs(void)
+{
+	RUN_TESTS(verifier_inlinable_kfuncs);
+}
diff --git a/tools/testing/selftests/bpf/progs/verifier_inlinable_kfuncs.c b/tools/testing/selftests/bpf/progs/verifier_inlinable_kfuncs.c
new file mode 100644
index 000000000000..bd1cf5f5956c
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/verifier_inlinable_kfuncs.c
@@ -0,0 +1,253 @@
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_core_read.h>
+#include "bpf_misc.h"
+#include <stdbool.h>
+#include "bpf_kfuncs.h"
+
+extern int bpf_test_inline_kfunc1(int p) __ksym;
+extern int bpf_test_inline_kfunc2(int *p) __ksym;
+extern int bpf_test_inline_kfunc3(void *p, __u64 p__sz) __ksym;
+extern int bpf_test_inline_kfunc4(void) __ksym;
+extern int bpf_test_inline_kfunc5(const struct bpf_dynptr *p) __ksym;
+extern int bpf_test_inline_kfunc6(void) __ksym;
+extern int bpf_test_inline_kfunc7(void) __ksym;
+extern int bpf_test_inline_kfunc8(void) __ksym;
+
+SEC("socket")
+/* verify that scalar params are marked as precise */
+__log_level(2)
+/* first call to kfunc */
+__msg("1: (85) call bpf_test_inline_kfunc1")
+__msg("mark_precise: frame0: last_idx 1 first_idx 0 subseq_idx -1")
+__msg("mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 1")
+/* second call to kfunc */
+__msg("3: (85) call bpf_test_inline_kfunc1")
+__msg("mark_precise: frame0: last_idx 3 first_idx 0 subseq_idx -1")
+__msg("mark_precise: frame0: regs=r1 stack= before 2: (b7) r1 = 2")
+/* check that dead code elimination took place independently for both callsites */
+__xlated("0: r1 = 1")
+__xlated("1: r0 = 42")
+__xlated("2: r0 = 11")
+__xlated("3: goto pc+0")
+__xlated("4: r1 = 2")
+__xlated("5: r0 = 42")
+__xlated("6: r0 = 22")
+__xlated("7: goto pc+0")
+__xlated("8: exit")
+__success
+__naked void two_callsites_scalar_param(void)
+{
+	asm volatile (
+		"r1 = 1;"
+		"call %[bpf_test_inline_kfunc1];"
+		"r1 = 2;"
+		"call %[bpf_test_inline_kfunc1];"
+		"exit;"
+		:
+		: __imm(bpf_test_inline_kfunc1)
+		: __clobber_all
+	);
+}
+
+SEC("socket")
+__xlated("0: r1 = 0")
+__xlated("1: r0 = 42")
+__xlated("2: r0 = 24")
+__xlated("3: goto pc+0")
+__xlated("4: exit")
+__success
+__naked void param_null(void)
+{
+	asm volatile (
+		"r1 = 0;"
+		"call %[bpf_test_inline_kfunc2];"
+		"exit;"
+		:
+		: __imm(bpf_test_inline_kfunc2)
+		: __clobber_all
+	);
+}
+
+SEC("socket")
+__xlated("0: r1 = r10")
+__xlated("1: r1 += -8")
+__xlated("2: r2 = 8")
+__xlated("3: r1 = *(u64 *)(r1 +0)")
+__xlated("4: *(u64 *)(r1 +0) = 42")
+__xlated("5: r0 = 0")
+__xlated("6: goto pc+0")
+__xlated("7: exit")
+__success
+__naked void param_kernel_value(void)
+{
+	asm volatile (
+		"r1 = r10;"
+		"r1 += -8;"
+		"r2 = 8;"
+		"call %[bpf_test_inline_kfunc3];"
+		"exit;"
+		:
+		: __imm(bpf_test_inline_kfunc3)
+		: __clobber_all
+	);
+}
+
+SEC("socket")
+__xlated("0: *(u64 *)(r10 -128) = r1")
+__xlated("1: *(u64 *)(r10 -136) = r6")
+__xlated("2: *(u64 *)(r10 -144) = r7")
+__xlated("3: r0 = 0")
+__xlated("4: r1 = 1")
+__xlated("5: r2 = 2")
+__xlated("6: r6 = 3")
+__xlated("7: r7 = 4")
+__xlated("8: goto pc+0")
+__xlated("9: r7 = *(u64 *)(r10 -144)")
+__xlated("10: r6 = *(u64 *)(r10 -136)")
+__xlated("11: r1 = *(u64 *)(r10 -128)")
+__xlated("12: exit")
+__success
+__naked void clobbered_regs(void)
+{
+	asm volatile (
+		"*(u64 *)(r10 - 128) = r1;"
+		"call %[bpf_test_inline_kfunc4];"
+		"r1 = *(u64 *)(r10 - 128);"
+		"exit;"
+		:
+		: __imm(bpf_test_inline_kfunc4)
+		: __clobber_all
+	);
+}
+
+SEC("socket")
+__xlated("0: *(u64 *)(r10 -32) = r1")
+__xlated("1: *(u64 *)(r10 -40) = r6")
+__xlated("2: r0 = 0")
+__xlated("3: *(u64 *)(r10 -48) = r0")
+__xlated("4: r0 = *(u64 *)(r10 -48)")
+__xlated("5: r6 = 1")
+__xlated("6: goto pc+0")
+__xlated("7: r6 = *(u64 *)(r10 -40)")
+__xlated("8: r1 = *(u64 *)(r10 -32)")
+__xlated("9: exit")
+__success
+__naked void clobbered_regs_and_stack(void)
+{
+	asm volatile (
+		"*(u64 *)(r10 - 32) = r1;"
+		"call %[bpf_test_inline_kfunc6];"
+		"r1 = *(u64 *)(r10 - 32);"
+		"exit;"
+		:
+		: __imm(bpf_test_inline_kfunc6)
+		: __clobber_all
+	);
+}
+
+SEC("socket")
+__xlated("0: call kernel-function")
+__xlated("1: exit")
+__success
+__naked void r10_escapes1(void)
+{
+	asm volatile (
+		"call %[bpf_test_inline_kfunc7];"
+		"exit;"
+		:
+		: __imm(bpf_test_inline_kfunc7)
+		: __clobber_all
+	);
+}
+
+SEC("socket")
+__xlated("0: call kernel-function")
+__xlated("1: exit")
+__success
+__naked void r10_escapes2(void)
+{
+	asm volatile (
+		"call %[bpf_test_inline_kfunc8];"
+		"exit;"
+		:
+		: __imm(bpf_test_inline_kfunc8)
+		: __clobber_all
+	);
+}
+
+SEC("xdp")
+__xlated("5: r1 = r10")
+__xlated("6: r1 += -16")
+__xlated("7: r1 = *(u32 *)(r1 +8)")
+__xlated("8: r1 &= ")
+__xlated("9: r1 >>= ")
+__xlated("10: r0 = 2")
+__xlated("11: goto pc+0")
+__xlated("12: exit")
+__success
+__naked void param_dynptr1(void)
+{
+	asm volatile (
+		"r1 = r1;"
+		"r2 = 0;"
+		"r3 = r10;"
+		"r3 += -16;"
+		"call %[bpf_dynptr_from_xdp];"
+		"r1 = r10;"
+		"r1 += -16;"
+		"call %[bpf_test_inline_kfunc5];"
+		"exit;"
+		:
+		: __imm(bpf_test_inline_kfunc5),
+		  __imm(bpf_dynptr_from_xdp)
+		: __clobber_all
+	);
+}
+
+SEC("cgroup_skb/egress")
+__xlated("5: r1 = r10")
+__xlated("6: r1 += -16")
+__xlated("7: r1 = *(u32 *)(r1 +8)")
+__xlated("8: r1 &= ")
+__xlated("9: r1 >>= ")
+__xlated("10: r0 = 1")
+__xlated("11: goto pc+0")
+__xlated("12: r0 &= 3")
+__xlated("13: exit")
+__success
+__naked void param_dynptr2(void)
+{
+	asm volatile (
+		"r1 = r1;"
+		"r2 = 0;"
+		"r3 = r10;"
+		"r3 += -16;"
+		"call %[bpf_dynptr_from_skb];"
+		"r1 = r10;"
+		"r1 += -16;"
+		"call %[bpf_test_inline_kfunc5];"
+		"r0 &= 3;"
+		"exit;"
+		:
+		: __imm(bpf_test_inline_kfunc5),
+		  __imm(bpf_dynptr_from_skb)
+		: __clobber_all
+	);
+}
+
+void __kfunc_btf_root(void)
+{
+	bpf_test_inline_kfunc1(0);
+	bpf_test_inline_kfunc2(0);
+	bpf_test_inline_kfunc3(0, 0);
+	bpf_test_inline_kfunc4();
+	bpf_test_inline_kfunc5(0);
+	bpf_test_inline_kfunc6();
+	bpf_test_inline_kfunc7();
+	bpf_test_inline_kfunc8();
+	bpf_dynptr_from_skb(0, 0, 0);
+	bpf_dynptr_from_xdp(0, 0, 0);
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [RFC bpf-next 11/11] selftests/bpf: dynptr_slice benchmark
  2024-11-07 17:50 [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Eduard Zingerman
                   ` (9 preceding siblings ...)
  2024-11-07 17:50 ` [RFC bpf-next 10/11] selftests/bpf: tests to verify handling of inlined kfuncs Eduard Zingerman
@ 2024-11-07 17:50 ` Eduard Zingerman
  2024-11-08 20:41 ` [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Toke Høiland-Jørgensen
  11 siblings, 0 replies; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-07 17:50 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	Eduard Zingerman

Benchmark to measure execution time of a simple bpf_dynptr_slice()
call, when dynptr type is known and buffer__opt parameter is null.
Under such conditions verifier inline the kfunc call and delete a
couple of conditionals in the body of the kfunc.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/Makefile          |  2 +
 tools/testing/selftests/bpf/bench.c           |  2 +
 .../selftests/bpf/benchs/bench_dynptr_slice.c | 76 +++++++++++++++++++
 .../selftests/bpf/progs/dynptr_slice_bench.c  | 29 +++++++
 4 files changed, 109 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/benchs/bench_dynptr_slice.c
 create mode 100644 tools/testing/selftests/bpf/progs/dynptr_slice_bench.c

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index edef5df08cb2..3a938d5b295f 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -845,6 +845,7 @@ $(OUTPUT)/bench_local_storage_create.o: $(OUTPUT)/bench_local_storage_create.ske
 $(OUTPUT)/bench_bpf_hashmap_lookup.o: $(OUTPUT)/bpf_hashmap_lookup.skel.h
 $(OUTPUT)/bench_htab_mem.o: $(OUTPUT)/htab_mem_bench.skel.h
 $(OUTPUT)/bench_bpf_crypto.o: $(OUTPUT)/crypto_bench.skel.h
+$(OUTPUT)/bench_dynptr_slice.o: $(OUTPUT)/dynptr_slice_bench.skel.h
 $(OUTPUT)/bench.o: bench.h testing_helpers.h $(BPFOBJ)
 $(OUTPUT)/bench: LDLIBS += -lm
 $(OUTPUT)/bench: $(OUTPUT)/bench.o \
@@ -865,6 +866,7 @@ $(OUTPUT)/bench: $(OUTPUT)/bench.o \
 		 $(OUTPUT)/bench_local_storage_create.o \
 		 $(OUTPUT)/bench_htab_mem.o \
 		 $(OUTPUT)/bench_bpf_crypto.o \
+		 $(OUTPUT)/bench_dynptr_slice.o \
 		 #
 	$(call msg,BINARY,,$@)
 	$(Q)$(CC) $(CFLAGS) $(LDFLAGS) $(filter %.a %.o,$^) $(LDLIBS) -o $@
diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c
index 1bd403a5ef7b..74053665465c 100644
--- a/tools/testing/selftests/bpf/bench.c
+++ b/tools/testing/selftests/bpf/bench.c
@@ -549,6 +549,7 @@ extern const struct bench bench_local_storage_create;
 extern const struct bench bench_htab_mem;
 extern const struct bench bench_crypto_encrypt;
 extern const struct bench bench_crypto_decrypt;
+extern const struct bench bench_dynptr_slice;
 
 static const struct bench *benchs[] = {
 	&bench_count_global,
@@ -609,6 +610,7 @@ static const struct bench *benchs[] = {
 	&bench_htab_mem,
 	&bench_crypto_encrypt,
 	&bench_crypto_decrypt,
+	&bench_dynptr_slice,
 };
 
 static void find_benchmark(void)
diff --git a/tools/testing/selftests/bpf/benchs/bench_dynptr_slice.c b/tools/testing/selftests/bpf/benchs/bench_dynptr_slice.c
new file mode 100644
index 000000000000..957ecc6f1531
--- /dev/null
+++ b/tools/testing/selftests/bpf/benchs/bench_dynptr_slice.c
@@ -0,0 +1,76 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
+
+#include <argp.h>
+#include "bench.h"
+#include "dynptr_slice_bench.skel.h"
+
+static struct dynptr_slice_ctx {
+	struct dynptr_slice_bench *skel;
+	int pfd;
+} ctx;
+
+static void dynptr_slice_validate(void)
+{
+	if (env.consumer_cnt != 0) {
+		fprintf(stderr, "bpf dynptr_slice benchmark doesn't support consumer!\n");
+		exit(1);
+	}
+}
+
+static void dynptr_slice_setup(void)
+{
+	LIBBPF_OPTS(bpf_test_run_opts, opts);
+	int err;
+
+	setup_libbpf();
+	ctx.skel = dynptr_slice_bench__open();
+	if (!ctx.skel) {
+		fprintf(stderr, "failed to open skeleton\n");
+		exit(1);
+	}
+
+	ctx.skel->bss->hits = 0;
+	err = dynptr_slice_bench__load(ctx.skel);
+	if (err) {
+		fprintf(stderr, "failed to load skeleton\n");
+		dynptr_slice_bench__destroy(ctx.skel);
+		exit(1);
+	}
+}
+
+static void dynptr_slice_encrypt_setup(void)
+{
+	dynptr_slice_setup();
+	ctx.pfd = bpf_program__fd(ctx.skel->progs.dynptr_slice);
+}
+
+
+static void dynptr_slice_measure(struct bench_res *res)
+{
+	res->hits = atomic_swap(&ctx.skel->bss->hits, 0);
+}
+
+static void *dynptr_slice_producer(void *unused)
+{
+	static const char data_in[1000];
+	LIBBPF_OPTS(bpf_test_run_opts, opts,
+		.repeat = 64,
+		.data_in = data_in,
+		.data_size_in = sizeof(data_in),
+	);
+
+	while (true)
+		(void)bpf_prog_test_run_opts(ctx.pfd, &opts);
+	return NULL;
+}
+
+const struct bench bench_dynptr_slice = {
+	.name = "dynptr_slice",
+	.validate = dynptr_slice_validate,
+	.setup = dynptr_slice_encrypt_setup,
+	.producer_thread = dynptr_slice_producer,
+	.measure = dynptr_slice_measure,
+	.report_progress = hits_drops_report_progress,
+	.report_final = hits_drops_report_final,
+};
diff --git a/tools/testing/selftests/bpf/progs/dynptr_slice_bench.c b/tools/testing/selftests/bpf/progs/dynptr_slice_bench.c
new file mode 100644
index 000000000000..65a493426b5e
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/dynptr_slice_bench.c
@@ -0,0 +1,29 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include "bpf_compiler.h"
+#include "bpf_kfuncs.h"
+
+long hits = 0;
+
+SEC("tc")
+int dynptr_slice(struct __sk_buff *skb)
+{
+	struct bpf_dynptr psrc;
+	const int N = 100;
+	int i;
+
+	bpf_dynptr_from_skb(skb, 0, &psrc);
+__pragma_loop_unroll_full
+	for (i = 0; i < N; ++i) {
+		bpf_dynptr_slice(&psrc, i, NULL, 1);
+	}
+	__sync_add_and_fetch(&hits, N);
+
+	return 0;
+}
+
+char __license[] SEC("license") = "GPL";
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 10/11] selftests/bpf: tests to verify handling of inlined kfuncs
  2024-11-07 17:50 ` [RFC bpf-next 10/11] selftests/bpf: tests to verify handling of inlined kfuncs Eduard Zingerman
@ 2024-11-07 22:04   ` Jeff Johnson
  2024-11-07 22:08     ` Eduard Zingerman
  0 siblings, 1 reply; 34+ messages in thread
From: Jeff Johnson @ 2024-11-07 22:04 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor

On 11/7/24 09:50, Eduard Zingerman wrote:
...
> diff --git a/tools/testing/selftests/bpf/bpf_testmod/test_inlinable_kfuncs.c b/tools/testing/selftests/bpf/bpf_testmod/test_inlinable_kfuncs.c
> new file mode 100644
> index 000000000000..d8b90ee7f2b0
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/bpf_testmod/test_inlinable_kfuncs.c
> @@ -0,0 +1,132 @@
...
> +MODULE_LICENSE("Dual BSD/GPL");

Since commit 1fffe7a34c89 ("script: modpost: emit a warning when the
description is missing"), a module without a MODULE_DESCRIPTION() will
result in a warning when built with make W=1. Not sure if this is
applicable to your new module, but if so, please add the missing
MODULE_DESCRIPTION().


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 10/11] selftests/bpf: tests to verify handling of inlined kfuncs
  2024-11-07 22:04   ` Jeff Johnson
@ 2024-11-07 22:08     ` Eduard Zingerman
  2024-11-07 22:19       ` Jeff Johnson
  0 siblings, 1 reply; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-07 22:08 UTC (permalink / raw)
  To: Jeff Johnson, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor

On Thu, 2024-11-07 at 14:04 -0800, Jeff Johnson wrote:

[...]

> Since commit 1fffe7a34c89 ("script: modpost: emit a warning when the
> description is missing"), a module without a MODULE_DESCRIPTION() will
> result in a warning when built with make W=1. Not sure if this is
> applicable to your new module, but if so, please add the missing
> MODULE_DESCRIPTION().
> 

Hi Jeff,

Thank you for the heads-up.
The MODULE_DESCRIPTION is already present in the bpf_testmod.c
(this file is renamed in the RFC, but remains as a part of the module).


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 10/11] selftests/bpf: tests to verify handling of inlined kfuncs
  2024-11-07 22:08     ` Eduard Zingerman
@ 2024-11-07 22:19       ` Jeff Johnson
  2024-11-07 23:00         ` Eduard Zingerman
  0 siblings, 1 reply; 34+ messages in thread
From: Jeff Johnson @ 2024-11-07 22:19 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor

On 11/7/2024 2:08 PM, Eduard Zingerman wrote:
> On Thu, 2024-11-07 at 14:04 -0800, Jeff Johnson wrote:
> 
> [...]
> 
>> Since commit 1fffe7a34c89 ("script: modpost: emit a warning when the
>> description is missing"), a module without a MODULE_DESCRIPTION() will
>> result in a warning when built with make W=1. Not sure if this is
>> applicable to your new module, but if so, please add the missing
>> MODULE_DESCRIPTION().
>>
> 
> Hi Jeff,
> 
> Thank you for the heads-up.
> The MODULE_DESCRIPTION is already present in the bpf_testmod.c
> (this file is renamed in the RFC, but remains as a part of the module).

Does bpf_testmod.c already have a MODULE_LICENSE(). If so, then I'd drop the
extra one in test_inlinable_kfuncs.c.

My reviews on this subject are triggered by the lore search pattern:
MODULE_LICENSE AND NOT MODULE_DESCRIPTION

Since it is expected that the two should appear together.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 10/11] selftests/bpf: tests to verify handling of inlined kfuncs
  2024-11-07 22:19       ` Jeff Johnson
@ 2024-11-07 23:00         ` Eduard Zingerman
  0 siblings, 0 replies; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-07 23:00 UTC (permalink / raw)
  To: Jeff Johnson, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor

On Thu, 2024-11-07 at 14:19 -0800, Jeff Johnson wrote:

[...]

> Does bpf_testmod.c already have a MODULE_LICENSE(). If so, then I'd drop the
> extra one in test_inlinable_kfuncs.c.

It does, will drop the license line from test_inlinable_kfuncs.c.

> My reviews on this subject are triggered by the lore search pattern:
> MODULE_LICENSE AND NOT MODULE_DESCRIPTION
> 
> Since it is expected that the two should appear together.

Understood, thank you for explaining.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF
  2024-11-07 17:50 [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Eduard Zingerman
                   ` (10 preceding siblings ...)
  2024-11-07 17:50 ` [RFC bpf-next 11/11] selftests/bpf: dynptr_slice benchmark Eduard Zingerman
@ 2024-11-08 20:41 ` Toke Høiland-Jørgensen
  2024-11-08 23:01   ` Eduard Zingerman
  11 siblings, 1 reply; 34+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-11-08 20:41 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	Eduard Zingerman, Jesper Dangaard Brouer

Eduard Zingerman <eddyz87@gmail.com> writes:

> Some time ago, in an off-list discussion, Alexei Starovoitov suggested
> compiling certain kfuncs to BPF to allow inlining calls to such kfuncs
> during verification. This RFC explores the idea.
>
> This RFC introduces a notion of inlinable BPF kfuncs.
> Inlinable kfuncs are compiled to BPF and are inlined by verifier after
> program verification. Inlined kfunc bodies are subject to dead code
> removal and removal of conditional jumps, if such jumps are proved to
> always follow a single branch.

Ohh, this is very exciting!

Mostly want to comment on this bit:

> Imo, this RFC is worth following through only if number of kfuncs
> benefiting from inlining is big. If the set is limited to dynptr
> family of functions, it is simpler to add a number of hard-coded
> inlining templates for such functions (similarly to what is currently
> done for some helpers).

One place where this would definitely be applicable is in all the XDP HW
metadata kfuncs. Right now, there's a function call for each piece of HW
metadata that an XDP program wants to read, which quickly adds up. And
in XDP land we are counting function calls, as the overhead (~1.1 ns) is
directly measurable in XDP PPS performance.

Back when we settled on the kfunc approach to reading metadata, we were
discussing this overhead, obviously, and whether we should do the
bespoke BPF assembly type inlining that we currently do for map lookups
and that sort of thing. We were told that the "right" way to do the
inlining is something along the lines of what you are proposing here, so
I would very much encourage you to continue working on this!

One complication for the XDP kfuncs is that the kfunc that the BPF
program calls is actually a stub function in the kernel core; at
verification time, the actual function call is replaced with one from
the network driver (see bpf_dev_bound_resolve_kfunc()). So somehow
supporting this (with kfuncs defined in drivers, i.e., in modules) would
be needed for the XDP use case.

Happy to help with benchmarking for the XDP use case when/if this can be
supported, of course! :)

(+Jesper, who I'm sure will be happy to help as well)

-Toke

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 03/11] bpf: shared BPF/native kfuncs
  2024-11-07 17:50 ` [RFC bpf-next 03/11] bpf: shared BPF/native kfuncs Eduard Zingerman
@ 2024-11-08 20:43   ` Toke Høiland-Jørgensen
  2024-11-08 21:25     ` Eduard Zingerman
  2024-11-15  0:27   ` Andrii Nakryiko
  1 sibling, 1 reply; 34+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-11-08 20:43 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	Eduard Zingerman

Eduard Zingerman <eddyz87@gmail.com> writes:

> Inlinable kfuncs are available only if CLANG is used for kernel
> compilation.

To what extent is this a fundamental limitation? AFAIU, this comes from
the fact that you are re-using the intermediate compilation stages,
right? But if those are absent, couldn't we just invoke a full clang
compile from source of the same file (so you could get the inlining even
when compiling with GCC)?

-Toke

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 03/11] bpf: shared BPF/native kfuncs
  2024-11-08 20:43   ` Toke Høiland-Jørgensen
@ 2024-11-08 21:25     ` Eduard Zingerman
  2024-11-11 18:41       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-08 21:25 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor

On Fri, 2024-11-08 at 21:43 +0100, Toke Høiland-Jørgensen wrote:
> Eduard Zingerman <eddyz87@gmail.com> writes:
> 
> > Inlinable kfuncs are available only if CLANG is used for kernel
> > compilation.
> 
> To what extent is this a fundamental limitation? AFAIU, this comes from
> the fact that you are re-using the intermediate compilation stages,
> right?

Yes, the main obstacle is C --clang--> bitcode as for host --llc--> BPF pipeline.
And this intermediate step is needed to include some of the header
files as-is (but not all will work, e.g. those where host inline
assembly is not dead-code-eliminated by optimizer would error out).
The reason why 'clang --target=bpf' can't be used with these headers
is that headers check current architecture in various places, however:
- there is no BPF architecture defined at the moment;
- most of the time host architecture is what's needed, e.g.
  here is a fragment of arch/x86/include/asm/current.h:

  struct pcpu_hot {
  	union {
  		struct {
  			struct task_struct	*current_task;
  			int			preempt_count;
  			int			cpu_number;
  #ifdef CONFIG_MITIGATION_CALL_DEPTH_TRACKING
  			u64			call_depth;
  #endif
  			unsigned long		top_of_stack;
  			void			*hardirq_stack_ptr;
  			u16			softirq_pending;
  #ifdef CONFIG_X86_64
  			bool			hardirq_stack_inuse;
  #else
  			void			*softirq_stack_ptr;
  #endif
  		};
  		u8	pad[64];
  	};
  };

In case if inlinable kfunc operates on pcpu_hot structure,
it has to see same binary layout as the host.
So, technically, 'llc' step is not necessary, but if it is not present
something else should be done about header files.

> But if those are absent, couldn't we just invoke a full clang
> compile from source of the same file (so you could get the inlining even
> when compiling with GCC)?

Yes, hybrid should work w/o problems if headers are dealt with in some
other way.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF
  2024-11-08 20:41 ` [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Toke Høiland-Jørgensen
@ 2024-11-08 23:01   ` Eduard Zingerman
  2024-11-11 18:42     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-08 23:01 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	Jesper Dangaard Brouer

On Fri, 2024-11-08 at 21:41 +0100, Toke Høiland-Jørgensen wrote:

[...]

Hi Toke,

> Back when we settled on the kfunc approach to reading metadata, we were
> discussing this overhead, obviously, and whether we should do the
> bespoke BPF assembly type inlining that we currently do for map lookups
> and that sort of thing. We were told that the "right" way to do the
> inlining is something along the lines of what you are proposing here, so
> I would very much encourage you to continue working on this!
> 
> One complication for the XDP kfuncs is that the kfunc that the BPF
> program calls is actually a stub function in the kernel core; at
> verification time, the actual function call is replaced with one from
> the network driver (see bpf_dev_bound_resolve_kfunc()). So somehow
> supporting this (with kfuncs defined in drivers, i.e., in modules) would
> be needed for the XDP use case.

Thank you for the pointer to bpf_dev_bound_resolve_kfunc().
Looking at specialize_kfunc(), I will have to extend the interface for
selecting inlinable function body. The inlinable kfuncs already could
be defined in modules, so this should be a relatively small adjustment.

> Happy to help with benchmarking for the XDP use case when/if this can be
> supported, of course! :)
> 
> (+Jesper, who I'm sure will be happy to help as well)

Thank you, help with benchmarking is most welcome.
Very interested in real-world benchmarks, as I'm not fully sold on
this feature, it adds significant layer of complexity to the verifier.
I'll reach to you and Jesper after adding support for inlining of XDP
metadata kfuncs.

Thanks,
Eduard


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 03/11] bpf: shared BPF/native kfuncs
  2024-11-08 21:25     ` Eduard Zingerman
@ 2024-11-11 18:41       ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 34+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-11-11 18:41 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor

Eduard Zingerman <eddyz87@gmail.com> writes:

> On Fri, 2024-11-08 at 21:43 +0100, Toke Høiland-Jørgensen wrote:
>> Eduard Zingerman <eddyz87@gmail.com> writes:
>> 
>> > Inlinable kfuncs are available only if CLANG is used for kernel
>> > compilation.
>> 
>> To what extent is this a fundamental limitation? AFAIU, this comes from
>> the fact that you are re-using the intermediate compilation stages,
>> right?
>
> Yes, the main obstacle is C --clang--> bitcode as for host --llc--> BPF pipeline.
> And this intermediate step is needed to include some of the header
> files as-is (but not all will work, e.g. those where host inline
> assembly is not dead-code-eliminated by optimizer would error out).
> The reason why 'clang --target=bpf' can't be used with these headers
> is that headers check current architecture in various places, however:
> - there is no BPF architecture defined at the moment;
> - most of the time host architecture is what's needed, e.g.
>   here is a fragment of arch/x86/include/asm/current.h:
>
>   struct pcpu_hot {
>   	union {
>   		struct {
>   			struct task_struct	*current_task;
>   			int			preempt_count;
>   			int			cpu_number;
>   #ifdef CONFIG_MITIGATION_CALL_DEPTH_TRACKING
>   			u64			call_depth;
>   #endif
>   			unsigned long		top_of_stack;
>   			void			*hardirq_stack_ptr;
>   			u16			softirq_pending;
>   #ifdef CONFIG_X86_64
>   			bool			hardirq_stack_inuse;
>   #else
>   			void			*softirq_stack_ptr;
>   #endif
>   		};
>   		u8	pad[64];
>   	};
>   };
>
> In case if inlinable kfunc operates on pcpu_hot structure,
> it has to see same binary layout as the host.
> So, technically, 'llc' step is not necessary, but if it is not present
> something else should be done about header files.

Right, makes sense. Do any of the kfuncs you are targeting currently use
headers that have this problem? If not, could a stopgap solution be to
just restrict the set of kfuncs that can be inlined to those that can be
compiled with `clang --target=bpf`? That may require moving around some
code a bit, but there are other examples where all the kfuncs for a
subsystem are kept in a separate .c file anyway (IIRC, netfilter does this).

>> But if those are absent, couldn't we just invoke a full clang
>> compile from source of the same file (so you could get the inlining even
>> when compiling with GCC)?
>
> Yes, hybrid should work w/o problems if headers are dealt with in some
> other way.

But couldn't a hybrid approach be used even in the case of GCC
compilation? I.e., compile it both with GCC (for inclusion into
vmlinux/.ko file) and once with clang (in host mode) and then pass it
through LLC to generate BPF?

-Toke

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF
  2024-11-08 23:01   ` Eduard Zingerman
@ 2024-11-11 18:42     ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 34+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-11-11 18:42 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	Jesper Dangaard Brouer

Eduard Zingerman <eddyz87@gmail.com> writes:

> On Fri, 2024-11-08 at 21:41 +0100, Toke Høiland-Jørgensen wrote:
>
> [...]
>
> Hi Toke,
>
>> Back when we settled on the kfunc approach to reading metadata, we were
>> discussing this overhead, obviously, and whether we should do the
>> bespoke BPF assembly type inlining that we currently do for map lookups
>> and that sort of thing. We were told that the "right" way to do the
>> inlining is something along the lines of what you are proposing here, so
>> I would very much encourage you to continue working on this!
>> 
>> One complication for the XDP kfuncs is that the kfunc that the BPF
>> program calls is actually a stub function in the kernel core; at
>> verification time, the actual function call is replaced with one from
>> the network driver (see bpf_dev_bound_resolve_kfunc()). So somehow
>> supporting this (with kfuncs defined in drivers, i.e., in modules) would
>> be needed for the XDP use case.
>
> Thank you for the pointer to bpf_dev_bound_resolve_kfunc().
> Looking at specialize_kfunc(), I will have to extend the interface for
> selecting inlinable function body. The inlinable kfuncs already could
> be defined in modules, so this should be a relatively small adjustment.

Awesome!

>> Happy to help with benchmarking for the XDP use case when/if this can be
>> supported, of course! :)
>> 
>> (+Jesper, who I'm sure will be happy to help as well)
>
> Thank you, help with benchmarking is most welcome.
> Very interested in real-world benchmarks, as I'm not fully sold on
> this feature, it adds significant layer of complexity to the verifier.
> I'll reach to you and Jesper after adding support for inlining of XDP
> metadata kfuncs.

Sounds good, thanks!

-Toke

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches()
  2024-11-07 17:50 ` [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches() Eduard Zingerman
@ 2024-11-14 22:20   ` Eduard Zingerman
  2024-11-15  0:17     ` Andrii Nakryiko
  2024-11-15  0:16   ` Andrii Nakryiko
  1 sibling, 1 reply; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-14 22:20 UTC (permalink / raw)
  To: bpf, ast; +Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor

On Thu, 2024-11-07 at 09:50 -0800, Eduard Zingerman wrote:
> Consider dead code elimination problem for program like below:
> 
>     main:
>       1: r1 = 42
>       2: call <subprogram>;
>       3: exit
> 
>     subprogram:
>       4: r0 = 1
>       5: if r1 != 42 goto +1
>       6: r0 = 2
>       7: exit;
> 
> Here verifier would visit every instruction and thus
> bpf_insn_aux_data->seen flag would be set for both true (7)
> and falltrhough (6) branches of conditional (5).
> Hence opt_hard_wire_dead_code_branches() will not replace
> conditional (5) with unconditional jump.

[...]

Had an off-list discussion with Alexei yesterday,
here are some answers to questions raised:
- The patches #1,2 with opt_hard_wire_dead_code_branches() changes are
  not necessary for dynptr_slice kfunc inlining / branch removal.
  I will drop these patches and adjust test cases.
- Did some measurements for dynptr_slice call using simple benchmark
  from patch #11:
  - baseline:
    76.167 ± 0.030M/s million calls per second;
  - with call inlining, but without branch pruning (only patch #3):
    101.198 ± 0.101M/s million calls per second;
  - with call inlining and with branch pruning (full patch-set):
    116.935 ± 0.142M/s million calls per second.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches()
  2024-11-07 17:50 ` [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches() Eduard Zingerman
  2024-11-14 22:20   ` Eduard Zingerman
@ 2024-11-15  0:16   ` Andrii Nakryiko
  1 sibling, 0 replies; 34+ messages in thread
From: Andrii Nakryiko @ 2024-11-15  0:16 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	memxor

On Thu, Nov 7, 2024 at 9:51 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Consider dead code elimination problem for program like below:
>
>     main:
>       1: r1 = 42
>       2: call <subprogram>;
>       3: exit
>
>     subprogram:
>       4: r0 = 1
>       5: if r1 != 42 goto +1
>       6: r0 = 2
>       7: exit;
>
> Here verifier would visit every instruction and thus
> bpf_insn_aux_data->seen flag would be set for both true (7)
> and falltrhough (6) branches of conditional (5).
> Hence opt_hard_wire_dead_code_branches() will not replace
> conditional (5) with unconditional jump.
>
> To cover such cases:
> - add two fields in struct bpf_insn_aux_data:
>   - true_branch_taken;
>   - false_branch_taken;
> - adjust check_cond_jmp_op() to set the fields according to jump
>   predictions;
> - handle these flags in the opt_hard_wire_dead_code_branches():
>   - true_branch_taken && !false_branch_taken
>     always jump, replace instruction with 'goto off';
>   - !true_branch_taken && false_branch_taken
>     always falltrhough, replace with 'goto +0';
>   - true_branch_taken && false_branch_taken
>     jump and falltrhough are possible, don't change the instruction;
>   - !true_branch_taken && !false_branch_taken
>     neither jump, nor falltrhough are possible, if condition itself
>     must be a dead code (should be removed by opt_remove_dead_code).
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  include/linux/bpf_verifier.h |  4 +++-
>  kernel/bpf/verifier.c        | 16 ++++++++++++----
>  2 files changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 4513372c5bc8..ed4eacfd4db7 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -570,7 +570,9 @@ struct bpf_insn_aux_data {
>         struct btf_struct_meta *kptr_struct_meta;
>         u64 map_key_state; /* constant (32 bit) key tracking for maps */
>         int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
> -       u32 seen; /* this insn was processed by the verifier at env->pass_cnt */
> +       bool seen; /* this insn was processed by the verifier at env->pass_cnt */
> +       bool true_branch_taken; /* for cond jumps, set if verifier ever took true branch */
> +       bool false_branch_taken; /* for cond jumps, set if verifier ever took false branch */

we'll now have 12 bool fields in bpf_insn_aux_data, probably it's time
to switch to bitfields for those (even though you are trading 4 for 3
bytes here), 72+ bytes per instruction is not unnoticeable, especially
for big BPF programs

>         bool sanitize_stack_spill; /* subject to Spectre v4 sanitation */
>         bool zext_dst; /* this insn zero extends dst reg */
>         bool needs_zext; /* alu op needs to clear upper bits */

[...]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches()
  2024-11-14 22:20   ` Eduard Zingerman
@ 2024-11-15  0:17     ` Andrii Nakryiko
  2024-11-15  0:19       ` Andrii Nakryiko
  2024-11-15  0:20       ` Eduard Zingerman
  0 siblings, 2 replies; 34+ messages in thread
From: Andrii Nakryiko @ 2024-11-15  0:17 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	memxor

On Thu, Nov 14, 2024 at 2:20 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Thu, 2024-11-07 at 09:50 -0800, Eduard Zingerman wrote:
> > Consider dead code elimination problem for program like below:
> >
> >     main:
> >       1: r1 = 42
> >       2: call <subprogram>;
> >       3: exit
> >
> >     subprogram:
> >       4: r0 = 1
> >       5: if r1 != 42 goto +1
> >       6: r0 = 2
> >       7: exit;
> >
> > Here verifier would visit every instruction and thus
> > bpf_insn_aux_data->seen flag would be set for both true (7)
> > and falltrhough (6) branches of conditional (5).
> > Hence opt_hard_wire_dead_code_branches() will not replace
> > conditional (5) with unconditional jump.
>
> [...]
>
> Had an off-list discussion with Alexei yesterday,
> here are some answers to questions raised:
> - The patches #1,2 with opt_hard_wire_dead_code_branches() changes are
>   not necessary for dynptr_slice kfunc inlining / branch removal.
>   I will drop these patches and adjust test cases.
> - Did some measurements for dynptr_slice call using simple benchmark
>   from patch #11:
>   - baseline:
>     76.167 ± 0.030M/s million calls per second;
>   - with call inlining, but without branch pruning (only patch #3):
>     101.198 ± 0.101M/s million calls per second;
>   - with call inlining and with branch pruning (full patch-set):
>     116.935 ± 0.142M/s million calls per second.
>

This true/false logic seems generally useful not just for this use
case, is there anything wrong with landing it? Seems pretty
straightforward. I'd split it from the kfunc inlining and land
independently.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches()
  2024-11-15  0:17     ` Andrii Nakryiko
@ 2024-11-15  0:19       ` Andrii Nakryiko
  2024-11-15  0:50         ` Eduard Zingerman
  2024-11-15  0:20       ` Eduard Zingerman
  1 sibling, 1 reply; 34+ messages in thread
From: Andrii Nakryiko @ 2024-11-15  0:19 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	memxor

On Thu, Nov 14, 2024 at 4:17 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Nov 14, 2024 at 2:20 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > On Thu, 2024-11-07 at 09:50 -0800, Eduard Zingerman wrote:
> > > Consider dead code elimination problem for program like below:
> > >
> > >     main:
> > >       1: r1 = 42
> > >       2: call <subprogram>;
> > >       3: exit
> > >
> > >     subprogram:
> > >       4: r0 = 1
> > >       5: if r1 != 42 goto +1
> > >       6: r0 = 2
> > >       7: exit;
> > >
> > > Here verifier would visit every instruction and thus
> > > bpf_insn_aux_data->seen flag would be set for both true (7)
> > > and falltrhough (6) branches of conditional (5).
> > > Hence opt_hard_wire_dead_code_branches() will not replace
> > > conditional (5) with unconditional jump.
> >
> > [...]
> >
> > Had an off-list discussion with Alexei yesterday,
> > here are some answers to questions raised:
> > - The patches #1,2 with opt_hard_wire_dead_code_branches() changes are
> >   not necessary for dynptr_slice kfunc inlining / branch removal.
> >   I will drop these patches and adjust test cases.
> > - Did some measurements for dynptr_slice call using simple benchmark
> >   from patch #11:
> >   - baseline:
> >     76.167 ± 0.030M/s million calls per second;
> >   - with call inlining, but without branch pruning (only patch #3):
> >     101.198 ± 0.101M/s million calls per second;
> >   - with call inlining and with branch pruning (full patch-set):
> >     116.935 ± 0.142M/s million calls per second.
> >
>
> This true/false logic seems generally useful not just for this use
> case, is there anything wrong with landing it? Seems pretty
> straightforward. I'd split it from the kfunc inlining and land
> independently.

I was also always hoping that we'll eventually optimize the following pattern:

r1 = *(global var)
if r1 == 1 /* always 1 or 0 */
   goto +...
...


This is extremely common with .rodata global variables, and while the
branches are dead code eliminated, memory reads are not. Not sure how
involved it would be to do this.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches()
  2024-11-15  0:17     ` Andrii Nakryiko
  2024-11-15  0:19       ` Andrii Nakryiko
@ 2024-11-15  0:20       ` Eduard Zingerman
  2024-11-15  0:27         ` Alexei Starovoitov
  1 sibling, 1 reply; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-15  0:20 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	memxor

On Thu, 2024-11-14 at 16:17 -0800, Andrii Nakryiko wrote:

[...]

> This true/false logic seems generally useful not just for this use
> case, is there anything wrong with landing it? Seems pretty
> straightforward. I'd split it from the kfunc inlining and land
> independently.

I don't see anything wrong with it, but Alexei didn't like it.
Agree with your comment about organizing flags as bit-fields.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 03/11] bpf: shared BPF/native kfuncs
  2024-11-07 17:50 ` [RFC bpf-next 03/11] bpf: shared BPF/native kfuncs Eduard Zingerman
  2024-11-08 20:43   ` Toke Høiland-Jørgensen
@ 2024-11-15  0:27   ` Andrii Nakryiko
  1 sibling, 0 replies; 34+ messages in thread
From: Andrii Nakryiko @ 2024-11-15  0:27 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	memxor

On Thu, Nov 7, 2024 at 9:51 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> This commit adds a notion of inlinable kfuncs, compiled both to native
> code and BPF. BPF-compiled version is embedded in kernel data section
> and is available to verifier. Verifier uses it to replace calls to
> such kfuncs with inlined function bodies.
>
> Inlinable kfuncs are available only if CLANG is used for kernel
> compilation.
>
> In the scope of this commit all inlining happens as last steps of
> do_check(), after verification is finished. Follow up commits would
> extend this mechanism to allow removal of some conditional branches
> inside inlined function bodies.
>
> The commit consists of the following changes:
> - main kernel makefile:
>   modified to compile a bootstrap version of the bpftool;
> - kernel/bpf/Makefile:
>   - a new file inlinable_kfuncs.c is added;
>   - makefile is modified to compile this file as BPF elf,
>     using the following steps:
>     - use clang with native target to produce LLVM bitcode;
>     - compile LLVM bitcode to BPF object file;
>     - resolve relocations inside BPF object file using bpftool as a
>       linker;
>     Such arrangement allows including unmodified network related
>     header files.
> - verifier.c:
>   - generated BPF elf is included as a part of kernel data section;
>   - at kernel initialization phase:
>     - the elf is parsed and each function declared within it is
>       recorded as an instance of 'inlinable_kfunc' structure;
>     - calls to extern functions within elf file (pointed to by
>       relocation records) are replaced with kfunc call instructions;
>   - do_check() is modified to replace calls to kfuncs from inlinable
>     kfunc table with function bodies:
>     - replacement happens after main verification pass, so the bodies
>       of the kfuncs are not analyzed by verifier;
>     - if kfunc uses callee saved registers r6-r9 the spill/fill pairs
>       are generated for these register before/after inlined kfunc body
>       at call site;
>     - if kfunc uses r10 as a base pointer for load or store
>       instructions, offsets of these instructions are adjusted;
>     - if kfunc uses r10 in other instructions, such r10 is considered
>       as escaping and kfunc is not inlined.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  Makefile                      |  22 +-
>  kernel/bpf/Makefile           |  24 +-
>  kernel/bpf/inlinable_kfuncs.c |   1 +
>  kernel/bpf/verifier.c         | 652 +++++++++++++++++++++++++++++++++-
>  4 files changed, 680 insertions(+), 19 deletions(-)
>  create mode 100644 kernel/bpf/inlinable_kfuncs.c
>
> diff --git a/Makefile b/Makefile
> index a9a7d9ffaa98..4ded57f4b0c2 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -496,6 +496,7 @@ CLIPPY_DRIVER       = clippy-driver
>  BINDGEN                = bindgen
>  PAHOLE         = pahole
>  RESOLVE_BTFIDS = $(objtree)/tools/bpf/resolve_btfids/resolve_btfids
> +BPFTOOL                = $(objtree)/tools/bpf/bpftool/bootstrap/bpftool
>  LEX            = flex
>  YACC           = bison
>  AWK            = awk
> @@ -585,7 +586,7 @@ export RUSTC_BOOTSTRAP := 1
>  export ARCH SRCARCH CONFIG_SHELL BASH HOSTCC KBUILD_HOSTCFLAGS CROSS_COMPILE LD CC HOSTPKG_CONFIG
>  export RUSTC RUSTDOC RUSTFMT RUSTC_OR_CLIPPY_QUIET RUSTC_OR_CLIPPY BINDGEN
>  export HOSTRUSTC KBUILD_HOSTRUSTFLAGS
> -export CPP AR NM STRIP OBJCOPY OBJDUMP READELF PAHOLE RESOLVE_BTFIDS LEX YACC AWK INSTALLKERNEL
> +export CPP AR NM STRIP OBJCOPY OBJDUMP READELF PAHOLE RESOLVE_BTFIDS BPFTOOL LEX YACC AWK INSTALLKERNEL
>  export PERL PYTHON3 CHECK CHECKFLAGS MAKE UTS_MACHINE HOSTCXX
>  export KGZIP KBZIP2 KLZOP LZMA LZ4 XZ ZSTD
>  export KBUILD_HOSTCXXFLAGS KBUILD_HOSTLDFLAGS KBUILD_HOSTLDLIBS LDFLAGS_MODULE
> @@ -1356,6 +1357,25 @@ ifneq ($(wildcard $(resolve_btfids_O)),)
>         $(Q)$(MAKE) -sC $(srctree)/tools/bpf/resolve_btfids O=$(resolve_btfids_O) clean
>  endif
>
> +# TODO: cross compilation?
> +# TODO: bootstrap! (to avoid vmlinux.h generation)
> +PHONY += bpftool_bootstrap bpftool_clean
> +bpftool_O = $(abspath $(objtree))/tools/bpf/bpftool
> +
> +ifdef CONFIG_BPF
> +ifdef CONFIG_CC_IS_CLANG
> +prepare: bpftool_bootstrap
> +endif
> +endif
> +
> +bpftool_bootstrap:
> +       $(Q)$(MAKE) -sC $(srctree)/tools/bpf/bpftool O=$(bpftool_O) srctree=$(abspath $(srctree)) bootstrap
> +
> +bpftool_clean:
> +ifneq ($(wildcard $(bpftool_O)),)
> +       $(Q)$(MAKE) -sC $(srctree)/tools/bpf/bpftool O=$(bpftool_O) srctree=$(abspath $(srctree)) clean
> +endif
> +
>  # Clear a bunch of variables before executing the submake
>  ifeq ($(quiet),silent_)
>  tools_silent=s
> diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
> index 105328f0b9c0..3d7ee81c8e2e 100644
> --- a/kernel/bpf/Makefile
> +++ b/kernel/bpf/Makefile
> @@ -6,7 +6,7 @@ cflags-nogcse-$(CONFIG_X86)$(CONFIG_CC_IS_GCC) := -fno-gcse
>  endif
>  CFLAGS_core.o += -Wno-override-init $(cflags-nogcse-yy)
>
> -obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o log.o token.o
> +obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o tnum.o log.o token.o
>  obj-$(CONFIG_BPF_SYSCALL) += bpf_iter.o map_iter.o task_iter.o prog_iter.o link_iter.o
>  obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o bloom_filter.o
>  obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o
> @@ -53,3 +53,25 @@ obj-$(CONFIG_BPF_SYSCALL) += relo_core.o
>  obj-$(CONFIG_BPF_SYSCALL) += btf_iter.o
>  obj-$(CONFIG_BPF_SYSCALL) += btf_relocate.o
>  obj-$(CONFIG_BPF_SYSCALL) += kmem_cache_iter.o
> +obj-$(CONFIG_BPF_SYSCALL) += helpers.o inlinable_kfuncs.o
> +
> +ifdef CONFIG_CC_IS_CLANG
> +
> +LLC ?= $(LLVM_PREFIX)llc$(LLVM_SUFFIX)
> +
> +# -mfentry -pg is $(CC_FLAGS_FTRACE)
> +# -fpatchable-function-entry=16,16 is $(PADDING_CFLAGS)
> +CFLAGS_REMOVE_inlinable_kfuncs.bpf.bc.o += $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_inlinable_kfuncs.bpf.bc.o += $(PADDING_CFLAGS)
> +$(obj)/inlinable_kfuncs.bpf.bc.o: $(src)/inlinable_kfuncs.c
> +       $(Q)$(CLANG) $(c_flags) -emit-llvm -c $< -o $@
> +
> +$(obj)/inlinable_kfuncs.bpf.o: $(obj)/inlinable_kfuncs.bpf.bc.o
> +       $(Q)$(LLC) -mcpu=v3 --mtriple=bpf --filetype=obj $< -o $@
> +
> +$(obj)/inlinable_kfuncs.bpf.linked.o: $(obj)/inlinable_kfuncs.bpf.o
> +       $(Q)$(BPFTOOL) gen object $@ $<

what's the point? Just to strip DWARF? `strip -g` then?

but honestly, why even bother embedding entire ELF? Get binary data
from .text, separately record ELF symbols (to know function name,
size, and where they start). Keep it simple and minimal.


> +
> +$(obj)/verifier.o: $(obj)/inlinable_kfuncs.bpf.linked.o
> +
> +endif
> diff --git a/kernel/bpf/inlinable_kfuncs.c b/kernel/bpf/inlinable_kfuncs.c
> new file mode 100644
> index 000000000000..7b7dc05fa1a4
> --- /dev/null
> +++ b/kernel/bpf/inlinable_kfuncs.c
> @@ -0,0 +1 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 3bae0bbc1da9..fbf51147f319 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -20509,6 +20509,622 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>         return 0;
>  }
>

[...]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches()
  2024-11-15  0:20       ` Eduard Zingerman
@ 2024-11-15  0:27         ` Alexei Starovoitov
  2024-11-15  0:33           ` Eduard Zingerman
  0 siblings, 1 reply; 34+ messages in thread
From: Alexei Starovoitov @ 2024-11-15  0:27 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song,
	Kumar Kartikeya Dwivedi

On Thu, Nov 14, 2024 at 4:20 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Thu, 2024-11-14 at 16:17 -0800, Andrii Nakryiko wrote:
>
> [...]
>
> > This true/false logic seems generally useful not just for this use
> > case, is there anything wrong with landing it? Seems pretty
> > straightforward. I'd split it from the kfunc inlining and land
> > independently.
>
> I don't see anything wrong with it, but Alexei didn't like it.
> Agree with your comment about organizing flags as bit-fields.

Well, I asked whether it makes a difference.
And looks like your numbers show that this optimization adds 101m to 116m ?

If so, it's worth doing. I still don't like two bool flags though.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches()
  2024-11-15  0:27         ` Alexei Starovoitov
@ 2024-11-15  0:33           ` Eduard Zingerman
  2024-11-15  0:38             ` Alexei Starovoitov
  0 siblings, 1 reply; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-15  0:33 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song,
	Kumar Kartikeya Dwivedi

On Thu, 2024-11-14 at 16:27 -0800, Alexei Starovoitov wrote:

[...]

> Well, I asked whether it makes a difference.
> And looks like your numbers show that this optimization adds 101m to 116m ?

No, it does not make a difference for dynptr_slice:

> > - The patches #1,2 with opt_hard_wire_dead_code_branches() changes are
> >   not necessary for dynptr_slice kfunc inlining / branch removal.
> >  I will drop these patches and adjust test cases.

The 101m -> 116m is for inlining w/o known branch removal -> inlining with branch removal.
(With 76m being no inlining at all).

> If so, it's worth doing. I still don't like two bool flags though.

The peaces of information to encode here are:
- whether the branch is always predicted;
- if so, which branch.

I don't think this could be narrowed down to a single bit.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches()
  2024-11-15  0:33           ` Eduard Zingerman
@ 2024-11-15  0:38             ` Alexei Starovoitov
  2024-11-15  0:43               ` Eduard Zingerman
  0 siblings, 1 reply; 34+ messages in thread
From: Alexei Starovoitov @ 2024-11-15  0:38 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song,
	Kumar Kartikeya Dwivedi

On Thu, Nov 14, 2024 at 4:34 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Thu, 2024-11-14 at 16:27 -0800, Alexei Starovoitov wrote:
>
> [...]
>
> > Well, I asked whether it makes a difference.
> > And looks like your numbers show that this optimization adds 101m to 116m ?
>
> No, it does not make a difference for dynptr_slice:

Then let's not do it. If there is no observable performance
difference there is no need to do it just to make selftests shorter.

> > > - The patches #1,2 with opt_hard_wire_dead_code_branches() changes are
> > >   not necessary for dynptr_slice kfunc inlining / branch removal.
> > >  I will drop these patches and adjust test cases.
>
> The 101m -> 116m is for inlining w/o known branch removal -> inlining with branch removal.
> (With 76m being no inlining at all).

Not following. Which patch # does branch removal then?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches()
  2024-11-15  0:38             ` Alexei Starovoitov
@ 2024-11-15  0:43               ` Eduard Zingerman
  0 siblings, 0 replies; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-15  0:43 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song,
	Kumar Kartikeya Dwivedi

On Thu, 2024-11-14 at 16:38 -0800, Alexei Starovoitov wrote:

[...]

> > The 101m -> 116m is for inlining w/o known branch removal -> inlining with branch removal.
> > (With 76m being no inlining at all).
> 
> Not following. Which patch # does branch removal then?

- "bpf: shared BPF/native kfuncs" (patch #3)
  Build system integration and kfuncs inlining after verification.

- "bpf: instantiate inlinable kfuncs before verification" (patch #7)
  Adds a pass that clones inlinable kfunc bodies as hidden
  subprograms, one subprogram per callsite.

#3 does inlining, but does not remove any branches.
#7 moves where inlining is done which allows to remove branches.

Performance numbers for the simple test:
- #3 alone : 76m -> 101m
- #3 + #7  : 76m -> 116m


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches()
  2024-11-15  0:19       ` Andrii Nakryiko
@ 2024-11-15  0:50         ` Eduard Zingerman
  2024-11-15  3:03           ` Andrii Nakryiko
  0 siblings, 1 reply; 34+ messages in thread
From: Eduard Zingerman @ 2024-11-15  0:50 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	memxor

On Thu, 2024-11-14 at 16:19 -0800, Andrii Nakryiko wrote:

[...]

> I was also always hoping that we'll eventually optimize the following pattern:
> 
> r1 = *(global var)
> if r1 == 1 /* always 1 or 0 */
>    goto +...
> ...
> 
> 
> This is extremely common with .rodata global variables, and while the
> branches are dead code eliminated, memory reads are not. Not sure how
> involved it would be to do this.

Could you please elaborate a bit.
For a simple test like below compiler replaces 'flag' with 1,
so no action is needed from verifier:

    const int flag = 1;

    SEC("socket")
    __success
    __xlated("foobar")
    int rodata_test(void *ctx)
    {
    	if (flag)
    		return 1;
    	return 0;
    }


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches()
  2024-11-15  0:50         ` Eduard Zingerman
@ 2024-11-15  3:03           ` Andrii Nakryiko
  0 siblings, 0 replies; 34+ messages in thread
From: Andrii Nakryiko @ 2024-11-15  3:03 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	memxor

On Thu, Nov 14, 2024 at 4:50 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Thu, 2024-11-14 at 16:19 -0800, Andrii Nakryiko wrote:
>
> [...]
>
> > I was also always hoping that we'll eventually optimize the following pattern:
> >
> > r1 = *(global var)
> > if r1 == 1 /* always 1 or 0 */
> >    goto +...
> > ...
> >
> >
> > This is extremely common with .rodata global variables, and while the
> > branches are dead code eliminated, memory reads are not. Not sure how
> > involved it would be to do this.
>
> Could you please elaborate a bit.
> For a simple test like below compiler replaces 'flag' with 1,
> so no action is needed from verifier:
>
>     const int flag = 1;

now change this to actual .rodata global variable"

const volatile int flag = 1;

>
>     SEC("socket")
>     __success
>     __xlated("foobar")
>     int rodata_test(void *ctx)
>     {
>         if (flag)
>                 return 1;
>         return 0;
>     }
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2024-11-15  3:03 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-07 17:50 [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Eduard Zingerman
2024-11-07 17:50 ` [RFC bpf-next 01/11] bpf: use branch predictions in opt_hard_wire_dead_code_branches() Eduard Zingerman
2024-11-14 22:20   ` Eduard Zingerman
2024-11-15  0:17     ` Andrii Nakryiko
2024-11-15  0:19       ` Andrii Nakryiko
2024-11-15  0:50         ` Eduard Zingerman
2024-11-15  3:03           ` Andrii Nakryiko
2024-11-15  0:20       ` Eduard Zingerman
2024-11-15  0:27         ` Alexei Starovoitov
2024-11-15  0:33           ` Eduard Zingerman
2024-11-15  0:38             ` Alexei Starovoitov
2024-11-15  0:43               ` Eduard Zingerman
2024-11-15  0:16   ` Andrii Nakryiko
2024-11-07 17:50 ` [RFC bpf-next 02/11] selftests/bpf: tests for opt_hard_wire_dead_code_branches() Eduard Zingerman
2024-11-07 17:50 ` [RFC bpf-next 03/11] bpf: shared BPF/native kfuncs Eduard Zingerman
2024-11-08 20:43   ` Toke Høiland-Jørgensen
2024-11-08 21:25     ` Eduard Zingerman
2024-11-11 18:41       ` Toke Høiland-Jørgensen
2024-11-15  0:27   ` Andrii Nakryiko
2024-11-07 17:50 ` [RFC bpf-next 04/11] bpf: allow specifying inlinable kfuncs in modules Eduard Zingerman
2024-11-07 17:50 ` [RFC bpf-next 05/11] bpf: dynamic allocation for bpf_verifier_env->subprog_info Eduard Zingerman
2024-11-07 17:50 ` [RFC bpf-next 06/11] bpf: KERNEL_VALUE register type Eduard Zingerman
2024-11-07 17:50 ` [RFC bpf-next 07/11] bpf: instantiate inlinable kfuncs before verification Eduard Zingerman
2024-11-07 17:50 ` [RFC bpf-next 08/11] bpf: special rules for kernel function calls inside inlinable kfuncs Eduard Zingerman
2024-11-07 17:50 ` [RFC bpf-next 09/11] bpf: move selected dynptr kfuncs to inlinable_kfuncs.c Eduard Zingerman
2024-11-07 17:50 ` [RFC bpf-next 10/11] selftests/bpf: tests to verify handling of inlined kfuncs Eduard Zingerman
2024-11-07 22:04   ` Jeff Johnson
2024-11-07 22:08     ` Eduard Zingerman
2024-11-07 22:19       ` Jeff Johnson
2024-11-07 23:00         ` Eduard Zingerman
2024-11-07 17:50 ` [RFC bpf-next 11/11] selftests/bpf: dynptr_slice benchmark Eduard Zingerman
2024-11-08 20:41 ` [RFC bpf-next 00/11] bpf: inlinable kfuncs for BPF Toke Høiland-Jørgensen
2024-11-08 23:01   ` Eduard Zingerman
2024-11-11 18:42     ` Toke Høiland-Jørgensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).