public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 bpf-next 00/13] BPF indirect jumps
@ 2025-09-18  9:38 Anton Protopopov
  2025-09-18  9:38 ` [PATCH v3 bpf-next 01/13] bpf: fix the return value of push_stack Anton Protopopov
                   ` (13 more replies)
  0 siblings, 14 replies; 46+ messages in thread
From: Anton Protopopov @ 2025-09-18  9:38 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

This patchset implements a new type of map, instruction set, and uses
it to build support for indirect branches in BPF (on x86). (The same
map will be later used to provide support for indirect calls and static
keys.) See [1], [2] for more context.

Short table of contents:

  * Patches 1-6 implement the new map of type
    BPF_MAP_TYPE_INSN_SET and corresponding selftests. This map can
    be used to track the "original -> xlated -> jitted mapping" for
    a given program. Patches 5,6 add support for "blinded" variant.

  * Patches 7,8,9 implement the support for indirect jumps

  * Patches 10--13 add support for LLVM-compiled programs containing
    indirect jumps.

A special LLVM should be used for that, see [3] for the details and
some related discussions. Due to this fact, selftests for indirect
jumps which directly use `goto *rX` are commented out (such that
CI can run). Due to this fact, I've run test_progs compiled with
indirect jumps as described in [4] (in brief, all tests which
normally pass on my setup, pass with indirect jumps).

There is a list of TBDs (mostly, more selftests), but the list of
changes looks big enough to send the v2.

See individual patches for more details on the implementation details.

v2 -> v3:
  * fix build failure when CONFIG_BPF_SYSCALL is not set (kbuild-bot)
  * reformat bpftool help messages (Quentin)

v1 -> v2:

  * push_stack changes:
    * sanitize_speculative_path should just return int (Eduard)
    * return code from sanitize_speculative_path, not EFAULT (Eduard)
    * when BPF_COMPLEXITY_LIMIT_JMP_SEQ is reached, return E2BIG (Eduard)

  * indirect jumps:
    * omit support for .imm=fd in gotox, as we're not using it for now (Eduard)
    * struct jt -> struct bpf_iarray (Eduard)
    * insn_successors: rewrite the interface to just return a pointer (Eduard)
    * remove min_index/max_index, use umin_value/umax_value instead (Alexei, Eduard)
    * move emit_indirect_jump args change to the previous patch (Eduard)
    * add a comment to map_mem_size() (Eduard)
    * use verifier_bug for some error cases in check_indirect_jump (Eduard)
    * clear_insn_aux_data: use start,len instead of start,end (Eduard)
    * make regs[insn->dst_reg].type = PTR_TO_INSN part of check_mem_access (Eduard)

  * constant blinding changes:
    * make subprog_start adjustment better readable (Eduard)
    * do not set subprog len, it is already set (Eduard)

  * libbpf:
    * remove check that relocations from .rodata are ok (Anton)
    * do not freeze the map, it is not necessary anymore (Anton)
    * rename the goto_x -> gotox everywhere (Anton)
    * use u64 when parsing LLVM jump tables (Eduard)
    * split patch in two due to spaces->tabs change (Eduard)
    * split bpftool changes to bpftool patch (Andrii)
    * make sym_size it a union with ext_idx (Andrii)
    * properly copy/free the jumptables_data section from elf (Andrii)
    * a few cosmetic changes around create_jt_map (Andrii)
    * fix some comments + rewrite patch description (Andrii)
    * inline bpf_prog__append_subprog_offsets (Andrii)
    * subprog_sec_offst -> subprog_sec_off (Andrii)
    * !strcmp -> strcmp() == 0 (Andrii)
    * make some function names more readable (Andrii)
    * allocate table of subfunc offsets via libbpf_reallocarray (Andrii)

  * selftests:
    * squash insn_array* tests together (Anton)

  * fixed build warnings (kernel test robot)

RFC -> v1:

  * I've tried to address all the comments provided by Alexei and
    Eduard in RFC. Will try to list the most important of them below.
  * One big change: move from older LLVM version [5] to newer [4].
    Now LLVM generates jump tables as symbols in the new special
    section ".jumptables". Another part of this change is that
    libbpf now doesn't try to link map load and goto *rX, as
    1) this is absolutely not reliable 2) for some use cases this
    is impossible (namely, when more than one jump table can be used
    in the same gotox instruction).
  * Added insn_successors() support (Alexei, Eduard). This includes
    getting rid of the ugly bpf_insn_set_iter_xlated_offset()
    interface (Eduard).
  * Removed hack for the unreachable instruction, as new LLVM thank to
    Eduard doesn't generate it.
  * Set mem_size for direct map access properly instead of hacking.
    Remove off>0 check. (Alexei)
  * Do not allocate new memory for min_index/max_index (Alexei, Eduard)
  * Information required during check_cfg is now cached to be reused
    later (Alexei + general logic for supporting multiple JT per jump)
  * Properly compare registers in regsafe (Alexei, Eduard)
  * Remove support for JMP32 (Eduard)
  * Better checks in adjust_ptr_min_max_vals (Eduard)
  * More selftests were added (but still there's room for more) which
    directly use gotox (Alexei)
  * More checks and verbose messages added
  * "unique pointers" are no more in the map

Links:
  1. https://lpc.events/event/18/contributions/1941/
  2. https://lwn.net/Articles/1017439/
  3. https://github.com/llvm/llvm-project/pull/149715
  4. https://github.com/llvm/llvm-project/pull/149715#issuecomment-3274833753
  5. v1: https://lore.kernel.org/bpf/20250816180631.952085-1-a.s.protopopov@gmail.com/
  6. rfc: https://lore.kernel.org/bpf/20250615085943.3871208-1-a.s.protopopov@gmail.com/


Anton Protopopov (13):
  bpf: fix the return value of push_stack
  bpf: save the start of functions in bpf_prog_aux
  bpf, x86: add new map type: instructions array
  selftests/bpf: add selftests for new insn_array map
  bpf: support instructions arrays with constants blinding
  selftests/bpf: test instructions arrays with blinding
  bpf, x86: allow indirect jumps to r8...r15
  bpf, x86: add support for indirect jumps
  bpf: disasm: add support for BPF_JMP|BPF_JA|BPF_X
  libbpf: fix formatting of bpf_object__append_subprog_code
  libbpf: support llvm-generated indirect jumps
  bpftool: Recognize insn_array map type
  selftests/bpf: add selftests for indirect jumps

 arch/x86/net/bpf_jit_comp.c                   |  39 +-
 include/linux/bpf.h                           |  40 ++
 include/linux/bpf_types.h                     |   1 +
 include/linux/bpf_verifier.h                  |  17 +
 include/uapi/linux/bpf.h                      |  11 +
 kernel/bpf/Makefile                           |   2 +-
 kernel/bpf/bpf_insn_array.c                   | 350 ++++++++++
 kernel/bpf/core.c                             |  21 +
 kernel/bpf/disasm.c                           |   9 +
 kernel/bpf/log.c                              |   1 +
 kernel/bpf/syscall.c                          |  22 +
 kernel/bpf/verifier.c                         | 646 ++++++++++++++++--
 .../bpf/bpftool/Documentation/bpftool-map.rst |   3 +-
 tools/bpf/bpftool/map.c                       |   3 +-
 tools/include/uapi/linux/bpf.h                |  11 +
 tools/lib/bpf/libbpf.c                        | 192 +++++-
 tools/lib/bpf/libbpf_probes.c                 |   4 +
 tools/lib/bpf/linker.c                        |  10 +-
 tools/testing/selftests/bpf/Makefile          |   4 +-
 .../selftests/bpf/prog_tests/bpf_gotox.c      | 132 ++++
 .../selftests/bpf/prog_tests/bpf_insn_array.c | 497 ++++++++++++++
 tools/testing/selftests/bpf/progs/bpf_gotox.c | 384 +++++++++++
 22 files changed, 2289 insertions(+), 110 deletions(-)
 create mode 100644 kernel/bpf/bpf_insn_array.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_gotox.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_gotox.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v3 bpf-next 01/13] bpf: fix the return value of push_stack
  2025-09-18  9:38 [PATCH v3 bpf-next 00/13] BPF indirect jumps Anton Protopopov
@ 2025-09-18  9:38 ` Anton Protopopov
  2025-09-19  0:17   ` Eduard Zingerman
  2025-09-18  9:38 ` [PATCH v3 bpf-next 02/13] bpf: save the start of functions in bpf_prog_aux Anton Protopopov
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 46+ messages in thread
From: Anton Protopopov @ 2025-09-18  9:38 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

In [1] Eduard mentioned that on push_stack failure verifier code
should return -ENOMEM instead of -EFAULT. After checking with the
other call sites I've found that code randomly returns either -ENOMEM
or -EFAULT. This patch unifies the return values for the push_stack
(and similar push_async_cb) functions such that error codes are
always assigned properly.

  [1] https://lore.kernel.org/bpf/20250615085943.3871208-1-a.s.protopopov@gmail.com

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 kernel/bpf/verifier.c | 80 +++++++++++++++++++++----------------------
 1 file changed, 40 insertions(+), 40 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index beaa391e02fb..6e4abb06d5e4 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2120,7 +2120,7 @@ static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env,
 
 	elem = kzalloc(sizeof(struct bpf_verifier_stack_elem), GFP_KERNEL_ACCOUNT);
 	if (!elem)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	elem->insn_idx = insn_idx;
 	elem->prev_insn_idx = prev_insn_idx;
@@ -2130,12 +2130,12 @@ static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env,
 	env->stack_size++;
 	err = copy_verifier_state(&elem->st, cur);
 	if (err)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 	elem->st.speculative |= speculative;
 	if (env->stack_size > BPF_COMPLEXITY_LIMIT_JMP_SEQ) {
 		verbose(env, "The sequence of %d jumps is too complex.\n",
 			env->stack_size);
-		return NULL;
+		return ERR_PTR(-E2BIG);
 	}
 	if (elem->st.parent) {
 		++elem->st.parent->branches;
@@ -2932,7 +2932,7 @@ static struct bpf_verifier_state *push_async_cb(struct bpf_verifier_env *env,
 
 	elem = kzalloc(sizeof(struct bpf_verifier_stack_elem), GFP_KERNEL_ACCOUNT);
 	if (!elem)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	elem->insn_idx = insn_idx;
 	elem->prev_insn_idx = prev_insn_idx;
@@ -2944,7 +2944,7 @@ static struct bpf_verifier_state *push_async_cb(struct bpf_verifier_env *env,
 		verbose(env,
 			"The sequence of %d jumps is too complex for async cb.\n",
 			env->stack_size);
-		return NULL;
+		return ERR_PTR(-E2BIG);
 	}
 	/* Unlike push_stack() do not copy_verifier_state().
 	 * The caller state doesn't matter.
@@ -2955,7 +2955,7 @@ static struct bpf_verifier_state *push_async_cb(struct bpf_verifier_env *env,
 	elem->st.in_sleepable = is_sleepable;
 	frame = kzalloc(sizeof(*frame), GFP_KERNEL_ACCOUNT);
 	if (!frame)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 	init_func_state(env, frame,
 			BPF_MAIN_FUNC /* callsite */,
 			0 /* frameno within this callchain */,
@@ -9070,8 +9070,8 @@ static int process_iter_next_call(struct bpf_verifier_env *env, int insn_idx,
 		prev_st = find_prev_entry(env, cur_st->parent, insn_idx);
 		/* branch out active iter state */
 		queued_st = push_stack(env, insn_idx + 1, insn_idx, false);
-		if (!queued_st)
-			return -ENOMEM;
+		if (IS_ERR(queued_st))
+			return PTR_ERR(queued_st);
 
 		queued_iter = get_iter_from_state(queued_st, meta);
 		queued_iter->iter.state = BPF_ITER_STATE_ACTIVE;
@@ -10641,8 +10641,8 @@ static int push_callback_call(struct bpf_verifier_env *env, struct bpf_insn *ins
 		async_cb = push_async_cb(env, env->subprog_info[subprog].start,
 					 insn_idx, subprog,
 					 is_bpf_wq_set_callback_impl_kfunc(insn->imm));
-		if (!async_cb)
-			return -EFAULT;
+		if (IS_ERR(async_cb))
+			return PTR_ERR(async_cb);
 		callee = async_cb->frame[0];
 		callee->async_entry_cnt = caller->async_entry_cnt + 1;
 
@@ -10658,8 +10658,8 @@ static int push_callback_call(struct bpf_verifier_env *env, struct bpf_insn *ins
 	 * proceed with next instruction within current frame.
 	 */
 	callback_state = push_stack(env, env->subprog_info[subprog].start, insn_idx, false);
-	if (!callback_state)
-		return -ENOMEM;
+	if (IS_ERR(callback_state))
+		return PTR_ERR(callback_state);
 
 	err = setup_func_entry(env, subprog, insn_idx, set_callee_state_cb,
 			       callback_state);
@@ -13808,9 +13808,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		struct bpf_reg_state *regs;
 
 		branch = push_stack(env, env->insn_idx + 1, env->insn_idx, false);
-		if (!branch) {
+		if (IS_ERR(branch)) {
 			verbose(env, "failed to push state for failed lock acquisition\n");
-			return -ENOMEM;
+			return PTR_ERR(branch);
 		}
 
 		regs = branch->frame[branch->curframe]->regs;
@@ -14238,16 +14238,15 @@ struct bpf_sanitize_info {
 	bool mask_to_left;
 };
 
-static struct bpf_verifier_state *
-sanitize_speculative_path(struct bpf_verifier_env *env,
-			  const struct bpf_insn *insn,
-			  u32 next_idx, u32 curr_idx)
+static int sanitize_speculative_path(struct bpf_verifier_env *env,
+				     const struct bpf_insn *insn,
+				     u32 next_idx, u32 curr_idx)
 {
 	struct bpf_verifier_state *branch;
 	struct bpf_reg_state *regs;
 
 	branch = push_stack(env, next_idx, curr_idx, true);
-	if (branch && insn) {
+	if (!IS_ERR(branch) && insn) {
 		regs = branch->frame[branch->curframe]->regs;
 		if (BPF_SRC(insn->code) == BPF_K) {
 			mark_reg_unknown(env, regs, insn->dst_reg);
@@ -14256,7 +14255,7 @@ sanitize_speculative_path(struct bpf_verifier_env *env,
 			mark_reg_unknown(env, regs, insn->src_reg);
 		}
 	}
-	return branch;
+	return IS_ERR(branch) ? PTR_ERR(branch) : 0;
 }
 
 static int sanitize_ptr_alu(struct bpf_verifier_env *env,
@@ -14275,7 +14274,6 @@ static int sanitize_ptr_alu(struct bpf_verifier_env *env,
 	u8 opcode = BPF_OP(insn->code);
 	u32 alu_state, alu_limit;
 	struct bpf_reg_state tmp;
-	bool ret;
 	int err;
 
 	if (can_skip_alu_sanitation(env, insn))
@@ -14348,11 +14346,12 @@ static int sanitize_ptr_alu(struct bpf_verifier_env *env,
 		tmp = *dst_reg;
 		copy_register_state(dst_reg, ptr_reg);
 	}
-	ret = sanitize_speculative_path(env, NULL, env->insn_idx + 1,
-					env->insn_idx);
-	if (!ptr_is_dst_reg && ret)
+	err = sanitize_speculative_path(env, NULL, env->insn_idx + 1, env->insn_idx);
+	if (err < 0)
+		return REASON_STACK;
+	if (!ptr_is_dst_reg)
 		*dst_reg = tmp;
-	return !ret ? REASON_STACK : 0;
+	return 0;
 }
 
 static void sanitize_mark_insn_seen(struct bpf_verifier_env *env)
@@ -16675,8 +16674,8 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 
 		/* branch out 'fallthrough' insn as a new state to explore */
 		queued_st = push_stack(env, idx + 1, idx, false);
-		if (!queued_st)
-			return -ENOMEM;
+		if (IS_ERR(queued_st))
+			return PTR_ERR(queued_st);
 
 		queued_st->may_goto_depth++;
 		if (prev_st)
@@ -16754,10 +16753,11 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 		 * the fall-through branch for simulation under speculative
 		 * execution.
 		 */
-		if (!env->bypass_spec_v1 &&
-		    !sanitize_speculative_path(env, insn, *insn_idx + 1,
-					       *insn_idx))
-			return -EFAULT;
+		if (!env->bypass_spec_v1) {
+			err = sanitize_speculative_path(env, insn, *insn_idx + 1, *insn_idx);
+			if (err < 0)
+				return err;
+		}
 		if (env->log.level & BPF_LOG_LEVEL)
 			print_insn_state(env, this_branch, this_branch->curframe);
 		*insn_idx += insn->off;
@@ -16767,11 +16767,12 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 		 * program will go. If needed, push the goto branch for
 		 * simulation under speculative execution.
 		 */
-		if (!env->bypass_spec_v1 &&
-		    !sanitize_speculative_path(env, insn,
-					       *insn_idx + insn->off + 1,
-					       *insn_idx))
-			return -EFAULT;
+		if (!env->bypass_spec_v1) {
+			err = sanitize_speculative_path(env, insn, *insn_idx + insn->off + 1,
+							*insn_idx);
+			if (err < 0)
+				return err;
+		}
 		if (env->log.level & BPF_LOG_LEVEL)
 			print_insn_state(env, this_branch, this_branch->curframe);
 		return 0;
@@ -16792,10 +16793,9 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 			return err;
 	}
 
-	other_branch = push_stack(env, *insn_idx + insn->off + 1, *insn_idx,
-				  false);
-	if (!other_branch)
-		return -EFAULT;
+	other_branch = push_stack(env, *insn_idx + insn->off + 1, *insn_idx, false);
+	if (IS_ERR(other_branch))
+		return PTR_ERR(other_branch);
 	other_branch_regs = other_branch->frame[other_branch->curframe]->regs;
 
 	if (BPF_SRC(insn->code) == BPF_X) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 bpf-next 02/13] bpf: save the start of functions in bpf_prog_aux
  2025-09-18  9:38 [PATCH v3 bpf-next 00/13] BPF indirect jumps Anton Protopopov
  2025-09-18  9:38 ` [PATCH v3 bpf-next 01/13] bpf: fix the return value of push_stack Anton Protopopov
@ 2025-09-18  9:38 ` Anton Protopopov
  2025-09-18  9:38 ` [PATCH v3 bpf-next 03/13] bpf, x86: add new map type: instructions array Anton Protopopov
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 46+ messages in thread
From: Anton Protopopov @ 2025-09-18  9:38 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

Introduce a new subprog_start field in bpf_prog_aux. This field may
be used by JIT compilers wanting to know the real absolute xlated
offset of the function being jitted. The func_info[func_id] may have
served this purpose, but func_info may be NULL, so JIT compilers
can't rely on it.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 include/linux/bpf.h   | 1 +
 kernel/bpf/verifier.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 41f776071ff5..1056fd0d54d3 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1601,6 +1601,7 @@ struct bpf_prog_aux {
 	u32 ctx_arg_info_size;
 	u32 max_rdonly_access;
 	u32 max_rdwr_access;
+	u32 subprog_start;
 	struct btf *attach_btf;
 	struct bpf_ctx_arg_aux *ctx_arg_info;
 	void __percpu *priv_stack_ptr;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 6e4abb06d5e4..b8c4b4dd2ddf 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -21611,6 +21611,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 		func[i]->aux->func_idx = i;
 		/* Below members will be freed only at prog->aux */
 		func[i]->aux->btf = prog->aux->btf;
+		func[i]->aux->subprog_start = subprog_start;
 		func[i]->aux->func_info = prog->aux->func_info;
 		func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
 		func[i]->aux->poke_tab = prog->aux->poke_tab;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 bpf-next 03/13] bpf, x86: add new map type: instructions array
  2025-09-18  9:38 [PATCH v3 bpf-next 00/13] BPF indirect jumps Anton Protopopov
  2025-09-18  9:38 ` [PATCH v3 bpf-next 01/13] bpf: fix the return value of push_stack Anton Protopopov
  2025-09-18  9:38 ` [PATCH v3 bpf-next 02/13] bpf: save the start of functions in bpf_prog_aux Anton Protopopov
@ 2025-09-18  9:38 ` Anton Protopopov
  2025-09-18  9:38 ` [PATCH v3 bpf-next 04/13] selftests/bpf: add selftests for new insn_array map Anton Protopopov
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 46+ messages in thread
From: Anton Protopopov @ 2025-09-18  9:38 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

On bpf(BPF_PROG_LOAD) syscall user-supplied BPF programs are
translated by the verifier into "xlated" BPF programs. During this
process the original instructions offsets might be adjusted and/or
individual instructions might be replaced by new sets of instructions,
or deleted.

Add a new BPF map type which is aimed to keep track of how, for a
given program, the original instructions were relocated during the
verification. Also, besides keeping track of the original -> xlated
mapping, make x86 JIT to build the xlated -> jitted mapping for every
instruction listed in an instruction array. This is required for every
future application of instruction arrays: static keys, indirect jumps
and indirect calls.

A map of the BPF_MAP_TYPE_INSN_ARRAY type must be created with a u32
keys and value of size 8. The values have different semantics for
userspace and for BPF space. For userspace a value consists of two
u32 values – xlated and jitted offsets. For BPF side the value is
a real pointer to a jitted instruction.

On map creation/initialization, before loading the program, each
element of the map should be initialized to point to an instruction
offset within the program. Before the program load such maps should
be made frozen. After the program verification xlated and jitted
offsets can be read via the bpf(2) syscall.

If a tracked instruction is removed by the verifier, then the xlated
offset is set to (u32)-1 which is considered to be too big for a valid
BPF program offset.

One such a map can, obviously, be used to track one and only one BPF
program.  If the verification process was unsuccessful, then the same
map can be re-used to verify the program with a different log level.
However, if the program was loaded fine, then such a map, being
frozen in any case, can't be reused by other programs even after the
program release.

Example. Consider the following original and xlated programs:

    Original prog:                      Xlated prog:

     0:  r1 = 0x0                        0: r1 = 0
     1:  *(u32 *)(r10 - 0x4) = r1        1: *(u32 *)(r10 -4) = r1
     2:  r2 = r10                        2: r2 = r10
     3:  r2 += -0x4                      3: r2 += -4
     4:  r1 = 0x0 ll                     4: r1 = map[id:88]
     6:  call 0x1                        6: r1 += 272
                                         7: r0 = *(u32 *)(r2 +0)
                                         8: if r0 >= 0x1 goto pc+3
                                         9: r0 <<= 3
                                        10: r0 += r1
                                        11: goto pc+1
                                        12: r0 = 0
     7:  r6 = r0                        13: r6 = r0
     8:  if r6 == 0x0 goto +0x2         14: if r6 == 0x0 goto pc+4
     9:  call 0x76                      15: r0 = 0xffffffff8d2079c0
                                        17: r0 = *(u64 *)(r0 +0)
    10:  *(u64 *)(r6 + 0x0) = r0        18: *(u64 *)(r6 +0) = r0
    11:  r0 = 0x0                       19: r0 = 0x0
    12:  exit                           20: exit

An instruction array map, containing, e.g., instructions [0,4,7,12]
will be translated by the verifier to [0,4,13,20]. A map with
index 5 (the middle of 16-byte instruction) or indexes greater than 12
(outside the program boundaries) would be rejected.

The functionality provided by this patch will be extended in consequent
patches to implement BPF Static Keys, indirect jumps, and indirect calls.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 arch/x86/net/bpf_jit_comp.c    |   8 +
 include/linux/bpf.h            |  38 ++++
 include/linux/bpf_types.h      |   1 +
 include/linux/bpf_verifier.h   |   2 +
 include/uapi/linux/bpf.h       |  11 ++
 kernel/bpf/Makefile            |   2 +-
 kernel/bpf/bpf_insn_array.c    | 336 +++++++++++++++++++++++++++++++++
 kernel/bpf/syscall.c           |  22 +++
 kernel/bpf/verifier.c          |  43 +++++
 tools/include/uapi/linux/bpf.h |  11 ++
 10 files changed, 473 insertions(+), 1 deletion(-)
 create mode 100644 kernel/bpf/bpf_insn_array.c

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 8d34a9400a5e..8792d7f371d3 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1664,6 +1664,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image
 	prog = temp;
 
 	for (i = 1; i <= insn_cnt; i++, insn++) {
+		u32 abs_xlated_off = bpf_prog->aux->subprog_start + i - 1;
 		const s32 imm32 = insn->imm;
 		u32 dst_reg = insn->dst_reg;
 		u32 src_reg = insn->src_reg;
@@ -2717,6 +2718,13 @@ st:			if (is_imm8(insn->off))
 				return -EFAULT;
 			}
 			memcpy(rw_image + proglen, temp, ilen);
+
+			/*
+			 * Instruction arrays need to know how xlated code
+			 * maps to jitted code
+			 */
+			bpf_prog_update_insn_ptr(bpf_prog, abs_xlated_off, proglen,
+						 image + proglen);
 		}
 		proglen += ilen;
 		addrs[i] = proglen;
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 1056fd0d54d3..ee3967f2a025 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -3717,4 +3717,42 @@ int bpf_prog_get_file_line(struct bpf_prog *prog, unsigned long ip, const char *
 			   const char **linep, int *nump);
 struct bpf_prog *bpf_prog_find_from_stack(void);
 
+int bpf_insn_array_init(struct bpf_map *map, const struct bpf_prog *prog);
+int bpf_insn_array_ready(struct bpf_map *map);
+void bpf_insn_array_release(struct bpf_map *map);
+void bpf_insn_array_adjust(struct bpf_map *map, u32 off, u32 len);
+void bpf_insn_array_adjust_after_remove(struct bpf_map *map, u32 off, u32 len);
+
+/*
+ * The struct bpf_insn_ptr structure describes a pointer to a
+ * particular instruction in a loaded BPF program. Initially
+ * it is initialised from userspace via user_value.xlated_off.
+ * During the program verification all other fields are populated
+ * accordingly:
+ *
+ *   jitted_ip:       address of the instruction in the jitted image
+ *   user_value:      user-visible xlated and jitted offsets
+ *   orig_xlated_off: original offset of the instruction
+ */
+struct bpf_insn_ptr {
+	void *jitted_ip;
+	struct bpf_insn_array_value user_value;
+	u32 orig_xlated_off;
+};
+
+#ifdef CONFIG_BPF_SYSCALL
+void bpf_prog_update_insn_ptr(struct bpf_prog *prog,
+			      u32 xlated_off,
+			      u32 jitted_off,
+			      void *jitted_ip);
+#else
+static inline void
+bpf_prog_update_insn_ptr(struct bpf_prog *prog,
+			 u32 xlated_off,
+			 u32 jitted_off,
+			 void *jitted_ip)
+{
+}
+#endif
+
 #endif /* _LINUX_BPF_H */
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index fa78f49d4a9a..b13de31e163f 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -133,6 +133,7 @@ BPF_MAP_TYPE(BPF_MAP_TYPE_RINGBUF, ringbuf_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_BLOOM_FILTER, bloom_filter_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_USER_RINGBUF, user_ringbuf_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_ARENA, arena_map_ops)
+BPF_MAP_TYPE(BPF_MAP_TYPE_INSN_ARRAY, insn_array_map_ops)
 
 BPF_LINK_TYPE(BPF_LINK_TYPE_RAW_TRACEPOINT, raw_tracepoint)
 BPF_LINK_TYPE(BPF_LINK_TYPE_TRACING, tracing)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 020de62bd09c..aca43c284203 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -766,8 +766,10 @@ struct bpf_verifier_env {
 	struct list_head free_list;	/* list of struct bpf_verifier_state_list */
 	struct bpf_map *used_maps[MAX_USED_MAPS]; /* array of map's used by eBPF program */
 	struct btf_mod_pair used_btfs[MAX_USED_BTFS]; /* array of BTF's used by BPF program */
+	struct bpf_map *insn_array_maps[MAX_USED_MAPS]; /* array of INSN_ARRAY map's to be relocated */
 	u32 used_map_cnt;		/* number of used maps */
 	u32 used_btf_cnt;		/* number of used BTF objects */
+	u32 insn_array_map_cnt;		/* number of used maps of type BPF_MAP_TYPE_INSN_ARRAY */
 	u32 id_gen;			/* used to generate unique reg IDs */
 	u32 hidden_subprog_cnt;		/* number of hidden subprogs */
 	int exception_callback_subprog;
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 233de8677382..021c27ee5591 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1026,6 +1026,7 @@ enum bpf_map_type {
 	BPF_MAP_TYPE_USER_RINGBUF,
 	BPF_MAP_TYPE_CGRP_STORAGE,
 	BPF_MAP_TYPE_ARENA,
+	BPF_MAP_TYPE_INSN_ARRAY,
 	__MAX_BPF_MAP_TYPE
 };
 
@@ -7623,4 +7624,14 @@ enum bpf_kfunc_flags {
 	BPF_F_PAD_ZEROS = (1ULL << 0),
 };
 
+/*
+ * Values of a BPF_MAP_TYPE_INSN_ARRAY entry must be of this type.
+ * On updates jitted_off must be equal to 0.
+ */
+struct bpf_insn_array_value {
+	__u32 jitted_off;
+	__u32 xlated_off;
+};
+
+
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index f6cf8c2af5f7..e596b66a48e6 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -9,7 +9,7 @@ CFLAGS_core.o += -Wno-override-init $(cflags-nogcse-yy)
 obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o log.o token.o
 obj-$(CONFIG_BPF_SYSCALL) += bpf_iter.o map_iter.o task_iter.o prog_iter.o link_iter.o
 obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o bloom_filter.o
-obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o
+obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o bpf_insn_array.o
 obj-$(CONFIG_BPF_SYSCALL) += bpf_local_storage.o bpf_task_storage.o
 obj-${CONFIG_BPF_LSM}	  += bpf_inode_storage.o
 obj-$(CONFIG_BPF_SYSCALL) += disasm.o mprog.o
diff --git a/kernel/bpf/bpf_insn_array.c b/kernel/bpf/bpf_insn_array.c
new file mode 100644
index 000000000000..0c8dac62f457
--- /dev/null
+++ b/kernel/bpf/bpf_insn_array.c
@@ -0,0 +1,336 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/bpf.h>
+#include <linux/sort.h>
+
+#define MAX_INSN_ARRAY_ENTRIES 256
+
+struct bpf_insn_array {
+	struct bpf_map map;
+	struct mutex state_mutex;
+	int state;
+	long *ips;
+	DECLARE_FLEX_ARRAY(struct bpf_insn_ptr, ptrs);
+};
+
+enum {
+	INSN_ARRAY_STATE_FREE = 0,
+	INSN_ARRAY_STATE_INIT,
+	INSN_ARRAY_STATE_READY,
+};
+
+#define cast_insn_array(MAP_PTR) \
+	container_of(MAP_PTR, struct bpf_insn_array, map)
+
+#define INSN_DELETED ((u32)-1)
+
+static inline u32 insn_array_alloc_size(u32 max_entries)
+{
+	const u32 base_size = sizeof(struct bpf_insn_array);
+	const u32 entry_size = sizeof(struct bpf_insn_ptr);
+
+	return base_size + entry_size * max_entries;
+}
+
+static int insn_array_alloc_check(union bpf_attr *attr)
+{
+	if (attr->max_entries == 0 ||
+	    attr->key_size != 4 ||
+	    attr->value_size != 8 ||
+	    attr->map_flags != 0)
+		return -EINVAL;
+
+	if (attr->max_entries > MAX_INSN_ARRAY_ENTRIES)
+		return -E2BIG;
+
+	return 0;
+}
+
+static void insn_array_free(struct bpf_map *map)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+
+	kfree(insn_array->ips);
+	bpf_map_area_free(insn_array);
+}
+
+static struct bpf_map *insn_array_alloc(union bpf_attr *attr)
+{
+	u64 size = insn_array_alloc_size(attr->max_entries);
+	struct bpf_insn_array *insn_array;
+
+	insn_array = bpf_map_area_alloc(size, NUMA_NO_NODE);
+	if (!insn_array)
+		return ERR_PTR(-ENOMEM);
+
+	insn_array->ips = kcalloc(attr->max_entries, sizeof(long), GFP_KERNEL);
+	if (!insn_array->ips) {
+		insn_array_free(&insn_array->map);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	bpf_map_init_from_attr(&insn_array->map, attr);
+
+	mutex_init(&insn_array->state_mutex);
+	insn_array->state = INSN_ARRAY_STATE_FREE;
+
+	return &insn_array->map;
+}
+
+static int insn_array_get_next_key(struct bpf_map *map, void *key, void *next_key)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+	u32 index = key ? *(u32 *)key : U32_MAX;
+	u32 *next = (u32 *)next_key;
+
+	if (index >= insn_array->map.max_entries) {
+		*next = 0;
+		return 0;
+	}
+
+	if (index == insn_array->map.max_entries - 1)
+		return -ENOENT;
+
+	*next = index + 1;
+	return 0;
+}
+
+static void *insn_array_lookup_elem(struct bpf_map *map, void *key)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+	u32 index = *(u32 *)key;
+
+	if (unlikely(index >= insn_array->map.max_entries))
+		return NULL;
+
+	return &insn_array->ptrs[index].user_value;
+}
+
+static long insn_array_update_elem(struct bpf_map *map, void *key, void *value, u64 map_flags)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+	u32 index = *(u32 *)key;
+	struct bpf_insn_array_value val = {};
+	int err = 0;
+
+	if (unlikely((map_flags & ~BPF_F_LOCK) > BPF_EXIST))
+		return -EINVAL;
+
+	if (unlikely(index >= insn_array->map.max_entries))
+		return -E2BIG;
+
+	if (unlikely(map_flags & BPF_NOEXIST))
+		return -EEXIST;
+
+	/* No updates for maps in use */
+	if (!mutex_trylock(&insn_array->state_mutex))
+		return -EBUSY;
+
+	if (insn_array->state != INSN_ARRAY_STATE_FREE) {
+		err = -EBUSY;
+		goto unlock;
+	}
+
+	copy_map_value(map, &val, value);
+	if (val.jitted_off || val.xlated_off == INSN_DELETED) {
+		err = -EINVAL;
+		goto unlock;
+	}
+
+	insn_array->ptrs[index].orig_xlated_off = val.xlated_off;
+	insn_array->ptrs[index].user_value.xlated_off = val.xlated_off;
+
+unlock:
+	mutex_unlock(&insn_array->state_mutex);
+	return err;
+}
+
+static long insn_array_delete_elem(struct bpf_map *map, void *key)
+{
+	return -EINVAL;
+}
+
+static int insn_array_check_btf(const struct bpf_map *map,
+			      const struct btf *btf,
+			      const struct btf_type *key_type,
+			      const struct btf_type *value_type)
+{
+	if (!btf_type_is_i32(key_type))
+		return -EINVAL;
+
+	if (!btf_type_is_i64(value_type))
+		return -EINVAL;
+
+	return 0;
+}
+
+static u64 insn_array_mem_usage(const struct bpf_map *map)
+{
+	u64 extra_size = 0;
+
+	extra_size += sizeof(long) * map->max_entries; /* insn_array->ips */
+
+	return insn_array_alloc_size(map->max_entries) + extra_size;
+}
+
+BTF_ID_LIST_SINGLE(insn_array_btf_ids, struct, bpf_insn_array)
+
+const struct bpf_map_ops insn_array_map_ops = {
+	.map_alloc_check = insn_array_alloc_check,
+	.map_alloc = insn_array_alloc,
+	.map_free = insn_array_free,
+	.map_get_next_key = insn_array_get_next_key,
+	.map_lookup_elem = insn_array_lookup_elem,
+	.map_update_elem = insn_array_update_elem,
+	.map_delete_elem = insn_array_delete_elem,
+	.map_check_btf = insn_array_check_btf,
+	.map_mem_usage = insn_array_mem_usage,
+	.map_btf_id = &insn_array_btf_ids[0],
+};
+
+static bool is_insn_array(const struct bpf_map *map)
+{
+	return map->map_type == BPF_MAP_TYPE_INSN_ARRAY;
+}
+
+static inline bool valid_offsets(const struct bpf_insn_array *insn_array,
+				 const struct bpf_prog *prog)
+{
+	u32 off;
+	int i;
+
+	for (i = 0; i < insn_array->map.max_entries; i++) {
+		off = insn_array->ptrs[i].orig_xlated_off;
+
+		if (off >= prog->len)
+			return false;
+
+		if (off > 0) {
+			if (prog->insnsi[off-1].code == (BPF_LD | BPF_DW | BPF_IMM))
+				return false;
+		}
+	}
+
+	return true;
+}
+
+int bpf_insn_array_init(struct bpf_map *map, const struct bpf_prog *prog)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+	int i;
+
+	if (!valid_offsets(insn_array, prog))
+		return -EINVAL;
+
+	/*
+	 * There can be only one program using the map
+	 */
+	mutex_lock(&insn_array->state_mutex);
+	if (insn_array->state != INSN_ARRAY_STATE_FREE) {
+		mutex_unlock(&insn_array->state_mutex);
+		return -EBUSY;
+	}
+	insn_array->state = INSN_ARRAY_STATE_INIT;
+	mutex_unlock(&insn_array->state_mutex);
+
+	/*
+	 * Reset all the map indexes to the original values.  This is needed,
+	 * e.g., when a replay of verification with different log level should
+	 * be performed.
+	 */
+	for (i = 0; i < map->max_entries; i++)
+		insn_array->ptrs[i].user_value.xlated_off = insn_array->ptrs[i].orig_xlated_off;
+
+	return 0;
+}
+
+int bpf_insn_array_ready(struct bpf_map *map)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+	guard(mutex)(&insn_array->state_mutex);
+	int i;
+
+	for (i = 0; i < map->max_entries; i++) {
+		if (insn_array->ptrs[i].user_value.xlated_off == INSN_DELETED)
+			continue;
+		if (!insn_array->ips[i]) {
+			/*
+			 * Set the map free on failure; the program owning it
+			 * might be re-loaded with different log level
+			 */
+			insn_array->state = INSN_ARRAY_STATE_FREE;
+			return -EFAULT;
+		}
+	}
+
+	insn_array->state = INSN_ARRAY_STATE_READY;
+	return 0;
+}
+
+void bpf_insn_array_release(struct bpf_map *map)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+	guard(mutex)(&insn_array->state_mutex);
+
+	insn_array->state = INSN_ARRAY_STATE_FREE;
+}
+
+void bpf_insn_array_adjust(struct bpf_map *map, u32 off, u32 len)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+	int i;
+
+	if (len <= 1)
+		return;
+
+	for (i = 0; i < map->max_entries; i++) {
+		if (insn_array->ptrs[i].user_value.xlated_off <= off)
+			continue;
+		if (insn_array->ptrs[i].user_value.xlated_off == INSN_DELETED)
+			continue;
+		insn_array->ptrs[i].user_value.xlated_off += len - 1;
+	}
+}
+
+void bpf_insn_array_adjust_after_remove(struct bpf_map *map, u32 off, u32 len)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+	int i;
+
+	for (i = 0; i < map->max_entries; i++) {
+		if (insn_array->ptrs[i].user_value.xlated_off < off)
+			continue;
+		if (insn_array->ptrs[i].user_value.xlated_off == INSN_DELETED)
+			continue;
+		if (insn_array->ptrs[i].user_value.xlated_off >= off &&
+		    insn_array->ptrs[i].user_value.xlated_off < off + len)
+			insn_array->ptrs[i].user_value.xlated_off = INSN_DELETED;
+		else
+			insn_array->ptrs[i].user_value.xlated_off -= len;
+	}
+}
+
+void bpf_prog_update_insn_ptr(struct bpf_prog *prog,
+			      u32 xlated_off,
+			      u32 jitted_off,
+			      void *jitted_ip)
+{
+	struct bpf_insn_array *insn_array;
+	struct bpf_map *map;
+	int i, j;
+
+	for (i = 0; i < prog->aux->used_map_cnt; i++) {
+		map = prog->aux->used_maps[i];
+		if (!is_insn_array(map))
+			continue;
+
+		insn_array = cast_insn_array(map);
+		for (j = 0; j < map->max_entries; j++) {
+			if (insn_array->ptrs[j].user_value.xlated_off == xlated_off) {
+				insn_array->ips[j] = (long)jitted_ip;
+				insn_array->ptrs[j].jitted_ip = jitted_ip;
+				insn_array->ptrs[j].user_value.jitted_off = jitted_off;
+			}
+		}
+	}
+}
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 3f178a0f8eb1..7b4e7a053aa0 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1461,6 +1461,7 @@ static int map_create(union bpf_attr *attr, bool kernel)
 	case BPF_MAP_TYPE_STRUCT_OPS:
 	case BPF_MAP_TYPE_CPUMAP:
 	case BPF_MAP_TYPE_ARENA:
+	case BPF_MAP_TYPE_INSN_ARRAY:
 		if (!bpf_token_capable(token, CAP_BPF))
 			goto put_token;
 		break;
@@ -2761,6 +2762,23 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type)
 	}
 }
 
+static int bpf_prog_mark_insn_arrays_ready(struct bpf_prog *prog)
+{
+	int err;
+	int i;
+
+	for (i = 0; i < prog->aux->used_map_cnt; i++) {
+		if (prog->aux->used_maps[i]->map_type != BPF_MAP_TYPE_INSN_ARRAY)
+			continue;
+
+		err = bpf_insn_array_ready(prog->aux->used_maps[i]);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 /* last field in 'union bpf_attr' used by this command */
 #define BPF_PROG_LOAD_LAST_FIELD fd_array_cnt
 
@@ -2984,6 +3002,10 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 	if (err < 0)
 		goto free_used_maps;
 
+	err = bpf_prog_mark_insn_arrays_ready(prog);
+	if (err < 0)
+		goto free_used_maps;
+
 	err = bpf_prog_alloc_id(prog);
 	if (err)
 		goto free_used_maps;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b8c4b4dd2ddf..a7ad4fe756da 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -10108,6 +10108,8 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
 		    func_id != BPF_FUNC_map_push_elem)
 			goto error;
 		break;
+	case BPF_MAP_TYPE_INSN_ARRAY:
+		goto error;
 	default:
 		break;
 	}
@@ -20532,6 +20534,15 @@ static int __add_used_map(struct bpf_verifier_env *env, struct bpf_map *map)
 
 	env->used_maps[env->used_map_cnt++] = map;
 
+	if (map->map_type == BPF_MAP_TYPE_INSN_ARRAY) {
+		err = bpf_insn_array_init(map, env->prog);
+		if (err) {
+			verbose(env, "Failed to properly initialize insn array\n");
+			return err;
+		}
+		env->insn_array_maps[env->insn_array_map_cnt++] = map;
+	}
+
 	return env->used_map_cnt - 1;
 }
 
@@ -20778,6 +20789,33 @@ static void adjust_subprog_starts(struct bpf_verifier_env *env, u32 off, u32 len
 	}
 }
 
+static void release_insn_arrays(struct bpf_verifier_env *env)
+{
+	int i;
+
+	for (i = 0; i < env->insn_array_map_cnt; i++)
+		bpf_insn_array_release(env->insn_array_maps[i]);
+}
+
+static void adjust_insn_arrays(struct bpf_verifier_env *env, u32 off, u32 len)
+{
+	int i;
+
+	if (len == 1)
+		return;
+
+	for (i = 0; i < env->insn_array_map_cnt; i++)
+		bpf_insn_array_adjust(env->insn_array_maps[i], off, len);
+}
+
+static void adjust_insn_arrays_after_remove(struct bpf_verifier_env *env, u32 off, u32 len)
+{
+	int i;
+
+	for (i = 0; i < env->insn_array_map_cnt; i++)
+		bpf_insn_array_adjust_after_remove(env->insn_array_maps[i], off, len);
+}
+
 static void adjust_poke_descs(struct bpf_prog *prog, u32 off, u32 len)
 {
 	struct bpf_jit_poke_descriptor *tab = prog->aux->poke_tab;
@@ -20819,6 +20857,7 @@ static struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 of
 	}
 	adjust_insn_aux_data(env, new_prog, off, len);
 	adjust_subprog_starts(env, off, len);
+	adjust_insn_arrays(env, off, len);
 	adjust_poke_descs(new_prog, off, len);
 	return new_prog;
 }
@@ -21002,6 +21041,8 @@ static int verifier_remove_insns(struct bpf_verifier_env *env, u32 off, u32 cnt)
 	if (err)
 		return err;
 
+	adjust_insn_arrays_after_remove(env, off, cnt);
+
 	memmove(aux_data + off,	aux_data + off + cnt,
 		sizeof(*aux_data) * (orig_prog_len - off - cnt));
 
@@ -24850,6 +24891,8 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	adjust_btf_func(env);
 
 err_release_maps:
+	if (ret)
+		release_insn_arrays(env);
 	if (!env->prog->aux->used_maps)
 		/* if we didn't copy map pointers into bpf_prog_info, release
 		 * them now. Otherwise free_used_maps() will release them.
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 233de8677382..021c27ee5591 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1026,6 +1026,7 @@ enum bpf_map_type {
 	BPF_MAP_TYPE_USER_RINGBUF,
 	BPF_MAP_TYPE_CGRP_STORAGE,
 	BPF_MAP_TYPE_ARENA,
+	BPF_MAP_TYPE_INSN_ARRAY,
 	__MAX_BPF_MAP_TYPE
 };
 
@@ -7623,4 +7624,14 @@ enum bpf_kfunc_flags {
 	BPF_F_PAD_ZEROS = (1ULL << 0),
 };
 
+/*
+ * Values of a BPF_MAP_TYPE_INSN_ARRAY entry must be of this type.
+ * On updates jitted_off must be equal to 0.
+ */
+struct bpf_insn_array_value {
+	__u32 jitted_off;
+	__u32 xlated_off;
+};
+
+
 #endif /* _UAPI__LINUX_BPF_H__ */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 bpf-next 04/13] selftests/bpf: add selftests for new insn_array map
  2025-09-18  9:38 [PATCH v3 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (2 preceding siblings ...)
  2025-09-18  9:38 ` [PATCH v3 bpf-next 03/13] bpf, x86: add new map type: instructions array Anton Protopopov
@ 2025-09-18  9:38 ` Anton Protopopov
  2025-09-18  9:38 ` [PATCH v3 bpf-next 05/13] bpf: support instructions arrays with constants blinding Anton Protopopov
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 46+ messages in thread
From: Anton Protopopov @ 2025-09-18  9:38 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

Add the following selftests for new insn_array map:

  * Incorrect instruction indexes are rejected
  * Two programs can't use the same map
  * BPF progs can't operate the map
  * no changes to code => map is the same
  * expected changes when instructions are added
  * expected changes when instructions are deleted
  * expected changes when multiple functions are present

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 .../selftests/bpf/prog_tests/bpf_insn_array.c | 405 ++++++++++++++++++
 1 file changed, 405 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c b/tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c
new file mode 100644
index 000000000000..f785132497d6
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c
@@ -0,0 +1,405 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <bpf/bpf.h>
+#include <test_progs.h>
+
+static int map_create(__u32 map_type, __u32 max_entries)
+{
+	const char *map_name = "insn_array";
+	__u32 key_size = 4;
+	__u32 value_size = sizeof(struct bpf_insn_array_value);
+
+	return bpf_map_create(map_type, map_name, key_size, value_size, max_entries, NULL);
+}
+
+static int prog_load(struct bpf_insn *insns, __u32 insn_cnt, int *fd_array, __u32 fd_array_cnt)
+{
+	LIBBPF_OPTS(bpf_prog_load_opts, opts);
+
+	opts.fd_array = fd_array;
+	opts.fd_array_cnt = fd_array_cnt;
+
+	return bpf_prog_load(BPF_PROG_TYPE_XDP, NULL, "GPL", insns, insn_cnt, &opts);
+}
+
+/*
+ * Load a program, which will not be anyhow mangled by the verifier.  Add an
+ * insn_array map pointing to every instruction. Check that it hasn't changed
+ * after the program load.
+ */
+static void check_one_to_one_mapping(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 4),
+		BPF_MOV64_IMM(BPF_REG_0, 3),
+		BPF_MOV64_IMM(BPF_REG_0, 2),
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd = -1, map_fd;
+	struct bpf_insn_array_value val = {};
+	int i;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, ARRAY_SIZE(insns));
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		val.xlated_off = i;
+		if (!ASSERT_EQ(bpf_map_update_elem(map_fd, &i, &val, 0), 0, "bpf_map_update_elem"))
+			goto cleanup;
+	}
+
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_GE(prog_fd, 0, "bpf(BPF_PROG_LOAD)"))
+		goto cleanup;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		if (!ASSERT_EQ(bpf_map_lookup_elem(map_fd, &i, &val), 0, "bpf_map_lookup_elem"))
+			goto cleanup;
+
+		ASSERT_EQ(val.xlated_off, i, "val should be equal i");
+	}
+
+cleanup:
+	close(prog_fd);
+	close(map_fd);
+}
+
+/*
+ * Try to load a program with a map which points to outside of the program
+ */
+static void check_out_of_bounds_index(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 4),
+		BPF_MOV64_IMM(BPF_REG_0, 3),
+		BPF_MOV64_IMM(BPF_REG_0, 2),
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd, map_fd;
+	struct bpf_insn_array_value val = {};
+	int key;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, 1);
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	key = 0;
+	val.xlated_off = ARRAY_SIZE(insns); /* too big */
+	if (!ASSERT_EQ(bpf_map_update_elem(map_fd, &key, &val, 0), 0, "bpf_map_update_elem"))
+		goto cleanup;
+
+	errno = 0;
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_EQ(prog_fd, -EINVAL, "program should have been rejected (prog_fd != -EINVAL)")) {
+		close(prog_fd);
+		goto cleanup;
+	}
+
+cleanup:
+	close(map_fd);
+}
+
+/*
+ * Try to load a program with a map which points to the middle of 16-bit insn
+ */
+static void check_mid_insn_index(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_LD_IMM64(BPF_REG_0, 0), /* 2 x 8 */
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd, map_fd;
+	struct bpf_insn_array_value val = {};
+	int key;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, 1);
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	key = 0;
+	val.xlated_off = 1; /* middle of 16-byte instruction */
+	if (!ASSERT_EQ(bpf_map_update_elem(map_fd, &key, &val, 0), 0, "bpf_map_update_elem"))
+		goto cleanup;
+
+	errno = 0;
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_EQ(prog_fd, -EINVAL, "program should have been rejected (prog_fd != -EINVAL)")) {
+		close(prog_fd);
+		goto cleanup;
+	}
+
+cleanup:
+	close(map_fd);
+}
+
+static void check_incorrect_index(void)
+{
+	check_out_of_bounds_index();
+	check_mid_insn_index();
+}
+
+/*
+ * Load a program with two patches (get jiffies, for simplicity). Add an
+ * insn_array map pointing to every instruction. Check how it was changed
+ * after the program load.
+ */
+static void check_simple(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 2),
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd = -1, map_fd;
+	__u32 map_in[] = {0, 1, 2, 3, 4, 5};
+	__u32 map_out[] = {0, 1, 4, 5, 8, 9};
+	struct bpf_insn_array_value val = {};
+	int i;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, ARRAY_SIZE(insns));
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		val.xlated_off = map_in[i];
+		if (!ASSERT_EQ(bpf_map_update_elem(map_fd, &i, &val, 0), 0,
+			       "bpf_map_update_elem"))
+			goto cleanup;
+	}
+
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_GE(prog_fd, 0, "bpf(BPF_PROG_LOAD)"))
+		goto cleanup;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		if (!ASSERT_EQ(bpf_map_lookup_elem(map_fd, &i, &val), 0, "bpf_map_lookup_elem"))
+			goto cleanup;
+
+		ASSERT_EQ(val.xlated_off, map_out[i], "val should be equal map_out[i]");
+	}
+
+cleanup:
+	close(prog_fd);
+	close(map_fd);
+}
+
+/*
+ * Verifier can delete code in two cases: nops & dead code. From insn
+ * array's point of view, the two cases are the same, so test using
+ * the simplest method: by loading some nops
+ */
+static void check_deletions(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 2),
+		BPF_JMP_IMM(BPF_JA, 0, 0, 0), /* nop */
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_JMP_IMM(BPF_JA, 0, 0, 0), /* nop */
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd = -1, map_fd;
+	__u32 map_in[] = {0, 1, 2, 3, 4, 5};
+	__u32 map_out[] = {0, -1, 1, -1, 2, 3};
+	struct bpf_insn_array_value val = {};
+	int i;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, ARRAY_SIZE(insns));
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		val.xlated_off = map_in[i];
+		if (!ASSERT_EQ(bpf_map_update_elem(map_fd, &i, &val, 0), 0,
+			       "bpf_map_update_elem"))
+			goto cleanup;
+	}
+
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_GE(prog_fd, 0, "bpf(BPF_PROG_LOAD)"))
+		goto cleanup;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		if (!ASSERT_EQ(bpf_map_lookup_elem(map_fd, &i, &val), 0, "bpf_map_lookup_elem"))
+			goto cleanup;
+
+		ASSERT_EQ(val.xlated_off, map_out[i], "val should be equal map_out[i]");
+	}
+
+cleanup:
+	close(prog_fd);
+	close(map_fd);
+}
+
+static void check_with_functions(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_JMP_IMM(BPF_JA, 0, 0, 0), /* nop */
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+		BPF_JMP_IMM(BPF_JA, 0, 0, 0), /* nop */
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 2),
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_EXIT_INSN(),
+		BPF_JMP_IMM(BPF_JA, 0, 0, 0), /* nop */
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+		BPF_JMP_IMM(BPF_JA, 0, 0, 0), /* nop */
+		BPF_MOV64_IMM(BPF_REG_0, 2),
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd = -1, map_fd;
+	__u32 map_in[] =  { 0, 1,  2, 3, 4, 5, /* func */  6, 7,  8, 9, 10};
+	__u32 map_out[] = {-1, 0, -1, 3, 4, 5, /* func */ -1, 6, -1, 9, 10};
+	struct bpf_insn_array_value val = {};
+	int i;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, ARRAY_SIZE(insns));
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		val.xlated_off = map_in[i];
+		if (!ASSERT_EQ(bpf_map_update_elem(map_fd, &i, &val, 0), 0,
+			       "bpf_map_update_elem"))
+			goto cleanup;
+	}
+
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_GE(prog_fd, 0, "bpf(BPF_PROG_LOAD)"))
+		goto cleanup;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		if (!ASSERT_EQ(bpf_map_lookup_elem(map_fd, &i, &val), 0, "bpf_map_lookup_elem"))
+			goto cleanup;
+
+		ASSERT_EQ(val.xlated_off, map_out[i], "val should be equal map_out[i]");
+	}
+
+cleanup:
+	close(prog_fd);
+	close(map_fd);
+}
+
+/* Map can be used only by one BPF program */
+static void check_no_map_reuse(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd = -1, map_fd, extra_fd = -1;
+	struct bpf_insn_array_value val = {};
+	int i;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, ARRAY_SIZE(insns));
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		val.xlated_off = i;
+		if (!ASSERT_EQ(bpf_map_update_elem(map_fd, &i, &val, 0), 0, "bpf_map_update_elem"))
+			goto cleanup;
+	}
+
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_GE(prog_fd, 0, "bpf(BPF_PROG_LOAD)"))
+		goto cleanup;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		if (!ASSERT_EQ(bpf_map_lookup_elem(map_fd, &i, &val), 0, "bpf_map_lookup_elem"))
+			goto cleanup;
+
+		ASSERT_EQ(val.xlated_off, i, "val should be equal i");
+	}
+
+	errno = 0;
+	extra_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_EQ(extra_fd, -EBUSY, "program should have been rejected (extra_fd != -EBUSY)"))
+		goto cleanup;
+
+	/* correctness: check that prog is still loadable without fd_array */
+	extra_fd = prog_load(insns, ARRAY_SIZE(insns), NULL, 0);
+	if (!ASSERT_GE(prog_fd, 0, "bpf(BPF_PROG_LOAD): expected no error"))
+		goto cleanup;
+
+cleanup:
+	close(extra_fd);
+	close(prog_fd);
+	close(map_fd);
+}
+
+static void check_bpf_no_lookup(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_LD_MAP_FD(BPF_REG_1, 0),
+		BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+		BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+		BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd = -1, map_fd;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, 1);
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	insns[0].imm = map_fd;
+
+	errno = 0;
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), NULL, 0);
+	if (!ASSERT_EQ(prog_fd, -EINVAL, "program should have been rejected (prog_fd != -EINVAL)"))
+		goto cleanup;
+
+	/* correctness: check that prog is still loadable with normal map */
+	close(map_fd);
+	map_fd = map_create(BPF_MAP_TYPE_ARRAY, 1);
+	insns[0].imm = map_fd;
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), NULL, 0);
+	if (!ASSERT_GE(prog_fd, 0, "bpf(BPF_PROG_LOAD)"))
+		goto cleanup;
+
+cleanup:
+	close(prog_fd);
+	close(map_fd);
+}
+
+static void check_bpf_side(void)
+{
+	check_bpf_no_lookup();
+}
+
+void test_bpf_insn_array(void)
+{
+	/* Test if offsets are adjusted properly */
+
+	if (test__start_subtest("one2one"))
+		check_one_to_one_mapping();
+
+	if (test__start_subtest("simple"))
+		check_simple();
+
+	if (test__start_subtest("deletions"))
+		check_deletions();
+
+	if (test__start_subtest("multiple-functions"))
+		check_with_functions();
+
+	/* Check all kinds of operations and related restrictions */
+
+	if (test__start_subtest("incorrect-index"))
+		check_incorrect_index();
+
+	if (test__start_subtest("no-map-reuse"))
+		check_no_map_reuse();
+
+	if (test__start_subtest("bpf-side-ops"))
+		check_bpf_side();
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 bpf-next 05/13] bpf: support instructions arrays with constants blinding
  2025-09-18  9:38 [PATCH v3 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (3 preceding siblings ...)
  2025-09-18  9:38 ` [PATCH v3 bpf-next 04/13] selftests/bpf: add selftests for new insn_array map Anton Protopopov
@ 2025-09-18  9:38 ` Anton Protopopov
  2025-09-19  6:35   ` Eduard Zingerman
  2025-09-18  9:38 ` [PATCH v3 bpf-next 06/13] selftests/bpf: test instructions arrays with blinding Anton Protopopov
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 46+ messages in thread
From: Anton Protopopov @ 2025-09-18  9:38 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

When bpf_jit_harden is enabled, all constants in the BPF code are
blinded to prevent JIT spraying attacks. This happens during JIT
phase. Adjust all the related instruction arrays accordingly.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 kernel/bpf/core.c     | 20 ++++++++++++++++++++
 kernel/bpf/verifier.c | 11 ++++++++++-
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 1cda2589d4b3..90f201a6f51d 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1451,6 +1451,23 @@ void bpf_jit_prog_release_other(struct bpf_prog *fp, struct bpf_prog *fp_other)
 	bpf_prog_clone_free(fp_other);
 }
 
+static void adjust_insn_arrays(struct bpf_prog *prog, u32 off, u32 len)
+{
+#ifdef CONFIG_BPF_SYSCALL
+	struct bpf_map *map;
+	int i;
+
+	if (len <= 1)
+		return;
+
+	for (i = 0; i < prog->aux->used_map_cnt; i++) {
+		map = prog->aux->used_maps[i];
+		if (map->map_type == BPF_MAP_TYPE_INSN_ARRAY)
+			bpf_insn_array_adjust(map, off, len);
+	}
+#endif
+}
+
 struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
 {
 	struct bpf_insn insn_buff[16], aux[2];
@@ -1506,6 +1523,9 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
 		clone = tmp;
 		insn_delta = rewritten - 1;
 
+		/* Instructions arrays must be updated using absolute xlated offsets */
+		adjust_insn_arrays(clone, prog->aux->subprog_start + i, rewritten);
+
 		/* Walk new program and skip insns we just inserted. */
 		insn = clone->insnsi + i + insn_delta;
 		insn_cnt += insn_delta;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a7ad4fe756da..5c1e4e37d1f8 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -21578,6 +21578,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 	struct bpf_insn *insn;
 	void *old_bpf_func;
 	int err, num_exentries;
+	int old_len, subprog_start_adjustment = 0;
 
 	if (env->subprog_cnt <= 1)
 		return 0;
@@ -21652,7 +21653,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 		func[i]->aux->func_idx = i;
 		/* Below members will be freed only at prog->aux */
 		func[i]->aux->btf = prog->aux->btf;
-		func[i]->aux->subprog_start = subprog_start;
+		func[i]->aux->subprog_start = subprog_start + subprog_start_adjustment;
 		func[i]->aux->func_info = prog->aux->func_info;
 		func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
 		func[i]->aux->poke_tab = prog->aux->poke_tab;
@@ -21705,7 +21706,15 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 		func[i]->aux->might_sleep = env->subprog_info[i].might_sleep;
 		if (!i)
 			func[i]->aux->exception_boundary = env->seen_exception;
+
+		/*
+		 * To properly pass the absolute subprog start to jit
+		 * all instruction adjustments should be accumulated
+		 */
+		old_len = func[i]->len;
 		func[i] = bpf_int_jit_compile(func[i]);
+		subprog_start_adjustment += func[i]->len - old_len;
+
 		if (!func[i]->jited) {
 			err = -ENOTSUPP;
 			goto out_free;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 bpf-next 06/13] selftests/bpf: test instructions arrays with blinding
  2025-09-18  9:38 [PATCH v3 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (4 preceding siblings ...)
  2025-09-18  9:38 ` [PATCH v3 bpf-next 05/13] bpf: support instructions arrays with constants blinding Anton Protopopov
@ 2025-09-18  9:38 ` Anton Protopopov
  2025-09-18  9:38 ` [PATCH v3 bpf-next 07/13] bpf, x86: allow indirect jumps to r8...r15 Anton Protopopov
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 46+ messages in thread
From: Anton Protopopov @ 2025-09-18  9:38 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

Add a specific test for instructions arrays with blinding enabled.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 .../selftests/bpf/prog_tests/bpf_insn_array.c | 92 +++++++++++++++++++
 1 file changed, 92 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c b/tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c
index f785132497d6..489badc17a2d 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c
@@ -287,6 +287,95 @@ static void check_with_functions(void)
 	close(map_fd);
 }
 
+static int set_bpf_jit_harden(char *level)
+{
+	char old_level;
+	int err = -1;
+	int fd = -1;
+
+	fd = open("/proc/sys/net/core/bpf_jit_harden", O_RDWR | O_NONBLOCK);
+	if (fd < 0) {
+		ASSERT_FAIL("open .../bpf_jit_harden returned %d (errno=%d)", fd, errno);
+		return -1;
+	}
+
+	err = read(fd, &old_level, 1);
+	if (err != 1) {
+		ASSERT_FAIL("read from .../bpf_jit_harden returned %d (errno=%d)", err, errno);
+		err = -1;
+		goto end;
+	}
+
+	lseek(fd, 0, SEEK_SET);
+
+	err = write(fd, level, 1);
+	if (err != 1) {
+		ASSERT_FAIL("write to .../bpf_jit_harden returned %d (errno=%d)", err, errno);
+		err = -1;
+		goto end;
+	}
+
+	err = 0;
+	*level = old_level;
+end:
+	if (fd >= 0)
+		close(fd);
+	return err;
+}
+
+static void check_blindness(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 4),
+		BPF_MOV64_IMM(BPF_REG_0, 3),
+		BPF_MOV64_IMM(BPF_REG_0, 2),
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd = -1, map_fd;
+	struct bpf_insn_array_value val = {};
+	char bpf_jit_harden = '@'; /* non-exizsting value */
+	int i;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, ARRAY_SIZE(insns));
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		val.xlated_off = i;
+		if (!ASSERT_EQ(bpf_map_update_elem(map_fd, &i, &val, 0), 0, "bpf_map_update_elem"))
+			goto cleanup;
+	}
+
+	bpf_jit_harden = '2';
+	if (set_bpf_jit_harden(&bpf_jit_harden)) {
+		bpf_jit_harden = '@'; /* open, read or write failed => no write was done */
+		goto cleanup;
+	}
+
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_GE(prog_fd, 0, "bpf(BPF_PROG_LOAD)"))
+		goto cleanup;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		char fmt[32];
+
+		if (!ASSERT_EQ(bpf_map_lookup_elem(map_fd, &i, &val), 0, "bpf_map_lookup_elem"))
+			goto cleanup;
+
+		snprintf(fmt, sizeof(fmt), "val should be equal 3*%d", i);
+		ASSERT_EQ(val.xlated_off, i * 3, fmt);
+	}
+
+cleanup:
+	/* restore the old one */
+	if (bpf_jit_harden != '@')
+		set_bpf_jit_harden(&bpf_jit_harden);
+
+	close(prog_fd);
+	close(map_fd);
+}
+
 /* Map can be used only by one BPF program */
 static void check_no_map_reuse(void)
 {
@@ -392,6 +481,9 @@ void test_bpf_insn_array(void)
 	if (test__start_subtest("multiple-functions"))
 		check_with_functions();
 
+	if (test__start_subtest("blindness"))
+		check_blindness();
+
 	/* Check all kinds of operations and related restrictions */
 
 	if (test__start_subtest("incorrect-index"))
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 bpf-next 07/13] bpf, x86: allow indirect jumps to r8...r15
  2025-09-18  9:38 [PATCH v3 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (5 preceding siblings ...)
  2025-09-18  9:38 ` [PATCH v3 bpf-next 06/13] selftests/bpf: test instructions arrays with blinding Anton Protopopov
@ 2025-09-18  9:38 ` Anton Protopopov
  2025-09-19 18:25   ` Eduard Zingerman
  2025-09-18  9:38 ` [PATCH v3 bpf-next 08/13] bpf, x86: add support for indirect jumps Anton Protopopov
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 46+ messages in thread
From: Anton Protopopov @ 2025-09-18  9:38 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

Currently the emit_indirect_jump() function only accepts one of the
RAX, RCX, ..., RBP registers as the destination. Make it to accept
R8, R9, ..., R15 as well, and make callers to pass BPF registers, not
native registers. This is required to enable indirect jumps support
in eBPF.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 arch/x86/net/bpf_jit_comp.c | 28 +++++++++++++++++++++-------
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 8792d7f371d3..fcebb48742ae 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -660,24 +660,38 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
 
 #define EMIT_LFENCE()	EMIT3(0x0F, 0xAE, 0xE8)
 
-static void emit_indirect_jump(u8 **pprog, int reg, u8 *ip)
+static void __emit_indirect_jump(u8 **pprog, int reg, bool ereg)
 {
 	u8 *prog = *pprog;
 
+	if (ereg)
+		EMIT1(0x41);
+
+	EMIT2(0xFF, 0xE0 + reg);
+
+	*pprog = prog;
+}
+
+static void emit_indirect_jump(u8 **pprog, int bpf_reg, u8 *ip)
+{
+	u8 *prog = *pprog;
+	int reg = reg2hex[bpf_reg];
+	bool ereg = is_ereg(bpf_reg);
+
 	if (cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) {
 		OPTIMIZER_HIDE_VAR(reg);
 		emit_jump(&prog, its_static_thunk(reg), ip);
 	} else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) {
 		EMIT_LFENCE();
-		EMIT2(0xFF, 0xE0 + reg);
+		__emit_indirect_jump(pprog, reg, ereg);
 	} else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE)) {
 		OPTIMIZER_HIDE_VAR(reg);
 		if (cpu_feature_enabled(X86_FEATURE_CALL_DEPTH))
-			emit_jump(&prog, &__x86_indirect_jump_thunk_array[reg], ip);
+			emit_jump(&prog, &__x86_indirect_jump_thunk_array[reg + 8*ereg], ip);
 		else
-			emit_jump(&prog, &__x86_indirect_thunk_array[reg], ip);
+			emit_jump(&prog, &__x86_indirect_thunk_array[reg + 8*ereg], ip);
 	} else {
-		EMIT2(0xFF, 0xE0 + reg);	/* jmp *%\reg */
+		__emit_indirect_jump(pprog, reg, ereg);
 		if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) || IS_ENABLED(CONFIG_MITIGATION_SLS))
 			EMIT1(0xCC);		/* int3 */
 	}
@@ -797,7 +811,7 @@ static void emit_bpf_tail_call_indirect(struct bpf_prog *bpf_prog,
 	 * rdi == ctx (1st arg)
 	 * rcx == prog->bpf_func + X86_TAIL_CALL_OFFSET
 	 */
-	emit_indirect_jump(&prog, 1 /* rcx */, ip + (prog - start));
+	emit_indirect_jump(&prog, BPF_REG_4 /* R4 -> rcx */, ip + (prog - start));
 
 	/* out: */
 	ctx->tail_call_indirect_label = prog - start;
@@ -3517,7 +3531,7 @@ static int emit_bpf_dispatcher(u8 **pprog, int a, int b, s64 *progs, u8 *image,
 		if (err)
 			return err;
 
-		emit_indirect_jump(&prog, 2 /* rdx */, image + (prog - buf));
+		emit_indirect_jump(&prog, BPF_REG_3 /* R3 -> rdx */, image + (prog - buf));
 
 		*pprog = prog;
 		return 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 bpf-next 08/13] bpf, x86: add support for indirect jumps
  2025-09-18  9:38 [PATCH v3 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (6 preceding siblings ...)
  2025-09-18  9:38 ` [PATCH v3 bpf-next 07/13] bpf, x86: allow indirect jumps to r8...r15 Anton Protopopov
@ 2025-09-18  9:38 ` Anton Protopopov
  2025-09-20  0:28   ` Eduard Zingerman
  2025-09-18  9:38 ` [PATCH v3 bpf-next 09/13] bpf: disasm: add support for BPF_JMP|BPF_JA|BPF_X Anton Protopopov
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 46+ messages in thread
From: Anton Protopopov @ 2025-09-18  9:38 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

Add support for a new instruction

    BPF_JMP|BPF_X|BPF_JA, SRC=0, DST=Rx, off=0, imm=0

which does an indirect jump to a location stored in Rx.  The register
Rx should have type PTR_TO_INSN. This new type assures that the Rx
register contains a value (or a range of values) loaded from a
correct jump table – map of type instruction array.

For example, for a C switch LLVM will generate the following code:

    0:   r3 = r1                    # "switch (r3)"
    1:   if r3 > 0x13 goto +0x666   # check r3 boundaries
    2:   r3 <<= 0x3                 # adjust to an index in array of addresses
    3:   r1 = 0xbeef ll             # r1 is PTR_TO_MAP_VALUE, r1->map_ptr=M
    5:   r1 += r3                   # r1 inherits boundaries from r3
    6:   r1 = *(u64 *)(r1 + 0x0)    # r1 now has type INSN_TO_PTR
    7:   gotox r1[,imm=fd(M)]       # jit will generate proper code

Here the gotox instruction corresponds to one particular map. This is
possible however to have a gotox instruction which can be loaded from
different maps, e.g.

    0:	 r1 &= 0x1
    1:	 r2 <<= 0x3
    2:	 r3 = 0x0 ll                # load from map M_1
    4:	 r3 += r2
    5:	 if r1 == 0x0 goto +0x4
    6:	 r1 <<= 0x3
    7:	 r3 = 0x0 ll                # load from map M_2
    9:	 r3 += r1
    A:	 r1 = *(u64 *)(r3 + 0x0)
    B:	 gotox r1                   # jump to target loaded from M_1 or M_2

During check_cfg stage the verifier will collect all the maps which
point to inside the subprog being verified. When building the config,
the high 16 bytes of the insn_state are used, so this patch
(theoretically) supports jump tables of up to 2^16 slots.

During the later stage, in check_indirect_jump, it is checked that
the register Rx was loaded from a particular instruction array.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 arch/x86/net/bpf_jit_comp.c  |   3 +
 include/linux/bpf.h          |   1 +
 include/linux/bpf_verifier.h |  15 +
 kernel/bpf/bpf_insn_array.c  |  16 +-
 kernel/bpf/core.c            |   1 +
 kernel/bpf/log.c             |   1 +
 kernel/bpf/verifier.c        | 513 ++++++++++++++++++++++++++++++++---
 7 files changed, 514 insertions(+), 36 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index fcebb48742ae..095d249eb235 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -2595,6 +2595,9 @@ st:			if (is_imm8(insn->off))
 
 			break;
 
+		case BPF_JMP | BPF_JA | BPF_X:
+			emit_indirect_jump(&prog, insn->dst_reg, image + addrs[i - 1]);
+			break;
 		case BPF_JMP | BPF_JA:
 		case BPF_JMP32 | BPF_JA:
 			if (BPF_CLASS(insn->code) == BPF_JMP) {
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index ee3967f2a025..be6019bf6a57 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -973,6 +973,7 @@ enum bpf_reg_type {
 	PTR_TO_ARENA,
 	PTR_TO_BUF,		 /* reg points to a read/write buffer */
 	PTR_TO_FUNC,		 /* reg points to a bpf program function */
+	PTR_TO_INSN,		 /* reg points to a bpf program instruction */
 	CONST_PTR_TO_DYNPTR,	 /* reg points to a const struct bpf_dynptr */
 	__BPF_REG_TYPE_MAX,
 
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index aca43c284203..607a684642e5 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -533,6 +533,16 @@ struct bpf_map_ptr_state {
 #define BPF_ALU_SANITIZE		(BPF_ALU_SANITIZE_SRC | \
 					 BPF_ALU_SANITIZE_DST)
 
+/*
+ * A structure defining an array of BPF instructions.  Can be used,
+ * for example, as a return value of the insn_successors() function
+ * and in the struct bpf_insn_aux_data below.
+ */
+struct bpf_iarray {
+	int off_cnt;
+	u32 off[];
+};
+
 struct bpf_insn_aux_data {
 	union {
 		enum bpf_reg_type ptr_type;	/* pointer type for load/store insns */
@@ -542,6 +552,7 @@ struct bpf_insn_aux_data {
 		struct {
 			u32 map_index;		/* index into used_maps[] */
 			u32 map_off;		/* offset from value base address */
+			struct bpf_iarray *jt;	/* jump table for gotox instruction */
 		};
 		struct {
 			enum bpf_reg_type reg_type;	/* type of pseudo_btf_id */
@@ -586,6 +597,9 @@ struct bpf_insn_aux_data {
 	u8 fastcall_spills_num:3;
 	u8 arg_prog:4;
 
+	/* true if jt->off was allocated */
+	bool jt_allocated;
+
 	/* below fields are initialized once */
 	unsigned int orig_idx; /* original instruction index */
 	bool jmp_point;
@@ -847,6 +861,7 @@ struct bpf_verifier_env {
 	/* array of pointers to bpf_scc_info indexed by SCC id */
 	struct bpf_scc_info **scc_info;
 	u32 scc_cnt;
+	struct bpf_iarray *succ;
 };
 
 static inline struct bpf_func_info_aux *subprog_aux(struct bpf_verifier_env *env, int subprog)
diff --git a/kernel/bpf/bpf_insn_array.c b/kernel/bpf/bpf_insn_array.c
index 0c8dac62f457..4b945b7e31b8 100644
--- a/kernel/bpf/bpf_insn_array.c
+++ b/kernel/bpf/bpf_insn_array.c
@@ -1,7 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
 
 #include <linux/bpf.h>
-#include <linux/sort.h>
 
 #define MAX_INSN_ARRAY_ENTRIES 256
 
@@ -173,6 +172,20 @@ static u64 insn_array_mem_usage(const struct bpf_map *map)
 	return insn_array_alloc_size(map->max_entries) + extra_size;
 }
 
+static int insn_array_map_direct_value_addr(const struct bpf_map *map, u64 *imm, u32 off)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+
+	if ((off % sizeof(long)) != 0 ||
+	    (off / sizeof(long)) >= map->max_entries)
+		return -EINVAL;
+
+	/* from BPF's point of view, this map is a jump table */
+	*imm = (unsigned long)insn_array->ips + off;
+
+	return 0;
+}
+
 BTF_ID_LIST_SINGLE(insn_array_btf_ids, struct, bpf_insn_array)
 
 const struct bpf_map_ops insn_array_map_ops = {
@@ -185,6 +198,7 @@ const struct bpf_map_ops insn_array_map_ops = {
 	.map_delete_elem = insn_array_delete_elem,
 	.map_check_btf = insn_array_check_btf,
 	.map_mem_usage = insn_array_mem_usage,
+	.map_direct_value_addr = insn_array_map_direct_value_addr,
 	.map_btf_id = &insn_array_btf_ids[0],
 };
 
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 90f201a6f51d..1f933857ca1d 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1709,6 +1709,7 @@ bool bpf_opcode_in_insntable(u8 code)
 		[BPF_LD | BPF_IND | BPF_B] = true,
 		[BPF_LD | BPF_IND | BPF_H] = true,
 		[BPF_LD | BPF_IND | BPF_W] = true,
+		[BPF_JMP | BPF_JA | BPF_X] = true,
 		[BPF_JMP | BPF_JCOND] = true,
 	};
 #undef BPF_INSN_3_TBL
diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
index e4983c1303e7..75adfe7914f2 100644
--- a/kernel/bpf/log.c
+++ b/kernel/bpf/log.c
@@ -461,6 +461,7 @@ const char *reg_type_str(struct bpf_verifier_env *env, enum bpf_reg_type type)
 		[PTR_TO_ARENA]		= "arena",
 		[PTR_TO_BUF]		= "buf",
 		[PTR_TO_FUNC]		= "func",
+		[PTR_TO_INSN]		= "insn",
 		[PTR_TO_MAP_KEY]	= "map_key",
 		[CONST_PTR_TO_DYNPTR]	= "dynptr_ptr",
 	};
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5c1e4e37d1f8..839260e62fa9 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -212,6 +212,7 @@ static int ref_set_non_owning(struct bpf_verifier_env *env,
 static void specialize_kfunc(struct bpf_verifier_env *env,
 			     u32 func_id, u16 offset, unsigned long *addr);
 static bool is_trusted_reg(const struct bpf_reg_state *reg);
+static int add_used_map(struct bpf_verifier_env *env, int fd);
 
 static bool bpf_map_ptr_poisoned(const struct bpf_insn_aux_data *aux)
 {
@@ -2977,14 +2978,13 @@ static int cmp_subprogs(const void *a, const void *b)
 	       ((struct bpf_subprog_info *)b)->start;
 }
 
-/* Find subprogram that contains instruction at 'off' */
-static struct bpf_subprog_info *find_containing_subprog(struct bpf_verifier_env *env, int off)
+static int find_containing_subprog_idx(struct bpf_verifier_env *env, int off)
 {
 	struct bpf_subprog_info *vals = env->subprog_info;
 	int l, r, m;
 
 	if (off >= env->prog->len || off < 0 || env->subprog_cnt == 0)
-		return NULL;
+		return -1;
 
 	l = 0;
 	r = env->subprog_cnt - 1;
@@ -2995,7 +2995,19 @@ static struct bpf_subprog_info *find_containing_subprog(struct bpf_verifier_env
 		else
 			r = m - 1;
 	}
-	return &vals[l];
+	return l;
+}
+
+/* Find subprogram that contains instruction at 'off' */
+static struct bpf_subprog_info *find_containing_subprog(struct bpf_verifier_env *env, int off)
+{
+	int subprog_idx;
+
+	subprog_idx = find_containing_subprog_idx(env, off);
+	if (subprog_idx < 0)
+		return NULL;
+
+	return &env->subprog_info[subprog_idx];
 }
 
 /* Find subprogram that starts exactly at 'off' */
@@ -6092,6 +6104,18 @@ static int check_map_kptr_access(struct bpf_verifier_env *env, u32 regno,
 	return 0;
 }
 
+/*
+ * Return the size of the memory region accessible from a pointer to map value.
+ * For INSN_ARRAY maps whole bpf_insn_array->ips array is accessible.
+ */
+static u32 map_mem_size(const struct bpf_map *map)
+{
+	if (map->map_type == BPF_MAP_TYPE_INSN_ARRAY)
+		return map->max_entries * sizeof(long);
+
+	return map->value_size;
+}
+
 /* check read/write into a map element with possible variable offset */
 static int check_map_access(struct bpf_verifier_env *env, u32 regno,
 			    int off, int size, bool zero_size_allowed,
@@ -6101,11 +6125,11 @@ static int check_map_access(struct bpf_verifier_env *env, u32 regno,
 	struct bpf_func_state *state = vstate->frame[vstate->curframe];
 	struct bpf_reg_state *reg = &state->regs[regno];
 	struct bpf_map *map = reg->map_ptr;
+	u32 mem_size = map_mem_size(map);
 	struct btf_record *rec;
 	int err, i;
 
-	err = check_mem_region_access(env, regno, off, size, map->value_size,
-				      zero_size_allowed);
+	err = check_mem_region_access(env, regno, off, size, mem_size, zero_size_allowed);
 	if (err)
 		return err;
 
@@ -7620,6 +7644,19 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
 
 				regs[value_regno].type = SCALAR_VALUE;
 				__mark_reg_known(&regs[value_regno], val);
+			} else if (map->map_type == BPF_MAP_TYPE_INSN_ARRAY) {
+				regs[value_regno].type = PTR_TO_INSN;
+				regs[value_regno].map_ptr = map;
+				regs[value_regno].off = reg->off;
+				regs[value_regno].umin_value = reg->umin_value;
+				regs[value_regno].umax_value = reg->umax_value;
+				regs[value_regno].smin_value = reg->smin_value;
+				regs[value_regno].smax_value = reg->smax_value;
+				regs[value_regno].s32_min_value = reg->s32_min_value;
+				regs[value_regno].s32_max_value = reg->s32_max_value;
+				regs[value_regno].u32_min_value = reg->u32_min_value;
+				regs[value_regno].u32_max_value = reg->u32_max_value;
+				regs[value_regno].var_off = reg->var_off;
 			} else {
 				mark_reg_unknown(env, regs, value_regno);
 			}
@@ -7810,6 +7847,11 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
 static int save_aux_ptr_type(struct bpf_verifier_env *env, enum bpf_reg_type type,
 			     bool allow_trust_mismatch);
 
+static bool map_is_insn_array(struct bpf_map *map)
+{
+	return map && map->map_type == BPF_MAP_TYPE_INSN_ARRAY;
+}
+
 static int check_load_mem(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			  bool strict_alignment_once, bool is_ldsx,
 			  bool allow_trust_mismatch, const char *ctx)
@@ -14487,6 +14529,8 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
 	struct bpf_func_state *state = vstate->frame[vstate->curframe];
 	struct bpf_reg_state *regs = state->regs, *dst_reg;
 	bool known = tnum_is_const(off_reg->var_off);
+	bool ptr_to_insn_array = base_type(ptr_reg->type) == PTR_TO_MAP_VALUE &&
+				 map_is_insn_array(ptr_reg->map_ptr);
 	s64 smin_val = off_reg->smin_value, smax_val = off_reg->smax_value,
 	    smin_ptr = ptr_reg->smin_value, smax_ptr = ptr_reg->smax_value;
 	u64 umin_val = off_reg->umin_value, umax_val = off_reg->umax_value,
@@ -14628,6 +14672,11 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
 		}
 		break;
 	case BPF_SUB:
+		if (ptr_to_insn_array) {
+			verbose(env, "Operation %s on ptr to instruction set map is prohibited\n",
+				bpf_alu_string[opcode >> 4]);
+			return -EACCES;
+		}
 		if (dst_reg == off_reg) {
 			/* scalar -= pointer.  Creates an unknown scalar */
 			verbose(env, "R%d tried to subtract pointer from scalar\n",
@@ -16980,7 +17029,8 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn)
 		}
 		dst_reg->type = PTR_TO_MAP_VALUE;
 		dst_reg->off = aux->map_off;
-		WARN_ON_ONCE(map->max_entries != 1);
+		WARN_ON_ONCE(map->map_type != BPF_MAP_TYPE_INSN_ARRAY &&
+			     map->max_entries != 1);
 		/* We want reg->id to be same (0) as map_value is not distinct */
 	} else if (insn->src_reg == BPF_PSEUDO_MAP_FD ||
 		   insn->src_reg == BPF_PSEUDO_MAP_IDX) {
@@ -17733,6 +17783,234 @@ static int mark_fastcall_patterns(struct bpf_verifier_env *env)
 	return 0;
 }
 
+#define SET_HIGH(STATE, LAST)	STATE = (STATE & 0xffffU) | ((LAST) << 16)
+#define GET_HIGH(STATE)		((u16)((STATE) >> 16))
+
+static int push_gotox_edge(int t, struct bpf_verifier_env *env, struct bpf_iarray *jt)
+{
+	int *insn_stack = env->cfg.insn_stack;
+	int *insn_state = env->cfg.insn_state;
+	u16 prev;
+	int w;
+
+	for (prev = GET_HIGH(insn_state[t]); prev < jt->off_cnt; prev++) {
+		w = jt->off[prev];
+
+		/* EXPLORED || DISCOVERED */
+		if (insn_state[w])
+			continue;
+
+		break;
+	}
+
+	if (prev == jt->off_cnt)
+		return DONE_EXPLORING;
+
+	mark_prune_point(env, t);
+
+	if (env->cfg.cur_stack >= env->prog->len)
+		return -E2BIG;
+	insn_stack[env->cfg.cur_stack++] = w;
+
+	mark_jmp_point(env, w);
+
+	SET_HIGH(insn_state[t], prev + 1);
+	return KEEP_EXPLORING;
+}
+
+static int copy_insn_array(struct bpf_map *map, u32 start, u32 end, u32 *off)
+{
+	struct bpf_insn_array_value *value;
+	u32 i;
+
+	for (i = start; i <= end; i++) {
+		value = map->ops->map_lookup_elem(map, &i);
+		if (!value)
+			return -EINVAL;
+		off[i - start] = value->xlated_off;
+	}
+	return 0;
+}
+
+static int cmp_ptr_to_u32(const void *a, const void *b)
+{
+	return *(u32 *)a - *(u32 *)b;
+}
+
+static int sort_insn_array_uniq(u32 *off, int off_cnt)
+{
+	int unique = 1;
+	int i;
+
+	sort(off, off_cnt, sizeof(off[0]), cmp_ptr_to_u32, NULL);
+
+	for (i = 1; i < off_cnt; i++)
+		if (off[i] != off[unique - 1])
+			off[unique++] = off[i];
+
+	return unique;
+}
+
+/*
+ * sort_unique({map[start], ..., map[end]}) into off
+ */
+static int copy_insn_array_uniq(struct bpf_map *map, u32 start, u32 end, u32 *off)
+{
+	u32 n = end - start + 1;
+	int err;
+
+	err = copy_insn_array(map, start, end, off);
+	if (err)
+		return err;
+
+	return sort_insn_array_uniq(off, n);
+}
+
+static struct bpf_iarray *iarray_realloc(struct bpf_iarray *old, size_t n_elem)
+{
+	size_t new_size = sizeof(struct bpf_iarray) + n_elem * 4;
+	struct bpf_iarray *new;
+
+	new = kvrealloc(old, new_size, GFP_KERNEL_ACCOUNT);
+	if (!new) {
+		/* this is what callers always want, so simplify the call site */
+		kvfree(old);
+		return NULL;
+	}
+
+	new->off_cnt = n_elem;
+	return new;
+}
+
+/*
+ * Copy all unique offsets from the map
+ */
+static struct bpf_iarray *jt_from_map(struct bpf_map *map)
+{
+	struct bpf_iarray *jt;
+	int n;
+
+	jt = iarray_realloc(NULL, map->max_entries);
+	if (!jt)
+		return ERR_PTR(-ENOMEM);
+
+	n = copy_insn_array_uniq(map, 0, map->max_entries - 1, jt->off);
+	if (n < 0) {
+		kvfree(jt);
+		return ERR_PTR(n);
+	}
+
+	return jt;
+}
+
+/*
+ * Find and collect all maps which fit in the subprog. Return the result as one
+ * combined jump table in jt->off (allocated with kvcalloc
+ */
+static struct bpf_iarray *jt_from_subprog(struct bpf_verifier_env *env,
+					  int subprog_start, int subprog_end)
+{
+	struct bpf_iarray *jt = NULL;
+	struct bpf_map *map;
+	struct bpf_iarray *jt_cur;
+	int i;
+
+	for (i = 0; i < env->insn_array_map_cnt; i++) {
+		/*
+		 * TODO (when needed): collect only jump tables, not static keys
+		 * or maps for indirect calls
+		 */
+		map = env->insn_array_maps[i];
+
+		jt_cur = jt_from_map(map);
+		if (IS_ERR(jt_cur)) {
+			kvfree(jt);
+			return jt_cur;
+		}
+
+		/*
+		 * This is enough to check one element. The full table is
+		 * checked to fit inside the subprog later in create_jt()
+		 */
+		if (jt_cur->off[0] >= subprog_start && jt_cur->off[0] < subprog_end) {
+			u32 old_cnt = jt ? jt->off_cnt : 0;
+			jt = iarray_realloc(jt, old_cnt + jt_cur->off_cnt);
+			if (!jt) {
+				kvfree(jt_cur);
+				return ERR_PTR(-ENOMEM);
+			}
+			memcpy(jt->off + old_cnt, jt_cur->off, jt_cur->off_cnt << 2);
+		}
+
+		kvfree(jt_cur);
+	}
+
+	if (!jt) {
+		verbose(env, "no jump tables found for subprog starting at %u\n", subprog_start);
+		return ERR_PTR(-EINVAL);
+	}
+
+	jt->off_cnt = sort_insn_array_uniq(jt->off, jt->off_cnt);
+	return jt;
+}
+
+static struct bpf_iarray *
+create_jt(int t, struct bpf_verifier_env *env, int fd)
+{
+	static struct bpf_subprog_info *subprog;
+	int subprog_idx, subprog_start, subprog_end;
+	struct bpf_iarray *jt;
+	int i;
+
+	if (env->subprog_cnt == 0)
+		return ERR_PTR(-EFAULT);
+
+	subprog_idx = find_containing_subprog_idx(env, t);
+	if (subprog_idx < 0) {
+		verbose(env, "can't find subprog containing instruction %d\n", t);
+		return ERR_PTR(-EFAULT);
+	}
+	subprog = &env->subprog_info[subprog_idx];
+	subprog_start = subprog->start;
+	subprog_end = (subprog + 1)->start;
+	jt = jt_from_subprog(env, subprog_start, subprog_end);
+	if (IS_ERR(jt))
+		return jt;
+
+	/* Check that the every element of the jump table fits within the given subprogram */
+	for (i = 0; i < jt->off_cnt; i++) {
+		if (jt->off[i] < subprog_start || jt->off[i] >= subprog_end) {
+			verbose(env, "jump table for insn %d points outside of the subprog [%u,%u]",
+					t, subprog_start, subprog_end);
+			return ERR_PTR(-EINVAL);
+		}
+	}
+
+	return jt;
+}
+
+/* "conditional jump with N edges" */
+static int visit_gotox_insn(int t, struct bpf_verifier_env *env, int fd)
+{
+	struct bpf_iarray *jt = env->insn_aux_data[t].jt;
+
+	if (!jt) {
+		jt = create_jt(t, env, fd);
+		if (IS_ERR(jt))
+			return PTR_ERR(jt);
+	}
+
+	/*
+	 * Mark jt as allocated. Otherwise, this is not possible to check if it
+	 * was allocated or not in the code which frees memory (jt is a part of
+	 * union)
+	 */
+	env->insn_aux_data[t].jt_allocated = true;
+	env->insn_aux_data[t].jt = jt;
+
+	return push_gotox_edge(t, env, jt);
+}
+
 /* Visits the instruction at index t and returns one of the following:
  *  < 0 - an error occurred
  *  DONE_EXPLORING - the instruction was fully explored
@@ -17823,8 +18101,8 @@ static int visit_insn(int t, struct bpf_verifier_env *env)
 		return visit_func_call_insn(t, insns, env, insn->src_reg == BPF_PSEUDO_CALL);
 
 	case BPF_JA:
-		if (BPF_SRC(insn->code) != BPF_K)
-			return -EINVAL;
+		if (BPF_SRC(insn->code) == BPF_X)
+			return visit_gotox_insn(t, env, insn->imm);
 
 		if (BPF_CLASS(insn->code) == BPF_JMP)
 			off = insn->off;
@@ -17855,6 +18133,13 @@ static int visit_insn(int t, struct bpf_verifier_env *env)
 	}
 }
 
+static bool insn_is_gotox(struct bpf_insn *insn)
+{
+	return BPF_CLASS(insn->code) == BPF_JMP &&
+	       BPF_OP(insn->code) == BPF_JA &&
+	       BPF_SRC(insn->code) == BPF_X;
+}
+
 /* non-recursive depth-first-search to detect loops in BPF program
  * loop == back-edge in directed graph
  */
@@ -18716,6 +19001,10 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
 		return regs_exact(rold, rcur, idmap) && rold->frameno == rcur->frameno;
 	case PTR_TO_ARENA:
 		return true;
+	case PTR_TO_INSN:
+		/* is rcur a subset of rold? */
+		return (rcur->umin_value >= rold->umin_value &&
+			rcur->umax_value <= rold->umax_value);
 	default:
 		return regs_exact(rold, rcur, idmap);
 	}
@@ -19862,6 +20151,102 @@ static int process_bpf_exit_full(struct bpf_verifier_env *env,
 	return PROCESS_BPF_EXIT;
 }
 
+static int indirect_jump_min_max_index(struct bpf_verifier_env *env,
+				       int regno,
+				       struct bpf_map *map,
+				       u32 *pmin_index, u32 *pmax_index)
+{
+	struct bpf_reg_state *reg = reg_state(env, regno);
+	u64 min_index, max_index;
+
+	if (check_add_overflow(reg->umin_value, reg->off, &min_index) ||
+		(min_index > (u64) U32_MAX * sizeof(long))) {
+		verbose(env, "the sum of R%u umin_value %llu and off %u is too big\n",
+			     regno, reg->umin_value, reg->off);
+		return -ERANGE;
+	}
+	if (check_add_overflow(reg->umax_value, reg->off, &max_index) ||
+		(max_index > (u64) U32_MAX * sizeof(long))) {
+		verbose(env, "the sum of R%u umax_value %llu and off %u is too big\n",
+			     regno, reg->umax_value, reg->off);
+		return -ERANGE;
+	}
+
+	min_index /= sizeof(long);
+	max_index /= sizeof(long);
+
+	if (min_index >= map->max_entries || max_index >= map->max_entries) {
+		verbose(env, "R%u points to outside of jump table: [%llu,%llu] max_entries %u\n",
+			     regno, min_index, max_index, map->max_entries);
+		return -EINVAL;
+	}
+
+	*pmin_index = min_index;
+	*pmax_index = max_index;
+	return 0;
+}
+
+/* gotox *dst_reg */
+static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *insn)
+{
+	struct bpf_verifier_state *other_branch;
+	struct bpf_reg_state *dst_reg;
+	struct bpf_map *map;
+	u32 min_index, max_index;
+	int err = 0;
+	u32 *xoff;
+	int n;
+	int i;
+
+	dst_reg = reg_state(env, insn->dst_reg);
+	if (dst_reg->type != PTR_TO_INSN) {
+		verbose(env, "R%d has type %d, expected PTR_TO_INSN\n",
+			     insn->dst_reg, dst_reg->type);
+		return -EINVAL;
+	}
+
+	map = dst_reg->map_ptr;
+	if (verifier_bug_if(!map, env, "R%d has an empty map pointer", insn->dst_reg))
+		return -EFAULT;
+
+	if (verifier_bug_if(map->map_type != BPF_MAP_TYPE_INSN_ARRAY, env,
+			    "R%d has incorrect map type %d", insn->dst_reg, map->map_type))
+		return -EFAULT;
+
+	err = indirect_jump_min_max_index(env, insn->dst_reg, map, &min_index, &max_index);
+	if (err)
+		return err;
+
+	xoff = kvcalloc(max_index - min_index + 1, sizeof(u32), GFP_KERNEL_ACCOUNT);
+	if (!xoff)
+		return -ENOMEM;
+
+	n = copy_insn_array_uniq(map, min_index, max_index, xoff);
+	if (n < 0) {
+		err = n;
+		goto free_off;
+	}
+	if (n == 0) {
+		verbose(env, "register R%d doesn't point to any offset in map id=%d\n",
+			     insn->dst_reg, map->id);
+		err = -EINVAL;
+		goto free_off;
+	}
+
+	for (i = 0; i < n - 1; i++) {
+		other_branch = push_stack(env, xoff[i], env->insn_idx, false);
+		if (IS_ERR(other_branch)) {
+			err = PTR_ERR(other_branch);
+			goto free_off;
+		}
+	}
+	env->insn_idx = xoff[n-1];
+
+free_off:
+	kvfree(xoff);
+	return err;
+}
+
 static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
 {
 	int err;
@@ -19964,6 +20349,9 @@ static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
 
 			mark_reg_scratched(env, BPF_REG_0);
 		} else if (opcode == BPF_JA) {
+			if (BPF_SRC(insn->code) == BPF_X)
+				return check_indirect_jump(env, insn);
+
 			if (BPF_SRC(insn->code) != BPF_K ||
 			    insn->src_reg != BPF_REG_0 ||
 			    insn->dst_reg != BPF_REG_0 ||
@@ -20463,6 +20851,7 @@ static int check_map_prog_compatibility(struct bpf_verifier_env *env,
 		case BPF_MAP_TYPE_QUEUE:
 		case BPF_MAP_TYPE_STACK:
 		case BPF_MAP_TYPE_ARENA:
+		case BPF_MAP_TYPE_INSN_ARRAY:
 			break;
 		default:
 			verbose(env,
@@ -21020,6 +21409,23 @@ static int bpf_adj_linfo_after_remove(struct bpf_verifier_env *env, u32 off,
 	return 0;
 }
 
+/*
+ * Clean up dynamically allocated fields of aux data for instructions [start, ...]
+ */
+static void clear_insn_aux_data(struct bpf_insn_aux_data *aux_data, int start, int len)
+{
+	int end = start + len;
+	int i;
+
+	for (i = start; i < end; i++) {
+		if (aux_data[i].jt_allocated) {
+			kvfree(aux_data[i].jt);
+			aux_data[i].jt = NULL;
+			aux_data[i].jt_allocated = false;
+		}
+	}
+}
+
 static int verifier_remove_insns(struct bpf_verifier_env *env, u32 off, u32 cnt)
 {
 	struct bpf_insn_aux_data *aux_data = env->insn_aux_data;
@@ -21043,6 +21449,8 @@ static int verifier_remove_insns(struct bpf_verifier_env *env, u32 off, u32 cnt)
 
 	adjust_insn_arrays_after_remove(env, off, cnt);
 
+	clear_insn_aux_data(aux_data, off, cnt);
+
 	memmove(aux_data + off,	aux_data + off + cnt,
 		sizeof(*aux_data) * (orig_prog_len - off - cnt));
 
@@ -21683,6 +22091,8 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 		func[i]->aux->jited_linfo = prog->aux->jited_linfo;
 		func[i]->aux->linfo_idx = env->subprog_info[i].linfo_idx;
 		func[i]->aux->arena = prog->aux->arena;
+		func[i]->aux->used_maps = env->used_maps;
+		func[i]->aux->used_map_cnt = env->used_map_cnt;
 		num_exentries = 0;
 		insn = func[i]->insnsi;
 		for (j = 0; j < func[i]->len; j++, insn++) {
@@ -24215,23 +24625,41 @@ static bool can_jump(struct bpf_insn *insn)
 	return false;
 }
 
-static int insn_successors(struct bpf_prog *prog, u32 idx, u32 succ[2])
+/*
+ * Returns an array of instructions succ, with succ->off[0], ...,
+ * succ->off[n-1] with successor instructions, where n=succ->off_cnt
+ */
+static struct bpf_iarray *
+insn_successors(struct bpf_verifier_env *env, u32 insn_idx)
 {
-	struct bpf_insn *insn = &prog->insnsi[idx];
-	int i = 0, insn_sz;
+	struct bpf_prog *prog = env->prog;
+	struct bpf_insn *insn = &prog->insnsi[insn_idx];
+	struct bpf_iarray *succ;
+	int insn_sz;
 	u32 dst;
 
-	insn_sz = bpf_is_ldimm64(insn) ? 2 : 1;
-	if (can_fallthrough(insn) && idx + 1 < prog->len)
-		succ[i++] = idx + insn_sz;
+	if (unlikely(insn_is_gotox(insn))) {
+		succ = env->insn_aux_data[insn_idx].jt;
+		if (verifier_bug_if(!succ, env,
+				    "aux data for insn %u doesn't contain a jump table\n",
+				    insn_idx))
+			return ERR_PTR(-EFAULT);
+	} else {
+		/* pre-allocated array of size up to 2; reset cnt, as it may be used already */
+		succ = env->succ;
+		succ->off_cnt = 0;
 
-	if (can_jump(insn)) {
-		dst = idx + jmp_offset(insn) + 1;
-		if (i == 0 || succ[0] != dst)
-			succ[i++] = dst;
-	}
+		insn_sz = bpf_is_ldimm64(insn) ? 2 : 1;
+		if (can_fallthrough(insn) && insn_idx + 1 < prog->len)
+			succ->off[succ->off_cnt++] = insn_idx + insn_sz;
 
-	return i;
+		if (can_jump(insn)) {
+			dst = insn_idx + jmp_offset(insn) + 1;
+			if (succ->off_cnt == 0 || succ->off[0] != dst)
+				succ->off[succ->off_cnt++] = dst;
+		}
+	}
+	return succ;
 }
 
 /* Each field is a register bitmask */
@@ -24426,14 +24854,18 @@ static int compute_live_registers(struct bpf_verifier_env *env)
 		for (i = 0; i < env->cfg.cur_postorder; ++i) {
 			int insn_idx = env->cfg.insn_postorder[i];
 			struct insn_live_regs *live = &state[insn_idx];
-			int succ_num;
-			u32 succ[2];
+			struct bpf_iarray *succ;
 			u16 new_out = 0;
 			u16 new_in = 0;
 
-			succ_num = insn_successors(env->prog, insn_idx, succ);
-			for (int s = 0; s < succ_num; ++s)
-				new_out |= state[succ[s]].in;
+			succ = insn_successors(env, insn_idx);
+			if (IS_ERR(succ)) {
+				err = PTR_ERR(succ);
+				goto out;
+
+			}
+			for (int s = 0; s < succ->off_cnt; ++s)
+				new_out |= state[succ->off[s]].in;
 			new_in = (new_out & ~live->def) | live->use;
 			if (new_out != live->out || new_in != live->in) {
 				live->in = new_in;
@@ -24489,11 +24921,10 @@ static int compute_scc(struct bpf_verifier_env *env)
 	const u32 insn_cnt = env->prog->len;
 	int stack_sz, dfs_sz, err = 0;
 	u32 *stack, *pre, *low, *dfs;
-	u32 succ_cnt, i, j, t, w;
+	u32 i, j, t, w;
 	u32 next_preorder_num;
 	u32 next_scc_id;
 	bool assign_scc;
-	u32 succ[2];
 
 	next_preorder_num = 1;
 	next_scc_id = 1;
@@ -24592,6 +25023,8 @@ static int compute_scc(struct bpf_verifier_env *env)
 		dfs[0] = i;
 dfs_continue:
 		while (dfs_sz) {
+			struct bpf_iarray *succ;
+
 			w = dfs[dfs_sz - 1];
 			if (pre[w] == 0) {
 				low[w] = next_preorder_num;
@@ -24600,12 +25033,17 @@ static int compute_scc(struct bpf_verifier_env *env)
 				stack[stack_sz++] = w;
 			}
 			/* Visit 'w' successors */
-			succ_cnt = insn_successors(env->prog, w, succ);
-			for (j = 0; j < succ_cnt; ++j) {
-				if (pre[succ[j]]) {
-					low[w] = min(low[w], low[succ[j]]);
+			succ = insn_successors(env, w);
+			if (IS_ERR(succ)) {
+				err = PTR_ERR(succ);
+				goto exit;
+
+			}
+			for (j = 0; j < succ->off_cnt; ++j) {
+				if (pre[succ->off[j]]) {
+					low[w] = min(low[w], low[succ->off[j]]);
 				} else {
-					dfs[dfs_sz++] = succ[j];
+					dfs[dfs_sz++] = succ->off[j];
 					goto dfs_continue;
 				}
 			}
@@ -24622,8 +25060,8 @@ static int compute_scc(struct bpf_verifier_env *env)
 			 * or if component has a self reference.
 			 */
 			assign_scc = stack[stack_sz - 1] != w;
-			for (j = 0; j < succ_cnt; ++j) {
-				if (succ[j] == w) {
+			for (j = 0; j < succ->off_cnt; ++j) {
+				if (succ->off[j] == w) {
 					assign_scc = true;
 					break;
 				}
@@ -24683,6 +25121,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	ret = -ENOMEM;
 	if (!env->insn_aux_data)
 		goto err_free_env;
+	env->succ = iarray_realloc(NULL, 2);
+	if (!env->succ)
+		goto err_free_env;
 	for (i = 0; i < len; i++)
 		env->insn_aux_data[i].orig_idx = i;
 	env->prog = *prog;
@@ -24922,10 +25363,12 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 err_unlock:
 	if (!is_priv)
 		mutex_unlock(&bpf_verifier_lock);
+	clear_insn_aux_data(env->insn_aux_data, 0, env->prog->len);
 	vfree(env->insn_aux_data);
 err_free_env:
 	kvfree(env->cfg.insn_postorder);
 	kvfree(env->scc_info);
+	kvfree(env->succ);
 	kvfree(env);
 	return ret;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 bpf-next 09/13] bpf: disasm: add support for BPF_JMP|BPF_JA|BPF_X
  2025-09-18  9:38 [PATCH v3 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (7 preceding siblings ...)
  2025-09-18  9:38 ` [PATCH v3 bpf-next 08/13] bpf, x86: add support for indirect jumps Anton Protopopov
@ 2025-09-18  9:38 ` Anton Protopopov
  2025-09-18  9:38 ` [PATCH v3 bpf-next 10/13] libbpf: fix formatting of bpf_object__append_subprog_code Anton Protopopov
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 46+ messages in thread
From: Anton Protopopov @ 2025-09-18  9:38 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

Add support for indirect jump instruction.

Example output from bpftool:

   0: (79) r3 = *(u64 *)(r1 +0)
   1: (25) if r3 > 0x4 goto pc+666
   2: (67) r3 <<= 3
   3: (18) r1 = 0xffffbeefspameggs
   5: (0f) r1 += r3
   6: (79) r1 = *(u64 *)(r1 +0)
   7: (0d) gotox r1

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 kernel/bpf/disasm.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c
index 20883c6b1546..4a1ecc6f7582 100644
--- a/kernel/bpf/disasm.c
+++ b/kernel/bpf/disasm.c
@@ -183,6 +183,13 @@ static inline bool is_mov_percpu_addr(const struct bpf_insn *insn)
 	return insn->code == (BPF_ALU64 | BPF_MOV | BPF_X) && insn->off == BPF_ADDR_PERCPU;
 }
 
+static void print_bpf_ja_indirect(bpf_insn_print_t verbose,
+				  void *private_data,
+				  const struct bpf_insn *insn)
+{
+	verbose(private_data, "(%02x) gotox r%d\n", insn->code, insn->dst_reg);
+}
+
 void print_bpf_insn(const struct bpf_insn_cbs *cbs,
 		    const struct bpf_insn *insn,
 		    bool allow_ptr_leaks)
@@ -358,6 +365,8 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs,
 		} else if (insn->code == (BPF_JMP | BPF_JA)) {
 			verbose(cbs->private_data, "(%02x) goto pc%+d\n",
 				insn->code, insn->off);
+		} else if (insn->code == (BPF_JMP | BPF_JA | BPF_X)) {
+			print_bpf_ja_indirect(verbose, cbs->private_data, insn);
 		} else if (insn->code == (BPF_JMP | BPF_JCOND) &&
 			   insn->src_reg == BPF_MAY_GOTO) {
 			verbose(cbs->private_data, "(%02x) may_goto pc%+d\n",
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 bpf-next 10/13] libbpf: fix formatting of bpf_object__append_subprog_code
  2025-09-18  9:38 [PATCH v3 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (8 preceding siblings ...)
  2025-09-18  9:38 ` [PATCH v3 bpf-next 09/13] bpf: disasm: add support for BPF_JMP|BPF_JA|BPF_X Anton Protopopov
@ 2025-09-18  9:38 ` Anton Protopopov
  2025-09-19 23:18   ` Andrii Nakryiko
  2025-09-18  9:38 ` [PATCH v3 bpf-next 11/13] libbpf: support llvm-generated indirect jumps Anton Protopopov
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 46+ messages in thread
From: Anton Protopopov @ 2025-09-18  9:38 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

The commit 6c918709bd30 ("libbpf: Refactor bpf_object__reloc_code")
added the bpf_object__append_subprog_code() with incorrect indentations.
Use tabs instead. (This also makes a consequent commit better readable.)

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 tools/lib/bpf/libbpf.c | 52 +++++++++++++++++++++---------------------
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index fe4fc5438678..2c1f48f77680 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -6393,32 +6393,32 @@ static int
 bpf_object__append_subprog_code(struct bpf_object *obj, struct bpf_program *main_prog,
 				struct bpf_program *subprog)
 {
-       struct bpf_insn *insns;
-       size_t new_cnt;
-       int err;
-
-       subprog->sub_insn_off = main_prog->insns_cnt;
-
-       new_cnt = main_prog->insns_cnt + subprog->insns_cnt;
-       insns = libbpf_reallocarray(main_prog->insns, new_cnt, sizeof(*insns));
-       if (!insns) {
-               pr_warn("prog '%s': failed to realloc prog code\n", main_prog->name);
-               return -ENOMEM;
-       }
-       main_prog->insns = insns;
-       main_prog->insns_cnt = new_cnt;
-
-       memcpy(main_prog->insns + subprog->sub_insn_off, subprog->insns,
-              subprog->insns_cnt * sizeof(*insns));
-
-       pr_debug("prog '%s': added %zu insns from sub-prog '%s'\n",
-                main_prog->name, subprog->insns_cnt, subprog->name);
-
-       /* The subprog insns are now appended. Append its relos too. */
-       err = append_subprog_relos(main_prog, subprog);
-       if (err)
-               return err;
-       return 0;
+	struct bpf_insn *insns;
+	size_t new_cnt;
+	int err;
+
+	subprog->sub_insn_off = main_prog->insns_cnt;
+
+	new_cnt = main_prog->insns_cnt + subprog->insns_cnt;
+	insns = libbpf_reallocarray(main_prog->insns, new_cnt, sizeof(*insns));
+	if (!insns) {
+		pr_warn("prog '%s': failed to realloc prog code\n", main_prog->name);
+		return -ENOMEM;
+	}
+	main_prog->insns = insns;
+	main_prog->insns_cnt = new_cnt;
+
+	memcpy(main_prog->insns + subprog->sub_insn_off, subprog->insns,
+	       subprog->insns_cnt * sizeof(*insns));
+
+	pr_debug("prog '%s': added %zu insns from sub-prog '%s'\n",
+		 main_prog->name, subprog->insns_cnt, subprog->name);
+
+	/* The subprog insns are now appended. Append its relos too. */
+	err = append_subprog_relos(main_prog, subprog);
+	if (err)
+		return err;
+	return 0;
 }
 
 static int
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 bpf-next 11/13] libbpf: support llvm-generated indirect jumps
  2025-09-18  9:38 [PATCH v3 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (9 preceding siblings ...)
  2025-09-18  9:38 ` [PATCH v3 bpf-next 10/13] libbpf: fix formatting of bpf_object__append_subprog_code Anton Protopopov
@ 2025-09-18  9:38 ` Anton Protopopov
  2025-09-19 23:18   ` Andrii Nakryiko
  2025-09-18  9:38 ` [PATCH v3 bpf-next 12/13] bpftool: Recognize insn_array map type Anton Protopopov
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 46+ messages in thread
From: Anton Protopopov @ 2025-09-18  9:38 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

For v5 instruction set LLVM is allowed to generate indirect jumps for
switch statements and for 'goto *rX' assembly. Every such a jump will
be accompanied by necessary metadata, e.g. (`llvm-objdump -Sr ...`):

       0:       r2 = 0x0 ll
                0000000000000030:  R_BPF_64_64  BPF.JT.0.0

Here BPF.JT.1.0 is a symbol residing in the .jumptables section:

    Symbol table:
       4: 0000000000000000   240 OBJECT  GLOBAL DEFAULT     4 BPF.JT.0.0

The -bpf-min-jump-table-entries llvm option may be used to control the
minimal size of a switch which will be converted to an indirect jumps.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 tools/lib/bpf/libbpf.c        | 150 +++++++++++++++++++++++++++++++++-
 tools/lib/bpf/libbpf_probes.c |   4 +
 tools/lib/bpf/linker.c        |  10 ++-
 3 files changed, 161 insertions(+), 3 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 2c1f48f77680..57cac0810d2e 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -191,6 +191,7 @@ static const char * const map_type_name[] = {
 	[BPF_MAP_TYPE_USER_RINGBUF]             = "user_ringbuf",
 	[BPF_MAP_TYPE_CGRP_STORAGE]		= "cgrp_storage",
 	[BPF_MAP_TYPE_ARENA]			= "arena",
+	[BPF_MAP_TYPE_INSN_ARRAY]		= "insn_array",
 };
 
 static const char * const prog_type_name[] = {
@@ -372,6 +373,7 @@ enum reloc_type {
 	RELO_EXTERN_CALL,
 	RELO_SUBPROG_ADDR,
 	RELO_CORE,
+	RELO_INSN_ARRAY,
 };
 
 struct reloc_desc {
@@ -382,7 +384,10 @@ struct reloc_desc {
 		struct {
 			int map_idx;
 			int sym_off;
-			int ext_idx;
+			union {
+				int ext_idx;
+				int sym_size;
+			};
 		};
 	};
 };
@@ -424,6 +429,11 @@ struct bpf_sec_def {
 	libbpf_prog_attach_fn_t prog_attach_fn;
 };
 
+struct bpf_light_subprog {
+	__u32 sec_insn_off;
+	__u32 sub_insn_off;
+};
+
 /*
  * bpf_prog should be a better name but it has been used in
  * linux/filter.h.
@@ -496,6 +506,9 @@ struct bpf_program {
 	__u32 line_info_rec_size;
 	__u32 line_info_cnt;
 	__u32 prog_flags;
+
+	struct bpf_light_subprog *subprog;
+	__u32 subprog_cnt;
 };
 
 struct bpf_struct_ops {
@@ -525,6 +538,7 @@ struct bpf_struct_ops {
 #define STRUCT_OPS_SEC ".struct_ops"
 #define STRUCT_OPS_LINK_SEC ".struct_ops.link"
 #define ARENA_SEC ".addr_space.1"
+#define JUMPTABLES_SEC ".jumptables"
 
 enum libbpf_map_type {
 	LIBBPF_MAP_UNSPEC,
@@ -668,6 +682,7 @@ struct elf_state {
 	int symbols_shndx;
 	bool has_st_ops;
 	int arena_data_shndx;
+	int jumptables_data_shndx;
 };
 
 struct usdt_manager;
@@ -739,6 +754,9 @@ struct bpf_object {
 	void *arena_data;
 	size_t arena_data_sz;
 
+	void *jumptables_data;
+	size_t jumptables_data_sz;
+
 	struct kern_feature_cache *feat_cache;
 	char *token_path;
 	int token_fd;
@@ -765,6 +783,7 @@ void bpf_program__unload(struct bpf_program *prog)
 
 	zfree(&prog->func_info);
 	zfree(&prog->line_info);
+	zfree(&prog->subprog);
 }
 
 static void bpf_program__exit(struct bpf_program *prog)
@@ -3945,6 +3964,13 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
 			} else if (strcmp(name, ARENA_SEC) == 0) {
 				obj->efile.arena_data = data;
 				obj->efile.arena_data_shndx = idx;
+			} else if (strcmp(name, JUMPTABLES_SEC) == 0) {
+				obj->jumptables_data = malloc(data->d_size);
+				if (!obj->jumptables_data)
+					return -ENOMEM;
+				memcpy(obj->jumptables_data, data->d_buf, data->d_size);
+				obj->jumptables_data_sz = data->d_size;
+				obj->efile.jumptables_data_shndx = idx;
 			} else {
 				pr_info("elf: skipping unrecognized data section(%d) %s\n",
 					idx, name);
@@ -4599,6 +4625,16 @@ static int bpf_program__record_reloc(struct bpf_program *prog,
 		return 0;
 	}
 
+	/* jump table data relocation */
+	if (shdr_idx == obj->efile.jumptables_data_shndx) {
+		reloc_desc->type = RELO_INSN_ARRAY;
+		reloc_desc->insn_idx = insn_idx;
+		reloc_desc->map_idx = -1;
+		reloc_desc->sym_off = sym->st_value;
+		reloc_desc->sym_size = sym->st_size;
+		return 0;
+	}
+
 	/* generic map reference relocation */
 	if (type == LIBBPF_MAP_UNSPEC) {
 		if (!bpf_object__shndx_is_maps(obj, shdr_idx)) {
@@ -6101,6 +6137,74 @@ static void poison_kfunc_call(struct bpf_program *prog, int relo_idx,
 	insn->imm = POISON_CALL_KFUNC_BASE + ext_idx;
 }
 
+static int create_jt_map(struct bpf_object *obj, int off, int size, int adjust_off)
+{
+	const __u32 value_size = sizeof(struct bpf_insn_array_value);
+	const __u32 max_entries = size / value_size;
+	struct bpf_insn_array_value val = {};
+	int map_fd, err;
+	__u64 xlated_off;
+	__u64 *jt;
+	__u32 i;
+
+	map_fd = bpf_map_create(BPF_MAP_TYPE_INSN_ARRAY, "jt",
+				4, value_size, max_entries, NULL);
+	if (map_fd < 0)
+		return map_fd;
+
+	if (!obj->jumptables_data) {
+		pr_warn("object contains no jumptables_data\n");
+		return -EINVAL;
+	}
+	if ((off + size) > obj->jumptables_data_sz) {
+		pr_warn("jumptables_data size is %zd, trying to access %d\n",
+			obj->jumptables_data_sz, off + size);
+		return -EINVAL;
+	}
+
+	jt = (__u64 *)(obj->jumptables_data + off);
+	for (i = 0; i < max_entries; i++) {
+		/*
+		 * LLVM-generated jump tables contain u64 records, however
+		 * should contain values that fit in u32.
+		 * The adjust_off provided by the caller adjusts the offset to
+		 * be relative to the beginning of the main function
+		 */
+		xlated_off = jt[i]/sizeof(struct bpf_insn) + adjust_off;
+		if (xlated_off > UINT32_MAX) {
+			pr_warn("invalid jump table value %llx at offset %d (adjust_off %d)\n",
+				jt[i], off + i, adjust_off);
+			return -EINVAL;
+		}
+
+		val.xlated_off = xlated_off;
+		err = bpf_map_update_elem(map_fd, &i, &val, 0);
+		if (err) {
+			close(map_fd);
+			return err;
+		}
+	}
+	return map_fd;
+}
+
+/*
+ * In LLVM the .jumptables section contains jump tables entries relative to the
+ * section start. The BPF kernel-side code expects jump table offsets relative
+ * to the beginning of the program (passed in bpf(BPF_PROG_LOAD)). This helper
+ * computes a delta to be added when creating a map.
+ */
+static int jt_adjust_off(struct bpf_program *prog, int insn_idx)
+{
+	int i;
+
+	for (i = prog->subprog_cnt - 1; i >= 0; i--)
+		if (insn_idx >= prog->subprog[i].sub_insn_off)
+			return prog->subprog[i].sub_insn_off - prog->subprog[i].sec_insn_off;
+
+	return -prog->sec_insn_off;
+}
+
+
 /* Relocate data references within program code:
  *  - map references;
  *  - global variable references;
@@ -6192,6 +6296,21 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
 		case RELO_CORE:
 			/* will be handled by bpf_program_record_relos() */
 			break;
+		case RELO_INSN_ARRAY: {
+			int map_fd;
+
+			map_fd = create_jt_map(obj, relo->sym_off, relo->sym_size,
+					       jt_adjust_off(prog, relo->insn_idx));
+			if (map_fd < 0) {
+				pr_warn("prog '%s': relo #%d: can't create jump table: sym_off %u\n",
+						prog->name, i, relo->sym_off);
+				return map_fd;
+			}
+			insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
+			insn->imm = map_fd;
+			insn->off = 0;
+		}
+			break;
 		default:
 			pr_warn("prog '%s': relo #%d: bad relo type %d\n",
 				prog->name, i, relo->type);
@@ -6389,6 +6508,24 @@ static int append_subprog_relos(struct bpf_program *main_prog, struct bpf_progra
 	return 0;
 }
 
+static int save_subprog_offsets(struct bpf_program *main_prog, struct bpf_program *subprog)
+{
+	size_t size = sizeof(main_prog->subprog[0]);
+	int new_cnt = main_prog->subprog_cnt + 1;
+	void *tmp;
+
+	tmp = libbpf_reallocarray(main_prog->subprog, new_cnt, size);
+	if (!tmp)
+		return -ENOMEM;
+
+	main_prog->subprog = tmp;
+	main_prog->subprog[new_cnt - 1].sec_insn_off = subprog->sec_insn_off;
+	main_prog->subprog[new_cnt - 1].sub_insn_off = subprog->sub_insn_off;
+	main_prog->subprog_cnt = new_cnt;
+
+	return 0;
+}
+
 static int
 bpf_object__append_subprog_code(struct bpf_object *obj, struct bpf_program *main_prog,
 				struct bpf_program *subprog)
@@ -6418,6 +6555,14 @@ bpf_object__append_subprog_code(struct bpf_object *obj, struct bpf_program *main
 	err = append_subprog_relos(main_prog, subprog);
 	if (err)
 		return err;
+
+	/* Save subprogram offsets */
+	err = save_subprog_offsets(main_prog, subprog);
+	if (err) {
+		pr_warn("prog '%s': failed to add subprog offsets\n", main_prog->name);
+		return err;
+	}
+
 	return 0;
 }
 
@@ -9185,6 +9330,9 @@ void bpf_object__close(struct bpf_object *obj)
 
 	zfree(&obj->arena_data);
 
+	zfree(&obj->jumptables_data);
+	obj->jumptables_data_sz = 0;
+
 	free(obj);
 }
 
diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c
index 9dfbe7750f56..bccf4bb747e1 100644
--- a/tools/lib/bpf/libbpf_probes.c
+++ b/tools/lib/bpf/libbpf_probes.c
@@ -364,6 +364,10 @@ static int probe_map_create(enum bpf_map_type map_type)
 	case BPF_MAP_TYPE_SOCKHASH:
 	case BPF_MAP_TYPE_REUSEPORT_SOCKARRAY:
 		break;
+	case BPF_MAP_TYPE_INSN_ARRAY:
+		key_size	= sizeof(__u32);
+		value_size	= sizeof(struct bpf_insn_array_value);
+		break;
 	case BPF_MAP_TYPE_UNSPEC:
 	default:
 		return -EOPNOTSUPP;
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c
index a469e5d4fee7..d1585baa9f14 100644
--- a/tools/lib/bpf/linker.c
+++ b/tools/lib/bpf/linker.c
@@ -28,6 +28,8 @@
 #include "str_error.h"
 
 #define BTF_EXTERN_SEC ".extern"
+#define JUMPTABLES_SEC ".jumptables"
+#define JUMPTABLES_REL_SEC ".rel.jumptables"
 
 struct src_sec {
 	const char *sec_name;
@@ -2026,6 +2028,9 @@ static int linker_append_elf_sym(struct bpf_linker *linker, struct src_obj *obj,
 			obj->sym_map[src_sym_idx] = dst_sec->sec_sym_idx;
 			return 0;
 		}
+
+		if (strcmp(src_sec->sec_name, JUMPTABLES_SEC) == 0)
+			goto add_sym;
 	}
 
 	if (sym_bind == STB_LOCAL)
@@ -2272,8 +2277,9 @@ static int linker_append_elf_relos(struct bpf_linker *linker, struct src_obj *ob
 						insn->imm += sec->dst_off / sizeof(struct bpf_insn);
 					else
 						insn->imm += sec->dst_off;
-				} else {
-					pr_warn("relocation against STT_SECTION in non-exec section is not supported!\n");
+				} else if (strcmp(src_sec->sec_name, JUMPTABLES_REL_SEC)) {
+					pr_warn("relocation against STT_SECTION in section %s is not supported!\n",
+						src_sec->sec_name);
 					return -EINVAL;
 				}
 			}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 bpf-next 12/13] bpftool: Recognize insn_array map type
  2025-09-18  9:38 [PATCH v3 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (10 preceding siblings ...)
  2025-09-18  9:38 ` [PATCH v3 bpf-next 11/13] libbpf: support llvm-generated indirect jumps Anton Protopopov
@ 2025-09-18  9:38 ` Anton Protopopov
  2025-09-18  9:38 ` [PATCH v3 bpf-next 13/13] selftests/bpf: add selftests for indirect jumps Anton Protopopov
  2025-09-19  6:46 ` [PATCH v3 bpf-next 00/13] BPF " Eduard Zingerman
  13 siblings, 0 replies; 46+ messages in thread
From: Anton Protopopov @ 2025-09-18  9:38 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

Teach bpftool to recognize instruction array map type.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Acked-by: Quentin Monnet <qmo@kernel.org>
---
 tools/bpf/bpftool/Documentation/bpftool-map.rst | 3 ++-
 tools/bpf/bpftool/map.c                         | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/bpf/bpftool/Documentation/bpftool-map.rst b/tools/bpf/bpftool/Documentation/bpftool-map.rst
index 252e4c538edb..1af3305ea2b2 100644
--- a/tools/bpf/bpftool/Documentation/bpftool-map.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool-map.rst
@@ -55,7 +55,8 @@ MAP COMMANDS
 |     | **devmap** | **devmap_hash** | **sockmap** | **cpumap** | **xskmap** | **sockhash**
 |     | **cgroup_storage** | **reuseport_sockarray** | **percpu_cgroup_storage**
 |     | **queue** | **stack** | **sk_storage** | **struct_ops** | **ringbuf** | **inode_storage**
-|     | **task_storage** | **bloom_filter** | **user_ringbuf** | **cgrp_storage** | **arena** }
+|     | **task_storage** | **bloom_filter** | **user_ringbuf** | **cgrp_storage** | **arena**
+|     | **insn_array** }
 
 DESCRIPTION
 ===========
diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
index c9de44a45778..7ebf7dbcfba4 100644
--- a/tools/bpf/bpftool/map.c
+++ b/tools/bpf/bpftool/map.c
@@ -1477,7 +1477,8 @@ static int do_help(int argc, char **argv)
 		"                 devmap | devmap_hash | sockmap | cpumap | xskmap | sockhash |\n"
 		"                 cgroup_storage | reuseport_sockarray | percpu_cgroup_storage |\n"
 		"                 queue | stack | sk_storage | struct_ops | ringbuf | inode_storage |\n"
-		"                 task_storage | bloom_filter | user_ringbuf | cgrp_storage | arena }\n"
+		"                 task_storage | bloom_filter | user_ringbuf | cgrp_storage | arena |\n"
+		"                 insn_array }\n"
 		"       " HELP_SPEC_OPTIONS " |\n"
 		"                    {-f|--bpffs} | {-n|--nomount} }\n"
 		"",
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v3 bpf-next 13/13] selftests/bpf: add selftests for indirect jumps
  2025-09-18  9:38 [PATCH v3 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (11 preceding siblings ...)
  2025-09-18  9:38 ` [PATCH v3 bpf-next 12/13] bpftool: Recognize insn_array map type Anton Protopopov
@ 2025-09-18  9:38 ` Anton Protopopov
  2025-09-20  0:58   ` Eduard Zingerman
  2025-09-19  6:46 ` [PATCH v3 bpf-next 00/13] BPF " Eduard Zingerman
  13 siblings, 1 reply; 46+ messages in thread
From: Anton Protopopov @ 2025-09-18  9:38 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

Add selftests for indirect jumps. All the indirect jumps are
generated from C switch statements, so, if compiled by a compiler
which doesn't support indirect jumps, then should pass as well.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 tools/testing/selftests/bpf/Makefile          |   4 +-
 .../selftests/bpf/prog_tests/bpf_gotox.c      | 132 ++++++
 tools/testing/selftests/bpf/progs/bpf_gotox.c | 384 ++++++++++++++++++
 3 files changed, 519 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_gotox.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_gotox.c

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 11d2a368db3e..606d7d5a48a7 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -453,7 +453,9 @@ BPF_CFLAGS = -g -Wall -Werror -D__TARGET_ARCH_$(SRCARCH) $(MENDIAN)	\
 	     -I$(abspath $(OUTPUT)/../usr/include)			\
 	     -std=gnu11		 					\
 	     -fno-strict-aliasing 					\
-	     -Wno-compare-distinct-pointer-types
+	     -Wno-compare-distinct-pointer-types			\
+	     -Wno-initializer-overrides					\
+	     #
 # TODO: enable me -Wsign-compare
 
 CLANG_CFLAGS = $(CLANG_SYS_INCLUDES)
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_gotox.c b/tools/testing/selftests/bpf/prog_tests/bpf_gotox.c
new file mode 100644
index 000000000000..90647c080579
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_gotox.c
@@ -0,0 +1,132 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <test_progs.h>
+
+#include <linux/if_ether.h>
+#include <linux/in.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/in6.h>
+#include <linux/udp.h>
+#include <linux/tcp.h>
+
+#include <sys/syscall.h>
+#include <bpf/bpf.h>
+
+#include "bpf_gotox.skel.h"
+
+static void __test_run(struct bpf_program *prog, void *ctx_in, size_t ctx_size_in)
+{
+	LIBBPF_OPTS(bpf_test_run_opts, topts,
+			    .ctx_in = ctx_in,
+			    .ctx_size_in = ctx_size_in,
+		   );
+	int err, prog_fd;
+
+	prog_fd = bpf_program__fd(prog);
+	err = bpf_prog_test_run_opts(prog_fd, &topts);
+	ASSERT_OK(err, "test_run_opts err");
+}
+
+static void check_simple(struct bpf_gotox *skel,
+			 struct bpf_program *prog,
+			 __u64 ctx_in,
+			 __u64 expected)
+{
+	skel->bss->ret_user = 0;
+
+	__test_run(prog, &ctx_in, sizeof(ctx_in));
+
+	if (!ASSERT_EQ(skel->bss->ret_user, expected, "skel->bss->ret_user"))
+		return;
+}
+
+static void check_simple_fentry(struct bpf_gotox *skel,
+				struct bpf_program *prog,
+				__u64 ctx_in,
+				__u64 expected)
+{
+	skel->bss->in_user = ctx_in;
+	skel->bss->ret_user = 0;
+
+	/* trigger */
+	usleep(1);
+
+	if (!ASSERT_EQ(skel->bss->ret_user, expected, "skel->bss->ret_user"))
+		return;
+}
+
+static void check_gotox_skel(struct bpf_gotox *skel)
+{
+	int i;
+	__u64 in[]   = {0, 1, 2, 3, 4,  5, 77};
+	__u64 out[]  = {2, 3, 4, 5, 7, 19, 19};
+	__u64 out2[] = {103, 104, 107, 205, 115, 1019, 1019};
+	__u64 in3[]  = {0, 11, 27, 31, 22, 45, 99};
+	__u64 out3[] = {2,  3,  4,  5, 19, 19, 19};
+	__u64 in4[]  = {0, 1, 2, 3, 4,  5, 77};
+	__u64 out4[] = {12, 15, 7 , 15, 12, 15, 15};
+
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.simple_test, in[i], out[i]);
+
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.simple_test2, in[i], out[i]);
+
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.two_switches, in[i], out2[i]);
+
+	if (0) for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.big_jump_table, in3[i], out3[i]);
+
+	if (0) for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.one_jump_two_maps, in4[i], out4[i]);
+
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.use_static_global1, in[i], out[i]);
+
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.use_static_global2, in[i], out[i]);
+
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.use_nonstatic_global1, in[i], out[i]);
+
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.use_nonstatic_global2, in[i], out[i]);
+
+	bpf_program__attach(skel->progs.simple_test_other_sec);
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple_fentry(skel, skel->progs.simple_test_other_sec, in[i], out[i]);
+
+	bpf_program__attach(skel->progs.use_static_global_other_sec);
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple_fentry(skel, skel->progs.use_static_global_other_sec, in[i], out[i]);
+
+	bpf_program__attach(skel->progs.use_nonstatic_global_other_sec);
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple_fentry(skel, skel->progs.use_nonstatic_global_other_sec, in[i], out[i]);
+}
+
+void gotox_skel(void)
+{
+	struct bpf_gotox *skel;
+	int ret;
+
+	skel = bpf_gotox__open();
+	if (!ASSERT_NEQ(skel, NULL, "bpf_gotox__open"))
+		return;
+
+	ret = bpf_gotox__load(skel);
+	if (!ASSERT_OK(ret, "bpf_gotox__load"))
+		return;
+
+	check_gotox_skel(skel);
+
+	bpf_gotox__destroy(skel);
+}
+
+void test_bpf_gotox(void)
+{
+	if (test__start_subtest("gotox_skel"))
+		gotox_skel();
+}
diff --git a/tools/testing/selftests/bpf/progs/bpf_gotox.c b/tools/testing/selftests/bpf/progs/bpf_gotox.c
new file mode 100644
index 000000000000..72917f34315c
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_gotox.c
@@ -0,0 +1,384 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_core_read.h>
+#include "bpf_misc.h"
+
+__u64 in_user;
+__u64 ret_user;
+
+struct simple_ctx {
+	__u64 x;
+};
+
+__u64 some_var;
+
+/*
+ * This function adds code which will be replaced by a different
+ * number of instructions by the verifier. This adds additional
+ * stress on testing the insn_array maps corresponding to indirect jumps.
+ */
+static __always_inline void adjust_insns(__u64 x)
+{
+	some_var ^= x + bpf_jiffies64();
+}
+
+SEC("syscall")
+int simple_test(struct simple_ctx *ctx)
+{
+	switch (ctx->x) {
+	case 0:
+		adjust_insns(ctx->x + 1);
+		ret_user = 2;
+		break;
+	case 1:
+		adjust_insns(ctx->x + 7);
+		ret_user = 3;
+		break;
+	case 2:
+		adjust_insns(ctx->x + 9);
+		ret_user = 4;
+		break;
+	case 3:
+		adjust_insns(ctx->x + 11);
+		ret_user = 5;
+		break;
+	case 4:
+		adjust_insns(ctx->x + 17);
+		ret_user = 7;
+		break;
+	default:
+		adjust_insns(ctx->x + 177);
+		ret_user = 19;
+		break;
+	}
+
+	return 0;
+}
+
+SEC("syscall")
+int simple_test2(struct simple_ctx *ctx)
+{
+	switch (ctx->x) {
+	case 0:
+		adjust_insns(ctx->x + 1);
+		ret_user = 2;
+		break;
+	case 1:
+		adjust_insns(ctx->x + 7);
+		ret_user = 3;
+		break;
+	case 2:
+		adjust_insns(ctx->x + 9);
+		ret_user = 4;
+		break;
+	case 3:
+		adjust_insns(ctx->x + 11);
+		ret_user = 5;
+		break;
+	case 4:
+		adjust_insns(ctx->x + 17);
+		ret_user = 7;
+		break;
+	default:
+		adjust_insns(ctx->x + 177);
+		ret_user = 19;
+		break;
+	}
+
+	return 0;
+}
+
+SEC("fentry/" SYS_PREFIX "sys_nanosleep")
+int simple_test_other_sec(struct pt_regs *ctx)
+{
+	__u64 x = in_user;
+
+	switch (x) {
+	case 0:
+		adjust_insns(x + 1);
+		ret_user = 2;
+		break;
+	case 1:
+		adjust_insns(x + 7);
+		ret_user = 3;
+		break;
+	case 2:
+		adjust_insns(x + 9);
+		ret_user = 4;
+		break;
+	case 3:
+		adjust_insns(x + 11);
+		ret_user = 5;
+		break;
+	case 4:
+		adjust_insns(x + 17);
+		ret_user = 7;
+		break;
+	default:
+		adjust_insns(x + 177);
+		ret_user = 19;
+		break;
+	}
+
+	return 0;
+}
+
+SEC("syscall")
+int two_switches(struct simple_ctx *ctx)
+{
+	switch (ctx->x) {
+	case 0:
+		adjust_insns(ctx->x + 1);
+		ret_user = 2;
+		break;
+	case 1:
+		adjust_insns(ctx->x + 7);
+		ret_user = 3;
+		break;
+	case 2:
+		adjust_insns(ctx->x + 9);
+		ret_user = 4;
+		break;
+	case 3:
+		adjust_insns(ctx->x + 11);
+		ret_user = 5;
+		break;
+	case 4:
+		adjust_insns(ctx->x + 17);
+		ret_user = 7;
+		break;
+	default:
+		adjust_insns(ctx->x + 177);
+		ret_user = 19;
+		break;
+	}
+
+	switch (ctx->x + !!ret_user) {
+	case 1:
+		adjust_insns(ctx->x + 7);
+		ret_user = 103;
+		break;
+	case 2:
+		adjust_insns(ctx->x + 9);
+		ret_user = 104;
+		break;
+	case 3:
+		adjust_insns(ctx->x + 11);
+		ret_user = 107;
+		break;
+	case 4:
+		adjust_insns(ctx->x + 11);
+		ret_user = 205;
+		break;
+	case 5:
+		adjust_insns(ctx->x + 11);
+		ret_user = 115;
+		break;
+	default:
+		adjust_insns(ctx->x + 177);
+		ret_user = 1019;
+		break;
+	}
+
+	return 0;
+}
+
+SEC("syscall")
+int big_jump_table(struct simple_ctx *ctx __attribute__((unused)))
+{
+#if 0
+	const void *const jt[256] = {
+		[0 ... 255] = &&default_label,
+		[0] = &&l0,
+		[11] = &&l11,
+		[27] = &&l27,
+		[31] = &&l31,
+	};
+
+	goto *jt[ctx->x & 0xff];
+
+l0:
+	adjust_insns(ctx->x + 1);
+	ret_user = 2;
+	return 0;
+
+l11:
+	adjust_insns(ctx->x + 7);
+	ret_user = 3;
+	return 0;
+
+l27:
+	adjust_insns(ctx->x + 9);
+	ret_user = 4;
+	return 0;
+
+l31:
+	adjust_insns(ctx->x + 11);
+	ret_user = 5;
+	return 0;
+
+default_label:
+	adjust_insns(ctx->x + 177);
+	ret_user = 19;
+	return 0;
+#else
+	return 0;
+#endif
+}
+
+SEC("syscall")
+int one_jump_two_maps(struct simple_ctx *ctx __attribute__((unused)))
+{
+#if 0
+	__label__ l1, l2, l3, l4;
+	void *jt1[2] = { &&l1, &&l2 };
+	void *jt2[2] = { &&l3, &&l4 };
+	unsigned int a = ctx->x % 2;
+	unsigned int b = (ctx->x / 2) % 2;
+	volatile int ret = 0;
+
+	if (!(a < 2 && b < 2))
+		return 19;
+
+	if (ctx->x % 2)
+		goto *jt1[a];
+	else
+		goto *jt2[b];
+
+	l1: ret += 1;
+	l2: ret += 3;
+	l3: ret += 5;
+	l4: ret += 7;
+
+	ret_user = ret;
+	return ret;
+#else
+	return 0;
+#endif
+}
+
+/* Just to introduce some non-zero offsets in .text */
+static __noinline int f0(volatile struct simple_ctx *ctx __arg_ctx)
+{
+	if (ctx)
+		return 1;
+	else
+		return 13;
+}
+
+SEC("syscall") int f1(struct simple_ctx *ctx)
+{
+	ret_user = 0;
+	return f0(ctx);
+}
+
+static __noinline int __static_global(__u64 x)
+{
+	switch (x) {
+	case 0:
+		adjust_insns(x + 1);
+		ret_user = 2;
+		break;
+	case 1:
+		adjust_insns(x + 7);
+		ret_user = 3;
+		break;
+	case 2:
+		adjust_insns(x + 9);
+		ret_user = 4;
+		break;
+	case 3:
+		adjust_insns(x + 11);
+		ret_user = 5;
+		break;
+	case 4:
+		adjust_insns(x + 17);
+		ret_user = 7;
+		break;
+	default:
+		adjust_insns(x + 177);
+		ret_user = 19;
+		break;
+	}
+
+	return 0;
+}
+
+SEC("syscall")
+int use_static_global1(struct simple_ctx *ctx)
+{
+	ret_user = 0;
+	return __static_global(ctx->x);
+}
+
+SEC("syscall")
+int use_static_global2(struct simple_ctx *ctx)
+{
+	ret_user = 0;
+	adjust_insns(ctx->x + 1);
+	return __static_global(ctx->x);
+}
+
+SEC("fentry/" SYS_PREFIX "sys_nanosleep")
+int use_static_global_other_sec(void *ctx)
+{
+	return __static_global(in_user);
+}
+
+__noinline int __nonstatic_global(__u64 x)
+{
+	switch (x) {
+	case 0:
+		adjust_insns(x + 1);
+		ret_user = 2;
+		break;
+	case 1:
+		adjust_insns(x + 7);
+		ret_user = 3;
+		break;
+	case 2:
+		adjust_insns(x + 9);
+		ret_user = 4;
+		break;
+	case 3:
+		adjust_insns(x + 11);
+		ret_user = 5;
+		break;
+	case 4:
+		adjust_insns(x + 17);
+		ret_user = 7;
+		break;
+	default:
+		adjust_insns(x + 177);
+		ret_user = 19;
+		break;
+	}
+
+	return 0;
+}
+
+SEC("syscall")
+int use_nonstatic_global1(struct simple_ctx *ctx)
+{
+	ret_user = 0;
+	return __nonstatic_global(ctx->x);
+}
+
+SEC("syscall")
+int use_nonstatic_global2(struct simple_ctx *ctx)
+{
+	ret_user = 0;
+	adjust_insns(ctx->x + 1);
+	return __nonstatic_global(ctx->x);
+}
+
+SEC("fentry/" SYS_PREFIX "sys_nanosleep")
+int use_nonstatic_global_other_sec(void *ctx)
+{
+	return __nonstatic_global(in_user);
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 01/13] bpf: fix the return value of push_stack
  2025-09-18  9:38 ` [PATCH v3 bpf-next 01/13] bpf: fix the return value of push_stack Anton Protopopov
@ 2025-09-19  0:17   ` Eduard Zingerman
  2025-09-19  7:18     ` Anton Protopopov
  0 siblings, 1 reply; 46+ messages in thread
From: Eduard Zingerman @ 2025-09-19  0:17 UTC (permalink / raw)
  To: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Daniel Borkmann, Quentin Monnet, Yonghong Song

On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> In [1] Eduard mentioned that on push_stack failure verifier code
> should return -ENOMEM instead of -EFAULT. After checking with the
> other call sites I've found that code randomly returns either -ENOMEM
> or -EFAULT. This patch unifies the return values for the push_stack
> (and similar push_async_cb) functions such that error codes are
> always assigned properly.
> 
>   [1] https://lore.kernel.org/bpf/20250615085943.3871208-1-a.s.protopopov@gmail.com
> 
> Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
> ---

Acked-by: Eduard Zingerman <eddyz87@gmail.com>

> @@ -14256,7 +14255,7 @@ sanitize_speculative_path(struct bpf_verifier_env *env,
>  			mark_reg_unknown(env, regs, insn->src_reg);
>  		}
>  	}
> -	return branch;
> +	return IS_ERR(branch) ? PTR_ERR(branch) : 0;

Nit: this is the same as PTR_ERR_OR_ZERO.

>  }
>  
>  static int sanitize_ptr_alu(struct bpf_verifier_env *env,

[...]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 05/13] bpf: support instructions arrays with constants blinding
  2025-09-18  9:38 ` [PATCH v3 bpf-next 05/13] bpf: support instructions arrays with constants blinding Anton Protopopov
@ 2025-09-19  6:35   ` Eduard Zingerman
  2025-09-19  7:05     ` Anton Protopopov
  0 siblings, 1 reply; 46+ messages in thread
From: Eduard Zingerman @ 2025-09-19  6:35 UTC (permalink / raw)
  To: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Daniel Borkmann, Quentin Monnet, Yonghong Song

On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:

[...]

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index a7ad4fe756da..5c1e4e37d1f8 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -21578,6 +21578,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>  	struct bpf_insn *insn;
>  	void *old_bpf_func;
>  	int err, num_exentries;
> +	int old_len, subprog_start_adjustment = 0;
>  
>  	if (env->subprog_cnt <= 1)
>  		return 0;
> @@ -21652,7 +21653,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>  		func[i]->aux->func_idx = i;
>  		/* Below members will be freed only at prog->aux */
>  		func[i]->aux->btf = prog->aux->btf;
> -		func[i]->aux->subprog_start = subprog_start;
> +		func[i]->aux->subprog_start = subprog_start + subprog_start_adjustment;
>  		func[i]->aux->func_info = prog->aux->func_info;
>  		func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
>  		func[i]->aux->poke_tab = prog->aux->poke_tab;
> @@ -21705,7 +21706,15 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>  		func[i]->aux->might_sleep = env->subprog_info[i].might_sleep;
>  		if (!i)
>  			func[i]->aux->exception_boundary = env->seen_exception;
> +
> +		/*
> +		 * To properly pass the absolute subprog start to jit
> +		 * all instruction adjustments should be accumulated
> +		 */
> +		old_len = func[i]->len;
>  		func[i] = bpf_int_jit_compile(func[i]);
> +		subprog_start_adjustment += func[i]->len - old_len;
> +
>  		if (!func[i]->jited) {
>  			err = -ENOTSUPP;
>  			goto out_free;

This change makes sense, however, would it be possible to move
bpf_jit_blind_constants() out from jit to verifier.c:do_check,
somewhere after do_misc_fixups?
Looking at the source code, bpf_jit_blind_constants() is the first
thing any bpf_int_jit_compile() does.
Another alternative is to add adjust_subprog_starts() call to this
function. Wdyt?


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 00/13] BPF indirect jumps
  2025-09-18  9:38 [PATCH v3 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (12 preceding siblings ...)
  2025-09-18  9:38 ` [PATCH v3 bpf-next 13/13] selftests/bpf: add selftests for indirect jumps Anton Protopopov
@ 2025-09-19  6:46 ` Eduard Zingerman
  2025-09-19 14:57   ` Anton Protopopov
  2025-09-19 17:27   ` Eduard Zingerman
  13 siblings, 2 replies; 46+ messages in thread
From: Eduard Zingerman @ 2025-09-19  6:46 UTC (permalink / raw)
  To: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Daniel Borkmann, Quentin Monnet, Yonghong Song

On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> This patchset implements a new type of map, instruction set, and uses
> it to build support for indirect branches in BPF (on x86). (The same
> map will be later used to provide support for indirect calls and static
> keys.) See [1], [2] for more context.

With this patch-set on top of the bpf-next at commit [1],
I get a KASAN bug report [2] when running `./test_progs -t tailcalls`.
Does not happen w/o this series applied.
Kernel is compiled with gcc 15.2.1, selftests are compiled with clang
20.1.8 (w/o gotox support).

[1] 3547a61ee2fe ("Merge branch 'update-kf_rcu_protected'")
[2] https://gist.github.com/eddyz87/8f82545db32223d8a80d2ca69a47bbc2

[...]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 05/13] bpf: support instructions arrays with constants blinding
  2025-09-19  6:35   ` Eduard Zingerman
@ 2025-09-19  7:05     ` Anton Protopopov
  2025-09-19  7:12       ` Eduard Zingerman
  0 siblings, 1 reply; 46+ messages in thread
From: Anton Protopopov @ 2025-09-19  7:05 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Quentin Monnet, Yonghong Song

On 25/09/18 11:35PM, Eduard Zingerman wrote:
> On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> 
> [...]
> 
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index a7ad4fe756da..5c1e4e37d1f8 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -21578,6 +21578,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> >  	struct bpf_insn *insn;
> >  	void *old_bpf_func;
> >  	int err, num_exentries;
> > +	int old_len, subprog_start_adjustment = 0;
> >  
> >  	if (env->subprog_cnt <= 1)
> >  		return 0;
> > @@ -21652,7 +21653,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> >  		func[i]->aux->func_idx = i;
> >  		/* Below members will be freed only at prog->aux */
> >  		func[i]->aux->btf = prog->aux->btf;
> > -		func[i]->aux->subprog_start = subprog_start;
> > +		func[i]->aux->subprog_start = subprog_start + subprog_start_adjustment;
> >  		func[i]->aux->func_info = prog->aux->func_info;
> >  		func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
> >  		func[i]->aux->poke_tab = prog->aux->poke_tab;
> > @@ -21705,7 +21706,15 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> >  		func[i]->aux->might_sleep = env->subprog_info[i].might_sleep;
> >  		if (!i)
> >  			func[i]->aux->exception_boundary = env->seen_exception;
> > +
> > +		/*
> > +		 * To properly pass the absolute subprog start to jit
> > +		 * all instruction adjustments should be accumulated
> > +		 */
> > +		old_len = func[i]->len;
> >  		func[i] = bpf_int_jit_compile(func[i]);
> > +		subprog_start_adjustment += func[i]->len - old_len;
> > +
> >  		if (!func[i]->jited) {
> >  			err = -ENOTSUPP;
> >  			goto out_free;
> 
> This change makes sense, however, would it be possible to move
> bpf_jit_blind_constants() out from jit to verifier.c:do_check,
> somewhere after do_misc_fixups?
> Looking at the source code, bpf_jit_blind_constants() is the first
> thing any bpf_int_jit_compile() does.
> Another alternative is to add adjust_subprog_starts() call to this
> function. Wdyt?

Yes, it makes total sense. Blinding was added to x86 jit initially and then
every other jit copy-pasted it.  I was considering to move blinding up some
time back (see https://lore.kernel.org/bpf/20250318143318.656785-1-aspsk@isovalent.com/),
but then I've decided to avoid this, as this requires to patch every JIT, and I
am not sure what is the way to test such a change (any hints?)

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 05/13] bpf: support instructions arrays with constants blinding
  2025-09-19  7:05     ` Anton Protopopov
@ 2025-09-19  7:12       ` Eduard Zingerman
  2025-09-19 18:26         ` Alexei Starovoitov
  0 siblings, 1 reply; 46+ messages in thread
From: Eduard Zingerman @ 2025-09-19  7:12 UTC (permalink / raw)
  To: Anton Protopopov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Quentin Monnet, Yonghong Song

On Fri, 2025-09-19 at 07:05 +0000, Anton Protopopov wrote:
> On 25/09/18 11:35PM, Eduard Zingerman wrote:
> > On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> > 
> > [...]
> > 
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index a7ad4fe756da..5c1e4e37d1f8 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -21578,6 +21578,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > >  	struct bpf_insn *insn;
> > >  	void *old_bpf_func;
> > >  	int err, num_exentries;
> > > +	int old_len, subprog_start_adjustment = 0;
> > >  
> > >  	if (env->subprog_cnt <= 1)
> > >  		return 0;
> > > @@ -21652,7 +21653,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > >  		func[i]->aux->func_idx = i;
> > >  		/* Below members will be freed only at prog->aux */
> > >  		func[i]->aux->btf = prog->aux->btf;
> > > -		func[i]->aux->subprog_start = subprog_start;
> > > +		func[i]->aux->subprog_start = subprog_start + subprog_start_adjustment;
> > >  		func[i]->aux->func_info = prog->aux->func_info;
> > >  		func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
> > >  		func[i]->aux->poke_tab = prog->aux->poke_tab;
> > > @@ -21705,7 +21706,15 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > >  		func[i]->aux->might_sleep = env->subprog_info[i].might_sleep;
> > >  		if (!i)
> > >  			func[i]->aux->exception_boundary = env->seen_exception;
> > > +
> > > +		/*
> > > +		 * To properly pass the absolute subprog start to jit
> > > +		 * all instruction adjustments should be accumulated
> > > +		 */
> > > +		old_len = func[i]->len;
> > >  		func[i] = bpf_int_jit_compile(func[i]);
> > > +		subprog_start_adjustment += func[i]->len - old_len;
> > > +
> > >  		if (!func[i]->jited) {
> > >  			err = -ENOTSUPP;
> > >  			goto out_free;
> > 
> > This change makes sense, however, would it be possible to move
> > bpf_jit_blind_constants() out from jit to verifier.c:do_check,
> > somewhere after do_misc_fixups?
> > Looking at the source code, bpf_jit_blind_constants() is the first
> > thing any bpf_int_jit_compile() does.
> > Another alternative is to add adjust_subprog_starts() call to this
> > function. Wdyt?
> 
> Yes, it makes total sense. Blinding was added to x86 jit initially and then
> every other jit copy-pasted it.  I was considering to move blinding up some
> time back (see https://lore.kernel.org/bpf/20250318143318.656785-1-aspsk@isovalent.com/),
> but then I've decided to avoid this, as this requires to patch every JIT, and I
> am not sure what is the way to test such a change (any hints?)

We have the following covered by CI:
- arch/x86/net/bpf_jit_comp.c
- arch/s390/net/bpf_jit_comp.c
- arch/arm64/net/bpf_jit_comp.c

People work on these jits actively:
- arch/riscv/net/bpf_jit_core.c
- arch/loongarch/net/bpf_jit.c
- arch/powerpc/net/bpf_jit_comp.c

So, we can probably ask to test the patch-set.

The remaining are:
- arch/x86/net/bpf_jit_comp32.c
- arch/parisc/net/bpf_jit_core.c
- arch/mips/net/bpf_jit_comp.c
- arch/arm/net/bpf_jit_32.c
- arch/sparc/net/bpf_jit_comp_64.c
- arch/arc/net/bpf_jit_core.c

The change to each individual jit is not complicated, just removing
the transformation call. Idk, I'd just go for it.
Maybe Alexei has concerns?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 01/13] bpf: fix the return value of push_stack
  2025-09-19  0:17   ` Eduard Zingerman
@ 2025-09-19  7:18     ` Anton Protopopov
  0 siblings, 0 replies; 46+ messages in thread
From: Anton Protopopov @ 2025-09-19  7:18 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Quentin Monnet, Yonghong Song

On 25/09/18 05:17PM, Eduard Zingerman wrote:
> On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> > In [1] Eduard mentioned that on push_stack failure verifier code
> > should return -ENOMEM instead of -EFAULT. After checking with the
> > other call sites I've found that code randomly returns either -ENOMEM
> > or -EFAULT. This patch unifies the return values for the push_stack
> > (and similar push_async_cb) functions such that error codes are
> > always assigned properly.
> > 
> >   [1] https://lore.kernel.org/bpf/20250615085943.3871208-1-a.s.protopopov@gmail.com
> > 
> > Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
> > ---
> 
> Acked-by: Eduard Zingerman <eddyz87@gmail.com>
> 
> > @@ -14256,7 +14255,7 @@ sanitize_speculative_path(struct bpf_verifier_env *env,
> >  			mark_reg_unknown(env, regs, insn->src_reg);
> >  		}
> >  	}
> > -	return branch;
> > +	return IS_ERR(branch) ? PTR_ERR(branch) : 0;
> 
> Nit: this is the same as PTR_ERR_OR_ZERO.

thanks, fixed

> >  }
> >  
> >  static int sanitize_ptr_alu(struct bpf_verifier_env *env,
> 
> [...]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 00/13] BPF indirect jumps
  2025-09-19  6:46 ` [PATCH v3 bpf-next 00/13] BPF " Eduard Zingerman
@ 2025-09-19 14:57   ` Anton Protopopov
  2025-09-19 16:49     ` Eduard Zingerman
  2025-09-19 17:27   ` Eduard Zingerman
  1 sibling, 1 reply; 46+ messages in thread
From: Anton Protopopov @ 2025-09-19 14:57 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Quentin Monnet, Yonghong Song

25/09/18 11:46PM, Eduard Zingerman wrote:
> On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> > This patchset implements a new type of map, instruction set, and uses
> > it to build support for indirect branches in BPF (on x86). (The same
> > map will be later used to provide support for indirect calls and static
> > keys.) See [1], [2] for more context.
> 
> With this patch-set on top of the bpf-next at commit [1],
> I get a KASAN bug report [2] when running `./test_progs -t tailcalls`.
> Does not happen w/o this series applied.
> Kernel is compiled with gcc 15.2.1, selftests are compiled with clang
> 20.1.8 (w/o gotox support).
> 
> [1] 3547a61ee2fe ("Merge branch 'update-kf_rcu_protected'")
> [2] https://gist.github.com/eddyz87/8f82545db32223d8a80d2ca69a47bbc2
> 
> [...]

Can't reproduce on my setup yet (but with the other set of compilers,
will try to reproduce with ~ your versions).

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 00/13] BPF indirect jumps
  2025-09-19 14:57   ` Anton Protopopov
@ 2025-09-19 16:49     ` Eduard Zingerman
  0 siblings, 0 replies; 46+ messages in thread
From: Eduard Zingerman @ 2025-09-19 16:49 UTC (permalink / raw)
  To: Anton Protopopov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Quentin Monnet, Yonghong Song

[-- Attachment #1: Type: text/plain, Size: 1167 bytes --]

On Fri, 2025-09-19 at 14:57 +0000, Anton Protopopov wrote:
> 25/09/18 11:46PM, Eduard Zingerman wrote:
> > On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> > > This patchset implements a new type of map, instruction set, and uses
> > > it to build support for indirect branches in BPF (on x86). (The same
> > > map will be later used to provide support for indirect calls and static
> > > keys.) See [1], [2] for more context.
> > 
> > With this patch-set on top of the bpf-next at commit [1],
> > I get a KASAN bug report [2] when running `./test_progs -t tailcalls`.
> > Does not happen w/o this series applied.
> > Kernel is compiled with gcc 15.2.1, selftests are compiled with clang
> > 20.1.8 (w/o gotox support).
> > 
> > [1] 3547a61ee2fe ("Merge branch 'update-kf_rcu_protected'")
> > [2] https://gist.github.com/eddyz87/8f82545db32223d8a80d2ca69a47bbc2
> > 
> > [...]
> 
> Can't reproduce on my setup yet (but with the other set of compilers,
> will try to reproduce with ~ your versions).

Double-checked with `git clean -xfd` between retries, see same outcome
as before. Attaching script I use for kernel configuration.

[-- Attachment #2: kernel-config-addons --]
[-- Type: text/plain, Size: 939 bytes --]

# need ethernet device for ssh connection
CONFIG_E1000=y

# use sda as boot device
CONFIG_ATA=y
CONFIG_SATA_AHCI=y
CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y

# allow using gdb from outside the VM
CONFIG_GDB_SCRIPTS=y
CONFIG_DEBUG_PREEMPT=y
CONFIG_KGDB=y
CONFIG_UNWINDER_FRAME_POINTER=y
CONFIG_FRAME_POINTER=y

# should be useful, but idk
CONFIG_HYPERVISOR_GUEST=y
CONFIG_PARAVIRT=y
CONFIG_KVM_GUEST=y

# used by BPF exceptions
CONFIG_UNWINDER_ORC=y

# boot from vmlinux
CONFIG_PVH=y

# to avoid kernel build error about legacy openssl shiphers
CONFIG_MODULE_SIG_SHA512=y

# sched_ext
CONFIG_SCHED_CLASS_EXT=y

# address sanitizer
#CONFIG_HAVE_ARCH_KASAN=y
#CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
#CONFIG_CC_HAS_KASAN_GENERIC=y
#CONFIG_CC_HAS_KASAN_SW_TAGS=y
#CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
CONFIG_KASAN=y
#CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX=y
#CONFIG_KASAN_GENERIC=y
#CONFIG_KASAN_INLINE=y
#CONFIG_KASAN_STACK=y
#CONFIG_KASAN_VMALLOC=y

[-- Attachment #3: make-kernel-config.sh --]
[-- Type: application/x-shellscript, Size: 643 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 00/13] BPF indirect jumps
  2025-09-19  6:46 ` [PATCH v3 bpf-next 00/13] BPF " Eduard Zingerman
  2025-09-19 14:57   ` Anton Protopopov
@ 2025-09-19 17:27   ` Eduard Zingerman
  2025-09-19 18:03     ` Eduard Zingerman
  1 sibling, 1 reply; 46+ messages in thread
From: Eduard Zingerman @ 2025-09-19 17:27 UTC (permalink / raw)
  To: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Daniel Borkmann, Quentin Monnet, Yonghong Song

On Thu, 2025-09-18 at 23:46 -0700, Eduard Zingerman wrote:
> On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> > This patchset implements a new type of map, instruction set, and uses
> > it to build support for indirect branches in BPF (on x86). (The same
> > map will be later used to provide support for indirect calls and static
> > keys.) See [1], [2] for more context.
> 
> With this patch-set on top of the bpf-next at commit [1],
> I get a KASAN bug report [2] when running `./test_progs -t tailcalls`.
> Does not happen w/o this series applied.
> Kernel is compiled with gcc 15.2.1, selftests are compiled with clang
> 20.1.8 (w/o gotox support).
> 
> [1] 3547a61ee2fe ("Merge branch 'update-kf_rcu_protected'")
> [2] https://gist.github.com/eddyz87/8f82545db32223d8a80d2ca69a47bbc2
> 
> [...]

Bisect points to patch #7 "bpf, x86: allow indirect jumps to r8...r15".

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 00/13] BPF indirect jumps
  2025-09-19 17:27   ` Eduard Zingerman
@ 2025-09-19 18:03     ` Eduard Zingerman
  0 siblings, 0 replies; 46+ messages in thread
From: Eduard Zingerman @ 2025-09-19 18:03 UTC (permalink / raw)
  To: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Daniel Borkmann, Quentin Monnet, Yonghong Song

On Fri, 2025-09-19 at 10:27 -0700, Eduard Zingerman wrote:
> On Thu, 2025-09-18 at 23:46 -0700, Eduard Zingerman wrote:
> > On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> > > This patchset implements a new type of map, instruction set, and uses
> > > it to build support for indirect branches in BPF (on x86). (The same
> > > map will be later used to provide support for indirect calls and static
> > > keys.) See [1], [2] for more context.
> > 
> > With this patch-set on top of the bpf-next at commit [1],
> > I get a KASAN bug report [2] when running `./test_progs -t tailcalls`.
> > Does not happen w/o this series applied.
> > Kernel is compiled with gcc 15.2.1, selftests are compiled with clang
> > 20.1.8 (w/o gotox support).
> > 
> > [1] 3547a61ee2fe ("Merge branch 'update-kf_rcu_protected'")
> > [2] https://gist.github.com/eddyz87/8f82545db32223d8a80d2ca69a47bbc2
> > 
> > [...]
> 
> Bisect points to patch #7 "bpf, x86: allow indirect jumps to r8...r15".

And this does not happen on my other machine.
I inserted a few printks, on the good machine #3 is printed,
on the bad machine #4 is printed:

  static void emit_indirect_jump(u8 **pprog, int bpf_reg, u8 *ip)
  {
	u8 *prog = *pprog;
	int reg = reg2hex[bpf_reg];
	bool ereg = is_ereg(bpf_reg);

	if (cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) {
+		printk("emit_indirect_jump #1\n");
		OPTIMIZER_HIDE_VAR(reg);
		emit_jump(&prog, its_static_thunk(reg), ip);
	} else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) {
+		printk("emit_indirect_jump #2\n");
		EMIT_LFENCE();
		__emit_indirect_jump(pprog, reg, ereg);
	} else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE)) {
+		printk("emit_indirect_jump #3\n");
		OPTIMIZER_HIDE_VAR(reg);
		if (cpu_feature_enabled(X86_FEATURE_CALL_DEPTH))
			emit_jump(&prog, &__x86_indirect_jump_thunk_array[reg + 8*ereg], ip);
		else
			emit_jump(&prog, &__x86_indirect_thunk_array[reg + 8*ereg], ip);
	} else {
+		printk("emit_indirect_jump #4\n");
		__emit_indirect_jump(pprog, reg, ereg);
		if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) || IS_ENABLED(CONFIG_MITIGATION_SLS))
			EMIT1(0xCC);		/* int3 */
	}

	*pprog = prog;
  }

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 07/13] bpf, x86: allow indirect jumps to r8...r15
  2025-09-18  9:38 ` [PATCH v3 bpf-next 07/13] bpf, x86: allow indirect jumps to r8...r15 Anton Protopopov
@ 2025-09-19 18:25   ` Eduard Zingerman
  2025-09-19 18:38     ` Eduard Zingerman
  0 siblings, 1 reply; 46+ messages in thread
From: Eduard Zingerman @ 2025-09-19 18:25 UTC (permalink / raw)
  To: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Daniel Borkmann, Quentin Monnet, Yonghong Song

On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> Currently the emit_indirect_jump() function only accepts one of the
> RAX, RCX, ..., RBP registers as the destination. Make it to accept
> R8, R9, ..., R15 as well, and make callers to pass BPF registers, not
> native registers. This is required to enable indirect jumps support
> in eBPF.
>
> Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
> ---
>  arch/x86/net/bpf_jit_comp.c | 28 +++++++++++++++++++++-------
>  1 file changed, 21 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 8792d7f371d3..fcebb48742ae 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -660,24 +660,38 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
>
>  #define EMIT_LFENCE()	EMIT3(0x0F, 0xAE, 0xE8)
>
> -static void emit_indirect_jump(u8 **pprog, int reg, u8 *ip)
> +static void __emit_indirect_jump(u8 **pprog, int reg, bool ereg)
>  {
>  	u8 *prog = *pprog; <------------------------------------------------------------------------.
>                                                                                                   |
> +	if (ereg)                                                                                   |
> +		EMIT1(0x41);                                                                        |
> +                                                                                                 |
> +	EMIT2(0xFF, 0xE0 + reg);                                                                    |
> +                                                                                                 |
> +	*pprog = prog;     <------------------------------------------------------------------------|
> +}                                                                                                |
> +                                                                                                 |
> +static void emit_indirect_jump(u8 **pprog, int bpf_reg, u8 *ip)                                  |
> +{                                                                                                |
> +	u8 *prog = *pprog; <------------------------------------------------------------------------|
> +	int reg = reg2hex[bpf_reg];                                                                 |
> +	bool ereg = is_ereg(bpf_reg);                                                               |
> +                                                                                                 |
>  	if (cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) {                                  |
>  		OPTIMIZER_HIDE_VAR(reg);                                                            |
>  		emit_jump(&prog, its_static_thunk(reg), ip);                                        |
>  	} else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) {                             |
>  		EMIT_LFENCE();                                                                      |
> -		EMIT2(0xFF, 0xE0 + reg);                                                            |
> +		__emit_indirect_jump(pprog, reg, ereg);                                             |
>  	} else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE)) {                                    |
>  		OPTIMIZER_HIDE_VAR(reg);                                                            |
>  		if (cpu_feature_enabled(X86_FEATURE_CALL_DEPTH))                                    |
> -			emit_jump(&prog, &__x86_indirect_jump_thunk_array[reg], ip);                |
> +			emit_jump(&prog, &__x86_indirect_jump_thunk_array[reg + 8*ereg], ip);       |
>  		else                                                                                |
> -			emit_jump(&prog, &__x86_indirect_thunk_array[reg], ip);                     |
> +			emit_jump(&prog, &__x86_indirect_thunk_array[reg + 8*ereg], ip);            |
>  	} else {                                                                                    |
> -		EMIT2(0xFF, 0xE0 + reg);	/* jmp *%\reg */                                    |
> +		__emit_indirect_jump(pprog, reg, ereg);    <----------------------------------------|
                                                                                                    |
                You need to re-read *pprog after __emit_indirect_jump() call                        |
                this is what causes KASAN error I reported in the sibling thread.                   |
                W/o re-reading it the FF E1 emitted above is overwritten to CC E1                   |
                by EMIT1(0xCC) below.                                                               |
                                                                                                    |
>  		if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) || IS_ENABLED(CONFIG_MITIGATION_SLS))   |
>  			EMIT1(0xCC);		/* int3 */ <----------------------------------------'
>  	}

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 05/13] bpf: support instructions arrays with constants blinding
  2025-09-19  7:12       ` Eduard Zingerman
@ 2025-09-19 18:26         ` Alexei Starovoitov
  2025-09-19 19:28           ` Daniel Borkmann
  0 siblings, 1 reply; 46+ messages in thread
From: Alexei Starovoitov @ 2025-09-19 18:26 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Daniel Borkmann, Quentin Monnet, Yonghong Song

On Fri, Sep 19, 2025 at 12:12 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Fri, 2025-09-19 at 07:05 +0000, Anton Protopopov wrote:
> > On 25/09/18 11:35PM, Eduard Zingerman wrote:
> > > On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> > >
> > > [...]
> > >
> > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > index a7ad4fe756da..5c1e4e37d1f8 100644
> > > > --- a/kernel/bpf/verifier.c
> > > > +++ b/kernel/bpf/verifier.c
> > > > @@ -21578,6 +21578,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > >   struct bpf_insn *insn;
> > > >   void *old_bpf_func;
> > > >   int err, num_exentries;
> > > > + int old_len, subprog_start_adjustment = 0;
> > > >
> > > >   if (env->subprog_cnt <= 1)
> > > >           return 0;
> > > > @@ -21652,7 +21653,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > >           func[i]->aux->func_idx = i;
> > > >           /* Below members will be freed only at prog->aux */
> > > >           func[i]->aux->btf = prog->aux->btf;
> > > > -         func[i]->aux->subprog_start = subprog_start;
> > > > +         func[i]->aux->subprog_start = subprog_start + subprog_start_adjustment;
> > > >           func[i]->aux->func_info = prog->aux->func_info;
> > > >           func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
> > > >           func[i]->aux->poke_tab = prog->aux->poke_tab;
> > > > @@ -21705,7 +21706,15 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > >           func[i]->aux->might_sleep = env->subprog_info[i].might_sleep;
> > > >           if (!i)
> > > >                   func[i]->aux->exception_boundary = env->seen_exception;
> > > > +
> > > > +         /*
> > > > +          * To properly pass the absolute subprog start to jit
> > > > +          * all instruction adjustments should be accumulated
> > > > +          */
> > > > +         old_len = func[i]->len;
> > > >           func[i] = bpf_int_jit_compile(func[i]);
> > > > +         subprog_start_adjustment += func[i]->len - old_len;
> > > > +
> > > >           if (!func[i]->jited) {
> > > >                   err = -ENOTSUPP;
> > > >                   goto out_free;
> > >
> > > This change makes sense, however, would it be possible to move
> > > bpf_jit_blind_constants() out from jit to verifier.c:do_check,
> > > somewhere after do_misc_fixups?
> > > Looking at the source code, bpf_jit_blind_constants() is the first
> > > thing any bpf_int_jit_compile() does.
> > > Another alternative is to add adjust_subprog_starts() call to this
> > > function. Wdyt?
> >
> > Yes, it makes total sense. Blinding was added to x86 jit initially and then
> > every other jit copy-pasted it.  I was considering to move blinding up some
> > time back (see https://lore.kernel.org/bpf/20250318143318.656785-1-aspsk@isovalent.com/),
> > but then I've decided to avoid this, as this requires to patch every JIT, and I
> > am not sure what is the way to test such a change (any hints?)
>
> We have the following covered by CI:
> - arch/x86/net/bpf_jit_comp.c
> - arch/s390/net/bpf_jit_comp.c
> - arch/arm64/net/bpf_jit_comp.c
>
> People work on these jits actively:
> - arch/riscv/net/bpf_jit_core.c
> - arch/loongarch/net/bpf_jit.c
> - arch/powerpc/net/bpf_jit_comp.c
>
> So, we can probably ask to test the patch-set.
>
> The remaining are:
> - arch/x86/net/bpf_jit_comp32.c
> - arch/parisc/net/bpf_jit_core.c
> - arch/mips/net/bpf_jit_comp.c
> - arch/arm/net/bpf_jit_32.c
> - arch/sparc/net/bpf_jit_comp_64.c
> - arch/arc/net/bpf_jit_core.c
>
> The change to each individual jit is not complicated, just removing
> the transformation call. Idk, I'd just go for it.
> Maybe Alexei has concerns?

No concerns.
I don't remember why JIT calls it instead of the verifier.

Daniel,
do you recall? Any concern?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 07/13] bpf, x86: allow indirect jumps to r8...r15
  2025-09-19 18:25   ` Eduard Zingerman
@ 2025-09-19 18:38     ` Eduard Zingerman
  2025-09-19 19:25       ` Anton Protopopov
  0 siblings, 1 reply; 46+ messages in thread
From: Eduard Zingerman @ 2025-09-19 18:38 UTC (permalink / raw)
  To: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Daniel Borkmann, Quentin Monnet, Yonghong Song

On Fri, 2025-09-19 at 11:25 -0700, Eduard Zingerman wrote:

[...]

> > --- a/arch/x86/net/bpf_jit_comp.c
> > +++ b/arch/x86/net/bpf_jit_comp.c
> > @@ -660,24 +660,38 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
> >
> >  #define EMIT_LFENCE()	EMIT3(0x0F, 0xAE, 0xE8)
> >
> > -static void emit_indirect_jump(u8 **pprog, int reg, u8 *ip)
> > +static void __emit_indirect_jump(u8 **pprog, int reg, bool ereg)
> >  {
> >  	u8 *prog = *pprog;
> >
> > +	if (ereg)
> > +		EMIT1(0x41);
> > +
> > +	EMIT2(0xFF, 0xE0 + reg);
> > +
> > +	*pprog = prog;
> > +}
> > +
> > +static void emit_indirect_jump(u8 **pprog, int bpf_reg, u8 *ip)
> > +{

[...]

> >  	} else {
> > -		EMIT2(0xFF, 0xE0 + reg);	/* jmp *%\reg */
> > +		__emit_indirect_jump(pprog, reg, ereg);
>
>                 You need to re-read *pprog after __emit_indirect_jump() call
>                 this is what causes KASAN error I reported in the sibling thread.
>                 W/o re-reading it the FF E1 emitted above is overwritten to CC E1
>                 by EMIT1(0xCC) below.
>
> >  		if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) || IS_ENABLED(CONFIG_MITIGATION_SLS))
> >  			EMIT1(0xCC);		/* int3 */
> >  	}

Or just move the EMIT1(0xCC) inside __emit_indirect_jump().
It is probably necessary to correctly do mitigations anyway, wdyt?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 07/13] bpf, x86: allow indirect jumps to r8...r15
  2025-09-19 18:38     ` Eduard Zingerman
@ 2025-09-19 19:25       ` Anton Protopopov
  0 siblings, 0 replies; 46+ messages in thread
From: Anton Protopopov @ 2025-09-19 19:25 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Quentin Monnet, Yonghong Song

On 25/09/19 11:38AM, Eduard Zingerman wrote:
> On Fri, 2025-09-19 at 11:25 -0700, Eduard Zingerman wrote:
> 
> [...]
> 
> > > --- a/arch/x86/net/bpf_jit_comp.c
> > > +++ b/arch/x86/net/bpf_jit_comp.c
> > > @@ -660,24 +660,38 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
> > >
> > >  #define EMIT_LFENCE()	EMIT3(0x0F, 0xAE, 0xE8)
> > >
> > > -static void emit_indirect_jump(u8 **pprog, int reg, u8 *ip)
> > > +static void __emit_indirect_jump(u8 **pprog, int reg, bool ereg)
> > >  {
> > >  	u8 *prog = *pprog;
> > >
> > > +	if (ereg)
> > > +		EMIT1(0x41);
> > > +
> > > +	EMIT2(0xFF, 0xE0 + reg);
> > > +
> > > +	*pprog = prog;
> > > +}
> > > +
> > > +static void emit_indirect_jump(u8 **pprog, int bpf_reg, u8 *ip)
> > > +{
> 
> [...]
> 
> > >  	} else {
> > > -		EMIT2(0xFF, 0xE0 + reg);	/* jmp *%\reg */
> > > +		__emit_indirect_jump(pprog, reg, ereg);
> >
> >                 You need to re-read *pprog after __emit_indirect_jump() call
> >                 this is what causes KASAN error I reported in the sibling thread.
> >                 W/o re-reading it the FF E1 emitted above is overwritten to CC E1
> >                 by EMIT1(0xCC) below.
> >
> > >  		if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) || IS_ENABLED(CONFIG_MITIGATION_SLS))
> > >  			EMIT1(0xCC);		/* int3 */
> > >  	}
> 
> Or just move the EMIT1(0xCC) inside __emit_indirect_jump().
> It is probably necessary to correctly do mitigations anyway, wdyt?

Thanks a lot for bisecting! I think better is to re-read,
as in the other branch we do not emit CC

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 05/13] bpf: support instructions arrays with constants blinding
  2025-09-19 18:26         ` Alexei Starovoitov
@ 2025-09-19 19:28           ` Daniel Borkmann
  2025-09-19 19:44             ` Eduard Zingerman
  0 siblings, 1 reply; 46+ messages in thread
From: Daniel Borkmann @ 2025-09-19 19:28 UTC (permalink / raw)
  To: Alexei Starovoitov, Eduard Zingerman
  Cc: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Quentin Monnet, Yonghong Song

On 9/19/25 8:26 PM, Alexei Starovoitov wrote:
> On Fri, Sep 19, 2025 at 12:12 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>> On Fri, 2025-09-19 at 07:05 +0000, Anton Protopopov wrote:
>>> On 25/09/18 11:35PM, Eduard Zingerman wrote:
>>>> On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
>>>>
>>>> [...]
>>>>
>>>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>>>>> index a7ad4fe756da..5c1e4e37d1f8 100644
>>>>> --- a/kernel/bpf/verifier.c
>>>>> +++ b/kernel/bpf/verifier.c
>>>>> @@ -21578,6 +21578,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>>>>>    struct bpf_insn *insn;
>>>>>    void *old_bpf_func;
>>>>>    int err, num_exentries;
>>>>> + int old_len, subprog_start_adjustment = 0;
>>>>>
>>>>>    if (env->subprog_cnt <= 1)
>>>>>            return 0;
>>>>> @@ -21652,7 +21653,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>>>>>            func[i]->aux->func_idx = i;
>>>>>            /* Below members will be freed only at prog->aux */
>>>>>            func[i]->aux->btf = prog->aux->btf;
>>>>> -         func[i]->aux->subprog_start = subprog_start;
>>>>> +         func[i]->aux->subprog_start = subprog_start + subprog_start_adjustment;
>>>>>            func[i]->aux->func_info = prog->aux->func_info;
>>>>>            func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
>>>>>            func[i]->aux->poke_tab = prog->aux->poke_tab;
>>>>> @@ -21705,7 +21706,15 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>>>>>            func[i]->aux->might_sleep = env->subprog_info[i].might_sleep;
>>>>>            if (!i)
>>>>>                    func[i]->aux->exception_boundary = env->seen_exception;
>>>>> +
>>>>> +         /*
>>>>> +          * To properly pass the absolute subprog start to jit
>>>>> +          * all instruction adjustments should be accumulated
>>>>> +          */
>>>>> +         old_len = func[i]->len;
>>>>>            func[i] = bpf_int_jit_compile(func[i]);
>>>>> +         subprog_start_adjustment += func[i]->len - old_len;
>>>>> +
>>>>>            if (!func[i]->jited) {
>>>>>                    err = -ENOTSUPP;
>>>>>                    goto out_free;
>>>>
>>>> This change makes sense, however, would it be possible to move
>>>> bpf_jit_blind_constants() out from jit to verifier.c:do_check,
>>>> somewhere after do_misc_fixups?
>>>> Looking at the source code, bpf_jit_blind_constants() is the first
>>>> thing any bpf_int_jit_compile() does.
>>>> Another alternative is to add adjust_subprog_starts() call to this
>>>> function. Wdyt?
>>>
>>> Yes, it makes total sense. Blinding was added to x86 jit initially and then
>>> every other jit copy-pasted it.  I was considering to move blinding up some
>>> time back (see https://lore.kernel.org/bpf/20250318143318.656785-1-aspsk@isovalent.com/),
>>> but then I've decided to avoid this, as this requires to patch every JIT, and I
>>> am not sure what is the way to test such a change (any hints?)
>>
>> We have the following covered by CI:
>> - arch/x86/net/bpf_jit_comp.c
>> - arch/s390/net/bpf_jit_comp.c
>> - arch/arm64/net/bpf_jit_comp.c
>>
>> People work on these jits actively:
>> - arch/riscv/net/bpf_jit_core.c
>> - arch/loongarch/net/bpf_jit.c
>> - arch/powerpc/net/bpf_jit_comp.c
>>
>> So, we can probably ask to test the patch-set.
>>
>> The remaining are:
>> - arch/x86/net/bpf_jit_comp32.c
>> - arch/parisc/net/bpf_jit_core.c
>> - arch/mips/net/bpf_jit_comp.c
>> - arch/arm/net/bpf_jit_32.c
>> - arch/sparc/net/bpf_jit_comp_64.c
>> - arch/arc/net/bpf_jit_core.c
>>
>> The change to each individual jit is not complicated, just removing
>> the transformation call. Idk, I'd just go for it.
>> Maybe Alexei has concerns?
> 
> No concerns.
> I don't remember why JIT calls it instead of the verifier.
> 
> Daniel,
> do you recall? Any concern?

Hm, I think we did this in the JIT back then for couple of reasons iirc,
the constant blinding needs to work from native bpf(2) as well as from
cbpf->ebpf (seccomp-bpf, filters, etc), so the JIT was a natural location
to capture them all, and to fallback to interpreter with the non-blinded
BPF-insns when something went wrong during blinding or JIT process (e.g.
JIT hits some internal limits etc). Moving bpf_jit_blind_constants() out
from JIT to verifier.c:do_check() means constant blinding of cbpf->ebpf
are not covered anymore (and in this case its reachable from unpriv).

Cheers,
Daniel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 05/13] bpf: support instructions arrays with constants blinding
  2025-09-19 19:28           ` Daniel Borkmann
@ 2025-09-19 19:44             ` Eduard Zingerman
  2025-09-19 20:27               ` Anton Protopopov
  2025-09-19 21:41               ` Daniel Borkmann
  0 siblings, 2 replies; 46+ messages in thread
From: Eduard Zingerman @ 2025-09-19 19:44 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Quentin Monnet, Yonghong Song

On Fri, 2025-09-19 at 21:28 +0200, Daniel Borkmann wrote:
> On 9/19/25 8:26 PM, Alexei Starovoitov wrote:
> > On Fri, Sep 19, 2025 at 12:12 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > On Fri, 2025-09-19 at 07:05 +0000, Anton Protopopov wrote:
> > > > On 25/09/18 11:35PM, Eduard Zingerman wrote:
> > > > > On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> > > > > 
> > > > > [...]
> > > > > 
> > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > > > index a7ad4fe756da..5c1e4e37d1f8 100644
> > > > > > --- a/kernel/bpf/verifier.c
> > > > > > +++ b/kernel/bpf/verifier.c
> > > > > > @@ -21578,6 +21578,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > > > >    struct bpf_insn *insn;
> > > > > >    void *old_bpf_func;
> > > > > >    int err, num_exentries;
> > > > > > + int old_len, subprog_start_adjustment = 0;
> > > > > > 
> > > > > >    if (env->subprog_cnt <= 1)
> > > > > >            return 0;
> > > > > > @@ -21652,7 +21653,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > > > >            func[i]->aux->func_idx = i;
> > > > > >            /* Below members will be freed only at prog->aux */
> > > > > >            func[i]->aux->btf = prog->aux->btf;
> > > > > > -         func[i]->aux->subprog_start = subprog_start;
> > > > > > +         func[i]->aux->subprog_start = subprog_start + subprog_start_adjustment;
> > > > > >            func[i]->aux->func_info = prog->aux->func_info;
> > > > > >            func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
> > > > > >            func[i]->aux->poke_tab = prog->aux->poke_tab;
> > > > > > @@ -21705,7 +21706,15 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > > > >            func[i]->aux->might_sleep = env->subprog_info[i].might_sleep;
> > > > > >            if (!i)
> > > > > >                    func[i]->aux->exception_boundary = env->seen_exception;
> > > > > > +
> > > > > > +         /*
> > > > > > +          * To properly pass the absolute subprog start to jit
> > > > > > +          * all instruction adjustments should be accumulated
> > > > > > +          */
> > > > > > +         old_len = func[i]->len;
> > > > > >            func[i] = bpf_int_jit_compile(func[i]);
> > > > > > +         subprog_start_adjustment += func[i]->len - old_len;
> > > > > > +
> > > > > >            if (!func[i]->jited) {
> > > > > >                    err = -ENOTSUPP;
> > > > > >                    goto out_free;
> > > > > 
> > > > > This change makes sense, however, would it be possible to move
> > > > > bpf_jit_blind_constants() out from jit to verifier.c:do_check,
> > > > > somewhere after do_misc_fixups?
> > > > > Looking at the source code, bpf_jit_blind_constants() is the first
> > > > > thing any bpf_int_jit_compile() does.
> > > > > Another alternative is to add adjust_subprog_starts() call to this
> > > > > function. Wdyt?
> > > > 
> > > > Yes, it makes total sense. Blinding was added to x86 jit initially and then
> > > > every other jit copy-pasted it.  I was considering to move blinding up some
> > > > time back (see https://lore.kernel.org/bpf/20250318143318.656785-1-aspsk@isovalent.com/),
> > > > but then I've decided to avoid this, as this requires to patch every JIT, and I
> > > > am not sure what is the way to test such a change (any hints?)
> > > 
> > > We have the following covered by CI:
> > > - arch/x86/net/bpf_jit_comp.c
> > > - arch/s390/net/bpf_jit_comp.c
> > > - arch/arm64/net/bpf_jit_comp.c
> > > 
> > > People work on these jits actively:
> > > - arch/riscv/net/bpf_jit_core.c
> > > - arch/loongarch/net/bpf_jit.c
> > > - arch/powerpc/net/bpf_jit_comp.c
> > > 
> > > So, we can probably ask to test the patch-set.
> > > 
> > > The remaining are:
> > > - arch/x86/net/bpf_jit_comp32.c
> > > - arch/parisc/net/bpf_jit_core.c
> > > - arch/mips/net/bpf_jit_comp.c
> > > - arch/arm/net/bpf_jit_32.c
> > > - arch/sparc/net/bpf_jit_comp_64.c
> > > - arch/arc/net/bpf_jit_core.c
> > > 
> > > The change to each individual jit is not complicated, just removing
> > > the transformation call. Idk, I'd just go for it.
> > > Maybe Alexei has concerns?
> > 
> > No concerns.
> > I don't remember why JIT calls it instead of the verifier.
> > 
> > Daniel,
> > do you recall? Any concern?
> 
> Hm, I think we did this in the JIT back then for couple of reasons iirc,
> the constant blinding needs to work from native bpf(2) as well as from
> cbpf->ebpf (seccomp-bpf, filters, etc), so the JIT was a natural location
> to capture them all, and to fallback to interpreter with the non-blinded
> BPF-insns when something went wrong during blinding or JIT process (e.g.
> JIT hits some internal limits etc). Moving bpf_jit_blind_constants() out
> from JIT to verifier.c:do_check() means constant blinding of cbpf->ebpf
> are not covered anymore (and in this case its reachable from unpriv).

Hi Daniel,

Thank you for the context.
So, the ideal location for bpf_jit_blind_constants() would be in
core.c in some wrapper function for bpf_int_jit_compile():

  static struct bpf_prog *jit_compile(prog)
  {
  	tmp = bpf_jit_blind_constants()
        if (!tmp)
           return prog;
        return bpf_int_jit_compile(tmp);
  }

A bit of a hassle.

Anton, wdyt about a second option: adding adjust_subprog_starts()
to bpf_jit_blind_constants() and leaving all the rest as-is?
It would have to happen either way of call to bpf_jit_blind_constants()
itself is moved.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 05/13] bpf: support instructions arrays with constants blinding
  2025-09-19 19:44             ` Eduard Zingerman
@ 2025-09-19 20:27               ` Anton Protopopov
  2025-09-19 20:47                 ` Eduard Zingerman
  2025-09-19 21:41               ` Daniel Borkmann
  1 sibling, 1 reply; 46+ messages in thread
From: Anton Protopopov @ 2025-09-19 20:27 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Daniel Borkmann, Alexei Starovoitov, bpf, Alexei Starovoitov,
	Andrii Nakryiko, Anton Protopopov, Quentin Monnet, Yonghong Song

On 25/09/19 12:44PM, Eduard Zingerman wrote:
> On Fri, 2025-09-19 at 21:28 +0200, Daniel Borkmann wrote:
> > On 9/19/25 8:26 PM, Alexei Starovoitov wrote:
> > > On Fri, Sep 19, 2025 at 12:12 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > On Fri, 2025-09-19 at 07:05 +0000, Anton Protopopov wrote:
> > > > > On 25/09/18 11:35PM, Eduard Zingerman wrote:
> > > > > > On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> > > > > > 
> > > > > > [...]
> > > > > > 
> > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > > > > index a7ad4fe756da..5c1e4e37d1f8 100644
> > > > > > > --- a/kernel/bpf/verifier.c
> > > > > > > +++ b/kernel/bpf/verifier.c
> > > > > > > @@ -21578,6 +21578,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > > > > >    struct bpf_insn *insn;
> > > > > > >    void *old_bpf_func;
> > > > > > >    int err, num_exentries;
> > > > > > > + int old_len, subprog_start_adjustment = 0;
> > > > > > > 
> > > > > > >    if (env->subprog_cnt <= 1)
> > > > > > >            return 0;
> > > > > > > @@ -21652,7 +21653,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > > > > >            func[i]->aux->func_idx = i;
> > > > > > >            /* Below members will be freed only at prog->aux */
> > > > > > >            func[i]->aux->btf = prog->aux->btf;
> > > > > > > -         func[i]->aux->subprog_start = subprog_start;
> > > > > > > +         func[i]->aux->subprog_start = subprog_start + subprog_start_adjustment;
> > > > > > >            func[i]->aux->func_info = prog->aux->func_info;
> > > > > > >            func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
> > > > > > >            func[i]->aux->poke_tab = prog->aux->poke_tab;
> > > > > > > @@ -21705,7 +21706,15 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > > > > >            func[i]->aux->might_sleep = env->subprog_info[i].might_sleep;
> > > > > > >            if (!i)
> > > > > > >                    func[i]->aux->exception_boundary = env->seen_exception;
> > > > > > > +
> > > > > > > +         /*
> > > > > > > +          * To properly pass the absolute subprog start to jit
> > > > > > > +          * all instruction adjustments should be accumulated
> > > > > > > +          */
> > > > > > > +         old_len = func[i]->len;
> > > > > > >            func[i] = bpf_int_jit_compile(func[i]);
> > > > > > > +         subprog_start_adjustment += func[i]->len - old_len;
> > > > > > > +
> > > > > > >            if (!func[i]->jited) {
> > > > > > >                    err = -ENOTSUPP;
> > > > > > >                    goto out_free;
> > > > > > 
> > > > > > This change makes sense, however, would it be possible to move
> > > > > > bpf_jit_blind_constants() out from jit to verifier.c:do_check,
> > > > > > somewhere after do_misc_fixups?
> > > > > > Looking at the source code, bpf_jit_blind_constants() is the first
> > > > > > thing any bpf_int_jit_compile() does.
> > > > > > Another alternative is to add adjust_subprog_starts() call to this
> > > > > > function. Wdyt?
> > > > > 
> > > > > Yes, it makes total sense. Blinding was added to x86 jit initially and then
> > > > > every other jit copy-pasted it.  I was considering to move blinding up some
> > > > > time back (see https://lore.kernel.org/bpf/20250318143318.656785-1-aspsk@isovalent.com/),
> > > > > but then I've decided to avoid this, as this requires to patch every JIT, and I
> > > > > am not sure what is the way to test such a change (any hints?)
> > > > 
> > > > We have the following covered by CI:
> > > > - arch/x86/net/bpf_jit_comp.c
> > > > - arch/s390/net/bpf_jit_comp.c
> > > > - arch/arm64/net/bpf_jit_comp.c
> > > > 
> > > > People work on these jits actively:
> > > > - arch/riscv/net/bpf_jit_core.c
> > > > - arch/loongarch/net/bpf_jit.c
> > > > - arch/powerpc/net/bpf_jit_comp.c
> > > > 
> > > > So, we can probably ask to test the patch-set.
> > > > 
> > > > The remaining are:
> > > > - arch/x86/net/bpf_jit_comp32.c
> > > > - arch/parisc/net/bpf_jit_core.c
> > > > - arch/mips/net/bpf_jit_comp.c
> > > > - arch/arm/net/bpf_jit_32.c
> > > > - arch/sparc/net/bpf_jit_comp_64.c
> > > > - arch/arc/net/bpf_jit_core.c
> > > > 
> > > > The change to each individual jit is not complicated, just removing
> > > > the transformation call. Idk, I'd just go for it.
> > > > Maybe Alexei has concerns?
> > > 
> > > No concerns.
> > > I don't remember why JIT calls it instead of the verifier.
> > > 
> > > Daniel,
> > > do you recall? Any concern?
> > 
> > Hm, I think we did this in the JIT back then for couple of reasons iirc,
> > the constant blinding needs to work from native bpf(2) as well as from
> > cbpf->ebpf (seccomp-bpf, filters, etc), so the JIT was a natural location
> > to capture them all, and to fallback to interpreter with the non-blinded
> > BPF-insns when something went wrong during blinding or JIT process (e.g.
> > JIT hits some internal limits etc). Moving bpf_jit_blind_constants() out
> > from JIT to verifier.c:do_check() means constant blinding of cbpf->ebpf
> > are not covered anymore (and in this case its reachable from unpriv).
> 
> Hi Daniel,
> 
> Thank you for the context.
> So, the ideal location for bpf_jit_blind_constants() would be in
> core.c in some wrapper function for bpf_int_jit_compile():
> 
>   static struct bpf_prog *jit_compile(prog)
>   {
>   	tmp = bpf_jit_blind_constants()
>         if (!tmp)
>            return prog;
>         return bpf_int_jit_compile(tmp);
>   }
> 
> A bit of a hassle.
> 
> Anton, wdyt about a second option: adding adjust_subprog_starts()
> to bpf_jit_blind_constants() and leaving all the rest as-is?
> It would have to happen either way of call to bpf_jit_blind_constants()
> itself is moved.

So, to be clear, in this case adjust_insn_arrays() stays as in the
original patch, but the "subprog_start_adjustment" chunks are
replaced by calling the adjust_subprog_starts() (for better
readability and consistency, right?)

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 05/13] bpf: support instructions arrays with constants blinding
  2025-09-19 20:27               ` Anton Protopopov
@ 2025-09-19 20:47                 ` Eduard Zingerman
  2025-09-22  9:28                   ` Anton Protopopov
  0 siblings, 1 reply; 46+ messages in thread
From: Eduard Zingerman @ 2025-09-19 20:47 UTC (permalink / raw)
  To: Anton Protopopov
  Cc: Daniel Borkmann, Alexei Starovoitov, bpf, Alexei Starovoitov,
	Andrii Nakryiko, Anton Protopopov, Quentin Monnet, Yonghong Song

On Fri, 2025-09-19 at 20:27 +0000, Anton Protopopov wrote:
> On 25/09/19 12:44PM, Eduard Zingerman wrote:
> > On Fri, 2025-09-19 at 21:28 +0200, Daniel Borkmann wrote:
> > > On 9/19/25 8:26 PM, Alexei Starovoitov wrote:
> > > > On Fri, Sep 19, 2025 at 12:12 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > > On Fri, 2025-09-19 at 07:05 +0000, Anton Protopopov wrote:
> > > > > > On 25/09/18 11:35PM, Eduard Zingerman wrote:
> > > > > > > On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> > > > > > > 
> > > > > > > [...]
> > > > > > > 
> > > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > > > > > index a7ad4fe756da..5c1e4e37d1f8 100644
> > > > > > > > --- a/kernel/bpf/verifier.c
> > > > > > > > +++ b/kernel/bpf/verifier.c
> > > > > > > > @@ -21578,6 +21578,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > > > > > >    struct bpf_insn *insn;
> > > > > > > >    void *old_bpf_func;
> > > > > > > >    int err, num_exentries;
> > > > > > > > + int old_len, subprog_start_adjustment = 0;
> > > > > > > > 
> > > > > > > >    if (env->subprog_cnt <= 1)
> > > > > > > >            return 0;
> > > > > > > > @@ -21652,7 +21653,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > > > > > >            func[i]->aux->func_idx = i;
> > > > > > > >            /* Below members will be freed only at prog->aux */
> > > > > > > >            func[i]->aux->btf = prog->aux->btf;
> > > > > > > > -         func[i]->aux->subprog_start = subprog_start;
> > > > > > > > +         func[i]->aux->subprog_start = subprog_start + subprog_start_adjustment;
> > > > > > > >            func[i]->aux->func_info = prog->aux->func_info;
> > > > > > > >            func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
> > > > > > > >            func[i]->aux->poke_tab = prog->aux->poke_tab;
> > > > > > > > @@ -21705,7 +21706,15 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > > > > > >            func[i]->aux->might_sleep = env->subprog_info[i].might_sleep;
> > > > > > > >            if (!i)
> > > > > > > >                    func[i]->aux->exception_boundary = env->seen_exception;
> > > > > > > > +
> > > > > > > > +         /*
> > > > > > > > +          * To properly pass the absolute subprog start to jit
> > > > > > > > +          * all instruction adjustments should be accumulated
> > > > > > > > +          */
> > > > > > > > +         old_len = func[i]->len;
> > > > > > > >            func[i] = bpf_int_jit_compile(func[i]);
> > > > > > > > +         subprog_start_adjustment += func[i]->len - old_len;
> > > > > > > > +
> > > > > > > >            if (!func[i]->jited) {
> > > > > > > >                    err = -ENOTSUPP;
> > > > > > > >                    goto out_free;
> > > > > > > 
> > > > > > > This change makes sense, however, would it be possible to move
> > > > > > > bpf_jit_blind_constants() out from jit to verifier.c:do_check,
> > > > > > > somewhere after do_misc_fixups?
> > > > > > > Looking at the source code, bpf_jit_blind_constants() is the first
> > > > > > > thing any bpf_int_jit_compile() does.
> > > > > > > Another alternative is to add adjust_subprog_starts() call to this
> > > > > > > function. Wdyt?
> > > > > > 
> > > > > > Yes, it makes total sense. Blinding was added to x86 jit initially and then
> > > > > > every other jit copy-pasted it.  I was considering to move blinding up some
> > > > > > time back (see https://lore.kernel.org/bpf/20250318143318.656785-1-aspsk@isovalent.com/),
> > > > > > but then I've decided to avoid this, as this requires to patch every JIT, and I
> > > > > > am not sure what is the way to test such a change (any hints?)
> > > > > 
> > > > > We have the following covered by CI:
> > > > > - arch/x86/net/bpf_jit_comp.c
> > > > > - arch/s390/net/bpf_jit_comp.c
> > > > > - arch/arm64/net/bpf_jit_comp.c
> > > > > 
> > > > > People work on these jits actively:
> > > > > - arch/riscv/net/bpf_jit_core.c
> > > > > - arch/loongarch/net/bpf_jit.c
> > > > > - arch/powerpc/net/bpf_jit_comp.c
> > > > > 
> > > > > So, we can probably ask to test the patch-set.
> > > > > 
> > > > > The remaining are:
> > > > > - arch/x86/net/bpf_jit_comp32.c
> > > > > - arch/parisc/net/bpf_jit_core.c
> > > > > - arch/mips/net/bpf_jit_comp.c
> > > > > - arch/arm/net/bpf_jit_32.c
> > > > > - arch/sparc/net/bpf_jit_comp_64.c
> > > > > - arch/arc/net/bpf_jit_core.c
> > > > > 
> > > > > The change to each individual jit is not complicated, just removing
> > > > > the transformation call. Idk, I'd just go for it.
> > > > > Maybe Alexei has concerns?
> > > > 
> > > > No concerns.
> > > > I don't remember why JIT calls it instead of the verifier.
> > > > 
> > > > Daniel,
> > > > do you recall? Any concern?
> > > 
> > > Hm, I think we did this in the JIT back then for couple of reasons iirc,
> > > the constant blinding needs to work from native bpf(2) as well as from
> > > cbpf->ebpf (seccomp-bpf, filters, etc), so the JIT was a natural location
> > > to capture them all, and to fallback to interpreter with the non-blinded
> > > BPF-insns when something went wrong during blinding or JIT process (e.g.
> > > JIT hits some internal limits etc). Moving bpf_jit_blind_constants() out
> > > from JIT to verifier.c:do_check() means constant blinding of cbpf->ebpf
> > > are not covered anymore (and in this case its reachable from unpriv).
> > 
> > Hi Daniel,
> > 
> > Thank you for the context.
> > So, the ideal location for bpf_jit_blind_constants() would be in
> > core.c in some wrapper function for bpf_int_jit_compile():
> > 
> >   static struct bpf_prog *jit_compile(prog)
> >   {
> >   	tmp = bpf_jit_blind_constants()
> >         if (!tmp)
> >            return prog;
> >         return bpf_int_jit_compile(tmp);
> >   }
> > 
> > A bit of a hassle.
> > 
> > Anton, wdyt about a second option: adding adjust_subprog_starts()
> > to bpf_jit_blind_constants() and leaving all the rest as-is?
> > It would have to happen either way of call to bpf_jit_blind_constants()
> > itself is moved.
> 
> So, to be clear, in this case adjust_insn_arrays() stays as in the
> original patch, but the "subprog_start_adjustment" chunks are
> replaced by calling the adjust_subprog_starts() (for better
> readability and consistency, right?)

Yes, by adding adjust_subprog_starts() call inside
bpf_jit_blind_constants() it should be possible to read
env->subprog_info[*].start in the jit_subprogs() loop directly,
w/o tracking the subprog_start_adjustment delta.
(At-least I think this should work).

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 05/13] bpf: support instructions arrays with constants blinding
  2025-09-19 19:44             ` Eduard Zingerman
  2025-09-19 20:27               ` Anton Protopopov
@ 2025-09-19 21:41               ` Daniel Borkmann
  1 sibling, 0 replies; 46+ messages in thread
From: Daniel Borkmann @ 2025-09-19 21:41 UTC (permalink / raw)
  To: Eduard Zingerman, Alexei Starovoitov
  Cc: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Quentin Monnet, Yonghong Song

On 9/19/25 9:44 PM, Eduard Zingerman wrote:
> On Fri, 2025-09-19 at 21:28 +0200, Daniel Borkmann wrote:
>> On 9/19/25 8:26 PM, Alexei Starovoitov wrote:
>>> On Fri, Sep 19, 2025 at 12:12 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>>>> On Fri, 2025-09-19 at 07:05 +0000, Anton Protopopov wrote:
>>>>> On 25/09/18 11:35PM, Eduard Zingerman wrote:
>>>>>> On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>>>>>>> index a7ad4fe756da..5c1e4e37d1f8 100644
>>>>>>> --- a/kernel/bpf/verifier.c
>>>>>>> +++ b/kernel/bpf/verifier.c
>>>>>>> @@ -21578,6 +21578,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>>>>>>>     struct bpf_insn *insn;
>>>>>>>     void *old_bpf_func;
>>>>>>>     int err, num_exentries;
>>>>>>> + int old_len, subprog_start_adjustment = 0;
>>>>>>>
>>>>>>>     if (env->subprog_cnt <= 1)
>>>>>>>             return 0;
>>>>>>> @@ -21652,7 +21653,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>>>>>>>             func[i]->aux->func_idx = i;
>>>>>>>             /* Below members will be freed only at prog->aux */
>>>>>>>             func[i]->aux->btf = prog->aux->btf;
>>>>>>> -         func[i]->aux->subprog_start = subprog_start;
>>>>>>> +         func[i]->aux->subprog_start = subprog_start + subprog_start_adjustment;
>>>>>>>             func[i]->aux->func_info = prog->aux->func_info;
>>>>>>>             func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
>>>>>>>             func[i]->aux->poke_tab = prog->aux->poke_tab;
>>>>>>> @@ -21705,7 +21706,15 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>>>>>>>             func[i]->aux->might_sleep = env->subprog_info[i].might_sleep;
>>>>>>>             if (!i)
>>>>>>>                     func[i]->aux->exception_boundary = env->seen_exception;
>>>>>>> +
>>>>>>> +         /*
>>>>>>> +          * To properly pass the absolute subprog start to jit
>>>>>>> +          * all instruction adjustments should be accumulated
>>>>>>> +          */
>>>>>>> +         old_len = func[i]->len;
>>>>>>>             func[i] = bpf_int_jit_compile(func[i]);
>>>>>>> +         subprog_start_adjustment += func[i]->len - old_len;
>>>>>>> +
>>>>>>>             if (!func[i]->jited) {
>>>>>>>                     err = -ENOTSUPP;
>>>>>>>                     goto out_free;
>>>>>>
>>>>>> This change makes sense, however, would it be possible to move
>>>>>> bpf_jit_blind_constants() out from jit to verifier.c:do_check,
>>>>>> somewhere after do_misc_fixups?
>>>>>> Looking at the source code, bpf_jit_blind_constants() is the first
>>>>>> thing any bpf_int_jit_compile() does.
>>>>>> Another alternative is to add adjust_subprog_starts() call to this
>>>>>> function. Wdyt?
>>>>>
>>>>> Yes, it makes total sense. Blinding was added to x86 jit initially and then
>>>>> every other jit copy-pasted it.  I was considering to move blinding up some
>>>>> time back (see https://lore.kernel.org/bpf/20250318143318.656785-1-aspsk@isovalent.com/),
>>>>> but then I've decided to avoid this, as this requires to patch every JIT, and I
>>>>> am not sure what is the way to test such a change (any hints?)
>>>>
>>>> We have the following covered by CI:
>>>> - arch/x86/net/bpf_jit_comp.c
>>>> - arch/s390/net/bpf_jit_comp.c
>>>> - arch/arm64/net/bpf_jit_comp.c
>>>>
>>>> People work on these jits actively:
>>>> - arch/riscv/net/bpf_jit_core.c
>>>> - arch/loongarch/net/bpf_jit.c
>>>> - arch/powerpc/net/bpf_jit_comp.c
>>>>
>>>> So, we can probably ask to test the patch-set.
>>>>
>>>> The remaining are:
>>>> - arch/x86/net/bpf_jit_comp32.c
>>>> - arch/parisc/net/bpf_jit_core.c
>>>> - arch/mips/net/bpf_jit_comp.c
>>>> - arch/arm/net/bpf_jit_32.c
>>>> - arch/sparc/net/bpf_jit_comp_64.c
>>>> - arch/arc/net/bpf_jit_core.c
>>>>
>>>> The change to each individual jit is not complicated, just removing
>>>> the transformation call. Idk, I'd just go for it.
>>>> Maybe Alexei has concerns?
>>>
>>> No concerns.
>>> I don't remember why JIT calls it instead of the verifier.
>>>
>>> Daniel,
>>> do you recall? Any concern?
>>
>> Hm, I think we did this in the JIT back then for couple of reasons iirc,
>> the constant blinding needs to work from native bpf(2) as well as from
>> cbpf->ebpf (seccomp-bpf, filters, etc), so the JIT was a natural location
>> to capture them all, and to fallback to interpreter with the non-blinded
>> BPF-insns when something went wrong during blinding or JIT process (e.g.
>> JIT hits some internal limits etc). Moving bpf_jit_blind_constants() out
>> from JIT to verifier.c:do_check() means constant blinding of cbpf->ebpf
>> are not covered anymore (and in this case its reachable from unpriv).
> 
> Thank you for the context.
> So, the ideal location for bpf_jit_blind_constants() would be in
> core.c in some wrapper function for bpf_int_jit_compile():
> 
>    static struct bpf_prog *jit_compile(prog)
>    {
>    	tmp = bpf_jit_blind_constants()
>          if (!tmp)
>             return prog;
>          return bpf_int_jit_compile(tmp);
>    }
> 
> A bit of a hassle.

Yes, hassle, and technically when bpf_int_jit_compile() fails and interpreter
is compiled in, then the latter should only get the non-blinded insns, so above
would not be sufficient as-is.

> Anton, wdyt about a second option: adding adjust_subprog_starts()
> to bpf_jit_blind_constants() and leaving all the rest as-is?
> It would have to happen either way of call to bpf_jit_blind_constants()
> itself is moved.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 11/13] libbpf: support llvm-generated indirect jumps
  2025-09-18  9:38 ` [PATCH v3 bpf-next 11/13] libbpf: support llvm-generated indirect jumps Anton Protopopov
@ 2025-09-19 23:18   ` Andrii Nakryiko
  2025-09-22 10:13     ` Anton Protopopov
  0 siblings, 1 reply; 46+ messages in thread
From: Andrii Nakryiko @ 2025-09-19 23:18 UTC (permalink / raw)
  To: Anton Protopopov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song

On Thu, Sep 18, 2025 at 2:32 AM Anton Protopopov
<a.s.protopopov@gmail.com> wrote:
>
> For v5 instruction set LLVM is allowed to generate indirect jumps for
> switch statements and for 'goto *rX' assembly. Every such a jump will
> be accompanied by necessary metadata, e.g. (`llvm-objdump -Sr ...`):
>
>        0:       r2 = 0x0 ll
>                 0000000000000030:  R_BPF_64_64  BPF.JT.0.0
>
> Here BPF.JT.1.0 is a symbol residing in the .jumptables section:
>
>     Symbol table:
>        4: 0000000000000000   240 OBJECT  GLOBAL DEFAULT     4 BPF.JT.0.0
>
> The -bpf-min-jump-table-entries llvm option may be used to control the
> minimal size of a switch which will be converted to an indirect jumps.
>
> Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
> ---
>  tools/lib/bpf/libbpf.c        | 150 +++++++++++++++++++++++++++++++++-
>  tools/lib/bpf/libbpf_probes.c |   4 +
>  tools/lib/bpf/linker.c        |  10 ++-
>  3 files changed, 161 insertions(+), 3 deletions(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 2c1f48f77680..57cac0810d2e 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -191,6 +191,7 @@ static const char * const map_type_name[] = {
>         [BPF_MAP_TYPE_USER_RINGBUF]             = "user_ringbuf",
>         [BPF_MAP_TYPE_CGRP_STORAGE]             = "cgrp_storage",
>         [BPF_MAP_TYPE_ARENA]                    = "arena",
> +       [BPF_MAP_TYPE_INSN_ARRAY]               = "insn_array",
>  };
>
>  static const char * const prog_type_name[] = {
> @@ -372,6 +373,7 @@ enum reloc_type {
>         RELO_EXTERN_CALL,
>         RELO_SUBPROG_ADDR,
>         RELO_CORE,
> +       RELO_INSN_ARRAY,
>  };
>
>  struct reloc_desc {
> @@ -382,7 +384,10 @@ struct reloc_desc {
>                 struct {
>                         int map_idx;
>                         int sym_off;
> -                       int ext_idx;
> +                       union {
> +                               int ext_idx;
> +                               int sym_size;
> +                       };
>                 };
>         };
>  };
> @@ -424,6 +429,11 @@ struct bpf_sec_def {
>         libbpf_prog_attach_fn_t prog_attach_fn;
>  };
>
> +struct bpf_light_subprog {
> +       __u32 sec_insn_off;
> +       __u32 sub_insn_off;
> +};
> +
>  /*
>   * bpf_prog should be a better name but it has been used in
>   * linux/filter.h.
> @@ -496,6 +506,9 @@ struct bpf_program {
>         __u32 line_info_rec_size;
>         __u32 line_info_cnt;
>         __u32 prog_flags;
> +
> +       struct bpf_light_subprog *subprog;

nit: subprogs (but still subprog_cnt, yep)

> +       __u32 subprog_cnt;
>  };
>
>  struct bpf_struct_ops {
> @@ -525,6 +538,7 @@ struct bpf_struct_ops {
>  #define STRUCT_OPS_SEC ".struct_ops"
>  #define STRUCT_OPS_LINK_SEC ".struct_ops.link"
>  #define ARENA_SEC ".addr_space.1"
> +#define JUMPTABLES_SEC ".jumptables"
>
>  enum libbpf_map_type {
>         LIBBPF_MAP_UNSPEC,
> @@ -668,6 +682,7 @@ struct elf_state {
>         int symbols_shndx;
>         bool has_st_ops;
>         int arena_data_shndx;
> +       int jumptables_data_shndx;
>  };
>
>  struct usdt_manager;
> @@ -739,6 +754,9 @@ struct bpf_object {
>         void *arena_data;
>         size_t arena_data_sz;
>
> +       void *jumptables_data;
> +       size_t jumptables_data_sz;
> +
>         struct kern_feature_cache *feat_cache;
>         char *token_path;
>         int token_fd;
> @@ -765,6 +783,7 @@ void bpf_program__unload(struct bpf_program *prog)
>
>         zfree(&prog->func_info);
>         zfree(&prog->line_info);
> +       zfree(&prog->subprog);
>  }
>
>  static void bpf_program__exit(struct bpf_program *prog)
> @@ -3945,6 +3964,13 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
>                         } else if (strcmp(name, ARENA_SEC) == 0) {
>                                 obj->efile.arena_data = data;
>                                 obj->efile.arena_data_shndx = idx;
> +                       } else if (strcmp(name, JUMPTABLES_SEC) == 0) {
> +                               obj->jumptables_data = malloc(data->d_size);
> +                               if (!obj->jumptables_data)
> +                                       return -ENOMEM;
> +                               memcpy(obj->jumptables_data, data->d_buf, data->d_size);
> +                               obj->jumptables_data_sz = data->d_size;
> +                               obj->efile.jumptables_data_shndx = idx;
>                         } else {
>                                 pr_info("elf: skipping unrecognized data section(%d) %s\n",
>                                         idx, name);
> @@ -4599,6 +4625,16 @@ static int bpf_program__record_reloc(struct bpf_program *prog,
>                 return 0;
>         }
>
> +       /* jump table data relocation */
> +       if (shdr_idx == obj->efile.jumptables_data_shndx) {
> +               reloc_desc->type = RELO_INSN_ARRAY;
> +               reloc_desc->insn_idx = insn_idx;
> +               reloc_desc->map_idx = -1;
> +               reloc_desc->sym_off = sym->st_value;
> +               reloc_desc->sym_size = sym->st_size;
> +               return 0;
> +       }
> +
>         /* generic map reference relocation */
>         if (type == LIBBPF_MAP_UNSPEC) {
>                 if (!bpf_object__shndx_is_maps(obj, shdr_idx)) {
> @@ -6101,6 +6137,74 @@ static void poison_kfunc_call(struct bpf_program *prog, int relo_idx,
>         insn->imm = POISON_CALL_KFUNC_BASE + ext_idx;
>  }
>
> +static int create_jt_map(struct bpf_object *obj, int off, int size, int adjust_off)
> +{
> +       const __u32 value_size = sizeof(struct bpf_insn_array_value);
> +       const __u32 max_entries = size / value_size;
> +       struct bpf_insn_array_value val = {};
> +       int map_fd, err;
> +       __u64 xlated_off;
> +       __u64 *jt;
> +       __u32 i;
> +
> +       map_fd = bpf_map_create(BPF_MAP_TYPE_INSN_ARRAY, "jt",

let's call it ".jumptables" just like special global data maps?

> +                               4, value_size, max_entries, NULL);
> +       if (map_fd < 0)
> +               return map_fd;
> +
> +       if (!obj->jumptables_data) {
> +               pr_warn("object contains no jumptables_data\n");

for map-related errors we follow (pretty consistently) error format:

map '%s': whatever bad happened

let's stick to that here? "map '.jumptables': ELF file is missing jump
table data" or something along those lines?

> +               return -EINVAL;
> +       }
> +       if ((off + size) > obj->jumptables_data_sz) {

nit: unnecessary ()

> +               pr_warn("jumptables_data size is %zd, trying to access %d\n",
> +                       obj->jumptables_data_sz, off + size);
> +               return -EINVAL;
> +       }
> +
> +       jt = (__u64 *)(obj->jumptables_data + off);
> +       for (i = 0; i < max_entries; i++) {
> +               /*
> +                * LLVM-generated jump tables contain u64 records, however
> +                * should contain values that fit in u32.
> +                * The adjust_off provided by the caller adjusts the offset to
> +                * be relative to the beginning of the main function
> +                */
> +               xlated_off = jt[i]/sizeof(struct bpf_insn) + adjust_off;
> +               if (xlated_off > UINT32_MAX) {
> +                       pr_warn("invalid jump table value %llx at offset %d (adjust_off %d)\n",
> +                               jt[i], off + i, adjust_off);

no close(map_fd)? same in a bunch of places above? I'd actually move
map create to right before this loop and simplify error handling

pw-bot: cr

> +                       return -EINVAL;
> +               }
> +
> +               val.xlated_off = xlated_off;
> +               err = bpf_map_update_elem(map_fd, &i, &val, 0);
> +               if (err) {
> +                       close(map_fd);
> +                       return err;
> +               }
> +       }
> +       return map_fd;
> +}
> +
> +/*
> + * In LLVM the .jumptables section contains jump tables entries relative to the
> + * section start. The BPF kernel-side code expects jump table offsets relative
> + * to the beginning of the program (passed in bpf(BPF_PROG_LOAD)). This helper
> + * computes a delta to be added when creating a map.
> + */
> +static int jt_adjust_off(struct bpf_program *prog, int insn_idx)
> +{
> +       int i;
> +
> +       for (i = prog->subprog_cnt - 1; i >= 0; i--)
> +               if (insn_idx >= prog->subprog[i].sub_insn_off)
> +                       return prog->subprog[i].sub_insn_off - prog->subprog[i].sec_insn_off;

nit: please add {} around multi-line for loop body (even if it's a
single statement)

> +
> +       return -prog->sec_insn_off;
> +}
> +
> +
>  /* Relocate data references within program code:
>   *  - map references;
>   *  - global variable references;
> @@ -6192,6 +6296,21 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
>                 case RELO_CORE:
>                         /* will be handled by bpf_program_record_relos() */
>                         break;
> +               case RELO_INSN_ARRAY: {
> +                       int map_fd;
> +
> +                       map_fd = create_jt_map(obj, relo->sym_off, relo->sym_size,
> +                                              jt_adjust_off(prog, relo->insn_idx));

Who's closing all these fds? (I feel like we'd want to have all those
maps in a list of bpf_object's maps, just like .rodata and others)

Also, how many of those will we have? Each individual relocation gets
its own map, right?..


> +                       if (map_fd < 0) {
> +                               pr_warn("prog '%s': relo #%d: can't create jump table: sym_off %u\n",
> +                                               prog->name, i, relo->sym_off);
> +                               return map_fd;
> +                       }
> +                       insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
> +                       insn->imm = map_fd;
> +                       insn->off = 0;
> +               }
> +                       break;
>                 default:
>                         pr_warn("prog '%s': relo #%d: bad relo type %d\n",
>                                 prog->name, i, relo->type);
> @@ -6389,6 +6508,24 @@ static int append_subprog_relos(struct bpf_program *main_prog, struct bpf_progra
>         return 0;
>  }
>
> +static int save_subprog_offsets(struct bpf_program *main_prog, struct bpf_program *subprog)
> +{
> +       size_t size = sizeof(main_prog->subprog[0]);
> +       int new_cnt = main_prog->subprog_cnt + 1;
> +       void *tmp;
> +
> +       tmp = libbpf_reallocarray(main_prog->subprog, new_cnt, size);
> +       if (!tmp)
> +               return -ENOMEM;
> +
> +       main_prog->subprog = tmp;
> +       main_prog->subprog[new_cnt - 1].sec_insn_off = subprog->sec_insn_off;
> +       main_prog->subprog[new_cnt - 1].sub_insn_off = subprog->sub_insn_off;
> +       main_prog->subprog_cnt = new_cnt;
> +
> +       return 0;
> +}
> +
>  static int
>  bpf_object__append_subprog_code(struct bpf_object *obj, struct bpf_program *main_prog,
>                                 struct bpf_program *subprog)
> @@ -6418,6 +6555,14 @@ bpf_object__append_subprog_code(struct bpf_object *obj, struct bpf_program *main
>         err = append_subprog_relos(main_prog, subprog);
>         if (err)
>                 return err;
> +
> +       /* Save subprogram offsets */
> +       err = save_subprog_offsets(main_prog, subprog);
> +       if (err) {
> +               pr_warn("prog '%s': failed to add subprog offsets\n", main_prog->name);

emit error itself as well, use errstr()

> +               return err;
> +       }
> +
>         return 0;
>  }
>
> @@ -9185,6 +9330,9 @@ void bpf_object__close(struct bpf_object *obj)
>
>         zfree(&obj->arena_data);
>
> +       zfree(&obj->jumptables_data);
> +       obj->jumptables_data_sz = 0;
> +
>         free(obj);
>  }
>
> diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c
> index 9dfbe7750f56..bccf4bb747e1 100644
> --- a/tools/lib/bpf/libbpf_probes.c
> +++ b/tools/lib/bpf/libbpf_probes.c
> @@ -364,6 +364,10 @@ static int probe_map_create(enum bpf_map_type map_type)
>         case BPF_MAP_TYPE_SOCKHASH:
>         case BPF_MAP_TYPE_REUSEPORT_SOCKARRAY:
>                 break;
> +       case BPF_MAP_TYPE_INSN_ARRAY:
> +               key_size        = sizeof(__u32);
> +               value_size      = sizeof(struct bpf_insn_array_value);
> +               break;
>         case BPF_MAP_TYPE_UNSPEC:
>         default:
>                 return -EOPNOTSUPP;
> diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c
> index a469e5d4fee7..d1585baa9f14 100644
> --- a/tools/lib/bpf/linker.c
> +++ b/tools/lib/bpf/linker.c
> @@ -28,6 +28,8 @@
>  #include "str_error.h"
>
>  #define BTF_EXTERN_SEC ".extern"
> +#define JUMPTABLES_SEC ".jumptables"
> +#define JUMPTABLES_REL_SEC ".rel.jumptables"
>
>  struct src_sec {
>         const char *sec_name;
> @@ -2026,6 +2028,9 @@ static int linker_append_elf_sym(struct bpf_linker *linker, struct src_obj *obj,
>                         obj->sym_map[src_sym_idx] = dst_sec->sec_sym_idx;
>                         return 0;
>                 }
> +
> +               if (strcmp(src_sec->sec_name, JUMPTABLES_SEC) == 0)
> +                       goto add_sym;
>         }
>
>         if (sym_bind == STB_LOCAL)
> @@ -2272,8 +2277,9 @@ static int linker_append_elf_relos(struct bpf_linker *linker, struct src_obj *ob
>                                                 insn->imm += sec->dst_off / sizeof(struct bpf_insn);
>                                         else
>                                                 insn->imm += sec->dst_off;
> -                               } else {
> -                                       pr_warn("relocation against STT_SECTION in non-exec section is not supported!\n");
> +                               } else if (strcmp(src_sec->sec_name, JUMPTABLES_REL_SEC)) {

please add explicit `!= 0`, but also didn't we agree to have

if (strcmp(..., JUMPTABLES_REL_SEC) == 0) {
    /* no need to adjust .jumptables */
} else {
    ... original default handling of errors ...


Also, how did you test that this actually works? Can you add a
selftest demonstrating this?

}

> +                                       pr_warn("relocation against STT_SECTION in section %s is not supported!\n",
> +                                               src_sec->sec_name);
>                                         return -EINVAL;
>                                 }
>                         }
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 10/13] libbpf: fix formatting of bpf_object__append_subprog_code
  2025-09-18  9:38 ` [PATCH v3 bpf-next 10/13] libbpf: fix formatting of bpf_object__append_subprog_code Anton Protopopov
@ 2025-09-19 23:18   ` Andrii Nakryiko
  0 siblings, 0 replies; 46+ messages in thread
From: Andrii Nakryiko @ 2025-09-19 23:18 UTC (permalink / raw)
  To: Anton Protopopov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song

On Thu, Sep 18, 2025 at 2:32 AM Anton Protopopov
<a.s.protopopov@gmail.com> wrote:
>
> The commit 6c918709bd30 ("libbpf: Refactor bpf_object__reloc_code")
> added the bpf_object__append_subprog_code() with incorrect indentations.
> Use tabs instead. (This also makes a consequent commit better readable.)
>
> Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
> ---
>  tools/lib/bpf/libbpf.c | 52 +++++++++++++++++++++---------------------
>  1 file changed, 26 insertions(+), 26 deletions(-)
>

thanks!

Acked-by: Andrii Nakryiko <andrii@kernel.org>


> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index fe4fc5438678..2c1f48f77680 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -6393,32 +6393,32 @@ static int
>  bpf_object__append_subprog_code(struct bpf_object *obj, struct bpf_program *main_prog,
>                                 struct bpf_program *subprog)
>  {
> -       struct bpf_insn *insns;
> -       size_t new_cnt;
> -       int err;
> -
> -       subprog->sub_insn_off = main_prog->insns_cnt;
> -
> -       new_cnt = main_prog->insns_cnt + subprog->insns_cnt;
> -       insns = libbpf_reallocarray(main_prog->insns, new_cnt, sizeof(*insns));
> -       if (!insns) {
> -               pr_warn("prog '%s': failed to realloc prog code\n", main_prog->name);
> -               return -ENOMEM;
> -       }
> -       main_prog->insns = insns;
> -       main_prog->insns_cnt = new_cnt;
> -
> -       memcpy(main_prog->insns + subprog->sub_insn_off, subprog->insns,
> -              subprog->insns_cnt * sizeof(*insns));
> -
> -       pr_debug("prog '%s': added %zu insns from sub-prog '%s'\n",
> -                main_prog->name, subprog->insns_cnt, subprog->name);
> -
> -       /* The subprog insns are now appended. Append its relos too. */
> -       err = append_subprog_relos(main_prog, subprog);
> -       if (err)
> -               return err;
> -       return 0;
> +       struct bpf_insn *insns;
> +       size_t new_cnt;
> +       int err;
> +
> +       subprog->sub_insn_off = main_prog->insns_cnt;
> +
> +       new_cnt = main_prog->insns_cnt + subprog->insns_cnt;
> +       insns = libbpf_reallocarray(main_prog->insns, new_cnt, sizeof(*insns));
> +       if (!insns) {
> +               pr_warn("prog '%s': failed to realloc prog code\n", main_prog->name);
> +               return -ENOMEM;
> +       }
> +       main_prog->insns = insns;
> +       main_prog->insns_cnt = new_cnt;
> +
> +       memcpy(main_prog->insns + subprog->sub_insn_off, subprog->insns,
> +              subprog->insns_cnt * sizeof(*insns));
> +
> +       pr_debug("prog '%s': added %zu insns from sub-prog '%s'\n",
> +                main_prog->name, subprog->insns_cnt, subprog->name);
> +
> +       /* The subprog insns are now appended. Append its relos too. */
> +       err = append_subprog_relos(main_prog, subprog);
> +       if (err)
> +               return err;
> +       return 0;
>  }
>
>  static int
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 08/13] bpf, x86: add support for indirect jumps
  2025-09-18  9:38 ` [PATCH v3 bpf-next 08/13] bpf, x86: add support for indirect jumps Anton Protopopov
@ 2025-09-20  0:28   ` Eduard Zingerman
  2025-09-21 19:12     ` Eduard Zingerman
  2025-09-25 18:07     ` Anton Protopopov
  0 siblings, 2 replies; 46+ messages in thread
From: Eduard Zingerman @ 2025-09-20  0:28 UTC (permalink / raw)
  To: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Daniel Borkmann, Quentin Monnet, Yonghong Song

On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> Add support for a new instruction
> 
>     BPF_JMP|BPF_X|BPF_JA, SRC=0, DST=Rx, off=0, imm=0
> 
> which does an indirect jump to a location stored in Rx.  The register
> Rx should have type PTR_TO_INSN. This new type assures that the Rx
> register contains a value (or a range of values) loaded from a
> correct jump table – map of type instruction array.
> 
> For example, for a C switch LLVM will generate the following code:
> 
>     0:   r3 = r1                    # "switch (r3)"
>     1:   if r3 > 0x13 goto +0x666   # check r3 boundaries
>     2:   r3 <<= 0x3                 # adjust to an index in array of addresses
>     3:   r1 = 0xbeef ll             # r1 is PTR_TO_MAP_VALUE, r1->map_ptr=M
>     5:   r1 += r3                   # r1 inherits boundaries from r3
>     6:   r1 = *(u64 *)(r1 + 0x0)    # r1 now has type INSN_TO_PTR
>     7:   gotox r1[,imm=fd(M)]       # jit will generate proper code
                   ^^^^^^^^^^^^
	      Nit: this part is not needed atm.
> 
> Here the gotox instruction corresponds to one particular map. This is
> possible however to have a gotox instruction which can be loaded from
> different maps, e.g.

[...]

> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index aca43c284203..607a684642e5 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h

[...]

> @@ -586,6 +597,9 @@ struct bpf_insn_aux_data {
>  	u8 fastcall_spills_num:3;
>  	u8 arg_prog:4;
>  
> +	/* true if jt->off was allocated */
> +	bool jt_allocated;
> +

Nit: in clear_insn_aux_data() maybe just check if instruction is a gotox?

>  	/* below fields are initialized once */
>  	unsigned int orig_idx; /* original instruction index */
>  	bool jmp_point;

[...]

>  static inline struct bpf_func_info_aux *subprog_aux(struct bpf_verifier_env *env, int subprog)
> diff --git a/kernel/bpf/bpf_insn_array.c b/kernel/bpf/bpf_insn_array.c
> index 0c8dac62f457..4b945b7e31b8 100644
> --- a/kernel/bpf/bpf_insn_array.c
> +++ b/kernel/bpf/bpf_insn_array.c
> @@ -1,7 +1,6 @@
>  // SPDX-License-Identifier: GPL-2.0-only
>  
>  #include <linux/bpf.h>
> -#include <linux/sort.h>

Nit: remove this include from patch #3?

>  
>  #define MAX_INSN_ARRAY_ENTRIES 256
>  

[...]

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 5c1e4e37d1f8..839260e62fa9 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c

[...]

> @@ -7620,6 +7644,19 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
>  
>  				regs[value_regno].type = SCALAR_VALUE;
>  				__mark_reg_known(&regs[value_regno], val);
> +			} else if (map->map_type == BPF_MAP_TYPE_INSN_ARRAY) {
> +				regs[value_regno].type = PTR_TO_INSN;
> +				regs[value_regno].map_ptr = map;
> +				regs[value_regno].off = reg->off;
> +				regs[value_regno].umin_value = reg->umin_value;
> +				regs[value_regno].umax_value = reg->umax_value;
> +				regs[value_regno].smin_value = reg->smin_value;
> +				regs[value_regno].smax_value = reg->smax_value;
> +				regs[value_regno].s32_min_value = reg->s32_min_value;
> +				regs[value_regno].s32_max_value = reg->s32_max_value;
> +				regs[value_regno].u32_min_value = reg->u32_min_value;
> +				regs[value_regno].u32_max_value = reg->u32_max_value;
> +				regs[value_regno].var_off = reg->var_off;

This can be shortened to:

  copy_register_state(regs + value_regno, reg);
  regs[value_regno].type = PTR_TO_INSN;

I think that a check that read is u64 wide is necessary here.
Otherwise e.g. for u8 load you'd need to truncate the bounds set above.
This is also necessary for alignment check at the beginning of this
function (check_ptr_alignment() call).

>  			} else {
>  				mark_reg_unknown(env, regs, value_regno);
>  			}

[...]

> @@ -14628,6 +14672,11 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
>  		}
>  		break;
>  	case BPF_SUB:
> +		if (ptr_to_insn_array) {
> +			verbose(env, "Operation %s on ptr to instruction set map is prohibited\n",
> +				bpf_alu_string[opcode >> 4]);
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
               Nit: Just "subtraction", no need for lookup?
                    Also, maybe put this near the same check for PTR_TO_STACK?

> +			return -EACCES;
> +		}
>  		if (dst_reg == off_reg) {
>  			/* scalar -= pointer.  Creates an unknown scalar */
>  			verbose(env, "R%d tried to subtract pointer from scalar\n",

[...]

> @@ -17733,6 +17783,234 @@ static int mark_fastcall_patterns(struct bpf_verifier_env *env)
>  	return 0;
>  }
>  
> +#define SET_HIGH(STATE, LAST)	STATE = (STATE & 0xffffU) | ((LAST) << 16)
> +#define GET_HIGH(STATE)		((u16)((STATE) >> 16))
> +
> +static int push_gotox_edge(int t, struct bpf_verifier_env *env, struct bpf_iarray *jt)
> +{
> +	int *insn_stack = env->cfg.insn_stack;
> +	int *insn_state = env->cfg.insn_state;
> +	u16 prev;
> +	int w;
> +
> +	for (prev = GET_HIGH(insn_state[t]); prev < jt->off_cnt; prev++) {
> +		w = jt->off[prev];
> +
> +		/* EXPLORED || DISCOVERED */
> +		if (insn_state[w])
> +			continue;

Suppose there is some other way to reach `w` beside gotox.
Also suppose that `w` had been visited already.
In such case `mark_jmp_point(env, w)` might get omitted for `w`.

> +
> +		break;
> +	}
> +
> +	if (prev == jt->off_cnt)
> +		return DONE_EXPLORING;
> +
> +	mark_prune_point(env, t);

Nit: do this from visit_gotox_insn() ?

> +
> +	if (env->cfg.cur_stack >= env->prog->len)
> +		return -E2BIG;
> +	insn_stack[env->cfg.cur_stack++] = w;
> +
> +	mark_jmp_point(env, w);
> +
> +	SET_HIGH(insn_state[t], prev + 1);
> +	return KEEP_EXPLORING;
> +}

[...]

> +/*
> + * Find and collect all maps which fit in the subprog. Return the result as one
> + * combined jump table in jt->off (allocated with kvcalloc
                                                           ^^^
						   nit: missing ')'

> + */
> +static struct bpf_iarray *jt_from_subprog(struct bpf_verifier_env *env,
> +					  int subprog_start, int subprog_end)

[...]

> +static struct bpf_iarray *
> +create_jt(int t, struct bpf_verifier_env *env, int fd)
                                                  ^^^^^^
			fd is unused, same for visit_gotox_insn()

[...]

> @@ -18716,6 +19001,10 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
>  		return regs_exact(rold, rcur, idmap) && rold->frameno == rcur->frameno;
>  	case PTR_TO_ARENA:
>  		return true;
> +	case PTR_TO_INSN:
> +		/* is rcur a subset of rold? */
> +		return (rcur->umin_value >= rold->umin_value &&
> +			rcur->umax_value <= rold->umax_value);

I think this should be:

                 if (rold->off != rcur->off)
                         return false;
                 return range_within(old: rold, cur: rcur) &&
                        tnum_in(a: rold->var_off, b: rcur->var_off);

>  	default:
>  		return regs_exact(rold, rcur, idmap);
>  	}
> @@ -19862,6 +20151,102 @@ static int process_bpf_exit_full(struct bpf_verifier_env *env,
>  	return PROCESS_BPF_EXIT;
>  }
>  
> +static int indirect_jump_min_max_index(struct bpf_verifier_env *env,
> +				       int regno,
> +				       struct bpf_map *map,
> +				       u32 *pmin_index, u32 *pmax_index)
> +{
> +	struct bpf_reg_state *reg = reg_state(env, regno);
> +	u64 min_index, max_index;
> +
> +	if (check_add_overflow(reg->umin_value, reg->off, &min_index) ||
> +		(min_index > (u64) U32_MAX * sizeof(long))) {
> +		verbose(env, "the sum of R%u umin_value %llu and off %u is too big\n",
> +			     regno, reg->umin_value, reg->off);
> +		return -ERANGE;
> +	}
> +	if (check_add_overflow(reg->umax_value, reg->off, &max_index) ||
> +		(max_index > (u64) U32_MAX * sizeof(long))) {
> +		verbose(env, "the sum of R%u umax_value %llu and off %u is too big\n",
> +			     regno, reg->umax_value, reg->off);
> +		return -ERANGE;
> +	}
> +
> +	min_index /= sizeof(long);
> +	max_index /= sizeof(long);

Nit: `long` is 32-bit long on x86 (w/o -64), I understand that x86 jit
would just reject gotox, but could you please use `sizeof(u64)` here?

> +
> +	if (min_index >= map->max_entries || max_index >= map->max_entries) {
> +		verbose(env, "R%u points to outside of jump table: [%llu,%llu] max_entries %u\n",
> +			     regno, min_index, max_index, map->max_entries);
> +		return -EINVAL;
> +	}
> +
> +	*pmin_index = min_index;
> +	*pmax_index = max_index;
> +	return 0;
> +}
> +
> +/* gotox *dst_reg */
> +static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *insn)
> +{
> +	struct bpf_verifier_state *other_branch;
> +	struct bpf_reg_state *dst_reg;
> +	struct bpf_map *map;
> +	u32 min_index, max_index;
> +	int err = 0;
> +	u32 *xoff;
> +	int n;
> +	int i;
> +
> +	dst_reg = reg_state(env, insn->dst_reg);
> +	if (dst_reg->type != PTR_TO_INSN) {
> +		verbose(env, "R%d has type %d, expected PTR_TO_INSN\n",
> +			     insn->dst_reg, dst_reg->type);
> +		return -EINVAL;
> +	}
> +
> +	map = dst_reg->map_ptr;
> +	if (verifier_bug_if(!map, env, "R%d has an empty map pointer", insn->dst_reg))
> +		return -EFAULT;
> +
> +	if (verifier_bug_if(map->map_type != BPF_MAP_TYPE_INSN_ARRAY, env,
> +			    "R%d has incorrect map type %d", insn->dst_reg, map->map_type))
> +		return -EFAULT;
> +
> +	err = indirect_jump_min_max_index(env, insn->dst_reg, map, &min_index, &max_index);
> +	if (err)
> +		return err;
> +
> +	xoff = kvcalloc(max_index - min_index + 1, sizeof(u32), GFP_KERNEL_ACCOUNT);
> +	if (!xoff)
> +		return -ENOMEM;

Let's keep a buffer for this allocation in `env` and realloc it when needed.
Would be good to avoid allocating memory each time this gotox is visited.

> +
> +	n = copy_insn_array_uniq(map, min_index, max_index, xoff);
> +	if (n < 0) {
> +		err = n;
> +		goto free_off;
> +	}
> +	if (n == 0) {
> +		verbose(env, "register R%d doesn't point to any offset in map id=%d\n",
> +			     insn->dst_reg, map->id);
> +		err = -EINVAL;
> +		goto free_off;
> +	}
> +
> +	for (i = 0; i < n - 1; i++) {
> +		other_branch = push_stack(env, xoff[i], env->insn_idx, false);
                                                                       ^^^^^
                         `is_speculative` has to be inherited from env->cur_state

> +		if (IS_ERR(other_branch)) {
> +			err = PTR_ERR(other_branch);
> +			goto free_off;
> +		}
> +	}
> +	env->insn_idx = xoff[n-1];
> +
> +free_off:
> +	kvfree(xoff);
> +	return err;
> +}
> +
>  static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
>  {
>  	int err;
> @@ -19964,6 +20349,9 @@ static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
>  
>  			mark_reg_scratched(env, BPF_REG_0);
>  		} else if (opcode == BPF_JA) {
> +			if (BPF_SRC(insn->code) == BPF_X)
> +				return check_indirect_jump(env, insn);
> +

check_indirect_jump() does not check reserved fields (like offset or dst_reg).

>  			if (BPF_SRC(insn->code) != BPF_K ||
>  			    insn->src_reg != BPF_REG_0 ||
>  			    insn->dst_reg != BPF_REG_0 ||

[...]

> @@ -24215,23 +24625,41 @@ static bool can_jump(struct bpf_insn *insn)
>  	return false;
>  }
>  
> -static int insn_successors(struct bpf_prog *prog, u32 idx, u32 succ[2])
> +/*
> + * Returns an array of instructions succ, with succ->off[0], ...,
> + * succ->off[n-1] with successor instructions, where n=succ->off_cnt
> + */
> +static struct bpf_iarray *
> +insn_successors(struct bpf_verifier_env *env, u32 insn_idx)

Nit: maybe put insn_successors refactoring to a separate patch?

>  {
> -	struct bpf_insn *insn = &prog->insnsi[idx];
> -	int i = 0, insn_sz;
> +	struct bpf_prog *prog = env->prog;
> +	struct bpf_insn *insn = &prog->insnsi[insn_idx];
> +	struct bpf_iarray *succ;
> +	int insn_sz;
>  	u32 dst;
>  
> -	insn_sz = bpf_is_ldimm64(insn) ? 2 : 1;
> -	if (can_fallthrough(insn) && idx + 1 < prog->len)
> -		succ[i++] = idx + insn_sz;
> +	if (unlikely(insn_is_gotox(insn))) {
> +		succ = env->insn_aux_data[insn_idx].jt;
> +		if (verifier_bug_if(!succ, env,
> +				    "aux data for insn %u doesn't contain a jump table\n",
> +				    insn_idx))
> +			return ERR_PTR(-EFAULT);

Requiring each callsite to check error code for this function is very inconvenient.
Moreover, insn_successors() is hot in liveness.c:update_instance().
Let's just assume that NULL here cannot happen.

> +	} else {
> +		/* pre-allocated array of size up to 2; reset cnt, as it may be used already */
> +		succ = env->succ;
> +		succ->off_cnt = 0;
>  
> -	if (can_jump(insn)) {
> -		dst = idx + jmp_offset(insn) + 1;
> -		if (i == 0 || succ[0] != dst)
> -			succ[i++] = dst;
> -	}
> +		insn_sz = bpf_is_ldimm64(insn) ? 2 : 1;
> +		if (can_fallthrough(insn) && insn_idx + 1 < prog->len)
> +			succ->off[succ->off_cnt++] = insn_idx + insn_sz;
>  
> -	return i;
> +		if (can_jump(insn)) {
> +			dst = insn_idx + jmp_offset(insn) + 1;
> +			if (succ->off_cnt == 0 || succ->off[0] != dst)
> +				succ->off[succ->off_cnt++] = dst;
> +		}
> +	}
> +	return succ;
>  }
>

[...]

> @@ -24489,11 +24921,10 @@ static int compute_scc(struct bpf_verifier_env *env)
>  	const u32 insn_cnt = env->prog->len;
>  	int stack_sz, dfs_sz, err = 0;
>  	u32 *stack, *pre, *low, *dfs;
> -	u32 succ_cnt, i, j, t, w;
> +	u32 i, j, t, w;
>  	u32 next_preorder_num;
>  	u32 next_scc_id;
>  	bool assign_scc;
> -	u32 succ[2];
>  
>  	next_preorder_num = 1;
>  	next_scc_id = 1;
> @@ -24592,6 +25023,8 @@ static int compute_scc(struct bpf_verifier_env *env)
>  		dfs[0] = i;
>  dfs_continue:
>  		while (dfs_sz) {
> +			struct bpf_iarray *succ;
> +

Nit: please move this declaration up, just to be consistent with other variables.

>  			w = dfs[dfs_sz - 1];
>  			if (pre[w] == 0) {
>  				low[w] = next_preorder_num;
> @@ -24600,12 +25033,17 @@ static int compute_scc(struct bpf_verifier_env *env)
>  				stack[stack_sz++] = w;
>  			}
>  			/* Visit 'w' successors */
> -			succ_cnt = insn_successors(env->prog, w, succ);
> -			for (j = 0; j < succ_cnt; ++j) {
> -				if (pre[succ[j]]) {
> -					low[w] = min(low[w], low[succ[j]]);
> +			succ = insn_successors(env, w);
> +			if (IS_ERR(succ)) {
> +				err = PTR_ERR(succ);
> +				goto exit;
> +
> +			}
> +			for (j = 0; j < succ->off_cnt; ++j) {
> +				if (pre[succ->off[j]]) {
> +					low[w] = min(low[w], low[succ->off[j]]);
>  				} else {
> -					dfs[dfs_sz++] = succ[j];
> +					dfs[dfs_sz++] = succ->off[j];
>  					goto dfs_continue;
>  				}
>  			}

[...]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 13/13] selftests/bpf: add selftests for indirect jumps
  2025-09-18  9:38 ` [PATCH v3 bpf-next 13/13] selftests/bpf: add selftests for indirect jumps Anton Protopopov
@ 2025-09-20  0:58   ` Eduard Zingerman
  2025-09-20 22:27     ` Eduard Zingerman
  0 siblings, 1 reply; 46+ messages in thread
From: Eduard Zingerman @ 2025-09-20  0:58 UTC (permalink / raw)
  To: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Daniel Borkmann, Quentin Monnet, Yonghong Song

On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> Add selftests for indirect jumps. All the indirect jumps are
> generated from C switch statements, so, if compiled by a compiler
> which doesn't support indirect jumps, then should pass as well.
> 
> Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
> ---

Patch #8 adds a lot of error conditions that are effectively untested
at the moment. I think we need to figure out a way to express gotox
tests in inline assembly, independent of clang version, and add a
bunch of correctness tests.

[...]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 13/13] selftests/bpf: add selftests for indirect jumps
  2025-09-20  0:58   ` Eduard Zingerman
@ 2025-09-20 22:27     ` Eduard Zingerman
  2025-09-20 22:32       ` Eduard Zingerman
  2025-09-25 18:14       ` Anton Protopopov
  0 siblings, 2 replies; 46+ messages in thread
From: Eduard Zingerman @ 2025-09-20 22:27 UTC (permalink / raw)
  To: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Daniel Borkmann, Quentin Monnet, Yonghong Song

On Fri, 2025-09-19 at 17:58 -0700, Eduard Zingerman wrote:
> On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> > Add selftests for indirect jumps. All the indirect jumps are
> > generated from C switch statements, so, if compiled by a compiler
> > which doesn't support indirect jumps, then should pass as well.
> >
> > Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
> > ---
>
> Patch #8 adds a lot of error conditions that are effectively untested
> at the moment. I think we need to figure out a way to express gotox
> tests in inline assembly, independent of clang version, and add a
> bunch of correctness tests.
>
> [...]

Here is an example (I modifier verifier_and.c, the patch should use
some verifier_gotox.c, of course):

  #include <linux/bpf.h>
  #include <bpf/bpf_helpers.h>
  #include "bpf_misc.h"
  #include "../../../include/linux/filter.h"

  SEC("socket")
  __success
  __retval(1)
  __naked void jump_table1(void)
  {
  	asm volatile (
  ".pushsection .jumptables,\"\",@progbits;\n"
  "jt0_%=:\n"
  	".quad ret0_%=;\n"
  	".quad ret1_%=;\n"
  ".size jt0_%=, 16;\n"
  ".global jt0_%=;\n"
  ".popsection;\n"

  	"r0 = jt0_%= ll;\n"
  	"r0 += 8;\n"
  	"r0 = *(u64 *)(r0 + 0);\n"
  	".8byte %[gotox_r0];\n"
  "ret0_%=:\n"
  	"r0 = 0;\n"
  	"exit;\n"
  "ret1_%=:\n"
  	"r0 = 1;\n"
  	"exit;\n"
  	:
  	: __imm_insn(gotox_r0, BPF_RAW_INSN(BPF_JMP | BPF_JA | BPF_X, BPF_REG_0, 0, 0 , 0))
  	: __clobber_all);
  }

  char _license[] SEC("license") = "GPL";

It verifies and executes (having fix for emit_indirect_jump() applied):

  VERIFIER LOG:
  =============
  func#0 @0
  Live regs before insn:
        0: .......... (18) r0 = 0xffff888108c66700
        2: 0......... (07) r0 += 8
        3: 0......... (79) r0 = *(u64 *)(r0 +0)
        4: .......... (0d) gotox r0
        5: .......... (b7) r0 = 0
        6: 0......... (95) exit
        7: .......... (b7) r0 = 1
        8: 0......... (95) exit
  Global function jump_table1() doesn't return scalar. Only those are supported.
  0: R1=ctx() R10=fp0
  ; asm volatile ( @ verifier_and.c:122
  0: (18) r0 = 0xffff888108c66700       ; R0_w=map_value(map=jt,ks=4,vs=8)
  2: (07) r0 += 8                       ; R0_w=map_value(map=jt,ks=4,vs=8,off=8)
  3: (79) r0 = *(u64 *)(r0 +0)          ; R0_w=insn(off=8)
  4: (0d) gotox r0
  7: (b7) r0 = 1                        ; R0_w=1
  8: (95) exit
  processed 6 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
  =============
  do_prog_test_run:PASS:bpf_prog_test_run 0 nsec
  #488/1   verifier_and/jump_table1:OK
  #488     verifier_and:OK
  Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

This example can be mutated in various ways to check behaviour and
error conditions.

Having such complete set of such tests, I'd only keep a few canary
C-level tests.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 13/13] selftests/bpf: add selftests for indirect jumps
  2025-09-20 22:27     ` Eduard Zingerman
@ 2025-09-20 22:32       ` Eduard Zingerman
  2025-09-25 18:14       ` Anton Protopopov
  1 sibling, 0 replies; 46+ messages in thread
From: Eduard Zingerman @ 2025-09-20 22:32 UTC (permalink / raw)
  To: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Daniel Borkmann, Quentin Monnet, Yonghong Song

On Sat, 2025-09-20 at 15:27 -0700, Eduard Zingerman wrote:

[...]

>   ".pushsection .jumptables,\"\",@progbits;\n"
                                            ^^^^
                                should be w/o newlines, sorry

[...]

> Having such complete set of such tests, I'd only keep a few canary
> C-level tests.

*Having complete set of such tests, ...

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 08/13] bpf, x86: add support for indirect jumps
  2025-09-20  0:28   ` Eduard Zingerman
@ 2025-09-21 19:12     ` Eduard Zingerman
  2025-09-25 18:07     ` Anton Protopopov
  1 sibling, 0 replies; 46+ messages in thread
From: Eduard Zingerman @ 2025-09-21 19:12 UTC (permalink / raw)
  To: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Daniel Borkmann, Quentin Monnet, Yonghong Song

On Fri, 2025-09-19 at 17:28 -0700, Eduard Zingerman wrote:

[...]

> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 5c1e4e37d1f8..839260e62fa9 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c

[...]

> > +/* gotox *dst_reg */
> > +static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *insn)
> > +{
> > +	struct bpf_verifier_state *other_branch;
> > +	struct bpf_reg_state *dst_reg;
> > +	struct bpf_map *map;
> > +	u32 min_index, max_index;
> > +	int err = 0;
> > +	u32 *xoff;
> > +	int n;
> > +	int i;
> > +
> > +	dst_reg = reg_state(env, insn->dst_reg);
> > +	if (dst_reg->type != PTR_TO_INSN) {
> > +		verbose(env, "R%d has type %d, expected PTR_TO_INSN\n",
> > +			     insn->dst_reg, dst_reg->type);
> > +		return -EINVAL;
> > +	}
> > +
> > +	map = dst_reg->map_ptr;
> > +	if (verifier_bug_if(!map, env, "R%d has an empty map pointer", insn->dst_reg))
> > +		return -EFAULT;
> > +
> > +	if (verifier_bug_if(map->map_type != BPF_MAP_TYPE_INSN_ARRAY, env,
> > +			    "R%d has incorrect map type %d", insn->dst_reg, map->map_type))
> > +		return -EFAULT;
> > +
> > +	err = indirect_jump_min_max_index(env, insn->dst_reg, map, &min_index, &max_index);
> > +	if (err)
> > +		return err;
> > +
> > +	xoff = kvcalloc(max_index - min_index + 1, sizeof(u32), GFP_KERNEL_ACCOUNT);
> > +	if (!xoff)
> > +		return -ENOMEM;
> 
> Let's keep a buffer for this allocation in `env` and realloc it when needed.
> Would be good to avoid allocating memory each time this gotox is visited.

On a second thought, maybe put this array into bpf_subprog_info for
each function and avoid copy/sort on each gotox instruction as well?

> > +
> > +	n = copy_insn_array_uniq(map, min_index, max_index, xoff);
> > +	if (n < 0) {
> > +		err = n;
> > +		goto free_off;
> > +	}
> > +	if (n == 0) {
> > +		verbose(env, "register R%d doesn't point to any offset in map id=%d\n",
> > +			     insn->dst_reg, map->id);
> > +		err = -EINVAL;
> > +		goto free_off;
> > +	}

[...]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 05/13] bpf: support instructions arrays with constants blinding
  2025-09-19 20:47                 ` Eduard Zingerman
@ 2025-09-22  9:28                   ` Anton Protopopov
  2025-09-30  9:07                     ` Anton Protopopov
  0 siblings, 1 reply; 46+ messages in thread
From: Anton Protopopov @ 2025-09-22  9:28 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Daniel Borkmann, Alexei Starovoitov, bpf, Alexei Starovoitov,
	Andrii Nakryiko, Anton Protopopov, Quentin Monnet, Yonghong Song

On 25/09/19 01:47PM, Eduard Zingerman wrote:
> On Fri, 2025-09-19 at 20:27 +0000, Anton Protopopov wrote:
> > On 25/09/19 12:44PM, Eduard Zingerman wrote:
> > > On Fri, 2025-09-19 at 21:28 +0200, Daniel Borkmann wrote:
> > > > On 9/19/25 8:26 PM, Alexei Starovoitov wrote:
> > > > > On Fri, Sep 19, 2025 at 12:12 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > > > On Fri, 2025-09-19 at 07:05 +0000, Anton Protopopov wrote:
> > > > > > > On 25/09/18 11:35PM, Eduard Zingerman wrote:
> > > > > > > > On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> > > > > > > > 
> > > > > > > > [...]
> > > > > > > > 
> > > > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > > > > > > index a7ad4fe756da..5c1e4e37d1f8 100644
> > > > > > > > > --- a/kernel/bpf/verifier.c
> > > > > > > > > +++ b/kernel/bpf/verifier.c
> > > > > > > > > @@ -21578,6 +21578,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > > > > > > >    struct bpf_insn *insn;
> > > > > > > > >    void *old_bpf_func;
> > > > > > > > >    int err, num_exentries;
> > > > > > > > > + int old_len, subprog_start_adjustment = 0;
> > > > > > > > > 
> > > > > > > > >    if (env->subprog_cnt <= 1)
> > > > > > > > >            return 0;
> > > > > > > > > @@ -21652,7 +21653,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > > > > > > >            func[i]->aux->func_idx = i;
> > > > > > > > >            /* Below members will be freed only at prog->aux */
> > > > > > > > >            func[i]->aux->btf = prog->aux->btf;
> > > > > > > > > -         func[i]->aux->subprog_start = subprog_start;
> > > > > > > > > +         func[i]->aux->subprog_start = subprog_start + subprog_start_adjustment;
> > > > > > > > >            func[i]->aux->func_info = prog->aux->func_info;
> > > > > > > > >            func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
> > > > > > > > >            func[i]->aux->poke_tab = prog->aux->poke_tab;
> > > > > > > > > @@ -21705,7 +21706,15 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > > > > > > >            func[i]->aux->might_sleep = env->subprog_info[i].might_sleep;
> > > > > > > > >            if (!i)
> > > > > > > > >                    func[i]->aux->exception_boundary = env->seen_exception;
> > > > > > > > > +
> > > > > > > > > +         /*
> > > > > > > > > +          * To properly pass the absolute subprog start to jit
> > > > > > > > > +          * all instruction adjustments should be accumulated
> > > > > > > > > +          */
> > > > > > > > > +         old_len = func[i]->len;
> > > > > > > > >            func[i] = bpf_int_jit_compile(func[i]);
> > > > > > > > > +         subprog_start_adjustment += func[i]->len - old_len;
> > > > > > > > > +
> > > > > > > > >            if (!func[i]->jited) {
> > > > > > > > >                    err = -ENOTSUPP;
> > > > > > > > >                    goto out_free;
> > > > > > > > 
> > > > > > > > This change makes sense, however, would it be possible to move
> > > > > > > > bpf_jit_blind_constants() out from jit to verifier.c:do_check,
> > > > > > > > somewhere after do_misc_fixups?
> > > > > > > > Looking at the source code, bpf_jit_blind_constants() is the first
> > > > > > > > thing any bpf_int_jit_compile() does.
> > > > > > > > Another alternative is to add adjust_subprog_starts() call to this
> > > > > > > > function. Wdyt?
> > > > > > > 
> > > > > > > Yes, it makes total sense. Blinding was added to x86 jit initially and then
> > > > > > > every other jit copy-pasted it.  I was considering to move blinding up some
> > > > > > > time back (see https://lore.kernel.org/bpf/20250318143318.656785-1-aspsk@isovalent.com/),
> > > > > > > but then I've decided to avoid this, as this requires to patch every JIT, and I
> > > > > > > am not sure what is the way to test such a change (any hints?)
> > > > > > 
> > > > > > We have the following covered by CI:
> > > > > > - arch/x86/net/bpf_jit_comp.c
> > > > > > - arch/s390/net/bpf_jit_comp.c
> > > > > > - arch/arm64/net/bpf_jit_comp.c
> > > > > > 
> > > > > > People work on these jits actively:
> > > > > > - arch/riscv/net/bpf_jit_core.c
> > > > > > - arch/loongarch/net/bpf_jit.c
> > > > > > - arch/powerpc/net/bpf_jit_comp.c
> > > > > > 
> > > > > > So, we can probably ask to test the patch-set.
> > > > > > 
> > > > > > The remaining are:
> > > > > > - arch/x86/net/bpf_jit_comp32.c
> > > > > > - arch/parisc/net/bpf_jit_core.c
> > > > > > - arch/mips/net/bpf_jit_comp.c
> > > > > > - arch/arm/net/bpf_jit_32.c
> > > > > > - arch/sparc/net/bpf_jit_comp_64.c
> > > > > > - arch/arc/net/bpf_jit_core.c
> > > > > > 
> > > > > > The change to each individual jit is not complicated, just removing
> > > > > > the transformation call. Idk, I'd just go for it.
> > > > > > Maybe Alexei has concerns?
> > > > > 
> > > > > No concerns.
> > > > > I don't remember why JIT calls it instead of the verifier.
> > > > > 
> > > > > Daniel,
> > > > > do you recall? Any concern?
> > > > 
> > > > Hm, I think we did this in the JIT back then for couple of reasons iirc,
> > > > the constant blinding needs to work from native bpf(2) as well as from
> > > > cbpf->ebpf (seccomp-bpf, filters, etc), so the JIT was a natural location
> > > > to capture them all, and to fallback to interpreter with the non-blinded
> > > > BPF-insns when something went wrong during blinding or JIT process (e.g.
> > > > JIT hits some internal limits etc). Moving bpf_jit_blind_constants() out
> > > > from JIT to verifier.c:do_check() means constant blinding of cbpf->ebpf
> > > > are not covered anymore (and in this case its reachable from unpriv).
> > > 
> > > Hi Daniel,
> > > 
> > > Thank you for the context.
> > > So, the ideal location for bpf_jit_blind_constants() would be in
> > > core.c in some wrapper function for bpf_int_jit_compile():
> > > 
> > >   static struct bpf_prog *jit_compile(prog)
> > >   {
> > >   	tmp = bpf_jit_blind_constants()
> > >         if (!tmp)
> > >            return prog;
> > >         return bpf_int_jit_compile(tmp);
> > >   }
> > > 
> > > A bit of a hassle.
> > > 
> > > Anton, wdyt about a second option: adding adjust_subprog_starts()
> > > to bpf_jit_blind_constants() and leaving all the rest as-is?
> > > It would have to happen either way of call to bpf_jit_blind_constants()
> > > itself is moved.
> > 
> > So, to be clear, in this case adjust_insn_arrays() stays as in the
> > original patch, but the "subprog_start_adjustment" chunks are
> > replaced by calling the adjust_subprog_starts() (for better
> > readability and consistency, right?)
> 
> Yes, by adding adjust_subprog_starts() call inside
> bpf_jit_blind_constants() it should be possible to read
> env->subprog_info[*].start in the jit_subprogs() loop directly,
> w/o tracking the subprog_start_adjustment delta.
> (At-least I think this should work).

Ok, will do this way, thanks.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 11/13] libbpf: support llvm-generated indirect jumps
  2025-09-19 23:18   ` Andrii Nakryiko
@ 2025-09-22 10:13     ` Anton Protopopov
  0 siblings, 0 replies; 46+ messages in thread
From: Anton Protopopov @ 2025-09-22 10:13 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song

On 25/09/19 04:18PM, Andrii Nakryiko wrote:
> On Thu, Sep 18, 2025 at 2:32 AM Anton Protopopov
> <a.s.protopopov@gmail.com> wrote:
> >
> > For v5 instruction set LLVM is allowed to generate indirect jumps for
> > switch statements and for 'goto *rX' assembly. Every such a jump will
> > be accompanied by necessary metadata, e.g. (`llvm-objdump -Sr ...`):
> >
> >        0:       r2 = 0x0 ll
> >                 0000000000000030:  R_BPF_64_64  BPF.JT.0.0
> >
> > Here BPF.JT.1.0 is a symbol residing in the .jumptables section:
> >
> >     Symbol table:
> >        4: 0000000000000000   240 OBJECT  GLOBAL DEFAULT     4 BPF.JT.0.0
> >
> > The -bpf-min-jump-table-entries llvm option may be used to control the
> > minimal size of a switch which will be converted to an indirect jumps.
> >
> > Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
> > ---
> >  tools/lib/bpf/libbpf.c        | 150 +++++++++++++++++++++++++++++++++-
> >  tools/lib/bpf/libbpf_probes.c |   4 +
> >  tools/lib/bpf/linker.c        |  10 ++-
> >  3 files changed, 161 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > index 2c1f48f77680..57cac0810d2e 100644
> > --- a/tools/lib/bpf/libbpf.c
> > +++ b/tools/lib/bpf/libbpf.c
> > @@ -191,6 +191,7 @@ static const char * const map_type_name[] = {
> >         [BPF_MAP_TYPE_USER_RINGBUF]             = "user_ringbuf",
> >         [BPF_MAP_TYPE_CGRP_STORAGE]             = "cgrp_storage",
> >         [BPF_MAP_TYPE_ARENA]                    = "arena",
> > +       [BPF_MAP_TYPE_INSN_ARRAY]               = "insn_array",
> >  };
> >
> >  static const char * const prog_type_name[] = {
> > @@ -372,6 +373,7 @@ enum reloc_type {
> >         RELO_EXTERN_CALL,
> >         RELO_SUBPROG_ADDR,
> >         RELO_CORE,
> > +       RELO_INSN_ARRAY,
> >  };
> >
> >  struct reloc_desc {
> > @@ -382,7 +384,10 @@ struct reloc_desc {
> >                 struct {
> >                         int map_idx;
> >                         int sym_off;
> > -                       int ext_idx;
> > +                       union {
> > +                               int ext_idx;
> > +                               int sym_size;
> > +                       };
> >                 };
> >         };
> >  };
> > @@ -424,6 +429,11 @@ struct bpf_sec_def {
> >         libbpf_prog_attach_fn_t prog_attach_fn;
> >  };
> >
> > +struct bpf_light_subprog {
> > +       __u32 sec_insn_off;
> > +       __u32 sub_insn_off;
> > +};
> > +
> >  /*
> >   * bpf_prog should be a better name but it has been used in
> >   * linux/filter.h.
> > @@ -496,6 +506,9 @@ struct bpf_program {
> >         __u32 line_info_rec_size;
> >         __u32 line_info_cnt;
> >         __u32 prog_flags;
> > +
> > +       struct bpf_light_subprog *subprog;
> 
> nit: subprogs (but still subprog_cnt, yep)

done

> 
> > +       __u32 subprog_cnt;
> >  };
> >
> >  struct bpf_struct_ops {
> > @@ -525,6 +538,7 @@ struct bpf_struct_ops {
> >  #define STRUCT_OPS_SEC ".struct_ops"
> >  #define STRUCT_OPS_LINK_SEC ".struct_ops.link"
> >  #define ARENA_SEC ".addr_space.1"
> > +#define JUMPTABLES_SEC ".jumptables"
> >
> >  enum libbpf_map_type {
> >         LIBBPF_MAP_UNSPEC,
> > @@ -668,6 +682,7 @@ struct elf_state {
> >         int symbols_shndx;
> >         bool has_st_ops;
> >         int arena_data_shndx;
> > +       int jumptables_data_shndx;
> >  };
> >
> >  struct usdt_manager;
> > @@ -739,6 +754,9 @@ struct bpf_object {
> >         void *arena_data;
> >         size_t arena_data_sz;
> >
> > +       void *jumptables_data;
> > +       size_t jumptables_data_sz;
> > +
> >         struct kern_feature_cache *feat_cache;
> >         char *token_path;
> >         int token_fd;
> > @@ -765,6 +783,7 @@ void bpf_program__unload(struct bpf_program *prog)
> >
> >         zfree(&prog->func_info);
> >         zfree(&prog->line_info);
> > +       zfree(&prog->subprog);
> >  }
> >
> >  static void bpf_program__exit(struct bpf_program *prog)
> > @@ -3945,6 +3964,13 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
> >                         } else if (strcmp(name, ARENA_SEC) == 0) {
> >                                 obj->efile.arena_data = data;
> >                                 obj->efile.arena_data_shndx = idx;
> > +                       } else if (strcmp(name, JUMPTABLES_SEC) == 0) {
> > +                               obj->jumptables_data = malloc(data->d_size);
> > +                               if (!obj->jumptables_data)
> > +                                       return -ENOMEM;
> > +                               memcpy(obj->jumptables_data, data->d_buf, data->d_size);
> > +                               obj->jumptables_data_sz = data->d_size;
> > +                               obj->efile.jumptables_data_shndx = idx;
> >                         } else {
> >                                 pr_info("elf: skipping unrecognized data section(%d) %s\n",
> >                                         idx, name);
> > @@ -4599,6 +4625,16 @@ static int bpf_program__record_reloc(struct bpf_program *prog,
> >                 return 0;
> >         }
> >
> > +       /* jump table data relocation */
> > +       if (shdr_idx == obj->efile.jumptables_data_shndx) {
> > +               reloc_desc->type = RELO_INSN_ARRAY;
> > +               reloc_desc->insn_idx = insn_idx;
> > +               reloc_desc->map_idx = -1;
> > +               reloc_desc->sym_off = sym->st_value;
> > +               reloc_desc->sym_size = sym->st_size;
> > +               return 0;
> > +       }
> > +
> >         /* generic map reference relocation */
> >         if (type == LIBBPF_MAP_UNSPEC) {
> >                 if (!bpf_object__shndx_is_maps(obj, shdr_idx)) {
> > @@ -6101,6 +6137,74 @@ static void poison_kfunc_call(struct bpf_program *prog, int relo_idx,
> >         insn->imm = POISON_CALL_KFUNC_BASE + ext_idx;
> >  }
> >
> > +static int create_jt_map(struct bpf_object *obj, int off, int size, int adjust_off)
> > +{
> > +       const __u32 value_size = sizeof(struct bpf_insn_array_value);
> > +       const __u32 max_entries = size / value_size;
> > +       struct bpf_insn_array_value val = {};
> > +       int map_fd, err;
> > +       __u64 xlated_off;
> > +       __u64 *jt;
> > +       __u32 i;
> > +
> > +       map_fd = bpf_map_create(BPF_MAP_TYPE_INSN_ARRAY, "jt",
> 
> let's call it ".jumptables" just like special global data maps?

done

> > +                               4, value_size, max_entries, NULL);
> > +       if (map_fd < 0)
> > +               return map_fd;
> > +
> > +       if (!obj->jumptables_data) {
> > +               pr_warn("object contains no jumptables_data\n");
> 
> for map-related errors we follow (pretty consistently) error format:
> 
> map '%s': whatever bad happened
> 
> let's stick to that here? "map '.jumptables': ELF file is missing jump
> table data" or something along those lines?

sure, thanks

> > +               return -EINVAL;
> > +       }
> > +       if ((off + size) > obj->jumptables_data_sz) {
> 
> nit: unnecessary ()

Thanks, removed

> > +               pr_warn("jumptables_data size is %zd, trying to access %d\n",
> > +                       obj->jumptables_data_sz, off + size);
> > +               return -EINVAL;
> > +       }
> > +
> > +       jt = (__u64 *)(obj->jumptables_data + off);
> > +       for (i = 0; i < max_entries; i++) {
> > +               /*
> > +                * LLVM-generated jump tables contain u64 records, however
> > +                * should contain values that fit in u32.
> > +                * The adjust_off provided by the caller adjusts the offset to
> > +                * be relative to the beginning of the main function
> > +                */
> > +               xlated_off = jt[i]/sizeof(struct bpf_insn) + adjust_off;
> > +               if (xlated_off > UINT32_MAX) {
> > +                       pr_warn("invalid jump table value %llx at offset %d (adjust_off %d)\n",
> > +                               jt[i], off + i, adjust_off);
> 
> no close(map_fd)? same in a bunch of places above? I'd actually move
> map create to right before this loop and simplify error handling

oops, thanks...

> pw-bot: cr
> 
> > +                       return -EINVAL;
> > +               }
> > +
> > +               val.xlated_off = xlated_off;
> > +               err = bpf_map_update_elem(map_fd, &i, &val, 0);
> > +               if (err) {
> > +                       close(map_fd);
> > +                       return err;
> > +               }
> > +       }
> > +       return map_fd;
> > +}
> > +
> > +/*
> > + * In LLVM the .jumptables section contains jump tables entries relative to the
> > + * section start. The BPF kernel-side code expects jump table offsets relative
> > + * to the beginning of the program (passed in bpf(BPF_PROG_LOAD)). This helper
> > + * computes a delta to be added when creating a map.
> > + */
> > +static int jt_adjust_off(struct bpf_program *prog, int insn_idx)
> > +{
> > +       int i;
> > +
> > +       for (i = prog->subprog_cnt - 1; i >= 0; i--)
> > +               if (insn_idx >= prog->subprog[i].sub_insn_off)
> > +                       return prog->subprog[i].sub_insn_off - prog->subprog[i].sec_insn_off;
> 
> nit: please add {} around multi-line for loop body (even if it's a
> single statement)

Sure, done.

> > +
> > +       return -prog->sec_insn_off;
> > +}
> > +
> > +
> >  /* Relocate data references within program code:
> >   *  - map references;
> >   *  - global variable references;
> > @@ -6192,6 +6296,21 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
> >                 case RELO_CORE:
> >                         /* will be handled by bpf_program_record_relos() */
> >                         break;
> > +               case RELO_INSN_ARRAY: {
> > +                       int map_fd;
> > +
> > +                       map_fd = create_jt_map(obj, relo->sym_off, relo->sym_size,
> > +                                              jt_adjust_off(prog, relo->insn_idx));
> 
> Who's closing all these fds? (I feel like we'd want to have all those
> maps in a list of bpf_object's maps, just like .rodata and others)

Ok, thanks, I've overlooked this.

> Also, how many of those will we have? Each individual relocation gets
> its own map, right?..

Yes. I think I didn't have a case where we have two loads fo the same
table. I will take a look at if this makes sense to add such a use
case, and then I will change this code to create only one map.

> 
> > +                       if (map_fd < 0) {
> > +                               pr_warn("prog '%s': relo #%d: can't create jump table: sym_off %u\n",
> > +                                               prog->name, i, relo->sym_off);
> > +                               return map_fd;
> > +                       }
> > +                       insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
> > +                       insn->imm = map_fd;
> > +                       insn->off = 0;
> > +               }
> > +                       break;
> >                 default:
> >                         pr_warn("prog '%s': relo #%d: bad relo type %d\n",
> >                                 prog->name, i, relo->type);
> > @@ -6389,6 +6508,24 @@ static int append_subprog_relos(struct bpf_program *main_prog, struct bpf_progra
> >         return 0;
> >  }
> >
> > +static int save_subprog_offsets(struct bpf_program *main_prog, struct bpf_program *subprog)
> > +{
> > +       size_t size = sizeof(main_prog->subprog[0]);
> > +       int new_cnt = main_prog->subprog_cnt + 1;
> > +       void *tmp;
> > +
> > +       tmp = libbpf_reallocarray(main_prog->subprog, new_cnt, size);
> > +       if (!tmp)
> > +               return -ENOMEM;
> > +
> > +       main_prog->subprog = tmp;
> > +       main_prog->subprog[new_cnt - 1].sec_insn_off = subprog->sec_insn_off;
> > +       main_prog->subprog[new_cnt - 1].sub_insn_off = subprog->sub_insn_off;
> > +       main_prog->subprog_cnt = new_cnt;
> > +
> > +       return 0;
> > +}
> > +
> >  static int
> >  bpf_object__append_subprog_code(struct bpf_object *obj, struct bpf_program *main_prog,
> >                                 struct bpf_program *subprog)
> > @@ -6418,6 +6555,14 @@ bpf_object__append_subprog_code(struct bpf_object *obj, struct bpf_program *main
> >         err = append_subprog_relos(main_prog, subprog);
> >         if (err)
> >                 return err;
> > +
> > +       /* Save subprogram offsets */
> > +       err = save_subprog_offsets(main_prog, subprog);
> > +       if (err) {
> > +               pr_warn("prog '%s': failed to add subprog offsets\n", main_prog->name);
> 
> emit error itself as well, use errstr()

ok, done

> > +               return err;
> > +       }
> > +
> >         return 0;
> >  }
> >
> > @@ -9185,6 +9330,9 @@ void bpf_object__close(struct bpf_object *obj)
> >
> >         zfree(&obj->arena_data);
> >
> > +       zfree(&obj->jumptables_data);
> > +       obj->jumptables_data_sz = 0;
> > +
> >         free(obj);
> >  }
> >
> > diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c
> > index 9dfbe7750f56..bccf4bb747e1 100644
> > --- a/tools/lib/bpf/libbpf_probes.c
> > +++ b/tools/lib/bpf/libbpf_probes.c
> > @@ -364,6 +364,10 @@ static int probe_map_create(enum bpf_map_type map_type)
> >         case BPF_MAP_TYPE_SOCKHASH:
> >         case BPF_MAP_TYPE_REUSEPORT_SOCKARRAY:
> >                 break;
> > +       case BPF_MAP_TYPE_INSN_ARRAY:
> > +               key_size        = sizeof(__u32);
> > +               value_size      = sizeof(struct bpf_insn_array_value);
> > +               break;
> >         case BPF_MAP_TYPE_UNSPEC:
> >         default:
> >                 return -EOPNOTSUPP;
> > diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c
> > index a469e5d4fee7..d1585baa9f14 100644
> > --- a/tools/lib/bpf/linker.c
> > +++ b/tools/lib/bpf/linker.c
> > @@ -28,6 +28,8 @@
> >  #include "str_error.h"
> >
> >  #define BTF_EXTERN_SEC ".extern"
> > +#define JUMPTABLES_SEC ".jumptables"
> > +#define JUMPTABLES_REL_SEC ".rel.jumptables"
> >
> >  struct src_sec {
> >         const char *sec_name;
> > @@ -2026,6 +2028,9 @@ static int linker_append_elf_sym(struct bpf_linker *linker, struct src_obj *obj,
> >                         obj->sym_map[src_sym_idx] = dst_sec->sec_sym_idx;
> >                         return 0;
> >                 }
> > +
> > +               if (strcmp(src_sec->sec_name, JUMPTABLES_SEC) == 0)
> > +                       goto add_sym;
> >         }
> >
> >         if (sym_bind == STB_LOCAL)
> > @@ -2272,8 +2277,9 @@ static int linker_append_elf_relos(struct bpf_linker *linker, struct src_obj *ob
> >                                                 insn->imm += sec->dst_off / sizeof(struct bpf_insn);
> >                                         else
> >                                                 insn->imm += sec->dst_off;
> > -                               } else {
> > -                                       pr_warn("relocation against STT_SECTION in non-exec section is not supported!\n");
> > +                               } else if (strcmp(src_sec->sec_name, JUMPTABLES_REL_SEC)) {
> 
> please add explicit `!= 0`, but also didn't we agree to have
> 
> if (strcmp(..., JUMPTABLES_REL_SEC) == 0) {
>     /* no need to adjust .jumptables */
> } else {
>     ... original default handling of errors ...
>
> 
> Also, how did you test that this actually works? Can you add a
> selftest demonstrating this?

I see that I've missed your comment about linking two objects.
I will add a selftest and patch the code above as you've suggested.

> }
> 
> > +                                       pr_warn("relocation against STT_SECTION in section %s is not supported!\n",
> > +                                               src_sec->sec_name);
> >                                         return -EINVAL;
> >                                 }
> >                         }
> > --
> > 2.34.1
> >

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 08/13] bpf, x86: add support for indirect jumps
  2025-09-20  0:28   ` Eduard Zingerman
  2025-09-21 19:12     ` Eduard Zingerman
@ 2025-09-25 18:07     ` Anton Protopopov
  2025-09-29 14:10       ` Anton Protopopov
  1 sibling, 1 reply; 46+ messages in thread
From: Anton Protopopov @ 2025-09-25 18:07 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Quentin Monnet, Yonghong Song

On 25/09/19 05:28PM, Eduard Zingerman wrote:
> On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> > Add support for a new instruction
> > 
> >     BPF_JMP|BPF_X|BPF_JA, SRC=0, DST=Rx, off=0, imm=0
> > 
> > which does an indirect jump to a location stored in Rx.  The register
> > Rx should have type PTR_TO_INSN. This new type assures that the Rx
> > register contains a value (or a range of values) loaded from a
> > correct jump table – map of type instruction array.
> > 
> > For example, for a C switch LLVM will generate the following code:
> > 
> >     0:   r3 = r1                    # "switch (r3)"
> >     1:   if r3 > 0x13 goto +0x666   # check r3 boundaries
> >     2:   r3 <<= 0x3                 # adjust to an index in array of addresses
> >     3:   r1 = 0xbeef ll             # r1 is PTR_TO_MAP_VALUE, r1->map_ptr=M
> >     5:   r1 += r3                   # r1 inherits boundaries from r3
> >     6:   r1 = *(u64 *)(r1 + 0x0)    # r1 now has type INSN_TO_PTR
> >     7:   gotox r1[,imm=fd(M)]       # jit will generate proper code
>                    ^^^^^^^^^^^^
> 	      Nit: this part is not needed atm.

Thanks, removed.

> > 
> > Here the gotox instruction corresponds to one particular map. This is
> > possible however to have a gotox instruction which can be loaded from
> > different maps, e.g.
> 
> [...]
> 
> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > index aca43c284203..607a684642e5 100644
> > --- a/include/linux/bpf_verifier.h
> > +++ b/include/linux/bpf_verifier.h
> 
> [...]
> 
> > @@ -586,6 +597,9 @@ struct bpf_insn_aux_data {
> >  	u8 fastcall_spills_num:3;
> >  	u8 arg_prog:4;
> >  
> > +	/* true if jt->off was allocated */
> > +	bool jt_allocated;
> > +
> 
> Nit: in clear_insn_aux_data() maybe just check if instruction is a gotox?

Yes, this should work, thanks

> 
> >  	/* below fields are initialized once */
> >  	unsigned int orig_idx; /* original instruction index */
> >  	bool jmp_point;
> 
> [...]
> 
> >  static inline struct bpf_func_info_aux *subprog_aux(struct bpf_verifier_env *env, int subprog)
> > diff --git a/kernel/bpf/bpf_insn_array.c b/kernel/bpf/bpf_insn_array.c
> > index 0c8dac62f457..4b945b7e31b8 100644
> > --- a/kernel/bpf/bpf_insn_array.c
> > +++ b/kernel/bpf/bpf_insn_array.c
> > @@ -1,7 +1,6 @@
> >  // SPDX-License-Identifier: GPL-2.0-only
> >  
> >  #include <linux/bpf.h>
> > -#include <linux/sort.h>
> 
> Nit: remove this include from patch #3?

sure, thanks!

> >  
> >  #define MAX_INSN_ARRAY_ENTRIES 256
> >  
> 
> [...]
> 
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 5c1e4e37d1f8..839260e62fa9 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> 
> [...]
> 
> > @@ -7620,6 +7644,19 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
> >  
> >  				regs[value_regno].type = SCALAR_VALUE;
> >  				__mark_reg_known(&regs[value_regno], val);
> > +			} else if (map->map_type == BPF_MAP_TYPE_INSN_ARRAY) {
> > +				regs[value_regno].type = PTR_TO_INSN;
> > +				regs[value_regno].map_ptr = map;
> > +				regs[value_regno].off = reg->off;
> > +				regs[value_regno].umin_value = reg->umin_value;
> > +				regs[value_regno].umax_value = reg->umax_value;
> > +				regs[value_regno].smin_value = reg->smin_value;
> > +				regs[value_regno].smax_value = reg->smax_value;
> > +				regs[value_regno].s32_min_value = reg->s32_min_value;
> > +				regs[value_regno].s32_max_value = reg->s32_max_value;
> > +				regs[value_regno].u32_min_value = reg->u32_min_value;
> > +				regs[value_regno].u32_max_value = reg->u32_max_value;
> > +				regs[value_regno].var_off = reg->var_off;
> 
> This can be shortened to:
> 
>   copy_register_state(regs + value_regno, reg);
>   regs[value_regno].type = PTR_TO_INSN;
> 
> I think that a check that read is u64 wide is necessary here.
> Otherwise e.g. for u8 load you'd need to truncate the bounds set above.
> This is also necessary for alignment check at the beginning of this
> function (check_ptr_alignment() call).

will fix, thanks!

> >  			} else {
> >  				mark_reg_unknown(env, regs, value_regno);
> >  			}
> 
> [...]
> 
> > @@ -14628,6 +14672,11 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
> >  		}
> >  		break;
> >  	case BPF_SUB:
> > +		if (ptr_to_insn_array) {
> > +			verbose(env, "Operation %s on ptr to instruction set map is prohibited\n",
> > +				bpf_alu_string[opcode >> 4]);
>                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
>                Nit: Just "subtraction", no need for lookup?
>                     Also, maybe put this near the same check for PTR_TO_STACK?

ok

> 
> > +			return -EACCES;
> > +		}
> >  		if (dst_reg == off_reg) {
> >  			/* scalar -= pointer.  Creates an unknown scalar */
> >  			verbose(env, "R%d tried to subtract pointer from scalar\n",
> 
> [...]
> 
> > @@ -17733,6 +17783,234 @@ static int mark_fastcall_patterns(struct bpf_verifier_env *env)
> >  	return 0;
> >  }
> >  
> > +#define SET_HIGH(STATE, LAST)	STATE = (STATE & 0xffffU) | ((LAST) << 16)
> > +#define GET_HIGH(STATE)		((u16)((STATE) >> 16))
> > +
> > +static int push_gotox_edge(int t, struct bpf_verifier_env *env, struct bpf_iarray *jt)
> > +{
> > +	int *insn_stack = env->cfg.insn_stack;
> > +	int *insn_state = env->cfg.insn_state;
> > +	u16 prev;
> > +	int w;
> > +
> > +	for (prev = GET_HIGH(insn_state[t]); prev < jt->off_cnt; prev++) {
> > +		w = jt->off[prev];
> > +
> > +		/* EXPLORED || DISCOVERED */
> > +		if (insn_state[w])
> > +			continue;
> 
> Suppose there is some other way to reach `w` beside gotox.
> Also suppose that `w` had been visited already.
> In such case `mark_jmp_point(env, w)` might get omitted for `w`.

thanks

> > +
> > +		break;
> > +	}
> > +
> > +	if (prev == jt->off_cnt)
> > +		return DONE_EXPLORING;
> > +
> > +	mark_prune_point(env, t);
> 
> Nit: do this from visit_gotox_insn() ?

yes, ok

> > +
> > +	if (env->cfg.cur_stack >= env->prog->len)
> > +		return -E2BIG;
> > +	insn_stack[env->cfg.cur_stack++] = w;
> > +
> > +	mark_jmp_point(env, w);
> > +
> > +	SET_HIGH(insn_state[t], prev + 1);
> > +	return KEEP_EXPLORING;
> > +}
> 
> [...]
> 
> > +/*
> > + * Find and collect all maps which fit in the subprog. Return the result as one
> > + * combined jump table in jt->off (allocated with kvcalloc
>                                                            ^^^
> 						   nit: missing ')'
> 
> > + */
> > +static struct bpf_iarray *jt_from_subprog(struct bpf_verifier_env *env,
> > +					  int subprog_start, int subprog_end)
> 
> [...]
> 
> > +static struct bpf_iarray *
> > +create_jt(int t, struct bpf_verifier_env *env, int fd)
>                                                   ^^^^^^
> 			fd is unused, same for visit_gotox_insn()
> 
> [...]
> 
> > @@ -18716,6 +19001,10 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
> >  		return regs_exact(rold, rcur, idmap) && rold->frameno == rcur->frameno;
> >  	case PTR_TO_ARENA:
> >  		return true;
> > +	case PTR_TO_INSN:
> > +		/* is rcur a subset of rold? */
> > +		return (rcur->umin_value >= rold->umin_value &&
> > +			rcur->umax_value <= rold->umax_value);
> 
> I think this should be:
> 
>                  if (rold->off != rcur->off)
>                          return false;
>                  return range_within(old: rold, cur: rcur) &&
>                         tnum_in(a: rold->var_off, b: rcur->var_off);

ok, makes sense

> >  	default:
> >  		return regs_exact(rold, rcur, idmap);
> >  	}
> > @@ -19862,6 +20151,102 @@ static int process_bpf_exit_full(struct bpf_verifier_env *env,
> >  	return PROCESS_BPF_EXIT;
> >  }
> >  
> > +static int indirect_jump_min_max_index(struct bpf_verifier_env *env,
> > +				       int regno,
> > +				       struct bpf_map *map,
> > +				       u32 *pmin_index, u32 *pmax_index)
> > +{
> > +	struct bpf_reg_state *reg = reg_state(env, regno);
> > +	u64 min_index, max_index;
> > +
> > +	if (check_add_overflow(reg->umin_value, reg->off, &min_index) ||
> > +		(min_index > (u64) U32_MAX * sizeof(long))) {
> > +		verbose(env, "the sum of R%u umin_value %llu and off %u is too big\n",
> > +			     regno, reg->umin_value, reg->off);
> > +		return -ERANGE;
> > +	}
> > +	if (check_add_overflow(reg->umax_value, reg->off, &max_index) ||
> > +		(max_index > (u64) U32_MAX * sizeof(long))) {
> > +		verbose(env, "the sum of R%u umax_value %llu and off %u is too big\n",
> > +			     regno, reg->umax_value, reg->off);
> > +		return -ERANGE;
> > +	}
> > +
> > +	min_index /= sizeof(long);
> > +	max_index /= sizeof(long);
> 
> Nit: `long` is 32-bit long on x86 (w/o -64), I understand that x86 jit
> would just reject gotox, but could you please use `sizeof(u64)` here?

Haven't check, really, but will the jump table contain 8-byte records
for x86_32? I thought they are size of pointers, thus I use long.

Still can replace by 8, yes.

> > +
> > +	if (min_index >= map->max_entries || max_index >= map->max_entries) {
> > +		verbose(env, "R%u points to outside of jump table: [%llu,%llu] max_entries %u\n",
> > +			     regno, min_index, max_index, map->max_entries);
> > +		return -EINVAL;
> > +	}
> > +
> > +	*pmin_index = min_index;
> > +	*pmax_index = max_index;
> > +	return 0;
> > +}
> > +
> > +/* gotox *dst_reg */
> > +static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *insn)
> > +{
> > +	struct bpf_verifier_state *other_branch;
> > +	struct bpf_reg_state *dst_reg;
> > +	struct bpf_map *map;
> > +	u32 min_index, max_index;
> > +	int err = 0;
> > +	u32 *xoff;
> > +	int n;
> > +	int i;
> > +
> > +	dst_reg = reg_state(env, insn->dst_reg);
> > +	if (dst_reg->type != PTR_TO_INSN) {
> > +		verbose(env, "R%d has type %d, expected PTR_TO_INSN\n",
> > +			     insn->dst_reg, dst_reg->type);
> > +		return -EINVAL;
> > +	}
> > +
> > +	map = dst_reg->map_ptr;
> > +	if (verifier_bug_if(!map, env, "R%d has an empty map pointer", insn->dst_reg))
> > +		return -EFAULT;
> > +
> > +	if (verifier_bug_if(map->map_type != BPF_MAP_TYPE_INSN_ARRAY, env,
> > +			    "R%d has incorrect map type %d", insn->dst_reg, map->map_type))
> > +		return -EFAULT;
> > +
> > +	err = indirect_jump_min_max_index(env, insn->dst_reg, map, &min_index, &max_index);
> > +	if (err)
> > +		return err;
> > +
> > +	xoff = kvcalloc(max_index - min_index + 1, sizeof(u32), GFP_KERNEL_ACCOUNT);
> > +	if (!xoff)
> > +		return -ENOMEM;
> 
> Let's keep a buffer for this allocation in `env` and realloc it when needed.
> Would be good to avoid allocating memory each time this gotox is visited.

Ok (to put it in bpf_subprog_info as suggested in your next letter).
Though, probably it still needs to grow (= realloc).

> > +
> > +	n = copy_insn_array_uniq(map, min_index, max_index, xoff);
> > +	if (n < 0) {
> > +		err = n;
> > +		goto free_off;
> > +	}
> > +	if (n == 0) {
> > +		verbose(env, "register R%d doesn't point to any offset in map id=%d\n",
> > +			     insn->dst_reg, map->id);
> > +		err = -EINVAL;
> > +		goto free_off;
> > +	}
> > +
> > +	for (i = 0; i < n - 1; i++) {
> > +		other_branch = push_stack(env, xoff[i], env->insn_idx, false);
>                                                                        ^^^^^
>                          `is_speculative` has to be inherited from env->cur_state

Ah, yes, thanks

> > +		if (IS_ERR(other_branch)) {
> > +			err = PTR_ERR(other_branch);
> > +			goto free_off;
> > +		}
> > +	}
> > +	env->insn_idx = xoff[n-1];
> > +
> > +free_off:
> > +	kvfree(xoff);
> > +	return err;
> > +}
> > +
> >  static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
> >  {
> >  	int err;
> > @@ -19964,6 +20349,9 @@ static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
> >  
> >  			mark_reg_scratched(env, BPF_REG_0);
> >  		} else if (opcode == BPF_JA) {
> > +			if (BPF_SRC(insn->code) == BPF_X)
> > +				return check_indirect_jump(env, insn);
> > +
> 
> check_indirect_jump() does not check reserved fields (like offset or dst_reg).

Ok, thanks, will fix. Though, maybe, in the visit_gotox, why to wait until here?

(just in case, should be s/dst_reg/src_reg in your comment)

> 
> >  			if (BPF_SRC(insn->code) != BPF_K ||
> >  			    insn->src_reg != BPF_REG_0 ||
> >  			    insn->dst_reg != BPF_REG_0 ||
> 
> [...]
> 
> > @@ -24215,23 +24625,41 @@ static bool can_jump(struct bpf_insn *insn)
> >  	return false;
> >  }
> >  
> > -static int insn_successors(struct bpf_prog *prog, u32 idx, u32 succ[2])
> > +/*
> > + * Returns an array of instructions succ, with succ->off[0], ...,
> > + * succ->off[n-1] with successor instructions, where n=succ->off_cnt
> > + */
> > +static struct bpf_iarray *
> > +insn_successors(struct bpf_verifier_env *env, u32 insn_idx)
> 
> Nit: maybe put insn_successors refactoring to a separate patch?

Yes, makes sense, will do. (In any case thi piece needs to be
carefully rebased after you recent changes.)

> >  {
> > -	struct bpf_insn *insn = &prog->insnsi[idx];
> > -	int i = 0, insn_sz;
> > +	struct bpf_prog *prog = env->prog;
> > +	struct bpf_insn *insn = &prog->insnsi[insn_idx];
> > +	struct bpf_iarray *succ;
> > +	int insn_sz;
> >  	u32 dst;
> >  
> > -	insn_sz = bpf_is_ldimm64(insn) ? 2 : 1;
> > -	if (can_fallthrough(insn) && idx + 1 < prog->len)
> > -		succ[i++] = idx + insn_sz;
> > +	if (unlikely(insn_is_gotox(insn))) {
> > +		succ = env->insn_aux_data[insn_idx].jt;
> > +		if (verifier_bug_if(!succ, env,
> > +				    "aux data for insn %u doesn't contain a jump table\n",
> > +				    insn_idx))
> > +			return ERR_PTR(-EFAULT);
> 
> Requiring each callsite to check error code for this function is very inconvenient.
> Moreover, insn_successors() is hot in liveness.c:update_instance().
> Let's just assume that NULL here cannot happen.

Hmm, ok. I will check and fix.

> > +	} else {
> > +		/* pre-allocated array of size up to 2; reset cnt, as it may be used already */
> > +		succ = env->succ;
> > +		succ->off_cnt = 0;
> >  
> > -	if (can_jump(insn)) {
> > -		dst = idx + jmp_offset(insn) + 1;
> > -		if (i == 0 || succ[0] != dst)
> > -			succ[i++] = dst;
> > -	}
> > +		insn_sz = bpf_is_ldimm64(insn) ? 2 : 1;
> > +		if (can_fallthrough(insn) && insn_idx + 1 < prog->len)
> > +			succ->off[succ->off_cnt++] = insn_idx + insn_sz;
> >  
> > -	return i;
> > +		if (can_jump(insn)) {
> > +			dst = insn_idx + jmp_offset(insn) + 1;
> > +			if (succ->off_cnt == 0 || succ->off[0] != dst)
> > +				succ->off[succ->off_cnt++] = dst;
> > +		}
> > +	}
> > +	return succ;
> >  }
> >
> 
> [...]
> 
> > @@ -24489,11 +24921,10 @@ static int compute_scc(struct bpf_verifier_env *env)
> >  	const u32 insn_cnt = env->prog->len;
> >  	int stack_sz, dfs_sz, err = 0;
> >  	u32 *stack, *pre, *low, *dfs;
> > -	u32 succ_cnt, i, j, t, w;
> > +	u32 i, j, t, w;
> >  	u32 next_preorder_num;
> >  	u32 next_scc_id;
> >  	bool assign_scc;
> > -	u32 succ[2];
> >  
> >  	next_preorder_num = 1;
> >  	next_scc_id = 1;
> > @@ -24592,6 +25023,8 @@ static int compute_scc(struct bpf_verifier_env *env)
> >  		dfs[0] = i;
> >  dfs_continue:
> >  		while (dfs_sz) {
> > +			struct bpf_iarray *succ;
> > +
> 
> Nit: please move this declaration up, just to be consistent with other variables.

Sure

> >  			w = dfs[dfs_sz - 1];
> >  			if (pre[w] == 0) {
> >  				low[w] = next_preorder_num;
> > @@ -24600,12 +25033,17 @@ static int compute_scc(struct bpf_verifier_env *env)
> >  				stack[stack_sz++] = w;
> >  			}
> >  			/* Visit 'w' successors */
> > -			succ_cnt = insn_successors(env->prog, w, succ);
> > -			for (j = 0; j < succ_cnt; ++j) {
> > -				if (pre[succ[j]]) {
> > -					low[w] = min(low[w], low[succ[j]]);
> > +			succ = insn_successors(env, w);
> > +			if (IS_ERR(succ)) {
> > +				err = PTR_ERR(succ);
> > +				goto exit;
> > +
> > +			}
> > +			for (j = 0; j < succ->off_cnt; ++j) {
> > +				if (pre[succ->off[j]]) {
> > +					low[w] = min(low[w], low[succ->off[j]]);
> >  				} else {
> > -					dfs[dfs_sz++] = succ[j];
> > +					dfs[dfs_sz++] = succ->off[j];
> >  					goto dfs_continue;
> >  				}
> >  			}
> 
> [...]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 13/13] selftests/bpf: add selftests for indirect jumps
  2025-09-20 22:27     ` Eduard Zingerman
  2025-09-20 22:32       ` Eduard Zingerman
@ 2025-09-25 18:14       ` Anton Protopopov
  1 sibling, 0 replies; 46+ messages in thread
From: Anton Protopopov @ 2025-09-25 18:14 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Quentin Monnet, Yonghong Song

On 25/09/20 03:27PM, Eduard Zingerman wrote:
> On Fri, 2025-09-19 at 17:58 -0700, Eduard Zingerman wrote:
> > On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> > > Add selftests for indirect jumps. All the indirect jumps are
> > > generated from C switch statements, so, if compiled by a compiler
> > > which doesn't support indirect jumps, then should pass as well.
> > >
> > > Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
> > > ---
> >
> > Patch #8 adds a lot of error conditions that are effectively untested
> > at the moment. I think we need to figure out a way to express gotox
> > tests in inline assembly, independent of clang version, and add a
> > bunch of correctness tests.
> >
> > [...]
> 
> Here is an example (I modifier verifier_and.c, the patch should use
> some verifier_gotox.c, of course):
> 
>   #include <linux/bpf.h>
>   #include <bpf/bpf_helpers.h>
>   #include "bpf_misc.h"
>   #include "../../../include/linux/filter.h"
> 
>   SEC("socket")
>   __success
>   __retval(1)
>   __naked void jump_table1(void)
>   {
>   	asm volatile (
>   ".pushsection .jumptables,\"\",@progbits;\n"
>   "jt0_%=:\n"
>   	".quad ret0_%=;\n"
>   	".quad ret1_%=;\n"
>   ".size jt0_%=, 16;\n"
>   ".global jt0_%=;\n"
>   ".popsection;\n"
> 
>   	"r0 = jt0_%= ll;\n"
>   	"r0 += 8;\n"
>   	"r0 = *(u64 *)(r0 + 0);\n"
>   	".8byte %[gotox_r0];\n"
>   "ret0_%=:\n"
>   	"r0 = 0;\n"
>   	"exit;\n"
>   "ret1_%=:\n"
>   	"r0 = 1;\n"
>   	"exit;\n"
>   	:
>   	: __imm_insn(gotox_r0, BPF_RAW_INSN(BPF_JMP | BPF_JA | BPF_X, BPF_REG_0, 0, 0 , 0))
>   	: __clobber_all);
>   }
> 
>   char _license[] SEC("license") = "GPL";
> 
> It verifies and executes (having fix for emit_indirect_jump() applied):
> 
>   VERIFIER LOG:
>   =============
>   func#0 @0
>   Live regs before insn:
>         0: .......... (18) r0 = 0xffff888108c66700
>         2: 0......... (07) r0 += 8
>         3: 0......... (79) r0 = *(u64 *)(r0 +0)
>         4: .......... (0d) gotox r0
>         5: .......... (b7) r0 = 0
>         6: 0......... (95) exit
>         7: .......... (b7) r0 = 1
>         8: 0......... (95) exit
>   Global function jump_table1() doesn't return scalar. Only those are supported.
>   0: R1=ctx() R10=fp0
>   ; asm volatile ( @ verifier_and.c:122
>   0: (18) r0 = 0xffff888108c66700       ; R0_w=map_value(map=jt,ks=4,vs=8)
>   2: (07) r0 += 8                       ; R0_w=map_value(map=jt,ks=4,vs=8,off=8)
>   3: (79) r0 = *(u64 *)(r0 +0)          ; R0_w=insn(off=8)
>   4: (0d) gotox r0
>   7: (b7) r0 = 1                        ; R0_w=1
>   8: (95) exit
>   processed 6 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>   =============
>   do_prog_test_run:PASS:bpf_prog_test_run 0 nsec
>   #488/1   verifier_and/jump_table1:OK
>   #488     verifier_and:OK
>   Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
> 
> This example can be mutated in various ways to check behaviour and
> error conditions.
> 
> Having such complete set of such tests, I'd only keep a few canary
> C-level tests.

Thanks a lot, I can use it for sure!

As for C-level tests, I want to keep a bunch of them in any
case to test libbpf operations.

(I also remember your request to extend compute_live_registers,
just didn't have time to get to it yet.)

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 08/13] bpf, x86: add support for indirect jumps
  2025-09-25 18:07     ` Anton Protopopov
@ 2025-09-29 14:10       ` Anton Protopopov
  0 siblings, 0 replies; 46+ messages in thread
From: Anton Protopopov @ 2025-09-29 14:10 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Quentin Monnet, Yonghong Song

On 25/09/25 06:07PM, Anton Protopopov wrote:
> On 25/09/19 05:28PM, Eduard Zingerman wrote:
> > On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> [...]
> > > +static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *insn)
> > > +{
> > > +	struct bpf_verifier_state *other_branch;
> > > +	struct bpf_reg_state *dst_reg;
> > > +	struct bpf_map *map;
> > > +	u32 min_index, max_index;
> > > +	int err = 0;
> > > +	u32 *xoff;
> > > +	int n;
> > > +	int i;
> > > +
> > > +	dst_reg = reg_state(env, insn->dst_reg);
> > > +	if (dst_reg->type != PTR_TO_INSN) {
> > > +		verbose(env, "R%d has type %d, expected PTR_TO_INSN\n",
> > > +			     insn->dst_reg, dst_reg->type);
> > > +		return -EINVAL;
> > > +	}
> > > +
> > > +	map = dst_reg->map_ptr;
> > > +	if (verifier_bug_if(!map, env, "R%d has an empty map pointer", insn->dst_reg))
> > > +		return -EFAULT;
> > > +
> > > +	if (verifier_bug_if(map->map_type != BPF_MAP_TYPE_INSN_ARRAY, env,
> > > +			    "R%d has incorrect map type %d", insn->dst_reg, map->map_type))
> > > +		return -EFAULT;
> > > +
> > > +	err = indirect_jump_min_max_index(env, insn->dst_reg, map, &min_index, &max_index);
> > > +	if (err)
> > > +		return err;
> > > +
> > > +	xoff = kvcalloc(max_index - min_index + 1, sizeof(u32), GFP_KERNEL_ACCOUNT);
> > > +	if (!xoff)
> > > +		return -ENOMEM;
> > 
> > Let's keep a buffer for this allocation in `env` and realloc it when needed.
> > Would be good to avoid allocating memory each time this gotox is visited.
> 
> Ok (to put it in bpf_subprog_info as suggested in your next letter).
> Though, probably it still needs to grow (= realloc).

On a second thought, this doesn't work in general case: hard to avoid
copying/sorting. Every other request wants to do sort(M[start, end]).
All three things are vaiables: M, start, end. For now I will just add
a buffer to avoid allocations.

> [...]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v3 bpf-next 05/13] bpf: support instructions arrays with constants blinding
  2025-09-22  9:28                   ` Anton Protopopov
@ 2025-09-30  9:07                     ` Anton Protopopov
  0 siblings, 0 replies; 46+ messages in thread
From: Anton Protopopov @ 2025-09-30  9:07 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Daniel Borkmann, Alexei Starovoitov, bpf, Alexei Starovoitov,
	Andrii Nakryiko, Anton Protopopov, Quentin Monnet, Yonghong Song

On 25/09/22 09:28AM, Anton Protopopov wrote:
> On 25/09/19 01:47PM, Eduard Zingerman wrote:
> > On Fri, 2025-09-19 at 20:27 +0000, Anton Protopopov wrote:
> > > On 25/09/19 12:44PM, Eduard Zingerman wrote:
> > > > On Fri, 2025-09-19 at 21:28 +0200, Daniel Borkmann wrote:
> > > > > On 9/19/25 8:26 PM, Alexei Starovoitov wrote:
> > > > > > On Fri, Sep 19, 2025 at 12:12 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > > > > On Fri, 2025-09-19 at 07:05 +0000, Anton Protopopov wrote:
> > > > > > > > On 25/09/18 11:35PM, Eduard Zingerman wrote:
> > > > > > > > > On Thu, 2025-09-18 at 09:38 +0000, Anton Protopopov wrote:
> > > > > > > > > 
> > > > > > > > > [...]
> > > > > > > > > 
> > > > > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > > > > > > > index a7ad4fe756da..5c1e4e37d1f8 100644
> > > > > > > > > > --- a/kernel/bpf/verifier.c
> > > > > > > > > > +++ b/kernel/bpf/verifier.c
> > > > > > > > > > @@ -21578,6 +21578,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > > > > > > > >    struct bpf_insn *insn;
> > > > > > > > > >    void *old_bpf_func;
> > > > > > > > > >    int err, num_exentries;
> > > > > > > > > > + int old_len, subprog_start_adjustment = 0;
> > > > > > > > > > 
> > > > > > > > > >    if (env->subprog_cnt <= 1)
> > > > > > > > > >            return 0;
> > > > > > > > > > @@ -21652,7 +21653,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > > > > > > > >            func[i]->aux->func_idx = i;
> > > > > > > > > >            /* Below members will be freed only at prog->aux */
> > > > > > > > > >            func[i]->aux->btf = prog->aux->btf;
> > > > > > > > > > -         func[i]->aux->subprog_start = subprog_start;
> > > > > > > > > > +         func[i]->aux->subprog_start = subprog_start + subprog_start_adjustment;
> > > > > > > > > >            func[i]->aux->func_info = prog->aux->func_info;
> > > > > > > > > >            func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
> > > > > > > > > >            func[i]->aux->poke_tab = prog->aux->poke_tab;
> > > > > > > > > > @@ -21705,7 +21706,15 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> > > > > > > > > >            func[i]->aux->might_sleep = env->subprog_info[i].might_sleep;
> > > > > > > > > >            if (!i)
> > > > > > > > > >                    func[i]->aux->exception_boundary = env->seen_exception;
> > > > > > > > > > +
> > > > > > > > > > +         /*
> > > > > > > > > > +          * To properly pass the absolute subprog start to jit
> > > > > > > > > > +          * all instruction adjustments should be accumulated
> > > > > > > > > > +          */
> > > > > > > > > > +         old_len = func[i]->len;
> > > > > > > > > >            func[i] = bpf_int_jit_compile(func[i]);
> > > > > > > > > > +         subprog_start_adjustment += func[i]->len - old_len;
> > > > > > > > > > +
> > > > > > > > > >            if (!func[i]->jited) {
> > > > > > > > > >                    err = -ENOTSUPP;
> > > > > > > > > >                    goto out_free;
> > > > > > > > > 
> > > > > > > > > This change makes sense, however, would it be possible to move
> > > > > > > > > bpf_jit_blind_constants() out from jit to verifier.c:do_check,
> > > > > > > > > somewhere after do_misc_fixups?
> > > > > > > > > Looking at the source code, bpf_jit_blind_constants() is the first
> > > > > > > > > thing any bpf_int_jit_compile() does.
> > > > > > > > > Another alternative is to add adjust_subprog_starts() call to this
> > > > > > > > > function. Wdyt?
> > > > > > > > 
> > > > > > > > Yes, it makes total sense. Blinding was added to x86 jit initially and then
> > > > > > > > every other jit copy-pasted it.  I was considering to move blinding up some
> > > > > > > > time back (see https://lore.kernel.org/bpf/20250318143318.656785-1-aspsk@isovalent.com/),
> > > > > > > > but then I've decided to avoid this, as this requires to patch every JIT, and I
> > > > > > > > am not sure what is the way to test such a change (any hints?)
> > > > > > > 
> > > > > > > We have the following covered by CI:
> > > > > > > - arch/x86/net/bpf_jit_comp.c
> > > > > > > - arch/s390/net/bpf_jit_comp.c
> > > > > > > - arch/arm64/net/bpf_jit_comp.c
> > > > > > > 
> > > > > > > People work on these jits actively:
> > > > > > > - arch/riscv/net/bpf_jit_core.c
> > > > > > > - arch/loongarch/net/bpf_jit.c
> > > > > > > - arch/powerpc/net/bpf_jit_comp.c
> > > > > > > 
> > > > > > > So, we can probably ask to test the patch-set.
> > > > > > > 
> > > > > > > The remaining are:
> > > > > > > - arch/x86/net/bpf_jit_comp32.c
> > > > > > > - arch/parisc/net/bpf_jit_core.c
> > > > > > > - arch/mips/net/bpf_jit_comp.c
> > > > > > > - arch/arm/net/bpf_jit_32.c
> > > > > > > - arch/sparc/net/bpf_jit_comp_64.c
> > > > > > > - arch/arc/net/bpf_jit_core.c
> > > > > > > 
> > > > > > > The change to each individual jit is not complicated, just removing
> > > > > > > the transformation call. Idk, I'd just go for it.
> > > > > > > Maybe Alexei has concerns?
> > > > > > 
> > > > > > No concerns.
> > > > > > I don't remember why JIT calls it instead of the verifier.
> > > > > > 
> > > > > > Daniel,
> > > > > > do you recall? Any concern?
> > > > > 
> > > > > Hm, I think we did this in the JIT back then for couple of reasons iirc,
> > > > > the constant blinding needs to work from native bpf(2) as well as from
> > > > > cbpf->ebpf (seccomp-bpf, filters, etc), so the JIT was a natural location
> > > > > to capture them all, and to fallback to interpreter with the non-blinded
> > > > > BPF-insns when something went wrong during blinding or JIT process (e.g.
> > > > > JIT hits some internal limits etc). Moving bpf_jit_blind_constants() out
> > > > > from JIT to verifier.c:do_check() means constant blinding of cbpf->ebpf
> > > > > are not covered anymore (and in this case its reachable from unpriv).
> > > > 
> > > > Hi Daniel,
> > > > 
> > > > Thank you for the context.
> > > > So, the ideal location for bpf_jit_blind_constants() would be in
> > > > core.c in some wrapper function for bpf_int_jit_compile():
> > > > 
> > > >   static struct bpf_prog *jit_compile(prog)
> > > >   {
> > > >   	tmp = bpf_jit_blind_constants()
> > > >         if (!tmp)
> > > >            return prog;
> > > >         return bpf_int_jit_compile(tmp);
> > > >   }
> > > > 
> > > > A bit of a hassle.
> > > > 
> > > > Anton, wdyt about a second option: adding adjust_subprog_starts()
> > > > to bpf_jit_blind_constants() and leaving all the rest as-is?
> > > > It would have to happen either way of call to bpf_jit_blind_constants()
> > > > itself is moved.
> > > 
> > > So, to be clear, in this case adjust_insn_arrays() stays as in the
> > > original patch, but the "subprog_start_adjustment" chunks are
> > > replaced by calling the adjust_subprog_starts() (for better
> > > readability and consistency, right?)
> > 
> > Yes, by adding adjust_subprog_starts() call inside
> > bpf_jit_blind_constants() it should be possible to read
> > env->subprog_info[*].start in the jit_subprogs() loop directly,
> > w/o tracking the subprog_start_adjustment delta.
> > (At-least I think this should work).
> 
> Ok, will do this way, thanks.

Actually, I think I will skip it this time. During jit_subprogs
the code of the original program is split into subfuncs via the
_unchanged_ subprog info, as the xlated code is copied for each
new subprog in the loop. So this "adjustment" thing will appear
in some form in any case.

Also, doing adjust_subprog_starts() requires passing env to jits,
which wasn't done yet, and needs to be faked for non-ebpf progs,
I think. So maybe this is better to cleanup/generalize this later,
not as part of this patch.

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2025-09-30  9:02 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-18  9:38 [PATCH v3 bpf-next 00/13] BPF indirect jumps Anton Protopopov
2025-09-18  9:38 ` [PATCH v3 bpf-next 01/13] bpf: fix the return value of push_stack Anton Protopopov
2025-09-19  0:17   ` Eduard Zingerman
2025-09-19  7:18     ` Anton Protopopov
2025-09-18  9:38 ` [PATCH v3 bpf-next 02/13] bpf: save the start of functions in bpf_prog_aux Anton Protopopov
2025-09-18  9:38 ` [PATCH v3 bpf-next 03/13] bpf, x86: add new map type: instructions array Anton Protopopov
2025-09-18  9:38 ` [PATCH v3 bpf-next 04/13] selftests/bpf: add selftests for new insn_array map Anton Protopopov
2025-09-18  9:38 ` [PATCH v3 bpf-next 05/13] bpf: support instructions arrays with constants blinding Anton Protopopov
2025-09-19  6:35   ` Eduard Zingerman
2025-09-19  7:05     ` Anton Protopopov
2025-09-19  7:12       ` Eduard Zingerman
2025-09-19 18:26         ` Alexei Starovoitov
2025-09-19 19:28           ` Daniel Borkmann
2025-09-19 19:44             ` Eduard Zingerman
2025-09-19 20:27               ` Anton Protopopov
2025-09-19 20:47                 ` Eduard Zingerman
2025-09-22  9:28                   ` Anton Protopopov
2025-09-30  9:07                     ` Anton Protopopov
2025-09-19 21:41               ` Daniel Borkmann
2025-09-18  9:38 ` [PATCH v3 bpf-next 06/13] selftests/bpf: test instructions arrays with blinding Anton Protopopov
2025-09-18  9:38 ` [PATCH v3 bpf-next 07/13] bpf, x86: allow indirect jumps to r8...r15 Anton Protopopov
2025-09-19 18:25   ` Eduard Zingerman
2025-09-19 18:38     ` Eduard Zingerman
2025-09-19 19:25       ` Anton Protopopov
2025-09-18  9:38 ` [PATCH v3 bpf-next 08/13] bpf, x86: add support for indirect jumps Anton Protopopov
2025-09-20  0:28   ` Eduard Zingerman
2025-09-21 19:12     ` Eduard Zingerman
2025-09-25 18:07     ` Anton Protopopov
2025-09-29 14:10       ` Anton Protopopov
2025-09-18  9:38 ` [PATCH v3 bpf-next 09/13] bpf: disasm: add support for BPF_JMP|BPF_JA|BPF_X Anton Protopopov
2025-09-18  9:38 ` [PATCH v3 bpf-next 10/13] libbpf: fix formatting of bpf_object__append_subprog_code Anton Protopopov
2025-09-19 23:18   ` Andrii Nakryiko
2025-09-18  9:38 ` [PATCH v3 bpf-next 11/13] libbpf: support llvm-generated indirect jumps Anton Protopopov
2025-09-19 23:18   ` Andrii Nakryiko
2025-09-22 10:13     ` Anton Protopopov
2025-09-18  9:38 ` [PATCH v3 bpf-next 12/13] bpftool: Recognize insn_array map type Anton Protopopov
2025-09-18  9:38 ` [PATCH v3 bpf-next 13/13] selftests/bpf: add selftests for indirect jumps Anton Protopopov
2025-09-20  0:58   ` Eduard Zingerman
2025-09-20 22:27     ` Eduard Zingerman
2025-09-20 22:32       ` Eduard Zingerman
2025-09-25 18:14       ` Anton Protopopov
2025-09-19  6:46 ` [PATCH v3 bpf-next 00/13] BPF " Eduard Zingerman
2025-09-19 14:57   ` Anton Protopopov
2025-09-19 16:49     ` Eduard Zingerman
2025-09-19 17:27   ` Eduard Zingerman
2025-09-19 18:03     ` Eduard Zingerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox