[PATCH v2 bpf-next 00/13] BPF indirect jumps

BPF List
 help / color / mirror / Atom feed

* [PATCH v2 bpf-next 00/13] BPF indirect jumps
@ 2025-09-13 19:39 Anton Protopopov
  2025-09-13 19:39 ` [PATCH v2 bpf-next 01/13] bpf: fix the return value of push_stack Anton Protopopov
                   ` (12 more replies)
  0 siblings, 13 replies; 26+ messages in thread
From: Anton Protopopov @ 2025-09-13 19:39 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

This patchset implements a new type of map, instruction set, and uses
it to build support for indirect branches in BPF (on x86). (The same
map will be later used to provide support for indirect calls and static
keys.) See [1], [2] for more context.

Short table of contents:

  * Patches 1-6 implement the new map of type
    BPF_MAP_TYPE_INSN_SET and corresponding selftests. This map can
    be used to track the "original -> xlated -> jitted mapping" for
    a given program. Patches 5,6 add support for "blinded" variant.

  * Patches 7,8,9 implement the support for indirect jumps

  * Patches 10--13 add support for LLVM-compiled programs containing
    indirect jumps.

A special LLVM should be used for that, see [3] for the details and
some related discussions. Due to this fact, selftests for indirect
jumps which directly use `goto *rX` are commented out (such that
CI can run). Due to this fact, I've run test_progs compiled with
indirect jumps as described in [4] (in brief, all tests which
normally pass on my setup, pass with indirect jumps).

There is a list of TBDs (mostly, more selftests), but the list of
changes looks big enough to send the v2.

See individual patches for more details on the implementation details.

v1 -> v2:

  * push_stack changes:
    * sanitize_speculative_path should just return int (Eduard)
    * return code from sanitize_speculative_path, not EFAULT (Eduard)
    * when BPF_COMPLEXITY_LIMIT_JMP_SEQ is reached, return E2BIG (Eduard)

  * indirect jumps:
    * omit support for .imm=fd in gotox, as we're not using it for now (Eduard)
    * struct jt -> struct bpf_iarray (Eduard)
    * insn_successors: rewrite the interface to just return a pointer (Eduard)
    * remove min_index/max_index, use umin_value/umax_value instead (Alexei, Eduard)
    * move emit_indirect_jump args change to the previous patch (Eduard)
    * add a comment to map_mem_size() (Eduard)
    * use verifier_bug for some error cases in check_indirect_jump (Eduard)
    * clear_insn_aux_data: use start,len instead of start,end (Eduard)
    * make regs[insn->dst_reg].type = PTR_TO_INSN part of check_mem_access (Eduard)

  * constant blinding changes:
    * make subprog_start adjustment better readable (Eduard)
    * do not set subprog len, it is already set (Eduard)

  * libbpf:
    * remove check that relocations from .rodata are ok (Anton)
    * do not freeze the map, it is not necessary anymore (Anton)
    * rename the goto_x -> gotox everywhere (Anton)
    * use u64 when parsing LLVM jump tables (Eduard)
    * split patch in two due to spaces->tabs change (Eduard)
    * split bpftool changes to bpftool patch (Andrii)
    * make sym_size it a union with ext_idx (Andrii)
    * properly copy/free the jumptables_data section from elf (Andrii)
    * a few cosmetic changes around create_jt_map (Andrii)
    * fix some comments + rewrite patch description (Andrii)
    * inline bpf_prog__append_subprog_offsets (Andrii)
    * subprog_sec_offst -> subprog_sec_off (Andrii)
    * !strcmp -> strcmp() == 0 (Andrii)
    * make some function names more readable (Andrii)
    * allocate table of subfunc offsets via libbpf_reallocarray (Andrii)

  * selftests:
    * squash insn_array* tests together (Anton)

  * fixed build warnings (kernel test robot)

RFC -> v1:

  * I've tried to address all the comments provided by Alexei and
    Eduard in RFC. Will try to list the most important of them below.
  * One big change: move from older LLVM version [5] to newer [4].
    Now LLVM generates jump tables as symbols in the new special
    section ".jumptables". Another part of this change is that
    libbpf now doesn't try to link map load and goto *rX, as
    1) this is absolutely not reliable 2) for some use cases this
    is impossible (namely, when more than one jump table can be used
    in the same gotox instruction).
  * Added insn_successors() support (Alexei, Eduard). This includes
    getting rid of the ugly bpf_insn_set_iter_xlated_offset()
    interface (Eduard).
  * Removed hack for the unreachable instruction, as new LLVM thank to
    Eduard doesn't generate it.
  * Set mem_size for direct map access properly instead of hacking.
    Remove off>0 check. (Alexei)
  * Do not allocate new memory for min_index/max_index (Alexei, Eduard)
  * Information required during check_cfg is now cached to be reused
    later (Alexei + general logic for supporting multiple JT per jump)
  * Properly compare registers in regsafe (Alexei, Eduard)
  * Remove support for JMP32 (Eduard)
  * Better checks in adjust_ptr_min_max_vals (Eduard)
  * More selftests were added (but still there's room for more) which
    directly use gotox (Alexei)
  * More checks and verbose messages added
  * "unique pointers" are no more in the map

Links:
  1. https://lpc.events/event/18/contributions/1941/
  2. https://lwn.net/Articles/1017439/
  3. https://github.com/llvm/llvm-project/pull/149715
  4. https://github.com/llvm/llvm-project/pull/149715#issuecomment-3274833753
  5. v1: https://lore.kernel.org/bpf/20250816180631.952085-1-a.s.protopopov@gmail.com/
  6. rfc: https://lore.kernel.org/bpf/20250615085943.3871208-1-a.s.protopopov@gmail.com/

Anton Protopopov (13):
  bpf: fix the return value of push_stack
  bpf: save the start of functions in bpf_prog_aux
  bpf, x86: add new map type: instructions array
  selftests/bpf: add selftests for new insn_array map
  bpf: support instructions arrays with constants blinding
  selftests/bpf: test instructions arrays with blinding
  bpf, x86: allow indirect jumps to r8...r15
  bpf, x86: add support for indirect jumps
  bpf: disasm: add support for BPF_JMP|BPF_JA|BPF_X
  libbpf: fix formatting of bpf_object__append_subprog_code
  libbpf: support llvm-generated indirect jumps
  bpftool: Recognize insn_array map type
  selftests/bpf: add selftests for indirect jumps

 arch/x86/net/bpf_jit_comp.c                   |  39 +-
 include/linux/bpf.h                           |  30 +
 include/linux/bpf_types.h                     |   1 +
 include/linux/bpf_verifier.h                  |  17 +
 include/uapi/linux/bpf.h                      |  11 +
 kernel/bpf/Makefile                           |   2 +-
 kernel/bpf/bpf_insn_array.c                   | 350 ++++++++++
 kernel/bpf/core.c                             |  21 +
 kernel/bpf/disasm.c                           |   9 +
 kernel/bpf/log.c                              |   1 +
 kernel/bpf/syscall.c                          |  22 +
 kernel/bpf/verifier.c                         | 646 ++++++++++++++++--
 .../bpf/bpftool/Documentation/bpftool-map.rst |   2 +-
 tools/bpf/bpftool/map.c                       |   2 +-
 tools/include/uapi/linux/bpf.h                |  11 +
 tools/lib/bpf/libbpf.c                        | 192 +++++-
 tools/lib/bpf/libbpf_probes.c                 |   4 +
 tools/lib/bpf/linker.c                        |  10 +-
 tools/testing/selftests/bpf/Makefile          |   4 +-
 .../selftests/bpf/prog_tests/bpf_gotox.c      | 132 ++++
 .../selftests/bpf/prog_tests/bpf_insn_array.c | 497 ++++++++++++++
 tools/testing/selftests/bpf/progs/bpf_gotox.c | 384 +++++++++++
 22 files changed, 2277 insertions(+), 110 deletions(-)
 create mode 100644 kernel/bpf/bpf_insn_array.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_gotox.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_gotox.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 bpf-next 01/13] bpf: fix the return value of push_stack
  2025-09-13 19:39 [PATCH v2 bpf-next 00/13] BPF indirect jumps Anton Protopopov
@ 2025-09-13 19:39 ` Anton Protopopov
  2025-09-13 19:39 ` [PATCH v2 bpf-next 02/13] bpf: save the start of functions in bpf_prog_aux Anton Protopopov
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Anton Protopopov @ 2025-09-13 19:39 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

In [1] Eduard mentioned that on push_stack failure verifier code
should return -ENOMEM instead of -EFAULT. After checking with the
other call sites I've found that code randomly returns either -ENOMEM
or -EFAULT. This patch unifies the return values for the push_stack
(and similar push_async_cb) functions such that error codes are
always assigned properly.

  [1] https://lore.kernel.org/bpf/20250615085943.3871208-1-a.s.protopopov@gmail.com

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 kernel/bpf/verifier.c | 80 +++++++++++++++++++++----------------------
 1 file changed, 40 insertions(+), 40 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 17fe623400a5..5b4d28048b19 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2105,7 +2105,7 @@ static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env,
 
 	elem = kzalloc(sizeof(struct bpf_verifier_stack_elem), GFP_KERNEL_ACCOUNT);
 	if (!elem)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	elem->insn_idx = insn_idx;
 	elem->prev_insn_idx = prev_insn_idx;
@@ -2115,12 +2115,12 @@ static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env,
 	env->stack_size++;
 	err = copy_verifier_state(&elem->st, cur);
 	if (err)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 	elem->st.speculative |= speculative;
 	if (env->stack_size > BPF_COMPLEXITY_LIMIT_JMP_SEQ) {
 		verbose(env, "The sequence of %d jumps is too complex.\n",
 			env->stack_size);
-		return NULL;
+		return ERR_PTR(-E2BIG);
 	}
 	if (elem->st.parent) {
 		++elem->st.parent->branches;
@@ -2917,7 +2917,7 @@ static struct bpf_verifier_state *push_async_cb(struct bpf_verifier_env *env,
 
 	elem = kzalloc(sizeof(struct bpf_verifier_stack_elem), GFP_KERNEL_ACCOUNT);
 	if (!elem)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	elem->insn_idx = insn_idx;
 	elem->prev_insn_idx = prev_insn_idx;
@@ -2929,7 +2929,7 @@ static struct bpf_verifier_state *push_async_cb(struct bpf_verifier_env *env,
 		verbose(env,
 			"The sequence of %d jumps is too complex for async cb.\n",
 			env->stack_size);
-		return NULL;
+		return ERR_PTR(-E2BIG);
 	}
 	/* Unlike push_stack() do not copy_verifier_state().
 	 * The caller state doesn't matter.
@@ -2940,7 +2940,7 @@ static struct bpf_verifier_state *push_async_cb(struct bpf_verifier_env *env,
 	elem->st.in_sleepable = is_sleepable;
 	frame = kzalloc(sizeof(*frame), GFP_KERNEL_ACCOUNT);
 	if (!frame)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 	init_func_state(env, frame,
 			BPF_MAIN_FUNC /* callsite */,
 			0 /* frameno within this callchain */,
@@ -9055,8 +9055,8 @@ static int process_iter_next_call(struct bpf_verifier_env *env, int insn_idx,
 		prev_st = find_prev_entry(env, cur_st->parent, insn_idx);
 		/* branch out active iter state */
 		queued_st = push_stack(env, insn_idx + 1, insn_idx, false);
-		if (!queued_st)
-			return -ENOMEM;
+		if (IS_ERR(queued_st))
+			return PTR_ERR(queued_st);
 
 		queued_iter = get_iter_from_state(queued_st, meta);
 		queued_iter->iter.state = BPF_ITER_STATE_ACTIVE;
@@ -10626,8 +10626,8 @@ static int push_callback_call(struct bpf_verifier_env *env, struct bpf_insn *ins
 		async_cb = push_async_cb(env, env->subprog_info[subprog].start,
 					 insn_idx, subprog,
 					 is_bpf_wq_set_callback_impl_kfunc(insn->imm));
-		if (!async_cb)
-			return -EFAULT;
+		if (IS_ERR(async_cb))
+			return PTR_ERR(async_cb);
 		callee = async_cb->frame[0];
 		callee->async_entry_cnt = caller->async_entry_cnt + 1;
 
@@ -10643,8 +10643,8 @@ static int push_callback_call(struct bpf_verifier_env *env, struct bpf_insn *ins
 	 * proceed with next instruction within current frame.
 	 */
 	callback_state = push_stack(env, env->subprog_info[subprog].start, insn_idx, false);
-	if (!callback_state)
-		return -ENOMEM;
+	if (IS_ERR(callback_state))
+		return PTR_ERR(callback_state);
 
 	err = setup_func_entry(env, subprog, insn_idx, set_callee_state_cb,
 			       callback_state);
@@ -13793,9 +13793,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		struct bpf_reg_state *regs;
 
 		branch = push_stack(env, env->insn_idx + 1, env->insn_idx, false);
-		if (!branch) {
+		if (IS_ERR(branch)) {
 			verbose(env, "failed to push state for failed lock acquisition\n");
-			return -ENOMEM;
+			return PTR_ERR(branch);
 		}
 
 		regs = branch->frame[branch->curframe]->regs;
@@ -14223,16 +14223,15 @@ struct bpf_sanitize_info {
 	bool mask_to_left;
 };
 
-static struct bpf_verifier_state *
-sanitize_speculative_path(struct bpf_verifier_env *env,
-			  const struct bpf_insn *insn,
-			  u32 next_idx, u32 curr_idx)
+static int sanitize_speculative_path(struct bpf_verifier_env *env,
+				     const struct bpf_insn *insn,
+				     u32 next_idx, u32 curr_idx)
 {
 	struct bpf_verifier_state *branch;
 	struct bpf_reg_state *regs;
 
 	branch = push_stack(env, next_idx, curr_idx, true);
-	if (branch && insn) {
+	if (!IS_ERR(branch) && insn) {
 		regs = branch->frame[branch->curframe]->regs;
 		if (BPF_SRC(insn->code) == BPF_K) {
 			mark_reg_unknown(env, regs, insn->dst_reg);
@@ -14241,7 +14240,7 @@ sanitize_speculative_path(struct bpf_verifier_env *env,
 			mark_reg_unknown(env, regs, insn->src_reg);
 		}
 	}
-	return branch;
+	return IS_ERR(branch) ? PTR_ERR(branch) : 0;
 }
 
 static int sanitize_ptr_alu(struct bpf_verifier_env *env,
@@ -14260,7 +14259,6 @@ static int sanitize_ptr_alu(struct bpf_verifier_env *env,
 	u8 opcode = BPF_OP(insn->code);
 	u32 alu_state, alu_limit;
 	struct bpf_reg_state tmp;
-	bool ret;
 	int err;
 
 	if (can_skip_alu_sanitation(env, insn))
@@ -14333,11 +14331,12 @@ static int sanitize_ptr_alu(struct bpf_verifier_env *env,
 		tmp = *dst_reg;
 		copy_register_state(dst_reg, ptr_reg);
 	}
-	ret = sanitize_speculative_path(env, NULL, env->insn_idx + 1,
-					env->insn_idx);
-	if (!ptr_is_dst_reg && ret)
+	err = sanitize_speculative_path(env, NULL, env->insn_idx + 1, env->insn_idx);
+	if (err < 0)
+		return REASON_STACK;
+	if (!ptr_is_dst_reg)
 		*dst_reg = tmp;
-	return !ret ? REASON_STACK : 0;
+	return 0;
 }
 
 static void sanitize_mark_insn_seen(struct bpf_verifier_env *env)
@@ -16660,8 +16659,8 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 
 		/* branch out 'fallthrough' insn as a new state to explore */
 		queued_st = push_stack(env, idx + 1, idx, false);
-		if (!queued_st)
-			return -ENOMEM;
+		if (IS_ERR(queued_st))
+			return PTR_ERR(queued_st);
 
 		queued_st->may_goto_depth++;
 		if (prev_st)
@@ -16739,10 +16738,11 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 		 * the fall-through branch for simulation under speculative
 		 * execution.
 		 */
-		if (!env->bypass_spec_v1 &&
-		    !sanitize_speculative_path(env, insn, *insn_idx + 1,
-					       *insn_idx))
-			return -EFAULT;
+		if (!env->bypass_spec_v1) {
+			err = sanitize_speculative_path(env, insn, *insn_idx + 1, *insn_idx);
+			if (err < 0)
+				return err;
+		}
 		if (env->log.level & BPF_LOG_LEVEL)
 			print_insn_state(env, this_branch, this_branch->curframe);
 		*insn_idx += insn->off;
@@ -16752,11 +16752,12 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 		 * program will go. If needed, push the goto branch for
 		 * simulation under speculative execution.
 		 */
-		if (!env->bypass_spec_v1 &&
-		    !sanitize_speculative_path(env, insn,
-					       *insn_idx + insn->off + 1,
-					       *insn_idx))
-			return -EFAULT;
+		if (!env->bypass_spec_v1) {
+			err = sanitize_speculative_path(env, insn, *insn_idx + insn->off + 1,
+							*insn_idx);
+			if (err < 0)
+				return err;
+		}
 		if (env->log.level & BPF_LOG_LEVEL)
 			print_insn_state(env, this_branch, this_branch->curframe);
 		return 0;
@@ -16777,10 +16778,9 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 			return err;
 	}
 
-	other_branch = push_stack(env, *insn_idx + insn->off + 1, *insn_idx,
-				  false);
-	if (!other_branch)
-		return -EFAULT;
+	other_branch = push_stack(env, *insn_idx + insn->off + 1, *insn_idx, false);
+	if (IS_ERR(other_branch))
+		return PTR_ERR(other_branch);
 	other_branch_regs = other_branch->frame[other_branch->curframe]->regs;
 
 	if (BPF_SRC(insn->code) == BPF_X) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 bpf-next 02/13] bpf: save the start of functions in bpf_prog_aux
  2025-09-13 19:39 [PATCH v2 bpf-next 00/13] BPF indirect jumps Anton Protopopov
  2025-09-13 19:39 ` [PATCH v2 bpf-next 01/13] bpf: fix the return value of push_stack Anton Protopopov
@ 2025-09-13 19:39 ` Anton Protopopov
  2025-09-13 19:39 ` [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array Anton Protopopov
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Anton Protopopov @ 2025-09-13 19:39 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

Introduce a new subprog_start field in bpf_prog_aux. This field may
be used by JIT compilers wanting to know the real absolute xlated
offset of the function being jitted. The func_info[func_id] may have
served this purpose, but func_info may be NULL, so JIT compilers
can't rely on it.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 include/linux/bpf.h   | 1 +
 kernel/bpf/verifier.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 41f776071ff5..1056fd0d54d3 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1601,6 +1601,7 @@ struct bpf_prog_aux {
 	u32 ctx_arg_info_size;
 	u32 max_rdonly_access;
 	u32 max_rdwr_access;
+	u32 subprog_start;
 	struct btf *attach_btf;
 	struct bpf_ctx_arg_aux *ctx_arg_info;
 	void __percpu *priv_stack_ptr;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5b4d28048b19..14c0c6fe9416 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -21597,6 +21597,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 		func[i]->aux->func_idx = i;
 		/* Below members will be freed only at prog->aux */
 		func[i]->aux->btf = prog->aux->btf;
+		func[i]->aux->subprog_start = subprog_start;
 		func[i]->aux->func_info = prog->aux->func_info;
 		func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
 		func[i]->aux->poke_tab = prog->aux->poke_tab;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array
  2025-09-13 19:39 [PATCH v2 bpf-next 00/13] BPF indirect jumps Anton Protopopov
  2025-09-13 19:39 ` [PATCH v2 bpf-next 01/13] bpf: fix the return value of push_stack Anton Protopopov
  2025-09-13 19:39 ` [PATCH v2 bpf-next 02/13] bpf: save the start of functions in bpf_prog_aux Anton Protopopov
@ 2025-09-13 19:39 ` Anton Protopopov
  2025-09-15  4:09   ` kernel test robot
  2025-09-20  0:30   ` Alexei Starovoitov
  2025-09-13 19:39 ` [PATCH v2 bpf-next 04/13] selftests/bpf: add selftests for new insn_array map Anton Protopopov
                   ` (9 subsequent siblings)
  12 siblings, 2 replies; 26+ messages in thread
From: Anton Protopopov @ 2025-09-13 19:39 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

On bpf(BPF_PROG_LOAD) syscall user-supplied BPF programs are
translated by the verifier into "xlated" BPF programs. During this
process the original instructions offsets might be adjusted and/or
individual instructions might be replaced by new sets of instructions,
or deleted.

Add a new BPF map type which is aimed to keep track of how, for a
given program, the original instructions were relocated during the
verification. Also, besides keeping track of the original -> xlated
mapping, make x86 JIT to build the xlated -> jitted mapping for every
instruction listed in an instruction array. This is required for every
future application of instruction arrays: static keys, indirect jumps
and indirect calls.

A map of the BPF_MAP_TYPE_INSN_ARRAY type must be created with a u32
keys and value of size 8. The values have different semantics for
userspace and for BPF space. For userspace a value consists of two
u32 values – xlated and jitted offsets. For BPF side the value is
a real pointer to a jitted instruction.

On map creation/initialization, before loading the program, each
element of the map should be initialized to point to an instruction
offset within the program. Before the program load such maps should
be made frozen. After the program verification xlated and jitted
offsets can be read via the bpf(2) syscall.

If a tracked instruction is removed by the verifier, then the xlated
offset is set to (u32)-1 which is considered to be too big for a valid
BPF program offset.

One such a map can, obviously, be used to track one and only one BPF
program.  If the verification process was unsuccessful, then the same
map can be re-used to verify the program with a different log level.
However, if the program was loaded fine, then such a map, being
frozen in any case, can't be reused by other programs even after the
program release.

Example. Consider the following original and xlated programs:

    Original prog:                      Xlated prog:

     0:  r1 = 0x0                        0: r1 = 0
     1:  *(u32 *)(r10 - 0x4) = r1        1: *(u32 *)(r10 -4) = r1
     2:  r2 = r10                        2: r2 = r10
     3:  r2 += -0x4                      3: r2 += -4
     4:  r1 = 0x0 ll                     4: r1 = map[id:88]
     6:  call 0x1                        6: r1 += 272
                                         7: r0 = *(u32 *)(r2 +0)
                                         8: if r0 >= 0x1 goto pc+3
                                         9: r0 <<= 3
                                        10: r0 += r1
                                        11: goto pc+1
                                        12: r0 = 0
     7:  r6 = r0                        13: r6 = r0
     8:  if r6 == 0x0 goto +0x2         14: if r6 == 0x0 goto pc+4
     9:  call 0x76                      15: r0 = 0xffffffff8d2079c0
                                        17: r0 = *(u64 *)(r0 +0)
    10:  *(u64 *)(r6 + 0x0) = r0        18: *(u64 *)(r6 +0) = r0
    11:  r0 = 0x0                       19: r0 = 0x0
    12:  exit                           20: exit

An instruction array map, containing, e.g., instructions [0,4,7,12]
will be translated by the verifier to [0,4,13,20]. A map with
index 5 (the middle of 16-byte instruction) or indexes greater than 12
(outside the program boundaries) would be rejected.

The functionality provided by this patch will be extended in consequent
patches to implement BPF Static Keys, indirect jumps, and indirect calls.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 arch/x86/net/bpf_jit_comp.c    |   8 +
 include/linux/bpf.h            |  28 +++
 include/linux/bpf_types.h      |   1 +
 include/linux/bpf_verifier.h   |   2 +
 include/uapi/linux/bpf.h       |  11 ++
 kernel/bpf/Makefile            |   2 +-
 kernel/bpf/bpf_insn_array.c    | 336 +++++++++++++++++++++++++++++++++
 kernel/bpf/syscall.c           |  22 +++
 kernel/bpf/verifier.c          |  43 +++++
 tools/include/uapi/linux/bpf.h |  11 ++
 10 files changed, 463 insertions(+), 1 deletion(-)
 create mode 100644 kernel/bpf/bpf_insn_array.c

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 8d34a9400a5e..8792d7f371d3 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1664,6 +1664,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image
 	prog = temp;
 
 	for (i = 1; i <= insn_cnt; i++, insn++) {
+		u32 abs_xlated_off = bpf_prog->aux->subprog_start + i - 1;
 		const s32 imm32 = insn->imm;
 		u32 dst_reg = insn->dst_reg;
 		u32 src_reg = insn->src_reg;
@@ -2717,6 +2718,13 @@ st:			if (is_imm8(insn->off))
 				return -EFAULT;
 			}
 			memcpy(rw_image + proglen, temp, ilen);
+
+			/*
+			 * Instruction arrays need to know how xlated code
+			 * maps to jitted code
+			 */
+			bpf_prog_update_insn_ptr(bpf_prog, abs_xlated_off, proglen,
+						 image + proglen);
 		}
 		proglen += ilen;
 		addrs[i] = proglen;
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 1056fd0d54d3..77fcb508d6ae 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -3717,4 +3717,32 @@ int bpf_prog_get_file_line(struct bpf_prog *prog, unsigned long ip, const char *
 			   const char **linep, int *nump);
 struct bpf_prog *bpf_prog_find_from_stack(void);
 
+int bpf_insn_array_init(struct bpf_map *map, const struct bpf_prog *prog);
+int bpf_insn_array_ready(struct bpf_map *map);
+void bpf_insn_array_release(struct bpf_map *map);
+void bpf_insn_array_adjust(struct bpf_map *map, u32 off, u32 len);
+void bpf_insn_array_adjust_after_remove(struct bpf_map *map, u32 off, u32 len);
+
+/*
+ * The struct bpf_insn_ptr structure describes a pointer to a
+ * particular instruction in a loaded BPF program. Initially
+ * it is initialised from userspace via user_value.xlated_off.
+ * During the program verification all other fields are populated
+ * accordingly:
+ *
+ *   jitted_ip:       address of the instruction in the jitted image
+ *   user_value:      user-visible xlated and jitted offsets
+ *   orig_xlated_off: original offset of the instruction
+ */
+struct bpf_insn_ptr {
+	void *jitted_ip;
+	struct bpf_insn_array_value user_value;
+	u32 orig_xlated_off;
+};
+
+void bpf_prog_update_insn_ptr(struct bpf_prog *prog,
+			      u32 xlated_off,
+			      u32 jitted_off,
+			      void *jitted_ip);
+
 #endif /* _LINUX_BPF_H */
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index fa78f49d4a9a..b13de31e163f 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -133,6 +133,7 @@ BPF_MAP_TYPE(BPF_MAP_TYPE_RINGBUF, ringbuf_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_BLOOM_FILTER, bloom_filter_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_USER_RINGBUF, user_ringbuf_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_ARENA, arena_map_ops)
+BPF_MAP_TYPE(BPF_MAP_TYPE_INSN_ARRAY, insn_array_map_ops)
 
 BPF_LINK_TYPE(BPF_LINK_TYPE_RAW_TRACEPOINT, raw_tracepoint)
 BPF_LINK_TYPE(BPF_LINK_TYPE_TRACING, tracing)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 020de62bd09c..aca43c284203 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -766,8 +766,10 @@ struct bpf_verifier_env {
 	struct list_head free_list;	/* list of struct bpf_verifier_state_list */
 	struct bpf_map *used_maps[MAX_USED_MAPS]; /* array of map's used by eBPF program */
 	struct btf_mod_pair used_btfs[MAX_USED_BTFS]; /* array of BTF's used by BPF program */
+	struct bpf_map *insn_array_maps[MAX_USED_MAPS]; /* array of INSN_ARRAY map's to be relocated */
 	u32 used_map_cnt;		/* number of used maps */
 	u32 used_btf_cnt;		/* number of used BTF objects */
+	u32 insn_array_map_cnt;		/* number of used maps of type BPF_MAP_TYPE_INSN_ARRAY */
 	u32 id_gen;			/* used to generate unique reg IDs */
 	u32 hidden_subprog_cnt;		/* number of hidden subprogs */
 	int exception_callback_subprog;
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 233de8677382..021c27ee5591 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1026,6 +1026,7 @@ enum bpf_map_type {
 	BPF_MAP_TYPE_USER_RINGBUF,
 	BPF_MAP_TYPE_CGRP_STORAGE,
 	BPF_MAP_TYPE_ARENA,
+	BPF_MAP_TYPE_INSN_ARRAY,
 	__MAX_BPF_MAP_TYPE
 };
 
@@ -7623,4 +7624,14 @@ enum bpf_kfunc_flags {
 	BPF_F_PAD_ZEROS = (1ULL << 0),
 };
 
+/*
+ * Values of a BPF_MAP_TYPE_INSN_ARRAY entry must be of this type.
+ * On updates jitted_off must be equal to 0.
+ */
+struct bpf_insn_array_value {
+	__u32 jitted_off;
+	__u32 xlated_off;
+};
+
+
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index f6cf8c2af5f7..e596b66a48e6 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -9,7 +9,7 @@ CFLAGS_core.o += -Wno-override-init $(cflags-nogcse-yy)
 obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o log.o token.o
 obj-$(CONFIG_BPF_SYSCALL) += bpf_iter.o map_iter.o task_iter.o prog_iter.o link_iter.o
 obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o bloom_filter.o
-obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o
+obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o bpf_insn_array.o
 obj-$(CONFIG_BPF_SYSCALL) += bpf_local_storage.o bpf_task_storage.o
 obj-${CONFIG_BPF_LSM}	  += bpf_inode_storage.o
 obj-$(CONFIG_BPF_SYSCALL) += disasm.o mprog.o
diff --git a/kernel/bpf/bpf_insn_array.c b/kernel/bpf/bpf_insn_array.c
new file mode 100644
index 000000000000..0c8dac62f457
--- /dev/null
+++ b/kernel/bpf/bpf_insn_array.c
@@ -0,0 +1,336 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/bpf.h>
+#include <linux/sort.h>
+
+#define MAX_INSN_ARRAY_ENTRIES 256
+
+struct bpf_insn_array {
+	struct bpf_map map;
+	struct mutex state_mutex;
+	int state;
+	long *ips;
+	DECLARE_FLEX_ARRAY(struct bpf_insn_ptr, ptrs);
+};
+
+enum {
+	INSN_ARRAY_STATE_FREE = 0,
+	INSN_ARRAY_STATE_INIT,
+	INSN_ARRAY_STATE_READY,
+};
+
+#define cast_insn_array(MAP_PTR) \
+	container_of(MAP_PTR, struct bpf_insn_array, map)
+
+#define INSN_DELETED ((u32)-1)
+
+static inline u32 insn_array_alloc_size(u32 max_entries)
+{
+	const u32 base_size = sizeof(struct bpf_insn_array);
+	const u32 entry_size = sizeof(struct bpf_insn_ptr);
+
+	return base_size + entry_size * max_entries;
+}
+
+static int insn_array_alloc_check(union bpf_attr *attr)
+{
+	if (attr->max_entries == 0 ||
+	    attr->key_size != 4 ||
+	    attr->value_size != 8 ||
+	    attr->map_flags != 0)
+		return -EINVAL;
+
+	if (attr->max_entries > MAX_INSN_ARRAY_ENTRIES)
+		return -E2BIG;
+
+	return 0;
+}
+
+static void insn_array_free(struct bpf_map *map)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+
+	kfree(insn_array->ips);
+	bpf_map_area_free(insn_array);
+}
+
+static struct bpf_map *insn_array_alloc(union bpf_attr *attr)
+{
+	u64 size = insn_array_alloc_size(attr->max_entries);
+	struct bpf_insn_array *insn_array;
+
+	insn_array = bpf_map_area_alloc(size, NUMA_NO_NODE);
+	if (!insn_array)
+		return ERR_PTR(-ENOMEM);
+
+	insn_array->ips = kcalloc(attr->max_entries, sizeof(long), GFP_KERNEL);
+	if (!insn_array->ips) {
+		insn_array_free(&insn_array->map);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	bpf_map_init_from_attr(&insn_array->map, attr);
+
+	mutex_init(&insn_array->state_mutex);
+	insn_array->state = INSN_ARRAY_STATE_FREE;
+
+	return &insn_array->map;
+}
+
+static int insn_array_get_next_key(struct bpf_map *map, void *key, void *next_key)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+	u32 index = key ? *(u32 *)key : U32_MAX;
+	u32 *next = (u32 *)next_key;
+
+	if (index >= insn_array->map.max_entries) {
+		*next = 0;
+		return 0;
+	}
+
+	if (index == insn_array->map.max_entries - 1)
+		return -ENOENT;
+
+	*next = index + 1;
+	return 0;
+}
+
+static void *insn_array_lookup_elem(struct bpf_map *map, void *key)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+	u32 index = *(u32 *)key;
+
+	if (unlikely(index >= insn_array->map.max_entries))
+		return NULL;
+
+	return &insn_array->ptrs[index].user_value;
+}
+
+static long insn_array_update_elem(struct bpf_map *map, void *key, void *value, u64 map_flags)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+	u32 index = *(u32 *)key;
+	struct bpf_insn_array_value val = {};
+	int err = 0;
+
+	if (unlikely((map_flags & ~BPF_F_LOCK) > BPF_EXIST))
+		return -EINVAL;
+
+	if (unlikely(index >= insn_array->map.max_entries))
+		return -E2BIG;
+
+	if (unlikely(map_flags & BPF_NOEXIST))
+		return -EEXIST;
+
+	/* No updates for maps in use */
+	if (!mutex_trylock(&insn_array->state_mutex))
+		return -EBUSY;
+
+	if (insn_array->state != INSN_ARRAY_STATE_FREE) {
+		err = -EBUSY;
+		goto unlock;
+	}
+
+	copy_map_value(map, &val, value);
+	if (val.jitted_off || val.xlated_off == INSN_DELETED) {
+		err = -EINVAL;
+		goto unlock;
+	}
+
+	insn_array->ptrs[index].orig_xlated_off = val.xlated_off;
+	insn_array->ptrs[index].user_value.xlated_off = val.xlated_off;
+
+unlock:
+	mutex_unlock(&insn_array->state_mutex);
+	return err;
+}
+
+static long insn_array_delete_elem(struct bpf_map *map, void *key)
+{
+	return -EINVAL;
+}
+
+static int insn_array_check_btf(const struct bpf_map *map,
+			      const struct btf *btf,
+			      const struct btf_type *key_type,
+			      const struct btf_type *value_type)
+{
+	if (!btf_type_is_i32(key_type))
+		return -EINVAL;
+
+	if (!btf_type_is_i64(value_type))
+		return -EINVAL;
+
+	return 0;
+}
+
+static u64 insn_array_mem_usage(const struct bpf_map *map)
+{
+	u64 extra_size = 0;
+
+	extra_size += sizeof(long) * map->max_entries; /* insn_array->ips */
+
+	return insn_array_alloc_size(map->max_entries) + extra_size;
+}
+
+BTF_ID_LIST_SINGLE(insn_array_btf_ids, struct, bpf_insn_array)
+
+const struct bpf_map_ops insn_array_map_ops = {
+	.map_alloc_check = insn_array_alloc_check,
+	.map_alloc = insn_array_alloc,
+	.map_free = insn_array_free,
+	.map_get_next_key = insn_array_get_next_key,
+	.map_lookup_elem = insn_array_lookup_elem,
+	.map_update_elem = insn_array_update_elem,
+	.map_delete_elem = insn_array_delete_elem,
+	.map_check_btf = insn_array_check_btf,
+	.map_mem_usage = insn_array_mem_usage,
+	.map_btf_id = &insn_array_btf_ids[0],
+};
+
+static bool is_insn_array(const struct bpf_map *map)
+{
+	return map->map_type == BPF_MAP_TYPE_INSN_ARRAY;
+}
+
+static inline bool valid_offsets(const struct bpf_insn_array *insn_array,
+				 const struct bpf_prog *prog)
+{
+	u32 off;
+	int i;
+
+	for (i = 0; i < insn_array->map.max_entries; i++) {
+		off = insn_array->ptrs[i].orig_xlated_off;
+
+		if (off >= prog->len)
+			return false;
+
+		if (off > 0) {
+			if (prog->insnsi[off-1].code == (BPF_LD | BPF_DW | BPF_IMM))
+				return false;
+		}
+	}
+
+	return true;
+}
+
+int bpf_insn_array_init(struct bpf_map *map, const struct bpf_prog *prog)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+	int i;
+
+	if (!valid_offsets(insn_array, prog))
+		return -EINVAL;
+
+	/*
+	 * There can be only one program using the map
+	 */
+	mutex_lock(&insn_array->state_mutex);
+	if (insn_array->state != INSN_ARRAY_STATE_FREE) {
+		mutex_unlock(&insn_array->state_mutex);
+		return -EBUSY;
+	}
+	insn_array->state = INSN_ARRAY_STATE_INIT;
+	mutex_unlock(&insn_array->state_mutex);
+
+	/*
+	 * Reset all the map indexes to the original values.  This is needed,
+	 * e.g., when a replay of verification with different log level should
+	 * be performed.
+	 */
+	for (i = 0; i < map->max_entries; i++)
+		insn_array->ptrs[i].user_value.xlated_off = insn_array->ptrs[i].orig_xlated_off;
+
+	return 0;
+}
+
+int bpf_insn_array_ready(struct bpf_map *map)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+	guard(mutex)(&insn_array->state_mutex);
+	int i;
+
+	for (i = 0; i < map->max_entries; i++) {
+		if (insn_array->ptrs[i].user_value.xlated_off == INSN_DELETED)
+			continue;
+		if (!insn_array->ips[i]) {
+			/*
+			 * Set the map free on failure; the program owning it
+			 * might be re-loaded with different log level
+			 */
+			insn_array->state = INSN_ARRAY_STATE_FREE;
+			return -EFAULT;
+		}
+	}
+
+	insn_array->state = INSN_ARRAY_STATE_READY;
+	return 0;
+}
+
+void bpf_insn_array_release(struct bpf_map *map)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+	guard(mutex)(&insn_array->state_mutex);
+
+	insn_array->state = INSN_ARRAY_STATE_FREE;
+}
+
+void bpf_insn_array_adjust(struct bpf_map *map, u32 off, u32 len)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+	int i;
+
+	if (len <= 1)
+		return;
+
+	for (i = 0; i < map->max_entries; i++) {
+		if (insn_array->ptrs[i].user_value.xlated_off <= off)
+			continue;
+		if (insn_array->ptrs[i].user_value.xlated_off == INSN_DELETED)
+			continue;
+		insn_array->ptrs[i].user_value.xlated_off += len - 1;
+	}
+}
+
+void bpf_insn_array_adjust_after_remove(struct bpf_map *map, u32 off, u32 len)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+	int i;
+
+	for (i = 0; i < map->max_entries; i++) {
+		if (insn_array->ptrs[i].user_value.xlated_off < off)
+			continue;
+		if (insn_array->ptrs[i].user_value.xlated_off == INSN_DELETED)
+			continue;
+		if (insn_array->ptrs[i].user_value.xlated_off >= off &&
+		    insn_array->ptrs[i].user_value.xlated_off < off + len)
+			insn_array->ptrs[i].user_value.xlated_off = INSN_DELETED;
+		else
+			insn_array->ptrs[i].user_value.xlated_off -= len;
+	}
+}
+
+void bpf_prog_update_insn_ptr(struct bpf_prog *prog,
+			      u32 xlated_off,
+			      u32 jitted_off,
+			      void *jitted_ip)
+{
+	struct bpf_insn_array *insn_array;
+	struct bpf_map *map;
+	int i, j;
+
+	for (i = 0; i < prog->aux->used_map_cnt; i++) {
+		map = prog->aux->used_maps[i];
+		if (!is_insn_array(map))
+			continue;
+
+		insn_array = cast_insn_array(map);
+		for (j = 0; j < map->max_entries; j++) {
+			if (insn_array->ptrs[j].user_value.xlated_off == xlated_off) {
+				insn_array->ips[j] = (long)jitted_ip;
+				insn_array->ptrs[j].jitted_ip = jitted_ip;
+				insn_array->ptrs[j].user_value.jitted_off = jitted_off;
+			}
+		}
+	}
+}
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 3f178a0f8eb1..7b4e7a053aa0 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1461,6 +1461,7 @@ static int map_create(union bpf_attr *attr, bool kernel)
 	case BPF_MAP_TYPE_STRUCT_OPS:
 	case BPF_MAP_TYPE_CPUMAP:
 	case BPF_MAP_TYPE_ARENA:
+	case BPF_MAP_TYPE_INSN_ARRAY:
 		if (!bpf_token_capable(token, CAP_BPF))
 			goto put_token;
 		break;
@@ -2761,6 +2762,23 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type)
 	}
 }
 
+static int bpf_prog_mark_insn_arrays_ready(struct bpf_prog *prog)
+{
+	int err;
+	int i;
+
+	for (i = 0; i < prog->aux->used_map_cnt; i++) {
+		if (prog->aux->used_maps[i]->map_type != BPF_MAP_TYPE_INSN_ARRAY)
+			continue;
+
+		err = bpf_insn_array_ready(prog->aux->used_maps[i]);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 /* last field in 'union bpf_attr' used by this command */
 #define BPF_PROG_LOAD_LAST_FIELD fd_array_cnt
 
@@ -2984,6 +3002,10 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 	if (err < 0)
 		goto free_used_maps;
 
+	err = bpf_prog_mark_insn_arrays_ready(prog);
+	if (err < 0)
+		goto free_used_maps;
+
 	err = bpf_prog_alloc_id(prog);
 	if (err)
 		goto free_used_maps;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 14c0c6fe9416..1f1708fd76c4 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -10093,6 +10093,8 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
 		    func_id != BPF_FUNC_map_push_elem)
 			goto error;
 		break;
+	case BPF_MAP_TYPE_INSN_ARRAY:
+		goto error;
 	default:
 		break;
 	}
@@ -20517,6 +20519,15 @@ static int __add_used_map(struct bpf_verifier_env *env, struct bpf_map *map)
 
 	env->used_maps[env->used_map_cnt++] = map;
 
+	if (map->map_type == BPF_MAP_TYPE_INSN_ARRAY) {
+		err = bpf_insn_array_init(map, env->prog);
+		if (err) {
+			verbose(env, "Failed to properly initialize insn array\n");
+			return err;
+		}
+		env->insn_array_maps[env->insn_array_map_cnt++] = map;
+	}
+
 	return env->used_map_cnt - 1;
 }
 
@@ -20763,6 +20774,33 @@ static void adjust_subprog_starts(struct bpf_verifier_env *env, u32 off, u32 len
 	}
 }
 
+static void release_insn_arrays(struct bpf_verifier_env *env)
+{
+	int i;
+
+	for (i = 0; i < env->insn_array_map_cnt; i++)
+		bpf_insn_array_release(env->insn_array_maps[i]);
+}
+
+static void adjust_insn_arrays(struct bpf_verifier_env *env, u32 off, u32 len)
+{
+	int i;
+
+	if (len == 1)
+		return;
+
+	for (i = 0; i < env->insn_array_map_cnt; i++)
+		bpf_insn_array_adjust(env->insn_array_maps[i], off, len);
+}
+
+static void adjust_insn_arrays_after_remove(struct bpf_verifier_env *env, u32 off, u32 len)
+{
+	int i;
+
+	for (i = 0; i < env->insn_array_map_cnt; i++)
+		bpf_insn_array_adjust_after_remove(env->insn_array_maps[i], off, len);
+}
+
 static void adjust_poke_descs(struct bpf_prog *prog, u32 off, u32 len)
 {
 	struct bpf_jit_poke_descriptor *tab = prog->aux->poke_tab;
@@ -20805,6 +20843,7 @@ static struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 of
 	}
 	adjust_insn_aux_data(env, new_prog, off, len);
 	adjust_subprog_starts(env, off, len);
+	adjust_insn_arrays(env, off, len);
 	adjust_poke_descs(new_prog, off, len);
 	return new_prog;
 }
@@ -20988,6 +21027,8 @@ static int verifier_remove_insns(struct bpf_verifier_env *env, u32 off, u32 cnt)
 	if (err)
 		return err;
 
+	adjust_insn_arrays_after_remove(env, off, cnt);
+
 	memmove(aux_data + off,	aux_data + off + cnt,
 		sizeof(*aux_data) * (orig_prog_len - off - cnt));
 
@@ -24836,6 +24877,8 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	adjust_btf_func(env);
 
 err_release_maps:
+	if (ret)
+		release_insn_arrays(env);
 	if (!env->prog->aux->used_maps)
 		/* if we didn't copy map pointers into bpf_prog_info, release
 		 * them now. Otherwise free_used_maps() will release them.
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 233de8677382..021c27ee5591 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1026,6 +1026,7 @@ enum bpf_map_type {
 	BPF_MAP_TYPE_USER_RINGBUF,
 	BPF_MAP_TYPE_CGRP_STORAGE,
 	BPF_MAP_TYPE_ARENA,
+	BPF_MAP_TYPE_INSN_ARRAY,
 	__MAX_BPF_MAP_TYPE
 };
 
@@ -7623,4 +7624,14 @@ enum bpf_kfunc_flags {
 	BPF_F_PAD_ZEROS = (1ULL << 0),
 };
 
+/*
+ * Values of a BPF_MAP_TYPE_INSN_ARRAY entry must be of this type.
+ * On updates jitted_off must be equal to 0.
+ */
+struct bpf_insn_array_value {
+	__u32 jitted_off;
+	__u32 xlated_off;
+};
+
+
 #endif /* _UAPI__LINUX_BPF_H__ */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 bpf-next 04/13] selftests/bpf: add selftests for new insn_array map
  2025-09-13 19:39 [PATCH v2 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (2 preceding siblings ...)
  2025-09-13 19:39 ` [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array Anton Protopopov
@ 2025-09-13 19:39 ` Anton Protopopov
  2025-09-13 19:39 ` [PATCH v2 bpf-next 05/13] bpf: support instructions arrays with constants blinding Anton Protopopov
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Anton Protopopov @ 2025-09-13 19:39 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

Add the following selftests for new insn_array map:

  * Incorrect instruction indexes are rejected
  * Two programs can't use the same map
  * BPF progs can't operate the map
  * no changes to code => map is the same
  * expected changes when instructions are added
  * expected changes when instructions are deleted
  * expected changes when multiple functions are present

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 .../selftests/bpf/prog_tests/bpf_insn_array.c | 405 ++++++++++++++++++
 1 file changed, 405 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c b/tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c
new file mode 100644
index 000000000000..f785132497d6
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c
@@ -0,0 +1,405 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <bpf/bpf.h>
+#include <test_progs.h>
+
+static int map_create(__u32 map_type, __u32 max_entries)
+{
+	const char *map_name = "insn_array";
+	__u32 key_size = 4;
+	__u32 value_size = sizeof(struct bpf_insn_array_value);
+
+	return bpf_map_create(map_type, map_name, key_size, value_size, max_entries, NULL);
+}
+
+static int prog_load(struct bpf_insn *insns, __u32 insn_cnt, int *fd_array, __u32 fd_array_cnt)
+{
+	LIBBPF_OPTS(bpf_prog_load_opts, opts);
+
+	opts.fd_array = fd_array;
+	opts.fd_array_cnt = fd_array_cnt;
+
+	return bpf_prog_load(BPF_PROG_TYPE_XDP, NULL, "GPL", insns, insn_cnt, &opts);
+}
+
+/*
+ * Load a program, which will not be anyhow mangled by the verifier.  Add an
+ * insn_array map pointing to every instruction. Check that it hasn't changed
+ * after the program load.
+ */
+static void check_one_to_one_mapping(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 4),
+		BPF_MOV64_IMM(BPF_REG_0, 3),
+		BPF_MOV64_IMM(BPF_REG_0, 2),
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd = -1, map_fd;
+	struct bpf_insn_array_value val = {};
+	int i;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, ARRAY_SIZE(insns));
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		val.xlated_off = i;
+		if (!ASSERT_EQ(bpf_map_update_elem(map_fd, &i, &val, 0), 0, "bpf_map_update_elem"))
+			goto cleanup;
+	}
+
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_GE(prog_fd, 0, "bpf(BPF_PROG_LOAD)"))
+		goto cleanup;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		if (!ASSERT_EQ(bpf_map_lookup_elem(map_fd, &i, &val), 0, "bpf_map_lookup_elem"))
+			goto cleanup;
+
+		ASSERT_EQ(val.xlated_off, i, "val should be equal i");
+	}
+
+cleanup:
+	close(prog_fd);
+	close(map_fd);
+}
+
+/*
+ * Try to load a program with a map which points to outside of the program
+ */
+static void check_out_of_bounds_index(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 4),
+		BPF_MOV64_IMM(BPF_REG_0, 3),
+		BPF_MOV64_IMM(BPF_REG_0, 2),
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd, map_fd;
+	struct bpf_insn_array_value val = {};
+	int key;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, 1);
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	key = 0;
+	val.xlated_off = ARRAY_SIZE(insns); /* too big */
+	if (!ASSERT_EQ(bpf_map_update_elem(map_fd, &key, &val, 0), 0, "bpf_map_update_elem"))
+		goto cleanup;
+
+	errno = 0;
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_EQ(prog_fd, -EINVAL, "program should have been rejected (prog_fd != -EINVAL)")) {
+		close(prog_fd);
+		goto cleanup;
+	}
+
+cleanup:
+	close(map_fd);
+}
+
+/*
+ * Try to load a program with a map which points to the middle of 16-bit insn
+ */
+static void check_mid_insn_index(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_LD_IMM64(BPF_REG_0, 0), /* 2 x 8 */
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd, map_fd;
+	struct bpf_insn_array_value val = {};
+	int key;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, 1);
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	key = 0;
+	val.xlated_off = 1; /* middle of 16-byte instruction */
+	if (!ASSERT_EQ(bpf_map_update_elem(map_fd, &key, &val, 0), 0, "bpf_map_update_elem"))
+		goto cleanup;
+
+	errno = 0;
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_EQ(prog_fd, -EINVAL, "program should have been rejected (prog_fd != -EINVAL)")) {
+		close(prog_fd);
+		goto cleanup;
+	}
+
+cleanup:
+	close(map_fd);
+}
+
+static void check_incorrect_index(void)
+{
+	check_out_of_bounds_index();
+	check_mid_insn_index();
+}
+
+/*
+ * Load a program with two patches (get jiffies, for simplicity). Add an
+ * insn_array map pointing to every instruction. Check how it was changed
+ * after the program load.
+ */
+static void check_simple(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 2),
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd = -1, map_fd;
+	__u32 map_in[] = {0, 1, 2, 3, 4, 5};
+	__u32 map_out[] = {0, 1, 4, 5, 8, 9};
+	struct bpf_insn_array_value val = {};
+	int i;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, ARRAY_SIZE(insns));
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		val.xlated_off = map_in[i];
+		if (!ASSERT_EQ(bpf_map_update_elem(map_fd, &i, &val, 0), 0,
+			       "bpf_map_update_elem"))
+			goto cleanup;
+	}
+
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_GE(prog_fd, 0, "bpf(BPF_PROG_LOAD)"))
+		goto cleanup;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		if (!ASSERT_EQ(bpf_map_lookup_elem(map_fd, &i, &val), 0, "bpf_map_lookup_elem"))
+			goto cleanup;
+
+		ASSERT_EQ(val.xlated_off, map_out[i], "val should be equal map_out[i]");
+	}
+
+cleanup:
+	close(prog_fd);
+	close(map_fd);
+}
+
+/*
+ * Verifier can delete code in two cases: nops & dead code. From insn
+ * array's point of view, the two cases are the same, so test using
+ * the simplest method: by loading some nops
+ */
+static void check_deletions(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 2),
+		BPF_JMP_IMM(BPF_JA, 0, 0, 0), /* nop */
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_JMP_IMM(BPF_JA, 0, 0, 0), /* nop */
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd = -1, map_fd;
+	__u32 map_in[] = {0, 1, 2, 3, 4, 5};
+	__u32 map_out[] = {0, -1, 1, -1, 2, 3};
+	struct bpf_insn_array_value val = {};
+	int i;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, ARRAY_SIZE(insns));
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		val.xlated_off = map_in[i];
+		if (!ASSERT_EQ(bpf_map_update_elem(map_fd, &i, &val, 0), 0,
+			       "bpf_map_update_elem"))
+			goto cleanup;
+	}
+
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_GE(prog_fd, 0, "bpf(BPF_PROG_LOAD)"))
+		goto cleanup;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		if (!ASSERT_EQ(bpf_map_lookup_elem(map_fd, &i, &val), 0, "bpf_map_lookup_elem"))
+			goto cleanup;
+
+		ASSERT_EQ(val.xlated_off, map_out[i], "val should be equal map_out[i]");
+	}
+
+cleanup:
+	close(prog_fd);
+	close(map_fd);
+}
+
+static void check_with_functions(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_JMP_IMM(BPF_JA, 0, 0, 0), /* nop */
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+		BPF_JMP_IMM(BPF_JA, 0, 0, 0), /* nop */
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 2),
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_EXIT_INSN(),
+		BPF_JMP_IMM(BPF_JA, 0, 0, 0), /* nop */
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64),
+		BPF_JMP_IMM(BPF_JA, 0, 0, 0), /* nop */
+		BPF_MOV64_IMM(BPF_REG_0, 2),
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd = -1, map_fd;
+	__u32 map_in[] =  { 0, 1,  2, 3, 4, 5, /* func */  6, 7,  8, 9, 10};
+	__u32 map_out[] = {-1, 0, -1, 3, 4, 5, /* func */ -1, 6, -1, 9, 10};
+	struct bpf_insn_array_value val = {};
+	int i;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, ARRAY_SIZE(insns));
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		val.xlated_off = map_in[i];
+		if (!ASSERT_EQ(bpf_map_update_elem(map_fd, &i, &val, 0), 0,
+			       "bpf_map_update_elem"))
+			goto cleanup;
+	}
+
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_GE(prog_fd, 0, "bpf(BPF_PROG_LOAD)"))
+		goto cleanup;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		if (!ASSERT_EQ(bpf_map_lookup_elem(map_fd, &i, &val), 0, "bpf_map_lookup_elem"))
+			goto cleanup;
+
+		ASSERT_EQ(val.xlated_off, map_out[i], "val should be equal map_out[i]");
+	}
+
+cleanup:
+	close(prog_fd);
+	close(map_fd);
+}
+
+/* Map can be used only by one BPF program */
+static void check_no_map_reuse(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd = -1, map_fd, extra_fd = -1;
+	struct bpf_insn_array_value val = {};
+	int i;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, ARRAY_SIZE(insns));
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		val.xlated_off = i;
+		if (!ASSERT_EQ(bpf_map_update_elem(map_fd, &i, &val, 0), 0, "bpf_map_update_elem"))
+			goto cleanup;
+	}
+
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_GE(prog_fd, 0, "bpf(BPF_PROG_LOAD)"))
+		goto cleanup;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		if (!ASSERT_EQ(bpf_map_lookup_elem(map_fd, &i, &val), 0, "bpf_map_lookup_elem"))
+			goto cleanup;
+
+		ASSERT_EQ(val.xlated_off, i, "val should be equal i");
+	}
+
+	errno = 0;
+	extra_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_EQ(extra_fd, -EBUSY, "program should have been rejected (extra_fd != -EBUSY)"))
+		goto cleanup;
+
+	/* correctness: check that prog is still loadable without fd_array */
+	extra_fd = prog_load(insns, ARRAY_SIZE(insns), NULL, 0);
+	if (!ASSERT_GE(prog_fd, 0, "bpf(BPF_PROG_LOAD): expected no error"))
+		goto cleanup;
+
+cleanup:
+	close(extra_fd);
+	close(prog_fd);
+	close(map_fd);
+}
+
+static void check_bpf_no_lookup(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_LD_MAP_FD(BPF_REG_1, 0),
+		BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+		BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+		BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd = -1, map_fd;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, 1);
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	insns[0].imm = map_fd;
+
+	errno = 0;
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), NULL, 0);
+	if (!ASSERT_EQ(prog_fd, -EINVAL, "program should have been rejected (prog_fd != -EINVAL)"))
+		goto cleanup;
+
+	/* correctness: check that prog is still loadable with normal map */
+	close(map_fd);
+	map_fd = map_create(BPF_MAP_TYPE_ARRAY, 1);
+	insns[0].imm = map_fd;
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), NULL, 0);
+	if (!ASSERT_GE(prog_fd, 0, "bpf(BPF_PROG_LOAD)"))
+		goto cleanup;
+
+cleanup:
+	close(prog_fd);
+	close(map_fd);
+}
+
+static void check_bpf_side(void)
+{
+	check_bpf_no_lookup();
+}
+
+void test_bpf_insn_array(void)
+{
+	/* Test if offsets are adjusted properly */
+
+	if (test__start_subtest("one2one"))
+		check_one_to_one_mapping();
+
+	if (test__start_subtest("simple"))
+		check_simple();
+
+	if (test__start_subtest("deletions"))
+		check_deletions();
+
+	if (test__start_subtest("multiple-functions"))
+		check_with_functions();
+
+	/* Check all kinds of operations and related restrictions */
+
+	if (test__start_subtest("incorrect-index"))
+		check_incorrect_index();
+
+	if (test__start_subtest("no-map-reuse"))
+		check_no_map_reuse();
+
+	if (test__start_subtest("bpf-side-ops"))
+		check_bpf_side();
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 bpf-next 05/13] bpf: support instructions arrays with constants blinding
  2025-09-13 19:39 [PATCH v2 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (3 preceding siblings ...)
  2025-09-13 19:39 ` [PATCH v2 bpf-next 04/13] selftests/bpf: add selftests for new insn_array map Anton Protopopov
@ 2025-09-13 19:39 ` Anton Protopopov
  2025-09-13 19:39 ` [PATCH v2 bpf-next 06/13] selftests/bpf: test instructions arrays with blinding Anton Protopopov
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Anton Protopopov @ 2025-09-13 19:39 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

When bpf_jit_harden is enabled, all constants in the BPF code are
blinded to prevent JIT spraying attacks. This happens during JIT
phase. Adjust all the related instruction arrays accordingly.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 kernel/bpf/core.c     | 20 ++++++++++++++++++++
 kernel/bpf/verifier.c | 11 ++++++++++-
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 1cda2589d4b3..90f201a6f51d 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1451,6 +1451,23 @@ void bpf_jit_prog_release_other(struct bpf_prog *fp, struct bpf_prog *fp_other)
 	bpf_prog_clone_free(fp_other);
 }
 
+static void adjust_insn_arrays(struct bpf_prog *prog, u32 off, u32 len)
+{
+#ifdef CONFIG_BPF_SYSCALL
+	struct bpf_map *map;
+	int i;
+
+	if (len <= 1)
+		return;
+
+	for (i = 0; i < prog->aux->used_map_cnt; i++) {
+		map = prog->aux->used_maps[i];
+		if (map->map_type == BPF_MAP_TYPE_INSN_ARRAY)
+			bpf_insn_array_adjust(map, off, len);
+	}
+#endif
+}
+
 struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
 {
 	struct bpf_insn insn_buff[16], aux[2];
@@ -1506,6 +1523,9 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
 		clone = tmp;
 		insn_delta = rewritten - 1;
 
+		/* Instructions arrays must be updated using absolute xlated offsets */
+		adjust_insn_arrays(clone, prog->aux->subprog_start + i, rewritten);
+
 		/* Walk new program and skip insns we just inserted. */
 		insn = clone->insnsi + i + insn_delta;
 		insn_cnt += insn_delta;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1f1708fd76c4..4261486981a3 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -21564,6 +21564,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 	struct bpf_insn *insn;
 	void *old_bpf_func;
 	int err, num_exentries;
+	int old_len, subprog_start_adjustment = 0;
 
 	if (env->subprog_cnt <= 1)
 		return 0;
@@ -21638,7 +21639,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 		func[i]->aux->func_idx = i;
 		/* Below members will be freed only at prog->aux */
 		func[i]->aux->btf = prog->aux->btf;
-		func[i]->aux->subprog_start = subprog_start;
+		func[i]->aux->subprog_start = subprog_start + subprog_start_adjustment;
 		func[i]->aux->func_info = prog->aux->func_info;
 		func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
 		func[i]->aux->poke_tab = prog->aux->poke_tab;
@@ -21691,7 +21692,15 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 		func[i]->aux->might_sleep = env->subprog_info[i].might_sleep;
 		if (!i)
 			func[i]->aux->exception_boundary = env->seen_exception;
+
+		/*
+		 * To properly pass the absolute subprog start to jit
+		 * all instruction adjustments should be accumulated
+		 */
+		old_len = func[i]->len;
 		func[i] = bpf_int_jit_compile(func[i]);
+		subprog_start_adjustment += func[i]->len - old_len;
+
 		if (!func[i]->jited) {
 			err = -ENOTSUPP;
 			goto out_free;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 bpf-next 06/13] selftests/bpf: test instructions arrays with blinding
  2025-09-13 19:39 [PATCH v2 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (4 preceding siblings ...)
  2025-09-13 19:39 ` [PATCH v2 bpf-next 05/13] bpf: support instructions arrays with constants blinding Anton Protopopov
@ 2025-09-13 19:39 ` Anton Protopopov
  2025-09-13 19:39 ` [PATCH v2 bpf-next 07/13] bpf, x86: allow indirect jumps to r8...r15 Anton Protopopov
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Anton Protopopov @ 2025-09-13 19:39 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

Add a specific test for instructions arrays with blinding enabled.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 .../selftests/bpf/prog_tests/bpf_insn_array.c | 92 +++++++++++++++++++
 1 file changed, 92 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c b/tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c
index f785132497d6..489badc17a2d 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_insn_array.c
@@ -287,6 +287,95 @@ static void check_with_functions(void)
 	close(map_fd);
 }
 
+static int set_bpf_jit_harden(char *level)
+{
+	char old_level;
+	int err = -1;
+	int fd = -1;
+
+	fd = open("/proc/sys/net/core/bpf_jit_harden", O_RDWR | O_NONBLOCK);
+	if (fd < 0) {
+		ASSERT_FAIL("open .../bpf_jit_harden returned %d (errno=%d)", fd, errno);
+		return -1;
+	}
+
+	err = read(fd, &old_level, 1);
+	if (err != 1) {
+		ASSERT_FAIL("read from .../bpf_jit_harden returned %d (errno=%d)", err, errno);
+		err = -1;
+		goto end;
+	}
+
+	lseek(fd, 0, SEEK_SET);
+
+	err = write(fd, level, 1);
+	if (err != 1) {
+		ASSERT_FAIL("write to .../bpf_jit_harden returned %d (errno=%d)", err, errno);
+		err = -1;
+		goto end;
+	}
+
+	err = 0;
+	*level = old_level;
+end:
+	if (fd >= 0)
+		close(fd);
+	return err;
+}
+
+static void check_blindness(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_MOV64_IMM(BPF_REG_0, 4),
+		BPF_MOV64_IMM(BPF_REG_0, 3),
+		BPF_MOV64_IMM(BPF_REG_0, 2),
+		BPF_MOV64_IMM(BPF_REG_0, 1),
+		BPF_EXIT_INSN(),
+	};
+	int prog_fd = -1, map_fd;
+	struct bpf_insn_array_value val = {};
+	char bpf_jit_harden = '@'; /* non-exizsting value */
+	int i;
+
+	map_fd = map_create(BPF_MAP_TYPE_INSN_ARRAY, ARRAY_SIZE(insns));
+	if (!ASSERT_GE(map_fd, 0, "map_create"))
+		return;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		val.xlated_off = i;
+		if (!ASSERT_EQ(bpf_map_update_elem(map_fd, &i, &val, 0), 0, "bpf_map_update_elem"))
+			goto cleanup;
+	}
+
+	bpf_jit_harden = '2';
+	if (set_bpf_jit_harden(&bpf_jit_harden)) {
+		bpf_jit_harden = '@'; /* open, read or write failed => no write was done */
+		goto cleanup;
+	}
+
+	prog_fd = prog_load(insns, ARRAY_SIZE(insns), &map_fd, 1);
+	if (!ASSERT_GE(prog_fd, 0, "bpf(BPF_PROG_LOAD)"))
+		goto cleanup;
+
+	for (i = 0; i < ARRAY_SIZE(insns); i++) {
+		char fmt[32];
+
+		if (!ASSERT_EQ(bpf_map_lookup_elem(map_fd, &i, &val), 0, "bpf_map_lookup_elem"))
+			goto cleanup;
+
+		snprintf(fmt, sizeof(fmt), "val should be equal 3*%d", i);
+		ASSERT_EQ(val.xlated_off, i * 3, fmt);
+	}
+
+cleanup:
+	/* restore the old one */
+	if (bpf_jit_harden != '@')
+		set_bpf_jit_harden(&bpf_jit_harden);
+
+	close(prog_fd);
+	close(map_fd);
+}
+
 /* Map can be used only by one BPF program */
 static void check_no_map_reuse(void)
 {
@@ -392,6 +481,9 @@ void test_bpf_insn_array(void)
 	if (test__start_subtest("multiple-functions"))
 		check_with_functions();
 
+	if (test__start_subtest("blindness"))
+		check_blindness();
+
 	/* Check all kinds of operations and related restrictions */
 
 	if (test__start_subtest("incorrect-index"))
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 bpf-next 07/13] bpf, x86: allow indirect jumps to r8...r15
  2025-09-13 19:39 [PATCH v2 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (5 preceding siblings ...)
  2025-09-13 19:39 ` [PATCH v2 bpf-next 06/13] selftests/bpf: test instructions arrays with blinding Anton Protopopov
@ 2025-09-13 19:39 ` Anton Protopopov
  2025-09-13 19:39 ` [PATCH v2 bpf-next 08/13] bpf, x86: add support for indirect jumps Anton Protopopov
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Anton Protopopov @ 2025-09-13 19:39 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

Currently the emit_indirect_jump() function only accepts one of the
RAX, RCX, ..., RBP registers as the destination. Make it to accept
R8, R9, ..., R15 as well, and make callers to pass BPF registers, not
native registers. This is required to enable indirect jumps support
in eBPF.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 arch/x86/net/bpf_jit_comp.c | 28 +++++++++++++++++++++-------
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 8792d7f371d3..fcebb48742ae 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -660,24 +660,38 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
 
 #define EMIT_LFENCE()	EMIT3(0x0F, 0xAE, 0xE8)
 
-static void emit_indirect_jump(u8 **pprog, int reg, u8 *ip)
+static void __emit_indirect_jump(u8 **pprog, int reg, bool ereg)
 {
 	u8 *prog = *pprog;
 
+	if (ereg)
+		EMIT1(0x41);
+
+	EMIT2(0xFF, 0xE0 + reg);
+
+	*pprog = prog;
+}
+
+static void emit_indirect_jump(u8 **pprog, int bpf_reg, u8 *ip)
+{
+	u8 *prog = *pprog;
+	int reg = reg2hex[bpf_reg];
+	bool ereg = is_ereg(bpf_reg);
+
 	if (cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) {
 		OPTIMIZER_HIDE_VAR(reg);
 		emit_jump(&prog, its_static_thunk(reg), ip);
 	} else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) {
 		EMIT_LFENCE();
-		EMIT2(0xFF, 0xE0 + reg);
+		__emit_indirect_jump(pprog, reg, ereg);
 	} else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE)) {
 		OPTIMIZER_HIDE_VAR(reg);
 		if (cpu_feature_enabled(X86_FEATURE_CALL_DEPTH))
-			emit_jump(&prog, &__x86_indirect_jump_thunk_array[reg], ip);
+			emit_jump(&prog, &__x86_indirect_jump_thunk_array[reg + 8*ereg], ip);
 		else
-			emit_jump(&prog, &__x86_indirect_thunk_array[reg], ip);
+			emit_jump(&prog, &__x86_indirect_thunk_array[reg + 8*ereg], ip);
 	} else {
-		EMIT2(0xFF, 0xE0 + reg);	/* jmp *%\reg */
+		__emit_indirect_jump(pprog, reg, ereg);
 		if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) || IS_ENABLED(CONFIG_MITIGATION_SLS))
 			EMIT1(0xCC);		/* int3 */
 	}
@@ -797,7 +811,7 @@ static void emit_bpf_tail_call_indirect(struct bpf_prog *bpf_prog,
 	 * rdi == ctx (1st arg)
 	 * rcx == prog->bpf_func + X86_TAIL_CALL_OFFSET
 	 */
-	emit_indirect_jump(&prog, 1 /* rcx */, ip + (prog - start));
+	emit_indirect_jump(&prog, BPF_REG_4 /* R4 -> rcx */, ip + (prog - start));
 
 	/* out: */
 	ctx->tail_call_indirect_label = prog - start;
@@ -3517,7 +3531,7 @@ static int emit_bpf_dispatcher(u8 **pprog, int a, int b, s64 *progs, u8 *image,
 		if (err)
 			return err;
 
-		emit_indirect_jump(&prog, 2 /* rdx */, image + (prog - buf));
+		emit_indirect_jump(&prog, BPF_REG_3 /* R3 -> rdx */, image + (prog - buf));
 
 		*pprog = prog;
 		return 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 bpf-next 08/13] bpf, x86: add support for indirect jumps
  2025-09-13 19:39 [PATCH v2 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (6 preceding siblings ...)
  2025-09-13 19:39 ` [PATCH v2 bpf-next 07/13] bpf, x86: allow indirect jumps to r8...r15 Anton Protopopov
@ 2025-09-13 19:39 ` Anton Protopopov
  2025-09-13 19:39 ` [PATCH v2 bpf-next 09/13] bpf: disasm: add support for BPF_JMP|BPF_JA|BPF_X Anton Protopopov
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Anton Protopopov @ 2025-09-13 19:39 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

Add support for a new instruction

    BPF_JMP|BPF_X|BPF_JA, SRC=0, DST=Rx, off=0, imm=0

which does an indirect jump to a location stored in Rx.  The register
Rx should have type PTR_TO_INSN. This new type assures that the Rx
register contains a value (or a range of values) loaded from a
correct jump table – map of type instruction array.

For example, for a C switch LLVM will generate the following code:

    0:   r3 = r1                    # "switch (r3)"
    1:   if r3 > 0x13 goto +0x666   # check r3 boundaries
    2:   r3 <<= 0x3                 # adjust to an index in array of addresses
    3:   r1 = 0xbeef ll             # r1 is PTR_TO_MAP_VALUE, r1->map_ptr=M
    5:   r1 += r3                   # r1 inherits boundaries from r3
    6:   r1 = *(u64 *)(r1 + 0x0)    # r1 now has type INSN_TO_PTR
    7:   gotox r1[,imm=fd(M)]       # jit will generate proper code

Here the gotox instruction corresponds to one particular map. This is
possible however to have a gotox instruction which can be loaded from
different maps, e.g.

    0:	 r1 &= 0x1
    1:	 r2 <<= 0x3
    2:	 r3 = 0x0 ll                # load from map M_1
    4:	 r3 += r2
    5:	 if r1 == 0x0 goto +0x4
    6:	 r1 <<= 0x3
    7:	 r3 = 0x0 ll                # load from map M_2
    9:	 r3 += r1
    A:	 r1 = *(u64 *)(r3 + 0x0)
    B:	 gotox r1                   # jump to target loaded from M_1 or M_2

During check_cfg stage the verifier will collect all the maps which
point to inside the subprog being verified. When building the config,
the high 16 bytes of the insn_state are used, so this patch
(theoretically) supports jump tables of up to 2^16 slots.

During the later stage, in check_indirect_jump, it is checked that
the register Rx was loaded from a particular instruction array.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 arch/x86/net/bpf_jit_comp.c  |   3 +
 include/linux/bpf.h          |   1 +
 include/linux/bpf_verifier.h |  15 +
 kernel/bpf/bpf_insn_array.c  |  16 +-
 kernel/bpf/core.c            |   1 +
 kernel/bpf/log.c             |   1 +
 kernel/bpf/verifier.c        | 513 ++++++++++++++++++++++++++++++++---
 7 files changed, 514 insertions(+), 36 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index fcebb48742ae..095d249eb235 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -2595,6 +2595,9 @@ st:			if (is_imm8(insn->off))
 
 			break;
 
+		case BPF_JMP | BPF_JA | BPF_X:
+			emit_indirect_jump(&prog, insn->dst_reg, image + addrs[i - 1]);
+			break;
 		case BPF_JMP | BPF_JA:
 		case BPF_JMP32 | BPF_JA:
 			if (BPF_CLASS(insn->code) == BPF_JMP) {
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 77fcb508d6ae..2c12edfdf63c 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -973,6 +973,7 @@ enum bpf_reg_type {
 	PTR_TO_ARENA,
 	PTR_TO_BUF,		 /* reg points to a read/write buffer */
 	PTR_TO_FUNC,		 /* reg points to a bpf program function */
+	PTR_TO_INSN,		 /* reg points to a bpf program instruction */
 	CONST_PTR_TO_DYNPTR,	 /* reg points to a const struct bpf_dynptr */
 	__BPF_REG_TYPE_MAX,
 
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index aca43c284203..607a684642e5 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -533,6 +533,16 @@ struct bpf_map_ptr_state {
 #define BPF_ALU_SANITIZE		(BPF_ALU_SANITIZE_SRC | \
 					 BPF_ALU_SANITIZE_DST)
 
+/*
+ * A structure defining an array of BPF instructions.  Can be used,
+ * for example, as a return value of the insn_successors() function
+ * and in the struct bpf_insn_aux_data below.
+ */
+struct bpf_iarray {
+	int off_cnt;
+	u32 off[];
+};
+
 struct bpf_insn_aux_data {
 	union {
 		enum bpf_reg_type ptr_type;	/* pointer type for load/store insns */
@@ -542,6 +552,7 @@ struct bpf_insn_aux_data {
 		struct {
 			u32 map_index;		/* index into used_maps[] */
 			u32 map_off;		/* offset from value base address */
+			struct bpf_iarray *jt;	/* jump table for gotox instruction */
 		};
 		struct {
 			enum bpf_reg_type reg_type;	/* type of pseudo_btf_id */
@@ -586,6 +597,9 @@ struct bpf_insn_aux_data {
 	u8 fastcall_spills_num:3;
 	u8 arg_prog:4;
 
+	/* true if jt->off was allocated */
+	bool jt_allocated;
+
 	/* below fields are initialized once */
 	unsigned int orig_idx; /* original instruction index */
 	bool jmp_point;
@@ -847,6 +861,7 @@ struct bpf_verifier_env {
 	/* array of pointers to bpf_scc_info indexed by SCC id */
 	struct bpf_scc_info **scc_info;
 	u32 scc_cnt;
+	struct bpf_iarray *succ;
 };
 
 static inline struct bpf_func_info_aux *subprog_aux(struct bpf_verifier_env *env, int subprog)
diff --git a/kernel/bpf/bpf_insn_array.c b/kernel/bpf/bpf_insn_array.c
index 0c8dac62f457..4b945b7e31b8 100644
--- a/kernel/bpf/bpf_insn_array.c
+++ b/kernel/bpf/bpf_insn_array.c
@@ -1,7 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-only
 
 #include <linux/bpf.h>
-#include <linux/sort.h>
 
 #define MAX_INSN_ARRAY_ENTRIES 256
 
@@ -173,6 +172,20 @@ static u64 insn_array_mem_usage(const struct bpf_map *map)
 	return insn_array_alloc_size(map->max_entries) + extra_size;
 }
 
+static int insn_array_map_direct_value_addr(const struct bpf_map *map, u64 *imm, u32 off)
+{
+	struct bpf_insn_array *insn_array = cast_insn_array(map);
+
+	if ((off % sizeof(long)) != 0 ||
+	    (off / sizeof(long)) >= map->max_entries)
+		return -EINVAL;
+
+	/* from BPF's point of view, this map is a jump table */
+	*imm = (unsigned long)insn_array->ips + off;
+
+	return 0;
+}
+
 BTF_ID_LIST_SINGLE(insn_array_btf_ids, struct, bpf_insn_array)
 
 const struct bpf_map_ops insn_array_map_ops = {
@@ -185,6 +198,7 @@ const struct bpf_map_ops insn_array_map_ops = {
 	.map_delete_elem = insn_array_delete_elem,
 	.map_check_btf = insn_array_check_btf,
 	.map_mem_usage = insn_array_mem_usage,
+	.map_direct_value_addr = insn_array_map_direct_value_addr,
 	.map_btf_id = &insn_array_btf_ids[0],
 };
 
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 90f201a6f51d..1f933857ca1d 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1709,6 +1709,7 @@ bool bpf_opcode_in_insntable(u8 code)
 		[BPF_LD | BPF_IND | BPF_B] = true,
 		[BPF_LD | BPF_IND | BPF_H] = true,
 		[BPF_LD | BPF_IND | BPF_W] = true,
+		[BPF_JMP | BPF_JA | BPF_X] = true,
 		[BPF_JMP | BPF_JCOND] = true,
 	};
 #undef BPF_INSN_3_TBL
diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
index e4983c1303e7..75adfe7914f2 100644
--- a/kernel/bpf/log.c
+++ b/kernel/bpf/log.c
@@ -461,6 +461,7 @@ const char *reg_type_str(struct bpf_verifier_env *env, enum bpf_reg_type type)
 		[PTR_TO_ARENA]		= "arena",
 		[PTR_TO_BUF]		= "buf",
 		[PTR_TO_FUNC]		= "func",
+		[PTR_TO_INSN]		= "insn",
 		[PTR_TO_MAP_KEY]	= "map_key",
 		[CONST_PTR_TO_DYNPTR]	= "dynptr_ptr",
 	};
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 4261486981a3..5985ad4761ba 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -212,6 +212,7 @@ static int ref_set_non_owning(struct bpf_verifier_env *env,
 static void specialize_kfunc(struct bpf_verifier_env *env,
 			     u32 func_id, u16 offset, unsigned long *addr);
 static bool is_trusted_reg(const struct bpf_reg_state *reg);
+static int add_used_map(struct bpf_verifier_env *env, int fd);
 
 static bool bpf_map_ptr_poisoned(const struct bpf_insn_aux_data *aux)
 {
@@ -2962,14 +2963,13 @@ static int cmp_subprogs(const void *a, const void *b)
 	       ((struct bpf_subprog_info *)b)->start;
 }
 
-/* Find subprogram that contains instruction at 'off' */
-static struct bpf_subprog_info *find_containing_subprog(struct bpf_verifier_env *env, int off)
+static int find_containing_subprog_idx(struct bpf_verifier_env *env, int off)
 {
 	struct bpf_subprog_info *vals = env->subprog_info;
 	int l, r, m;
 
 	if (off >= env->prog->len || off < 0 || env->subprog_cnt == 0)
-		return NULL;
+		return -1;
 
 	l = 0;
 	r = env->subprog_cnt - 1;
@@ -2980,7 +2980,19 @@ static struct bpf_subprog_info *find_containing_subprog(struct bpf_verifier_env
 		else
 			r = m - 1;
 	}
-	return &vals[l];
+	return l;
+}
+
+/* Find subprogram that contains instruction at 'off' */
+static struct bpf_subprog_info *find_containing_subprog(struct bpf_verifier_env *env, int off)
+{
+	int subprog_idx;
+
+	subprog_idx = find_containing_subprog_idx(env, off);
+	if (subprog_idx < 0)
+		return NULL;
+
+	return &env->subprog_info[subprog_idx];
 }
 
 /* Find subprogram that starts exactly at 'off' */
@@ -6077,6 +6089,18 @@ static int check_map_kptr_access(struct bpf_verifier_env *env, u32 regno,
 	return 0;
 }
 
+/*
+ * Return the size of the memory region accessible from a pointer to map value.
+ * For INSN_ARRAY maps whole bpf_insn_array->ips array is accessible.
+ */
+static u32 map_mem_size(const struct bpf_map *map)
+{
+	if (map->map_type == BPF_MAP_TYPE_INSN_ARRAY)
+		return map->max_entries * sizeof(long);
+
+	return map->value_size;
+}
+
 /* check read/write into a map element with possible variable offset */
 static int check_map_access(struct bpf_verifier_env *env, u32 regno,
 			    int off, int size, bool zero_size_allowed,
@@ -6086,11 +6110,11 @@ static int check_map_access(struct bpf_verifier_env *env, u32 regno,
 	struct bpf_func_state *state = vstate->frame[vstate->curframe];
 	struct bpf_reg_state *reg = &state->regs[regno];
 	struct bpf_map *map = reg->map_ptr;
+	u32 mem_size = map_mem_size(map);
 	struct btf_record *rec;
 	int err, i;
 
-	err = check_mem_region_access(env, regno, off, size, map->value_size,
-				      zero_size_allowed);
+	err = check_mem_region_access(env, regno, off, size, mem_size, zero_size_allowed);
 	if (err)
 		return err;
 
@@ -7605,6 +7629,19 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
 
 				regs[value_regno].type = SCALAR_VALUE;
 				__mark_reg_known(&regs[value_regno], val);
+			} else if (map->map_type == BPF_MAP_TYPE_INSN_ARRAY) {
+				regs[value_regno].type = PTR_TO_INSN;
+				regs[value_regno].map_ptr = map;
+				regs[value_regno].off = reg->off;
+				regs[value_regno].umin_value = reg->umin_value;
+				regs[value_regno].umax_value = reg->umax_value;
+				regs[value_regno].smin_value = reg->smin_value;
+				regs[value_regno].smax_value = reg->smax_value;
+				regs[value_regno].s32_min_value = reg->s32_min_value;
+				regs[value_regno].s32_max_value = reg->s32_max_value;
+				regs[value_regno].u32_min_value = reg->u32_min_value;
+				regs[value_regno].u32_max_value = reg->u32_max_value;
+				regs[value_regno].var_off = reg->var_off;
 			} else {
 				mark_reg_unknown(env, regs, value_regno);
 			}
@@ -7795,6 +7832,11 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
 static int save_aux_ptr_type(struct bpf_verifier_env *env, enum bpf_reg_type type,
 			     bool allow_trust_mismatch);
 
+static bool map_is_insn_array(struct bpf_map *map)
+{
+	return map && map->map_type == BPF_MAP_TYPE_INSN_ARRAY;
+}
+
 static int check_load_mem(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			  bool strict_alignment_once, bool is_ldsx,
 			  bool allow_trust_mismatch, const char *ctx)
@@ -14472,6 +14514,8 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
 	struct bpf_func_state *state = vstate->frame[vstate->curframe];
 	struct bpf_reg_state *regs = state->regs, *dst_reg;
 	bool known = tnum_is_const(off_reg->var_off);
+	bool ptr_to_insn_array = base_type(ptr_reg->type) == PTR_TO_MAP_VALUE &&
+				 map_is_insn_array(ptr_reg->map_ptr);
 	s64 smin_val = off_reg->smin_value, smax_val = off_reg->smax_value,
 	    smin_ptr = ptr_reg->smin_value, smax_ptr = ptr_reg->smax_value;
 	u64 umin_val = off_reg->umin_value, umax_val = off_reg->umax_value,
@@ -14613,6 +14657,11 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
 		}
 		break;
 	case BPF_SUB:
+		if (ptr_to_insn_array) {
+			verbose(env, "Operation %s on ptr to instruction set map is prohibited\n",
+				bpf_alu_string[opcode >> 4]);
+			return -EACCES;
+		}
 		if (dst_reg == off_reg) {
 			/* scalar -= pointer.  Creates an unknown scalar */
 			verbose(env, "R%d tried to subtract pointer from scalar\n",
@@ -16965,7 +17014,8 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn)
 		}
 		dst_reg->type = PTR_TO_MAP_VALUE;
 		dst_reg->off = aux->map_off;
-		WARN_ON_ONCE(map->max_entries != 1);
+		WARN_ON_ONCE(map->map_type != BPF_MAP_TYPE_INSN_ARRAY &&
+			     map->max_entries != 1);
 		/* We want reg->id to be same (0) as map_value is not distinct */
 	} else if (insn->src_reg == BPF_PSEUDO_MAP_FD ||
 		   insn->src_reg == BPF_PSEUDO_MAP_IDX) {
@@ -17718,6 +17768,234 @@ static int mark_fastcall_patterns(struct bpf_verifier_env *env)
 	return 0;
 }
 
+#define SET_HIGH(STATE, LAST)	STATE = (STATE & 0xffffU) | ((LAST) << 16)
+#define GET_HIGH(STATE)		((u16)((STATE) >> 16))
+
+static int push_gotox_edge(int t, struct bpf_verifier_env *env, struct bpf_iarray *jt)
+{
+	int *insn_stack = env->cfg.insn_stack;
+	int *insn_state = env->cfg.insn_state;
+	u16 prev;
+	int w;
+
+	for (prev = GET_HIGH(insn_state[t]); prev < jt->off_cnt; prev++) {
+		w = jt->off[prev];
+
+		/* EXPLORED || DISCOVERED */
+		if (insn_state[w])
+			continue;
+
+		break;
+	}
+
+	if (prev == jt->off_cnt)
+		return DONE_EXPLORING;
+
+	mark_prune_point(env, t);
+
+	if (env->cfg.cur_stack >= env->prog->len)
+		return -E2BIG;
+	insn_stack[env->cfg.cur_stack++] = w;
+
+	mark_jmp_point(env, w);
+
+	SET_HIGH(insn_state[t], prev + 1);
+	return KEEP_EXPLORING;
+}
+
+static int copy_insn_array(struct bpf_map *map, u32 start, u32 end, u32 *off)
+{
+	struct bpf_insn_array_value *value;
+	u32 i;
+
+	for (i = start; i <= end; i++) {
+		value = map->ops->map_lookup_elem(map, &i);
+		if (!value)
+			return -EINVAL;
+		off[i - start] = value->xlated_off;
+	}
+	return 0;
+}
+
+static int cmp_ptr_to_u32(const void *a, const void *b)
+{
+	return *(u32 *)a - *(u32 *)b;
+}
+
+static int sort_insn_array_uniq(u32 *off, int off_cnt)
+{
+	int unique = 1;
+	int i;
+
+	sort(off, off_cnt, sizeof(off[0]), cmp_ptr_to_u32, NULL);
+
+	for (i = 1; i < off_cnt; i++)
+		if (off[i] != off[unique - 1])
+			off[unique++] = off[i];
+
+	return unique;
+}
+
+/*
+ * sort_unique({map[start], ..., map[end]}) into off
+ */
+static int copy_insn_array_uniq(struct bpf_map *map, u32 start, u32 end, u32 *off)
+{
+	u32 n = end - start + 1;
+	int err;
+
+	err = copy_insn_array(map, start, end, off);
+	if (err)
+		return err;
+
+	return sort_insn_array_uniq(off, n);
+}
+
+static struct bpf_iarray *iarray_realloc(struct bpf_iarray *old, size_t n_elem)
+{
+	size_t new_size = sizeof(struct bpf_iarray) + n_elem * 4;
+	struct bpf_iarray *new;
+
+	new = kvrealloc(old, new_size, GFP_KERNEL_ACCOUNT);
+	if (!new) {
+		/* this is what callers always want, so simplify the call site */
+		kvfree(old);
+		return NULL;
+	}
+
+	new->off_cnt = n_elem;
+	return new;
+}
+
+/*
+ * Copy all unique offsets from the map
+ */
+static struct bpf_iarray *jt_from_map(struct bpf_map *map)
+{
+	struct bpf_iarray *jt;
+	int n;
+
+	jt = iarray_realloc(NULL, map->max_entries);
+	if (!jt)
+		return ERR_PTR(-ENOMEM);
+
+	n = copy_insn_array_uniq(map, 0, map->max_entries - 1, jt->off);
+	if (n < 0) {
+		kvfree(jt);
+		return ERR_PTR(n);
+	}
+
+	return jt;
+}
+
+/*
+ * Find and collect all maps which fit in the subprog. Return the result as one
+ * combined jump table in jt->off (allocated with kvcalloc
+ */
+static struct bpf_iarray *jt_from_subprog(struct bpf_verifier_env *env,
+					  int subprog_start, int subprog_end)
+{
+	struct bpf_iarray *jt = NULL;
+	struct bpf_map *map;
+	struct bpf_iarray *jt_cur;
+	int i;
+
+	for (i = 0; i < env->insn_array_map_cnt; i++) {
+		/*
+		 * TODO (when needed): collect only jump tables, not static keys
+		 * or maps for indirect calls
+		 */
+		map = env->insn_array_maps[i];
+
+		jt_cur = jt_from_map(map);
+		if (IS_ERR(jt_cur)) {
+			kvfree(jt);
+			return jt_cur;
+		}
+
+		/*
+		 * This is enough to check one element. The full table is
+		 * checked to fit inside the subprog later in create_jt()
+		 */
+		if (jt_cur->off[0] >= subprog_start && jt_cur->off[0] < subprog_end) {
+			u32 old_cnt = jt ? jt->off_cnt : 0;
+			jt = iarray_realloc(jt, old_cnt + jt_cur->off_cnt);
+			if (!jt) {
+				kvfree(jt_cur);
+				return ERR_PTR(-ENOMEM);
+			}
+			memcpy(jt->off + old_cnt, jt_cur->off, jt_cur->off_cnt << 2);
+		}
+
+		kvfree(jt_cur);
+	}
+
+	if (!jt) {
+		verbose(env, "no jump tables found for subprog starting at %u\n", subprog_start);
+		return ERR_PTR(-EINVAL);
+	}
+
+	jt->off_cnt = sort_insn_array_uniq(jt->off, jt->off_cnt);
+	return jt;
+}
+
+static struct bpf_iarray *
+create_jt(int t, struct bpf_verifier_env *env, int fd)
+{
+	static struct bpf_subprog_info *subprog;
+	int subprog_idx, subprog_start, subprog_end;
+	struct bpf_iarray *jt;
+	int i;
+
+	if (env->subprog_cnt == 0)
+		return ERR_PTR(-EFAULT);
+
+	subprog_idx = find_containing_subprog_idx(env, t);
+	if (subprog_idx < 0) {
+		verbose(env, "can't find subprog containing instruction %d\n", t);
+		return ERR_PTR(-EFAULT);
+	}
+	subprog = &env->subprog_info[subprog_idx];
+	subprog_start = subprog->start;
+	subprog_end = (subprog + 1)->start;
+	jt = jt_from_subprog(env, subprog_start, subprog_end);
+	if (IS_ERR(jt))
+		return jt;
+
+	/* Check that the every element of the jump table fits within the given subprogram */
+	for (i = 0; i < jt->off_cnt; i++) {
+		if (jt->off[i] < subprog_start || jt->off[i] >= subprog_end) {
+			verbose(env, "jump table for insn %d points outside of the subprog [%u,%u]",
+					t, subprog_start, subprog_end);
+			return ERR_PTR(-EINVAL);
+		}
+	}
+
+	return jt;
+}
+
+/* "conditional jump with N edges" */
+static int visit_gotox_insn(int t, struct bpf_verifier_env *env, int fd)
+{
+	struct bpf_iarray *jt = env->insn_aux_data[t].jt;
+
+	if (!jt) {
+		jt = create_jt(t, env, fd);
+		if (IS_ERR(jt))
+			return PTR_ERR(jt);
+	}
+
+	/*
+	 * Mark jt as allocated. Otherwise, this is not possible to check if it
+	 * was allocated or not in the code which frees memory (jt is a part of
+	 * union)
+	 */
+	env->insn_aux_data[t].jt_allocated = true;
+	env->insn_aux_data[t].jt = jt;
+
+	return push_gotox_edge(t, env, jt);
+}
+
 /* Visits the instruction at index t and returns one of the following:
  *  < 0 - an error occurred
  *  DONE_EXPLORING - the instruction was fully explored
@@ -17808,8 +18086,8 @@ static int visit_insn(int t, struct bpf_verifier_env *env)
 		return visit_func_call_insn(t, insns, env, insn->src_reg == BPF_PSEUDO_CALL);
 
 	case BPF_JA:
-		if (BPF_SRC(insn->code) != BPF_K)
-			return -EINVAL;
+		if (BPF_SRC(insn->code) == BPF_X)
+			return visit_gotox_insn(t, env, insn->imm);
 
 		if (BPF_CLASS(insn->code) == BPF_JMP)
 			off = insn->off;
@@ -17840,6 +18118,13 @@ static int visit_insn(int t, struct bpf_verifier_env *env)
 	}
 }
 
+static bool insn_is_gotox(struct bpf_insn *insn)
+{
+	return BPF_CLASS(insn->code) == BPF_JMP &&
+	       BPF_OP(insn->code) == BPF_JA &&
+	       BPF_SRC(insn->code) == BPF_X;
+}
+
 /* non-recursive depth-first-search to detect loops in BPF program
  * loop == back-edge in directed graph
  */
@@ -18701,6 +18986,10 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
 		return regs_exact(rold, rcur, idmap) && rold->frameno == rcur->frameno;
 	case PTR_TO_ARENA:
 		return true;
+	case PTR_TO_INSN:
+		/* is rcur a subset of rold? */
+		return (rcur->umin_value >= rold->umin_value &&
+			rcur->umax_value <= rold->umax_value);
 	default:
 		return regs_exact(rold, rcur, idmap);
 	}
@@ -19847,6 +20136,102 @@ static int process_bpf_exit_full(struct bpf_verifier_env *env,
 	return PROCESS_BPF_EXIT;
 }
 
+static int indirect_jump_min_max_index(struct bpf_verifier_env *env,
+				       int regno,
+				       struct bpf_map *map,
+				       u32 *pmin_index, u32 *pmax_index)
+{
+	struct bpf_reg_state *reg = reg_state(env, regno);
+	u64 min_index, max_index;
+
+	if (check_add_overflow(reg->umin_value, reg->off, &min_index) ||
+		(min_index > (u64) U32_MAX * sizeof(long))) {
+		verbose(env, "the sum of R%u umin_value %llu and off %u is too big\n",
+			     regno, reg->umin_value, reg->off);
+		return -ERANGE;
+	}
+	if (check_add_overflow(reg->umax_value, reg->off, &max_index) ||
+		(max_index > (u64) U32_MAX * sizeof(long))) {
+		verbose(env, "the sum of R%u umax_value %llu and off %u is too big\n",
+			     regno, reg->umax_value, reg->off);
+		return -ERANGE;
+	}
+
+	min_index /= sizeof(long);
+	max_index /= sizeof(long);
+
+	if (min_index >= map->max_entries || max_index >= map->max_entries) {
+		verbose(env, "R%u points to outside of jump table: [%llu,%llu] max_entries %u\n",
+			     regno, min_index, max_index, map->max_entries);
+		return -EINVAL;
+	}
+
+	*pmin_index = min_index;
+	*pmax_index = max_index;
+	return 0;
+}
+
+/* gotox *dst_reg */
+static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *insn)
+{
+	struct bpf_verifier_state *other_branch;
+	struct bpf_reg_state *dst_reg;
+	struct bpf_map *map;
+	u32 min_index, max_index;
+	int err = 0;
+	u32 *xoff;
+	int n;
+	int i;
+
+	dst_reg = reg_state(env, insn->dst_reg);
+	if (dst_reg->type != PTR_TO_INSN) {
+		verbose(env, "R%d has type %d, expected PTR_TO_INSN\n",
+			     insn->dst_reg, dst_reg->type);
+		return -EINVAL;
+	}
+
+	map = dst_reg->map_ptr;
+	if (verifier_bug_if(!map, env, "R%d has an empty map pointer", insn->dst_reg))
+		return -EFAULT;
+
+	if (verifier_bug_if(map->map_type != BPF_MAP_TYPE_INSN_ARRAY, env,
+			    "R%d has incorrect map type %d", insn->dst_reg, map->map_type))
+		return -EFAULT;
+
+	err = indirect_jump_min_max_index(env, insn->dst_reg, map, &min_index, &max_index);
+	if (err)
+		return err;
+
+	xoff = kvcalloc(max_index - min_index + 1, sizeof(u32), GFP_KERNEL_ACCOUNT);
+	if (!xoff)
+		return -ENOMEM;
+
+	n = copy_insn_array_uniq(map, min_index, max_index, xoff);
+	if (n < 0) {
+		err = n;
+		goto free_off;
+	}
+	if (n == 0) {
+		verbose(env, "register R%d doesn't point to any offset in map id=%d\n",
+			     insn->dst_reg, map->id);
+		err = -EINVAL;
+		goto free_off;
+	}
+
+	for (i = 0; i < n - 1; i++) {
+		other_branch = push_stack(env, xoff[i], env->insn_idx, false);
+		if (IS_ERR(other_branch)) {
+			err = PTR_ERR(other_branch);
+			goto free_off;
+		}
+	}
+	env->insn_idx = xoff[n-1];
+
+free_off:
+	kvfree(xoff);
+	return err;
+}
+
 static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
 {
 	int err;
@@ -19949,6 +20334,9 @@ static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
 
 			mark_reg_scratched(env, BPF_REG_0);
 		} else if (opcode == BPF_JA) {
+			if (BPF_SRC(insn->code) == BPF_X)
+				return check_indirect_jump(env, insn);
+
 			if (BPF_SRC(insn->code) != BPF_K ||
 			    insn->src_reg != BPF_REG_0 ||
 			    insn->dst_reg != BPF_REG_0 ||
@@ -20448,6 +20836,7 @@ static int check_map_prog_compatibility(struct bpf_verifier_env *env,
 		case BPF_MAP_TYPE_QUEUE:
 		case BPF_MAP_TYPE_STACK:
 		case BPF_MAP_TYPE_ARENA:
+		case BPF_MAP_TYPE_INSN_ARRAY:
 			break;
 		default:
 			verbose(env,
@@ -21006,6 +21395,23 @@ static int bpf_adj_linfo_after_remove(struct bpf_verifier_env *env, u32 off,
 	return 0;
 }
 
+/*
+ * Clean up dynamically allocated fields of aux data for instructions [start, ...]
+ */
+static void clear_insn_aux_data(struct bpf_insn_aux_data *aux_data, int start, int len)
+{
+	int end = start + len;
+	int i;
+
+	for (i = start; i < end; i++) {
+		if (aux_data[i].jt_allocated) {
+			kvfree(aux_data[i].jt);
+			aux_data[i].jt = NULL;
+			aux_data[i].jt_allocated = false;
+		}
+	}
+}
+
 static int verifier_remove_insns(struct bpf_verifier_env *env, u32 off, u32 cnt)
 {
 	struct bpf_insn_aux_data *aux_data = env->insn_aux_data;
@@ -21029,6 +21435,8 @@ static int verifier_remove_insns(struct bpf_verifier_env *env, u32 off, u32 cnt)
 
 	adjust_insn_arrays_after_remove(env, off, cnt);
 
+	clear_insn_aux_data(aux_data, off, cnt);
+
 	memmove(aux_data + off,	aux_data + off + cnt,
 		sizeof(*aux_data) * (orig_prog_len - off - cnt));
 
@@ -21669,6 +22077,8 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 		func[i]->aux->jited_linfo = prog->aux->jited_linfo;
 		func[i]->aux->linfo_idx = env->subprog_info[i].linfo_idx;
 		func[i]->aux->arena = prog->aux->arena;
+		func[i]->aux->used_maps = env->used_maps;
+		func[i]->aux->used_map_cnt = env->used_map_cnt;
 		num_exentries = 0;
 		insn = func[i]->insnsi;
 		for (j = 0; j < func[i]->len; j++, insn++) {
@@ -24201,23 +24611,41 @@ static bool can_jump(struct bpf_insn *insn)
 	return false;
 }
 
-static int insn_successors(struct bpf_prog *prog, u32 idx, u32 succ[2])
+/*
+ * Returns an array of instructions succ, with succ->off[0], ...,
+ * succ->off[n-1] with successor instructions, where n=succ->off_cnt
+ */
+static struct bpf_iarray *
+insn_successors(struct bpf_verifier_env *env, u32 insn_idx)
 {
-	struct bpf_insn *insn = &prog->insnsi[idx];
-	int i = 0, insn_sz;
+	struct bpf_prog *prog = env->prog;
+	struct bpf_insn *insn = &prog->insnsi[insn_idx];
+	struct bpf_iarray *succ;
+	int insn_sz;
 	u32 dst;
 
-	insn_sz = bpf_is_ldimm64(insn) ? 2 : 1;
-	if (can_fallthrough(insn) && idx + 1 < prog->len)
-		succ[i++] = idx + insn_sz;
+	if (unlikely(insn_is_gotox(insn))) {
+		succ = env->insn_aux_data[insn_idx].jt;
+		if (verifier_bug_if(!succ, env,
+				    "aux data for insn %u doesn't contain a jump table\n",
+				    insn_idx))
+			return ERR_PTR(-EFAULT);
+	} else {
+		/* pre-allocated array of size up to 2; reset cnt, as it may be used already */
+		succ = env->succ;
+		succ->off_cnt = 0;
 
-	if (can_jump(insn)) {
-		dst = idx + jmp_offset(insn) + 1;
-		if (i == 0 || succ[0] != dst)
-			succ[i++] = dst;
-	}
+		insn_sz = bpf_is_ldimm64(insn) ? 2 : 1;
+		if (can_fallthrough(insn) && insn_idx + 1 < prog->len)
+			succ->off[succ->off_cnt++] = insn_idx + insn_sz;
 
-	return i;
+		if (can_jump(insn)) {
+			dst = insn_idx + jmp_offset(insn) + 1;
+			if (succ->off_cnt == 0 || succ->off[0] != dst)
+				succ->off[succ->off_cnt++] = dst;
+		}
+	}
+	return succ;
 }
 
 /* Each field is a register bitmask */
@@ -24412,14 +24840,18 @@ static int compute_live_registers(struct bpf_verifier_env *env)
 		for (i = 0; i < env->cfg.cur_postorder; ++i) {
 			int insn_idx = env->cfg.insn_postorder[i];
 			struct insn_live_regs *live = &state[insn_idx];
-			int succ_num;
-			u32 succ[2];
+			struct bpf_iarray *succ;
 			u16 new_out = 0;
 			u16 new_in = 0;
 
-			succ_num = insn_successors(env->prog, insn_idx, succ);
-			for (int s = 0; s < succ_num; ++s)
-				new_out |= state[succ[s]].in;
+			succ = insn_successors(env, insn_idx);
+			if (IS_ERR(succ)) {
+				err = PTR_ERR(succ);
+				goto out;
+
+			}
+			for (int s = 0; s < succ->off_cnt; ++s)
+				new_out |= state[succ->off[s]].in;
 			new_in = (new_out & ~live->def) | live->use;
 			if (new_out != live->out || new_in != live->in) {
 				live->in = new_in;
@@ -24475,11 +24907,10 @@ static int compute_scc(struct bpf_verifier_env *env)
 	const u32 insn_cnt = env->prog->len;
 	int stack_sz, dfs_sz, err = 0;
 	u32 *stack, *pre, *low, *dfs;
-	u32 succ_cnt, i, j, t, w;
+	u32 i, j, t, w;
 	u32 next_preorder_num;
 	u32 next_scc_id;
 	bool assign_scc;
-	u32 succ[2];
 
 	next_preorder_num = 1;
 	next_scc_id = 1;
@@ -24578,6 +25009,8 @@ static int compute_scc(struct bpf_verifier_env *env)
 		dfs[0] = i;
 dfs_continue:
 		while (dfs_sz) {
+			struct bpf_iarray *succ;
+
 			w = dfs[dfs_sz - 1];
 			if (pre[w] == 0) {
 				low[w] = next_preorder_num;
@@ -24586,12 +25019,17 @@ static int compute_scc(struct bpf_verifier_env *env)
 				stack[stack_sz++] = w;
 			}
 			/* Visit 'w' successors */
-			succ_cnt = insn_successors(env->prog, w, succ);
-			for (j = 0; j < succ_cnt; ++j) {
-				if (pre[succ[j]]) {
-					low[w] = min(low[w], low[succ[j]]);
+			succ = insn_successors(env, w);
+			if (IS_ERR(succ)) {
+				err = PTR_ERR(succ);
+				goto exit;
+
+			}
+			for (j = 0; j < succ->off_cnt; ++j) {
+				if (pre[succ->off[j]]) {
+					low[w] = min(low[w], low[succ->off[j]]);
 				} else {
-					dfs[dfs_sz++] = succ[j];
+					dfs[dfs_sz++] = succ->off[j];
 					goto dfs_continue;
 				}
 			}
@@ -24608,8 +25046,8 @@ static int compute_scc(struct bpf_verifier_env *env)
 			 * or if component has a self reference.
 			 */
 			assign_scc = stack[stack_sz - 1] != w;
-			for (j = 0; j < succ_cnt; ++j) {
-				if (succ[j] == w) {
+			for (j = 0; j < succ->off_cnt; ++j) {
+				if (succ->off[j] == w) {
 					assign_scc = true;
 					break;
 				}
@@ -24669,6 +25107,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	ret = -ENOMEM;
 	if (!env->insn_aux_data)
 		goto err_free_env;
+	env->succ = iarray_realloc(NULL, 2);
+	if (!env->succ)
+		goto err_free_env;
 	for (i = 0; i < len; i++)
 		env->insn_aux_data[i].orig_idx = i;
 	env->prog = *prog;
@@ -24908,10 +25349,12 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 err_unlock:
 	if (!is_priv)
 		mutex_unlock(&bpf_verifier_lock);
+	clear_insn_aux_data(env->insn_aux_data, 0, env->prog->len);
 	vfree(env->insn_aux_data);
 err_free_env:
 	kvfree(env->cfg.insn_postorder);
 	kvfree(env->scc_info);
+	kvfree(env->succ);
 	kvfree(env);
 	return ret;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 bpf-next 09/13] bpf: disasm: add support for BPF_JMP|BPF_JA|BPF_X
  2025-09-13 19:39 [PATCH v2 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (7 preceding siblings ...)
  2025-09-13 19:39 ` [PATCH v2 bpf-next 08/13] bpf, x86: add support for indirect jumps Anton Protopopov
@ 2025-09-13 19:39 ` Anton Protopopov
  2025-09-13 19:39 ` [PATCH v2 bpf-next 10/13] libbpf: fix formatting of bpf_object__append_subprog_code Anton Protopopov
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Anton Protopopov @ 2025-09-13 19:39 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

Add support for indirect jump instruction.

Example output from bpftool:

   0: (79) r3 = *(u64 *)(r1 +0)
   1: (25) if r3 > 0x4 goto pc+666
   2: (67) r3 <<= 3
   3: (18) r1 = 0xffffbeefspameggs
   5: (0f) r1 += r3
   6: (79) r1 = *(u64 *)(r1 +0)
   7: (0d) gotox r1

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 kernel/bpf/disasm.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/bpf/disasm.c b/kernel/bpf/disasm.c
index 20883c6b1546..4a1ecc6f7582 100644
--- a/kernel/bpf/disasm.c
+++ b/kernel/bpf/disasm.c
@@ -183,6 +183,13 @@ static inline bool is_mov_percpu_addr(const struct bpf_insn *insn)
 	return insn->code == (BPF_ALU64 | BPF_MOV | BPF_X) && insn->off == BPF_ADDR_PERCPU;
 }
 
+static void print_bpf_ja_indirect(bpf_insn_print_t verbose,
+				  void *private_data,
+				  const struct bpf_insn *insn)
+{
+	verbose(private_data, "(%02x) gotox r%d\n", insn->code, insn->dst_reg);
+}
+
 void print_bpf_insn(const struct bpf_insn_cbs *cbs,
 		    const struct bpf_insn *insn,
 		    bool allow_ptr_leaks)
@@ -358,6 +365,8 @@ void print_bpf_insn(const struct bpf_insn_cbs *cbs,
 		} else if (insn->code == (BPF_JMP | BPF_JA)) {
 			verbose(cbs->private_data, "(%02x) goto pc%+d\n",
 				insn->code, insn->off);
+		} else if (insn->code == (BPF_JMP | BPF_JA | BPF_X)) {
+			print_bpf_ja_indirect(verbose, cbs->private_data, insn);
 		} else if (insn->code == (BPF_JMP | BPF_JCOND) &&
 			   insn->src_reg == BPF_MAY_GOTO) {
 			verbose(cbs->private_data, "(%02x) may_goto pc%+d\n",
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 bpf-next 10/13] libbpf: fix formatting of bpf_object__append_subprog_code
  2025-09-13 19:39 [PATCH v2 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (8 preceding siblings ...)
  2025-09-13 19:39 ` [PATCH v2 bpf-next 09/13] bpf: disasm: add support for BPF_JMP|BPF_JA|BPF_X Anton Protopopov
@ 2025-09-13 19:39 ` Anton Protopopov
  2025-09-13 19:39 ` [PATCH v2 bpf-next 11/13] libbpf: support llvm-generated indirect jumps Anton Protopopov
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 26+ messages in thread
From: Anton Protopopov @ 2025-09-13 19:39 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

The commit 6c918709bd30 ("libbpf: Refactor bpf_object__reloc_code")
added the bpf_object__append_subprog_code() with incorrect indentations.
Use tabs instead. (This also makes a consequent commit better readable.)

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 tools/lib/bpf/libbpf.c | 52 +++++++++++++++++++++---------------------
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index fe4fc5438678..2c1f48f77680 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -6393,32 +6393,32 @@ static int
 bpf_object__append_subprog_code(struct bpf_object *obj, struct bpf_program *main_prog,
 				struct bpf_program *subprog)
 {
-       struct bpf_insn *insns;
-       size_t new_cnt;
-       int err;
-
-       subprog->sub_insn_off = main_prog->insns_cnt;
-
-       new_cnt = main_prog->insns_cnt + subprog->insns_cnt;
-       insns = libbpf_reallocarray(main_prog->insns, new_cnt, sizeof(*insns));
-       if (!insns) {
-               pr_warn("prog '%s': failed to realloc prog code\n", main_prog->name);
-               return -ENOMEM;
-       }
-       main_prog->insns = insns;
-       main_prog->insns_cnt = new_cnt;
-
-       memcpy(main_prog->insns + subprog->sub_insn_off, subprog->insns,
-              subprog->insns_cnt * sizeof(*insns));
-
-       pr_debug("prog '%s': added %zu insns from sub-prog '%s'\n",
-                main_prog->name, subprog->insns_cnt, subprog->name);
-
-       /* The subprog insns are now appended. Append its relos too. */
-       err = append_subprog_relos(main_prog, subprog);
-       if (err)
-               return err;
-       return 0;
+	struct bpf_insn *insns;
+	size_t new_cnt;
+	int err;
+
+	subprog->sub_insn_off = main_prog->insns_cnt;
+
+	new_cnt = main_prog->insns_cnt + subprog->insns_cnt;
+	insns = libbpf_reallocarray(main_prog->insns, new_cnt, sizeof(*insns));
+	if (!insns) {
+		pr_warn("prog '%s': failed to realloc prog code\n", main_prog->name);
+		return -ENOMEM;
+	}
+	main_prog->insns = insns;
+	main_prog->insns_cnt = new_cnt;
+
+	memcpy(main_prog->insns + subprog->sub_insn_off, subprog->insns,
+	       subprog->insns_cnt * sizeof(*insns));
+
+	pr_debug("prog '%s': added %zu insns from sub-prog '%s'\n",
+		 main_prog->name, subprog->insns_cnt, subprog->name);
+
+	/* The subprog insns are now appended. Append its relos too. */
+	err = append_subprog_relos(main_prog, subprog);
+	if (err)
+		return err;
+	return 0;
 }
 
 static int
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 bpf-next 11/13] libbpf: support llvm-generated indirect jumps
  2025-09-13 19:39 [PATCH v2 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (9 preceding siblings ...)
  2025-09-13 19:39 ` [PATCH v2 bpf-next 10/13] libbpf: fix formatting of bpf_object__append_subprog_code Anton Protopopov
@ 2025-09-13 19:39 ` Anton Protopopov
  2025-09-13 19:39 ` [PATCH v2 bpf-next 12/13] bpftool: Recognize insn_array map type Anton Protopopov
  2025-09-13 19:39 ` [PATCH v2 bpf-next 13/13] selftests/bpf: add selftests for indirect jumps Anton Protopopov
  12 siblings, 0 replies; 26+ messages in thread
From: Anton Protopopov @ 2025-09-13 19:39 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

For v5 instruction set LLVM is allowed to generate indirect jumps for
switch statements and for 'goto *rX' assembly. Every such a jump will
be accompanied by necessary metadata, e.g. (`llvm-objdump -Sr ...`):

       0:       r2 = 0x0 ll
                0000000000000030:  R_BPF_64_64  BPF.JT.0.0

Here BPF.JT.1.0 is a symbol residing in the .jumptables section:

    Symbol table:
       4: 0000000000000000   240 OBJECT  GLOBAL DEFAULT     4 BPF.JT.0.0

The -bpf-min-jump-table-entries llvm option may be used to control the
minimal size of a switch which will be converted to an indirect jumps.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 tools/lib/bpf/libbpf.c        | 150 +++++++++++++++++++++++++++++++++-
 tools/lib/bpf/libbpf_probes.c |   4 +
 tools/lib/bpf/linker.c        |  10 ++-
 3 files changed, 161 insertions(+), 3 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 2c1f48f77680..57cac0810d2e 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -191,6 +191,7 @@ static const char * const map_type_name[] = {
 	[BPF_MAP_TYPE_USER_RINGBUF]             = "user_ringbuf",
 	[BPF_MAP_TYPE_CGRP_STORAGE]		= "cgrp_storage",
 	[BPF_MAP_TYPE_ARENA]			= "arena",
+	[BPF_MAP_TYPE_INSN_ARRAY]		= "insn_array",
 };
 
 static const char * const prog_type_name[] = {
@@ -372,6 +373,7 @@ enum reloc_type {
 	RELO_EXTERN_CALL,
 	RELO_SUBPROG_ADDR,
 	RELO_CORE,
+	RELO_INSN_ARRAY,
 };
 
 struct reloc_desc {
@@ -382,7 +384,10 @@ struct reloc_desc {
 		struct {
 			int map_idx;
 			int sym_off;
-			int ext_idx;
+			union {
+				int ext_idx;
+				int sym_size;
+			};
 		};
 	};
 };
@@ -424,6 +429,11 @@ struct bpf_sec_def {
 	libbpf_prog_attach_fn_t prog_attach_fn;
 };
 
+struct bpf_light_subprog {
+	__u32 sec_insn_off;
+	__u32 sub_insn_off;
+};
+
 /*
  * bpf_prog should be a better name but it has been used in
  * linux/filter.h.
@@ -496,6 +506,9 @@ struct bpf_program {
 	__u32 line_info_rec_size;
 	__u32 line_info_cnt;
 	__u32 prog_flags;
+
+	struct bpf_light_subprog *subprog;
+	__u32 subprog_cnt;
 };
 
 struct bpf_struct_ops {
@@ -525,6 +538,7 @@ struct bpf_struct_ops {
 #define STRUCT_OPS_SEC ".struct_ops"
 #define STRUCT_OPS_LINK_SEC ".struct_ops.link"
 #define ARENA_SEC ".addr_space.1"
+#define JUMPTABLES_SEC ".jumptables"
 
 enum libbpf_map_type {
 	LIBBPF_MAP_UNSPEC,
@@ -668,6 +682,7 @@ struct elf_state {
 	int symbols_shndx;
 	bool has_st_ops;
 	int arena_data_shndx;
+	int jumptables_data_shndx;
 };
 
 struct usdt_manager;
@@ -739,6 +754,9 @@ struct bpf_object {
 	void *arena_data;
 	size_t arena_data_sz;
 
+	void *jumptables_data;
+	size_t jumptables_data_sz;
+
 	struct kern_feature_cache *feat_cache;
 	char *token_path;
 	int token_fd;
@@ -765,6 +783,7 @@ void bpf_program__unload(struct bpf_program *prog)
 
 	zfree(&prog->func_info);
 	zfree(&prog->line_info);
+	zfree(&prog->subprog);
 }
 
 static void bpf_program__exit(struct bpf_program *prog)
@@ -3945,6 +3964,13 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
 			} else if (strcmp(name, ARENA_SEC) == 0) {
 				obj->efile.arena_data = data;
 				obj->efile.arena_data_shndx = idx;
+			} else if (strcmp(name, JUMPTABLES_SEC) == 0) {
+				obj->jumptables_data = malloc(data->d_size);
+				if (!obj->jumptables_data)
+					return -ENOMEM;
+				memcpy(obj->jumptables_data, data->d_buf, data->d_size);
+				obj->jumptables_data_sz = data->d_size;
+				obj->efile.jumptables_data_shndx = idx;
 			} else {
 				pr_info("elf: skipping unrecognized data section(%d) %s\n",
 					idx, name);
@@ -4599,6 +4625,16 @@ static int bpf_program__record_reloc(struct bpf_program *prog,
 		return 0;
 	}
 
+	/* jump table data relocation */
+	if (shdr_idx == obj->efile.jumptables_data_shndx) {
+		reloc_desc->type = RELO_INSN_ARRAY;
+		reloc_desc->insn_idx = insn_idx;
+		reloc_desc->map_idx = -1;
+		reloc_desc->sym_off = sym->st_value;
+		reloc_desc->sym_size = sym->st_size;
+		return 0;
+	}
+
 	/* generic map reference relocation */
 	if (type == LIBBPF_MAP_UNSPEC) {
 		if (!bpf_object__shndx_is_maps(obj, shdr_idx)) {
@@ -6101,6 +6137,74 @@ static void poison_kfunc_call(struct bpf_program *prog, int relo_idx,
 	insn->imm = POISON_CALL_KFUNC_BASE + ext_idx;
 }
 
+static int create_jt_map(struct bpf_object *obj, int off, int size, int adjust_off)
+{
+	const __u32 value_size = sizeof(struct bpf_insn_array_value);
+	const __u32 max_entries = size / value_size;
+	struct bpf_insn_array_value val = {};
+	int map_fd, err;
+	__u64 xlated_off;
+	__u64 *jt;
+	__u32 i;
+
+	map_fd = bpf_map_create(BPF_MAP_TYPE_INSN_ARRAY, "jt",
+				4, value_size, max_entries, NULL);
+	if (map_fd < 0)
+		return map_fd;
+
+	if (!obj->jumptables_data) {
+		pr_warn("object contains no jumptables_data\n");
+		return -EINVAL;
+	}
+	if ((off + size) > obj->jumptables_data_sz) {
+		pr_warn("jumptables_data size is %zd, trying to access %d\n",
+			obj->jumptables_data_sz, off + size);
+		return -EINVAL;
+	}
+
+	jt = (__u64 *)(obj->jumptables_data + off);
+	for (i = 0; i < max_entries; i++) {
+		/*
+		 * LLVM-generated jump tables contain u64 records, however
+		 * should contain values that fit in u32.
+		 * The adjust_off provided by the caller adjusts the offset to
+		 * be relative to the beginning of the main function
+		 */
+		xlated_off = jt[i]/sizeof(struct bpf_insn) + adjust_off;
+		if (xlated_off > UINT32_MAX) {
+			pr_warn("invalid jump table value %llx at offset %d (adjust_off %d)\n",
+				jt[i], off + i, adjust_off);
+			return -EINVAL;
+		}
+
+		val.xlated_off = xlated_off;
+		err = bpf_map_update_elem(map_fd, &i, &val, 0);
+		if (err) {
+			close(map_fd);
+			return err;
+		}
+	}
+	return map_fd;
+}
+
+/*
+ * In LLVM the .jumptables section contains jump tables entries relative to the
+ * section start. The BPF kernel-side code expects jump table offsets relative
+ * to the beginning of the program (passed in bpf(BPF_PROG_LOAD)). This helper
+ * computes a delta to be added when creating a map.
+ */
+static int jt_adjust_off(struct bpf_program *prog, int insn_idx)
+{
+	int i;
+
+	for (i = prog->subprog_cnt - 1; i >= 0; i--)
+		if (insn_idx >= prog->subprog[i].sub_insn_off)
+			return prog->subprog[i].sub_insn_off - prog->subprog[i].sec_insn_off;
+
+	return -prog->sec_insn_off;
+}
+
+
 /* Relocate data references within program code:
  *  - map references;
  *  - global variable references;
@@ -6192,6 +6296,21 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
 		case RELO_CORE:
 			/* will be handled by bpf_program_record_relos() */
 			break;
+		case RELO_INSN_ARRAY: {
+			int map_fd;
+
+			map_fd = create_jt_map(obj, relo->sym_off, relo->sym_size,
+					       jt_adjust_off(prog, relo->insn_idx));
+			if (map_fd < 0) {
+				pr_warn("prog '%s': relo #%d: can't create jump table: sym_off %u\n",
+						prog->name, i, relo->sym_off);
+				return map_fd;
+			}
+			insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
+			insn->imm = map_fd;
+			insn->off = 0;
+		}
+			break;
 		default:
 			pr_warn("prog '%s': relo #%d: bad relo type %d\n",
 				prog->name, i, relo->type);
@@ -6389,6 +6508,24 @@ static int append_subprog_relos(struct bpf_program *main_prog, struct bpf_progra
 	return 0;
 }
 
+static int save_subprog_offsets(struct bpf_program *main_prog, struct bpf_program *subprog)
+{
+	size_t size = sizeof(main_prog->subprog[0]);
+	int new_cnt = main_prog->subprog_cnt + 1;
+	void *tmp;
+
+	tmp = libbpf_reallocarray(main_prog->subprog, new_cnt, size);
+	if (!tmp)
+		return -ENOMEM;
+
+	main_prog->subprog = tmp;
+	main_prog->subprog[new_cnt - 1].sec_insn_off = subprog->sec_insn_off;
+	main_prog->subprog[new_cnt - 1].sub_insn_off = subprog->sub_insn_off;
+	main_prog->subprog_cnt = new_cnt;
+
+	return 0;
+}
+
 static int
 bpf_object__append_subprog_code(struct bpf_object *obj, struct bpf_program *main_prog,
 				struct bpf_program *subprog)
@@ -6418,6 +6555,14 @@ bpf_object__append_subprog_code(struct bpf_object *obj, struct bpf_program *main
 	err = append_subprog_relos(main_prog, subprog);
 	if (err)
 		return err;
+
+	/* Save subprogram offsets */
+	err = save_subprog_offsets(main_prog, subprog);
+	if (err) {
+		pr_warn("prog '%s': failed to add subprog offsets\n", main_prog->name);
+		return err;
+	}
+
 	return 0;
 }
 
@@ -9185,6 +9330,9 @@ void bpf_object__close(struct bpf_object *obj)
 
 	zfree(&obj->arena_data);
 
+	zfree(&obj->jumptables_data);
+	obj->jumptables_data_sz = 0;
+
 	free(obj);
 }
 
diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c
index 9dfbe7750f56..bccf4bb747e1 100644
--- a/tools/lib/bpf/libbpf_probes.c
+++ b/tools/lib/bpf/libbpf_probes.c
@@ -364,6 +364,10 @@ static int probe_map_create(enum bpf_map_type map_type)
 	case BPF_MAP_TYPE_SOCKHASH:
 	case BPF_MAP_TYPE_REUSEPORT_SOCKARRAY:
 		break;
+	case BPF_MAP_TYPE_INSN_ARRAY:
+		key_size	= sizeof(__u32);
+		value_size	= sizeof(struct bpf_insn_array_value);
+		break;
 	case BPF_MAP_TYPE_UNSPEC:
 	default:
 		return -EOPNOTSUPP;
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c
index a469e5d4fee7..d1585baa9f14 100644
--- a/tools/lib/bpf/linker.c
+++ b/tools/lib/bpf/linker.c
@@ -28,6 +28,8 @@
 #include "str_error.h"
 
 #define BTF_EXTERN_SEC ".extern"
+#define JUMPTABLES_SEC ".jumptables"
+#define JUMPTABLES_REL_SEC ".rel.jumptables"
 
 struct src_sec {
 	const char *sec_name;
@@ -2026,6 +2028,9 @@ static int linker_append_elf_sym(struct bpf_linker *linker, struct src_obj *obj,
 			obj->sym_map[src_sym_idx] = dst_sec->sec_sym_idx;
 			return 0;
 		}
+
+		if (strcmp(src_sec->sec_name, JUMPTABLES_SEC) == 0)
+			goto add_sym;
 	}
 
 	if (sym_bind == STB_LOCAL)
@@ -2272,8 +2277,9 @@ static int linker_append_elf_relos(struct bpf_linker *linker, struct src_obj *ob
 						insn->imm += sec->dst_off / sizeof(struct bpf_insn);
 					else
 						insn->imm += sec->dst_off;
-				} else {
-					pr_warn("relocation against STT_SECTION in non-exec section is not supported!\n");
+				} else if (strcmp(src_sec->sec_name, JUMPTABLES_REL_SEC)) {
+					pr_warn("relocation against STT_SECTION in section %s is not supported!\n",
+						src_sec->sec_name);
 					return -EINVAL;
 				}
 			}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 bpf-next 12/13] bpftool: Recognize insn_array map type
  2025-09-13 19:39 [PATCH v2 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (10 preceding siblings ...)
  2025-09-13 19:39 ` [PATCH v2 bpf-next 11/13] libbpf: support llvm-generated indirect jumps Anton Protopopov
@ 2025-09-13 19:39 ` Anton Protopopov
  2025-09-16 20:33   ` Quentin Monnet
  2025-09-13 19:39 ` [PATCH v2 bpf-next 13/13] selftests/bpf: add selftests for indirect jumps Anton Protopopov
  12 siblings, 1 reply; 26+ messages in thread
From: Anton Protopopov @ 2025-09-13 19:39 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

Teach bpftool to recognize instruction array map type.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 tools/bpf/bpftool/Documentation/bpftool-map.rst | 2 +-
 tools/bpf/bpftool/map.c                         | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/bpf/bpftool/Documentation/bpftool-map.rst b/tools/bpf/bpftool/Documentation/bpftool-map.rst
index 252e4c538edb..3377d4a01c62 100644
--- a/tools/bpf/bpftool/Documentation/bpftool-map.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool-map.rst
@@ -55,7 +55,7 @@ MAP COMMANDS
 |     | **devmap** | **devmap_hash** | **sockmap** | **cpumap** | **xskmap** | **sockhash**
 |     | **cgroup_storage** | **reuseport_sockarray** | **percpu_cgroup_storage**
 |     | **queue** | **stack** | **sk_storage** | **struct_ops** | **ringbuf** | **inode_storage**
-|     | **task_storage** | **bloom_filter** | **user_ringbuf** | **cgrp_storage** | **arena** }
+|     | **task_storage** | **bloom_filter** | **user_ringbuf** | **cgrp_storage** | **arena** | **insn_array** }
 
 DESCRIPTION
 ===========
diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
index c9de44a45778..79b90f274bef 100644
--- a/tools/bpf/bpftool/map.c
+++ b/tools/bpf/bpftool/map.c
@@ -1477,7 +1477,7 @@ static int do_help(int argc, char **argv)
 		"                 devmap | devmap_hash | sockmap | cpumap | xskmap | sockhash |\n"
 		"                 cgroup_storage | reuseport_sockarray | percpu_cgroup_storage |\n"
 		"                 queue | stack | sk_storage | struct_ops | ringbuf | inode_storage |\n"
-		"                 task_storage | bloom_filter | user_ringbuf | cgrp_storage | arena }\n"
+		"                 task_storage | bloom_filter | user_ringbuf | cgrp_storage | arena | insn_array }\n"
 		"       " HELP_SPEC_OPTIONS " |\n"
 		"                    {-f|--bpffs} | {-n|--nomount} }\n"
 		"",
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 bpf-next 13/13] selftests/bpf: add selftests for indirect jumps
  2025-09-13 19:39 [PATCH v2 bpf-next 00/13] BPF indirect jumps Anton Protopopov
                   ` (11 preceding siblings ...)
  2025-09-13 19:39 ` [PATCH v2 bpf-next 12/13] bpftool: Recognize insn_array map type Anton Protopopov
@ 2025-09-13 19:39 ` Anton Protopopov
  12 siblings, 0 replies; 26+ messages in thread
From: Anton Protopopov @ 2025-09-13 19:39 UTC (permalink / raw)
  To: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song
  Cc: Anton Protopopov

Add selftests for indirect jumps. All the indirect jumps are
generated from C switch statements, so, if compiled by a compiler
which doesn't support indirect jumps, then should pass as well.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
---
 tools/testing/selftests/bpf/Makefile          |   4 +-
 .../selftests/bpf/prog_tests/bpf_gotox.c      | 132 ++++++
 tools/testing/selftests/bpf/progs/bpf_gotox.c | 384 ++++++++++++++++++
 3 files changed, 519 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_gotox.c
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_gotox.c

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 11d2a368db3e..606d7d5a48a7 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -453,7 +453,9 @@ BPF_CFLAGS = -g -Wall -Werror -D__TARGET_ARCH_$(SRCARCH) $(MENDIAN)	\
 	     -I$(abspath $(OUTPUT)/../usr/include)			\
 	     -std=gnu11		 					\
 	     -fno-strict-aliasing 					\
-	     -Wno-compare-distinct-pointer-types
+	     -Wno-compare-distinct-pointer-types			\
+	     -Wno-initializer-overrides					\
+	     #
 # TODO: enable me -Wsign-compare
 
 CLANG_CFLAGS = $(CLANG_SYS_INCLUDES)
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_gotox.c b/tools/testing/selftests/bpf/prog_tests/bpf_gotox.c
new file mode 100644
index 000000000000..90647c080579
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_gotox.c
@@ -0,0 +1,132 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <test_progs.h>
+
+#include <linux/if_ether.h>
+#include <linux/in.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <linux/in6.h>
+#include <linux/udp.h>
+#include <linux/tcp.h>
+
+#include <sys/syscall.h>
+#include <bpf/bpf.h>
+
+#include "bpf_gotox.skel.h"
+
+static void __test_run(struct bpf_program *prog, void *ctx_in, size_t ctx_size_in)
+{
+	LIBBPF_OPTS(bpf_test_run_opts, topts,
+			    .ctx_in = ctx_in,
+			    .ctx_size_in = ctx_size_in,
+		   );
+	int err, prog_fd;
+
+	prog_fd = bpf_program__fd(prog);
+	err = bpf_prog_test_run_opts(prog_fd, &topts);
+	ASSERT_OK(err, "test_run_opts err");
+}
+
+static void check_simple(struct bpf_gotox *skel,
+			 struct bpf_program *prog,
+			 __u64 ctx_in,
+			 __u64 expected)
+{
+	skel->bss->ret_user = 0;
+
+	__test_run(prog, &ctx_in, sizeof(ctx_in));
+
+	if (!ASSERT_EQ(skel->bss->ret_user, expected, "skel->bss->ret_user"))
+		return;
+}
+
+static void check_simple_fentry(struct bpf_gotox *skel,
+				struct bpf_program *prog,
+				__u64 ctx_in,
+				__u64 expected)
+{
+	skel->bss->in_user = ctx_in;
+	skel->bss->ret_user = 0;
+
+	/* trigger */
+	usleep(1);
+
+	if (!ASSERT_EQ(skel->bss->ret_user, expected, "skel->bss->ret_user"))
+		return;
+}
+
+static void check_gotox_skel(struct bpf_gotox *skel)
+{
+	int i;
+	__u64 in[]   = {0, 1, 2, 3, 4,  5, 77};
+	__u64 out[]  = {2, 3, 4, 5, 7, 19, 19};
+	__u64 out2[] = {103, 104, 107, 205, 115, 1019, 1019};
+	__u64 in3[]  = {0, 11, 27, 31, 22, 45, 99};
+	__u64 out3[] = {2,  3,  4,  5, 19, 19, 19};
+	__u64 in4[]  = {0, 1, 2, 3, 4,  5, 77};
+	__u64 out4[] = {12, 15, 7 , 15, 12, 15, 15};
+
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.simple_test, in[i], out[i]);
+
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.simple_test2, in[i], out[i]);
+
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.two_switches, in[i], out2[i]);
+
+	if (0) for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.big_jump_table, in3[i], out3[i]);
+
+	if (0) for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.one_jump_two_maps, in4[i], out4[i]);
+
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.use_static_global1, in[i], out[i]);
+
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.use_static_global2, in[i], out[i]);
+
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.use_nonstatic_global1, in[i], out[i]);
+
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple(skel, skel->progs.use_nonstatic_global2, in[i], out[i]);
+
+	bpf_program__attach(skel->progs.simple_test_other_sec);
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple_fentry(skel, skel->progs.simple_test_other_sec, in[i], out[i]);
+
+	bpf_program__attach(skel->progs.use_static_global_other_sec);
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple_fentry(skel, skel->progs.use_static_global_other_sec, in[i], out[i]);
+
+	bpf_program__attach(skel->progs.use_nonstatic_global_other_sec);
+	for (i = 0; i < ARRAY_SIZE(in); i++)
+		check_simple_fentry(skel, skel->progs.use_nonstatic_global_other_sec, in[i], out[i]);
+}
+
+void gotox_skel(void)
+{
+	struct bpf_gotox *skel;
+	int ret;
+
+	skel = bpf_gotox__open();
+	if (!ASSERT_NEQ(skel, NULL, "bpf_gotox__open"))
+		return;
+
+	ret = bpf_gotox__load(skel);
+	if (!ASSERT_OK(ret, "bpf_gotox__load"))
+		return;
+
+	check_gotox_skel(skel);
+
+	bpf_gotox__destroy(skel);
+}
+
+void test_bpf_gotox(void)
+{
+	if (test__start_subtest("gotox_skel"))
+		gotox_skel();
+}
diff --git a/tools/testing/selftests/bpf/progs/bpf_gotox.c b/tools/testing/selftests/bpf/progs/bpf_gotox.c
new file mode 100644
index 000000000000..72917f34315c
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/bpf_gotox.c
@@ -0,0 +1,384 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_core_read.h>
+#include "bpf_misc.h"
+
+__u64 in_user;
+__u64 ret_user;
+
+struct simple_ctx {
+	__u64 x;
+};
+
+__u64 some_var;
+
+/*
+ * This function adds code which will be replaced by a different
+ * number of instructions by the verifier. This adds additional
+ * stress on testing the insn_array maps corresponding to indirect jumps.
+ */
+static __always_inline void adjust_insns(__u64 x)
+{
+	some_var ^= x + bpf_jiffies64();
+}
+
+SEC("syscall")
+int simple_test(struct simple_ctx *ctx)
+{
+	switch (ctx->x) {
+	case 0:
+		adjust_insns(ctx->x + 1);
+		ret_user = 2;
+		break;
+	case 1:
+		adjust_insns(ctx->x + 7);
+		ret_user = 3;
+		break;
+	case 2:
+		adjust_insns(ctx->x + 9);
+		ret_user = 4;
+		break;
+	case 3:
+		adjust_insns(ctx->x + 11);
+		ret_user = 5;
+		break;
+	case 4:
+		adjust_insns(ctx->x + 17);
+		ret_user = 7;
+		break;
+	default:
+		adjust_insns(ctx->x + 177);
+		ret_user = 19;
+		break;
+	}
+
+	return 0;
+}
+
+SEC("syscall")
+int simple_test2(struct simple_ctx *ctx)
+{
+	switch (ctx->x) {
+	case 0:
+		adjust_insns(ctx->x + 1);
+		ret_user = 2;
+		break;
+	case 1:
+		adjust_insns(ctx->x + 7);
+		ret_user = 3;
+		break;
+	case 2:
+		adjust_insns(ctx->x + 9);
+		ret_user = 4;
+		break;
+	case 3:
+		adjust_insns(ctx->x + 11);
+		ret_user = 5;
+		break;
+	case 4:
+		adjust_insns(ctx->x + 17);
+		ret_user = 7;
+		break;
+	default:
+		adjust_insns(ctx->x + 177);
+		ret_user = 19;
+		break;
+	}
+
+	return 0;
+}
+
+SEC("fentry/" SYS_PREFIX "sys_nanosleep")
+int simple_test_other_sec(struct pt_regs *ctx)
+{
+	__u64 x = in_user;
+
+	switch (x) {
+	case 0:
+		adjust_insns(x + 1);
+		ret_user = 2;
+		break;
+	case 1:
+		adjust_insns(x + 7);
+		ret_user = 3;
+		break;
+	case 2:
+		adjust_insns(x + 9);
+		ret_user = 4;
+		break;
+	case 3:
+		adjust_insns(x + 11);
+		ret_user = 5;
+		break;
+	case 4:
+		adjust_insns(x + 17);
+		ret_user = 7;
+		break;
+	default:
+		adjust_insns(x + 177);
+		ret_user = 19;
+		break;
+	}
+
+	return 0;
+}
+
+SEC("syscall")
+int two_switches(struct simple_ctx *ctx)
+{
+	switch (ctx->x) {
+	case 0:
+		adjust_insns(ctx->x + 1);
+		ret_user = 2;
+		break;
+	case 1:
+		adjust_insns(ctx->x + 7);
+		ret_user = 3;
+		break;
+	case 2:
+		adjust_insns(ctx->x + 9);
+		ret_user = 4;
+		break;
+	case 3:
+		adjust_insns(ctx->x + 11);
+		ret_user = 5;
+		break;
+	case 4:
+		adjust_insns(ctx->x + 17);
+		ret_user = 7;
+		break;
+	default:
+		adjust_insns(ctx->x + 177);
+		ret_user = 19;
+		break;
+	}
+
+	switch (ctx->x + !!ret_user) {
+	case 1:
+		adjust_insns(ctx->x + 7);
+		ret_user = 103;
+		break;
+	case 2:
+		adjust_insns(ctx->x + 9);
+		ret_user = 104;
+		break;
+	case 3:
+		adjust_insns(ctx->x + 11);
+		ret_user = 107;
+		break;
+	case 4:
+		adjust_insns(ctx->x + 11);
+		ret_user = 205;
+		break;
+	case 5:
+		adjust_insns(ctx->x + 11);
+		ret_user = 115;
+		break;
+	default:
+		adjust_insns(ctx->x + 177);
+		ret_user = 1019;
+		break;
+	}
+
+	return 0;
+}
+
+SEC("syscall")
+int big_jump_table(struct simple_ctx *ctx __attribute__((unused)))
+{
+#if 0
+	const void *const jt[256] = {
+		[0 ... 255] = &&default_label,
+		[0] = &&l0,
+		[11] = &&l11,
+		[27] = &&l27,
+		[31] = &&l31,
+	};
+
+	goto *jt[ctx->x & 0xff];
+
+l0:
+	adjust_insns(ctx->x + 1);
+	ret_user = 2;
+	return 0;
+
+l11:
+	adjust_insns(ctx->x + 7);
+	ret_user = 3;
+	return 0;
+
+l27:
+	adjust_insns(ctx->x + 9);
+	ret_user = 4;
+	return 0;
+
+l31:
+	adjust_insns(ctx->x + 11);
+	ret_user = 5;
+	return 0;
+
+default_label:
+	adjust_insns(ctx->x + 177);
+	ret_user = 19;
+	return 0;
+#else
+	return 0;
+#endif
+}
+
+SEC("syscall")
+int one_jump_two_maps(struct simple_ctx *ctx __attribute__((unused)))
+{
+#if 0
+	__label__ l1, l2, l3, l4;
+	void *jt1[2] = { &&l1, &&l2 };
+	void *jt2[2] = { &&l3, &&l4 };
+	unsigned int a = ctx->x % 2;
+	unsigned int b = (ctx->x / 2) % 2;
+	volatile int ret = 0;
+
+	if (!(a < 2 && b < 2))
+		return 19;
+
+	if (ctx->x % 2)
+		goto *jt1[a];
+	else
+		goto *jt2[b];
+
+	l1: ret += 1;
+	l2: ret += 3;
+	l3: ret += 5;
+	l4: ret += 7;
+
+	ret_user = ret;
+	return ret;
+#else
+	return 0;
+#endif
+}
+
+/* Just to introduce some non-zero offsets in .text */
+static __noinline int f0(volatile struct simple_ctx *ctx __arg_ctx)
+{
+	if (ctx)
+		return 1;
+	else
+		return 13;
+}
+
+SEC("syscall") int f1(struct simple_ctx *ctx)
+{
+	ret_user = 0;
+	return f0(ctx);
+}
+
+static __noinline int __static_global(__u64 x)
+{
+	switch (x) {
+	case 0:
+		adjust_insns(x + 1);
+		ret_user = 2;
+		break;
+	case 1:
+		adjust_insns(x + 7);
+		ret_user = 3;
+		break;
+	case 2:
+		adjust_insns(x + 9);
+		ret_user = 4;
+		break;
+	case 3:
+		adjust_insns(x + 11);
+		ret_user = 5;
+		break;
+	case 4:
+		adjust_insns(x + 17);
+		ret_user = 7;
+		break;
+	default:
+		adjust_insns(x + 177);
+		ret_user = 19;
+		break;
+	}
+
+	return 0;
+}
+
+SEC("syscall")
+int use_static_global1(struct simple_ctx *ctx)
+{
+	ret_user = 0;
+	return __static_global(ctx->x);
+}
+
+SEC("syscall")
+int use_static_global2(struct simple_ctx *ctx)
+{
+	ret_user = 0;
+	adjust_insns(ctx->x + 1);
+	return __static_global(ctx->x);
+}
+
+SEC("fentry/" SYS_PREFIX "sys_nanosleep")
+int use_static_global_other_sec(void *ctx)
+{
+	return __static_global(in_user);
+}
+
+__noinline int __nonstatic_global(__u64 x)
+{
+	switch (x) {
+	case 0:
+		adjust_insns(x + 1);
+		ret_user = 2;
+		break;
+	case 1:
+		adjust_insns(x + 7);
+		ret_user = 3;
+		break;
+	case 2:
+		adjust_insns(x + 9);
+		ret_user = 4;
+		break;
+	case 3:
+		adjust_insns(x + 11);
+		ret_user = 5;
+		break;
+	case 4:
+		adjust_insns(x + 17);
+		ret_user = 7;
+		break;
+	default:
+		adjust_insns(x + 177);
+		ret_user = 19;
+		break;
+	}
+
+	return 0;
+}
+
+SEC("syscall")
+int use_nonstatic_global1(struct simple_ctx *ctx)
+{
+	ret_user = 0;
+	return __nonstatic_global(ctx->x);
+}
+
+SEC("syscall")
+int use_nonstatic_global2(struct simple_ctx *ctx)
+{
+	ret_user = 0;
+	adjust_insns(ctx->x + 1);
+	return __nonstatic_global(ctx->x);
+}
+
+SEC("fentry/" SYS_PREFIX "sys_nanosleep")
+int use_nonstatic_global_other_sec(void *ctx)
+{
+	return __nonstatic_global(in_user);
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array
  2025-09-13 19:39 ` [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array Anton Protopopov
@ 2025-09-15  4:09   ` kernel test robot
  2025-09-20  0:30   ` Alexei Starovoitov
  1 sibling, 0 replies; 26+ messages in thread
From: kernel test robot @ 2025-09-15  4:09 UTC (permalink / raw)
  To: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Daniel Borkmann, Eduard Zingerman,
	Quentin Monnet, Yonghong Song
  Cc: oe-kbuild-all

Hi Anton,

kernel test robot noticed the following build errors:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Anton-Protopopov/bpf-fix-the-return-value-of-push_stack/20250914-033453
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20250913193922.1910480-4-a.s.protopopov%40gmail.com
patch subject: [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array
config: x86_64-randconfig-078-20250914 (https://download.01.org/0day-ci/archive/20250915/202509151152.1FcyFoR8-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250915/202509151152.1FcyFoR8-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202509151152.1FcyFoR8-lkp@intel.com/

All errors (new ones prefixed by >>):

   ld: arch/x86/net/bpf_jit_comp.o: in function `do_jit':
>> arch/x86/net/bpf_jit_comp.c:2726:(.text+0xbb78): undefined reference to `bpf_prog_update_insn_ptr'


vim +2726 arch/x86/net/bpf_jit_comp.c

  1603	
  1604	static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image,
  1605			  int oldproglen, struct jit_context *ctx, bool jmp_padding)
  1606	{
  1607		bool tail_call_reachable = bpf_prog->aux->tail_call_reachable;
  1608		struct bpf_insn *insn = bpf_prog->insnsi;
  1609		bool callee_regs_used[4] = {};
  1610		int insn_cnt = bpf_prog->len;
  1611		bool seen_exit = false;
  1612		u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY];
  1613		void __percpu *priv_frame_ptr = NULL;
  1614		u64 arena_vm_start, user_vm_start;
  1615		void __percpu *priv_stack_ptr;
  1616		int i, excnt = 0;
  1617		int ilen, proglen = 0;
  1618		u8 *prog = temp;
  1619		u32 stack_depth;
  1620		int err;
  1621	
  1622		stack_depth = bpf_prog->aux->stack_depth;
  1623		priv_stack_ptr = bpf_prog->aux->priv_stack_ptr;
  1624		if (priv_stack_ptr) {
  1625			priv_frame_ptr = priv_stack_ptr + PRIV_STACK_GUARD_SZ + round_up(stack_depth, 8);
  1626			stack_depth = 0;
  1627		}
  1628	
  1629		arena_vm_start = bpf_arena_get_kern_vm_start(bpf_prog->aux->arena);
  1630		user_vm_start = bpf_arena_get_user_vm_start(bpf_prog->aux->arena);
  1631	
  1632		detect_reg_usage(insn, insn_cnt, callee_regs_used);
  1633	
  1634		emit_prologue(&prog, image, stack_depth,
  1635			      bpf_prog_was_classic(bpf_prog), tail_call_reachable,
  1636			      bpf_is_subprog(bpf_prog), bpf_prog->aux->exception_cb);
  1637		/* Exception callback will clobber callee regs for its own use, and
  1638		 * restore the original callee regs from main prog's stack frame.
  1639		 */
  1640		if (bpf_prog->aux->exception_boundary) {
  1641			/* We also need to save r12, which is not mapped to any BPF
  1642			 * register, as we throw after entry into the kernel, which may
  1643			 * overwrite r12.
  1644			 */
  1645			push_r12(&prog);
  1646			push_callee_regs(&prog, all_callee_regs_used);
  1647		} else {
  1648			if (arena_vm_start)
  1649				push_r12(&prog);
  1650			push_callee_regs(&prog, callee_regs_used);
  1651		}
  1652		if (arena_vm_start)
  1653			emit_mov_imm64(&prog, X86_REG_R12,
  1654				       arena_vm_start >> 32, (u32) arena_vm_start);
  1655	
  1656		if (priv_frame_ptr)
  1657			emit_priv_frame_ptr(&prog, priv_frame_ptr);
  1658	
  1659		ilen = prog - temp;
  1660		if (rw_image)
  1661			memcpy(rw_image + proglen, temp, ilen);
  1662		proglen += ilen;
  1663		addrs[0] = proglen;
  1664		prog = temp;
  1665	
  1666		for (i = 1; i <= insn_cnt; i++, insn++) {
  1667			u32 abs_xlated_off = bpf_prog->aux->subprog_start + i - 1;
  1668			const s32 imm32 = insn->imm;
  1669			u32 dst_reg = insn->dst_reg;
  1670			u32 src_reg = insn->src_reg;
  1671			u8 b2 = 0, b3 = 0;
  1672			u8 *start_of_ldx;
  1673			s64 jmp_offset;
  1674			s16 insn_off;
  1675			u8 jmp_cond;
  1676			u8 *func;
  1677			int nops;
  1678	
  1679			if (priv_frame_ptr) {
  1680				if (src_reg == BPF_REG_FP)
  1681					src_reg = X86_REG_R9;
  1682	
  1683				if (dst_reg == BPF_REG_FP)
  1684					dst_reg = X86_REG_R9;
  1685			}
  1686	
  1687			switch (insn->code) {
  1688				/* ALU */
  1689			case BPF_ALU | BPF_ADD | BPF_X:
  1690			case BPF_ALU | BPF_SUB | BPF_X:
  1691			case BPF_ALU | BPF_AND | BPF_X:
  1692			case BPF_ALU | BPF_OR | BPF_X:
  1693			case BPF_ALU | BPF_XOR | BPF_X:
  1694			case BPF_ALU64 | BPF_ADD | BPF_X:
  1695			case BPF_ALU64 | BPF_SUB | BPF_X:
  1696			case BPF_ALU64 | BPF_AND | BPF_X:
  1697			case BPF_ALU64 | BPF_OR | BPF_X:
  1698			case BPF_ALU64 | BPF_XOR | BPF_X:
  1699				maybe_emit_mod(&prog, dst_reg, src_reg,
  1700					       BPF_CLASS(insn->code) == BPF_ALU64);
  1701				b2 = simple_alu_opcodes[BPF_OP(insn->code)];
  1702				EMIT2(b2, add_2reg(0xC0, dst_reg, src_reg));
  1703				break;
  1704	
  1705			case BPF_ALU64 | BPF_MOV | BPF_X:
  1706				if (insn_is_cast_user(insn)) {
  1707					if (dst_reg != src_reg)
  1708						/* 32-bit mov */
  1709						emit_mov_reg(&prog, false, dst_reg, src_reg);
  1710					/* shl dst_reg, 32 */
  1711					maybe_emit_1mod(&prog, dst_reg, true);
  1712					EMIT3(0xC1, add_1reg(0xE0, dst_reg), 32);
  1713	
  1714					/* or dst_reg, user_vm_start */
  1715					maybe_emit_1mod(&prog, dst_reg, true);
  1716					if (is_axreg(dst_reg))
  1717						EMIT1_off32(0x0D,  user_vm_start >> 32);
  1718					else
  1719						EMIT2_off32(0x81, add_1reg(0xC8, dst_reg),  user_vm_start >> 32);
  1720	
  1721					/* rol dst_reg, 32 */
  1722					maybe_emit_1mod(&prog, dst_reg, true);
  1723					EMIT3(0xC1, add_1reg(0xC0, dst_reg), 32);
  1724	
  1725					/* xor r11, r11 */
  1726					EMIT3(0x4D, 0x31, 0xDB);
  1727	
  1728					/* test dst_reg32, dst_reg32; check if lower 32-bit are zero */
  1729					maybe_emit_mod(&prog, dst_reg, dst_reg, false);
  1730					EMIT2(0x85, add_2reg(0xC0, dst_reg, dst_reg));
  1731	
  1732					/* cmove r11, dst_reg; if so, set dst_reg to zero */
  1733					/* WARNING: Intel swapped src/dst register encoding in CMOVcc !!! */
  1734					maybe_emit_mod(&prog, AUX_REG, dst_reg, true);
  1735					EMIT3(0x0F, 0x44, add_2reg(0xC0, AUX_REG, dst_reg));
  1736					break;
  1737				} else if (insn_is_mov_percpu_addr(insn)) {
  1738					/* mov <dst>, <src> (if necessary) */
  1739					EMIT_mov(dst_reg, src_reg);
  1740	#ifdef CONFIG_SMP
  1741					/* add <dst>, gs:[<off>] */
  1742					EMIT2(0x65, add_1mod(0x48, dst_reg));
  1743					EMIT3(0x03, add_2reg(0x04, 0, dst_reg), 0x25);
  1744					EMIT((u32)(unsigned long)&this_cpu_off, 4);
  1745	#endif
  1746					break;
  1747				}
  1748				fallthrough;
  1749			case BPF_ALU | BPF_MOV | BPF_X:
  1750				if (insn->off == 0)
  1751					emit_mov_reg(&prog,
  1752						     BPF_CLASS(insn->code) == BPF_ALU64,
  1753						     dst_reg, src_reg);
  1754				else
  1755					emit_movsx_reg(&prog, insn->off,
  1756						       BPF_CLASS(insn->code) == BPF_ALU64,
  1757						       dst_reg, src_reg);
  1758				break;
  1759	
  1760				/* neg dst */
  1761			case BPF_ALU | BPF_NEG:
  1762			case BPF_ALU64 | BPF_NEG:
  1763				maybe_emit_1mod(&prog, dst_reg,
  1764						BPF_CLASS(insn->code) == BPF_ALU64);
  1765				EMIT2(0xF7, add_1reg(0xD8, dst_reg));
  1766				break;
  1767	
  1768			case BPF_ALU | BPF_ADD | BPF_K:
  1769			case BPF_ALU | BPF_SUB | BPF_K:
  1770			case BPF_ALU | BPF_AND | BPF_K:
  1771			case BPF_ALU | BPF_OR | BPF_K:
  1772			case BPF_ALU | BPF_XOR | BPF_K:
  1773			case BPF_ALU64 | BPF_ADD | BPF_K:
  1774			case BPF_ALU64 | BPF_SUB | BPF_K:
  1775			case BPF_ALU64 | BPF_AND | BPF_K:
  1776			case BPF_ALU64 | BPF_OR | BPF_K:
  1777			case BPF_ALU64 | BPF_XOR | BPF_K:
  1778				maybe_emit_1mod(&prog, dst_reg,
  1779						BPF_CLASS(insn->code) == BPF_ALU64);
  1780	
  1781				/*
  1782				 * b3 holds 'normal' opcode, b2 short form only valid
  1783				 * in case dst is eax/rax.
  1784				 */
  1785				switch (BPF_OP(insn->code)) {
  1786				case BPF_ADD:
  1787					b3 = 0xC0;
  1788					b2 = 0x05;
  1789					break;
  1790				case BPF_SUB:
  1791					b3 = 0xE8;
  1792					b2 = 0x2D;
  1793					break;
  1794				case BPF_AND:
  1795					b3 = 0xE0;
  1796					b2 = 0x25;
  1797					break;
  1798				case BPF_OR:
  1799					b3 = 0xC8;
  1800					b2 = 0x0D;
  1801					break;
  1802				case BPF_XOR:
  1803					b3 = 0xF0;
  1804					b2 = 0x35;
  1805					break;
  1806				}
  1807	
  1808				if (is_imm8(imm32))
  1809					EMIT3(0x83, add_1reg(b3, dst_reg), imm32);
  1810				else if (is_axreg(dst_reg))
  1811					EMIT1_off32(b2, imm32);
  1812				else
  1813					EMIT2_off32(0x81, add_1reg(b3, dst_reg), imm32);
  1814				break;
  1815	
  1816			case BPF_ALU64 | BPF_MOV | BPF_K:
  1817			case BPF_ALU | BPF_MOV | BPF_K:
  1818				emit_mov_imm32(&prog, BPF_CLASS(insn->code) == BPF_ALU64,
  1819					       dst_reg, imm32);
  1820				break;
  1821	
  1822			case BPF_LD | BPF_IMM | BPF_DW:
  1823				emit_mov_imm64(&prog, dst_reg, insn[1].imm, insn[0].imm);
  1824				insn++;
  1825				i++;
  1826				break;
  1827	
  1828				/* dst %= src, dst /= src, dst %= imm32, dst /= imm32 */
  1829			case BPF_ALU | BPF_MOD | BPF_X:
  1830			case BPF_ALU | BPF_DIV | BPF_X:
  1831			case BPF_ALU | BPF_MOD | BPF_K:
  1832			case BPF_ALU | BPF_DIV | BPF_K:
  1833			case BPF_ALU64 | BPF_MOD | BPF_X:
  1834			case BPF_ALU64 | BPF_DIV | BPF_X:
  1835			case BPF_ALU64 | BPF_MOD | BPF_K:
  1836			case BPF_ALU64 | BPF_DIV | BPF_K: {
  1837				bool is64 = BPF_CLASS(insn->code) == BPF_ALU64;
  1838	
  1839				if (dst_reg != BPF_REG_0)
  1840					EMIT1(0x50); /* push rax */
  1841				if (dst_reg != BPF_REG_3)
  1842					EMIT1(0x52); /* push rdx */
  1843	
  1844				if (BPF_SRC(insn->code) == BPF_X) {
  1845					if (src_reg == BPF_REG_0 ||
  1846					    src_reg == BPF_REG_3) {
  1847						/* mov r11, src_reg */
  1848						EMIT_mov(AUX_REG, src_reg);
  1849						src_reg = AUX_REG;
  1850					}
  1851				} else {
  1852					/* mov r11, imm32 */
  1853					EMIT3_off32(0x49, 0xC7, 0xC3, imm32);
  1854					src_reg = AUX_REG;
  1855				}
  1856	
  1857				if (dst_reg != BPF_REG_0)
  1858					/* mov rax, dst_reg */
  1859					emit_mov_reg(&prog, is64, BPF_REG_0, dst_reg);
  1860	
  1861				if (insn->off == 0) {
  1862					/*
  1863					 * xor edx, edx
  1864					 * equivalent to 'xor rdx, rdx', but one byte less
  1865					 */
  1866					EMIT2(0x31, 0xd2);
  1867	
  1868					/* div src_reg */
  1869					maybe_emit_1mod(&prog, src_reg, is64);
  1870					EMIT2(0xF7, add_1reg(0xF0, src_reg));
  1871				} else {
  1872					if (BPF_CLASS(insn->code) == BPF_ALU)
  1873						EMIT1(0x99); /* cdq */
  1874					else
  1875						EMIT2(0x48, 0x99); /* cqo */
  1876	
  1877					/* idiv src_reg */
  1878					maybe_emit_1mod(&prog, src_reg, is64);
  1879					EMIT2(0xF7, add_1reg(0xF8, src_reg));
  1880				}
  1881	
  1882				if (BPF_OP(insn->code) == BPF_MOD &&
  1883				    dst_reg != BPF_REG_3)
  1884					/* mov dst_reg, rdx */
  1885					emit_mov_reg(&prog, is64, dst_reg, BPF_REG_3);
  1886				else if (BPF_OP(insn->code) == BPF_DIV &&
  1887					 dst_reg != BPF_REG_0)
  1888					/* mov dst_reg, rax */
  1889					emit_mov_reg(&prog, is64, dst_reg, BPF_REG_0);
  1890	
  1891				if (dst_reg != BPF_REG_3)
  1892					EMIT1(0x5A); /* pop rdx */
  1893				if (dst_reg != BPF_REG_0)
  1894					EMIT1(0x58); /* pop rax */
  1895				break;
  1896			}
  1897	
  1898			case BPF_ALU | BPF_MUL | BPF_K:
  1899			case BPF_ALU64 | BPF_MUL | BPF_K:
  1900				maybe_emit_mod(&prog, dst_reg, dst_reg,
  1901					       BPF_CLASS(insn->code) == BPF_ALU64);
  1902	
  1903				if (is_imm8(imm32))
  1904					/* imul dst_reg, dst_reg, imm8 */
  1905					EMIT3(0x6B, add_2reg(0xC0, dst_reg, dst_reg),
  1906					      imm32);
  1907				else
  1908					/* imul dst_reg, dst_reg, imm32 */
  1909					EMIT2_off32(0x69,
  1910						    add_2reg(0xC0, dst_reg, dst_reg),
  1911						    imm32);
  1912				break;
  1913	
  1914			case BPF_ALU | BPF_MUL | BPF_X:
  1915			case BPF_ALU64 | BPF_MUL | BPF_X:
  1916				maybe_emit_mod(&prog, src_reg, dst_reg,
  1917					       BPF_CLASS(insn->code) == BPF_ALU64);
  1918	
  1919				/* imul dst_reg, src_reg */
  1920				EMIT3(0x0F, 0xAF, add_2reg(0xC0, src_reg, dst_reg));
  1921				break;
  1922	
  1923				/* Shifts */
  1924			case BPF_ALU | BPF_LSH | BPF_K:
  1925			case BPF_ALU | BPF_RSH | BPF_K:
  1926			case BPF_ALU | BPF_ARSH | BPF_K:
  1927			case BPF_ALU64 | BPF_LSH | BPF_K:
  1928			case BPF_ALU64 | BPF_RSH | BPF_K:
  1929			case BPF_ALU64 | BPF_ARSH | BPF_K:
  1930				maybe_emit_1mod(&prog, dst_reg,
  1931						BPF_CLASS(insn->code) == BPF_ALU64);
  1932	
  1933				b3 = simple_alu_opcodes[BPF_OP(insn->code)];
  1934				if (imm32 == 1)
  1935					EMIT2(0xD1, add_1reg(b3, dst_reg));
  1936				else
  1937					EMIT3(0xC1, add_1reg(b3, dst_reg), imm32);
  1938				break;
  1939	
  1940			case BPF_ALU | BPF_LSH | BPF_X:
  1941			case BPF_ALU | BPF_RSH | BPF_X:
  1942			case BPF_ALU | BPF_ARSH | BPF_X:
  1943			case BPF_ALU64 | BPF_LSH | BPF_X:
  1944			case BPF_ALU64 | BPF_RSH | BPF_X:
  1945			case BPF_ALU64 | BPF_ARSH | BPF_X:
  1946				/* BMI2 shifts aren't better when shift count is already in rcx */
  1947				if (boot_cpu_has(X86_FEATURE_BMI2) && src_reg != BPF_REG_4) {
  1948					/* shrx/sarx/shlx dst_reg, dst_reg, src_reg */
  1949					bool w = (BPF_CLASS(insn->code) == BPF_ALU64);
  1950					u8 op;
  1951	
  1952					switch (BPF_OP(insn->code)) {
  1953					case BPF_LSH:
  1954						op = 1; /* prefix 0x66 */
  1955						break;
  1956					case BPF_RSH:
  1957						op = 3; /* prefix 0xf2 */
  1958						break;
  1959					case BPF_ARSH:
  1960						op = 2; /* prefix 0xf3 */
  1961						break;
  1962					}
  1963	
  1964					emit_shiftx(&prog, dst_reg, src_reg, w, op);
  1965	
  1966					break;
  1967				}
  1968	
  1969				if (src_reg != BPF_REG_4) { /* common case */
  1970					/* Check for bad case when dst_reg == rcx */
  1971					if (dst_reg == BPF_REG_4) {
  1972						/* mov r11, dst_reg */
  1973						EMIT_mov(AUX_REG, dst_reg);
  1974						dst_reg = AUX_REG;
  1975					} else {
  1976						EMIT1(0x51); /* push rcx */
  1977					}
  1978					/* mov rcx, src_reg */
  1979					EMIT_mov(BPF_REG_4, src_reg);
  1980				}
  1981	
  1982				/* shl %rax, %cl | shr %rax, %cl | sar %rax, %cl */
  1983				maybe_emit_1mod(&prog, dst_reg,
  1984						BPF_CLASS(insn->code) == BPF_ALU64);
  1985	
  1986				b3 = simple_alu_opcodes[BPF_OP(insn->code)];
  1987				EMIT2(0xD3, add_1reg(b3, dst_reg));
  1988	
  1989				if (src_reg != BPF_REG_4) {
  1990					if (insn->dst_reg == BPF_REG_4)
  1991						/* mov dst_reg, r11 */
  1992						EMIT_mov(insn->dst_reg, AUX_REG);
  1993					else
  1994						EMIT1(0x59); /* pop rcx */
  1995				}
  1996	
  1997				break;
  1998	
  1999			case BPF_ALU | BPF_END | BPF_FROM_BE:
  2000			case BPF_ALU64 | BPF_END | BPF_FROM_LE:
  2001				switch (imm32) {
  2002				case 16:
  2003					/* Emit 'ror %ax, 8' to swap lower 2 bytes */
  2004					EMIT1(0x66);
  2005					if (is_ereg(dst_reg))
  2006						EMIT1(0x41);
  2007					EMIT3(0xC1, add_1reg(0xC8, dst_reg), 8);
  2008	
  2009					/* Emit 'movzwl eax, ax' */
  2010					if (is_ereg(dst_reg))
  2011						EMIT3(0x45, 0x0F, 0xB7);
  2012					else
  2013						EMIT2(0x0F, 0xB7);
  2014					EMIT1(add_2reg(0xC0, dst_reg, dst_reg));
  2015					break;
  2016				case 32:
  2017					/* Emit 'bswap eax' to swap lower 4 bytes */
  2018					if (is_ereg(dst_reg))
  2019						EMIT2(0x41, 0x0F);
  2020					else
  2021						EMIT1(0x0F);
  2022					EMIT1(add_1reg(0xC8, dst_reg));
  2023					break;
  2024				case 64:
  2025					/* Emit 'bswap rax' to swap 8 bytes */
  2026					EMIT3(add_1mod(0x48, dst_reg), 0x0F,
  2027					      add_1reg(0xC8, dst_reg));
  2028					break;
  2029				}
  2030				break;
  2031	
  2032			case BPF_ALU | BPF_END | BPF_FROM_LE:
  2033				switch (imm32) {
  2034				case 16:
  2035					/*
  2036					 * Emit 'movzwl eax, ax' to zero extend 16-bit
  2037					 * into 64 bit
  2038					 */
  2039					if (is_ereg(dst_reg))
  2040						EMIT3(0x45, 0x0F, 0xB7);
  2041					else
  2042						EMIT2(0x0F, 0xB7);
  2043					EMIT1(add_2reg(0xC0, dst_reg, dst_reg));
  2044					break;
  2045				case 32:
  2046					/* Emit 'mov eax, eax' to clear upper 32-bits */
  2047					if (is_ereg(dst_reg))
  2048						EMIT1(0x45);
  2049					EMIT2(0x89, add_2reg(0xC0, dst_reg, dst_reg));
  2050					break;
  2051				case 64:
  2052					/* nop */
  2053					break;
  2054				}
  2055				break;
  2056	
  2057				/* speculation barrier */
  2058			case BPF_ST | BPF_NOSPEC:
  2059				EMIT_LFENCE();
  2060				break;
  2061	
  2062				/* ST: *(u8*)(dst_reg + off) = imm */
  2063			case BPF_ST | BPF_MEM | BPF_B:
  2064				if (is_ereg(dst_reg))
  2065					EMIT2(0x41, 0xC6);
  2066				else
  2067					EMIT1(0xC6);
  2068				goto st;
  2069			case BPF_ST | BPF_MEM | BPF_H:
  2070				if (is_ereg(dst_reg))
  2071					EMIT3(0x66, 0x41, 0xC7);
  2072				else
  2073					EMIT2(0x66, 0xC7);
  2074				goto st;
  2075			case BPF_ST | BPF_MEM | BPF_W:
  2076				if (is_ereg(dst_reg))
  2077					EMIT2(0x41, 0xC7);
  2078				else
  2079					EMIT1(0xC7);
  2080				goto st;
  2081			case BPF_ST | BPF_MEM | BPF_DW:
  2082				EMIT2(add_1mod(0x48, dst_reg), 0xC7);
  2083	
  2084	st:			if (is_imm8(insn->off))
  2085					EMIT2(add_1reg(0x40, dst_reg), insn->off);
  2086				else
  2087					EMIT1_off32(add_1reg(0x80, dst_reg), insn->off);
  2088	
  2089				EMIT(imm32, bpf_size_to_x86_bytes(BPF_SIZE(insn->code)));
  2090				break;
  2091	
  2092				/* STX: *(u8*)(dst_reg + off) = src_reg */
  2093			case BPF_STX | BPF_MEM | BPF_B:
  2094			case BPF_STX | BPF_MEM | BPF_H:
  2095			case BPF_STX | BPF_MEM | BPF_W:
  2096			case BPF_STX | BPF_MEM | BPF_DW:
  2097				emit_stx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off);
  2098				break;
  2099	
  2100			case BPF_ST | BPF_PROBE_MEM32 | BPF_B:
  2101			case BPF_ST | BPF_PROBE_MEM32 | BPF_H:
  2102			case BPF_ST | BPF_PROBE_MEM32 | BPF_W:
  2103			case BPF_ST | BPF_PROBE_MEM32 | BPF_DW:
  2104				start_of_ldx = prog;
  2105				emit_st_r12(&prog, BPF_SIZE(insn->code), dst_reg, insn->off, insn->imm);
  2106				goto populate_extable;
  2107	
  2108				/* LDX: dst_reg = *(u8*)(src_reg + r12 + off) */
  2109			case BPF_LDX | BPF_PROBE_MEM32 | BPF_B:
  2110			case BPF_LDX | BPF_PROBE_MEM32 | BPF_H:
  2111			case BPF_LDX | BPF_PROBE_MEM32 | BPF_W:
  2112			case BPF_LDX | BPF_PROBE_MEM32 | BPF_DW:
  2113			case BPF_STX | BPF_PROBE_MEM32 | BPF_B:
  2114			case BPF_STX | BPF_PROBE_MEM32 | BPF_H:
  2115			case BPF_STX | BPF_PROBE_MEM32 | BPF_W:
  2116			case BPF_STX | BPF_PROBE_MEM32 | BPF_DW:
  2117				start_of_ldx = prog;
  2118				if (BPF_CLASS(insn->code) == BPF_LDX)
  2119					emit_ldx_r12(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off);
  2120				else
  2121					emit_stx_r12(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn->off);
  2122	populate_extable:
  2123				{
  2124					struct exception_table_entry *ex;
  2125					u8 *_insn = image + proglen + (start_of_ldx - temp);
  2126					u32 arena_reg, fixup_reg;
  2127					s64 delta;
  2128	
  2129					if (!bpf_prog->aux->extable)
  2130						break;
  2131	
  2132					if (excnt >= bpf_prog->aux->num_exentries) {
  2133						pr_err("mem32 extable bug\n");
  2134						return -EFAULT;
  2135					}
  2136					ex = &bpf_prog->aux->extable[excnt++];
  2137	
  2138					delta = _insn - (u8 *)&ex->insn;
  2139					/* switch ex to rw buffer for writes */
  2140					ex = (void *)rw_image + ((void *)ex - (void *)image);
  2141	
  2142					ex->insn = delta;
  2143	
  2144					ex->data = EX_TYPE_BPF;
  2145	
  2146					/*
  2147					 * src_reg/dst_reg holds the address in the arena region with upper
  2148					 * 32-bits being zero because of a preceding addr_space_cast(r<n>,
  2149					 * 0x0, 0x1) instruction. This address is adjusted with the addition
  2150					 * of arena_vm_start (see the implementation of BPF_PROBE_MEM32 and
  2151					 * BPF_PROBE_ATOMIC) before being used for the memory access. Pass
  2152					 * the reg holding the unmodified 32-bit address to
  2153					 * ex_handler_bpf().
  2154					 */
  2155					if (BPF_CLASS(insn->code) == BPF_LDX) {
  2156						arena_reg = reg2pt_regs[src_reg];
  2157						fixup_reg = reg2pt_regs[dst_reg];
  2158					} else {
  2159						arena_reg = reg2pt_regs[dst_reg];
  2160						fixup_reg = DONT_CLEAR;
  2161					}
  2162	
  2163					ex->fixup = FIELD_PREP(FIXUP_INSN_LEN_MASK, prog - start_of_ldx) |
  2164						    FIELD_PREP(FIXUP_ARENA_REG_MASK, arena_reg) |
  2165						    FIELD_PREP(FIXUP_REG_MASK, fixup_reg);
  2166					ex->fixup |= FIXUP_ARENA_ACCESS;
  2167	
  2168					ex->data |= FIELD_PREP(DATA_ARENA_OFFSET_MASK, insn->off);
  2169				}
  2170				break;
  2171	
  2172				/* LDX: dst_reg = *(u8*)(src_reg + off) */
  2173			case BPF_LDX | BPF_MEM | BPF_B:
  2174			case BPF_LDX | BPF_PROBE_MEM | BPF_B:
  2175			case BPF_LDX | BPF_MEM | BPF_H:
  2176			case BPF_LDX | BPF_PROBE_MEM | BPF_H:
  2177			case BPF_LDX | BPF_MEM | BPF_W:
  2178			case BPF_LDX | BPF_PROBE_MEM | BPF_W:
  2179			case BPF_LDX | BPF_MEM | BPF_DW:
  2180			case BPF_LDX | BPF_PROBE_MEM | BPF_DW:
  2181				/* LDXS: dst_reg = *(s8*)(src_reg + off) */
  2182			case BPF_LDX | BPF_MEMSX | BPF_B:
  2183			case BPF_LDX | BPF_MEMSX | BPF_H:
  2184			case BPF_LDX | BPF_MEMSX | BPF_W:
  2185			case BPF_LDX | BPF_PROBE_MEMSX | BPF_B:
  2186			case BPF_LDX | BPF_PROBE_MEMSX | BPF_H:
  2187			case BPF_LDX | BPF_PROBE_MEMSX | BPF_W:
  2188				insn_off = insn->off;
  2189	
  2190				if (BPF_MODE(insn->code) == BPF_PROBE_MEM ||
  2191				    BPF_MODE(insn->code) == BPF_PROBE_MEMSX) {
  2192					/* Conservatively check that src_reg + insn->off is a kernel address:
  2193					 *   src_reg + insn->off > TASK_SIZE_MAX + PAGE_SIZE
  2194					 *   and
  2195					 *   src_reg + insn->off < VSYSCALL_ADDR
  2196					 */
  2197	
  2198					u64 limit = TASK_SIZE_MAX + PAGE_SIZE - VSYSCALL_ADDR;
  2199					u8 *end_of_jmp;
  2200	
  2201					/* movabsq r10, VSYSCALL_ADDR */
  2202					emit_mov_imm64(&prog, BPF_REG_AX, (long)VSYSCALL_ADDR >> 32,
  2203						       (u32)(long)VSYSCALL_ADDR);
  2204	
  2205					/* mov src_reg, r11 */
  2206					EMIT_mov(AUX_REG, src_reg);
  2207	
  2208					if (insn->off) {
  2209						/* add r11, insn->off */
  2210						maybe_emit_1mod(&prog, AUX_REG, true);
  2211						EMIT2_off32(0x81, add_1reg(0xC0, AUX_REG), insn->off);
  2212					}
  2213	
  2214					/* sub r11, r10 */
  2215					maybe_emit_mod(&prog, AUX_REG, BPF_REG_AX, true);
  2216					EMIT2(0x29, add_2reg(0xC0, AUX_REG, BPF_REG_AX));
  2217	
  2218					/* movabsq r10, limit */
  2219					emit_mov_imm64(&prog, BPF_REG_AX, (long)limit >> 32,
  2220						       (u32)(long)limit);
  2221	
  2222					/* cmp r10, r11 */
  2223					maybe_emit_mod(&prog, AUX_REG, BPF_REG_AX, true);
  2224					EMIT2(0x39, add_2reg(0xC0, AUX_REG, BPF_REG_AX));
  2225	
  2226					/* if unsigned '>', goto load */
  2227					EMIT2(X86_JA, 0);
  2228					end_of_jmp = prog;
  2229	
  2230					/* xor dst_reg, dst_reg */
  2231					emit_mov_imm32(&prog, false, dst_reg, 0);
  2232					/* jmp byte_after_ldx */
  2233					EMIT2(0xEB, 0);
  2234	
  2235					/* populate jmp_offset for JAE above to jump to start_of_ldx */
  2236					start_of_ldx = prog;
  2237					end_of_jmp[-1] = start_of_ldx - end_of_jmp;
  2238				}
  2239				if (BPF_MODE(insn->code) == BPF_PROBE_MEMSX ||
  2240				    BPF_MODE(insn->code) == BPF_MEMSX)
  2241					emit_ldsx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn_off);
  2242				else
  2243					emit_ldx(&prog, BPF_SIZE(insn->code), dst_reg, src_reg, insn_off);
  2244				if (BPF_MODE(insn->code) == BPF_PROBE_MEM ||
  2245				    BPF_MODE(insn->code) == BPF_PROBE_MEMSX) {
  2246					struct exception_table_entry *ex;
  2247					u8 *_insn = image + proglen + (start_of_ldx - temp);
  2248					s64 delta;
  2249	
  2250					/* populate jmp_offset for JMP above */
  2251					start_of_ldx[-1] = prog - start_of_ldx;
  2252	
  2253					if (!bpf_prog->aux->extable)
  2254						break;
  2255	
  2256					if (excnt >= bpf_prog->aux->num_exentries) {
  2257						pr_err("ex gen bug\n");
  2258						return -EFAULT;
  2259					}
  2260					ex = &bpf_prog->aux->extable[excnt++];
  2261	
  2262					delta = _insn - (u8 *)&ex->insn;
  2263					if (!is_simm32(delta)) {
  2264						pr_err("extable->insn doesn't fit into 32-bit\n");
  2265						return -EFAULT;
  2266					}
  2267					/* switch ex to rw buffer for writes */
  2268					ex = (void *)rw_image + ((void *)ex - (void *)image);
  2269	
  2270					ex->insn = delta;
  2271	
  2272					ex->data = EX_TYPE_BPF;
  2273	
  2274					if (dst_reg > BPF_REG_9) {
  2275						pr_err("verifier error\n");
  2276						return -EFAULT;
  2277					}
  2278					/*
  2279					 * Compute size of x86 insn and its target dest x86 register.
  2280					 * ex_handler_bpf() will use lower 8 bits to adjust
  2281					 * pt_regs->ip to jump over this x86 instruction
  2282					 * and upper bits to figure out which pt_regs to zero out.
  2283					 * End result: x86 insn "mov rbx, qword ptr [rax+0x14]"
  2284					 * of 4 bytes will be ignored and rbx will be zero inited.
  2285					 */
  2286					ex->fixup = FIELD_PREP(FIXUP_INSN_LEN_MASK, prog - start_of_ldx) |
  2287						    FIELD_PREP(FIXUP_REG_MASK, reg2pt_regs[dst_reg]);
  2288				}
  2289				break;
  2290	
  2291			case BPF_STX | BPF_ATOMIC | BPF_B:
  2292			case BPF_STX | BPF_ATOMIC | BPF_H:
  2293				if (!bpf_atomic_is_load_store(insn)) {
  2294					pr_err("bpf_jit: 1- and 2-byte RMW atomics are not supported\n");
  2295					return -EFAULT;
  2296				}
  2297				fallthrough;
  2298			case BPF_STX | BPF_ATOMIC | BPF_W:
  2299			case BPF_STX | BPF_ATOMIC | BPF_DW:
  2300				if (insn->imm == (BPF_AND | BPF_FETCH) ||
  2301				    insn->imm == (BPF_OR | BPF_FETCH) ||
  2302				    insn->imm == (BPF_XOR | BPF_FETCH)) {
  2303					bool is64 = BPF_SIZE(insn->code) == BPF_DW;
  2304					u32 real_src_reg = src_reg;
  2305					u32 real_dst_reg = dst_reg;
  2306					u8 *branch_target;
  2307	
  2308					/*
  2309					 * Can't be implemented with a single x86 insn.
  2310					 * Need to do a CMPXCHG loop.
  2311					 */
  2312	
  2313					/* Will need RAX as a CMPXCHG operand so save R0 */
  2314					emit_mov_reg(&prog, true, BPF_REG_AX, BPF_REG_0);
  2315					if (src_reg == BPF_REG_0)
  2316						real_src_reg = BPF_REG_AX;
  2317					if (dst_reg == BPF_REG_0)
  2318						real_dst_reg = BPF_REG_AX;
  2319	
  2320					branch_target = prog;
  2321					/* Load old value */
  2322					emit_ldx(&prog, BPF_SIZE(insn->code),
  2323						 BPF_REG_0, real_dst_reg, insn->off);
  2324					/*
  2325					 * Perform the (commutative) operation locally,
  2326					 * put the result in the AUX_REG.
  2327					 */
  2328					emit_mov_reg(&prog, is64, AUX_REG, BPF_REG_0);
  2329					maybe_emit_mod(&prog, AUX_REG, real_src_reg, is64);
  2330					EMIT2(simple_alu_opcodes[BPF_OP(insn->imm)],
  2331					      add_2reg(0xC0, AUX_REG, real_src_reg));
  2332					/* Attempt to swap in new value */
  2333					err = emit_atomic_rmw(&prog, BPF_CMPXCHG,
  2334							      real_dst_reg, AUX_REG,
  2335							      insn->off,
  2336							      BPF_SIZE(insn->code));
  2337					if (WARN_ON(err))
  2338						return err;
  2339					/*
  2340					 * ZF tells us whether we won the race. If it's
  2341					 * cleared we need to try again.
  2342					 */
  2343					EMIT2(X86_JNE, -(prog - branch_target) - 2);
  2344					/* Return the pre-modification value */
  2345					emit_mov_reg(&prog, is64, real_src_reg, BPF_REG_0);
  2346					/* Restore R0 after clobbering RAX */
  2347					emit_mov_reg(&prog, true, BPF_REG_0, BPF_REG_AX);
  2348					break;
  2349				}
  2350	
  2351				if (bpf_atomic_is_load_store(insn))
  2352					err = emit_atomic_ld_st(&prog, insn->imm, dst_reg, src_reg,
  2353								insn->off, BPF_SIZE(insn->code));
  2354				else
  2355					err = emit_atomic_rmw(&prog, insn->imm, dst_reg, src_reg,
  2356							      insn->off, BPF_SIZE(insn->code));
  2357				if (err)
  2358					return err;
  2359				break;
  2360	
  2361			case BPF_STX | BPF_PROBE_ATOMIC | BPF_B:
  2362			case BPF_STX | BPF_PROBE_ATOMIC | BPF_H:
  2363				if (!bpf_atomic_is_load_store(insn)) {
  2364					pr_err("bpf_jit: 1- and 2-byte RMW atomics are not supported\n");
  2365					return -EFAULT;
  2366				}
  2367				fallthrough;
  2368			case BPF_STX | BPF_PROBE_ATOMIC | BPF_W:
  2369			case BPF_STX | BPF_PROBE_ATOMIC | BPF_DW:
  2370				start_of_ldx = prog;
  2371	
  2372				if (bpf_atomic_is_load_store(insn))
  2373					err = emit_atomic_ld_st_index(&prog, insn->imm,
  2374								      BPF_SIZE(insn->code), dst_reg,
  2375								      src_reg, X86_REG_R12, insn->off);
  2376				else
  2377					err = emit_atomic_rmw_index(&prog, insn->imm, BPF_SIZE(insn->code),
  2378								    dst_reg, src_reg, X86_REG_R12,
  2379								    insn->off);
  2380				if (err)
  2381					return err;
  2382				goto populate_extable;
  2383	
  2384				/* call */
  2385			case BPF_JMP | BPF_CALL: {
  2386				u8 *ip = image + addrs[i - 1];
  2387	
  2388				func = (u8 *) __bpf_call_base + imm32;
  2389				if (src_reg == BPF_PSEUDO_CALL && tail_call_reachable) {
  2390					LOAD_TAIL_CALL_CNT_PTR(stack_depth);
  2391					ip += 7;
  2392				}
  2393				if (!imm32)
  2394					return -EINVAL;
  2395				if (priv_frame_ptr) {
  2396					push_r9(&prog);
  2397					ip += 2;
  2398				}
  2399				ip += x86_call_depth_emit_accounting(&prog, func, ip);
  2400				if (emit_call(&prog, func, ip))
  2401					return -EINVAL;
  2402				if (priv_frame_ptr)
  2403					pop_r9(&prog);
  2404				break;
  2405			}
  2406	
  2407			case BPF_JMP | BPF_TAIL_CALL:
  2408				if (imm32)
  2409					emit_bpf_tail_call_direct(bpf_prog,
  2410								  &bpf_prog->aux->poke_tab[imm32 - 1],
  2411								  &prog, image + addrs[i - 1],
  2412								  callee_regs_used,
  2413								  stack_depth,
  2414								  ctx);
  2415				else
  2416					emit_bpf_tail_call_indirect(bpf_prog,
  2417								    &prog,
  2418								    callee_regs_used,
  2419								    stack_depth,
  2420								    image + addrs[i - 1],
  2421								    ctx);
  2422				break;
  2423	
  2424				/* cond jump */
  2425			case BPF_JMP | BPF_JEQ | BPF_X:
  2426			case BPF_JMP | BPF_JNE | BPF_X:
  2427			case BPF_JMP | BPF_JGT | BPF_X:
  2428			case BPF_JMP | BPF_JLT | BPF_X:
  2429			case BPF_JMP | BPF_JGE | BPF_X:
  2430			case BPF_JMP | BPF_JLE | BPF_X:
  2431			case BPF_JMP | BPF_JSGT | BPF_X:
  2432			case BPF_JMP | BPF_JSLT | BPF_X:
  2433			case BPF_JMP | BPF_JSGE | BPF_X:
  2434			case BPF_JMP | BPF_JSLE | BPF_X:
  2435			case BPF_JMP32 | BPF_JEQ | BPF_X:
  2436			case BPF_JMP32 | BPF_JNE | BPF_X:
  2437			case BPF_JMP32 | BPF_JGT | BPF_X:
  2438			case BPF_JMP32 | BPF_JLT | BPF_X:
  2439			case BPF_JMP32 | BPF_JGE | BPF_X:
  2440			case BPF_JMP32 | BPF_JLE | BPF_X:
  2441			case BPF_JMP32 | BPF_JSGT | BPF_X:
  2442			case BPF_JMP32 | BPF_JSLT | BPF_X:
  2443			case BPF_JMP32 | BPF_JSGE | BPF_X:
  2444			case BPF_JMP32 | BPF_JSLE | BPF_X:
  2445				/* cmp dst_reg, src_reg */
  2446				maybe_emit_mod(&prog, dst_reg, src_reg,
  2447					       BPF_CLASS(insn->code) == BPF_JMP);
  2448				EMIT2(0x39, add_2reg(0xC0, dst_reg, src_reg));
  2449				goto emit_cond_jmp;
  2450	
  2451			case BPF_JMP | BPF_JSET | BPF_X:
  2452			case BPF_JMP32 | BPF_JSET | BPF_X:
  2453				/* test dst_reg, src_reg */
  2454				maybe_emit_mod(&prog, dst_reg, src_reg,
  2455					       BPF_CLASS(insn->code) == BPF_JMP);
  2456				EMIT2(0x85, add_2reg(0xC0, dst_reg, src_reg));
  2457				goto emit_cond_jmp;
  2458	
  2459			case BPF_JMP | BPF_JSET | BPF_K:
  2460			case BPF_JMP32 | BPF_JSET | BPF_K:
  2461				/* test dst_reg, imm32 */
  2462				maybe_emit_1mod(&prog, dst_reg,
  2463						BPF_CLASS(insn->code) == BPF_JMP);
  2464				EMIT2_off32(0xF7, add_1reg(0xC0, dst_reg), imm32);
  2465				goto emit_cond_jmp;
  2466	
  2467			case BPF_JMP | BPF_JEQ | BPF_K:
  2468			case BPF_JMP | BPF_JNE | BPF_K:
  2469			case BPF_JMP | BPF_JGT | BPF_K:
  2470			case BPF_JMP | BPF_JLT | BPF_K:
  2471			case BPF_JMP | BPF_JGE | BPF_K:
  2472			case BPF_JMP | BPF_JLE | BPF_K:
  2473			case BPF_JMP | BPF_JSGT | BPF_K:
  2474			case BPF_JMP | BPF_JSLT | BPF_K:
  2475			case BPF_JMP | BPF_JSGE | BPF_K:
  2476			case BPF_JMP | BPF_JSLE | BPF_K:
  2477			case BPF_JMP32 | BPF_JEQ | BPF_K:
  2478			case BPF_JMP32 | BPF_JNE | BPF_K:
  2479			case BPF_JMP32 | BPF_JGT | BPF_K:
  2480			case BPF_JMP32 | BPF_JLT | BPF_K:
  2481			case BPF_JMP32 | BPF_JGE | BPF_K:
  2482			case BPF_JMP32 | BPF_JLE | BPF_K:
  2483			case BPF_JMP32 | BPF_JSGT | BPF_K:
  2484			case BPF_JMP32 | BPF_JSLT | BPF_K:
  2485			case BPF_JMP32 | BPF_JSGE | BPF_K:
  2486			case BPF_JMP32 | BPF_JSLE | BPF_K:
  2487				/* test dst_reg, dst_reg to save one extra byte */
  2488				if (imm32 == 0) {
  2489					maybe_emit_mod(&prog, dst_reg, dst_reg,
  2490						       BPF_CLASS(insn->code) == BPF_JMP);
  2491					EMIT2(0x85, add_2reg(0xC0, dst_reg, dst_reg));
  2492					goto emit_cond_jmp;
  2493				}
  2494	
  2495				/* cmp dst_reg, imm8/32 */
  2496				maybe_emit_1mod(&prog, dst_reg,
  2497						BPF_CLASS(insn->code) == BPF_JMP);
  2498	
  2499				if (is_imm8(imm32))
  2500					EMIT3(0x83, add_1reg(0xF8, dst_reg), imm32);
  2501				else
  2502					EMIT2_off32(0x81, add_1reg(0xF8, dst_reg), imm32);
  2503	
  2504	emit_cond_jmp:		/* Convert BPF opcode to x86 */
  2505				switch (BPF_OP(insn->code)) {
  2506				case BPF_JEQ:
  2507					jmp_cond = X86_JE;
  2508					break;
  2509				case BPF_JSET:
  2510				case BPF_JNE:
  2511					jmp_cond = X86_JNE;
  2512					break;
  2513				case BPF_JGT:
  2514					/* GT is unsigned '>', JA in x86 */
  2515					jmp_cond = X86_JA;
  2516					break;
  2517				case BPF_JLT:
  2518					/* LT is unsigned '<', JB in x86 */
  2519					jmp_cond = X86_JB;
  2520					break;
  2521				case BPF_JGE:
  2522					/* GE is unsigned '>=', JAE in x86 */
  2523					jmp_cond = X86_JAE;
  2524					break;
  2525				case BPF_JLE:
  2526					/* LE is unsigned '<=', JBE in x86 */
  2527					jmp_cond = X86_JBE;
  2528					break;
  2529				case BPF_JSGT:
  2530					/* Signed '>', GT in x86 */
  2531					jmp_cond = X86_JG;
  2532					break;
  2533				case BPF_JSLT:
  2534					/* Signed '<', LT in x86 */
  2535					jmp_cond = X86_JL;
  2536					break;
  2537				case BPF_JSGE:
  2538					/* Signed '>=', GE in x86 */
  2539					jmp_cond = X86_JGE;
  2540					break;
  2541				case BPF_JSLE:
  2542					/* Signed '<=', LE in x86 */
  2543					jmp_cond = X86_JLE;
  2544					break;
  2545				default: /* to silence GCC warning */
  2546					return -EFAULT;
  2547				}
  2548				jmp_offset = addrs[i + insn->off] - addrs[i];
  2549				if (is_imm8_jmp_offset(jmp_offset)) {
  2550					if (jmp_padding) {
  2551						/* To keep the jmp_offset valid, the extra bytes are
  2552						 * padded before the jump insn, so we subtract the
  2553						 * 2 bytes of jmp_cond insn from INSN_SZ_DIFF.
  2554						 *
  2555						 * If the previous pass already emits an imm8
  2556						 * jmp_cond, then this BPF insn won't shrink, so
  2557						 * "nops" is 0.
  2558						 *
  2559						 * On the other hand, if the previous pass emits an
  2560						 * imm32 jmp_cond, the extra 4 bytes(*) is padded to
  2561						 * keep the image from shrinking further.
  2562						 *
  2563						 * (*) imm32 jmp_cond is 6 bytes, and imm8 jmp_cond
  2564						 *     is 2 bytes, so the size difference is 4 bytes.
  2565						 */
  2566						nops = INSN_SZ_DIFF - 2;
  2567						if (nops != 0 && nops != 4) {
  2568							pr_err("unexpected jmp_cond padding: %d bytes\n",
  2569							       nops);
  2570							return -EFAULT;
  2571						}
  2572						emit_nops(&prog, nops);
  2573					}
  2574					EMIT2(jmp_cond, jmp_offset);
  2575				} else if (is_simm32(jmp_offset)) {
  2576					EMIT2_off32(0x0F, jmp_cond + 0x10, jmp_offset);
  2577				} else {
  2578					pr_err("cond_jmp gen bug %llx\n", jmp_offset);
  2579					return -EFAULT;
  2580				}
  2581	
  2582				break;
  2583	
  2584			case BPF_JMP | BPF_JA:
  2585			case BPF_JMP32 | BPF_JA:
  2586				if (BPF_CLASS(insn->code) == BPF_JMP) {
  2587					if (insn->off == -1)
  2588						/* -1 jmp instructions will always jump
  2589						 * backwards two bytes. Explicitly handling
  2590						 * this case avoids wasting too many passes
  2591						 * when there are long sequences of replaced
  2592						 * dead code.
  2593						 */
  2594						jmp_offset = -2;
  2595					else
  2596						jmp_offset = addrs[i + insn->off] - addrs[i];
  2597				} else {
  2598					if (insn->imm == -1)
  2599						jmp_offset = -2;
  2600					else
  2601						jmp_offset = addrs[i + insn->imm] - addrs[i];
  2602				}
  2603	
  2604				if (!jmp_offset) {
  2605					/*
  2606					 * If jmp_padding is enabled, the extra nops will
  2607					 * be inserted. Otherwise, optimize out nop jumps.
  2608					 */
  2609					if (jmp_padding) {
  2610						/* There are 3 possible conditions.
  2611						 * (1) This BPF_JA is already optimized out in
  2612						 *     the previous run, so there is no need
  2613						 *     to pad any extra byte (0 byte).
  2614						 * (2) The previous pass emits an imm8 jmp,
  2615						 *     so we pad 2 bytes to match the previous
  2616						 *     insn size.
  2617						 * (3) Similarly, the previous pass emits an
  2618						 *     imm32 jmp, and 5 bytes is padded.
  2619						 */
  2620						nops = INSN_SZ_DIFF;
  2621						if (nops != 0 && nops != 2 && nops != 5) {
  2622							pr_err("unexpected nop jump padding: %d bytes\n",
  2623							       nops);
  2624							return -EFAULT;
  2625						}
  2626						emit_nops(&prog, nops);
  2627					}
  2628					break;
  2629				}
  2630	emit_jmp:
  2631				if (is_imm8_jmp_offset(jmp_offset)) {
  2632					if (jmp_padding) {
  2633						/* To avoid breaking jmp_offset, the extra bytes
  2634						 * are padded before the actual jmp insn, so
  2635						 * 2 bytes is subtracted from INSN_SZ_DIFF.
  2636						 *
  2637						 * If the previous pass already emits an imm8
  2638						 * jmp, there is nothing to pad (0 byte).
  2639						 *
  2640						 * If it emits an imm32 jmp (5 bytes) previously
  2641						 * and now an imm8 jmp (2 bytes), then we pad
  2642						 * (5 - 2 = 3) bytes to stop the image from
  2643						 * shrinking further.
  2644						 */
  2645						nops = INSN_SZ_DIFF - 2;
  2646						if (nops != 0 && nops != 3) {
  2647							pr_err("unexpected jump padding: %d bytes\n",
  2648							       nops);
  2649							return -EFAULT;
  2650						}
  2651						emit_nops(&prog, INSN_SZ_DIFF - 2);
  2652					}
  2653					EMIT2(0xEB, jmp_offset);
  2654				} else if (is_simm32(jmp_offset)) {
  2655					EMIT1_off32(0xE9, jmp_offset);
  2656				} else {
  2657					pr_err("jmp gen bug %llx\n", jmp_offset);
  2658					return -EFAULT;
  2659				}
  2660				break;
  2661	
  2662			case BPF_JMP | BPF_EXIT:
  2663				if (seen_exit) {
  2664					jmp_offset = ctx->cleanup_addr - addrs[i];
  2665					goto emit_jmp;
  2666				}
  2667				seen_exit = true;
  2668				/* Update cleanup_addr */
  2669				ctx->cleanup_addr = proglen;
  2670				if (bpf_prog_was_classic(bpf_prog) &&
  2671				    !capable(CAP_SYS_ADMIN)) {
  2672					u8 *ip = image + addrs[i - 1];
  2673	
  2674					if (emit_spectre_bhb_barrier(&prog, ip, bpf_prog))
  2675						return -EINVAL;
  2676				}
  2677				if (bpf_prog->aux->exception_boundary) {
  2678					pop_callee_regs(&prog, all_callee_regs_used);
  2679					pop_r12(&prog);
  2680				} else {
  2681					pop_callee_regs(&prog, callee_regs_used);
  2682					if (arena_vm_start)
  2683						pop_r12(&prog);
  2684				}
  2685				EMIT1(0xC9);         /* leave */
  2686				emit_return(&prog, image + addrs[i - 1] + (prog - temp));
  2687				break;
  2688	
  2689			default:
  2690				/*
  2691				 * By design x86-64 JIT should support all BPF instructions.
  2692				 * This error will be seen if new instruction was added
  2693				 * to the interpreter, but not to the JIT, or if there is
  2694				 * junk in bpf_prog.
  2695				 */
  2696				pr_err("bpf_jit: unknown opcode %02x\n", insn->code);
  2697				return -EINVAL;
  2698			}
  2699	
  2700			ilen = prog - temp;
  2701			if (ilen > BPF_MAX_INSN_SIZE) {
  2702				pr_err("bpf_jit: fatal insn size error\n");
  2703				return -EFAULT;
  2704			}
  2705	
  2706			if (image) {
  2707				/*
  2708				 * When populating the image, assert that:
  2709				 *
  2710				 *  i) We do not write beyond the allocated space, and
  2711				 * ii) addrs[i] did not change from the prior run, in order
  2712				 *     to validate assumptions made for computing branch
  2713				 *     displacements.
  2714				 */
  2715				if (unlikely(proglen + ilen > oldproglen ||
  2716					     proglen + ilen != addrs[i])) {
  2717					pr_err("bpf_jit: fatal error\n");
  2718					return -EFAULT;
  2719				}
  2720				memcpy(rw_image + proglen, temp, ilen);
  2721	
  2722				/*
  2723				 * Instruction arrays need to know how xlated code
  2724				 * maps to jitted code
  2725				 */
> 2726				bpf_prog_update_insn_ptr(bpf_prog, abs_xlated_off, proglen,
  2727							 image + proglen);
  2728			}
  2729			proglen += ilen;
  2730			addrs[i] = proglen;
  2731			prog = temp;
  2732		}
  2733	
  2734		if (image && excnt != bpf_prog->aux->num_exentries) {
  2735			pr_err("extable is not populated\n");
  2736			return -EFAULT;
  2737		}
  2738		return proglen;
  2739	}
  2740	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 bpf-next 12/13] bpftool: Recognize insn_array map type
  2025-09-13 19:39 ` [PATCH v2 bpf-next 12/13] bpftool: Recognize insn_array map type Anton Protopopov
@ 2025-09-16 20:33   ` Quentin Monnet
  2025-09-18  8:11     ` Anton Protopopov
  0 siblings, 1 reply; 26+ messages in thread
From: Quentin Monnet @ 2025-09-16 20:33 UTC (permalink / raw)
  To: Anton Protopopov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Anton Protopopov, Daniel Borkmann, Eduard Zingerman,
	Yonghong Song

2025-09-13 19:39 UTC+0000 ~ Anton Protopopov <a.s.protopopov@gmail.com>
> Teach bpftool to recognize instruction array map type.
> 
> Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
> ---
>  tools/bpf/bpftool/Documentation/bpftool-map.rst | 2 +-
>  tools/bpf/bpftool/map.c                         | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/bpf/bpftool/Documentation/bpftool-map.rst b/tools/bpf/bpftool/Documentation/bpftool-map.rst
> index 252e4c538edb..3377d4a01c62 100644
> --- a/tools/bpf/bpftool/Documentation/bpftool-map.rst
> +++ b/tools/bpf/bpftool/Documentation/bpftool-map.rst
> @@ -55,7 +55,7 @@ MAP COMMANDS
>  |     | **devmap** | **devmap_hash** | **sockmap** | **cpumap** | **xskmap** | **sockhash**
>  |     | **cgroup_storage** | **reuseport_sockarray** | **percpu_cgroup_storage**
>  |     | **queue** | **stack** | **sk_storage** | **struct_ops** | **ringbuf** | **inode_storage**
> -|     | **task_storage** | **bloom_filter** | **user_ringbuf** | **cgrp_storage** | **arena** }
> +|     | **task_storage** | **bloom_filter** | **user_ringbuf** | **cgrp_storage** | **arena** | **insn_array** }


Thanks Anton!
That's a long line. As you'll likely respin your series, could you wrap
and start a new line, please?


>  
>  DESCRIPTION
>  ===========
> diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
> index c9de44a45778..79b90f274bef 100644
> --- a/tools/bpf/bpftool/map.c
> +++ b/tools/bpf/bpftool/map.c
> @@ -1477,7 +1477,7 @@ static int do_help(int argc, char **argv)
>  		"                 devmap | devmap_hash | sockmap | cpumap | xskmap | sockhash |\n"
>  		"                 cgroup_storage | reuseport_sockarray | percpu_cgroup_storage |\n"
>  		"                 queue | stack | sk_storage | struct_ops | ringbuf | inode_storage |\n"
> -		"                 task_storage | bloom_filter | user_ringbuf | cgrp_storage | arena }\n"
> +		"                 task_storage | bloom_filter | user_ringbuf | cgrp_storage | arena | insn_array }\n"


Same here. Other than these:

Acked-by: Quentin Monnet <qmo@kernel.org>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 bpf-next 12/13] bpftool: Recognize insn_array map type
  2025-09-16 20:33   ` Quentin Monnet
@ 2025-09-18  8:11     ` Anton Protopopov
  0 siblings, 0 replies; 26+ messages in thread
From: Anton Protopopov @ 2025-09-18  8:11 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Yonghong Song

On 25/09/16 09:33PM, Quentin Monnet wrote:
> 2025-09-13 19:39 UTC+0000 ~ Anton Protopopov <a.s.protopopov@gmail.com>
> > Teach bpftool to recognize instruction array map type.
> > 
> > Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
> > ---
> >  tools/bpf/bpftool/Documentation/bpftool-map.rst | 2 +-
> >  tools/bpf/bpftool/map.c                         | 2 +-
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tools/bpf/bpftool/Documentation/bpftool-map.rst b/tools/bpf/bpftool/Documentation/bpftool-map.rst
> > index 252e4c538edb..3377d4a01c62 100644
> > --- a/tools/bpf/bpftool/Documentation/bpftool-map.rst
> > +++ b/tools/bpf/bpftool/Documentation/bpftool-map.rst
> > @@ -55,7 +55,7 @@ MAP COMMANDS
> >  |     | **devmap** | **devmap_hash** | **sockmap** | **cpumap** | **xskmap** | **sockhash**
> >  |     | **cgroup_storage** | **reuseport_sockarray** | **percpu_cgroup_storage**
> >  |     | **queue** | **stack** | **sk_storage** | **struct_ops** | **ringbuf** | **inode_storage**
> > -|     | **task_storage** | **bloom_filter** | **user_ringbuf** | **cgrp_storage** | **arena** }
> > +|     | **task_storage** | **bloom_filter** | **user_ringbuf** | **cgrp_storage** | **arena** | **insn_array** }
> 
> 
> Thanks Anton!
> That's a long line. As you'll likely respin your series, could you wrap
> and start a new line, please?

Thanks, fixed! (I will resend the series as v3 now due to kbuild-bot issue.)

> 
> >  
> >  DESCRIPTION
> >  ===========
> > diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
> > index c9de44a45778..79b90f274bef 100644
> > --- a/tools/bpf/bpftool/map.c
> > +++ b/tools/bpf/bpftool/map.c
> > @@ -1477,7 +1477,7 @@ static int do_help(int argc, char **argv)
> >  		"                 devmap | devmap_hash | sockmap | cpumap | xskmap | sockhash |\n"
> >  		"                 cgroup_storage | reuseport_sockarray | percpu_cgroup_storage |\n"
> >  		"                 queue | stack | sk_storage | struct_ops | ringbuf | inode_storage |\n"
> > -		"                 task_storage | bloom_filter | user_ringbuf | cgrp_storage | arena }\n"
> > +		"                 task_storage | bloom_filter | user_ringbuf | cgrp_storage | arena | insn_array }\n"
> 
> 
> Same here. Other than these:
> 
> Acked-by: Quentin Monnet <qmo@kernel.org>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array
  2025-09-13 19:39 ` [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array Anton Protopopov
  2025-09-15  4:09   ` kernel test robot
@ 2025-09-20  0:30   ` Alexei Starovoitov
  2025-09-22 10:38     ` Anton Protopopov
  1 sibling, 1 reply; 26+ messages in thread
From: Alexei Starovoitov @ 2025-09-20  0:30 UTC (permalink / raw)
  To: Anton Protopopov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song

On Sat, Sep 13, 2025 at 12:33 PM Anton Protopopov
<a.s.protopopov@gmail.com> wrote:
> --- /dev/null
> +++ b/kernel/bpf/bpf_insn_array.c
> @@ -0,0 +1,336 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +

add copyright?

> +#include <linux/bpf.h>
> +#include <linux/sort.h>
> +
> +#define MAX_INSN_ARRAY_ENTRIES 256
> +
> +struct bpf_insn_array {
> +       struct bpf_map map;
> +       struct mutex state_mutex;
> +       int state;
> +       long *ips;
> +       DECLARE_FLEX_ARRAY(struct bpf_insn_ptr, ptrs);
> +};
> +
> +enum {
> +       INSN_ARRAY_STATE_FREE = 0,
> +       INSN_ARRAY_STATE_INIT,
> +       INSN_ARRAY_STATE_READY,
> +};
> +
> +#define cast_insn_array(MAP_PTR) \
> +       container_of(MAP_PTR, struct bpf_insn_array, map)

container_of((MAP_PTR)
checkpatch will be happier.

> +
> +#define INSN_DELETED ((u32)-1)
> +
> +static inline u32 insn_array_alloc_size(u32 max_entries)
> +{
> +       const u32 base_size = sizeof(struct bpf_insn_array);
> +       const u32 entry_size = sizeof(struct bpf_insn_ptr);
> +
> +       return base_size + entry_size * max_entries;
> +}
> +
> +static int insn_array_alloc_check(union bpf_attr *attr)
> +{
> +       if (attr->max_entries == 0 ||
> +           attr->key_size != 4 ||
> +           attr->value_size != 8 ||
> +           attr->map_flags != 0)
> +               return -EINVAL;

Use single line or two, instead of 4.

> +
> +       if (attr->max_entries > MAX_INSN_ARRAY_ENTRIES)
> +               return -E2BIG;
> +
> +       return 0;
> +}
> +
> +static void insn_array_free(struct bpf_map *map)
> +{
> +       struct bpf_insn_array *insn_array = cast_insn_array(map);
> +
> +       kfree(insn_array->ips);
> +       bpf_map_area_free(insn_array);
> +}
> +
> +static struct bpf_map *insn_array_alloc(union bpf_attr *attr)
> +{
> +       u64 size = insn_array_alloc_size(attr->max_entries);
> +       struct bpf_insn_array *insn_array;
> +
> +       insn_array = bpf_map_area_alloc(size, NUMA_NO_NODE);
> +       if (!insn_array)
> +               return ERR_PTR(-ENOMEM);
> +
> +       insn_array->ips = kcalloc(attr->max_entries, sizeof(long), GFP_KERNEL);
> +       if (!insn_array->ips) {
> +               insn_array_free(&insn_array->map);
> +               return ERR_PTR(-ENOMEM);
> +       }
> +
> +       bpf_map_init_from_attr(&insn_array->map, attr);
> +
> +       mutex_init(&insn_array->state_mutex);
> +       insn_array->state = INSN_ARRAY_STATE_FREE;
> +
> +       return &insn_array->map;
> +}
> +
> +static int insn_array_get_next_key(struct bpf_map *map, void *key, void *next_key)
> +{
> +       struct bpf_insn_array *insn_array = cast_insn_array(map);
> +       u32 index = key ? *(u32 *)key : U32_MAX;
> +       u32 *next = (u32 *)next_key;
> +
> +       if (index >= insn_array->map.max_entries) {
> +               *next = 0;
> +               return 0;
> +       }
> +
> +       if (index == insn_array->map.max_entries - 1)
> +               return -ENOENT;
> +
> +       *next = index + 1;
> +       return 0;
> +}

Full copy paste of array_map_get_next_key() is a bit too much.
Pls refactor array_map_get_next_key() to avoid casting
to struct bpf_array, then such a helper can work for both maps.

> +
> +static void *insn_array_lookup_elem(struct bpf_map *map, void *key)
> +{
> +       struct bpf_insn_array *insn_array = cast_insn_array(map);
> +       u32 index = *(u32 *)key;
> +
> +       if (unlikely(index >= insn_array->map.max_entries))
> +               return NULL;
> +
> +       return &insn_array->ptrs[index].user_value;
> +}
> +
> +static long insn_array_update_elem(struct bpf_map *map, void *key, void *value, u64 map_flags)
> +{
> +       struct bpf_insn_array *insn_array = cast_insn_array(map);
> +       u32 index = *(u32 *)key;
> +       struct bpf_insn_array_value val = {};
> +       int err = 0;
> +
> +       if (unlikely((map_flags & ~BPF_F_LOCK) > BPF_EXIST))
> +               return -EINVAL;

copy paste gone wrong. BPF_F_LOCK is not supported here.

> +
> +       if (unlikely(index >= insn_array->map.max_entries))
> +               return -E2BIG;
> +
> +       if (unlikely(map_flags & BPF_NOEXIST))
> +               return -EEXIST;
> +
> +       /* No updates for maps in use */
> +       if (!mutex_trylock(&insn_array->state_mutex))
> +               return -EBUSY;

trylock ?!

If I'm reading it correctly
check_map_func_compatibility() prevents usage of this helper
from the prog, so this is syscall only,
but trylock?!

> +
> +       if (insn_array->state != INSN_ARRAY_STATE_FREE) {
> +               err = -EBUSY;
> +               goto unlock;
> +       }
> +
> +       copy_map_value(map, &val, value);
> +       if (val.jitted_off || val.xlated_off == INSN_DELETED) {
> +               err = -EINVAL;
> +               goto unlock;
> +       }
> +
> +       insn_array->ptrs[index].orig_xlated_off = val.xlated_off;
> +       insn_array->ptrs[index].user_value.xlated_off = val.xlated_off;
> +
> +unlock:
> +       mutex_unlock(&insn_array->state_mutex);
> +       return err;
> +}
> +
> +static long insn_array_delete_elem(struct bpf_map *map, void *key)
> +{
> +       return -EINVAL;
> +}
> +
> +static int insn_array_check_btf(const struct bpf_map *map,
> +                             const struct btf *btf,
> +                             const struct btf_type *key_type,
> +                             const struct btf_type *value_type)
> +{
> +       if (!btf_type_is_i32(key_type))
> +               return -EINVAL;
> +
> +       if (!btf_type_is_i64(value_type))
> +               return -EINVAL;
> +
> +       return 0;
> +}
> +
> +static u64 insn_array_mem_usage(const struct bpf_map *map)
> +{
> +       u64 extra_size = 0;
> +
> +       extra_size += sizeof(long) * map->max_entries; /* insn_array->ips */
> +
> +       return insn_array_alloc_size(map->max_entries) + extra_size;
> +}
> +
> +BTF_ID_LIST_SINGLE(insn_array_btf_ids, struct, bpf_insn_array)
> +
> +const struct bpf_map_ops insn_array_map_ops = {
> +       .map_alloc_check = insn_array_alloc_check,
> +       .map_alloc = insn_array_alloc,
> +       .map_free = insn_array_free,
> +       .map_get_next_key = insn_array_get_next_key,
> +       .map_lookup_elem = insn_array_lookup_elem,
> +       .map_update_elem = insn_array_update_elem,
> +       .map_delete_elem = insn_array_delete_elem,
> +       .map_check_btf = insn_array_check_btf,
> +       .map_mem_usage = insn_array_mem_usage,
> +       .map_btf_id = &insn_array_btf_ids[0],
> +};
> +
> +static bool is_insn_array(const struct bpf_map *map)
> +{
> +       return map->map_type == BPF_MAP_TYPE_INSN_ARRAY;
> +}
> +
> +static inline bool valid_offsets(const struct bpf_insn_array *insn_array,
> +                                const struct bpf_prog *prog)
> +{
> +       u32 off;
> +       int i;
> +
> +       for (i = 0; i < insn_array->map.max_entries; i++) {
> +               off = insn_array->ptrs[i].orig_xlated_off;
> +
> +               if (off >= prog->len)
> +                       return false;
> +
> +               if (off > 0) {
> +                       if (prog->insnsi[off-1].code == (BPF_LD | BPF_DW | BPF_IMM))
> +                               return false;
> +               }
> +       }
> +
> +       return true;
> +}
> +
> +int bpf_insn_array_init(struct bpf_map *map, const struct bpf_prog *prog)
> +{
> +       struct bpf_insn_array *insn_array = cast_insn_array(map);
> +       int i;
> +
> +       if (!valid_offsets(insn_array, prog))
> +               return -EINVAL;
> +
> +       /*
> +        * There can be only one program using the map
> +        */
> +       mutex_lock(&insn_array->state_mutex);
> +       if (insn_array->state != INSN_ARRAY_STATE_FREE) {
> +               mutex_unlock(&insn_array->state_mutex);
> +               return -EBUSY;
> +       }
> +       insn_array->state = INSN_ARRAY_STATE_INIT;
> +       mutex_unlock(&insn_array->state_mutex);

only verifier calls this helpers, no?
Why all the mutexes here and below ?
All the mutexes is a big red flag to me.
Will stop any further comments here.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array
  2025-09-20  0:30   ` Alexei Starovoitov
@ 2025-09-22 10:38     ` Anton Protopopov
  2025-09-22 16:16       ` Alexei Starovoitov
  0 siblings, 1 reply; 26+ messages in thread
From: Anton Protopopov @ 2025-09-22 10:38 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song

On 25/09/19 05:30PM, Alexei Starovoitov wrote:
> On Sat, Sep 13, 2025 at 12:33 PM Anton Protopopov
> <a.s.protopopov@gmail.com> wrote:
> > --- /dev/null
> > +++ b/kernel/bpf/bpf_insn_array.c
> > @@ -0,0 +1,336 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +
> 
> add copyright?

Yes, thanks!

> > +#include <linux/bpf.h>
> > +#include <linux/sort.h>
> > +
> > +#define MAX_INSN_ARRAY_ENTRIES 256
> > +
> > +struct bpf_insn_array {
> > +       struct bpf_map map;
> > +       struct mutex state_mutex;
> > +       int state;
> > +       long *ips;
> > +       DECLARE_FLEX_ARRAY(struct bpf_insn_ptr, ptrs);
> > +};
> > +
> > +enum {
> > +       INSN_ARRAY_STATE_FREE = 0,
> > +       INSN_ARRAY_STATE_INIT,
> > +       INSN_ARRAY_STATE_READY,
> > +};
> > +
> > +#define cast_insn_array(MAP_PTR) \
> > +       container_of(MAP_PTR, struct bpf_insn_array, map)
> 
> container_of((MAP_PTR)
> checkpatch will be happier.

Thanks, fixed

> > +
> > +#define INSN_DELETED ((u32)-1)
> > +
> > +static inline u32 insn_array_alloc_size(u32 max_entries)
> > +{
> > +       const u32 base_size = sizeof(struct bpf_insn_array);
> > +       const u32 entry_size = sizeof(struct bpf_insn_ptr);
> > +
> > +       return base_size + entry_size * max_entries;
> > +}
> > +
> > +static int insn_array_alloc_check(union bpf_attr *attr)
> > +{
> > +       if (attr->max_entries == 0 ||
> > +           attr->key_size != 4 ||
> > +           attr->value_size != 8 ||
> > +           attr->map_flags != 0)
> > +               return -EINVAL;
> 
> Use single line or two, instead of 4.

Done

> > +
> > +       if (attr->max_entries > MAX_INSN_ARRAY_ENTRIES)
> > +               return -E2BIG;
> > +
> > +       return 0;
> > +}
> > +
> > +static void insn_array_free(struct bpf_map *map)
> > +{
> > +       struct bpf_insn_array *insn_array = cast_insn_array(map);
> > +
> > +       kfree(insn_array->ips);
> > +       bpf_map_area_free(insn_array);
> > +}
> > +
> > +static struct bpf_map *insn_array_alloc(union bpf_attr *attr)
> > +{
> > +       u64 size = insn_array_alloc_size(attr->max_entries);
> > +       struct bpf_insn_array *insn_array;
> > +
> > +       insn_array = bpf_map_area_alloc(size, NUMA_NO_NODE);
> > +       if (!insn_array)
> > +               return ERR_PTR(-ENOMEM);
> > +
> > +       insn_array->ips = kcalloc(attr->max_entries, sizeof(long), GFP_KERNEL);
> > +       if (!insn_array->ips) {
> > +               insn_array_free(&insn_array->map);
> > +               return ERR_PTR(-ENOMEM);
> > +       }
> > +
> > +       bpf_map_init_from_attr(&insn_array->map, attr);
> > +
> > +       mutex_init(&insn_array->state_mutex);
> > +       insn_array->state = INSN_ARRAY_STATE_FREE;
> > +
> > +       return &insn_array->map;
> > +}
> > +
> > +static int insn_array_get_next_key(struct bpf_map *map, void *key, void *next_key)
> > +{
> > +       struct bpf_insn_array *insn_array = cast_insn_array(map);
> > +       u32 index = key ? *(u32 *)key : U32_MAX;
> > +       u32 *next = (u32 *)next_key;
> > +
> > +       if (index >= insn_array->map.max_entries) {
> > +               *next = 0;
> > +               return 0;
> > +       }
> > +
> > +       if (index == insn_array->map.max_entries - 1)
> > +               return -ENOENT;
> > +
> > +       *next = index + 1;
> > +       return 0;
> > +}
> 
> Full copy paste of array_map_get_next_key() is a bit too much.
> Pls refactor array_map_get_next_key() to avoid casting
> to struct bpf_array, then such a helper can work for both maps.

Ok, thank, will do.

> > +
> > +static void *insn_array_lookup_elem(struct bpf_map *map, void *key)
> > +{
> > +       struct bpf_insn_array *insn_array = cast_insn_array(map);
> > +       u32 index = *(u32 *)key;
> > +
> > +       if (unlikely(index >= insn_array->map.max_entries))
> > +               return NULL;
> > +
> > +       return &insn_array->ptrs[index].user_value;
> > +}
> > +
> > +static long insn_array_update_elem(struct bpf_map *map, void *key, void *value, u64 map_flags)
> > +{
> > +       struct bpf_insn_array *insn_array = cast_insn_array(map);
> > +       u32 index = *(u32 *)key;
> > +       struct bpf_insn_array_value val = {};
> > +       int err = 0;
> > +
> > +       if (unlikely((map_flags & ~BPF_F_LOCK) > BPF_EXIST))
> > +               return -EINVAL;
> 
> copy paste gone wrong. BPF_F_LOCK is not supported here.

thanks, removed

> > +
> > +       if (unlikely(index >= insn_array->map.max_entries))
> > +               return -E2BIG;
> > +
> > +       if (unlikely(map_flags & BPF_NOEXIST))
> > +               return -EEXIST;
> > +
> > +       /* No updates for maps in use */
> > +       if (!mutex_trylock(&insn_array->state_mutex))
> > +               return -EBUSY;
> 
> trylock ?!
> 
> If I'm reading it correctly
> check_map_func_compatibility() prevents usage of this helper
> from the prog, so this is syscall only,
> but trylock?!

See the comment below.

> > +
> > +       if (insn_array->state != INSN_ARRAY_STATE_FREE) {
> > +               err = -EBUSY;
> > +               goto unlock;
> > +       }
> > +
> > +       copy_map_value(map, &val, value);
> > +       if (val.jitted_off || val.xlated_off == INSN_DELETED) {
> > +               err = -EINVAL;
> > +               goto unlock;
> > +       }
> > +
> > +       insn_array->ptrs[index].orig_xlated_off = val.xlated_off;
> > +       insn_array->ptrs[index].user_value.xlated_off = val.xlated_off;
> > +
> > +unlock:
> > +       mutex_unlock(&insn_array->state_mutex);
> > +       return err;
> > +}
> > +
> > +static long insn_array_delete_elem(struct bpf_map *map, void *key)
> > +{
> > +       return -EINVAL;
> > +}
> > +
> > +static int insn_array_check_btf(const struct bpf_map *map,
> > +                             const struct btf *btf,
> > +                             const struct btf_type *key_type,
> > +                             const struct btf_type *value_type)
> > +{
> > +       if (!btf_type_is_i32(key_type))
> > +               return -EINVAL;
> > +
> > +       if (!btf_type_is_i64(value_type))
> > +               return -EINVAL;
> > +
> > +       return 0;
> > +}
> > +
> > +static u64 insn_array_mem_usage(const struct bpf_map *map)
> > +{
> > +       u64 extra_size = 0;
> > +
> > +       extra_size += sizeof(long) * map->max_entries; /* insn_array->ips */
> > +
> > +       return insn_array_alloc_size(map->max_entries) + extra_size;
> > +}
> > +
> > +BTF_ID_LIST_SINGLE(insn_array_btf_ids, struct, bpf_insn_array)
> > +
> > +const struct bpf_map_ops insn_array_map_ops = {
> > +       .map_alloc_check = insn_array_alloc_check,
> > +       .map_alloc = insn_array_alloc,
> > +       .map_free = insn_array_free,
> > +       .map_get_next_key = insn_array_get_next_key,
> > +       .map_lookup_elem = insn_array_lookup_elem,
> > +       .map_update_elem = insn_array_update_elem,
> > +       .map_delete_elem = insn_array_delete_elem,
> > +       .map_check_btf = insn_array_check_btf,
> > +       .map_mem_usage = insn_array_mem_usage,
> > +       .map_btf_id = &insn_array_btf_ids[0],
> > +};
> > +
> > +static bool is_insn_array(const struct bpf_map *map)
> > +{
> > +       return map->map_type == BPF_MAP_TYPE_INSN_ARRAY;
> > +}
> > +
> > +static inline bool valid_offsets(const struct bpf_insn_array *insn_array,
> > +                                const struct bpf_prog *prog)
> > +{
> > +       u32 off;
> > +       int i;
> > +
> > +       for (i = 0; i < insn_array->map.max_entries; i++) {
> > +               off = insn_array->ptrs[i].orig_xlated_off;
> > +
> > +               if (off >= prog->len)
> > +                       return false;
> > +
> > +               if (off > 0) {
> > +                       if (prog->insnsi[off-1].code == (BPF_LD | BPF_DW | BPF_IMM))
> > +                               return false;
> > +               }
> > +       }
> > +
> > +       return true;
> > +}
> > +
> > +int bpf_insn_array_init(struct bpf_map *map, const struct bpf_prog *prog)
> > +{
> > +       struct bpf_insn_array *insn_array = cast_insn_array(map);
> > +       int i;
> > +
> > +       if (!valid_offsets(insn_array, prog))
> > +               return -EINVAL;
> > +
> > +       /*
> > +        * There can be only one program using the map
> > +        */
> > +       mutex_lock(&insn_array->state_mutex);
> > +       if (insn_array->state != INSN_ARRAY_STATE_FREE) {
> > +               mutex_unlock(&insn_array->state_mutex);
> > +               return -EBUSY;
> > +       }
> > +       insn_array->state = INSN_ARRAY_STATE_INIT;
> > +       mutex_unlock(&insn_array->state_mutex);
> 
> only verifier calls this helpers, no?
> Why all the mutexes here and below ?
> All the mutexes is a big red flag to me.
> Will stop any further comments here.

Mutex came here from the future patch for static keys.
I will see how to rewrite this with just an atomic state.
(Try lock came from fixing some robot report which I struggle to find now...)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array
  2025-09-22 10:38     ` Anton Protopopov
@ 2025-09-22 16:16       ` Alexei Starovoitov
  2025-09-22 17:37         ` Anton Protopopov
  0 siblings, 1 reply; 26+ messages in thread
From: Alexei Starovoitov @ 2025-09-22 16:16 UTC (permalink / raw)
  To: Anton Protopopov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song

On Mon, Sep 22, 2025 at 3:32 AM Anton Protopopov
<a.s.protopopov@gmail.com> wrote:
> > > +int bpf_insn_array_init(struct bpf_map *map, const struct bpf_prog *prog)
> > > +{
> > > +       struct bpf_insn_array *insn_array = cast_insn_array(map);
> > > +       int i;
> > > +
> > > +       if (!valid_offsets(insn_array, prog))
> > > +               return -EINVAL;
> > > +
> > > +       /*
> > > +        * There can be only one program using the map
> > > +        */
> > > +       mutex_lock(&insn_array->state_mutex);
> > > +       if (insn_array->state != INSN_ARRAY_STATE_FREE) {
> > > +               mutex_unlock(&insn_array->state_mutex);
> > > +               return -EBUSY;
> > > +       }
> > > +       insn_array->state = INSN_ARRAY_STATE_INIT;
> > > +       mutex_unlock(&insn_array->state_mutex);
> >
> > only verifier calls this helpers, no?
> > Why all the mutexes here and below ?
> > All the mutexes is a big red flag to me.
> > Will stop any further comments here.
>
> Mutex came here from the future patch for static keys.
> I will see how to rewrite this with just an atomic state.

I don't follow. Who will be calling them other than the verifier?
Some kfunc? I couldn't find that in the patch set.
If so, add synchronization logic in the patch set that
actually needs it. This one doesn't not. So don't add
any mutex or atomics here.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array
  2025-09-22 16:16       ` Alexei Starovoitov
@ 2025-09-22 17:37         ` Anton Protopopov
  2025-09-22 17:57           ` Alexei Starovoitov
  0 siblings, 1 reply; 26+ messages in thread
From: Anton Protopopov @ 2025-09-22 17:37 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song

On 25/09/22 09:16AM, Alexei Starovoitov wrote:
> On Mon, Sep 22, 2025 at 3:32 AM Anton Protopopov
> <a.s.protopopov@gmail.com> wrote:
> > > > +int bpf_insn_array_init(struct bpf_map *map, const struct bpf_prog *prog)
> > > > +{
> > > > +       struct bpf_insn_array *insn_array = cast_insn_array(map);
> > > > +       int i;
> > > > +
> > > > +       if (!valid_offsets(insn_array, prog))
> > > > +               return -EINVAL;
> > > > +
> > > > +       /*
> > > > +        * There can be only one program using the map
> > > > +        */
> > > > +       mutex_lock(&insn_array->state_mutex);
> > > > +       if (insn_array->state != INSN_ARRAY_STATE_FREE) {
> > > > +               mutex_unlock(&insn_array->state_mutex);
> > > > +               return -EBUSY;
> > > > +       }
> > > > +       insn_array->state = INSN_ARRAY_STATE_INIT;
> > > > +       mutex_unlock(&insn_array->state_mutex);
> > >
> > > only verifier calls this helpers, no?
> > > Why all the mutexes here and below ?
> > > All the mutexes is a big red flag to me.
> > > Will stop any further comments here.
> >
> > Mutex came here from the future patch for static keys.
> > I will see how to rewrite this with just an atomic state.
> 
> I don't follow. Who will be calling them other than the verifier?
> Some kfunc? I couldn't find that in the patch set.
> If so, add synchronization logic in the patch set that
> actually needs it. This one doesn't not. So don't add
> any mutex or atomics here.

The usage of this map is as follows:

  1. A user creates it and fills in the values using the map_update_element (syscall)
  2. Then the program is loaded

The map <-> program is 1:1 relation, so I want to prevent users from

  1. Updating the map after the program started loading
  2. Allowing two programs to use the same map (while, say, loading simultaneously)

At the same time I want map to be reusable for the same program for the case
when the program failed to load and is reloaded with the log buffer.
So there should be some synchronisation mechanism.

(In future patchset, the bpf(STATIC_KEY_UPDATE) syscall needs to execute. It
needs to be sure that the map was successfully loaded with the program. But
you're right that this doesn't make sense to leak part of this patch into this
patchset.)

Does this make sense?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array
  2025-09-22 17:37         ` Anton Protopopov
@ 2025-09-22 17:57           ` Alexei Starovoitov
  2025-09-22 19:23             ` Anton Protopopov
  2025-09-23  9:55             ` Anton Protopopov
  0 siblings, 2 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2025-09-22 17:57 UTC (permalink / raw)
  To: Anton Protopopov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song

On Mon, Sep 22, 2025 at 10:31 AM Anton Protopopov
<a.s.protopopov@gmail.com> wrote:
>
> On 25/09/22 09:16AM, Alexei Starovoitov wrote:
> > On Mon, Sep 22, 2025 at 3:32 AM Anton Protopopov
> > <a.s.protopopov@gmail.com> wrote:
> > > > > +int bpf_insn_array_init(struct bpf_map *map, const struct bpf_prog *prog)
> > > > > +{
> > > > > +       struct bpf_insn_array *insn_array = cast_insn_array(map);
> > > > > +       int i;
> > > > > +
> > > > > +       if (!valid_offsets(insn_array, prog))
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       /*
> > > > > +        * There can be only one program using the map
> > > > > +        */
> > > > > +       mutex_lock(&insn_array->state_mutex);
> > > > > +       if (insn_array->state != INSN_ARRAY_STATE_FREE) {
> > > > > +               mutex_unlock(&insn_array->state_mutex);
> > > > > +               return -EBUSY;
> > > > > +       }
> > > > > +       insn_array->state = INSN_ARRAY_STATE_INIT;
> > > > > +       mutex_unlock(&insn_array->state_mutex);
> > > >
> > > > only verifier calls this helpers, no?
> > > > Why all the mutexes here and below ?
> > > > All the mutexes is a big red flag to me.
> > > > Will stop any further comments here.
> > >
> > > Mutex came here from the future patch for static keys.
> > > I will see how to rewrite this with just an atomic state.
> >
> > I don't follow. Who will be calling them other than the verifier?
> > Some kfunc? I couldn't find that in the patch set.
> > If so, add synchronization logic in the patch set that
> > actually needs it. This one doesn't not. So don't add
> > any mutex or atomics here.
>
> The usage of this map is as follows:
>
>   1. A user creates it and fills in the values using the map_update_element (syscall)
>   2. Then the program is loaded
>
> The map <-> program is 1:1 relation, so I want to prevent users from
>
>   1. Updating the map after the program started loading
>   2. Allowing two programs to use the same map (while, say, loading simultaneously)

Then the user space should freeze the map after updating and
before loading.
As far as 1-1 relation, we just landed exclusive map support
that ties a map to one specific program.
This mechanism can be used or 1-1 can be established by the kernel
internally.

> At the same time I want map to be reusable for the same program for the case
> when the program failed to load and is reloaded with the log buffer.
> So there should be some synchronisation mechanism.
>
> (In future patchset, the bpf(STATIC_KEY_UPDATE) syscall needs to execute. It
> needs to be sure that the map was successfully loaded with the program. But
> you're right that this doesn't make sense to leak part of this patch into this
> patchset.)

Even when that bit will be available it won't be modifying the map.
At best it will flip flag or bit whether the branch is nop or jmp.
I still don't see a need for mutexes.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array
  2025-09-22 17:57           ` Alexei Starovoitov
@ 2025-09-22 19:23             ` Anton Protopopov
  2025-09-22 20:24               ` Alexei Starovoitov
  2025-09-23  9:55             ` Anton Protopopov
  1 sibling, 1 reply; 26+ messages in thread
From: Anton Protopopov @ 2025-09-22 19:23 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song

On 25/09/22 10:57AM, Alexei Starovoitov wrote:
> On Mon, Sep 22, 2025 at 10:31 AM Anton Protopopov
> <a.s.protopopov@gmail.com> wrote:
> >
> > On 25/09/22 09:16AM, Alexei Starovoitov wrote:
> > > On Mon, Sep 22, 2025 at 3:32 AM Anton Protopopov
> > > <a.s.protopopov@gmail.com> wrote:
> > > > > > +int bpf_insn_array_init(struct bpf_map *map, const struct bpf_prog *prog)
> > > > > > +{
> > > > > > +       struct bpf_insn_array *insn_array = cast_insn_array(map);
> > > > > > +       int i;
> > > > > > +
> > > > > > +       if (!valid_offsets(insn_array, prog))
> > > > > > +               return -EINVAL;
> > > > > > +
> > > > > > +       /*
> > > > > > +        * There can be only one program using the map
> > > > > > +        */
> > > > > > +       mutex_lock(&insn_array->state_mutex);
> > > > > > +       if (insn_array->state != INSN_ARRAY_STATE_FREE) {
> > > > > > +               mutex_unlock(&insn_array->state_mutex);
> > > > > > +               return -EBUSY;
> > > > > > +       }
> > > > > > +       insn_array->state = INSN_ARRAY_STATE_INIT;
> > > > > > +       mutex_unlock(&insn_array->state_mutex);
> > > > >
> > > > > only verifier calls this helpers, no?
> > > > > Why all the mutexes here and below ?
> > > > > All the mutexes is a big red flag to me.
> > > > > Will stop any further comments here.
> > > >
> > > > Mutex came here from the future patch for static keys.
> > > > I will see how to rewrite this with just an atomic state.
> > >
> > > I don't follow. Who will be calling them other than the verifier?
> > > Some kfunc? I couldn't find that in the patch set.
> > > If so, add synchronization logic in the patch set that
> > > actually needs it. This one doesn't not. So don't add
> > > any mutex or atomics here.
> >
> > The usage of this map is as follows:
> >
> >   1. A user creates it and fills in the values using the map_update_element (syscall)
> >   2. Then the program is loaded
> >
> > The map <-> program is 1:1 relation, so I want to prevent users from
> >
> >   1. Updating the map after the program started loading
> >   2. Allowing two programs to use the same map (while, say, loading simultaneously)
> 
> Then the user space should freeze the map after updating and
> before loading.
> As far as 1-1 relation, we just landed exclusive map support
> that ties a map to one specific program.
> This mechanism can be used or 1-1 can be established by the kernel
> internally.

I've actually first did it via frozen, and then removed it after Andrii's
comments. Will get it back and remove all other mutexes

> > At the same time I want map to be reusable for the same program for the case
> > when the program failed to load and is reloaded with the log buffer.
> > So there should be some synchronisation mechanism.
> >
> > (In future patchset, the bpf(STATIC_KEY_UPDATE) syscall needs to execute. It
> > needs to be sure that the map was successfully loaded with the program. But
> > you're right that this doesn't make sense to leak part of this patch into this
> > patchset.)
> 
> Even when that bit will be available it won't be modifying the map.
> At best it will flip flag or bit whether the branch is nop or jmp.
> I still don't see a need for mutexes.

ok

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array
  2025-09-22 19:23             ` Anton Protopopov
@ 2025-09-22 20:24               ` Alexei Starovoitov
  0 siblings, 0 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2025-09-22 20:24 UTC (permalink / raw)
  To: Anton Protopopov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song

On Mon, Sep 22, 2025 at 12:17 PM Anton Protopopov
<a.s.protopopov@gmail.com> wrote:
>
> On 25/09/22 10:57AM, Alexei Starovoitov wrote:
> > On Mon, Sep 22, 2025 at 10:31 AM Anton Protopopov
> > <a.s.protopopov@gmail.com> wrote:
> > >
> > > On 25/09/22 09:16AM, Alexei Starovoitov wrote:
> > > > On Mon, Sep 22, 2025 at 3:32 AM Anton Protopopov
> > > > <a.s.protopopov@gmail.com> wrote:
> > > > > > > +int bpf_insn_array_init(struct bpf_map *map, const struct bpf_prog *prog)
> > > > > > > +{
> > > > > > > +       struct bpf_insn_array *insn_array = cast_insn_array(map);
> > > > > > > +       int i;
> > > > > > > +
> > > > > > > +       if (!valid_offsets(insn_array, prog))
> > > > > > > +               return -EINVAL;
> > > > > > > +
> > > > > > > +       /*
> > > > > > > +        * There can be only one program using the map
> > > > > > > +        */
> > > > > > > +       mutex_lock(&insn_array->state_mutex);
> > > > > > > +       if (insn_array->state != INSN_ARRAY_STATE_FREE) {
> > > > > > > +               mutex_unlock(&insn_array->state_mutex);
> > > > > > > +               return -EBUSY;
> > > > > > > +       }
> > > > > > > +       insn_array->state = INSN_ARRAY_STATE_INIT;
> > > > > > > +       mutex_unlock(&insn_array->state_mutex);
> > > > > >
> > > > > > only verifier calls this helpers, no?
> > > > > > Why all the mutexes here and below ?
> > > > > > All the mutexes is a big red flag to me.
> > > > > > Will stop any further comments here.
> > > > >
> > > > > Mutex came here from the future patch for static keys.
> > > > > I will see how to rewrite this with just an atomic state.
> > > >
> > > > I don't follow. Who will be calling them other than the verifier?
> > > > Some kfunc? I couldn't find that in the patch set.
> > > > If so, add synchronization logic in the patch set that
> > > > actually needs it. This one doesn't not. So don't add
> > > > any mutex or atomics here.
> > >
> > > The usage of this map is as follows:
> > >
> > >   1. A user creates it and fills in the values using the map_update_element (syscall)
> > >   2. Then the program is loaded
> > >
> > > The map <-> program is 1:1 relation, so I want to prevent users from
> > >
> > >   1. Updating the map after the program started loading
> > >   2. Allowing two programs to use the same map (while, say, loading simultaneously)
> >
> > Then the user space should freeze the map after updating and
> > before loading.
> > As far as 1-1 relation, we just landed exclusive map support
> > that ties a map to one specific program.
> > This mechanism can be used or 1-1 can be established by the kernel
> > internally.
>
> I've actually first did it via frozen, and then removed it after Andrii's
> comments. Will get it back and remove all other mutexes

What was Andrii's concern with freeze ?
It seems like a good fit to me. User space updates and freezes,
because it shouldn't be updating it anymore. Normal jmp tables
in ELF are readonly too.

> > > At the same time I want map to be reusable for the same program for the case
> > > when the program failed to load and is reloaded with the log buffer.
> > > So there should be some synchronisation mechanism.
> > >
> > > (In future patchset, the bpf(STATIC_KEY_UPDATE) syscall needs to execute. It
> > > needs to be sure that the map was successfully loaded with the program. But
> > > you're right that this doesn't make sense to leak part of this patch into this
> > > patchset.)
> >
> > Even when that bit will be available it won't be modifying the map.
> > At best it will flip flag or bit whether the branch is nop or jmp.
> > I still don't see a need for mutexes.
>
> ok

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array
  2025-09-22 17:57           ` Alexei Starovoitov
  2025-09-22 19:23             ` Anton Protopopov
@ 2025-09-23  9:55             ` Anton Protopopov
  2025-09-23 15:14               ` Alexei Starovoitov
  1 sibling, 1 reply; 26+ messages in thread
From: Anton Protopopov @ 2025-09-23  9:55 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song

On 25/09/22 10:57AM, Alexei Starovoitov wrote:
> On Mon, Sep 22, 2025 at 10:31 AM Anton Protopopov
> <a.s.protopopov@gmail.com> wrote:
> >
> > On 25/09/22 09:16AM, Alexei Starovoitov wrote:
> > > On Mon, Sep 22, 2025 at 3:32 AM Anton Protopopov
> > > <a.s.protopopov@gmail.com> wrote:
> > > > > > +int bpf_insn_array_init(struct bpf_map *map, const struct bpf_prog *prog)
> > > > > > +{
> > > > > > +       struct bpf_insn_array *insn_array = cast_insn_array(map);
> > > > > > +       int i;
> > > > > > +
> > > > > > +       if (!valid_offsets(insn_array, prog))
> > > > > > +               return -EINVAL;
> > > > > > +
> > > > > > +       /*
> > > > > > +        * There can be only one program using the map
> > > > > > +        */
> > > > > > +       mutex_lock(&insn_array->state_mutex);
> > > > > > +       if (insn_array->state != INSN_ARRAY_STATE_FREE) {
> > > > > > +               mutex_unlock(&insn_array->state_mutex);
> > > > > > +               return -EBUSY;
> > > > > > +       }
> > > > > > +       insn_array->state = INSN_ARRAY_STATE_INIT;
> > > > > > +       mutex_unlock(&insn_array->state_mutex);
> > > > >
> > > > > only verifier calls this helpers, no?
> > > > > Why all the mutexes here and below ?
> > > > > All the mutexes is a big red flag to me.
> > > > > Will stop any further comments here.
> > > >
> > > > Mutex came here from the future patch for static keys.
> > > > I will see how to rewrite this with just an atomic state.
> > >
> > > I don't follow. Who will be calling them other than the verifier?
> > > Some kfunc? I couldn't find that in the patch set.
> > > If so, add synchronization logic in the patch set that
> > > actually needs it. This one doesn't not. So don't add
> > > any mutex or atomics here.
> >
> > The usage of this map is as follows:
> >
> >   1. A user creates it and fills in the values using the map_update_element (syscall)
> >   2. Then the program is loaded
> >
> > The map <-> program is 1:1 relation, so I want to prevent users from
> >
> >   1. Updating the map after the program started loading
> >   2. Allowing two programs to use the same map (while, say, loading simultaneously)
> 
> Then the user space should freeze the map after updating and
> before loading.
> As far as 1-1 relation, we just landed exclusive map support
> that ties a map to one specific program.

AFAICS, this api is not applicable here, as it says "this map can
only be used with the program with sha256 hash X", but nothing
prevents users from loading, say, 2 same programs with the same map.

Are you ok with just this for 1:1 correspondance:

    if (atomic64_fetch_add_unless(&insn_array->used, 1, 1))
        return -EBUSY;

> This mechanism can be used or 1-1 can be established by the kernel
> internally.
> 
> > At the same time I want map to be reusable for the same program for the case
> > when the program failed to load and is reloaded with the log buffer.
> > So there should be some synchronisation mechanism.
> >
> > (In future patchset, the bpf(STATIC_KEY_UPDATE) syscall needs to execute. It
> > needs to be sure that the map was successfully loaded with the program. But
> > you're right that this doesn't make sense to leak part of this patch into this
> > patchset.)
> 
> Even when that bit will be available it won't be modifying the map.
> At best it will flip flag or bit whether the branch is nop or jmp.
> I still don't see a need for mutexes.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array
  2025-09-23  9:55             ` Anton Protopopov
@ 2025-09-23 15:14               ` Alexei Starovoitov
  0 siblings, 0 replies; 26+ messages in thread
From: Alexei Starovoitov @ 2025-09-23 15:14 UTC (permalink / raw)
  To: Anton Protopopov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Anton Protopopov,
	Daniel Borkmann, Eduard Zingerman, Quentin Monnet, Yonghong Song

On Tue, Sep 23, 2025 at 2:49 AM Anton Protopopov
<a.s.protopopov@gmail.com> wrote:
>
> Are you ok with just this for 1:1 correspondance:
>
>     if (atomic64_fetch_add_unless(&insn_array->used, 1, 1))
>         return -EBUSY;

Like that, but more canonical form:

if (atomic_xchg(&insn_array->used, 1))
  return -EBUSY;

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2025-09-23 15:14 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-13 19:39 [PATCH v2 bpf-next 00/13] BPF indirect jumps Anton Protopopov
2025-09-13 19:39 ` [PATCH v2 bpf-next 01/13] bpf: fix the return value of push_stack Anton Protopopov
2025-09-13 19:39 ` [PATCH v2 bpf-next 02/13] bpf: save the start of functions in bpf_prog_aux Anton Protopopov
2025-09-13 19:39 ` [PATCH v2 bpf-next 03/13] bpf, x86: add new map type: instructions array Anton Protopopov
2025-09-15  4:09   ` kernel test robot
2025-09-20  0:30   ` Alexei Starovoitov
2025-09-22 10:38     ` Anton Protopopov
2025-09-22 16:16       ` Alexei Starovoitov
2025-09-22 17:37         ` Anton Protopopov
2025-09-22 17:57           ` Alexei Starovoitov
2025-09-22 19:23             ` Anton Protopopov
2025-09-22 20:24               ` Alexei Starovoitov
2025-09-23  9:55             ` Anton Protopopov
2025-09-23 15:14               ` Alexei Starovoitov
2025-09-13 19:39 ` [PATCH v2 bpf-next 04/13] selftests/bpf: add selftests for new insn_array map Anton Protopopov
2025-09-13 19:39 ` [PATCH v2 bpf-next 05/13] bpf: support instructions arrays with constants blinding Anton Protopopov
2025-09-13 19:39 ` [PATCH v2 bpf-next 06/13] selftests/bpf: test instructions arrays with blinding Anton Protopopov
2025-09-13 19:39 ` [PATCH v2 bpf-next 07/13] bpf, x86: allow indirect jumps to r8...r15 Anton Protopopov
2025-09-13 19:39 ` [PATCH v2 bpf-next 08/13] bpf, x86: add support for indirect jumps Anton Protopopov
2025-09-13 19:39 ` [PATCH v2 bpf-next 09/13] bpf: disasm: add support for BPF_JMP|BPF_JA|BPF_X Anton Protopopov
2025-09-13 19:39 ` [PATCH v2 bpf-next 10/13] libbpf: fix formatting of bpf_object__append_subprog_code Anton Protopopov
2025-09-13 19:39 ` [PATCH v2 bpf-next 11/13] libbpf: support llvm-generated indirect jumps Anton Protopopov
2025-09-13 19:39 ` [PATCH v2 bpf-next 12/13] bpftool: Recognize insn_array map type Anton Protopopov
2025-09-16 20:33   ` Quentin Monnet
2025-09-18  8:11     ` Anton Protopopov
2025-09-13 19:39 ` [PATCH v2 bpf-next 13/13] selftests/bpf: add selftests for indirect jumps Anton Protopopov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox