public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v4 0/4] emit ENDBR/BTI instructions for indirect jump targets
@ 2026-01-14  9:39 Xu Kuohai
  2026-01-14  9:39 ` [PATCH bpf-next v4 1/4] bpf: Fix an off-by-one error in check_indirect_jump Xu Kuohai
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Xu Kuohai @ 2026-01-14  9:39 UTC (permalink / raw)
  To: bpf, linux-kernel, linux-arm-kernel
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Yonghong Song, Puranjay Mohan,
	Anton Protopopov

From: Xu Kuohai <xukuohai@huawei.com>

On x86 CPUs with CET/IBT and arm64 CPUs with BTI, missing landing pad instructions
at indirect jump targets triggers kernel panic. So emit ENDBR instructions for
indirect jump targets on x86 and BTI on arm64. Indirect jump targets are identified
based on the insn_aux_data created by the verifier.

Patch 1 fixes an off-by-one error that causes the last ENDBR/BTI instruction to be
omitted.

Patch 2 introduces a helper to determine whether an instruction is indirect jump target.

Patches 3 and 4 emit ENDBR and BTI instructions for indirect jump targets on x86 and
arm64, respectively.

v4:
- Switch to the approach proposed by Eduard, using insn_aux_data to indentify indirect
  jump targets, and emit ENDBR on x86

v3: https://lore.kernel.org/bpf/20251227081033.240336-1-xukuohai@huaweicloud.com/
- Get rid of unnecessary enum definition (Yonghong Song, Anton Protopopov)

v2: https://lore.kernel.org/bpf/20251223085447.139301-1-xukuohai@huaweicloud.com/
- Exclude instruction arrays not used for indirect jumps (Anton Protopopov)

v1: https://lore.kernel.org/bpf/20251127140318.3944249-1-xukuohai@huaweicloud.com/

Xu Kuohai (4):
  bpf: Fix an off-by-one error in check_indirect_jump
  bpf: Add helper to detect indirect jump targets
  bpf, x86: Emit ENDBR for indirect jump targets
  bpf, arm64: Emit BTI for indirect jump target

 arch/arm64/net/bpf_jit_comp.c |  3 ++
 arch/x86/net/bpf_jit_comp.c   | 15 ++++++----
 include/linux/bpf.h           |  2 ++
 include/linux/bpf_verifier.h  | 10 ++++---
 kernel/bpf/core.c             | 51 ++++++++++++++++++++++++++++++---
 kernel/bpf/verifier.c         | 53 +++++++++++++++++++++++++++++++++--
 6 files changed, 119 insertions(+), 15 deletions(-)

-- 
2.47.3



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v4 1/4] bpf: Fix an off-by-one error in check_indirect_jump
  2026-01-14  9:39 [PATCH bpf-next v4 0/4] emit ENDBR/BTI instructions for indirect jump targets Xu Kuohai
@ 2026-01-14  9:39 ` Xu Kuohai
  2026-01-14 10:29   ` Anton Protopopov
  2026-01-14  9:39 ` [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets Xu Kuohai
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Xu Kuohai @ 2026-01-14  9:39 UTC (permalink / raw)
  To: bpf, linux-kernel, linux-arm-kernel
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Yonghong Song, Puranjay Mohan,
	Anton Protopopov

From: Xu Kuohai <xukuohai@huawei.com>

Fix an off-by-one error in check_indirect_jump() that skips the last
element returned by copy_insn_array_uniq().

Fixes: 493d9e0d6083 ("bpf, x86: add support for indirect jumps")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
---
 kernel/bpf/verifier.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index faa1ecc1fe9d..22605d9e0ffa 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -20336,7 +20336,7 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in
 		return -EINVAL;
 	}
 
-	for (i = 0; i < n - 1; i++) {
+	for (i = 0; i < n; i++) {
 		other_branch = push_stack(env, env->gotox_tmp_buf->items[i],
 					  env->insn_idx, env->cur_state->speculative);
 		if (IS_ERR(other_branch))
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets
  2026-01-14  9:39 [PATCH bpf-next v4 0/4] emit ENDBR/BTI instructions for indirect jump targets Xu Kuohai
  2026-01-14  9:39 ` [PATCH bpf-next v4 1/4] bpf: Fix an off-by-one error in check_indirect_jump Xu Kuohai
@ 2026-01-14  9:39 ` Xu Kuohai
  2026-01-14 11:00   ` Anton Protopopov
  2026-01-14 20:46   ` Eduard Zingerman
  2026-01-14  9:39 ` [PATCH bpf-next v4 3/4] bpf, x86: Emit ENDBR for " Xu Kuohai
  2026-01-14  9:39 ` [PATCH bpf-next v4 4/4] bpf, arm64: Emit BTI for indirect jump target Xu Kuohai
  3 siblings, 2 replies; 15+ messages in thread
From: Xu Kuohai @ 2026-01-14  9:39 UTC (permalink / raw)
  To: bpf, linux-kernel, linux-arm-kernel
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Yonghong Song, Puranjay Mohan,
	Anton Protopopov

From: Xu Kuohai <xukuohai@huawei.com>

Introduce helper bpf_insn_is_indirect_target to determine whether a BPF
instruction is an indirect jump target. This helper will be used by
follow-up patches to decide where to emit indirect landing pad instructions.

Add a new flag to struct bpf_insn_aux_data to mark instructions that are
indirect jump targets. The BPF verifier sets this flag, and the helper
checks it to determine whether an instruction is an indirect jump target.

Since bpf_insn_aux_data is only available before JIT stage, add a new
field to struct bpf_prog_aux to store a pointer to the bpf_insn_aux_data
array, making it accessible to the JIT.

For programs with multiple subprogs, each subprog uses its own private
copy of insn_aux_data, since subprogs may insert additional instructions
during JIT and need to update the array. For non-subprog, the verifier's
insn_aux_data array is used directly to avoid unnecessary copying.

Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
---
 include/linux/bpf.h          |  2 ++
 include/linux/bpf_verifier.h | 10 ++++---
 kernel/bpf/core.c            | 51 +++++++++++++++++++++++++++++++++---
 kernel/bpf/verifier.c        | 51 +++++++++++++++++++++++++++++++++++-
 4 files changed, 105 insertions(+), 9 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 5936f8e2996f..e7d7e705327e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1533,6 +1533,7 @@ bool bpf_has_frame_pointer(unsigned long ip);
 int bpf_jit_charge_modmem(u32 size);
 void bpf_jit_uncharge_modmem(u32 size);
 bool bpf_prog_has_trampoline(const struct bpf_prog *prog);
+bool bpf_insn_is_indirect_target(const struct bpf_prog *prog, int idx);
 #else
 static inline int bpf_trampoline_link_prog(struct bpf_tramp_link *link,
 					   struct bpf_trampoline *tr,
@@ -1760,6 +1761,7 @@ struct bpf_prog_aux {
 	struct bpf_stream stream[2];
 	struct mutex st_ops_assoc_mutex;
 	struct bpf_map __rcu *st_ops_assoc;
+	struct bpf_insn_aux_data *insn_aux;
 };
 
 #define BPF_NR_CONTEXTS        4       /* normal, softirq, hardirq, NMI */
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 130bcbd66f60..758086b384df 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -574,16 +574,18 @@ struct bpf_insn_aux_data {
 
 	/* below fields are initialized once */
 	unsigned int orig_idx; /* original instruction index */
-	bool jmp_point;
-	bool prune_point;
+	u32 jmp_point:1;
+	u32 prune_point:1;
 	/* ensure we check state equivalence and save state checkpoint and
 	 * this instruction, regardless of any heuristics
 	 */
-	bool force_checkpoint;
+	u32 force_checkpoint:1;
 	/* true if instruction is a call to a helper function that
 	 * accepts callback function as a parameter.
 	 */
-	bool calls_callback;
+	u32 calls_callback:1;
+	/* true if the instruction is an indirect jump target */
+	u32 indirect_target:1;
 	/*
 	 * CFG strongly connected component this instruction belongs to,
 	 * zero if it is a singleton SCC.
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index e0b8a8a5aaa9..bb870936e74b 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1486,6 +1486,35 @@ static void adjust_insn_arrays(struct bpf_prog *prog, u32 off, u32 len)
 #endif
 }
 
+static int adjust_insn_aux(struct bpf_prog *prog, int off, int cnt)
+{
+	size_t size;
+	struct bpf_insn_aux_data *new_aux;
+
+	if (cnt == 1)
+		return 0;
+
+	/* prog->len already accounts for the cnt - 1 newly inserted instructions */
+	size = array_size(prog->len, sizeof(struct bpf_insn_aux_data));
+	new_aux = vrealloc(prog->aux->insn_aux, size, GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+	if (!new_aux)
+		return -ENOMEM;
+
+	/* follow the same behavior as adjust_insn_array(): leave [0, off] unchanged and shift
+	 * [off + 1, end) to [off + cnt, end). Otherwise, the JIT would emit landing pads at
+	 * wrong locations, as the actual indirect jump target remains at off.
+	 */
+	size = array_size(prog->len - off - cnt, sizeof(struct bpf_insn_aux_data));
+	memmove(new_aux + off + cnt, new_aux + off + 1, size);
+
+	size = array_size(cnt - 1, sizeof(struct bpf_insn_aux_data));
+	memset(new_aux + off + 1, 0, size);
+
+	prog->aux->insn_aux = new_aux;
+
+	return 0;
+}
+
 struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
 {
 	struct bpf_insn insn_buff[16], aux[2];
@@ -1541,6 +1570,11 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
 		clone = tmp;
 		insn_delta = rewritten - 1;
 
+		if (adjust_insn_aux(clone, i, rewritten)) {
+			bpf_jit_prog_release_other(prog, clone);
+			return ERR_PTR(-ENOMEM);
+		}
+
 		/* Instructions arrays must be updated using absolute xlated offsets */
 		adjust_insn_arrays(clone, prog->aux->subprog_start + i, rewritten);
 
@@ -1553,6 +1587,11 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
 	clone->blinded = 1;
 	return clone;
 }
+
+bool bpf_insn_is_indirect_target(const struct bpf_prog *prog, int idx)
+{
+	return prog->aux->insn_aux && prog->aux->insn_aux[idx].indirect_target;
+}
 #endif /* CONFIG_BPF_JIT */
 
 /* Base function for offset calculation. Needs to go into .text section,
@@ -2540,24 +2579,24 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
 	if (!bpf_prog_is_offloaded(fp->aux)) {
 		*err = bpf_prog_alloc_jited_linfo(fp);
 		if (*err)
-			return fp;
+			goto free_insn_aux;
 
 		fp = bpf_int_jit_compile(fp);
 		bpf_prog_jit_attempt_done(fp);
 		if (!fp->jited && jit_needed) {
 			*err = -ENOTSUPP;
-			return fp;
+			goto free_insn_aux;
 		}
 	} else {
 		*err = bpf_prog_offload_compile(fp);
 		if (*err)
-			return fp;
+			goto free_insn_aux;
 	}
 
 finalize:
 	*err = bpf_prog_lock_ro(fp);
 	if (*err)
-		return fp;
+		goto free_insn_aux;
 
 	/* The tail call compatibility check can only be done at
 	 * this late stage as we need to determine, if we deal
@@ -2566,6 +2605,10 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
 	 */
 	*err = bpf_check_tail_call(fp);
 
+free_insn_aux:
+	vfree(fp->aux->insn_aux);
+	fp->aux->insn_aux = NULL;
+
 	return fp;
 }
 EXPORT_SYMBOL_GPL(bpf_prog_select_runtime);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 22605d9e0ffa..f2fe6baeceb9 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3852,6 +3852,11 @@ static bool is_jmp_point(struct bpf_verifier_env *env, int insn_idx)
 	return env->insn_aux_data[insn_idx].jmp_point;
 }
 
+static void mark_indirect_target(struct bpf_verifier_env *env, int idx)
+{
+	env->insn_aux_data[idx].indirect_target = true;
+}
+
 #define LR_FRAMENO_BITS	3
 #define LR_SPI_BITS	6
 #define LR_ENTRY_BITS	(LR_SPI_BITS + LR_FRAMENO_BITS + 1)
@@ -20337,6 +20342,7 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in
 	}
 
 	for (i = 0; i < n; i++) {
+		mark_indirect_target(env, env->gotox_tmp_buf->items[i]);
 		other_branch = push_stack(env, env->gotox_tmp_buf->items[i],
 					  env->insn_idx, env->cur_state->speculative);
 		if (IS_ERR(other_branch))
@@ -21243,6 +21249,37 @@ static void convert_pseudo_ld_imm64(struct bpf_verifier_env *env)
 	}
 }
 
+static int clone_insn_aux_data(struct bpf_prog *prog, struct bpf_verifier_env *env, u32 off)
+{
+	u32 i;
+	size_t size;
+	bool has_indirect_target = false;
+	struct bpf_insn_aux_data *insn_aux;
+
+	for (i = 0; i < prog->len; i++) {
+		if (env->insn_aux_data[off + i].indirect_target) {
+			has_indirect_target = true;
+			break;
+		}
+	}
+
+	/* insn_aux is copied into bpf_prog so the JIT can check whether an instruction is an
+	 * indirect jump target. If no indirect jump targets exist, copying is unnecessary.
+	 */
+	if (!has_indirect_target)
+		return 0;
+
+	size = array_size(sizeof(struct bpf_insn_aux_data), prog->len);
+	insn_aux = vzalloc(size);
+	if (!insn_aux)
+		return -ENOMEM;
+
+	memcpy(insn_aux, env->insn_aux_data + off, size);
+	prog->aux->insn_aux = insn_aux;
+
+	return 0;
+}
+
 /* single env->prog->insni[off] instruction was replaced with the range
  * insni[off, off + cnt).  Adjust corresponding insn_aux_data by copying
  * [0, off) and [off, end) to new locations, so the patched range stays zero
@@ -22239,6 +22276,10 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 		if (!i)
 			func[i]->aux->exception_boundary = env->seen_exception;
 
+		err = clone_insn_aux_data(func[i], env, subprog_start);
+		if (err < 0)
+			goto out_free;
+
 		/*
 		 * To properly pass the absolute subprog start to jit
 		 * all instruction adjustments should be accumulated
@@ -22306,6 +22347,8 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 	for (i = 0; i < env->subprog_cnt; i++) {
 		func[i]->aux->used_maps = NULL;
 		func[i]->aux->used_map_cnt = 0;
+		vfree(func[i]->aux->insn_aux);
+		func[i]->aux->insn_aux = NULL;
 	}
 
 	/* finally lock prog and jit images for all functions and
@@ -22367,6 +22410,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 	for (i = 0; i < env->subprog_cnt; i++) {
 		if (!func[i])
 			continue;
+		vfree(func[i]->aux->insn_aux);
 		func[i]->aux->poke_tab = NULL;
 		bpf_jit_free(func[i]);
 	}
@@ -25350,6 +25394,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	env->verification_time = ktime_get_ns() - start_time;
 	print_verification_stats(env);
 	env->prog->aux->verified_insns = env->insn_processed;
+	env->prog->aux->insn_aux = env->insn_aux_data;
 
 	/* preserve original error even if log finalization is successful */
 	err = bpf_vlog_finalize(&env->log, &log_true_size);
@@ -25428,7 +25473,11 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	if (!is_priv)
 		mutex_unlock(&bpf_verifier_lock);
 	clear_insn_aux_data(env, 0, env->prog->len);
-	vfree(env->insn_aux_data);
+	/* on success, insn_aux_data will be freed by bpf_prog_select_runtime */
+	if (ret) {
+		vfree(env->insn_aux_data);
+		env->prog->aux->insn_aux = NULL;
+	}
 err_free_env:
 	bpf_stack_liveness_free(env);
 	kvfree(env->cfg.insn_postorder);
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v4 3/4] bpf, x86: Emit ENDBR for indirect jump targets
  2026-01-14  9:39 [PATCH bpf-next v4 0/4] emit ENDBR/BTI instructions for indirect jump targets Xu Kuohai
  2026-01-14  9:39 ` [PATCH bpf-next v4 1/4] bpf: Fix an off-by-one error in check_indirect_jump Xu Kuohai
  2026-01-14  9:39 ` [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets Xu Kuohai
@ 2026-01-14  9:39 ` Xu Kuohai
  2026-01-14 16:46   ` kernel test robot
  2026-01-14  9:39 ` [PATCH bpf-next v4 4/4] bpf, arm64: Emit BTI for indirect jump target Xu Kuohai
  3 siblings, 1 reply; 15+ messages in thread
From: Xu Kuohai @ 2026-01-14  9:39 UTC (permalink / raw)
  To: bpf, linux-kernel, linux-arm-kernel
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Yonghong Song, Puranjay Mohan,
	Anton Protopopov

From: Xu Kuohai <xukuohai@huawei.com>

On CPUs that support CET/IBT, the indirect jump selftest triggers
a kernel panic because the indirect jump targets lack ENDBR
instructions.

To fix it, emit an ENDBR instruction to each indirect jump target. Since
the ENDBR instruction shifts the position of original jited instructions,
fix the instruction address calculation wherever the addresses are used.

For reference, below is a sample panic log.

 Missing ENDBR: bpf_prog_2e5f1c71c13ac3e0_big_jump_table+0x97/0xe1
 ------------[ cut here ]------------
 kernel BUG at arch/x86/kernel/cet.c:133!
 Oops: invalid opcode: 0000 [#1] SMP NOPTI

 ...

  ? 0xffffffffc00fb258
  ? bpf_prog_2e5f1c71c13ac3e0_big_jump_table+0x97/0xe1
  bpf_prog_test_run_syscall+0x110/0x2f0
  ? fdget+0xba/0xe0
  __sys_bpf+0xe4b/0x2590
  ? __kmalloc_node_track_caller_noprof+0x1c7/0x680
  ? bpf_prog_test_run_syscall+0x215/0x2f0
  __x64_sys_bpf+0x21/0x30
  do_syscall_64+0x85/0x620
  ? bpf_prog_test_run_syscall+0x1e2/0x2f0

Fixes: 493d9e0d6083 ("bpf, x86: add support for indirect jumps")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
---
 arch/x86/net/bpf_jit_comp.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index e3b1c4b1d550..ef79baac42d7 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1733,6 +1733,9 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image
 				dst_reg = X86_REG_R9;
 		}
 
+		if (bpf_insn_is_indirect_target(bpf_prog, i - 1))
+			EMIT_ENDBR();
+
 		switch (insn->code) {
 			/* ALU */
 		case BPF_ALU | BPF_ADD | BPF_X:
@@ -2439,7 +2442,7 @@ st:			if (is_imm8(insn->off))
 
 			/* call */
 		case BPF_JMP | BPF_CALL: {
-			u8 *ip = image + addrs[i - 1];
+			u8 *ip = image + addrs[i - 1] + (prog - temp);
 
 			func = (u8 *) __bpf_call_base + imm32;
 			if (src_reg == BPF_PSEUDO_CALL && tail_call_reachable) {
@@ -2464,7 +2467,8 @@ st:			if (is_imm8(insn->off))
 			if (imm32)
 				emit_bpf_tail_call_direct(bpf_prog,
 							  &bpf_prog->aux->poke_tab[imm32 - 1],
-							  &prog, image + addrs[i - 1],
+							  &prog,
+							  image + addrs[i - 1] + (prog - temp),
 							  callee_regs_used,
 							  stack_depth,
 							  ctx);
@@ -2473,7 +2477,7 @@ st:			if (is_imm8(insn->off))
 							    &prog,
 							    callee_regs_used,
 							    stack_depth,
-							    image + addrs[i - 1],
+							    image + addrs[i - 1] + (prog - temp),
 							    ctx);
 			break;
 
@@ -2638,7 +2642,8 @@ st:			if (is_imm8(insn->off))
 			break;
 
 		case BPF_JMP | BPF_JA | BPF_X:
-			emit_indirect_jump(&prog, insn->dst_reg, image + addrs[i - 1]);
+			emit_indirect_jump(&prog, insn->dst_reg,
+					   image + addrs[i - 1] + (prog - temp));
 			break;
 		case BPF_JMP | BPF_JA:
 		case BPF_JMP32 | BPF_JA:
@@ -2728,7 +2733,7 @@ st:			if (is_imm8(insn->off))
 			ctx->cleanup_addr = proglen;
 			if (bpf_prog_was_classic(bpf_prog) &&
 			    !ns_capable_noaudit(&init_user_ns, CAP_SYS_ADMIN)) {
-				u8 *ip = image + addrs[i - 1];
+				u8 *ip = image + addrs[i - 1] + (prog - temp);
 
 				if (emit_spectre_bhb_barrier(&prog, ip, bpf_prog))
 					return -EINVAL;
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v4 4/4] bpf, arm64: Emit BTI for indirect jump target
  2026-01-14  9:39 [PATCH bpf-next v4 0/4] emit ENDBR/BTI instructions for indirect jump targets Xu Kuohai
                   ` (2 preceding siblings ...)
  2026-01-14  9:39 ` [PATCH bpf-next v4 3/4] bpf, x86: Emit ENDBR for " Xu Kuohai
@ 2026-01-14  9:39 ` Xu Kuohai
  3 siblings, 0 replies; 15+ messages in thread
From: Xu Kuohai @ 2026-01-14  9:39 UTC (permalink / raw)
  To: bpf, linux-kernel, linux-arm-kernel
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Yonghong Song, Puranjay Mohan,
	Anton Protopopov

From: Xu Kuohai <xukuohai@huawei.com>

On CPUs that support BTI, the indirect jump selftest triggers a kernel
panic because there is no BTI instructions at the indirect jump targets.

Fix it by emitting a BTI instruction for each indirect jump target.

For reference, below is a sample panic log.

Internal error: Oops - BTI: 0000000036000003 [#1]  SMP
...
Call trace:
 bpf_prog_2e5f1c71c13ac3e0_big_jump_table+0x54/0xf8 (P)
 bpf_prog_run_pin_on_cpu+0x140/0x468
 bpf_prog_test_run_syscall+0x280/0x3b8
 bpf_prog_test_run+0x22c/0x2c0

Fixes: f4a66cf1cb14 ("bpf: arm64: Add support for indirect jumps")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
---
 arch/arm64/net/bpf_jit_comp.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 0c4d44bcfbf4..370ae0751b9e 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -1231,6 +1231,9 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
 	int ret;
 	bool sign_extend;
 
+	if (bpf_insn_is_indirect_target(ctx->prog, i))
+		emit_bti(A64_BTI_J, ctx);
+
 	switch (code) {
 	/* dst = src */
 	case BPF_ALU | BPF_MOV | BPF_X:
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v4 1/4] bpf: Fix an off-by-one error in check_indirect_jump
  2026-01-14  9:39 ` [PATCH bpf-next v4 1/4] bpf: Fix an off-by-one error in check_indirect_jump Xu Kuohai
@ 2026-01-14 10:29   ` Anton Protopopov
  2026-01-15  7:31     ` Xu Kuohai
  0 siblings, 1 reply; 15+ messages in thread
From: Anton Protopopov @ 2026-01-14 10:29 UTC (permalink / raw)
  To: Xu Kuohai
  Cc: bpf, linux-kernel, linux-arm-kernel, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Yonghong Song, Puranjay Mohan

On 26/01/14 05:39PM, Xu Kuohai wrote:
> From: Xu Kuohai <xukuohai@huawei.com>
> 
> Fix an off-by-one error in check_indirect_jump() that skips the last
> element returned by copy_insn_array_uniq().
> 
> Fixes: 493d9e0d6083 ("bpf, x86: add support for indirect jumps")
> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> ---
>  kernel/bpf/verifier.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index faa1ecc1fe9d..22605d9e0ffa 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -20336,7 +20336,7 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in
>  		return -EINVAL;
>  	}
>  
> -	for (i = 0; i < n - 1; i++) {
> +	for (i = 0; i < n; i++) {
>  		other_branch = push_stack(env, env->gotox_tmp_buf->items[i],
>  					  env->insn_idx, env->cur_state->speculative);
>  		if (IS_ERR(other_branch))
> -- 
> 2.47.3

Nack, the last state doesn't require a push_stack() call, it is
verified directly under this loop. Instead of this patch, just
add another call to mark_indirect_target().


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets
  2026-01-14  9:39 ` [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets Xu Kuohai
@ 2026-01-14 11:00   ` Anton Protopopov
  2026-01-15  7:37     ` Xu Kuohai
  2026-01-14 20:46   ` Eduard Zingerman
  1 sibling, 1 reply; 15+ messages in thread
From: Anton Protopopov @ 2026-01-14 11:00 UTC (permalink / raw)
  To: Xu Kuohai
  Cc: bpf, linux-kernel, linux-arm-kernel, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Yonghong Song, Puranjay Mohan

On 26/01/14 05:39PM, Xu Kuohai wrote:
> From: Xu Kuohai <xukuohai@huawei.com>
> 
> Introduce helper bpf_insn_is_indirect_target to determine whether a BPF
> instruction is an indirect jump target. This helper will be used by
> follow-up patches to decide where to emit indirect landing pad instructions.
> 
> Add a new flag to struct bpf_insn_aux_data to mark instructions that are
> indirect jump targets. The BPF verifier sets this flag, and the helper
> checks it to determine whether an instruction is an indirect jump target.
> 
> Since bpf_insn_aux_data is only available before JIT stage, add a new
> field to struct bpf_prog_aux to store a pointer to the bpf_insn_aux_data
> array, making it accessible to the JIT.
> 
> For programs with multiple subprogs, each subprog uses its own private
> copy of insn_aux_data, since subprogs may insert additional instructions
> during JIT and need to update the array. For non-subprog, the verifier's
> insn_aux_data array is used directly to avoid unnecessary copying.
> 
> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> ---
>  include/linux/bpf.h          |  2 ++
>  include/linux/bpf_verifier.h | 10 ++++---
>  kernel/bpf/core.c            | 51 +++++++++++++++++++++++++++++++++---
>  kernel/bpf/verifier.c        | 51 +++++++++++++++++++++++++++++++++++-
>  4 files changed, 105 insertions(+), 9 deletions(-)
> 
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 5936f8e2996f..e7d7e705327e 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1533,6 +1533,7 @@ bool bpf_has_frame_pointer(unsigned long ip);
>  int bpf_jit_charge_modmem(u32 size);
>  void bpf_jit_uncharge_modmem(u32 size);
>  bool bpf_prog_has_trampoline(const struct bpf_prog *prog);
> +bool bpf_insn_is_indirect_target(const struct bpf_prog *prog, int idx);
>  #else
>  static inline int bpf_trampoline_link_prog(struct bpf_tramp_link *link,
>  					   struct bpf_trampoline *tr,
> @@ -1760,6 +1761,7 @@ struct bpf_prog_aux {
>  	struct bpf_stream stream[2];
>  	struct mutex st_ops_assoc_mutex;
>  	struct bpf_map __rcu *st_ops_assoc;
> +	struct bpf_insn_aux_data *insn_aux;
>  };
>  
>  #define BPF_NR_CONTEXTS        4       /* normal, softirq, hardirq, NMI */
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 130bcbd66f60..758086b384df 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -574,16 +574,18 @@ struct bpf_insn_aux_data {
>  
>  	/* below fields are initialized once */
>  	unsigned int orig_idx; /* original instruction index */
> -	bool jmp_point;
> -	bool prune_point;
> +	u32 jmp_point:1;
> +	u32 prune_point:1;
>  	/* ensure we check state equivalence and save state checkpoint and
>  	 * this instruction, regardless of any heuristics
>  	 */
> -	bool force_checkpoint;
> +	u32 force_checkpoint:1;
>  	/* true if instruction is a call to a helper function that
>  	 * accepts callback function as a parameter.
>  	 */
> -	bool calls_callback;
> +	u32 calls_callback:1;
> +	/* true if the instruction is an indirect jump target */
> +	u32 indirect_target:1;
>  	/*
>  	 * CFG strongly connected component this instruction belongs to,
>  	 * zero if it is a singleton SCC.
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index e0b8a8a5aaa9..bb870936e74b 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -1486,6 +1486,35 @@ static void adjust_insn_arrays(struct bpf_prog *prog, u32 off, u32 len)
>  #endif
>  }
>  
> +static int adjust_insn_aux(struct bpf_prog *prog, int off, int cnt)
> +{
> +	size_t size;
> +	struct bpf_insn_aux_data *new_aux;
> +
> +	if (cnt == 1)
> +		return 0;
> +
> +	/* prog->len already accounts for the cnt - 1 newly inserted instructions */
> +	size = array_size(prog->len, sizeof(struct bpf_insn_aux_data));
> +	new_aux = vrealloc(prog->aux->insn_aux, size, GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> +	if (!new_aux)
> +		return -ENOMEM;
> +
> +	/* follow the same behavior as adjust_insn_array(): leave [0, off] unchanged and shift
> +	 * [off + 1, end) to [off + cnt, end). Otherwise, the JIT would emit landing pads at
> +	 * wrong locations, as the actual indirect jump target remains at off.
> +	 */
> +	size = array_size(prog->len - off - cnt, sizeof(struct bpf_insn_aux_data));
> +	memmove(new_aux + off + cnt, new_aux + off + 1, size);
> +
> +	size = array_size(cnt - 1, sizeof(struct bpf_insn_aux_data));
> +	memset(new_aux + off + 1, 0, size);
> +
> +	prog->aux->insn_aux = new_aux;
> +
> +	return 0;
> +}
> +
>  struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
>  {
>  	struct bpf_insn insn_buff[16], aux[2];
> @@ -1541,6 +1570,11 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
>  		clone = tmp;
>  		insn_delta = rewritten - 1;
>  
> +		if (adjust_insn_aux(clone, i, rewritten)) {
> +			bpf_jit_prog_release_other(prog, clone);
> +			return ERR_PTR(-ENOMEM);
> +		}
> +
>  		/* Instructions arrays must be updated using absolute xlated offsets */
>  		adjust_insn_arrays(clone, prog->aux->subprog_start + i, rewritten);
>  
> @@ -1553,6 +1587,11 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
>  	clone->blinded = 1;
>  	return clone;
>  }
> +
> +bool bpf_insn_is_indirect_target(const struct bpf_prog *prog, int idx)
> +{
> +	return prog->aux->insn_aux && prog->aux->insn_aux[idx].indirect_target;

Is there a case when insn_aux is NULL?

> +}
>  #endif /* CONFIG_BPF_JIT */
>  
>  /* Base function for offset calculation. Needs to go into .text section,
> @@ -2540,24 +2579,24 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
>  	if (!bpf_prog_is_offloaded(fp->aux)) {
>  		*err = bpf_prog_alloc_jited_linfo(fp);
>  		if (*err)
> -			return fp;
> +			goto free_insn_aux;
>  
>  		fp = bpf_int_jit_compile(fp);
>  		bpf_prog_jit_attempt_done(fp);
>  		if (!fp->jited && jit_needed) {
>  			*err = -ENOTSUPP;
> -			return fp;
> +			goto free_insn_aux;
>  		}
>  	} else {
>  		*err = bpf_prog_offload_compile(fp);
>  		if (*err)
> -			return fp;
> +			goto free_insn_aux;
>  	}
>  
>  finalize:
>  	*err = bpf_prog_lock_ro(fp);
>  	if (*err)
> -		return fp;
> +		goto free_insn_aux;
>  
>  	/* The tail call compatibility check can only be done at
>  	 * this late stage as we need to determine, if we deal
> @@ -2566,6 +2605,10 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
>  	 */
>  	*err = bpf_check_tail_call(fp);
>  
> +free_insn_aux:
> +	vfree(fp->aux->insn_aux);
> +	fp->aux->insn_aux = NULL;
> +
>  	return fp;
>  }
>  EXPORT_SYMBOL_GPL(bpf_prog_select_runtime);
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 22605d9e0ffa..f2fe6baeceb9 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -3852,6 +3852,11 @@ static bool is_jmp_point(struct bpf_verifier_env *env, int insn_idx)
>  	return env->insn_aux_data[insn_idx].jmp_point;
>  }
>  
> +static void mark_indirect_target(struct bpf_verifier_env *env, int idx)
> +{
> +	env->insn_aux_data[idx].indirect_target = true;
> +}
> +
>  #define LR_FRAMENO_BITS	3
>  #define LR_SPI_BITS	6
>  #define LR_ENTRY_BITS	(LR_SPI_BITS + LR_FRAMENO_BITS + 1)
> @@ -20337,6 +20342,7 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in
>  	}
>  
>  	for (i = 0; i < n; i++) {

^ n -> n-1

> +		mark_indirect_target(env, env->gotox_tmp_buf->items[i]);
>  		other_branch = push_stack(env, env->gotox_tmp_buf->items[i],
>  					  env->insn_idx, env->cur_state->speculative);
>  		if (IS_ERR(other_branch))
> @@ -21243,6 +21249,37 @@ static void convert_pseudo_ld_imm64(struct bpf_verifier_env *env)
>  	}

mark_indirect_target(n-1)

>  }
>  
> +static int clone_insn_aux_data(struct bpf_prog *prog, struct bpf_verifier_env *env, u32 off)
> +{
> +	u32 i;
> +	size_t size;
> +	bool has_indirect_target = false;
> +	struct bpf_insn_aux_data *insn_aux;
> +
> +	for (i = 0; i < prog->len; i++) {
> +		if (env->insn_aux_data[off + i].indirect_target) {
> +			has_indirect_target = true;
> +			break;
> +		}
> +	}
> +
> +	/* insn_aux is copied into bpf_prog so the JIT can check whether an instruction is an
> +	 * indirect jump target. If no indirect jump targets exist, copying is unnecessary.
> +	 */
> +	if (!has_indirect_target)
> +		return 0;
> +
> +	size = array_size(sizeof(struct bpf_insn_aux_data), prog->len);
> +	insn_aux = vzalloc(size);
> +	if (!insn_aux)
> +		return -ENOMEM;
> +
> +	memcpy(insn_aux, env->insn_aux_data + off, size);
> +	prog->aux->insn_aux = insn_aux;
> +
> +	return 0;
> +}
> +
>  /* single env->prog->insni[off] instruction was replaced with the range
>   * insni[off, off + cnt).  Adjust corresponding insn_aux_data by copying
>   * [0, off) and [off, end) to new locations, so the patched range stays zero
> @@ -22239,6 +22276,10 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>  		if (!i)
>  			func[i]->aux->exception_boundary = env->seen_exception;
>  
> +		err = clone_insn_aux_data(func[i], env, subprog_start);
> +		if (err < 0)
> +			goto out_free;
> +
>  		/*
>  		 * To properly pass the absolute subprog start to jit
>  		 * all instruction adjustments should be accumulated
> @@ -22306,6 +22347,8 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>  	for (i = 0; i < env->subprog_cnt; i++) {
>  		func[i]->aux->used_maps = NULL;
>  		func[i]->aux->used_map_cnt = 0;
> +		vfree(func[i]->aux->insn_aux);
> +		func[i]->aux->insn_aux = NULL;
>  	}
>  
>  	/* finally lock prog and jit images for all functions and
> @@ -22367,6 +22410,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>  	for (i = 0; i < env->subprog_cnt; i++) {
>  		if (!func[i])
>  			continue;
> +		vfree(func[i]->aux->insn_aux);
>  		func[i]->aux->poke_tab = NULL;
>  		bpf_jit_free(func[i]);
>  	}
> @@ -25350,6 +25394,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
>  	env->verification_time = ktime_get_ns() - start_time;
>  	print_verification_stats(env);
>  	env->prog->aux->verified_insns = env->insn_processed;
> +	env->prog->aux->insn_aux = env->insn_aux_data;
>  
>  	/* preserve original error even if log finalization is successful */
>  	err = bpf_vlog_finalize(&env->log, &log_true_size);
> @@ -25428,7 +25473,11 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
>  	if (!is_priv)
>  		mutex_unlock(&bpf_verifier_lock);
>  	clear_insn_aux_data(env, 0, env->prog->len);
> -	vfree(env->insn_aux_data);
> +	/* on success, insn_aux_data will be freed by bpf_prog_select_runtime */
> +	if (ret) {
> +		vfree(env->insn_aux_data);
> +		env->prog->aux->insn_aux = NULL;
> +	}
>  err_free_env:
>  	bpf_stack_liveness_free(env);
>  	kvfree(env->cfg.insn_postorder);
> -- 
> 2.47.3
> 

LGTM, just in case, could you please tell how you have tested
this patchset exactly?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v4 3/4] bpf, x86: Emit ENDBR for indirect jump targets
  2026-01-14  9:39 ` [PATCH bpf-next v4 3/4] bpf, x86: Emit ENDBR for " Xu Kuohai
@ 2026-01-14 16:46   ` kernel test robot
  0 siblings, 0 replies; 15+ messages in thread
From: kernel test robot @ 2026-01-14 16:46 UTC (permalink / raw)
  To: Xu Kuohai, bpf, linux-kernel, linux-arm-kernel
  Cc: oe-kbuild-all, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Yonghong Song, Puranjay Mohan, Anton Protopopov

Hi Xu,

kernel test robot noticed the following build warnings:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Xu-Kuohai/bpf-Fix-an-off-by-one-error-in-check_indirect_jump/20260114-172632
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20260114093914.2403982-4-xukuohai%40huaweicloud.com
patch subject: [PATCH bpf-next v4 3/4] bpf, x86: Emit ENDBR for indirect jump targets
config: x86_64-buildonly-randconfig-002-20260114 (https://download.01.org/0day-ci/archive/20260115/202601150016.x24DRk9R-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260115/202601150016.x24DRk9R-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202601150016.x24DRk9R-lkp@intel.com/

All warnings (new ones prefixed by >>):

   arch/x86/net/bpf_jit_comp.c: In function 'do_jit':
>> arch/x86/net/bpf_jit_comp.c:1737:37: warning: suggest braces around empty body in an 'if' statement [-Wempty-body]
    1737 |                         EMIT_ENDBR();
         |                                     ^


vim +/if +1737 arch/x86/net/bpf_jit_comp.c

  1650	
  1651	static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image,
  1652			  int oldproglen, struct jit_context *ctx, bool jmp_padding)
  1653	{
  1654		bool tail_call_reachable = bpf_prog->aux->tail_call_reachable;
  1655		struct bpf_insn *insn = bpf_prog->insnsi;
  1656		bool callee_regs_used[4] = {};
  1657		int insn_cnt = bpf_prog->len;
  1658		bool seen_exit = false;
  1659		u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY];
  1660		void __percpu *priv_frame_ptr = NULL;
  1661		u64 arena_vm_start, user_vm_start;
  1662		void __percpu *priv_stack_ptr;
  1663		int i, excnt = 0;
  1664		int ilen, proglen = 0;
  1665		u8 *prog = temp;
  1666		u32 stack_depth;
  1667		int err;
  1668	
  1669		stack_depth = bpf_prog->aux->stack_depth;
  1670		priv_stack_ptr = bpf_prog->aux->priv_stack_ptr;
  1671		if (priv_stack_ptr) {
  1672			priv_frame_ptr = priv_stack_ptr + PRIV_STACK_GUARD_SZ + round_up(stack_depth, 8);
  1673			stack_depth = 0;
  1674		}
  1675	
  1676		arena_vm_start = bpf_arena_get_kern_vm_start(bpf_prog->aux->arena);
  1677		user_vm_start = bpf_arena_get_user_vm_start(bpf_prog->aux->arena);
  1678	
  1679		detect_reg_usage(insn, insn_cnt, callee_regs_used);
  1680	
  1681		emit_prologue(&prog, image, stack_depth,
  1682			      bpf_prog_was_classic(bpf_prog), tail_call_reachable,
  1683			      bpf_is_subprog(bpf_prog), bpf_prog->aux->exception_cb);
  1684	
  1685		bpf_prog->aux->ksym.fp_start = prog - temp;
  1686	
  1687		/* Exception callback will clobber callee regs for its own use, and
  1688		 * restore the original callee regs from main prog's stack frame.
  1689		 */
  1690		if (bpf_prog->aux->exception_boundary) {
  1691			/* We also need to save r12, which is not mapped to any BPF
  1692			 * register, as we throw after entry into the kernel, which may
  1693			 * overwrite r12.
  1694			 */
  1695			push_r12(&prog);
  1696			push_callee_regs(&prog, all_callee_regs_used);
  1697		} else {
  1698			if (arena_vm_start)
  1699				push_r12(&prog);
  1700			push_callee_regs(&prog, callee_regs_used);
  1701		}
  1702		if (arena_vm_start)
  1703			emit_mov_imm64(&prog, X86_REG_R12,
  1704				       arena_vm_start >> 32, (u32) arena_vm_start);
  1705	
  1706		if (priv_frame_ptr)
  1707			emit_priv_frame_ptr(&prog, priv_frame_ptr);
  1708	
  1709		ilen = prog - temp;
  1710		if (rw_image)
  1711			memcpy(rw_image + proglen, temp, ilen);
  1712		proglen += ilen;
  1713		addrs[0] = proglen;
  1714		prog = temp;
  1715	
  1716		for (i = 1; i <= insn_cnt; i++, insn++) {
  1717			const s32 imm32 = insn->imm;
  1718			u32 dst_reg = insn->dst_reg;
  1719			u32 src_reg = insn->src_reg;
  1720			u8 b2 = 0, b3 = 0;
  1721			u8 *start_of_ldx;
  1722			s64 jmp_offset;
  1723			s16 insn_off;
  1724			u8 jmp_cond;
  1725			u8 *func;
  1726			int nops;
  1727	
  1728			if (priv_frame_ptr) {
  1729				if (src_reg == BPF_REG_FP)
  1730					src_reg = X86_REG_R9;
  1731	
  1732				if (dst_reg == BPF_REG_FP)
  1733					dst_reg = X86_REG_R9;
  1734			}
  1735	
  1736			if (bpf_insn_is_indirect_target(bpf_prog, i - 1))
> 1737				EMIT_ENDBR();
  1738	
  1739			switch (insn->code) {
  1740				/* ALU */
  1741			case BPF_ALU | BPF_ADD | BPF_X:
  1742			case BPF_ALU | BPF_SUB | BPF_X:
  1743			case BPF_ALU | BPF_AND | BPF_X:
  1744			case BPF_ALU | BPF_OR | BPF_X:
  1745			case BPF_ALU | BPF_XOR | BPF_X:
  1746			case BPF_ALU64 | BPF_ADD | BPF_X:
  1747			case BPF_ALU64 | BPF_SUB | BPF_X:
  1748			case BPF_ALU64 | BPF_AND | BPF_X:
  1749			case BPF_ALU64 | BPF_OR | BPF_X:
  1750			case BPF_ALU64 | BPF_XOR | BPF_X:
  1751				maybe_emit_mod(&prog, dst_reg, src_reg,
  1752					       BPF_CLASS(insn->code) == BPF_ALU64);
  1753				b2 = simple_alu_opcodes[BPF_OP(insn->code)];
  1754				EMIT2(b2, add_2reg(0xC0, dst_reg, src_reg));
  1755				break;
  1756	
  1757			case BPF_ALU64 | BPF_MOV | BPF_X:
  1758				if (insn_is_cast_user(insn)) {
  1759					if (dst_reg != src_reg)
  1760						/* 32-bit mov */
  1761						emit_mov_reg(&prog, false, dst_reg, src_reg);
  1762					/* shl dst_reg, 32 */
  1763					maybe_emit_1mod(&prog, dst_reg, true);
  1764					EMIT3(0xC1, add_1reg(0xE0, dst_reg), 32);
  1765	
  1766					/* or dst_reg, user_vm_start */
  1767					maybe_emit_1mod(&prog, dst_reg, true);
  1768					if (is_axreg(dst_reg))
  1769						EMIT1_off32(0x0D,  user_vm_start >> 32);
  1770					else
  1771						EMIT2_off32(0x81, add_1reg(0xC8, dst_reg),  user_vm_start >> 32);
  1772	
  1773					/* rol dst_reg, 32 */
  1774					maybe_emit_1mod(&prog, dst_reg, true);
  1775					EMIT3(0xC1, add_1reg(0xC0, dst_reg), 32);
  1776	
  1777					/* xor r11, r11 */
  1778					EMIT3(0x4D, 0x31, 0xDB);
  1779	
  1780					/* test dst_reg32, dst_reg32; check if lower 32-bit are zero */
  1781					maybe_emit_mod(&prog, dst_reg, dst_reg, false);
  1782					EMIT2(0x85, add_2reg(0xC0, dst_reg, dst_reg));
  1783	
  1784					/* cmove r11, dst_reg; if so, set dst_reg to zero */
  1785					/* WARNING: Intel swapped src/dst register encoding in CMOVcc !!! */
  1786					maybe_emit_mod(&prog, AUX_REG, dst_reg, true);
  1787					EMIT3(0x0F, 0x44, add_2reg(0xC0, AUX_REG, dst_reg));
  1788					break;
  1789				} else if (insn_is_mov_percpu_addr(insn)) {
  1790					/* mov <dst>, <src> (if necessary) */
  1791					EMIT_mov(dst_reg, src_reg);
  1792	#ifdef CONFIG_SMP
  1793					/* add <dst>, gs:[<off>] */
  1794					EMIT2(0x65, add_1mod(0x48, dst_reg));
  1795					EMIT3(0x03, add_2reg(0x04, 0, dst_reg), 0x25);
  1796					EMIT((u32)(unsigned long)&this_cpu_off, 4);
  1797	#endif
  1798					break;
  1799				}
  1800				fallthrough;
  1801			case BPF_ALU | BPF_MOV | BPF_X:
  1802				if (insn->off == 0)
  1803					emit_mov_reg(&prog,
  1804						     BPF_CLASS(insn->code) == BPF_ALU64,
  1805						     dst_reg, src_reg);
  1806				else
  1807					emit_movsx_reg(&prog, insn->off,
  1808						       BPF_CLASS(insn->code) == BPF_ALU64,
  1809						       dst_reg, src_reg);
  1810				break;
  1811	
  1812				/* neg dst */
  1813			case BPF_ALU | BPF_NEG:
  1814			case BPF_ALU64 | BPF_NEG:
  1815				maybe_emit_1mod(&prog, dst_reg,
  1816						BPF_CLASS(insn->code) == BPF_ALU64);
  1817				EMIT2(0xF7, add_1reg(0xD8, dst_reg));
  1818				break;
  1819	
  1820			case BPF_ALU | BPF_ADD | BPF_K:
  1821			case BPF_ALU | BPF_SUB | BPF_K:
  1822			case BPF_ALU | BPF_AND | BPF_K:
  1823			case BPF_ALU | BPF_OR | BPF_K:
  1824			case BPF_ALU | BPF_XOR | BPF_K:
  1825			case BPF_ALU64 | BPF_ADD | BPF_K:
  1826			case BPF_ALU64 | BPF_SUB | BPF_K:
  1827			case BPF_ALU64 | BPF_AND | BPF_K:
  1828			case BPF_ALU64 | BPF_OR | BPF_K:
  1829			case BPF_ALU64 | BPF_XOR | BPF_K:
  1830				maybe_emit_1mod(&prog, dst_reg,
  1831						BPF_CLASS(insn->code) == BPF_ALU64);
  1832	
  1833				/*
  1834				 * b3 holds 'normal' opcode, b2 short form only valid
  1835				 * in case dst is eax/rax.
  1836				 */
  1837				switch (BPF_OP(insn->code)) {
  1838				case BPF_ADD:
  1839					b3 = 0xC0;
  1840					b2 = 0x05;
  1841					break;
  1842				case BPF_SUB:
  1843					b3 = 0xE8;
  1844					b2 = 0x2D;
  1845					break;
  1846				case BPF_AND:
  1847					b3 = 0xE0;
  1848					b2 = 0x25;
  1849					break;
  1850				case BPF_OR:
  1851					b3 = 0xC8;
  1852					b2 = 0x0D;
  1853					break;
  1854				case BPF_XOR:
  1855					b3 = 0xF0;
  1856					b2 = 0x35;
  1857					break;
  1858				}
  1859	
  1860				if (is_imm8(imm32))
  1861					EMIT3(0x83, add_1reg(b3, dst_reg), imm32);
  1862				else if (is_axreg(dst_reg))
  1863					EMIT1_off32(b2, imm32);
  1864				else
  1865					EMIT2_off32(0x81, add_1reg(b3, dst_reg), imm32);
  1866				break;
  1867	
  1868			case BPF_ALU64 | BPF_MOV | BPF_K:
  1869			case BPF_ALU | BPF_MOV | BPF_K:
  1870				emit_mov_imm32(&prog, BPF_CLASS(insn->code) == BPF_ALU64,
  1871					       dst_reg, imm32);
  1872				break;
  1873	
  1874			case BPF_LD | BPF_IMM | BPF_DW:
  1875				emit_mov_imm64(&prog, dst_reg, insn[1].imm, insn[0].imm);
  1876				insn++;
  1877				i++;
  1878				break;
  1879	
  1880				/* dst %= src, dst /= src, dst %= imm32, dst /= imm32 */
  1881			case BPF_ALU | BPF_MOD | BPF_X:
  1882			case BPF_ALU | BPF_DIV | BPF_X:
  1883			case BPF_ALU | BPF_MOD | BPF_K:
  1884			case BPF_ALU | BPF_DIV | BPF_K:
  1885			case BPF_ALU64 | BPF_MOD | BPF_X:
  1886			case BPF_ALU64 | BPF_DIV | BPF_X:
  1887			case BPF_ALU64 | BPF_MOD | BPF_K:
  1888			case BPF_ALU64 | BPF_DIV | BPF_K: {
  1889				bool is64 = BPF_CLASS(insn->code) == BPF_ALU64;
  1890	
  1891				if (dst_reg != BPF_REG_0)
  1892					EMIT1(0x50); /* push rax */
  1893				if (dst_reg != BPF_REG_3)
  1894					EMIT1(0x52); /* push rdx */
  1895	
  1896				if (BPF_SRC(insn->code) == BPF_X) {
  1897					if (src_reg == BPF_REG_0 ||
  1898					    src_reg == BPF_REG_3) {
  1899						/* mov r11, src_reg */
  1900						EMIT_mov(AUX_REG, src_reg);
  1901						src_reg = AUX_REG;
  1902					}
  1903				} else {
  1904					/* mov r11, imm32 */
  1905					EMIT3_off32(0x49, 0xC7, 0xC3, imm32);
  1906					src_reg = AUX_REG;
  1907				}
  1908	
  1909				if (dst_reg != BPF_REG_0)
  1910					/* mov rax, dst_reg */
  1911					emit_mov_reg(&prog, is64, BPF_REG_0, dst_reg);
  1912	
  1913				if (insn->off == 0) {
  1914					/*
  1915					 * xor edx, edx
  1916					 * equivalent to 'xor rdx, rdx', but one byte less
  1917					 */
  1918					EMIT2(0x31, 0xd2);
  1919	
  1920					/* div src_reg */
  1921					maybe_emit_1mod(&prog, src_reg, is64);
  1922					EMIT2(0xF7, add_1reg(0xF0, src_reg));
  1923				} else {
  1924					if (BPF_CLASS(insn->code) == BPF_ALU)
  1925						EMIT1(0x99); /* cdq */
  1926					else
  1927						EMIT2(0x48, 0x99); /* cqo */
  1928	
  1929					/* idiv src_reg */
  1930					maybe_emit_1mod(&prog, src_reg, is64);
  1931					EMIT2(0xF7, add_1reg(0xF8, src_reg));
  1932				}
  1933	
  1934				if (BPF_OP(insn->code) == BPF_MOD &&
  1935				    dst_reg != BPF_REG_3)
  1936					/* mov dst_reg, rdx */
  1937					emit_mov_reg(&prog, is64, dst_reg, BPF_REG_3);
  1938				else if (BPF_OP(insn->code) == BPF_DIV &&
  1939					 dst_reg != BPF_REG_0)
  1940					/* mov dst_reg, rax */
  1941					emit_mov_reg(&prog, is64, dst_reg, BPF_REG_0);
  1942	
  1943				if (dst_reg != BPF_REG_3)
  1944					EMIT1(0x5A); /* pop rdx */
  1945				if (dst_reg != BPF_REG_0)
  1946					EMIT1(0x58); /* pop rax */
  1947				break;
  1948			}
  1949	
  1950			case BPF_ALU | BPF_MUL | BPF_K:
  1951			case BPF_ALU64 | BPF_MUL | BPF_K:
  1952				maybe_emit_mod(&prog, dst_reg, dst_reg,
  1953					       BPF_CLASS(insn->code) == BPF_ALU64);
  1954	
  1955				if (is_imm8(imm32))
  1956					/* imul dst_reg, dst_reg, imm8 */
  1957					EMIT3(0x6B, add_2reg(0xC0, dst_reg, dst_reg),
  1958					      imm32);
  1959				else
  1960					/* imul dst_reg, dst_reg, imm32 */
  1961					EMIT2_off32(0x69,
  1962						    add_2reg(0xC0, dst_reg, dst_reg),
  1963						    imm32);
  1964				break;
  1965	
  1966			case BPF_ALU | BPF_MUL | BPF_X:
  1967			case BPF_ALU64 | BPF_MUL | BPF_X:
  1968				maybe_emit_mod(&prog, src_reg, dst_reg,
  1969					       BPF_CLASS(insn->code) == BPF_ALU64);
  1970	
  1971				/* imul dst_reg, src_reg */
  1972				EMIT3(0x0F, 0xAF, add_2reg(0xC0, src_reg, dst_reg));
  1973				break;
  1974	
  1975				/* Shifts */
  1976			case BPF_ALU | BPF_LSH | BPF_K:
  1977			case BPF_ALU | BPF_RSH | BPF_K:
  1978			case BPF_ALU | BPF_ARSH | BPF_K:
  1979			case BPF_ALU64 | BPF_LSH | BPF_K:
  1980			case BPF_ALU64 | BPF_RSH | BPF_K:
  1981			case BPF_ALU64 | BPF_ARSH | BPF_K:
  1982				maybe_emit_1mod(&prog, dst_reg,
  1983						BPF_CLASS(insn->code) == BPF_ALU64);
  1984	
  1985				b3 = simple_alu_opcodes[BPF_OP(insn->code)];
  1986				if (imm32 == 1)
  1987					EMIT2(0xD1, add_1reg(b3, dst_reg));
  1988				else
  1989					EMIT3(0xC1, add_1reg(b3, dst_reg), imm32);
  1990				break;
  1991	
  1992			case BPF_ALU | BPF_LSH | BPF_X:
  1993			case BPF_ALU | BPF_RSH | BPF_X:
  1994			case BPF_ALU | BPF_ARSH | BPF_X:
  1995			case BPF_ALU64 | BPF_LSH | BPF_X:
  1996			case BPF_ALU64 | BPF_RSH | BPF_X:
  1997			case BPF_ALU64 | BPF_ARSH | BPF_X:
  1998				/* BMI2 shifts aren't better when shift count is already in rcx */
  1999				if (boot_cpu_has(X86_FEATURE_BMI2) && src_reg != BPF_REG_4) {
  2000					/* shrx/sarx/shlx dst_reg, dst_reg, src_reg */
  2001					bool w = (BPF_CLASS(insn->code) == BPF_ALU64);
  2002					u8 op;
  2003	
  2004					switch (BPF_OP(insn->code)) {
  2005					case BPF_LSH:
  2006						op = 1; /* prefix 0x66 */
  2007						break;
  2008					case BPF_RSH:
  2009						op = 3; /* prefix 0xf2 */
  2010						break;
  2011					case BPF_ARSH:
  2012						op = 2; /* prefix 0xf3 */
  2013						break;
  2014					}
  2015	
  2016					emit_shiftx(&prog, dst_reg, src_reg, w, op);
  2017	
  2018					break;
  2019				}
  2020	
  2021				if (src_reg != BPF_REG_4) { /* common case */
  2022					/* Check for bad case when dst_reg == rcx */
  2023					if (dst_reg == BPF_REG_4) {
  2024						/* mov r11, dst_reg */
  2025						EMIT_mov(AUX_REG, dst_reg);
  2026						dst_reg = AUX_REG;
  2027					} else {
  2028						EMIT1(0x51); /* push rcx */
  2029					}
  2030					/* mov rcx, src_reg */
  2031					EMIT_mov(BPF_REG_4, src_reg);
  2032				}
  2033	
  2034				/* shl %rax, %cl | shr %rax, %cl | sar %rax, %cl */
  2035				maybe_emit_1mod(&prog, dst_reg,
  2036						BPF_CLASS(insn->code) == BPF_ALU64);
  2037	
  2038				b3 = simple_alu_opcodes[BPF_OP(insn->code)];
  2039				EMIT2(0xD3, add_1reg(b3, dst_reg));
  2040	
  2041				if (src_reg != BPF_REG_4) {
  2042					if (insn->dst_reg == BPF_REG_4)
  2043						/* mov dst_reg, r11 */
  2044						EMIT_mov(insn->dst_reg, AUX_REG);
  2045					else
  2046						EMIT1(0x59); /* pop rcx */
  2047				}
  2048	
  2049				break;
  2050	
  2051			case BPF_ALU | BPF_END | BPF_FROM_BE:
  2052			case BPF_ALU64 | BPF_END | BPF_FROM_LE:
  2053				switch (imm32) {
  2054				case 16:
  2055					/* Emit 'ror %ax, 8' to swap lower 2 bytes */
  2056					EMIT1(0x66);
  2057					if (is_ereg(dst_reg))
  2058						EMIT1(0x41);
  2059					EMIT3(0xC1, add_1reg(0xC8, dst_reg), 8);
  2060	
  2061					/* Emit 'movzwl eax, ax' */
  2062					if (is_ereg(dst_reg))
  2063						EMIT3(0x45, 0x0F, 0xB7);
  2064					else
  2065						EMIT2(0x0F, 0xB7);
  2066					EMIT1(add_2reg(0xC0, dst_reg, dst_reg));
  2067					break;
  2068				case 32:
  2069					/* Emit 'bswap eax' to swap lower 4 bytes */
  2070					if (is_ereg(dst_reg))
  2071						EMIT2(0x41, 0x0F);
  2072					else
  2073						EMIT1(0x0F);
  2074					EMIT1(add_1reg(0xC8, dst_reg));
  2075					break;
  2076				case 64:
  2077					/* Emit 'bswap rax' to swap 8 bytes */
  2078					EMIT3(add_1mod(0x48, dst_reg), 0x0F,
  2079					      add_1reg(0xC8, dst_reg));
  2080					break;
  2081				}
  2082				break;
  2083	
  2084			case BPF_ALU | BPF_END | BPF_FROM_LE:
  2085				switch (imm32) {
  2086				case 16:
  2087					/*
  2088					 * Emit 'movzwl eax, ax' to zero extend 16-bit
  2089					 * into 64 bit
  2090					 */
  2091					if (is_ereg(dst_reg))
  2092						EMIT3(0x45, 0x0F, 0xB7);
  2093					else
  2094						EMIT2(0x0F, 0xB7);
  2095					EMIT1(add_2reg(0xC0, dst_reg, dst_reg));
  2096					break;
  2097				case 32:
  2098					/* Emit 'mov eax, eax' to clear upper 32-bits */
  2099					if (is_ereg(dst_reg))
  2100						EMIT1(0x45);
  2101					EMIT2(0x89, add_2reg(0xC0, dst_reg, dst_reg));
  2102					break;
  2103				case 64:
  2104					/* nop */
  2105					break;
  2106				}
  2107				break;
  2108	
  2109				/* speculation barrier */
  2110			case BPF_ST | BPF_NOSPEC:
  2111				EMIT_LFENCE();
  2112				break;
  2113	
  2114				/* ST: *(u8*)(dst_reg + off) = imm */
  2115			case BPF_ST | BPF_MEM | BPF_B:
  2116				if (is_ereg(dst_reg))
  2117					EMIT2(0x41, 0xC6);
  2118				else
  2119					EMIT1(0xC6);
  2120				goto st;
  2121			case BPF_ST | BPF_MEM | BPF_H:
  2122				if (is_ereg(dst_reg))
  2123					EMIT3(0x66, 0x41, 0xC7);
  2124				else
  2125					EMIT2(0x66, 0xC7);
  2126				goto st;
  2127			case BPF_ST | BPF_MEM | BPF_W:
  2128				if (is_ereg(dst_reg))
  2129					EMIT2(0x41, 0xC7);
  2130				else
  2131					EMIT1(0xC7);
  2132				goto st;
  2133			case BPF_ST | BPF_MEM | BPF_DW:
  2134				EMIT2(add_1mod(0x48, dst_reg), 0xC7);
  2135	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets
  2026-01-14  9:39 ` [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets Xu Kuohai
  2026-01-14 11:00   ` Anton Protopopov
@ 2026-01-14 20:46   ` Eduard Zingerman
  2026-01-15  7:47     ` Xu Kuohai
  1 sibling, 1 reply; 15+ messages in thread
From: Eduard Zingerman @ 2026-01-14 20:46 UTC (permalink / raw)
  To: Xu Kuohai, bpf, linux-kernel, linux-arm-kernel
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Yonghong Song, Puranjay Mohan, Anton Protopopov

On Wed, 2026-01-14 at 17:39 +0800, Xu Kuohai wrote:
> From: Xu Kuohai <xukuohai@huawei.com>
> 
> Introduce helper bpf_insn_is_indirect_target to determine whether a BPF
> instruction is an indirect jump target. This helper will be used by
> follow-up patches to decide where to emit indirect landing pad instructions.
> 
> Add a new flag to struct bpf_insn_aux_data to mark instructions that are
> indirect jump targets. The BPF verifier sets this flag, and the helper
> checks it to determine whether an instruction is an indirect jump target.
> 
> Since bpf_insn_aux_data is only available before JIT stage, add a new
> field to struct bpf_prog_aux to store a pointer to the bpf_insn_aux_data
> array, making it accessible to the JIT.
> 
> For programs with multiple subprogs, each subprog uses its own private
> copy of insn_aux_data, since subprogs may insert additional instructions
> during JIT and need to update the array. For non-subprog, the verifier's
> insn_aux_data array is used directly to avoid unnecessary copying.
> 
> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> ---

Hm, I've missed the fact insn_aux_data is not currently available to jit.
Is it really necessary to copy this array for each subprogram?
Given that we still want to free insn_aux_data after program load,
I'd expect that it should be possible just to pass a pointer with an
offset pointing to a start of specific subprogram. Wdyt?

[...]


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v4 1/4] bpf: Fix an off-by-one error in check_indirect_jump
  2026-01-14 10:29   ` Anton Protopopov
@ 2026-01-15  7:31     ` Xu Kuohai
  0 siblings, 0 replies; 15+ messages in thread
From: Xu Kuohai @ 2026-01-15  7:31 UTC (permalink / raw)
  To: Anton Protopopov
  Cc: bpf, linux-kernel, linux-arm-kernel, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Yonghong Song, Puranjay Mohan

On 1/14/2026 6:29 PM, Anton Protopopov wrote:
> On 26/01/14 05:39PM, Xu Kuohai wrote:
>> From: Xu Kuohai <xukuohai@huawei.com>
>>
>> Fix an off-by-one error in check_indirect_jump() that skips the last
>> element returned by copy_insn_array_uniq().
>>
>> Fixes: 493d9e0d6083 ("bpf, x86: add support for indirect jumps")
>> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
>> ---
>>   kernel/bpf/verifier.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index faa1ecc1fe9d..22605d9e0ffa 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -20336,7 +20336,7 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in
>>   		return -EINVAL;
>>   	}
>>   
>> -	for (i = 0; i < n - 1; i++) {
>> +	for (i = 0; i < n; i++) {
>>   		other_branch = push_stack(env, env->gotox_tmp_buf->items[i],
>>   					  env->insn_idx, env->cur_state->speculative);
>>   		if (IS_ERR(other_branch))
>> -- 
>> 2.47.3
> 
> Nack, the last state doesn't require a push_stack() call, it is
> verified directly under this loop. Instead of this patch, just
> add another call to mark_indirect_target().

Ok, I see. Thanks for the explanation.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets
  2026-01-14 11:00   ` Anton Protopopov
@ 2026-01-15  7:37     ` Xu Kuohai
  0 siblings, 0 replies; 15+ messages in thread
From: Xu Kuohai @ 2026-01-15  7:37 UTC (permalink / raw)
  To: Anton Protopopov
  Cc: bpf, linux-kernel, linux-arm-kernel, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Yonghong Song, Puranjay Mohan

On 1/14/2026 7:00 PM, Anton Protopopov wrote:

[...]

>> +
>> +bool bpf_insn_is_indirect_target(const struct bpf_prog *prog, int idx)
>> +{
>> +	return prog->aux->insn_aux && prog->aux->insn_aux[idx].indirect_target;
> 
> Is there a case when insn_aux is NULL?
>

It is NULL when there is no indirect jump targets for the bpf prog, see the
has_indirect_target test in clone_insn_aux_data.

>> +}
>>   #endif /* CONFIG_BPF_JIT */
>>   
>>   /* Base function for offset calculation. Needs to go into .text section,
>> @@ -2540,24 +2579,24 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
>>   	if (!bpf_prog_is_offloaded(fp->aux)) {
>>   		*err = bpf_prog_alloc_jited_linfo(fp);
>>   		if (*err)
>> -			return fp;
>> +			goto free_insn_aux;
>>   
>>   		fp = bpf_int_jit_compile(fp);
>>   		bpf_prog_jit_attempt_done(fp);
>>   		if (!fp->jited && jit_needed) {
>>   			*err = -ENOTSUPP;
>> -			return fp;
>> +			goto free_insn_aux;
>>   		}
>>   	} else {
>>   		*err = bpf_prog_offload_compile(fp);
>>   		if (*err)
>> -			return fp;
>> +			goto free_insn_aux;
>>   	}
>>   
>>   finalize:
>>   	*err = bpf_prog_lock_ro(fp);
>>   	if (*err)
>> -		return fp;
>> +		goto free_insn_aux;
>>   
>>   	/* The tail call compatibility check can only be done at
>>   	 * this late stage as we need to determine, if we deal
>> @@ -2566,6 +2605,10 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
>>   	 */
>>   	*err = bpf_check_tail_call(fp);
>>   
>> +free_insn_aux:
>> +	vfree(fp->aux->insn_aux);
>> +	fp->aux->insn_aux = NULL;
>> +
>>   	return fp;
>>   }
>>   EXPORT_SYMBOL_GPL(bpf_prog_select_runtime);
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index 22605d9e0ffa..f2fe6baeceb9 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -3852,6 +3852,11 @@ static bool is_jmp_point(struct bpf_verifier_env *env, int insn_idx)
>>   	return env->insn_aux_data[insn_idx].jmp_point;
>>   }
>>   
>> +static void mark_indirect_target(struct bpf_verifier_env *env, int idx)
>> +{
>> +	env->insn_aux_data[idx].indirect_target = true;
>> +}
>> +
>>   #define LR_FRAMENO_BITS	3
>>   #define LR_SPI_BITS	6
>>   #define LR_ENTRY_BITS	(LR_SPI_BITS + LR_FRAMENO_BITS + 1)
>> @@ -20337,6 +20342,7 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in
>>   	}
>>   
>>   	for (i = 0; i < n; i++) {
> 
> ^ n -> n-1
>

ACK

>> +		mark_indirect_target(env, env->gotox_tmp_buf->items[i]);
>>   		other_branch = push_stack(env, env->gotox_tmp_buf->items[i],
>>   					  env->insn_idx, env->cur_state->speculative);
>>   		if (IS_ERR(other_branch))
>> @@ -21243,6 +21249,37 @@ static void convert_pseudo_ld_imm64(struct bpf_verifier_env *env)
>>   	}
> 
> mark_indirect_target(n-1)
> 
>>   }
>>   
>> +static int clone_insn_aux_data(struct bpf_prog *prog, struct bpf_verifier_env *env, u32 off)
>> +{
>> +	u32 i;
>> +	size_t size;
>> +	bool has_indirect_target = false;
>> +	struct bpf_insn_aux_data *insn_aux;
>> +
>> +	for (i = 0; i < prog->len; i++) {
>> +		if (env->insn_aux_data[off + i].indirect_target) {
>> +			has_indirect_target = true;
>> +			break;
>> +		}
>> +	}
>> +
>> +	/* insn_aux is copied into bpf_prog so the JIT can check whether an instruction is an
>> +	 * indirect jump target. If no indirect jump targets exist, copying is unnecessary.
>> +	 */
>> +	if (!has_indirect_target)
>> +		return 0;
>> +
>> +	size = array_size(sizeof(struct bpf_insn_aux_data), prog->len);
>> +	insn_aux = vzalloc(size);
>> +	if (!insn_aux)
>> +		return -ENOMEM;
>> +
>> +	memcpy(insn_aux, env->insn_aux_data + off, size);
>> +	prog->aux->insn_aux = insn_aux;
>> +
>> +	return 0;
>> +}
>> +
>>   /* single env->prog->insni[off] instruction was replaced with the range
>>    * insni[off, off + cnt).  Adjust corresponding insn_aux_data by copying
>>    * [0, off) and [off, end) to new locations, so the patched range stays zero
>> @@ -22239,6 +22276,10 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>>   		if (!i)
>>   			func[i]->aux->exception_boundary = env->seen_exception;
>>   
>> +		err = clone_insn_aux_data(func[i], env, subprog_start);
>> +		if (err < 0)
>> +			goto out_free;
>> +
>>   		/*
>>   		 * To properly pass the absolute subprog start to jit
>>   		 * all instruction adjustments should be accumulated
>> @@ -22306,6 +22347,8 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>>   	for (i = 0; i < env->subprog_cnt; i++) {
>>   		func[i]->aux->used_maps = NULL;
>>   		func[i]->aux->used_map_cnt = 0;
>> +		vfree(func[i]->aux->insn_aux);
>> +		func[i]->aux->insn_aux = NULL;
>>   	}
>>   
>>   	/* finally lock prog and jit images for all functions and
>> @@ -22367,6 +22410,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>>   	for (i = 0; i < env->subprog_cnt; i++) {
>>   		if (!func[i])
>>   			continue;
>> +		vfree(func[i]->aux->insn_aux);
>>   		func[i]->aux->poke_tab = NULL;
>>   		bpf_jit_free(func[i]);
>>   	}
>> @@ -25350,6 +25394,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
>>   	env->verification_time = ktime_get_ns() - start_time;
>>   	print_verification_stats(env);
>>   	env->prog->aux->verified_insns = env->insn_processed;
>> +	env->prog->aux->insn_aux = env->insn_aux_data;
>>   
>>   	/* preserve original error even if log finalization is successful */
>>   	err = bpf_vlog_finalize(&env->log, &log_true_size);
>> @@ -25428,7 +25473,11 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
>>   	if (!is_priv)
>>   		mutex_unlock(&bpf_verifier_lock);
>>   	clear_insn_aux_data(env, 0, env->prog->len);
>> -	vfree(env->insn_aux_data);
>> +	/* on success, insn_aux_data will be freed by bpf_prog_select_runtime */
>> +	if (ret) {
>> +		vfree(env->insn_aux_data);
>> +		env->prog->aux->insn_aux = NULL;
>> +	}
>>   err_free_env:
>>   	bpf_stack_liveness_free(env);
>>   	kvfree(env->cfg.insn_postorder);
>> -- 
>> 2.47.3
>>
> 
> LGTM, just in case, could you please tell how you have tested
> this patchset exactly?

I ran test_progs-cpuv4 on machines supporting x86 CET/IBT and arm64 BTI. I tested in three
environments: an arm64 physical machine with BTI support (CPU: Hisilicon KP920B), an arm64
QEMU VM using cpu=max for BTI support, and a Bochs VM with model=arrow_lake for x86 CET/IBT
support.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets
  2026-01-14 20:46   ` Eduard Zingerman
@ 2026-01-15  7:47     ` Xu Kuohai
  2026-01-18 17:20       ` Alexei Starovoitov
  0 siblings, 1 reply; 15+ messages in thread
From: Xu Kuohai @ 2026-01-15  7:47 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, linux-kernel, linux-arm-kernel
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Yonghong Song, Puranjay Mohan, Anton Protopopov

On 1/15/2026 4:46 AM, Eduard Zingerman wrote:
> On Wed, 2026-01-14 at 17:39 +0800, Xu Kuohai wrote:
>> From: Xu Kuohai <xukuohai@huawei.com>
>>
>> Introduce helper bpf_insn_is_indirect_target to determine whether a BPF
>> instruction is an indirect jump target. This helper will be used by
>> follow-up patches to decide where to emit indirect landing pad instructions.
>>
>> Add a new flag to struct bpf_insn_aux_data to mark instructions that are
>> indirect jump targets. The BPF verifier sets this flag, and the helper
>> checks it to determine whether an instruction is an indirect jump target.
>>
>> Since bpf_insn_aux_data is only available before JIT stage, add a new
>> field to struct bpf_prog_aux to store a pointer to the bpf_insn_aux_data
>> array, making it accessible to the JIT.
>>
>> For programs with multiple subprogs, each subprog uses its own private
>> copy of insn_aux_data, since subprogs may insert additional instructions
>> during JIT and need to update the array. For non-subprog, the verifier's
>> insn_aux_data array is used directly to avoid unnecessary copying.
>>
>> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
>> ---
> 
> Hm, I've missed the fact insn_aux_data is not currently available to jit.
> Is it really necessary to copy this array for each subprogram?
> Given that we still want to free insn_aux_data after program load,
> I'd expect that it should be possible just to pass a pointer with an
> offset pointing to a start of specific subprogram. Wdyt?
>

I think it requires an additional field in struct bpf_prog to record the length
of the global insn_aux_data array. If a subprog inserts new instructions during
JIT (e.g., due to constant blinding), all entries in the array, including those
of the subsequent subprogs, would need to be adjusted. With per-subprog copying,
only the local insn_aux_data needs to be updated, reducing the amount of copying.

However, if you prefer a global array, I’m happy to switch to it.

> [...]



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets
  2026-01-15  7:47     ` Xu Kuohai
@ 2026-01-18 17:20       ` Alexei Starovoitov
  2026-01-18 23:22         ` Kumar Kartikeya Dwivedi
  2026-01-19  2:35         ` Xu Kuohai
  0 siblings, 2 replies; 15+ messages in thread
From: Alexei Starovoitov @ 2026-01-18 17:20 UTC (permalink / raw)
  To: Xu Kuohai, Kumar Kartikeya Dwivedi
  Cc: Eduard Zingerman, bpf, LKML, linux-arm-kernel, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Yonghong Song,
	Puranjay Mohan, Anton Protopopov

On Wed, Jan 14, 2026 at 11:47 PM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
>
> On 1/15/2026 4:46 AM, Eduard Zingerman wrote:
> > On Wed, 2026-01-14 at 17:39 +0800, Xu Kuohai wrote:
> >> From: Xu Kuohai <xukuohai@huawei.com>
> >>
> >> Introduce helper bpf_insn_is_indirect_target to determine whether a BPF
> >> instruction is an indirect jump target. This helper will be used by
> >> follow-up patches to decide where to emit indirect landing pad instructions.
> >>
> >> Add a new flag to struct bpf_insn_aux_data to mark instructions that are
> >> indirect jump targets. The BPF verifier sets this flag, and the helper
> >> checks it to determine whether an instruction is an indirect jump target.
> >>
> >> Since bpf_insn_aux_data is only available before JIT stage, add a new
> >> field to struct bpf_prog_aux to store a pointer to the bpf_insn_aux_data
> >> array, making it accessible to the JIT.
> >>
> >> For programs with multiple subprogs, each subprog uses its own private
> >> copy of insn_aux_data, since subprogs may insert additional instructions
> >> during JIT and need to update the array. For non-subprog, the verifier's
> >> insn_aux_data array is used directly to avoid unnecessary copying.
> >>
> >> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> >> ---
> >
> > Hm, I've missed the fact insn_aux_data is not currently available to jit.
> > Is it really necessary to copy this array for each subprogram?
> > Given that we still want to free insn_aux_data after program load,
> > I'd expect that it should be possible just to pass a pointer with an
> > offset pointing to a start of specific subprogram. Wdyt?
> >
>
> I think it requires an additional field in struct bpf_prog to record the length
> of the global insn_aux_data array. If a subprog inserts new instructions during
> JIT (e.g., due to constant blinding), all entries in the array, including those
> of the subsequent subprogs, would need to be adjusted. With per-subprog copying,
> only the local insn_aux_data needs to be updated, reducing the amount of copying.
>
> However, if you prefer a global array, I’m happy to switch to it.

iirc we struggled with lack of env/insn_aux in JIT earlier.

func[i]->aux->used_maps = env->used_maps;
is one such example.

Let's move bpf_prog_select_runtime() into bpf_check() and
consistently pass 'env' into bpf_int_jit_compile() while
env is still valid.
Close to jit_subprogs().
Or remove bpf_prog_select_runtime() and make jit_subprogs()
do the whole thing. tbd.

This way we can remove used_maps workaround and don't need to do
this insn_aux copy.
Errors during JIT can be printed into the verifier log too.

Kumar,
what do you think about it from modularization pov ?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets
  2026-01-18 17:20       ` Alexei Starovoitov
@ 2026-01-18 23:22         ` Kumar Kartikeya Dwivedi
  2026-01-19  2:35         ` Xu Kuohai
  1 sibling, 0 replies; 15+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2026-01-18 23:22 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Xu Kuohai, Eduard Zingerman, bpf, LKML, linux-arm-kernel,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Yonghong Song, Puranjay Mohan, Anton Protopopov

On Sun, 18 Jan 2026 at 18:20, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Jan 14, 2026 at 11:47 PM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
> >
> > On 1/15/2026 4:46 AM, Eduard Zingerman wrote:
> > > On Wed, 2026-01-14 at 17:39 +0800, Xu Kuohai wrote:
> > >> From: Xu Kuohai <xukuohai@huawei.com>
> > >>
> > >> Introduce helper bpf_insn_is_indirect_target to determine whether a BPF
> > >> instruction is an indirect jump target. This helper will be used by
> > >> follow-up patches to decide where to emit indirect landing pad instructions.
> > >>
> > >> Add a new flag to struct bpf_insn_aux_data to mark instructions that are
> > >> indirect jump targets. The BPF verifier sets this flag, and the helper
> > >> checks it to determine whether an instruction is an indirect jump target.
> > >>
> > >> Since bpf_insn_aux_data is only available before JIT stage, add a new
> > >> field to struct bpf_prog_aux to store a pointer to the bpf_insn_aux_data
> > >> array, making it accessible to the JIT.
> > >>
> > >> For programs with multiple subprogs, each subprog uses its own private
> > >> copy of insn_aux_data, since subprogs may insert additional instructions
> > >> during JIT and need to update the array. For non-subprog, the verifier's
> > >> insn_aux_data array is used directly to avoid unnecessary copying.
> > >>
> > >> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> > >> ---
> > >
> > > Hm, I've missed the fact insn_aux_data is not currently available to jit.
> > > Is it really necessary to copy this array for each subprogram?
> > > Given that we still want to free insn_aux_data after program load,
> > > I'd expect that it should be possible just to pass a pointer with an
> > > offset pointing to a start of specific subprogram. Wdyt?
> > >
> >
> > I think it requires an additional field in struct bpf_prog to record the length
> > of the global insn_aux_data array. If a subprog inserts new instructions during
> > JIT (e.g., due to constant blinding), all entries in the array, including those
> > of the subsequent subprogs, would need to be adjusted. With per-subprog copying,
> > only the local insn_aux_data needs to be updated, reducing the amount of copying.
> >
> > However, if you prefer a global array, I’m happy to switch to it.
>
> iirc we struggled with lack of env/insn_aux in JIT earlier.
>
> func[i]->aux->used_maps = env->used_maps;
> is one such example.
>
> Let's move bpf_prog_select_runtime() into bpf_check() and
> consistently pass 'env' into bpf_int_jit_compile() while
> env is still valid.
> Close to jit_subprogs().
> Or remove bpf_prog_select_runtime() and make jit_subprogs()
> do the whole thing. tbd.
>
> This way we can remove used_maps workaround and don't need to do
> this insn_aux copy.
> Errors during JIT can be printed into the verifier log too.
>
> Kumar,
> what do you think about it from modularization pov ?

Makes sense to do it, I don't think it would cause any problems for
modularization.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets
  2026-01-18 17:20       ` Alexei Starovoitov
  2026-01-18 23:22         ` Kumar Kartikeya Dwivedi
@ 2026-01-19  2:35         ` Xu Kuohai
  1 sibling, 0 replies; 15+ messages in thread
From: Xu Kuohai @ 2026-01-19  2:35 UTC (permalink / raw)
  To: Alexei Starovoitov, Kumar Kartikeya Dwivedi
  Cc: Eduard Zingerman, bpf, LKML, linux-arm-kernel, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Yonghong Song,
	Puranjay Mohan, Anton Protopopov

On 1/19/2026 1:20 AM, Alexei Starovoitov wrote:
> On Wed, Jan 14, 2026 at 11:47 PM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
>>
>> On 1/15/2026 4:46 AM, Eduard Zingerman wrote:
>>> On Wed, 2026-01-14 at 17:39 +0800, Xu Kuohai wrote:
>>>> From: Xu Kuohai <xukuohai@huawei.com>
>>>>
>>>> Introduce helper bpf_insn_is_indirect_target to determine whether a BPF
>>>> instruction is an indirect jump target. This helper will be used by
>>>> follow-up patches to decide where to emit indirect landing pad instructions.
>>>>
>>>> Add a new flag to struct bpf_insn_aux_data to mark instructions that are
>>>> indirect jump targets. The BPF verifier sets this flag, and the helper
>>>> checks it to determine whether an instruction is an indirect jump target.
>>>>
>>>> Since bpf_insn_aux_data is only available before JIT stage, add a new
>>>> field to struct bpf_prog_aux to store a pointer to the bpf_insn_aux_data
>>>> array, making it accessible to the JIT.
>>>>
>>>> For programs with multiple subprogs, each subprog uses its own private
>>>> copy of insn_aux_data, since subprogs may insert additional instructions
>>>> during JIT and need to update the array. For non-subprog, the verifier's
>>>> insn_aux_data array is used directly to avoid unnecessary copying.
>>>>
>>>> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
>>>> ---
>>>
>>> Hm, I've missed the fact insn_aux_data is not currently available to jit.
>>> Is it really necessary to copy this array for each subprogram?
>>> Given that we still want to free insn_aux_data after program load,
>>> I'd expect that it should be possible just to pass a pointer with an
>>> offset pointing to a start of specific subprogram. Wdyt?
>>>
>>
>> I think it requires an additional field in struct bpf_prog to record the length
>> of the global insn_aux_data array. If a subprog inserts new instructions during
>> JIT (e.g., due to constant blinding), all entries in the array, including those
>> of the subsequent subprogs, would need to be adjusted. With per-subprog copying,
>> only the local insn_aux_data needs to be updated, reducing the amount of copying.
>>
>> However, if you prefer a global array, I’m happy to switch to it.
> 
> iirc we struggled with lack of env/insn_aux in JIT earlier.
> 
> func[i]->aux->used_maps = env->used_maps;
> is one such example.
> 
> Let's move bpf_prog_select_runtime() into bpf_check() and
> consistently pass 'env' into bpf_int_jit_compile() while
> env is still valid.
> Close to jit_subprogs().
> Or remove bpf_prog_select_runtime() and make jit_subprogs()
> do the whole thing. tbd.
> 
> This way we can remove used_maps workaround and don't need to do
> this insn_aux copy.
> Errors during JIT can be printed into the verifier log too.
>

Sounds great. Using jit_subprogs for the whole thing seems cleaner. I'll
try this approach first.

> Kumar,
> what do you think about it from modularization pov ?



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2026-01-19  2:35 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-14  9:39 [PATCH bpf-next v4 0/4] emit ENDBR/BTI instructions for indirect jump targets Xu Kuohai
2026-01-14  9:39 ` [PATCH bpf-next v4 1/4] bpf: Fix an off-by-one error in check_indirect_jump Xu Kuohai
2026-01-14 10:29   ` Anton Protopopov
2026-01-15  7:31     ` Xu Kuohai
2026-01-14  9:39 ` [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets Xu Kuohai
2026-01-14 11:00   ` Anton Protopopov
2026-01-15  7:37     ` Xu Kuohai
2026-01-14 20:46   ` Eduard Zingerman
2026-01-15  7:47     ` Xu Kuohai
2026-01-18 17:20       ` Alexei Starovoitov
2026-01-18 23:22         ` Kumar Kartikeya Dwivedi
2026-01-19  2:35         ` Xu Kuohai
2026-01-14  9:39 ` [PATCH bpf-next v4 3/4] bpf, x86: Emit ENDBR for " Xu Kuohai
2026-01-14 16:46   ` kernel test robot
2026-01-14  9:39 ` [PATCH bpf-next v4 4/4] bpf, arm64: Emit BTI for indirect jump target Xu Kuohai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox