* [PATCH bpf-next v4 0/4] emit ENDBR/BTI instructions for indirect jump targets
@ 2026-01-14 9:39 Xu Kuohai
2026-01-14 9:39 ` [PATCH bpf-next v4 1/4] bpf: Fix an off-by-one error in check_indirect_jump Xu Kuohai
` (3 more replies)
0 siblings, 4 replies; 15+ messages in thread
From: Xu Kuohai @ 2026-01-14 9:39 UTC (permalink / raw)
To: bpf, linux-kernel, linux-arm-kernel
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Yonghong Song, Puranjay Mohan,
Anton Protopopov
From: Xu Kuohai <xukuohai@huawei.com>
On x86 CPUs with CET/IBT and arm64 CPUs with BTI, missing landing pad instructions
at indirect jump targets triggers kernel panic. So emit ENDBR instructions for
indirect jump targets on x86 and BTI on arm64. Indirect jump targets are identified
based on the insn_aux_data created by the verifier.
Patch 1 fixes an off-by-one error that causes the last ENDBR/BTI instruction to be
omitted.
Patch 2 introduces a helper to determine whether an instruction is indirect jump target.
Patches 3 and 4 emit ENDBR and BTI instructions for indirect jump targets on x86 and
arm64, respectively.
v4:
- Switch to the approach proposed by Eduard, using insn_aux_data to indentify indirect
jump targets, and emit ENDBR on x86
v3: https://lore.kernel.org/bpf/20251227081033.240336-1-xukuohai@huaweicloud.com/
- Get rid of unnecessary enum definition (Yonghong Song, Anton Protopopov)
v2: https://lore.kernel.org/bpf/20251223085447.139301-1-xukuohai@huaweicloud.com/
- Exclude instruction arrays not used for indirect jumps (Anton Protopopov)
v1: https://lore.kernel.org/bpf/20251127140318.3944249-1-xukuohai@huaweicloud.com/
Xu Kuohai (4):
bpf: Fix an off-by-one error in check_indirect_jump
bpf: Add helper to detect indirect jump targets
bpf, x86: Emit ENDBR for indirect jump targets
bpf, arm64: Emit BTI for indirect jump target
arch/arm64/net/bpf_jit_comp.c | 3 ++
arch/x86/net/bpf_jit_comp.c | 15 ++++++----
include/linux/bpf.h | 2 ++
include/linux/bpf_verifier.h | 10 ++++---
kernel/bpf/core.c | 51 ++++++++++++++++++++++++++++++---
kernel/bpf/verifier.c | 53 +++++++++++++++++++++++++++++++++--
6 files changed, 119 insertions(+), 15 deletions(-)
--
2.47.3
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH bpf-next v4 1/4] bpf: Fix an off-by-one error in check_indirect_jump
2026-01-14 9:39 [PATCH bpf-next v4 0/4] emit ENDBR/BTI instructions for indirect jump targets Xu Kuohai
@ 2026-01-14 9:39 ` Xu Kuohai
2026-01-14 10:29 ` Anton Protopopov
2026-01-14 9:39 ` [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets Xu Kuohai
` (2 subsequent siblings)
3 siblings, 1 reply; 15+ messages in thread
From: Xu Kuohai @ 2026-01-14 9:39 UTC (permalink / raw)
To: bpf, linux-kernel, linux-arm-kernel
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Yonghong Song, Puranjay Mohan,
Anton Protopopov
From: Xu Kuohai <xukuohai@huawei.com>
Fix an off-by-one error in check_indirect_jump() that skips the last
element returned by copy_insn_array_uniq().
Fixes: 493d9e0d6083 ("bpf, x86: add support for indirect jumps")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
---
kernel/bpf/verifier.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index faa1ecc1fe9d..22605d9e0ffa 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -20336,7 +20336,7 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in
return -EINVAL;
}
- for (i = 0; i < n - 1; i++) {
+ for (i = 0; i < n; i++) {
other_branch = push_stack(env, env->gotox_tmp_buf->items[i],
env->insn_idx, env->cur_state->speculative);
if (IS_ERR(other_branch))
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets
2026-01-14 9:39 [PATCH bpf-next v4 0/4] emit ENDBR/BTI instructions for indirect jump targets Xu Kuohai
2026-01-14 9:39 ` [PATCH bpf-next v4 1/4] bpf: Fix an off-by-one error in check_indirect_jump Xu Kuohai
@ 2026-01-14 9:39 ` Xu Kuohai
2026-01-14 11:00 ` Anton Protopopov
2026-01-14 20:46 ` Eduard Zingerman
2026-01-14 9:39 ` [PATCH bpf-next v4 3/4] bpf, x86: Emit ENDBR for " Xu Kuohai
2026-01-14 9:39 ` [PATCH bpf-next v4 4/4] bpf, arm64: Emit BTI for indirect jump target Xu Kuohai
3 siblings, 2 replies; 15+ messages in thread
From: Xu Kuohai @ 2026-01-14 9:39 UTC (permalink / raw)
To: bpf, linux-kernel, linux-arm-kernel
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Yonghong Song, Puranjay Mohan,
Anton Protopopov
From: Xu Kuohai <xukuohai@huawei.com>
Introduce helper bpf_insn_is_indirect_target to determine whether a BPF
instruction is an indirect jump target. This helper will be used by
follow-up patches to decide where to emit indirect landing pad instructions.
Add a new flag to struct bpf_insn_aux_data to mark instructions that are
indirect jump targets. The BPF verifier sets this flag, and the helper
checks it to determine whether an instruction is an indirect jump target.
Since bpf_insn_aux_data is only available before JIT stage, add a new
field to struct bpf_prog_aux to store a pointer to the bpf_insn_aux_data
array, making it accessible to the JIT.
For programs with multiple subprogs, each subprog uses its own private
copy of insn_aux_data, since subprogs may insert additional instructions
during JIT and need to update the array. For non-subprog, the verifier's
insn_aux_data array is used directly to avoid unnecessary copying.
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
---
include/linux/bpf.h | 2 ++
include/linux/bpf_verifier.h | 10 ++++---
kernel/bpf/core.c | 51 +++++++++++++++++++++++++++++++++---
kernel/bpf/verifier.c | 51 +++++++++++++++++++++++++++++++++++-
4 files changed, 105 insertions(+), 9 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 5936f8e2996f..e7d7e705327e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1533,6 +1533,7 @@ bool bpf_has_frame_pointer(unsigned long ip);
int bpf_jit_charge_modmem(u32 size);
void bpf_jit_uncharge_modmem(u32 size);
bool bpf_prog_has_trampoline(const struct bpf_prog *prog);
+bool bpf_insn_is_indirect_target(const struct bpf_prog *prog, int idx);
#else
static inline int bpf_trampoline_link_prog(struct bpf_tramp_link *link,
struct bpf_trampoline *tr,
@@ -1760,6 +1761,7 @@ struct bpf_prog_aux {
struct bpf_stream stream[2];
struct mutex st_ops_assoc_mutex;
struct bpf_map __rcu *st_ops_assoc;
+ struct bpf_insn_aux_data *insn_aux;
};
#define BPF_NR_CONTEXTS 4 /* normal, softirq, hardirq, NMI */
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 130bcbd66f60..758086b384df 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -574,16 +574,18 @@ struct bpf_insn_aux_data {
/* below fields are initialized once */
unsigned int orig_idx; /* original instruction index */
- bool jmp_point;
- bool prune_point;
+ u32 jmp_point:1;
+ u32 prune_point:1;
/* ensure we check state equivalence and save state checkpoint and
* this instruction, regardless of any heuristics
*/
- bool force_checkpoint;
+ u32 force_checkpoint:1;
/* true if instruction is a call to a helper function that
* accepts callback function as a parameter.
*/
- bool calls_callback;
+ u32 calls_callback:1;
+ /* true if the instruction is an indirect jump target */
+ u32 indirect_target:1;
/*
* CFG strongly connected component this instruction belongs to,
* zero if it is a singleton SCC.
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index e0b8a8a5aaa9..bb870936e74b 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1486,6 +1486,35 @@ static void adjust_insn_arrays(struct bpf_prog *prog, u32 off, u32 len)
#endif
}
+static int adjust_insn_aux(struct bpf_prog *prog, int off, int cnt)
+{
+ size_t size;
+ struct bpf_insn_aux_data *new_aux;
+
+ if (cnt == 1)
+ return 0;
+
+ /* prog->len already accounts for the cnt - 1 newly inserted instructions */
+ size = array_size(prog->len, sizeof(struct bpf_insn_aux_data));
+ new_aux = vrealloc(prog->aux->insn_aux, size, GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+ if (!new_aux)
+ return -ENOMEM;
+
+ /* follow the same behavior as adjust_insn_array(): leave [0, off] unchanged and shift
+ * [off + 1, end) to [off + cnt, end). Otherwise, the JIT would emit landing pads at
+ * wrong locations, as the actual indirect jump target remains at off.
+ */
+ size = array_size(prog->len - off - cnt, sizeof(struct bpf_insn_aux_data));
+ memmove(new_aux + off + cnt, new_aux + off + 1, size);
+
+ size = array_size(cnt - 1, sizeof(struct bpf_insn_aux_data));
+ memset(new_aux + off + 1, 0, size);
+
+ prog->aux->insn_aux = new_aux;
+
+ return 0;
+}
+
struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
{
struct bpf_insn insn_buff[16], aux[2];
@@ -1541,6 +1570,11 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
clone = tmp;
insn_delta = rewritten - 1;
+ if (adjust_insn_aux(clone, i, rewritten)) {
+ bpf_jit_prog_release_other(prog, clone);
+ return ERR_PTR(-ENOMEM);
+ }
+
/* Instructions arrays must be updated using absolute xlated offsets */
adjust_insn_arrays(clone, prog->aux->subprog_start + i, rewritten);
@@ -1553,6 +1587,11 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
clone->blinded = 1;
return clone;
}
+
+bool bpf_insn_is_indirect_target(const struct bpf_prog *prog, int idx)
+{
+ return prog->aux->insn_aux && prog->aux->insn_aux[idx].indirect_target;
+}
#endif /* CONFIG_BPF_JIT */
/* Base function for offset calculation. Needs to go into .text section,
@@ -2540,24 +2579,24 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
if (!bpf_prog_is_offloaded(fp->aux)) {
*err = bpf_prog_alloc_jited_linfo(fp);
if (*err)
- return fp;
+ goto free_insn_aux;
fp = bpf_int_jit_compile(fp);
bpf_prog_jit_attempt_done(fp);
if (!fp->jited && jit_needed) {
*err = -ENOTSUPP;
- return fp;
+ goto free_insn_aux;
}
} else {
*err = bpf_prog_offload_compile(fp);
if (*err)
- return fp;
+ goto free_insn_aux;
}
finalize:
*err = bpf_prog_lock_ro(fp);
if (*err)
- return fp;
+ goto free_insn_aux;
/* The tail call compatibility check can only be done at
* this late stage as we need to determine, if we deal
@@ -2566,6 +2605,10 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
*/
*err = bpf_check_tail_call(fp);
+free_insn_aux:
+ vfree(fp->aux->insn_aux);
+ fp->aux->insn_aux = NULL;
+
return fp;
}
EXPORT_SYMBOL_GPL(bpf_prog_select_runtime);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 22605d9e0ffa..f2fe6baeceb9 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3852,6 +3852,11 @@ static bool is_jmp_point(struct bpf_verifier_env *env, int insn_idx)
return env->insn_aux_data[insn_idx].jmp_point;
}
+static void mark_indirect_target(struct bpf_verifier_env *env, int idx)
+{
+ env->insn_aux_data[idx].indirect_target = true;
+}
+
#define LR_FRAMENO_BITS 3
#define LR_SPI_BITS 6
#define LR_ENTRY_BITS (LR_SPI_BITS + LR_FRAMENO_BITS + 1)
@@ -20337,6 +20342,7 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in
}
for (i = 0; i < n; i++) {
+ mark_indirect_target(env, env->gotox_tmp_buf->items[i]);
other_branch = push_stack(env, env->gotox_tmp_buf->items[i],
env->insn_idx, env->cur_state->speculative);
if (IS_ERR(other_branch))
@@ -21243,6 +21249,37 @@ static void convert_pseudo_ld_imm64(struct bpf_verifier_env *env)
}
}
+static int clone_insn_aux_data(struct bpf_prog *prog, struct bpf_verifier_env *env, u32 off)
+{
+ u32 i;
+ size_t size;
+ bool has_indirect_target = false;
+ struct bpf_insn_aux_data *insn_aux;
+
+ for (i = 0; i < prog->len; i++) {
+ if (env->insn_aux_data[off + i].indirect_target) {
+ has_indirect_target = true;
+ break;
+ }
+ }
+
+ /* insn_aux is copied into bpf_prog so the JIT can check whether an instruction is an
+ * indirect jump target. If no indirect jump targets exist, copying is unnecessary.
+ */
+ if (!has_indirect_target)
+ return 0;
+
+ size = array_size(sizeof(struct bpf_insn_aux_data), prog->len);
+ insn_aux = vzalloc(size);
+ if (!insn_aux)
+ return -ENOMEM;
+
+ memcpy(insn_aux, env->insn_aux_data + off, size);
+ prog->aux->insn_aux = insn_aux;
+
+ return 0;
+}
+
/* single env->prog->insni[off] instruction was replaced with the range
* insni[off, off + cnt). Adjust corresponding insn_aux_data by copying
* [0, off) and [off, end) to new locations, so the patched range stays zero
@@ -22239,6 +22276,10 @@ static int jit_subprogs(struct bpf_verifier_env *env)
if (!i)
func[i]->aux->exception_boundary = env->seen_exception;
+ err = clone_insn_aux_data(func[i], env, subprog_start);
+ if (err < 0)
+ goto out_free;
+
/*
* To properly pass the absolute subprog start to jit
* all instruction adjustments should be accumulated
@@ -22306,6 +22347,8 @@ static int jit_subprogs(struct bpf_verifier_env *env)
for (i = 0; i < env->subprog_cnt; i++) {
func[i]->aux->used_maps = NULL;
func[i]->aux->used_map_cnt = 0;
+ vfree(func[i]->aux->insn_aux);
+ func[i]->aux->insn_aux = NULL;
}
/* finally lock prog and jit images for all functions and
@@ -22367,6 +22410,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
for (i = 0; i < env->subprog_cnt; i++) {
if (!func[i])
continue;
+ vfree(func[i]->aux->insn_aux);
func[i]->aux->poke_tab = NULL;
bpf_jit_free(func[i]);
}
@@ -25350,6 +25394,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
env->verification_time = ktime_get_ns() - start_time;
print_verification_stats(env);
env->prog->aux->verified_insns = env->insn_processed;
+ env->prog->aux->insn_aux = env->insn_aux_data;
/* preserve original error even if log finalization is successful */
err = bpf_vlog_finalize(&env->log, &log_true_size);
@@ -25428,7 +25473,11 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
if (!is_priv)
mutex_unlock(&bpf_verifier_lock);
clear_insn_aux_data(env, 0, env->prog->len);
- vfree(env->insn_aux_data);
+ /* on success, insn_aux_data will be freed by bpf_prog_select_runtime */
+ if (ret) {
+ vfree(env->insn_aux_data);
+ env->prog->aux->insn_aux = NULL;
+ }
err_free_env:
bpf_stack_liveness_free(env);
kvfree(env->cfg.insn_postorder);
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH bpf-next v4 3/4] bpf, x86: Emit ENDBR for indirect jump targets
2026-01-14 9:39 [PATCH bpf-next v4 0/4] emit ENDBR/BTI instructions for indirect jump targets Xu Kuohai
2026-01-14 9:39 ` [PATCH bpf-next v4 1/4] bpf: Fix an off-by-one error in check_indirect_jump Xu Kuohai
2026-01-14 9:39 ` [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets Xu Kuohai
@ 2026-01-14 9:39 ` Xu Kuohai
2026-01-14 16:46 ` kernel test robot
2026-01-14 9:39 ` [PATCH bpf-next v4 4/4] bpf, arm64: Emit BTI for indirect jump target Xu Kuohai
3 siblings, 1 reply; 15+ messages in thread
From: Xu Kuohai @ 2026-01-14 9:39 UTC (permalink / raw)
To: bpf, linux-kernel, linux-arm-kernel
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Yonghong Song, Puranjay Mohan,
Anton Protopopov
From: Xu Kuohai <xukuohai@huawei.com>
On CPUs that support CET/IBT, the indirect jump selftest triggers
a kernel panic because the indirect jump targets lack ENDBR
instructions.
To fix it, emit an ENDBR instruction to each indirect jump target. Since
the ENDBR instruction shifts the position of original jited instructions,
fix the instruction address calculation wherever the addresses are used.
For reference, below is a sample panic log.
Missing ENDBR: bpf_prog_2e5f1c71c13ac3e0_big_jump_table+0x97/0xe1
------------[ cut here ]------------
kernel BUG at arch/x86/kernel/cet.c:133!
Oops: invalid opcode: 0000 [#1] SMP NOPTI
...
? 0xffffffffc00fb258
? bpf_prog_2e5f1c71c13ac3e0_big_jump_table+0x97/0xe1
bpf_prog_test_run_syscall+0x110/0x2f0
? fdget+0xba/0xe0
__sys_bpf+0xe4b/0x2590
? __kmalloc_node_track_caller_noprof+0x1c7/0x680
? bpf_prog_test_run_syscall+0x215/0x2f0
__x64_sys_bpf+0x21/0x30
do_syscall_64+0x85/0x620
? bpf_prog_test_run_syscall+0x1e2/0x2f0
Fixes: 493d9e0d6083 ("bpf, x86: add support for indirect jumps")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
---
arch/x86/net/bpf_jit_comp.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index e3b1c4b1d550..ef79baac42d7 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1733,6 +1733,9 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image
dst_reg = X86_REG_R9;
}
+ if (bpf_insn_is_indirect_target(bpf_prog, i - 1))
+ EMIT_ENDBR();
+
switch (insn->code) {
/* ALU */
case BPF_ALU | BPF_ADD | BPF_X:
@@ -2439,7 +2442,7 @@ st: if (is_imm8(insn->off))
/* call */
case BPF_JMP | BPF_CALL: {
- u8 *ip = image + addrs[i - 1];
+ u8 *ip = image + addrs[i - 1] + (prog - temp);
func = (u8 *) __bpf_call_base + imm32;
if (src_reg == BPF_PSEUDO_CALL && tail_call_reachable) {
@@ -2464,7 +2467,8 @@ st: if (is_imm8(insn->off))
if (imm32)
emit_bpf_tail_call_direct(bpf_prog,
&bpf_prog->aux->poke_tab[imm32 - 1],
- &prog, image + addrs[i - 1],
+ &prog,
+ image + addrs[i - 1] + (prog - temp),
callee_regs_used,
stack_depth,
ctx);
@@ -2473,7 +2477,7 @@ st: if (is_imm8(insn->off))
&prog,
callee_regs_used,
stack_depth,
- image + addrs[i - 1],
+ image + addrs[i - 1] + (prog - temp),
ctx);
break;
@@ -2638,7 +2642,8 @@ st: if (is_imm8(insn->off))
break;
case BPF_JMP | BPF_JA | BPF_X:
- emit_indirect_jump(&prog, insn->dst_reg, image + addrs[i - 1]);
+ emit_indirect_jump(&prog, insn->dst_reg,
+ image + addrs[i - 1] + (prog - temp));
break;
case BPF_JMP | BPF_JA:
case BPF_JMP32 | BPF_JA:
@@ -2728,7 +2733,7 @@ st: if (is_imm8(insn->off))
ctx->cleanup_addr = proglen;
if (bpf_prog_was_classic(bpf_prog) &&
!ns_capable_noaudit(&init_user_ns, CAP_SYS_ADMIN)) {
- u8 *ip = image + addrs[i - 1];
+ u8 *ip = image + addrs[i - 1] + (prog - temp);
if (emit_spectre_bhb_barrier(&prog, ip, bpf_prog))
return -EINVAL;
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH bpf-next v4 4/4] bpf, arm64: Emit BTI for indirect jump target
2026-01-14 9:39 [PATCH bpf-next v4 0/4] emit ENDBR/BTI instructions for indirect jump targets Xu Kuohai
` (2 preceding siblings ...)
2026-01-14 9:39 ` [PATCH bpf-next v4 3/4] bpf, x86: Emit ENDBR for " Xu Kuohai
@ 2026-01-14 9:39 ` Xu Kuohai
3 siblings, 0 replies; 15+ messages in thread
From: Xu Kuohai @ 2026-01-14 9:39 UTC (permalink / raw)
To: bpf, linux-kernel, linux-arm-kernel
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Yonghong Song, Puranjay Mohan,
Anton Protopopov
From: Xu Kuohai <xukuohai@huawei.com>
On CPUs that support BTI, the indirect jump selftest triggers a kernel
panic because there is no BTI instructions at the indirect jump targets.
Fix it by emitting a BTI instruction for each indirect jump target.
For reference, below is a sample panic log.
Internal error: Oops - BTI: 0000000036000003 [#1] SMP
...
Call trace:
bpf_prog_2e5f1c71c13ac3e0_big_jump_table+0x54/0xf8 (P)
bpf_prog_run_pin_on_cpu+0x140/0x468
bpf_prog_test_run_syscall+0x280/0x3b8
bpf_prog_test_run+0x22c/0x2c0
Fixes: f4a66cf1cb14 ("bpf: arm64: Add support for indirect jumps")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
---
arch/arm64/net/bpf_jit_comp.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 0c4d44bcfbf4..370ae0751b9e 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -1231,6 +1231,9 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
int ret;
bool sign_extend;
+ if (bpf_insn_is_indirect_target(ctx->prog, i))
+ emit_bti(A64_BTI_J, ctx);
+
switch (code) {
/* dst = src */
case BPF_ALU | BPF_MOV | BPF_X:
--
2.47.3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next v4 1/4] bpf: Fix an off-by-one error in check_indirect_jump
2026-01-14 9:39 ` [PATCH bpf-next v4 1/4] bpf: Fix an off-by-one error in check_indirect_jump Xu Kuohai
@ 2026-01-14 10:29 ` Anton Protopopov
2026-01-15 7:31 ` Xu Kuohai
0 siblings, 1 reply; 15+ messages in thread
From: Anton Protopopov @ 2026-01-14 10:29 UTC (permalink / raw)
To: Xu Kuohai
Cc: bpf, linux-kernel, linux-arm-kernel, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, Puranjay Mohan
On 26/01/14 05:39PM, Xu Kuohai wrote:
> From: Xu Kuohai <xukuohai@huawei.com>
>
> Fix an off-by-one error in check_indirect_jump() that skips the last
> element returned by copy_insn_array_uniq().
>
> Fixes: 493d9e0d6083 ("bpf, x86: add support for indirect jumps")
> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> ---
> kernel/bpf/verifier.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index faa1ecc1fe9d..22605d9e0ffa 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -20336,7 +20336,7 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in
> return -EINVAL;
> }
>
> - for (i = 0; i < n - 1; i++) {
> + for (i = 0; i < n; i++) {
> other_branch = push_stack(env, env->gotox_tmp_buf->items[i],
> env->insn_idx, env->cur_state->speculative);
> if (IS_ERR(other_branch))
> --
> 2.47.3
Nack, the last state doesn't require a push_stack() call, it is
verified directly under this loop. Instead of this patch, just
add another call to mark_indirect_target().
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets
2026-01-14 9:39 ` [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets Xu Kuohai
@ 2026-01-14 11:00 ` Anton Protopopov
2026-01-15 7:37 ` Xu Kuohai
2026-01-14 20:46 ` Eduard Zingerman
1 sibling, 1 reply; 15+ messages in thread
From: Anton Protopopov @ 2026-01-14 11:00 UTC (permalink / raw)
To: Xu Kuohai
Cc: bpf, linux-kernel, linux-arm-kernel, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, Puranjay Mohan
On 26/01/14 05:39PM, Xu Kuohai wrote:
> From: Xu Kuohai <xukuohai@huawei.com>
>
> Introduce helper bpf_insn_is_indirect_target to determine whether a BPF
> instruction is an indirect jump target. This helper will be used by
> follow-up patches to decide where to emit indirect landing pad instructions.
>
> Add a new flag to struct bpf_insn_aux_data to mark instructions that are
> indirect jump targets. The BPF verifier sets this flag, and the helper
> checks it to determine whether an instruction is an indirect jump target.
>
> Since bpf_insn_aux_data is only available before JIT stage, add a new
> field to struct bpf_prog_aux to store a pointer to the bpf_insn_aux_data
> array, making it accessible to the JIT.
>
> For programs with multiple subprogs, each subprog uses its own private
> copy of insn_aux_data, since subprogs may insert additional instructions
> during JIT and need to update the array. For non-subprog, the verifier's
> insn_aux_data array is used directly to avoid unnecessary copying.
>
> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> ---
> include/linux/bpf.h | 2 ++
> include/linux/bpf_verifier.h | 10 ++++---
> kernel/bpf/core.c | 51 +++++++++++++++++++++++++++++++++---
> kernel/bpf/verifier.c | 51 +++++++++++++++++++++++++++++++++++-
> 4 files changed, 105 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 5936f8e2996f..e7d7e705327e 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1533,6 +1533,7 @@ bool bpf_has_frame_pointer(unsigned long ip);
> int bpf_jit_charge_modmem(u32 size);
> void bpf_jit_uncharge_modmem(u32 size);
> bool bpf_prog_has_trampoline(const struct bpf_prog *prog);
> +bool bpf_insn_is_indirect_target(const struct bpf_prog *prog, int idx);
> #else
> static inline int bpf_trampoline_link_prog(struct bpf_tramp_link *link,
> struct bpf_trampoline *tr,
> @@ -1760,6 +1761,7 @@ struct bpf_prog_aux {
> struct bpf_stream stream[2];
> struct mutex st_ops_assoc_mutex;
> struct bpf_map __rcu *st_ops_assoc;
> + struct bpf_insn_aux_data *insn_aux;
> };
>
> #define BPF_NR_CONTEXTS 4 /* normal, softirq, hardirq, NMI */
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 130bcbd66f60..758086b384df 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -574,16 +574,18 @@ struct bpf_insn_aux_data {
>
> /* below fields are initialized once */
> unsigned int orig_idx; /* original instruction index */
> - bool jmp_point;
> - bool prune_point;
> + u32 jmp_point:1;
> + u32 prune_point:1;
> /* ensure we check state equivalence and save state checkpoint and
> * this instruction, regardless of any heuristics
> */
> - bool force_checkpoint;
> + u32 force_checkpoint:1;
> /* true if instruction is a call to a helper function that
> * accepts callback function as a parameter.
> */
> - bool calls_callback;
> + u32 calls_callback:1;
> + /* true if the instruction is an indirect jump target */
> + u32 indirect_target:1;
> /*
> * CFG strongly connected component this instruction belongs to,
> * zero if it is a singleton SCC.
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index e0b8a8a5aaa9..bb870936e74b 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -1486,6 +1486,35 @@ static void adjust_insn_arrays(struct bpf_prog *prog, u32 off, u32 len)
> #endif
> }
>
> +static int adjust_insn_aux(struct bpf_prog *prog, int off, int cnt)
> +{
> + size_t size;
> + struct bpf_insn_aux_data *new_aux;
> +
> + if (cnt == 1)
> + return 0;
> +
> + /* prog->len already accounts for the cnt - 1 newly inserted instructions */
> + size = array_size(prog->len, sizeof(struct bpf_insn_aux_data));
> + new_aux = vrealloc(prog->aux->insn_aux, size, GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> + if (!new_aux)
> + return -ENOMEM;
> +
> + /* follow the same behavior as adjust_insn_array(): leave [0, off] unchanged and shift
> + * [off + 1, end) to [off + cnt, end). Otherwise, the JIT would emit landing pads at
> + * wrong locations, as the actual indirect jump target remains at off.
> + */
> + size = array_size(prog->len - off - cnt, sizeof(struct bpf_insn_aux_data));
> + memmove(new_aux + off + cnt, new_aux + off + 1, size);
> +
> + size = array_size(cnt - 1, sizeof(struct bpf_insn_aux_data));
> + memset(new_aux + off + 1, 0, size);
> +
> + prog->aux->insn_aux = new_aux;
> +
> + return 0;
> +}
> +
> struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
> {
> struct bpf_insn insn_buff[16], aux[2];
> @@ -1541,6 +1570,11 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
> clone = tmp;
> insn_delta = rewritten - 1;
>
> + if (adjust_insn_aux(clone, i, rewritten)) {
> + bpf_jit_prog_release_other(prog, clone);
> + return ERR_PTR(-ENOMEM);
> + }
> +
> /* Instructions arrays must be updated using absolute xlated offsets */
> adjust_insn_arrays(clone, prog->aux->subprog_start + i, rewritten);
>
> @@ -1553,6 +1587,11 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
> clone->blinded = 1;
> return clone;
> }
> +
> +bool bpf_insn_is_indirect_target(const struct bpf_prog *prog, int idx)
> +{
> + return prog->aux->insn_aux && prog->aux->insn_aux[idx].indirect_target;
Is there a case when insn_aux is NULL?
> +}
> #endif /* CONFIG_BPF_JIT */
>
> /* Base function for offset calculation. Needs to go into .text section,
> @@ -2540,24 +2579,24 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
> if (!bpf_prog_is_offloaded(fp->aux)) {
> *err = bpf_prog_alloc_jited_linfo(fp);
> if (*err)
> - return fp;
> + goto free_insn_aux;
>
> fp = bpf_int_jit_compile(fp);
> bpf_prog_jit_attempt_done(fp);
> if (!fp->jited && jit_needed) {
> *err = -ENOTSUPP;
> - return fp;
> + goto free_insn_aux;
> }
> } else {
> *err = bpf_prog_offload_compile(fp);
> if (*err)
> - return fp;
> + goto free_insn_aux;
> }
>
> finalize:
> *err = bpf_prog_lock_ro(fp);
> if (*err)
> - return fp;
> + goto free_insn_aux;
>
> /* The tail call compatibility check can only be done at
> * this late stage as we need to determine, if we deal
> @@ -2566,6 +2605,10 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
> */
> *err = bpf_check_tail_call(fp);
>
> +free_insn_aux:
> + vfree(fp->aux->insn_aux);
> + fp->aux->insn_aux = NULL;
> +
> return fp;
> }
> EXPORT_SYMBOL_GPL(bpf_prog_select_runtime);
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 22605d9e0ffa..f2fe6baeceb9 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -3852,6 +3852,11 @@ static bool is_jmp_point(struct bpf_verifier_env *env, int insn_idx)
> return env->insn_aux_data[insn_idx].jmp_point;
> }
>
> +static void mark_indirect_target(struct bpf_verifier_env *env, int idx)
> +{
> + env->insn_aux_data[idx].indirect_target = true;
> +}
> +
> #define LR_FRAMENO_BITS 3
> #define LR_SPI_BITS 6
> #define LR_ENTRY_BITS (LR_SPI_BITS + LR_FRAMENO_BITS + 1)
> @@ -20337,6 +20342,7 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in
> }
>
> for (i = 0; i < n; i++) {
^ n -> n-1
> + mark_indirect_target(env, env->gotox_tmp_buf->items[i]);
> other_branch = push_stack(env, env->gotox_tmp_buf->items[i],
> env->insn_idx, env->cur_state->speculative);
> if (IS_ERR(other_branch))
> @@ -21243,6 +21249,37 @@ static void convert_pseudo_ld_imm64(struct bpf_verifier_env *env)
> }
mark_indirect_target(n-1)
> }
>
> +static int clone_insn_aux_data(struct bpf_prog *prog, struct bpf_verifier_env *env, u32 off)
> +{
> + u32 i;
> + size_t size;
> + bool has_indirect_target = false;
> + struct bpf_insn_aux_data *insn_aux;
> +
> + for (i = 0; i < prog->len; i++) {
> + if (env->insn_aux_data[off + i].indirect_target) {
> + has_indirect_target = true;
> + break;
> + }
> + }
> +
> + /* insn_aux is copied into bpf_prog so the JIT can check whether an instruction is an
> + * indirect jump target. If no indirect jump targets exist, copying is unnecessary.
> + */
> + if (!has_indirect_target)
> + return 0;
> +
> + size = array_size(sizeof(struct bpf_insn_aux_data), prog->len);
> + insn_aux = vzalloc(size);
> + if (!insn_aux)
> + return -ENOMEM;
> +
> + memcpy(insn_aux, env->insn_aux_data + off, size);
> + prog->aux->insn_aux = insn_aux;
> +
> + return 0;
> +}
> +
> /* single env->prog->insni[off] instruction was replaced with the range
> * insni[off, off + cnt). Adjust corresponding insn_aux_data by copying
> * [0, off) and [off, end) to new locations, so the patched range stays zero
> @@ -22239,6 +22276,10 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> if (!i)
> func[i]->aux->exception_boundary = env->seen_exception;
>
> + err = clone_insn_aux_data(func[i], env, subprog_start);
> + if (err < 0)
> + goto out_free;
> +
> /*
> * To properly pass the absolute subprog start to jit
> * all instruction adjustments should be accumulated
> @@ -22306,6 +22347,8 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> for (i = 0; i < env->subprog_cnt; i++) {
> func[i]->aux->used_maps = NULL;
> func[i]->aux->used_map_cnt = 0;
> + vfree(func[i]->aux->insn_aux);
> + func[i]->aux->insn_aux = NULL;
> }
>
> /* finally lock prog and jit images for all functions and
> @@ -22367,6 +22410,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
> for (i = 0; i < env->subprog_cnt; i++) {
> if (!func[i])
> continue;
> + vfree(func[i]->aux->insn_aux);
> func[i]->aux->poke_tab = NULL;
> bpf_jit_free(func[i]);
> }
> @@ -25350,6 +25394,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
> env->verification_time = ktime_get_ns() - start_time;
> print_verification_stats(env);
> env->prog->aux->verified_insns = env->insn_processed;
> + env->prog->aux->insn_aux = env->insn_aux_data;
>
> /* preserve original error even if log finalization is successful */
> err = bpf_vlog_finalize(&env->log, &log_true_size);
> @@ -25428,7 +25473,11 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
> if (!is_priv)
> mutex_unlock(&bpf_verifier_lock);
> clear_insn_aux_data(env, 0, env->prog->len);
> - vfree(env->insn_aux_data);
> + /* on success, insn_aux_data will be freed by bpf_prog_select_runtime */
> + if (ret) {
> + vfree(env->insn_aux_data);
> + env->prog->aux->insn_aux = NULL;
> + }
> err_free_env:
> bpf_stack_liveness_free(env);
> kvfree(env->cfg.insn_postorder);
> --
> 2.47.3
>
LGTM, just in case, could you please tell how you have tested
this patchset exactly?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next v4 3/4] bpf, x86: Emit ENDBR for indirect jump targets
2026-01-14 9:39 ` [PATCH bpf-next v4 3/4] bpf, x86: Emit ENDBR for " Xu Kuohai
@ 2026-01-14 16:46 ` kernel test robot
0 siblings, 0 replies; 15+ messages in thread
From: kernel test robot @ 2026-01-14 16:46 UTC (permalink / raw)
To: Xu Kuohai, bpf, linux-kernel, linux-arm-kernel
Cc: oe-kbuild-all, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
Yonghong Song, Puranjay Mohan, Anton Protopopov
Hi Xu,
kernel test robot noticed the following build warnings:
[auto build test WARNING on bpf-next/master]
url: https://github.com/intel-lab-lkp/linux/commits/Xu-Kuohai/bpf-Fix-an-off-by-one-error-in-check_indirect_jump/20260114-172632
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link: https://lore.kernel.org/r/20260114093914.2403982-4-xukuohai%40huaweicloud.com
patch subject: [PATCH bpf-next v4 3/4] bpf, x86: Emit ENDBR for indirect jump targets
config: x86_64-buildonly-randconfig-002-20260114 (https://download.01.org/0day-ci/archive/20260115/202601150016.x24DRk9R-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260115/202601150016.x24DRk9R-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202601150016.x24DRk9R-lkp@intel.com/
All warnings (new ones prefixed by >>):
arch/x86/net/bpf_jit_comp.c: In function 'do_jit':
>> arch/x86/net/bpf_jit_comp.c:1737:37: warning: suggest braces around empty body in an 'if' statement [-Wempty-body]
1737 | EMIT_ENDBR();
| ^
vim +/if +1737 arch/x86/net/bpf_jit_comp.c
1650
1651 static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image,
1652 int oldproglen, struct jit_context *ctx, bool jmp_padding)
1653 {
1654 bool tail_call_reachable = bpf_prog->aux->tail_call_reachable;
1655 struct bpf_insn *insn = bpf_prog->insnsi;
1656 bool callee_regs_used[4] = {};
1657 int insn_cnt = bpf_prog->len;
1658 bool seen_exit = false;
1659 u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY];
1660 void __percpu *priv_frame_ptr = NULL;
1661 u64 arena_vm_start, user_vm_start;
1662 void __percpu *priv_stack_ptr;
1663 int i, excnt = 0;
1664 int ilen, proglen = 0;
1665 u8 *prog = temp;
1666 u32 stack_depth;
1667 int err;
1668
1669 stack_depth = bpf_prog->aux->stack_depth;
1670 priv_stack_ptr = bpf_prog->aux->priv_stack_ptr;
1671 if (priv_stack_ptr) {
1672 priv_frame_ptr = priv_stack_ptr + PRIV_STACK_GUARD_SZ + round_up(stack_depth, 8);
1673 stack_depth = 0;
1674 }
1675
1676 arena_vm_start = bpf_arena_get_kern_vm_start(bpf_prog->aux->arena);
1677 user_vm_start = bpf_arena_get_user_vm_start(bpf_prog->aux->arena);
1678
1679 detect_reg_usage(insn, insn_cnt, callee_regs_used);
1680
1681 emit_prologue(&prog, image, stack_depth,
1682 bpf_prog_was_classic(bpf_prog), tail_call_reachable,
1683 bpf_is_subprog(bpf_prog), bpf_prog->aux->exception_cb);
1684
1685 bpf_prog->aux->ksym.fp_start = prog - temp;
1686
1687 /* Exception callback will clobber callee regs for its own use, and
1688 * restore the original callee regs from main prog's stack frame.
1689 */
1690 if (bpf_prog->aux->exception_boundary) {
1691 /* We also need to save r12, which is not mapped to any BPF
1692 * register, as we throw after entry into the kernel, which may
1693 * overwrite r12.
1694 */
1695 push_r12(&prog);
1696 push_callee_regs(&prog, all_callee_regs_used);
1697 } else {
1698 if (arena_vm_start)
1699 push_r12(&prog);
1700 push_callee_regs(&prog, callee_regs_used);
1701 }
1702 if (arena_vm_start)
1703 emit_mov_imm64(&prog, X86_REG_R12,
1704 arena_vm_start >> 32, (u32) arena_vm_start);
1705
1706 if (priv_frame_ptr)
1707 emit_priv_frame_ptr(&prog, priv_frame_ptr);
1708
1709 ilen = prog - temp;
1710 if (rw_image)
1711 memcpy(rw_image + proglen, temp, ilen);
1712 proglen += ilen;
1713 addrs[0] = proglen;
1714 prog = temp;
1715
1716 for (i = 1; i <= insn_cnt; i++, insn++) {
1717 const s32 imm32 = insn->imm;
1718 u32 dst_reg = insn->dst_reg;
1719 u32 src_reg = insn->src_reg;
1720 u8 b2 = 0, b3 = 0;
1721 u8 *start_of_ldx;
1722 s64 jmp_offset;
1723 s16 insn_off;
1724 u8 jmp_cond;
1725 u8 *func;
1726 int nops;
1727
1728 if (priv_frame_ptr) {
1729 if (src_reg == BPF_REG_FP)
1730 src_reg = X86_REG_R9;
1731
1732 if (dst_reg == BPF_REG_FP)
1733 dst_reg = X86_REG_R9;
1734 }
1735
1736 if (bpf_insn_is_indirect_target(bpf_prog, i - 1))
> 1737 EMIT_ENDBR();
1738
1739 switch (insn->code) {
1740 /* ALU */
1741 case BPF_ALU | BPF_ADD | BPF_X:
1742 case BPF_ALU | BPF_SUB | BPF_X:
1743 case BPF_ALU | BPF_AND | BPF_X:
1744 case BPF_ALU | BPF_OR | BPF_X:
1745 case BPF_ALU | BPF_XOR | BPF_X:
1746 case BPF_ALU64 | BPF_ADD | BPF_X:
1747 case BPF_ALU64 | BPF_SUB | BPF_X:
1748 case BPF_ALU64 | BPF_AND | BPF_X:
1749 case BPF_ALU64 | BPF_OR | BPF_X:
1750 case BPF_ALU64 | BPF_XOR | BPF_X:
1751 maybe_emit_mod(&prog, dst_reg, src_reg,
1752 BPF_CLASS(insn->code) == BPF_ALU64);
1753 b2 = simple_alu_opcodes[BPF_OP(insn->code)];
1754 EMIT2(b2, add_2reg(0xC0, dst_reg, src_reg));
1755 break;
1756
1757 case BPF_ALU64 | BPF_MOV | BPF_X:
1758 if (insn_is_cast_user(insn)) {
1759 if (dst_reg != src_reg)
1760 /* 32-bit mov */
1761 emit_mov_reg(&prog, false, dst_reg, src_reg);
1762 /* shl dst_reg, 32 */
1763 maybe_emit_1mod(&prog, dst_reg, true);
1764 EMIT3(0xC1, add_1reg(0xE0, dst_reg), 32);
1765
1766 /* or dst_reg, user_vm_start */
1767 maybe_emit_1mod(&prog, dst_reg, true);
1768 if (is_axreg(dst_reg))
1769 EMIT1_off32(0x0D, user_vm_start >> 32);
1770 else
1771 EMIT2_off32(0x81, add_1reg(0xC8, dst_reg), user_vm_start >> 32);
1772
1773 /* rol dst_reg, 32 */
1774 maybe_emit_1mod(&prog, dst_reg, true);
1775 EMIT3(0xC1, add_1reg(0xC0, dst_reg), 32);
1776
1777 /* xor r11, r11 */
1778 EMIT3(0x4D, 0x31, 0xDB);
1779
1780 /* test dst_reg32, dst_reg32; check if lower 32-bit are zero */
1781 maybe_emit_mod(&prog, dst_reg, dst_reg, false);
1782 EMIT2(0x85, add_2reg(0xC0, dst_reg, dst_reg));
1783
1784 /* cmove r11, dst_reg; if so, set dst_reg to zero */
1785 /* WARNING: Intel swapped src/dst register encoding in CMOVcc !!! */
1786 maybe_emit_mod(&prog, AUX_REG, dst_reg, true);
1787 EMIT3(0x0F, 0x44, add_2reg(0xC0, AUX_REG, dst_reg));
1788 break;
1789 } else if (insn_is_mov_percpu_addr(insn)) {
1790 /* mov <dst>, <src> (if necessary) */
1791 EMIT_mov(dst_reg, src_reg);
1792 #ifdef CONFIG_SMP
1793 /* add <dst>, gs:[<off>] */
1794 EMIT2(0x65, add_1mod(0x48, dst_reg));
1795 EMIT3(0x03, add_2reg(0x04, 0, dst_reg), 0x25);
1796 EMIT((u32)(unsigned long)&this_cpu_off, 4);
1797 #endif
1798 break;
1799 }
1800 fallthrough;
1801 case BPF_ALU | BPF_MOV | BPF_X:
1802 if (insn->off == 0)
1803 emit_mov_reg(&prog,
1804 BPF_CLASS(insn->code) == BPF_ALU64,
1805 dst_reg, src_reg);
1806 else
1807 emit_movsx_reg(&prog, insn->off,
1808 BPF_CLASS(insn->code) == BPF_ALU64,
1809 dst_reg, src_reg);
1810 break;
1811
1812 /* neg dst */
1813 case BPF_ALU | BPF_NEG:
1814 case BPF_ALU64 | BPF_NEG:
1815 maybe_emit_1mod(&prog, dst_reg,
1816 BPF_CLASS(insn->code) == BPF_ALU64);
1817 EMIT2(0xF7, add_1reg(0xD8, dst_reg));
1818 break;
1819
1820 case BPF_ALU | BPF_ADD | BPF_K:
1821 case BPF_ALU | BPF_SUB | BPF_K:
1822 case BPF_ALU | BPF_AND | BPF_K:
1823 case BPF_ALU | BPF_OR | BPF_K:
1824 case BPF_ALU | BPF_XOR | BPF_K:
1825 case BPF_ALU64 | BPF_ADD | BPF_K:
1826 case BPF_ALU64 | BPF_SUB | BPF_K:
1827 case BPF_ALU64 | BPF_AND | BPF_K:
1828 case BPF_ALU64 | BPF_OR | BPF_K:
1829 case BPF_ALU64 | BPF_XOR | BPF_K:
1830 maybe_emit_1mod(&prog, dst_reg,
1831 BPF_CLASS(insn->code) == BPF_ALU64);
1832
1833 /*
1834 * b3 holds 'normal' opcode, b2 short form only valid
1835 * in case dst is eax/rax.
1836 */
1837 switch (BPF_OP(insn->code)) {
1838 case BPF_ADD:
1839 b3 = 0xC0;
1840 b2 = 0x05;
1841 break;
1842 case BPF_SUB:
1843 b3 = 0xE8;
1844 b2 = 0x2D;
1845 break;
1846 case BPF_AND:
1847 b3 = 0xE0;
1848 b2 = 0x25;
1849 break;
1850 case BPF_OR:
1851 b3 = 0xC8;
1852 b2 = 0x0D;
1853 break;
1854 case BPF_XOR:
1855 b3 = 0xF0;
1856 b2 = 0x35;
1857 break;
1858 }
1859
1860 if (is_imm8(imm32))
1861 EMIT3(0x83, add_1reg(b3, dst_reg), imm32);
1862 else if (is_axreg(dst_reg))
1863 EMIT1_off32(b2, imm32);
1864 else
1865 EMIT2_off32(0x81, add_1reg(b3, dst_reg), imm32);
1866 break;
1867
1868 case BPF_ALU64 | BPF_MOV | BPF_K:
1869 case BPF_ALU | BPF_MOV | BPF_K:
1870 emit_mov_imm32(&prog, BPF_CLASS(insn->code) == BPF_ALU64,
1871 dst_reg, imm32);
1872 break;
1873
1874 case BPF_LD | BPF_IMM | BPF_DW:
1875 emit_mov_imm64(&prog, dst_reg, insn[1].imm, insn[0].imm);
1876 insn++;
1877 i++;
1878 break;
1879
1880 /* dst %= src, dst /= src, dst %= imm32, dst /= imm32 */
1881 case BPF_ALU | BPF_MOD | BPF_X:
1882 case BPF_ALU | BPF_DIV | BPF_X:
1883 case BPF_ALU | BPF_MOD | BPF_K:
1884 case BPF_ALU | BPF_DIV | BPF_K:
1885 case BPF_ALU64 | BPF_MOD | BPF_X:
1886 case BPF_ALU64 | BPF_DIV | BPF_X:
1887 case BPF_ALU64 | BPF_MOD | BPF_K:
1888 case BPF_ALU64 | BPF_DIV | BPF_K: {
1889 bool is64 = BPF_CLASS(insn->code) == BPF_ALU64;
1890
1891 if (dst_reg != BPF_REG_0)
1892 EMIT1(0x50); /* push rax */
1893 if (dst_reg != BPF_REG_3)
1894 EMIT1(0x52); /* push rdx */
1895
1896 if (BPF_SRC(insn->code) == BPF_X) {
1897 if (src_reg == BPF_REG_0 ||
1898 src_reg == BPF_REG_3) {
1899 /* mov r11, src_reg */
1900 EMIT_mov(AUX_REG, src_reg);
1901 src_reg = AUX_REG;
1902 }
1903 } else {
1904 /* mov r11, imm32 */
1905 EMIT3_off32(0x49, 0xC7, 0xC3, imm32);
1906 src_reg = AUX_REG;
1907 }
1908
1909 if (dst_reg != BPF_REG_0)
1910 /* mov rax, dst_reg */
1911 emit_mov_reg(&prog, is64, BPF_REG_0, dst_reg);
1912
1913 if (insn->off == 0) {
1914 /*
1915 * xor edx, edx
1916 * equivalent to 'xor rdx, rdx', but one byte less
1917 */
1918 EMIT2(0x31, 0xd2);
1919
1920 /* div src_reg */
1921 maybe_emit_1mod(&prog, src_reg, is64);
1922 EMIT2(0xF7, add_1reg(0xF0, src_reg));
1923 } else {
1924 if (BPF_CLASS(insn->code) == BPF_ALU)
1925 EMIT1(0x99); /* cdq */
1926 else
1927 EMIT2(0x48, 0x99); /* cqo */
1928
1929 /* idiv src_reg */
1930 maybe_emit_1mod(&prog, src_reg, is64);
1931 EMIT2(0xF7, add_1reg(0xF8, src_reg));
1932 }
1933
1934 if (BPF_OP(insn->code) == BPF_MOD &&
1935 dst_reg != BPF_REG_3)
1936 /* mov dst_reg, rdx */
1937 emit_mov_reg(&prog, is64, dst_reg, BPF_REG_3);
1938 else if (BPF_OP(insn->code) == BPF_DIV &&
1939 dst_reg != BPF_REG_0)
1940 /* mov dst_reg, rax */
1941 emit_mov_reg(&prog, is64, dst_reg, BPF_REG_0);
1942
1943 if (dst_reg != BPF_REG_3)
1944 EMIT1(0x5A); /* pop rdx */
1945 if (dst_reg != BPF_REG_0)
1946 EMIT1(0x58); /* pop rax */
1947 break;
1948 }
1949
1950 case BPF_ALU | BPF_MUL | BPF_K:
1951 case BPF_ALU64 | BPF_MUL | BPF_K:
1952 maybe_emit_mod(&prog, dst_reg, dst_reg,
1953 BPF_CLASS(insn->code) == BPF_ALU64);
1954
1955 if (is_imm8(imm32))
1956 /* imul dst_reg, dst_reg, imm8 */
1957 EMIT3(0x6B, add_2reg(0xC0, dst_reg, dst_reg),
1958 imm32);
1959 else
1960 /* imul dst_reg, dst_reg, imm32 */
1961 EMIT2_off32(0x69,
1962 add_2reg(0xC0, dst_reg, dst_reg),
1963 imm32);
1964 break;
1965
1966 case BPF_ALU | BPF_MUL | BPF_X:
1967 case BPF_ALU64 | BPF_MUL | BPF_X:
1968 maybe_emit_mod(&prog, src_reg, dst_reg,
1969 BPF_CLASS(insn->code) == BPF_ALU64);
1970
1971 /* imul dst_reg, src_reg */
1972 EMIT3(0x0F, 0xAF, add_2reg(0xC0, src_reg, dst_reg));
1973 break;
1974
1975 /* Shifts */
1976 case BPF_ALU | BPF_LSH | BPF_K:
1977 case BPF_ALU | BPF_RSH | BPF_K:
1978 case BPF_ALU | BPF_ARSH | BPF_K:
1979 case BPF_ALU64 | BPF_LSH | BPF_K:
1980 case BPF_ALU64 | BPF_RSH | BPF_K:
1981 case BPF_ALU64 | BPF_ARSH | BPF_K:
1982 maybe_emit_1mod(&prog, dst_reg,
1983 BPF_CLASS(insn->code) == BPF_ALU64);
1984
1985 b3 = simple_alu_opcodes[BPF_OP(insn->code)];
1986 if (imm32 == 1)
1987 EMIT2(0xD1, add_1reg(b3, dst_reg));
1988 else
1989 EMIT3(0xC1, add_1reg(b3, dst_reg), imm32);
1990 break;
1991
1992 case BPF_ALU | BPF_LSH | BPF_X:
1993 case BPF_ALU | BPF_RSH | BPF_X:
1994 case BPF_ALU | BPF_ARSH | BPF_X:
1995 case BPF_ALU64 | BPF_LSH | BPF_X:
1996 case BPF_ALU64 | BPF_RSH | BPF_X:
1997 case BPF_ALU64 | BPF_ARSH | BPF_X:
1998 /* BMI2 shifts aren't better when shift count is already in rcx */
1999 if (boot_cpu_has(X86_FEATURE_BMI2) && src_reg != BPF_REG_4) {
2000 /* shrx/sarx/shlx dst_reg, dst_reg, src_reg */
2001 bool w = (BPF_CLASS(insn->code) == BPF_ALU64);
2002 u8 op;
2003
2004 switch (BPF_OP(insn->code)) {
2005 case BPF_LSH:
2006 op = 1; /* prefix 0x66 */
2007 break;
2008 case BPF_RSH:
2009 op = 3; /* prefix 0xf2 */
2010 break;
2011 case BPF_ARSH:
2012 op = 2; /* prefix 0xf3 */
2013 break;
2014 }
2015
2016 emit_shiftx(&prog, dst_reg, src_reg, w, op);
2017
2018 break;
2019 }
2020
2021 if (src_reg != BPF_REG_4) { /* common case */
2022 /* Check for bad case when dst_reg == rcx */
2023 if (dst_reg == BPF_REG_4) {
2024 /* mov r11, dst_reg */
2025 EMIT_mov(AUX_REG, dst_reg);
2026 dst_reg = AUX_REG;
2027 } else {
2028 EMIT1(0x51); /* push rcx */
2029 }
2030 /* mov rcx, src_reg */
2031 EMIT_mov(BPF_REG_4, src_reg);
2032 }
2033
2034 /* shl %rax, %cl | shr %rax, %cl | sar %rax, %cl */
2035 maybe_emit_1mod(&prog, dst_reg,
2036 BPF_CLASS(insn->code) == BPF_ALU64);
2037
2038 b3 = simple_alu_opcodes[BPF_OP(insn->code)];
2039 EMIT2(0xD3, add_1reg(b3, dst_reg));
2040
2041 if (src_reg != BPF_REG_4) {
2042 if (insn->dst_reg == BPF_REG_4)
2043 /* mov dst_reg, r11 */
2044 EMIT_mov(insn->dst_reg, AUX_REG);
2045 else
2046 EMIT1(0x59); /* pop rcx */
2047 }
2048
2049 break;
2050
2051 case BPF_ALU | BPF_END | BPF_FROM_BE:
2052 case BPF_ALU64 | BPF_END | BPF_FROM_LE:
2053 switch (imm32) {
2054 case 16:
2055 /* Emit 'ror %ax, 8' to swap lower 2 bytes */
2056 EMIT1(0x66);
2057 if (is_ereg(dst_reg))
2058 EMIT1(0x41);
2059 EMIT3(0xC1, add_1reg(0xC8, dst_reg), 8);
2060
2061 /* Emit 'movzwl eax, ax' */
2062 if (is_ereg(dst_reg))
2063 EMIT3(0x45, 0x0F, 0xB7);
2064 else
2065 EMIT2(0x0F, 0xB7);
2066 EMIT1(add_2reg(0xC0, dst_reg, dst_reg));
2067 break;
2068 case 32:
2069 /* Emit 'bswap eax' to swap lower 4 bytes */
2070 if (is_ereg(dst_reg))
2071 EMIT2(0x41, 0x0F);
2072 else
2073 EMIT1(0x0F);
2074 EMIT1(add_1reg(0xC8, dst_reg));
2075 break;
2076 case 64:
2077 /* Emit 'bswap rax' to swap 8 bytes */
2078 EMIT3(add_1mod(0x48, dst_reg), 0x0F,
2079 add_1reg(0xC8, dst_reg));
2080 break;
2081 }
2082 break;
2083
2084 case BPF_ALU | BPF_END | BPF_FROM_LE:
2085 switch (imm32) {
2086 case 16:
2087 /*
2088 * Emit 'movzwl eax, ax' to zero extend 16-bit
2089 * into 64 bit
2090 */
2091 if (is_ereg(dst_reg))
2092 EMIT3(0x45, 0x0F, 0xB7);
2093 else
2094 EMIT2(0x0F, 0xB7);
2095 EMIT1(add_2reg(0xC0, dst_reg, dst_reg));
2096 break;
2097 case 32:
2098 /* Emit 'mov eax, eax' to clear upper 32-bits */
2099 if (is_ereg(dst_reg))
2100 EMIT1(0x45);
2101 EMIT2(0x89, add_2reg(0xC0, dst_reg, dst_reg));
2102 break;
2103 case 64:
2104 /* nop */
2105 break;
2106 }
2107 break;
2108
2109 /* speculation barrier */
2110 case BPF_ST | BPF_NOSPEC:
2111 EMIT_LFENCE();
2112 break;
2113
2114 /* ST: *(u8*)(dst_reg + off) = imm */
2115 case BPF_ST | BPF_MEM | BPF_B:
2116 if (is_ereg(dst_reg))
2117 EMIT2(0x41, 0xC6);
2118 else
2119 EMIT1(0xC6);
2120 goto st;
2121 case BPF_ST | BPF_MEM | BPF_H:
2122 if (is_ereg(dst_reg))
2123 EMIT3(0x66, 0x41, 0xC7);
2124 else
2125 EMIT2(0x66, 0xC7);
2126 goto st;
2127 case BPF_ST | BPF_MEM | BPF_W:
2128 if (is_ereg(dst_reg))
2129 EMIT2(0x41, 0xC7);
2130 else
2131 EMIT1(0xC7);
2132 goto st;
2133 case BPF_ST | BPF_MEM | BPF_DW:
2134 EMIT2(add_1mod(0x48, dst_reg), 0xC7);
2135
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets
2026-01-14 9:39 ` [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets Xu Kuohai
2026-01-14 11:00 ` Anton Protopopov
@ 2026-01-14 20:46 ` Eduard Zingerman
2026-01-15 7:47 ` Xu Kuohai
1 sibling, 1 reply; 15+ messages in thread
From: Eduard Zingerman @ 2026-01-14 20:46 UTC (permalink / raw)
To: Xu Kuohai, bpf, linux-kernel, linux-arm-kernel
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Yonghong Song, Puranjay Mohan, Anton Protopopov
On Wed, 2026-01-14 at 17:39 +0800, Xu Kuohai wrote:
> From: Xu Kuohai <xukuohai@huawei.com>
>
> Introduce helper bpf_insn_is_indirect_target to determine whether a BPF
> instruction is an indirect jump target. This helper will be used by
> follow-up patches to decide where to emit indirect landing pad instructions.
>
> Add a new flag to struct bpf_insn_aux_data to mark instructions that are
> indirect jump targets. The BPF verifier sets this flag, and the helper
> checks it to determine whether an instruction is an indirect jump target.
>
> Since bpf_insn_aux_data is only available before JIT stage, add a new
> field to struct bpf_prog_aux to store a pointer to the bpf_insn_aux_data
> array, making it accessible to the JIT.
>
> For programs with multiple subprogs, each subprog uses its own private
> copy of insn_aux_data, since subprogs may insert additional instructions
> during JIT and need to update the array. For non-subprog, the verifier's
> insn_aux_data array is used directly to avoid unnecessary copying.
>
> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> ---
Hm, I've missed the fact insn_aux_data is not currently available to jit.
Is it really necessary to copy this array for each subprogram?
Given that we still want to free insn_aux_data after program load,
I'd expect that it should be possible just to pass a pointer with an
offset pointing to a start of specific subprogram. Wdyt?
[...]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next v4 1/4] bpf: Fix an off-by-one error in check_indirect_jump
2026-01-14 10:29 ` Anton Protopopov
@ 2026-01-15 7:31 ` Xu Kuohai
0 siblings, 0 replies; 15+ messages in thread
From: Xu Kuohai @ 2026-01-15 7:31 UTC (permalink / raw)
To: Anton Protopopov
Cc: bpf, linux-kernel, linux-arm-kernel, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, Puranjay Mohan
On 1/14/2026 6:29 PM, Anton Protopopov wrote:
> On 26/01/14 05:39PM, Xu Kuohai wrote:
>> From: Xu Kuohai <xukuohai@huawei.com>
>>
>> Fix an off-by-one error in check_indirect_jump() that skips the last
>> element returned by copy_insn_array_uniq().
>>
>> Fixes: 493d9e0d6083 ("bpf, x86: add support for indirect jumps")
>> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
>> ---
>> kernel/bpf/verifier.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index faa1ecc1fe9d..22605d9e0ffa 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -20336,7 +20336,7 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in
>> return -EINVAL;
>> }
>>
>> - for (i = 0; i < n - 1; i++) {
>> + for (i = 0; i < n; i++) {
>> other_branch = push_stack(env, env->gotox_tmp_buf->items[i],
>> env->insn_idx, env->cur_state->speculative);
>> if (IS_ERR(other_branch))
>> --
>> 2.47.3
>
> Nack, the last state doesn't require a push_stack() call, it is
> verified directly under this loop. Instead of this patch, just
> add another call to mark_indirect_target().
Ok, I see. Thanks for the explanation.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets
2026-01-14 11:00 ` Anton Protopopov
@ 2026-01-15 7:37 ` Xu Kuohai
0 siblings, 0 replies; 15+ messages in thread
From: Xu Kuohai @ 2026-01-15 7:37 UTC (permalink / raw)
To: Anton Protopopov
Cc: bpf, linux-kernel, linux-arm-kernel, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, Puranjay Mohan
On 1/14/2026 7:00 PM, Anton Protopopov wrote:
[...]
>> +
>> +bool bpf_insn_is_indirect_target(const struct bpf_prog *prog, int idx)
>> +{
>> + return prog->aux->insn_aux && prog->aux->insn_aux[idx].indirect_target;
>
> Is there a case when insn_aux is NULL?
>
It is NULL when there is no indirect jump targets for the bpf prog, see the
has_indirect_target test in clone_insn_aux_data.
>> +}
>> #endif /* CONFIG_BPF_JIT */
>>
>> /* Base function for offset calculation. Needs to go into .text section,
>> @@ -2540,24 +2579,24 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
>> if (!bpf_prog_is_offloaded(fp->aux)) {
>> *err = bpf_prog_alloc_jited_linfo(fp);
>> if (*err)
>> - return fp;
>> + goto free_insn_aux;
>>
>> fp = bpf_int_jit_compile(fp);
>> bpf_prog_jit_attempt_done(fp);
>> if (!fp->jited && jit_needed) {
>> *err = -ENOTSUPP;
>> - return fp;
>> + goto free_insn_aux;
>> }
>> } else {
>> *err = bpf_prog_offload_compile(fp);
>> if (*err)
>> - return fp;
>> + goto free_insn_aux;
>> }
>>
>> finalize:
>> *err = bpf_prog_lock_ro(fp);
>> if (*err)
>> - return fp;
>> + goto free_insn_aux;
>>
>> /* The tail call compatibility check can only be done at
>> * this late stage as we need to determine, if we deal
>> @@ -2566,6 +2605,10 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
>> */
>> *err = bpf_check_tail_call(fp);
>>
>> +free_insn_aux:
>> + vfree(fp->aux->insn_aux);
>> + fp->aux->insn_aux = NULL;
>> +
>> return fp;
>> }
>> EXPORT_SYMBOL_GPL(bpf_prog_select_runtime);
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index 22605d9e0ffa..f2fe6baeceb9 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -3852,6 +3852,11 @@ static bool is_jmp_point(struct bpf_verifier_env *env, int insn_idx)
>> return env->insn_aux_data[insn_idx].jmp_point;
>> }
>>
>> +static void mark_indirect_target(struct bpf_verifier_env *env, int idx)
>> +{
>> + env->insn_aux_data[idx].indirect_target = true;
>> +}
>> +
>> #define LR_FRAMENO_BITS 3
>> #define LR_SPI_BITS 6
>> #define LR_ENTRY_BITS (LR_SPI_BITS + LR_FRAMENO_BITS + 1)
>> @@ -20337,6 +20342,7 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in
>> }
>>
>> for (i = 0; i < n; i++) {
>
> ^ n -> n-1
>
ACK
>> + mark_indirect_target(env, env->gotox_tmp_buf->items[i]);
>> other_branch = push_stack(env, env->gotox_tmp_buf->items[i],
>> env->insn_idx, env->cur_state->speculative);
>> if (IS_ERR(other_branch))
>> @@ -21243,6 +21249,37 @@ static void convert_pseudo_ld_imm64(struct bpf_verifier_env *env)
>> }
>
> mark_indirect_target(n-1)
>
>> }
>>
>> +static int clone_insn_aux_data(struct bpf_prog *prog, struct bpf_verifier_env *env, u32 off)
>> +{
>> + u32 i;
>> + size_t size;
>> + bool has_indirect_target = false;
>> + struct bpf_insn_aux_data *insn_aux;
>> +
>> + for (i = 0; i < prog->len; i++) {
>> + if (env->insn_aux_data[off + i].indirect_target) {
>> + has_indirect_target = true;
>> + break;
>> + }
>> + }
>> +
>> + /* insn_aux is copied into bpf_prog so the JIT can check whether an instruction is an
>> + * indirect jump target. If no indirect jump targets exist, copying is unnecessary.
>> + */
>> + if (!has_indirect_target)
>> + return 0;
>> +
>> + size = array_size(sizeof(struct bpf_insn_aux_data), prog->len);
>> + insn_aux = vzalloc(size);
>> + if (!insn_aux)
>> + return -ENOMEM;
>> +
>> + memcpy(insn_aux, env->insn_aux_data + off, size);
>> + prog->aux->insn_aux = insn_aux;
>> +
>> + return 0;
>> +}
>> +
>> /* single env->prog->insni[off] instruction was replaced with the range
>> * insni[off, off + cnt). Adjust corresponding insn_aux_data by copying
>> * [0, off) and [off, end) to new locations, so the patched range stays zero
>> @@ -22239,6 +22276,10 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>> if (!i)
>> func[i]->aux->exception_boundary = env->seen_exception;
>>
>> + err = clone_insn_aux_data(func[i], env, subprog_start);
>> + if (err < 0)
>> + goto out_free;
>> +
>> /*
>> * To properly pass the absolute subprog start to jit
>> * all instruction adjustments should be accumulated
>> @@ -22306,6 +22347,8 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>> for (i = 0; i < env->subprog_cnt; i++) {
>> func[i]->aux->used_maps = NULL;
>> func[i]->aux->used_map_cnt = 0;
>> + vfree(func[i]->aux->insn_aux);
>> + func[i]->aux->insn_aux = NULL;
>> }
>>
>> /* finally lock prog and jit images for all functions and
>> @@ -22367,6 +22410,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
>> for (i = 0; i < env->subprog_cnt; i++) {
>> if (!func[i])
>> continue;
>> + vfree(func[i]->aux->insn_aux);
>> func[i]->aux->poke_tab = NULL;
>> bpf_jit_free(func[i]);
>> }
>> @@ -25350,6 +25394,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
>> env->verification_time = ktime_get_ns() - start_time;
>> print_verification_stats(env);
>> env->prog->aux->verified_insns = env->insn_processed;
>> + env->prog->aux->insn_aux = env->insn_aux_data;
>>
>> /* preserve original error even if log finalization is successful */
>> err = bpf_vlog_finalize(&env->log, &log_true_size);
>> @@ -25428,7 +25473,11 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
>> if (!is_priv)
>> mutex_unlock(&bpf_verifier_lock);
>> clear_insn_aux_data(env, 0, env->prog->len);
>> - vfree(env->insn_aux_data);
>> + /* on success, insn_aux_data will be freed by bpf_prog_select_runtime */
>> + if (ret) {
>> + vfree(env->insn_aux_data);
>> + env->prog->aux->insn_aux = NULL;
>> + }
>> err_free_env:
>> bpf_stack_liveness_free(env);
>> kvfree(env->cfg.insn_postorder);
>> --
>> 2.47.3
>>
>
> LGTM, just in case, could you please tell how you have tested
> this patchset exactly?
I ran test_progs-cpuv4 on machines supporting x86 CET/IBT and arm64 BTI. I tested in three
environments: an arm64 physical machine with BTI support (CPU: Hisilicon KP920B), an arm64
QEMU VM using cpu=max for BTI support, and a Bochs VM with model=arrow_lake for x86 CET/IBT
support.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets
2026-01-14 20:46 ` Eduard Zingerman
@ 2026-01-15 7:47 ` Xu Kuohai
2026-01-18 17:20 ` Alexei Starovoitov
0 siblings, 1 reply; 15+ messages in thread
From: Xu Kuohai @ 2026-01-15 7:47 UTC (permalink / raw)
To: Eduard Zingerman, bpf, linux-kernel, linux-arm-kernel
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Yonghong Song, Puranjay Mohan, Anton Protopopov
On 1/15/2026 4:46 AM, Eduard Zingerman wrote:
> On Wed, 2026-01-14 at 17:39 +0800, Xu Kuohai wrote:
>> From: Xu Kuohai <xukuohai@huawei.com>
>>
>> Introduce helper bpf_insn_is_indirect_target to determine whether a BPF
>> instruction is an indirect jump target. This helper will be used by
>> follow-up patches to decide where to emit indirect landing pad instructions.
>>
>> Add a new flag to struct bpf_insn_aux_data to mark instructions that are
>> indirect jump targets. The BPF verifier sets this flag, and the helper
>> checks it to determine whether an instruction is an indirect jump target.
>>
>> Since bpf_insn_aux_data is only available before JIT stage, add a new
>> field to struct bpf_prog_aux to store a pointer to the bpf_insn_aux_data
>> array, making it accessible to the JIT.
>>
>> For programs with multiple subprogs, each subprog uses its own private
>> copy of insn_aux_data, since subprogs may insert additional instructions
>> during JIT and need to update the array. For non-subprog, the verifier's
>> insn_aux_data array is used directly to avoid unnecessary copying.
>>
>> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
>> ---
>
> Hm, I've missed the fact insn_aux_data is not currently available to jit.
> Is it really necessary to copy this array for each subprogram?
> Given that we still want to free insn_aux_data after program load,
> I'd expect that it should be possible just to pass a pointer with an
> offset pointing to a start of specific subprogram. Wdyt?
>
I think it requires an additional field in struct bpf_prog to record the length
of the global insn_aux_data array. If a subprog inserts new instructions during
JIT (e.g., due to constant blinding), all entries in the array, including those
of the subsequent subprogs, would need to be adjusted. With per-subprog copying,
only the local insn_aux_data needs to be updated, reducing the amount of copying.
However, if you prefer a global array, I’m happy to switch to it.
> [...]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets
2026-01-15 7:47 ` Xu Kuohai
@ 2026-01-18 17:20 ` Alexei Starovoitov
2026-01-18 23:22 ` Kumar Kartikeya Dwivedi
2026-01-19 2:35 ` Xu Kuohai
0 siblings, 2 replies; 15+ messages in thread
From: Alexei Starovoitov @ 2026-01-18 17:20 UTC (permalink / raw)
To: Xu Kuohai, Kumar Kartikeya Dwivedi
Cc: Eduard Zingerman, bpf, LKML, linux-arm-kernel, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Yonghong Song,
Puranjay Mohan, Anton Protopopov
On Wed, Jan 14, 2026 at 11:47 PM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
>
> On 1/15/2026 4:46 AM, Eduard Zingerman wrote:
> > On Wed, 2026-01-14 at 17:39 +0800, Xu Kuohai wrote:
> >> From: Xu Kuohai <xukuohai@huawei.com>
> >>
> >> Introduce helper bpf_insn_is_indirect_target to determine whether a BPF
> >> instruction is an indirect jump target. This helper will be used by
> >> follow-up patches to decide where to emit indirect landing pad instructions.
> >>
> >> Add a new flag to struct bpf_insn_aux_data to mark instructions that are
> >> indirect jump targets. The BPF verifier sets this flag, and the helper
> >> checks it to determine whether an instruction is an indirect jump target.
> >>
> >> Since bpf_insn_aux_data is only available before JIT stage, add a new
> >> field to struct bpf_prog_aux to store a pointer to the bpf_insn_aux_data
> >> array, making it accessible to the JIT.
> >>
> >> For programs with multiple subprogs, each subprog uses its own private
> >> copy of insn_aux_data, since subprogs may insert additional instructions
> >> during JIT and need to update the array. For non-subprog, the verifier's
> >> insn_aux_data array is used directly to avoid unnecessary copying.
> >>
> >> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> >> ---
> >
> > Hm, I've missed the fact insn_aux_data is not currently available to jit.
> > Is it really necessary to copy this array for each subprogram?
> > Given that we still want to free insn_aux_data after program load,
> > I'd expect that it should be possible just to pass a pointer with an
> > offset pointing to a start of specific subprogram. Wdyt?
> >
>
> I think it requires an additional field in struct bpf_prog to record the length
> of the global insn_aux_data array. If a subprog inserts new instructions during
> JIT (e.g., due to constant blinding), all entries in the array, including those
> of the subsequent subprogs, would need to be adjusted. With per-subprog copying,
> only the local insn_aux_data needs to be updated, reducing the amount of copying.
>
> However, if you prefer a global array, I’m happy to switch to it.
iirc we struggled with lack of env/insn_aux in JIT earlier.
func[i]->aux->used_maps = env->used_maps;
is one such example.
Let's move bpf_prog_select_runtime() into bpf_check() and
consistently pass 'env' into bpf_int_jit_compile() while
env is still valid.
Close to jit_subprogs().
Or remove bpf_prog_select_runtime() and make jit_subprogs()
do the whole thing. tbd.
This way we can remove used_maps workaround and don't need to do
this insn_aux copy.
Errors during JIT can be printed into the verifier log too.
Kumar,
what do you think about it from modularization pov ?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets
2026-01-18 17:20 ` Alexei Starovoitov
@ 2026-01-18 23:22 ` Kumar Kartikeya Dwivedi
2026-01-19 2:35 ` Xu Kuohai
1 sibling, 0 replies; 15+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2026-01-18 23:22 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Xu Kuohai, Eduard Zingerman, bpf, LKML, linux-arm-kernel,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Yonghong Song, Puranjay Mohan, Anton Protopopov
On Sun, 18 Jan 2026 at 18:20, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Jan 14, 2026 at 11:47 PM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
> >
> > On 1/15/2026 4:46 AM, Eduard Zingerman wrote:
> > > On Wed, 2026-01-14 at 17:39 +0800, Xu Kuohai wrote:
> > >> From: Xu Kuohai <xukuohai@huawei.com>
> > >>
> > >> Introduce helper bpf_insn_is_indirect_target to determine whether a BPF
> > >> instruction is an indirect jump target. This helper will be used by
> > >> follow-up patches to decide where to emit indirect landing pad instructions.
> > >>
> > >> Add a new flag to struct bpf_insn_aux_data to mark instructions that are
> > >> indirect jump targets. The BPF verifier sets this flag, and the helper
> > >> checks it to determine whether an instruction is an indirect jump target.
> > >>
> > >> Since bpf_insn_aux_data is only available before JIT stage, add a new
> > >> field to struct bpf_prog_aux to store a pointer to the bpf_insn_aux_data
> > >> array, making it accessible to the JIT.
> > >>
> > >> For programs with multiple subprogs, each subprog uses its own private
> > >> copy of insn_aux_data, since subprogs may insert additional instructions
> > >> during JIT and need to update the array. For non-subprog, the verifier's
> > >> insn_aux_data array is used directly to avoid unnecessary copying.
> > >>
> > >> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
> > >> ---
> > >
> > > Hm, I've missed the fact insn_aux_data is not currently available to jit.
> > > Is it really necessary to copy this array for each subprogram?
> > > Given that we still want to free insn_aux_data after program load,
> > > I'd expect that it should be possible just to pass a pointer with an
> > > offset pointing to a start of specific subprogram. Wdyt?
> > >
> >
> > I think it requires an additional field in struct bpf_prog to record the length
> > of the global insn_aux_data array. If a subprog inserts new instructions during
> > JIT (e.g., due to constant blinding), all entries in the array, including those
> > of the subsequent subprogs, would need to be adjusted. With per-subprog copying,
> > only the local insn_aux_data needs to be updated, reducing the amount of copying.
> >
> > However, if you prefer a global array, I’m happy to switch to it.
>
> iirc we struggled with lack of env/insn_aux in JIT earlier.
>
> func[i]->aux->used_maps = env->used_maps;
> is one such example.
>
> Let's move bpf_prog_select_runtime() into bpf_check() and
> consistently pass 'env' into bpf_int_jit_compile() while
> env is still valid.
> Close to jit_subprogs().
> Or remove bpf_prog_select_runtime() and make jit_subprogs()
> do the whole thing. tbd.
>
> This way we can remove used_maps workaround and don't need to do
> this insn_aux copy.
> Errors during JIT can be printed into the verifier log too.
>
> Kumar,
> what do you think about it from modularization pov ?
Makes sense to do it, I don't think it would cause any problems for
modularization.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets
2026-01-18 17:20 ` Alexei Starovoitov
2026-01-18 23:22 ` Kumar Kartikeya Dwivedi
@ 2026-01-19 2:35 ` Xu Kuohai
1 sibling, 0 replies; 15+ messages in thread
From: Xu Kuohai @ 2026-01-19 2:35 UTC (permalink / raw)
To: Alexei Starovoitov, Kumar Kartikeya Dwivedi
Cc: Eduard Zingerman, bpf, LKML, linux-arm-kernel, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Yonghong Song,
Puranjay Mohan, Anton Protopopov
On 1/19/2026 1:20 AM, Alexei Starovoitov wrote:
> On Wed, Jan 14, 2026 at 11:47 PM Xu Kuohai <xukuohai@huaweicloud.com> wrote:
>>
>> On 1/15/2026 4:46 AM, Eduard Zingerman wrote:
>>> On Wed, 2026-01-14 at 17:39 +0800, Xu Kuohai wrote:
>>>> From: Xu Kuohai <xukuohai@huawei.com>
>>>>
>>>> Introduce helper bpf_insn_is_indirect_target to determine whether a BPF
>>>> instruction is an indirect jump target. This helper will be used by
>>>> follow-up patches to decide where to emit indirect landing pad instructions.
>>>>
>>>> Add a new flag to struct bpf_insn_aux_data to mark instructions that are
>>>> indirect jump targets. The BPF verifier sets this flag, and the helper
>>>> checks it to determine whether an instruction is an indirect jump target.
>>>>
>>>> Since bpf_insn_aux_data is only available before JIT stage, add a new
>>>> field to struct bpf_prog_aux to store a pointer to the bpf_insn_aux_data
>>>> array, making it accessible to the JIT.
>>>>
>>>> For programs with multiple subprogs, each subprog uses its own private
>>>> copy of insn_aux_data, since subprogs may insert additional instructions
>>>> during JIT and need to update the array. For non-subprog, the verifier's
>>>> insn_aux_data array is used directly to avoid unnecessary copying.
>>>>
>>>> Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
>>>> ---
>>>
>>> Hm, I've missed the fact insn_aux_data is not currently available to jit.
>>> Is it really necessary to copy this array for each subprogram?
>>> Given that we still want to free insn_aux_data after program load,
>>> I'd expect that it should be possible just to pass a pointer with an
>>> offset pointing to a start of specific subprogram. Wdyt?
>>>
>>
>> I think it requires an additional field in struct bpf_prog to record the length
>> of the global insn_aux_data array. If a subprog inserts new instructions during
>> JIT (e.g., due to constant blinding), all entries in the array, including those
>> of the subsequent subprogs, would need to be adjusted. With per-subprog copying,
>> only the local insn_aux_data needs to be updated, reducing the amount of copying.
>>
>> However, if you prefer a global array, I’m happy to switch to it.
>
> iirc we struggled with lack of env/insn_aux in JIT earlier.
>
> func[i]->aux->used_maps = env->used_maps;
> is one such example.
>
> Let's move bpf_prog_select_runtime() into bpf_check() and
> consistently pass 'env' into bpf_int_jit_compile() while
> env is still valid.
> Close to jit_subprogs().
> Or remove bpf_prog_select_runtime() and make jit_subprogs()
> do the whole thing. tbd.
>
> This way we can remove used_maps workaround and don't need to do
> this insn_aux copy.
> Errors during JIT can be printed into the verifier log too.
>
Sounds great. Using jit_subprogs for the whole thing seems cleaner. I'll
try this approach first.
> Kumar,
> what do you think about it from modularization pov ?
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2026-01-19 2:35 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-14 9:39 [PATCH bpf-next v4 0/4] emit ENDBR/BTI instructions for indirect jump targets Xu Kuohai
2026-01-14 9:39 ` [PATCH bpf-next v4 1/4] bpf: Fix an off-by-one error in check_indirect_jump Xu Kuohai
2026-01-14 10:29 ` Anton Protopopov
2026-01-15 7:31 ` Xu Kuohai
2026-01-14 9:39 ` [PATCH bpf-next v4 2/4] bpf: Add helper to detect indirect jump targets Xu Kuohai
2026-01-14 11:00 ` Anton Protopopov
2026-01-15 7:37 ` Xu Kuohai
2026-01-14 20:46 ` Eduard Zingerman
2026-01-15 7:47 ` Xu Kuohai
2026-01-18 17:20 ` Alexei Starovoitov
2026-01-18 23:22 ` Kumar Kartikeya Dwivedi
2026-01-19 2:35 ` Xu Kuohai
2026-01-14 9:39 ` [PATCH bpf-next v4 3/4] bpf, x86: Emit ENDBR for " Xu Kuohai
2026-01-14 16:46 ` kernel test robot
2026-01-14 9:39 ` [PATCH bpf-next v4 4/4] bpf, arm64: Emit BTI for indirect jump target Xu Kuohai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox