Linux-ARM-Kernel Archive on lore.kernel.org

Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH bpf v15 1/5] bpf: Move constants blinding out of arch-specific JITs
From: Xu Kuohai @ 2026-04-16  6:43 UTC (permalink / raw)
  To: bpf, linux-kernel, linux-arm-kernel
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Yonghong Song, Puranjay Mohan, Anton Protopopov,
	Alexis Lothoré, Shahab Vahedi, Russell King, Tiezhu Yang,
	Hengqi Chen, Johan Almbladh, Paul Burton, Hari Bathini,
	Christophe Leroy, Naveen N Rao, Luke Nelson, Xi Wang,
	Björn Töpel, Pu Lehui, Ilya Leoshkevich, Heiko Carstens,
	Vasily Gorbik, David S . Miller, Wang YanQing
In-Reply-To: <20260416064341.151802-1-xukuohai@huaweicloud.com>

From: Xu Kuohai <xukuohai@huawei.com>

During the JIT stage, constants blinding rewrites instructions but only
rewrites the private instruction copy of the JITed subprog, leaving the
global env->prog->insnsi and env->insn_aux_data untouched. This causes a
mismatch between subprog instructions and the global state, making it
difficult to use the global data in the JIT.

To avoid this mismatch, and given that all arch-specific JITs already
support constants blinding, move it to the generic verifier code, and
switch to rewrite the global env->prog->insnsi with the global states
adjusted, as other rewrites in the verifier do.

This removes the constants blinding calls in each JIT, which are largely
duplicated code across architectures.

Since constants blinding is only required for JIT, and there are two
JIT entry functions, jit_subprogs() for BPF programs with multiple
subprogs and bpf_prog_select_runtime() for programs with no subprogs,
move the constants blinding invocation into these two functions.

In the verifier path, bpf_patch_insn_data() is used to keep global
verifier auxiliary data in sync with patched instructions. A key
question is whether this global auxiliary data should be restored
on the failure path.

Besides instructions, bpf_patch_insn_data() adjusts:
  - prog->aux->poke_tab
  - env->insn_array_maps
  - env->subprog_info
  - env->insn_aux_data

For prog->aux->poke_tab, it is only used by JIT or only meaningful after
JIT succeeds, so it does not need to be restored on the failure path.

For env->insn_array_maps, when JIT fails, programs using insn arrays
are rejected by bpf_insn_array_ready() due to missing JIT addresses.
Hence, env->insn_array_maps is only meaningful for JIT and does not need
to be restored.

For subprog_info, if jit_subprogs fails and CONFIG_BPF_JIT_ALWAYS_ON
is not enabled, kernel falls back to interpreter. In this case,
env->subprog_info is used to determine subprogram stack depth. So it
must be restored on failure.

For env->insn_aux_data, it is freed by clear_insn_aux_data() at the
end of bpf_check(). Before freeing, clear_insn_aux_data() loops over
env->insn_aux_data to release jump targets recorded in it. The loop
uses env->prog->len as the array length, but this length no longer
matches the actual size of the adjusted env->insn_aux_data array after
constants blinding.

To address it, a simple approach is to keep insn_aux_data as adjusted
after failure, since it will be freed shortly, and record its actual size
for the loop in clear_insn_aux_data(). But since clear_insn_aux_data()
uses the same index to loop over both env->prog->insnsi and env->insn_aux_data,
this approach results in incorrect index for the insnsi array. So an
alternative approach is adopted: clone the original env->insn_aux_data
before blinding and restore it after failure, similar to env->prog.

For classic BPF programs, constants blinding works as before since it
is still invoked from bpf_prog_select_runtime().

Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> # v8
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com> # powerpc jit
Reviewed-by: Pu Lehui <pulehui@huawei.com> # riscv jit
Acked-by: Hengqi Chen <hengqi.chen@gmail.com> # loongarch jit
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
---
 arch/arc/net/bpf_jit_core.c      |  39 +++------
 arch/arm/net/bpf_jit_32.c        |  41 ++-------
 arch/arm64/net/bpf_jit_comp.c    |  72 +++++----------
 arch/loongarch/net/bpf_jit.c     |  59 ++++---------
 arch/mips/net/bpf_jit_comp.c     |  20 +----
 arch/parisc/net/bpf_jit_core.c   |  73 ++++++----------
 arch/powerpc/net/bpf_jit_comp.c  |  72 ++++++---------
 arch/riscv/net/bpf_jit_core.c    |  61 +++++--------
 arch/s390/net/bpf_jit_comp.c     |  59 +++++--------
 arch/sparc/net/bpf_jit_comp_64.c |  61 +++++--------
 arch/x86/net/bpf_jit_comp.c      |  43 ++-------
 arch/x86/net/bpf_jit_comp32.c    |  33 +------
 include/linux/filter.h           |  33 ++++++-
 kernel/bpf/core.c                |  69 +++++++++++++--
 kernel/bpf/fixups.c              | 146 ++++++++++++++++++++++++++-----
 15 files changed, 403 insertions(+), 478 deletions(-)

diff --git a/arch/arc/net/bpf_jit_core.c b/arch/arc/net/bpf_jit_core.c
index 1421eeced0f5..973ceae48675 100644
--- a/arch/arc/net/bpf_jit_core.c
+++ b/arch/arc/net/bpf_jit_core.c
@@ -79,7 +79,6 @@ struct arc_jit_data {
  * The JIT pertinent context that is used by different functions.
  *
  * prog:		The current eBPF program being handled.
- * orig_prog:		The original eBPF program before any possible change.
  * jit:			The JIT buffer and its length.
  * bpf_header:		The JITed program header. "jit.buf" points inside it.
  * emit:		If set, opcodes are written to memory; else, a dry-run.
@@ -94,12 +93,10 @@ struct arc_jit_data {
  * need_extra_pass:	A forecast if an "extra_pass" will occur.
  * is_extra_pass:	Indicates if the current pass is an extra pass.
  * user_bpf_prog:	True, if VM opcodes come from a real program.
- * blinded:		True if "constant blinding" step returned a new "prog".
  * success:		Indicates if the whole JIT went OK.
  */
 struct jit_context {
 	struct bpf_prog			*prog;
-	struct bpf_prog			*orig_prog;
 	struct jit_buffer		jit;
 	struct bpf_binary_header	*bpf_header;
 	bool				emit;
@@ -114,7 +111,6 @@ struct jit_context {
 	bool				need_extra_pass;
 	bool				is_extra_pass;
 	bool				user_bpf_prog;
-	bool				blinded;
 	bool				success;
 };
 
@@ -161,13 +157,7 @@ static int jit_ctx_init(struct jit_context *ctx, struct bpf_prog *prog)
 {
 	memset(ctx, 0, sizeof(*ctx));
 
-	ctx->orig_prog = prog;
-
-	/* If constant blinding was requested but failed, scram. */
-	ctx->prog = bpf_jit_blind_constants(prog);
-	if (IS_ERR(ctx->prog))
-		return PTR_ERR(ctx->prog);
-	ctx->blinded = (ctx->prog != ctx->orig_prog);
+	ctx->prog = prog;
 
 	/* If the verifier doesn't zero-extend, then we have to do it. */
 	ctx->do_zext = !ctx->prog->aux->verifier_zext;
@@ -214,14 +204,6 @@ static inline void maybe_free(struct jit_context *ctx, void **mem)
  */
 static void jit_ctx_cleanup(struct jit_context *ctx)
 {
-	if (ctx->blinded) {
-		/* if all went well, release the orig_prog. */
-		if (ctx->success)
-			bpf_jit_prog_release_other(ctx->prog, ctx->orig_prog);
-		else
-			bpf_jit_prog_release_other(ctx->orig_prog, ctx->prog);
-	}
-
 	maybe_free(ctx, (void **)&ctx->bpf2insn);
 	maybe_free(ctx, (void **)&ctx->jit_data);
 
@@ -229,12 +211,19 @@ static void jit_ctx_cleanup(struct jit_context *ctx)
 		ctx->bpf2insn_valid = false;
 
 	/* Freeing "bpf_header" is enough. "jit.buf" is a sub-array of it. */
-	if (!ctx->success && ctx->bpf_header) {
-		bpf_jit_binary_free(ctx->bpf_header);
-		ctx->bpf_header = NULL;
-		ctx->jit.buf    = NULL;
-		ctx->jit.index  = 0;
-		ctx->jit.len    = 0;
+	if (!ctx->success) {
+		if (ctx->bpf_header) {
+			bpf_jit_binary_free(ctx->bpf_header);
+			ctx->bpf_header = NULL;
+			ctx->jit.buf    = NULL;
+			ctx->jit.index  = 0;
+			ctx->jit.len    = 0;
+		}
+		if (ctx->is_extra_pass) {
+			ctx->prog->bpf_func = NULL;
+			ctx->prog->jited = 0;
+			ctx->prog->jited_len = 0;
+		}
 	}
 
 	ctx->emit = false;
diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index deeb8f292454..e6b1bb2de627 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -2144,9 +2144,7 @@ bool bpf_jit_needs_zext(void)
 
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 {
-	struct bpf_prog *tmp, *orig_prog = prog;
 	struct bpf_binary_header *header;
-	bool tmp_blinded = false;
 	struct jit_ctx ctx;
 	unsigned int tmp_idx;
 	unsigned int image_size;
@@ -2156,20 +2154,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	 * the interpreter.
 	 */
 	if (!prog->jit_requested)
-		return orig_prog;
-
-	/* If constant blinding was enabled and we failed during blinding
-	 * then we must fall back to the interpreter. Otherwise, we save
-	 * the new JITed code.
-	 */
-	tmp = bpf_jit_blind_constants(prog);
-
-	if (IS_ERR(tmp))
-		return orig_prog;
-	if (tmp != prog) {
-		tmp_blinded = true;
-		prog = tmp;
-	}
+		return prog;
 
 	memset(&ctx, 0, sizeof(ctx));
 	ctx.prog = prog;
@@ -2179,10 +2164,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	 * we must fall back to the interpreter
 	 */
 	ctx.offsets = kcalloc(prog->len, sizeof(int), GFP_KERNEL);
-	if (ctx.offsets == NULL) {
-		prog = orig_prog;
-		goto out;
-	}
+	if (ctx.offsets == NULL)
+		return prog;
 
 	/* 1) fake pass to find in the length of the JITed code,
 	 * to compute ctx->offsets and other context variables
@@ -2194,10 +2177,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	 * being successful in the second pass, so just fall back
 	 * to the interpreter.
 	 */
-	if (build_body(&ctx)) {
-		prog = orig_prog;
+	if (build_body(&ctx))
 		goto out_off;
-	}
 
 	tmp_idx = ctx.idx;
 	build_prologue(&ctx);
@@ -2213,10 +2194,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	ctx.idx += ctx.imm_count;
 	if (ctx.imm_count) {
 		ctx.imms = kcalloc(ctx.imm_count, sizeof(u32), GFP_KERNEL);
-		if (ctx.imms == NULL) {
-			prog = orig_prog;
+		if (ctx.imms == NULL)
 			goto out_off;
-		}
 	}
 #else
 	/* there's nothing about the epilogue on ARMv7 */
@@ -2238,10 +2217,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	/* Not able to allocate memory for the structure then
 	 * we must fall back to the interpretation
 	 */
-	if (header == NULL) {
-		prog = orig_prog;
+	if (header == NULL)
 		goto out_imms;
-	}
 
 	/* 2.) Actual pass to generate final JIT code */
 	ctx.target = (u32 *) image_ptr;
@@ -2278,16 +2255,12 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 #endif
 out_off:
 	kfree(ctx.offsets);
-out:
-	if (tmp_blinded)
-		bpf_jit_prog_release_other(prog, prog == orig_prog ?
-					   tmp : orig_prog);
+
 	return prog;
 
 out_free:
 	image_ptr = NULL;
 	bpf_jit_binary_free(header);
-	prog = orig_prog;
 	goto out_imms;
 }
 
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 524b67c0867e..d310d1c35192 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -2003,14 +2003,12 @@ struct arm64_jit_data {
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 {
 	int image_size, prog_size, extable_size, extable_align, extable_offset;
-	struct bpf_prog *tmp, *orig_prog = prog;
 	struct bpf_binary_header *header;
 	struct bpf_binary_header *ro_header = NULL;
 	struct arm64_jit_data *jit_data;
 	void __percpu *priv_stack_ptr = NULL;
 	bool was_classic = bpf_prog_was_classic(prog);
 	int priv_stack_alloc_sz;
-	bool tmp_blinded = false;
 	bool extra_pass = false;
 	struct jit_ctx ctx;
 	u8 *image_ptr;
@@ -2019,26 +2017,13 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	int exentry_idx;
 
 	if (!prog->jit_requested)
-		return orig_prog;
-
-	tmp = bpf_jit_blind_constants(prog);
-	/* If blinding was requested and we failed during blinding,
-	 * we must fall back to the interpreter.
-	 */
-	if (IS_ERR(tmp))
-		return orig_prog;
-	if (tmp != prog) {
-		tmp_blinded = true;
-		prog = tmp;
-	}
+		return prog;
 
 	jit_data = prog->aux->jit_data;
 	if (!jit_data) {
 		jit_data = kzalloc_obj(*jit_data);
-		if (!jit_data) {
-			prog = orig_prog;
-			goto out;
-		}
+		if (!jit_data)
+			return prog;
 		prog->aux->jit_data = jit_data;
 	}
 	priv_stack_ptr = prog->aux->priv_stack_ptr;
@@ -2050,10 +2035,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		priv_stack_alloc_sz = round_up(prog->aux->stack_depth, 16) +
 				      2 * PRIV_STACK_GUARD_SZ;
 		priv_stack_ptr = __alloc_percpu_gfp(priv_stack_alloc_sz, 16, GFP_KERNEL);
-		if (!priv_stack_ptr) {
-			prog = orig_prog;
+		if (!priv_stack_ptr)
 			goto out_priv_stack;
-		}
 
 		priv_stack_init_guard(priv_stack_ptr, priv_stack_alloc_sz);
 		prog->aux->priv_stack_ptr = priv_stack_ptr;
@@ -2073,10 +2056,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	ctx.prog = prog;
 
 	ctx.offset = kvzalloc_objs(int, prog->len + 1);
-	if (ctx.offset == NULL) {
-		prog = orig_prog;
+	if (ctx.offset == NULL)
 		goto out_off;
-	}
 
 	ctx.user_vm_start = bpf_arena_get_user_vm_start(prog->aux->arena);
 	ctx.arena_vm_start = bpf_arena_get_kern_vm_start(prog->aux->arena);
@@ -2089,15 +2070,11 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	 * BPF line info needs ctx->offset[i] to be the offset of
 	 * instruction[i] in jited image, so build prologue first.
 	 */
-	if (build_prologue(&ctx, was_classic)) {
-		prog = orig_prog;
+	if (build_prologue(&ctx, was_classic))
 		goto out_off;
-	}
 
-	if (build_body(&ctx, extra_pass)) {
-		prog = orig_prog;
+	if (build_body(&ctx, extra_pass))
 		goto out_off;
-	}
 
 	ctx.epilogue_offset = ctx.idx;
 	build_epilogue(&ctx, was_classic);
@@ -2115,10 +2092,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	ro_header = bpf_jit_binary_pack_alloc(image_size, &ro_image_ptr,
 					      sizeof(u64), &header, &image_ptr,
 					      jit_fill_hole);
-	if (!ro_header) {
-		prog = orig_prog;
+	if (!ro_header)
 		goto out_off;
-	}
 
 	/* Pass 2: Determine jited position and result for each instruction */
 
@@ -2146,10 +2121,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	/* Dont write body instructions to memory for now */
 	ctx.write = false;
 
-	if (build_body(&ctx, extra_pass)) {
-		prog = orig_prog;
+	if (build_body(&ctx, extra_pass))
 		goto out_free_hdr;
-	}
 
 	ctx.epilogue_offset = ctx.idx;
 	ctx.exentry_idx = exentry_idx;
@@ -2158,19 +2131,15 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 
 	/* Pass 3: Adjust jump offset and write final image */
 	if (build_body(&ctx, extra_pass) ||
-		WARN_ON_ONCE(ctx.idx != ctx.epilogue_offset)) {
-		prog = orig_prog;
+		WARN_ON_ONCE(ctx.idx != ctx.epilogue_offset))
 		goto out_free_hdr;
-	}
 
 	build_epilogue(&ctx, was_classic);
 	build_plt(&ctx);
 
 	/* Extra pass to validate JITed code. */
-	if (validate_ctx(&ctx)) {
-		prog = orig_prog;
+	if (validate_ctx(&ctx))
 		goto out_free_hdr;
-	}
 
 	/* update the real prog size */
 	prog_size = sizeof(u32) * ctx.idx;
@@ -2187,16 +2156,13 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		if (extra_pass && ctx.idx > jit_data->ctx.idx) {
 			pr_err_once("multi-func JIT bug %d > %d\n",
 				    ctx.idx, jit_data->ctx.idx);
-			prog->bpf_func = NULL;
-			prog->jited = 0;
-			prog->jited_len = 0;
 			goto out_free_hdr;
 		}
 		if (WARN_ON(bpf_jit_binary_pack_finalize(ro_header, header))) {
-			/* ro_header has been freed */
+			/* ro_header and header has been freed */
 			ro_header = NULL;
-			prog = orig_prog;
-			goto out_off;
+			header = NULL;
+			goto out_free_hdr;
 		}
 	} else {
 		jit_data->ctx = ctx;
@@ -2233,13 +2199,15 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		kfree(jit_data);
 		prog->aux->jit_data = NULL;
 	}
-out:
-	if (tmp_blinded)
-		bpf_jit_prog_release_other(prog, prog == orig_prog ?
-					   tmp : orig_prog);
+
 	return prog;
 
 out_free_hdr:
+	if (extra_pass) {
+		prog->bpf_func = NULL;
+		prog->jited = 0;
+		prog->jited_len = 0;
+	}
 	if (header) {
 		bpf_arch_text_copy(&ro_header->size, &header->size,
 				   sizeof(header->size));
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index 9cb796e16379..fcc8c0c29fb0 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -1922,43 +1922,26 @@ int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags,
 
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 {
-	bool tmp_blinded = false, extra_pass = false;
+	bool extra_pass = false;
 	u8 *image_ptr, *ro_image_ptr;
 	int image_size, prog_size, extable_size;
 	struct jit_ctx ctx;
 	struct jit_data *jit_data;
 	struct bpf_binary_header *header;
 	struct bpf_binary_header *ro_header;
-	struct bpf_prog *tmp, *orig_prog = prog;
 
 	/*
 	 * If BPF JIT was not enabled then we must fall back to
 	 * the interpreter.
 	 */
 	if (!prog->jit_requested)
-		return orig_prog;
-
-	tmp = bpf_jit_blind_constants(prog);
-	/*
-	 * If blinding was requested and we failed during blinding,
-	 * we must fall back to the interpreter. Otherwise, we save
-	 * the new JITed code.
-	 */
-	if (IS_ERR(tmp))
-		return orig_prog;
-
-	if (tmp != prog) {
-		tmp_blinded = true;
-		prog = tmp;
-	}
+		return prog;
 
 	jit_data = prog->aux->jit_data;
 	if (!jit_data) {
 		jit_data = kzalloc_obj(*jit_data);
-		if (!jit_data) {
-			prog = orig_prog;
-			goto out;
-		}
+		if (!jit_data)
+			return prog;
 		prog->aux->jit_data = jit_data;
 	}
 	if (jit_data->ctx.offset) {
@@ -1978,17 +1961,13 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	ctx.user_vm_start = bpf_arena_get_user_vm_start(prog->aux->arena);
 
 	ctx.offset = kvcalloc(prog->len + 1, sizeof(u32), GFP_KERNEL);
-	if (ctx.offset == NULL) {
-		prog = orig_prog;
+	if (ctx.offset == NULL)
 		goto out_offset;
-	}
 
 	/* 1. Initial fake pass to compute ctx->idx and set ctx->flags */
 	build_prologue(&ctx);
-	if (build_body(&ctx, extra_pass)) {
-		prog = orig_prog;
+	if (build_body(&ctx, extra_pass))
 		goto out_offset;
-	}
 	ctx.epilogue_offset = ctx.idx;
 	build_epilogue(&ctx);
 
@@ -2004,10 +1983,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	/* Now we know the size of the structure to make */
 	ro_header = bpf_jit_binary_pack_alloc(image_size, &ro_image_ptr, sizeof(u32),
 					      &header, &image_ptr, jit_fill_hole);
-	if (!ro_header) {
-		prog = orig_prog;
+	if (!ro_header)
 		goto out_offset;
-	}
 
 	/* 2. Now, the actual pass to generate final JIT code */
 	/*
@@ -2027,17 +2004,13 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	ctx.num_exentries = 0;
 
 	build_prologue(&ctx);
-	if (build_body(&ctx, extra_pass)) {
-		prog = orig_prog;
+	if (build_body(&ctx, extra_pass))
 		goto out_free;
-	}
 	build_epilogue(&ctx);
 
 	/* 3. Extra pass to validate JITed code */
-	if (validate_ctx(&ctx)) {
-		prog = orig_prog;
+	if (validate_ctx(&ctx))
 		goto out_free;
-	}
 
 	/* And we're done */
 	if (bpf_jit_enable > 1)
@@ -2050,9 +2023,9 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 			goto out_free;
 		}
 		if (WARN_ON(bpf_jit_binary_pack_finalize(ro_header, header))) {
-			/* ro_header has been freed */
+			/* ro_header and header have been freed */
 			ro_header = NULL;
-			prog = orig_prog;
+			header = NULL;
 			goto out_free;
 		}
 		/*
@@ -2084,13 +2057,15 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		prog->aux->jit_data = NULL;
 	}
 
-out:
-	if (tmp_blinded)
-		bpf_jit_prog_release_other(prog, prog == orig_prog ? tmp : orig_prog);
-
 	return prog;
 
 out_free:
+	if (extra_pass) {
+		prog->bpf_func = NULL;
+		prog->jited = 0;
+		prog->jited_len = 0;
+	}
+
 	if (header) {
 		bpf_arch_text_copy(&ro_header->size, &header->size, sizeof(header->size));
 		bpf_jit_binary_pack_free(ro_header, header);
diff --git a/arch/mips/net/bpf_jit_comp.c b/arch/mips/net/bpf_jit_comp.c
index e355dfca4400..d2b6c955f18e 100644
--- a/arch/mips/net/bpf_jit_comp.c
+++ b/arch/mips/net/bpf_jit_comp.c
@@ -911,10 +911,8 @@ bool bpf_jit_needs_zext(void)
 
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 {
-	struct bpf_prog *tmp, *orig_prog = prog;
 	struct bpf_binary_header *header = NULL;
 	struct jit_context ctx;
-	bool tmp_blinded = false;
 	unsigned int tmp_idx;
 	unsigned int image_size;
 	u8 *image_ptr;
@@ -925,19 +923,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	 * the interpreter.
 	 */
 	if (!prog->jit_requested)
-		return orig_prog;
-	/*
-	 * If constant blinding was enabled and we failed during blinding
-	 * then we must fall back to the interpreter. Otherwise, we save
-	 * the new JITed code.
-	 */
-	tmp = bpf_jit_blind_constants(prog);
-	if (IS_ERR(tmp))
-		return orig_prog;
-	if (tmp != prog) {
-		tmp_blinded = true;
-		prog = tmp;
-	}
+		return prog;
 
 	memset(&ctx, 0, sizeof(ctx));
 	ctx.program = prog;
@@ -1025,14 +1011,10 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	prog->jited_len = image_size;
 
 out:
-	if (tmp_blinded)
-		bpf_jit_prog_release_other(prog, prog == orig_prog ?
-					   tmp : orig_prog);
 	kfree(ctx.descriptors);
 	return prog;
 
 out_err:
-	prog = orig_prog;
 	if (header)
 		bpf_jit_binary_free(header);
 	goto out;
diff --git a/arch/parisc/net/bpf_jit_core.c b/arch/parisc/net/bpf_jit_core.c
index a5eb6b51e27a..35dca372b5df 100644
--- a/arch/parisc/net/bpf_jit_core.c
+++ b/arch/parisc/net/bpf_jit_core.c
@@ -44,30 +44,19 @@ bool bpf_jit_needs_zext(void)
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 {
 	unsigned int prog_size = 0, extable_size = 0;
-	bool tmp_blinded = false, extra_pass = false;
-	struct bpf_prog *tmp, *orig_prog = prog;
+	bool extra_pass = false;
 	int pass = 0, prev_ninsns = 0, prologue_len, i;
 	struct hppa_jit_data *jit_data;
 	struct hppa_jit_context *ctx;
 
 	if (!prog->jit_requested)
-		return orig_prog;
-
-	tmp = bpf_jit_blind_constants(prog);
-	if (IS_ERR(tmp))
-		return orig_prog;
-	if (tmp != prog) {
-		tmp_blinded = true;
-		prog = tmp;
-	}
+		return prog;
 
 	jit_data = prog->aux->jit_data;
 	if (!jit_data) {
 		jit_data = kzalloc_obj(*jit_data);
-		if (!jit_data) {
-			prog = orig_prog;
-			goto out;
-		}
+		if (!jit_data)
+			return prog;
 		prog->aux->jit_data = jit_data;
 	}
 
@@ -81,10 +70,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 
 	ctx->prog = prog;
 	ctx->offset = kzalloc_objs(int, prog->len);
-	if (!ctx->offset) {
-		prog = orig_prog;
-		goto out_offset;
-	}
+	if (!ctx->offset)
+		goto out_err;
 	for (i = 0; i < prog->len; i++) {
 		prev_ninsns += 20;
 		ctx->offset[i] = prev_ninsns;
@@ -93,10 +80,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	for (i = 0; i < NR_JIT_ITERATIONS; i++) {
 		pass++;
 		ctx->ninsns = 0;
-		if (build_body(ctx, extra_pass, ctx->offset)) {
-			prog = orig_prog;
-			goto out_offset;
-		}
+		if (build_body(ctx, extra_pass, ctx->offset))
+			goto out_err;
 		ctx->body_len = ctx->ninsns;
 		bpf_jit_build_prologue(ctx);
 		ctx->prologue_len = ctx->ninsns - ctx->body_len;
@@ -116,10 +101,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 						     &jit_data->image,
 						     sizeof(long),
 						     bpf_fill_ill_insns);
-			if (!jit_data->header) {
-				prog = orig_prog;
-				goto out_offset;
-			}
+			if (!jit_data->header)
+				goto out_err;
 
 			ctx->insns = (u32 *)jit_data->image;
 			/*
@@ -134,8 +117,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		pr_err("bpf-jit: image did not converge in <%d passes!\n", i);
 		if (jit_data->header)
 			bpf_jit_binary_free(jit_data->header);
-		prog = orig_prog;
-		goto out_offset;
+		goto out_err;
 	}
 
 	if (extable_size)
@@ -148,8 +130,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	bpf_jit_build_prologue(ctx);
 	if (build_body(ctx, extra_pass, NULL)) {
 		bpf_jit_binary_free(jit_data->header);
-		prog = orig_prog;
-		goto out_offset;
+		goto out_err;
 	}
 	bpf_jit_build_epilogue(ctx);
 
@@ -160,20 +141,19 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 			{ extern int machine_restart(char *); machine_restart(""); }
 	}
 
+	if (!prog->is_func || extra_pass) {
+		if (bpf_jit_binary_lock_ro(jit_data->header)) {
+			bpf_jit_binary_free(jit_data->header);
+			goto out_err;
+		}
+		bpf_flush_icache(jit_data->header, ctx->insns + ctx->ninsns);
+	}
+
 	prog->bpf_func = (void *)ctx->insns;
 	prog->jited = 1;
 	prog->jited_len = prog_size;
 
-	bpf_flush_icache(jit_data->header, ctx->insns + ctx->ninsns);
-
 	if (!prog->is_func || extra_pass) {
-		if (bpf_jit_binary_lock_ro(jit_data->header)) {
-			bpf_jit_binary_free(jit_data->header);
-			prog->bpf_func = NULL;
-			prog->jited = 0;
-			prog->jited_len = 0;
-			goto out_offset;
-		}
 		prologue_len = ctx->epilogue_offset - ctx->body_len;
 		for (i = 0; i < prog->len; i++)
 			ctx->offset[i] += prologue_len;
@@ -183,14 +163,19 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		kfree(jit_data);
 		prog->aux->jit_data = NULL;
 	}
-out:
+
 	if (HPPA_JIT_REBOOT)
 		{ extern int machine_restart(char *); machine_restart(""); }
 
-	if (tmp_blinded)
-		bpf_jit_prog_release_other(prog, prog == orig_prog ?
-					   tmp : orig_prog);
 	return prog;
+
+out_err:
+	if (extra_pass) {
+		prog->bpf_func = NULL;
+		prog->jited = 0;
+		prog->jited_len = 0;
+	}
+	goto out_offset;
 }
 
 u64 hppa_div64(u64 div, u64 divisor)
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 50103b3794fb..2bae4699e78f 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -177,9 +177,6 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 	void __percpu *priv_stack_ptr = NULL;
 	struct bpf_binary_header *fhdr = NULL;
 	struct bpf_binary_header *hdr = NULL;
-	struct bpf_prog *org_fp = fp;
-	struct bpf_prog *tmp_fp = NULL;
-	bool bpf_blinded = false;
 	bool extra_pass = false;
 	u8 *fimage = NULL;
 	u32 *fcode_base = NULL;
@@ -187,24 +184,13 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 	u32 fixup_len;
 
 	if (!fp->jit_requested)
-		return org_fp;
-
-	tmp_fp = bpf_jit_blind_constants(org_fp);
-	if (IS_ERR(tmp_fp))
-		return org_fp;
-
-	if (tmp_fp != org_fp) {
-		bpf_blinded = true;
-		fp = tmp_fp;
-	}
+		return fp;
 
 	jit_data = fp->aux->jit_data;
 	if (!jit_data) {
 		jit_data = kzalloc_obj(*jit_data);
-		if (!jit_data) {
-			fp = org_fp;
-			goto out;
-		}
+		if (!jit_data)
+			return fp;
 		fp->aux->jit_data = jit_data;
 	}
 
@@ -219,10 +205,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 		priv_stack_alloc_size = round_up(fp->aux->stack_depth, 16) +
 							2 * PRIV_STACK_GUARD_SZ;
 		priv_stack_ptr = __alloc_percpu_gfp(priv_stack_alloc_size, 16, GFP_KERNEL);
-		if (!priv_stack_ptr) {
-			fp = org_fp;
+		if (!priv_stack_ptr)
 			goto out_priv_stack;
-		}
 
 		priv_stack_init_guard(priv_stack_ptr, priv_stack_alloc_size);
 		fp->aux->priv_stack_ptr = priv_stack_ptr;
@@ -249,10 +233,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 	}
 
 	addrs = kcalloc(flen + 1, sizeof(*addrs), GFP_KERNEL);
-	if (addrs == NULL) {
-		fp = org_fp;
-		goto out_addrs;
-	}
+	if (addrs == NULL)
+		goto out_err;
 
 	memset(&cgctx, 0, sizeof(struct codegen_context));
 	bpf_jit_init_reg_mapping(&cgctx);
@@ -279,11 +261,9 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 	}
 
 	/* Scouting faux-generate pass 0 */
-	if (bpf_jit_build_body(fp, NULL, NULL, &cgctx, addrs, 0, false)) {
+	if (bpf_jit_build_body(fp, NULL, NULL, &cgctx, addrs, 0, false))
 		/* We hit something illegal or unsupported. */
-		fp = org_fp;
-		goto out_addrs;
-	}
+		goto out_err;
 
 	/*
 	 * If we have seen a tail call, we need a second pass.
@@ -294,10 +274,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 	 */
 	if (cgctx.seen & SEEN_TAILCALL || !is_offset_in_branch_range((long)cgctx.idx * 4)) {
 		cgctx.idx = 0;
-		if (bpf_jit_build_body(fp, NULL, NULL, &cgctx, addrs, 0, false)) {
-			fp = org_fp;
-			goto out_addrs;
-		}
+		if (bpf_jit_build_body(fp, NULL, NULL, &cgctx, addrs, 0, false))
+			goto out_err;
 	}
 
 	bpf_jit_realloc_regs(&cgctx);
@@ -318,10 +296,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 
 	fhdr = bpf_jit_binary_pack_alloc(alloclen, &fimage, 4, &hdr, &image,
 					      bpf_jit_fill_ill_insns);
-	if (!fhdr) {
-		fp = org_fp;
-		goto out_addrs;
-	}
+	if (!fhdr)
+		goto out_err;
 
 	if (extable_len)
 		fp->aux->extable = (void *)fimage + FUNCTION_DESCR_SIZE + proglen + fixup_len;
@@ -340,8 +316,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 				       extra_pass)) {
 			bpf_arch_text_copy(&fhdr->size, &hdr->size, sizeof(hdr->size));
 			bpf_jit_binary_pack_free(fhdr, hdr);
-			fp = org_fp;
-			goto out_addrs;
+			goto out_err;
 		}
 		bpf_jit_build_epilogue(code_base, &cgctx);
 
@@ -363,15 +338,16 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 	((u64 *)image)[1] = local_paca->kernel_toc;
 #endif
 
+	if (!fp->is_func || extra_pass) {
+		if (bpf_jit_binary_pack_finalize(fhdr, hdr))
+			goto out_err;
+	}
+
 	fp->bpf_func = (void *)fimage;
 	fp->jited = 1;
 	fp->jited_len = cgctx.idx * 4 + FUNCTION_DESCR_SIZE;
 
 	if (!fp->is_func || extra_pass) {
-		if (bpf_jit_binary_pack_finalize(fhdr, hdr)) {
-			fp = org_fp;
-			goto out_addrs;
-		}
 		bpf_prog_fill_jited_linfo(fp, addrs);
 		/*
 		 * On ABI V1, executable code starts after the function
@@ -398,11 +374,15 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 		jit_data->hdr = hdr;
 	}
 
-out:
-	if (bpf_blinded)
-		bpf_jit_prog_release_other(fp, fp == org_fp ? tmp_fp : org_fp);
-
 	return fp;
+
+out_err:
+	if (extra_pass) {
+		fp->bpf_func = NULL;
+		fp->jited = 0;
+		fp->jited_len = 0;
+	}
+	goto out_addrs;
 }
 
 /*
diff --git a/arch/riscv/net/bpf_jit_core.c b/arch/riscv/net/bpf_jit_core.c
index f7fd4afc3ca3..36f0aea8096d 100644
--- a/arch/riscv/net/bpf_jit_core.c
+++ b/arch/riscv/net/bpf_jit_core.c
@@ -44,29 +44,19 @@ bool bpf_jit_needs_zext(void)
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 {
 	unsigned int prog_size = 0, extable_size = 0;
-	bool tmp_blinded = false, extra_pass = false;
-	struct bpf_prog *tmp, *orig_prog = prog;
+	bool extra_pass = false;
 	int pass = 0, prev_ninsns = 0, i;
 	struct rv_jit_data *jit_data;
 	struct rv_jit_context *ctx;
 
 	if (!prog->jit_requested)
-		return orig_prog;
-
-	tmp = bpf_jit_blind_constants(prog);
-	if (IS_ERR(tmp))
-		return orig_prog;
-	if (tmp != prog) {
-		tmp_blinded = true;
-		prog = tmp;
-	}
+		return prog;
 
 	jit_data = prog->aux->jit_data;
 	if (!jit_data) {
 		jit_data = kzalloc_obj(*jit_data);
 		if (!jit_data) {
-			prog = orig_prog;
-			goto out;
+			return prog;
 		}
 		prog->aux->jit_data = jit_data;
 	}
@@ -83,15 +73,11 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	ctx->user_vm_start = bpf_arena_get_user_vm_start(prog->aux->arena);
 	ctx->prog = prog;
 	ctx->offset = kzalloc_objs(int, prog->len);
-	if (!ctx->offset) {
-		prog = orig_prog;
+	if (!ctx->offset)
 		goto out_offset;
-	}
 
-	if (build_body(ctx, extra_pass, NULL)) {
-		prog = orig_prog;
+	if (build_body(ctx, extra_pass, NULL))
 		goto out_offset;
-	}
 
 	for (i = 0; i < prog->len; i++) {
 		prev_ninsns += 32;
@@ -105,10 +91,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		bpf_jit_build_prologue(ctx, bpf_is_subprog(prog));
 		ctx->prologue_len = ctx->ninsns;
 
-		if (build_body(ctx, extra_pass, ctx->offset)) {
-			prog = orig_prog;
+		if (build_body(ctx, extra_pass, ctx->offset))
 			goto out_offset;
-		}
 
 		ctx->epilogue_offset = ctx->ninsns;
 		bpf_jit_build_epilogue(ctx);
@@ -126,10 +110,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 							  &jit_data->ro_image, sizeof(u32),
 							  &jit_data->header, &jit_data->image,
 							  bpf_fill_ill_insns);
-			if (!jit_data->ro_header) {
-				prog = orig_prog;
+			if (!jit_data->ro_header)
 				goto out_offset;
-			}
 
 			/*
 			 * Use the image(RW) for writing the JITed instructions. But also save
@@ -150,7 +132,6 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 
 	if (i == NR_JIT_ITERATIONS) {
 		pr_err("bpf-jit: image did not converge in <%d passes!\n", i);
-		prog = orig_prog;
 		goto out_free_hdr;
 	}
 
@@ -163,26 +144,27 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	ctx->nexentries = 0;
 
 	bpf_jit_build_prologue(ctx, bpf_is_subprog(prog));
-	if (build_body(ctx, extra_pass, NULL)) {
-		prog = orig_prog;
+	if (build_body(ctx, extra_pass, NULL))
 		goto out_free_hdr;
-	}
 	bpf_jit_build_epilogue(ctx);
 
 	if (bpf_jit_enable > 1)
 		bpf_jit_dump(prog->len, prog_size, pass, ctx->insns);
 
-	prog->bpf_func = (void *)ctx->ro_insns + cfi_get_offset();
-	prog->jited = 1;
-	prog->jited_len = prog_size - cfi_get_offset();
-
 	if (!prog->is_func || extra_pass) {
 		if (WARN_ON(bpf_jit_binary_pack_finalize(jit_data->ro_header, jit_data->header))) {
 			/* ro_header has been freed */
 			jit_data->ro_header = NULL;
-			prog = orig_prog;
-			goto out_offset;
+			jit_data->header = NULL;
+			goto out_free_hdr;
 		}
+	}
+
+	prog->bpf_func = (void *)ctx->ro_insns + cfi_get_offset();
+	prog->jited = 1;
+	prog->jited_len = prog_size - cfi_get_offset();
+
+	if (!prog->is_func || extra_pass) {
 		for (i = 0; i < prog->len; i++)
 			ctx->offset[i] = ninsns_rvoff(ctx->offset[i]);
 		bpf_prog_fill_jited_linfo(prog, ctx->offset);
@@ -191,14 +173,15 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		kfree(jit_data);
 		prog->aux->jit_data = NULL;
 	}
-out:
 
-	if (tmp_blinded)
-		bpf_jit_prog_release_other(prog, prog == orig_prog ?
-					   tmp : orig_prog);
 	return prog;
 
 out_free_hdr:
+	if (extra_pass) {
+		prog->bpf_func = NULL;
+		prog->jited = 0;
+		prog->jited_len = 0;
+	}
 	if (jit_data->header) {
 		bpf_arch_text_copy(&jit_data->ro_header->size, &jit_data->header->size,
 				   sizeof(jit_data->header->size));
diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index d08d159b6319..2dfc279b1be2 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -2314,36 +2314,20 @@ static struct bpf_binary_header *bpf_jit_alloc(struct bpf_jit *jit,
  */
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 {
-	struct bpf_prog *tmp, *orig_fp = fp;
 	struct bpf_binary_header *header;
 	struct s390_jit_data *jit_data;
-	bool tmp_blinded = false;
 	bool extra_pass = false;
 	struct bpf_jit jit;
 	int pass;
 
 	if (!fp->jit_requested)
-		return orig_fp;
-
-	tmp = bpf_jit_blind_constants(fp);
-	/*
-	 * If blinding was requested and we failed during blinding,
-	 * we must fall back to the interpreter.
-	 */
-	if (IS_ERR(tmp))
-		return orig_fp;
-	if (tmp != fp) {
-		tmp_blinded = true;
-		fp = tmp;
-	}
+		return fp;
 
 	jit_data = fp->aux->jit_data;
 	if (!jit_data) {
 		jit_data = kzalloc_obj(*jit_data);
-		if (!jit_data) {
-			fp = orig_fp;
-			goto out;
-		}
+		if (!jit_data)
+			return fp;
 		fp->aux->jit_data = jit_data;
 	}
 	if (jit_data->ctx.addrs) {
@@ -2356,34 +2340,27 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 
 	memset(&jit, 0, sizeof(jit));
 	jit.addrs = kvcalloc(fp->len + 1, sizeof(*jit.addrs), GFP_KERNEL);
-	if (jit.addrs == NULL) {
-		fp = orig_fp;
-		goto free_addrs;
-	}
+	if (jit.addrs == NULL)
+		goto out_err;
 	/*
 	 * Three initial passes:
 	 *   - 1/2: Determine clobbered registers
 	 *   - 3:   Calculate program size and addrs array
 	 */
 	for (pass = 1; pass <= 3; pass++) {
-		if (bpf_jit_prog(&jit, fp, extra_pass)) {
-			fp = orig_fp;
-			goto free_addrs;
-		}
+		if (bpf_jit_prog(&jit, fp, extra_pass))
+			goto out_err;
 	}
 	/*
 	 * Final pass: Allocate and generate program
 	 */
 	header = bpf_jit_alloc(&jit, fp);
-	if (!header) {
-		fp = orig_fp;
-		goto free_addrs;
-	}
+	if (!header)
+		goto out_err;
 skip_init_ctx:
 	if (bpf_jit_prog(&jit, fp, extra_pass)) {
 		bpf_jit_binary_free(header);
-		fp = orig_fp;
-		goto free_addrs;
+		goto out_err;
 	}
 	if (bpf_jit_enable > 1) {
 		bpf_jit_dump(fp->len, jit.size, pass, jit.prg_buf);
@@ -2392,8 +2369,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 	if (!fp->is_func || extra_pass) {
 		if (bpf_jit_binary_lock_ro(header)) {
 			bpf_jit_binary_free(header);
-			fp = orig_fp;
-			goto free_addrs;
+			goto out_err;
 		}
 	} else {
 		jit_data->header = header;
@@ -2411,11 +2387,16 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
 		kfree(jit_data);
 		fp->aux->jit_data = NULL;
 	}
-out:
-	if (tmp_blinded)
-		bpf_jit_prog_release_other(fp, fp == orig_fp ?
-					   tmp : orig_fp);
+
 	return fp;
+
+out_err:
+	if (extra_pass) {
+		fp->bpf_func = NULL;
+		fp->jited = 0;
+		fp->jited_len = 0;
+	}
+	goto free_addrs;
 }
 
 bool bpf_jit_supports_kfunc_call(void)
diff --git a/arch/sparc/net/bpf_jit_comp_64.c b/arch/sparc/net/bpf_jit_comp_64.c
index b23d1c645ae5..e83e29137566 100644
--- a/arch/sparc/net/bpf_jit_comp_64.c
+++ b/arch/sparc/net/bpf_jit_comp_64.c
@@ -1479,37 +1479,22 @@ struct sparc64_jit_data {
 
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 {
-	struct bpf_prog *tmp, *orig_prog = prog;
 	struct sparc64_jit_data *jit_data;
 	struct bpf_binary_header *header;
 	u32 prev_image_size, image_size;
-	bool tmp_blinded = false;
 	bool extra_pass = false;
 	struct jit_ctx ctx;
 	u8 *image_ptr;
 	int pass, i;
 
 	if (!prog->jit_requested)
-		return orig_prog;
-
-	tmp = bpf_jit_blind_constants(prog);
-	/* If blinding was requested and we failed during blinding,
-	 * we must fall back to the interpreter.
-	 */
-	if (IS_ERR(tmp))
-		return orig_prog;
-	if (tmp != prog) {
-		tmp_blinded = true;
-		prog = tmp;
-	}
+		return prog;
 
 	jit_data = prog->aux->jit_data;
 	if (!jit_data) {
 		jit_data = kzalloc_obj(*jit_data);
-		if (!jit_data) {
-			prog = orig_prog;
-			goto out;
-		}
+		if (!jit_data)
+			return prog;
 		prog->aux->jit_data = jit_data;
 	}
 	if (jit_data->ctx.offset) {
@@ -1527,10 +1512,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	ctx.prog = prog;
 
 	ctx.offset = kmalloc_array(prog->len, sizeof(unsigned int), GFP_KERNEL);
-	if (ctx.offset == NULL) {
-		prog = orig_prog;
-		goto out_off;
-	}
+	if (ctx.offset == NULL)
+		goto out_err;
 
 	/* Longest sequence emitted is for bswap32, 12 instructions.  Pre-cook
 	 * the offset array so that we converge faster.
@@ -1543,10 +1526,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		ctx.idx = 0;
 
 		build_prologue(&ctx);
-		if (build_body(&ctx)) {
-			prog = orig_prog;
-			goto out_off;
-		}
+		if (build_body(&ctx))
+			goto out_err;
 		build_epilogue(&ctx);
 
 		if (bpf_jit_enable > 1)
@@ -1569,10 +1550,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	image_size = sizeof(u32) * ctx.idx;
 	header = bpf_jit_binary_alloc(image_size, &image_ptr,
 				      sizeof(u32), jit_fill_hole);
-	if (header == NULL) {
-		prog = orig_prog;
-		goto out_off;
-	}
+	if (header == NULL)
+		goto out_err;
 
 	ctx.image = (u32 *)image_ptr;
 skip_init_ctx:
@@ -1582,8 +1561,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 
 	if (build_body(&ctx)) {
 		bpf_jit_binary_free(header);
-		prog = orig_prog;
-		goto out_off;
+		goto out_err;
 	}
 
 	build_epilogue(&ctx);
@@ -1592,8 +1570,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		pr_err("bpf_jit: Failed to converge, prev_size=%u size=%d\n",
 		       prev_image_size, ctx.idx * 4);
 		bpf_jit_binary_free(header);
-		prog = orig_prog;
-		goto out_off;
+		goto out_err;
 	}
 
 	if (bpf_jit_enable > 1)
@@ -1604,8 +1581,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	if (!prog->is_func || extra_pass) {
 		if (bpf_jit_binary_lock_ro(header)) {
 			bpf_jit_binary_free(header);
-			prog = orig_prog;
-			goto out_off;
+			goto out_err;
 		}
 	} else {
 		jit_data->ctx = ctx;
@@ -1624,9 +1600,14 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		kfree(jit_data);
 		prog->aux->jit_data = NULL;
 	}
-out:
-	if (tmp_blinded)
-		bpf_jit_prog_release_other(prog, prog == orig_prog ?
-					   tmp : orig_prog);
+
 	return prog;
+
+out_err:
+	if (extra_pass) {
+		prog->bpf_func = NULL;
+		prog->jited = 0;
+		prog->jited_len = 0;
+	}
+	goto out_off;
 }
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index e9b78040d703..77d00a8dec87 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -3717,13 +3717,11 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 {
 	struct bpf_binary_header *rw_header = NULL;
 	struct bpf_binary_header *header = NULL;
-	struct bpf_prog *tmp, *orig_prog = prog;
 	void __percpu *priv_stack_ptr = NULL;
 	struct x64_jit_data *jit_data;
 	int priv_stack_alloc_sz;
 	int proglen, oldproglen = 0;
 	struct jit_context ctx = {};
-	bool tmp_blinded = false;
 	bool extra_pass = false;
 	bool padding = false;
 	u8 *rw_image = NULL;
@@ -3733,27 +3731,13 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 	int i;
 
 	if (!prog->jit_requested)
-		return orig_prog;
-
-	tmp = bpf_jit_blind_constants(prog);
-	/*
-	 * If blinding was requested and we failed during blinding,
-	 * we must fall back to the interpreter.
-	 */
-	if (IS_ERR(tmp))
-		return orig_prog;
-	if (tmp != prog) {
-		tmp_blinded = true;
-		prog = tmp;
-	}
+		return prog;
 
 	jit_data = prog->aux->jit_data;
 	if (!jit_data) {
 		jit_data = kzalloc_obj(*jit_data);
-		if (!jit_data) {
-			prog = orig_prog;
-			goto out;
-		}
+		if (!jit_data)
+			return prog;
 		prog->aux->jit_data = jit_data;
 	}
 	priv_stack_ptr = prog->aux->priv_stack_ptr;
@@ -3765,10 +3749,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		priv_stack_alloc_sz = round_up(prog->aux->stack_depth, 8) +
 				      2 * PRIV_STACK_GUARD_SZ;
 		priv_stack_ptr = __alloc_percpu_gfp(priv_stack_alloc_sz, 8, GFP_KERNEL);
-		if (!priv_stack_ptr) {
-			prog = orig_prog;
+		if (!priv_stack_ptr)
 			goto out_priv_stack;
-		}
 
 		priv_stack_init_guard(priv_stack_ptr, priv_stack_alloc_sz);
 		prog->aux->priv_stack_ptr = priv_stack_ptr;
@@ -3786,10 +3768,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		goto skip_init_addrs;
 	}
 	addrs = kvmalloc_objs(*addrs, prog->len + 1);
-	if (!addrs) {
-		prog = orig_prog;
+	if (!addrs)
 		goto out_addrs;
-	}
 
 	/*
 	 * Before first pass, make a rough estimation of addrs[]
@@ -3820,8 +3800,6 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 						   sizeof(rw_header->size));
 				bpf_jit_binary_pack_free(header, rw_header);
 			}
-			/* Fall back to interpreter mode */
-			prog = orig_prog;
 			if (extra_pass) {
 				prog->bpf_func = NULL;
 				prog->jited = 0;
@@ -3852,10 +3830,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 			header = bpf_jit_binary_pack_alloc(roundup(proglen, align) + extable_size,
 							   &image, align, &rw_header, &rw_image,
 							   jit_fill_hole);
-			if (!header) {
-				prog = orig_prog;
+			if (!header)
 				goto out_addrs;
-			}
 			prog->aux->extable = (void *) image + roundup(proglen, align);
 		}
 		oldproglen = proglen;
@@ -3908,8 +3884,6 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		prog->bpf_func = (void *)image + cfi_get_offset();
 		prog->jited = 1;
 		prog->jited_len = proglen - cfi_get_offset();
-	} else {
-		prog = orig_prog;
 	}
 
 	if (!image || !prog->is_func || extra_pass) {
@@ -3925,10 +3899,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		kfree(jit_data);
 		prog->aux->jit_data = NULL;
 	}
-out:
-	if (tmp_blinded)
-		bpf_jit_prog_release_other(prog, prog == orig_prog ?
-					   tmp : orig_prog);
+
 	return prog;
 }
 
diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c
index dda423025c3d..5f259577614a 100644
--- a/arch/x86/net/bpf_jit_comp32.c
+++ b/arch/x86/net/bpf_jit_comp32.c
@@ -2521,35 +2521,19 @@ bool bpf_jit_needs_zext(void)
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 {
 	struct bpf_binary_header *header = NULL;
-	struct bpf_prog *tmp, *orig_prog = prog;
 	int proglen, oldproglen = 0;
 	struct jit_context ctx = {};
-	bool tmp_blinded = false;
 	u8 *image = NULL;
 	int *addrs;
 	int pass;
 	int i;
 
 	if (!prog->jit_requested)
-		return orig_prog;
-
-	tmp = bpf_jit_blind_constants(prog);
-	/*
-	 * If blinding was requested and we failed during blinding,
-	 * we must fall back to the interpreter.
-	 */
-	if (IS_ERR(tmp))
-		return orig_prog;
-	if (tmp != prog) {
-		tmp_blinded = true;
-		prog = tmp;
-	}
+		return prog;
 
 	addrs = kmalloc_objs(*addrs, prog->len);
-	if (!addrs) {
-		prog = orig_prog;
-		goto out;
-	}
+	if (!addrs)
+		return prog;
 
 	/*
 	 * Before first pass, make a rough estimation of addrs[]
@@ -2574,7 +2558,6 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 			image = NULL;
 			if (header)
 				bpf_jit_binary_free(header);
-			prog = orig_prog;
 			goto out_addrs;
 		}
 		if (image) {
@@ -2588,10 +2571,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		if (proglen == oldproglen) {
 			header = bpf_jit_binary_alloc(proglen, &image,
 						      1, jit_fill_hole);
-			if (!header) {
-				prog = orig_prog;
+			if (!header)
 				goto out_addrs;
-			}
 		}
 		oldproglen = proglen;
 		cond_resched();
@@ -2604,16 +2585,10 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 		prog->bpf_func = (void *)image;
 		prog->jited = 1;
 		prog->jited_len = proglen;
-	} else {
-		prog = orig_prog;
 	}
 
 out_addrs:
 	kfree(addrs);
-out:
-	if (tmp_blinded)
-		bpf_jit_prog_release_other(prog, prog == orig_prog ?
-					   tmp : orig_prog);
 	return prog;
 }
 
diff --git a/include/linux/filter.h b/include/linux/filter.h
index f552170eacf4..9fa4d4090093 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -1184,6 +1184,18 @@ static inline bool bpf_dump_raw_ok(const struct cred *cred)
 
 struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
 				       const struct bpf_insn *patch, u32 len);
+
+#ifdef CONFIG_BPF_SYSCALL
+struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 off,
+				     const struct bpf_insn *patch, u32 len);
+#else
+static inline struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 off,
+						   const struct bpf_insn *patch, u32 len)
+{
+	return ERR_PTR(-ENOTSUPP);
+}
+#endif /* CONFIG_BPF_SYSCALL */
+
 int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt);
 
 static inline bool xdp_return_frame_no_direct(void)
@@ -1310,9 +1322,14 @@ int bpf_jit_get_func_addr(const struct bpf_prog *prog,
 
 const char *bpf_jit_get_prog_name(struct bpf_prog *prog);
 
-struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *fp);
+struct bpf_prog *bpf_jit_blind_constants(struct bpf_verifier_env *env, struct bpf_prog *prog);
 void bpf_jit_prog_release_other(struct bpf_prog *fp, struct bpf_prog *fp_other);
 
+static inline bool bpf_prog_need_blind(const struct bpf_prog *prog)
+{
+	return prog->blinding_requested && !prog->blinded;
+}
+
 static inline void bpf_jit_dump(unsigned int flen, unsigned int proglen,
 				u32 pass, void *image)
 {
@@ -1451,6 +1468,20 @@ static inline void bpf_prog_kallsyms_del(struct bpf_prog *fp)
 {
 }
 
+static inline bool bpf_prog_need_blind(const struct bpf_prog *prog)
+{
+	return false;
+}
+
+static inline
+struct bpf_prog *bpf_jit_blind_constants(struct bpf_verifier_env *env, struct bpf_prog *prog)
+{
+	return prog;
+}
+
+static inline void bpf_jit_prog_release_other(struct bpf_prog *fp, struct bpf_prog *fp_other)
+{
+}
 #endif /* CONFIG_BPF_JIT */
 
 void bpf_prog_kallsyms_del_all(struct bpf_prog *fp);
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 066b86e7233c..fc9fb3c07866 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1508,7 +1508,11 @@ static void adjust_insn_arrays(struct bpf_prog *prog, u32 off, u32 len)
 #endif
 }
 
-struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
+/*
+ * Now this function is used only to blind the main prog and must be invoked only when
+ * bpf_prog_need_blind() returns true.
+ */
+struct bpf_prog *bpf_jit_blind_constants(struct bpf_verifier_env *env, struct bpf_prog *prog)
 {
 	struct bpf_insn insn_buff[16], aux[2];
 	struct bpf_prog *clone, *tmp;
@@ -1516,13 +1520,17 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
 	struct bpf_insn *insn;
 	int i, rewritten;
 
-	if (!prog->blinding_requested || prog->blinded)
-		return prog;
+	if (WARN_ON_ONCE(env && env->prog != prog))
+		return ERR_PTR(-EINVAL);
 
 	clone = bpf_prog_clone_create(prog, GFP_USER);
 	if (!clone)
 		return ERR_PTR(-ENOMEM);
 
+	/* make sure bpf_patch_insn_data() patches the correct prog */
+	if (env)
+		env->prog = clone;
+
 	insn_cnt = clone->len;
 	insn = clone->insnsi;
 
@@ -1550,21 +1558,35 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog)
 		if (!rewritten)
 			continue;
 
-		tmp = bpf_patch_insn_single(clone, i, insn_buff, rewritten);
-		if (IS_ERR(tmp)) {
+		if (env)
+			tmp = bpf_patch_insn_data(env, i, insn_buff, rewritten);
+		else
+			tmp = bpf_patch_insn_single(clone, i, insn_buff, rewritten);
+
+		if (IS_ERR_OR_NULL(tmp)) {
+			if (env)
+				/* restore the original prog */
+				env->prog = prog;
 			/* Patching may have repointed aux->prog during
 			 * realloc from the original one, so we need to
 			 * fix it up here on error.
 			 */
 			bpf_jit_prog_release_other(prog, clone);
-			return tmp;
+			return IS_ERR(tmp) ? tmp : ERR_PTR(-ENOMEM);
 		}
 
 		clone = tmp;
 		insn_delta = rewritten - 1;
 
-		/* Instructions arrays must be updated using absolute xlated offsets */
-		adjust_insn_arrays(clone, prog->aux->subprog_start + i, rewritten);
+		if (env)
+			env->prog = clone;
+		else
+			/*
+			 * Instructions arrays must be updated using absolute xlated offsets.
+			 * The arrays have already been adjusted by bpf_patch_insn_data() when
+			 * env is not NULL.
+			 */
+			adjust_insn_arrays(clone, i, rewritten);
 
 		/* Walk new program and skip insns we just inserted. */
 		insn = clone->insnsi + i + insn_delta;
@@ -2533,6 +2555,35 @@ static bool bpf_prog_select_interpreter(struct bpf_prog *fp)
 	return select_interpreter;
 }
 
+static struct bpf_prog *bpf_prog_jit_compile(struct bpf_prog *prog)
+{
+#ifdef CONFIG_BPF_JIT
+	struct bpf_prog *orig_prog;
+
+	if (!bpf_prog_need_blind(prog))
+		return bpf_int_jit_compile(prog);
+
+	orig_prog = prog;
+	prog = bpf_jit_blind_constants(NULL, prog);
+	/*
+	 * If blinding was requested and we failed during blinding, we must fall
+	 * back to the interpreter.
+	 */
+	if (IS_ERR(prog))
+		return orig_prog;
+
+	prog = bpf_int_jit_compile(prog);
+	if (prog->jited) {
+		bpf_jit_prog_release_other(prog, orig_prog);
+		return prog;
+	}
+
+	bpf_jit_prog_release_other(orig_prog, prog);
+	prog = orig_prog;
+#endif
+	return prog;
+}
+
 /**
  *	bpf_prog_select_runtime - select exec runtime for BPF program
  *	@fp: bpf_prog populated with BPF program
@@ -2572,7 +2623,7 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
 		if (*err)
 			return fp;
 
-		fp = bpf_int_jit_compile(fp);
+		fp = bpf_prog_jit_compile(fp);
 		bpf_prog_jit_attempt_done(fp);
 		if (!fp->jited && jit_needed) {
 			*err = -ENOTSUPP;
diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
index dd00a680e4ea..721b830b5ef2 100644
--- a/kernel/bpf/fixups.c
+++ b/kernel/bpf/fixups.c
@@ -232,8 +232,8 @@ static void adjust_poke_descs(struct bpf_prog *prog, u32 off, u32 len)
 	}
 }
 
-static struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 off,
-					    const struct bpf_insn *patch, u32 len)
+struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 off,
+				     const struct bpf_insn *patch, u32 len)
 {
 	struct bpf_prog *new_prog;
 	struct bpf_insn_aux_data *new_data = NULL;
@@ -973,7 +973,47 @@ int bpf_convert_ctx_accesses(struct bpf_verifier_env *env)
 	return 0;
 }
 
-int bpf_jit_subprogs(struct bpf_verifier_env *env)
+static u32 *bpf_dup_subprog_starts(struct bpf_verifier_env *env)
+{
+	u32 *starts = NULL;
+
+	starts = kvmalloc_objs(u32, env->subprog_cnt, GFP_KERNEL_ACCOUNT);
+	if (starts) {
+		for (int i = 0; i < env->subprog_cnt; i++)
+			starts[i] = env->subprog_info[i].start;
+	}
+	return starts;
+}
+
+static void bpf_restore_subprog_starts(struct bpf_verifier_env *env, u32 *orig_starts)
+{
+	for (int i = 0; i < env->subprog_cnt; i++)
+		env->subprog_info[i].start = orig_starts[i];
+	/* restore the start of fake 'exit' subprog as well */
+	env->subprog_info[env->subprog_cnt].start = env->prog->len;
+}
+
+static struct bpf_insn_aux_data *bpf_dup_insn_aux_data(struct bpf_verifier_env *env)
+{
+	size_t size;
+	void *new_aux;
+
+	size = array_size(sizeof(struct bpf_insn_aux_data), env->prog->len);
+	new_aux = __vmalloc(size, GFP_KERNEL_ACCOUNT);
+	if (new_aux)
+		memcpy(new_aux, env->insn_aux_data, size);
+	return new_aux;
+}
+
+static void bpf_restore_insn_aux_data(struct bpf_verifier_env *env,
+				      struct bpf_insn_aux_data *orig_insn_aux)
+{
+	/* the expanded elements are zero-filled, so no special handling is required */
+	vfree(env->insn_aux_data);
+	env->insn_aux_data = orig_insn_aux;
+}
+
+static int jit_subprogs(struct bpf_verifier_env *env)
 {
 	struct bpf_prog *prog = env->prog, **func, *tmp;
 	int i, j, subprog_start, subprog_end = 0, len, subprog;
@@ -981,10 +1021,6 @@ int bpf_jit_subprogs(struct bpf_verifier_env *env)
 	struct bpf_insn *insn;
 	void *old_bpf_func;
 	int err, num_exentries;
-	int old_len, subprog_start_adjustment = 0;
-
-	if (env->subprog_cnt <= 1)
-		return 0;
 
 	for (i = 0, insn = prog->insnsi; i < prog->len; i++, insn++) {
 		if (!bpf_pseudo_func(insn) && !bpf_pseudo_call(insn))
@@ -1053,10 +1089,11 @@ int bpf_jit_subprogs(struct bpf_verifier_env *env)
 			goto out_free;
 		func[i]->is_func = 1;
 		func[i]->sleepable = prog->sleepable;
+		func[i]->blinded = prog->blinded;
 		func[i]->aux->func_idx = i;
 		/* Below members will be freed only at prog->aux */
 		func[i]->aux->btf = prog->aux->btf;
-		func[i]->aux->subprog_start = subprog_start + subprog_start_adjustment;
+		func[i]->aux->subprog_start = subprog_start;
 		func[i]->aux->func_info = prog->aux->func_info;
 		func[i]->aux->func_info_cnt = prog->aux->func_info_cnt;
 		func[i]->aux->poke_tab = prog->aux->poke_tab;
@@ -1113,15 +1150,7 @@ int bpf_jit_subprogs(struct bpf_verifier_env *env)
 		func[i]->aux->token = prog->aux->token;
 		if (!i)
 			func[i]->aux->exception_boundary = env->seen_exception;
-
-		/*
-		 * To properly pass the absolute subprog start to jit
-		 * all instruction adjustments should be accumulated
-		 */
-		old_len = func[i]->len;
 		func[i] = bpf_int_jit_compile(func[i]);
-		subprog_start_adjustment += func[i]->len - old_len;
-
 		if (!func[i]->jited) {
 			err = -ENOTSUPP;
 			goto out_free;
@@ -1247,16 +1276,87 @@ int bpf_jit_subprogs(struct bpf_verifier_env *env)
 	}
 	kfree(func);
 out_undo_insn:
+	bpf_prog_jit_attempt_done(prog);
+	return err;
+}
+
+int bpf_jit_subprogs(struct bpf_verifier_env *env)
+{
+	int err, i;
+	bool blinded = false;
+	struct bpf_insn *insn;
+	struct bpf_prog *prog, *orig_prog;
+	struct bpf_insn_aux_data *orig_insn_aux;
+	u32 *orig_subprog_starts;
+
+	if (env->subprog_cnt <= 1)
+		return 0;
+
+	prog = orig_prog = env->prog;
+	if (bpf_prog_need_blind(prog)) {
+		orig_insn_aux = bpf_dup_insn_aux_data(env);
+		if (!orig_insn_aux) {
+			err = -ENOMEM;
+			goto out_cleanup;
+		}
+		orig_subprog_starts = bpf_dup_subprog_starts(env);
+		if (!orig_subprog_starts) {
+			vfree(orig_insn_aux);
+			err = -ENOMEM;
+			goto out_cleanup;
+		}
+		prog = bpf_jit_blind_constants(env, prog);
+		if (IS_ERR(prog)) {
+			err = -ENOMEM;
+			prog = orig_prog;
+			goto out_restore;
+		}
+		blinded = true;
+	}
+
+	err = jit_subprogs(env);
+	if (err)
+		goto out_jit_err;
+
+	if (blinded) {
+		bpf_jit_prog_release_other(prog, orig_prog);
+		kvfree(orig_subprog_starts);
+		vfree(orig_insn_aux);
+	}
+
+	return 0;
+
+out_jit_err:
+	if (blinded) {
+		bpf_jit_prog_release_other(orig_prog, prog);
+		/* roll back to the clean original prog */
+		prog = env->prog = orig_prog;
+		goto out_restore;
+	} else {
+		if (err != -EFAULT) {
+			/*
+			 * We will fall back to interpreter mode when err is not -EFAULT, before
+			 * that, insn->off and insn->imm should be restored to their original
+			 * values since they were modified by jit_subprogs.
+			 */
+			for (i = 0, insn = prog->insnsi; i < prog->len; i++, insn++) {
+				if (!bpf_pseudo_call(insn))
+					continue;
+				insn->off = 0;
+				insn->imm = env->insn_aux_data[i].call_imm;
+			}
+		}
+		goto out_cleanup;
+	}
+
+out_restore:
+	bpf_restore_subprog_starts(env, orig_subprog_starts);
+	bpf_restore_insn_aux_data(env, orig_insn_aux);
+	kvfree(orig_subprog_starts);
+out_cleanup:
 	/* cleanup main prog to be interpreted */
 	prog->jit_requested = 0;
 	prog->blinding_requested = 0;
-	for (i = 0, insn = prog->insnsi; i < prog->len; i++, insn++) {
-		if (!bpf_pseudo_call(insn))
-			continue;
-		insn->off = 0;
-		insn->imm = env->insn_aux_data[i].call_imm;
-	}
-	bpf_prog_jit_attempt_done(prog);
 	return err;
 }
 
-- 
2.43.0



^ permalink raw reply related

* [PATCH bpf v15 5/5] bpf, arm64: Emit BTI for indirect jump target
From: Xu Kuohai @ 2026-04-16  6:43 UTC (permalink / raw)
  To: bpf, linux-kernel, linux-arm-kernel
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Yonghong Song, Puranjay Mohan, Anton Protopopov,
	Alexis Lothoré, Shahab Vahedi, Russell King, Tiezhu Yang,
	Hengqi Chen, Johan Almbladh, Paul Burton, Hari Bathini,
	Christophe Leroy, Naveen N Rao, Luke Nelson, Xi Wang,
	Björn Töpel, Pu Lehui, Ilya Leoshkevich, Heiko Carstens,
	Vasily Gorbik, David S . Miller, Wang YanQing
In-Reply-To: <20260416064341.151802-1-xukuohai@huaweicloud.com>

From: Xu Kuohai <xukuohai@huawei.com>

On CPUs that support BTI, the indirect jump selftest triggers a kernel
panic because there is no BTI instructions at the indirect jump targets.

Fix it by emitting a BTI instruction for each indirect jump target.

For reference, below is a sample panic log.

Internal error: Oops - BTI: 0000000036000003 [#1]  SMP
...
Call trace:
 bpf_prog_2e5f1c71c13ac3e0_big_jump_table+0x54/0xf8 (P)
 bpf_prog_run_pin_on_cpu+0x140/0x468
 bpf_prog_test_run_syscall+0x280/0x3b8
 bpf_prog_test_run+0x22c/0x2c0

Fixes: f4a66cf1cb14 ("bpf: arm64: Add support for indirect jumps")
Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> # v8
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> # v12
Acked-by: Leon Hwang <leon.hwang@linux.dev>
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
---
 arch/arm64/net/bpf_jit_comp.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index bd8757952507..0816c40fc7af 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -1197,8 +1197,8 @@ static int add_exception_handler(const struct bpf_insn *insn,
  * >0 - successfully JITed a 16-byte eBPF instruction.
  * <0 - failed to JIT.
  */
-static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
-		      bool extra_pass)
+static int build_insn(const struct bpf_verifier_env *env, const struct bpf_insn *insn,
+		      struct jit_ctx *ctx, bool extra_pass)
 {
 	const u8 code = insn->code;
 	u8 dst = bpf2a64[insn->dst_reg];
@@ -1223,6 +1223,9 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
 	int ret;
 	bool sign_extend;
 
+	if (bpf_insn_is_indirect_target(env, ctx->prog, i))
+		emit_bti(A64_BTI_J, ctx);
+
 	switch (code) {
 	/* dst = src */
 	case BPF_ALU | BPF_MOV | BPF_X:
@@ -1898,7 +1901,7 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
 	return 0;
 }
 
-static int build_body(struct jit_ctx *ctx, bool extra_pass)
+static int build_body(struct bpf_verifier_env *env, struct jit_ctx *ctx, bool extra_pass)
 {
 	const struct bpf_prog *prog = ctx->prog;
 	int i;
@@ -1917,7 +1920,7 @@ static int build_body(struct jit_ctx *ctx, bool extra_pass)
 		int ret;
 
 		ctx->offset[i] = ctx->idx;
-		ret = build_insn(insn, ctx, extra_pass);
+		ret = build_insn(env, insn, ctx, extra_pass);
 		if (ret > 0) {
 			i++;
 			ctx->offset[i] = ctx->idx;
@@ -2073,7 +2076,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_pr
 	if (build_prologue(&ctx, was_classic))
 		goto out_off;
 
-	if (build_body(&ctx, extra_pass))
+	if (build_body(env, &ctx, extra_pass))
 		goto out_off;
 
 	ctx.epilogue_offset = ctx.idx;
@@ -2121,7 +2124,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_pr
 	/* Dont write body instructions to memory for now */
 	ctx.write = false;
 
-	if (build_body(&ctx, extra_pass))
+	if (build_body(env, &ctx, extra_pass))
 		goto out_free_hdr;
 
 	ctx.epilogue_offset = ctx.idx;
@@ -2130,7 +2133,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_pr
 	ctx.write = true;
 
 	/* Pass 3: Adjust jump offset and write final image */
-	if (build_body(&ctx, extra_pass) ||
+	if (build_body(env, &ctx, extra_pass) ||
 		WARN_ON_ONCE(ctx.idx != ctx.epilogue_offset))
 		goto out_free_hdr;
 
-- 
2.43.0



^ permalink raw reply related

* [PATCH bpf v15 2/5] bpf: Pass bpf_verifier_env to JIT
From: Xu Kuohai @ 2026-04-16  6:43 UTC (permalink / raw)
  To: bpf, linux-kernel, linux-arm-kernel
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Yonghong Song, Puranjay Mohan, Anton Protopopov,
	Alexis Lothoré, Shahab Vahedi, Russell King, Tiezhu Yang,
	Hengqi Chen, Johan Almbladh, Paul Burton, Hari Bathini,
	Christophe Leroy, Naveen N Rao, Luke Nelson, Xi Wang,
	Björn Töpel, Pu Lehui, Ilya Leoshkevich, Heiko Carstens,
	Vasily Gorbik, David S . Miller, Wang YanQing
In-Reply-To: <20260416064341.151802-1-xukuohai@huaweicloud.com>

From: Xu Kuohai <xukuohai@huawei.com>

Pass bpf_verifier_env to bpf_int_jit_compile(). The follow-up patch will
use env->insn_aux_data in the JIT stage to detect indirect jump targets.

Since bpf_prog_select_runtime() can be called by cbpf and lib/test_bpf.c
code without verifier, introduce helper __bpf_prog_select_runtime()
to accept the env parameter.

Remove the call to bpf_prog_select_runtime() in bpf_prog_load(), and
switch to call __bpf_prog_select_runtime() in the verifier, with env
variable passed. The original bpf_prog_select_runtime() is preserved for
cbpf and lib/test_bpf.c, where env is NULL.

Now all constants blinding calls are moved into the verifier, except
the cbpf and lib/test_bpf.c cases. The instructions arrays are adjusted
by bpf_patch_insn_data() function for normal cases, so there is no need
to call adjust_insn_arrays() in bpf_jit_blind_constants(). Remove it.

Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> # v8
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> # v12
Acked-by: Hengqi Chen <hengqi.chen@gmail.com> # v14
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
---
 arch/arc/net/bpf_jit_core.c      |  2 +-
 arch/arm/net/bpf_jit_32.c        |  2 +-
 arch/arm64/net/bpf_jit_comp.c    |  2 +-
 arch/loongarch/net/bpf_jit.c     |  2 +-
 arch/mips/net/bpf_jit_comp.c     |  2 +-
 arch/parisc/net/bpf_jit_core.c   |  2 +-
 arch/powerpc/net/bpf_jit_comp.c  |  2 +-
 arch/riscv/net/bpf_jit_core.c    |  2 +-
 arch/s390/net/bpf_jit_comp.c     |  2 +-
 arch/sparc/net/bpf_jit_comp_64.c |  2 +-
 arch/x86/net/bpf_jit_comp.c      |  2 +-
 arch/x86/net/bpf_jit_comp32.c    |  2 +-
 include/linux/filter.h           | 17 ++++++-
 kernel/bpf/core.c                | 86 ++++++++++++++++----------------
 kernel/bpf/fixups.c              | 10 ++--
 kernel/bpf/syscall.c             |  4 --
 kernel/bpf/verifier.c            | 14 +++---
 17 files changed, 84 insertions(+), 71 deletions(-)

diff --git a/arch/arc/net/bpf_jit_core.c b/arch/arc/net/bpf_jit_core.c
index 973ceae48675..639a2736f029 100644
--- a/arch/arc/net/bpf_jit_core.c
+++ b/arch/arc/net/bpf_jit_core.c
@@ -1400,7 +1400,7 @@ static struct bpf_prog *do_extra_pass(struct bpf_prog *prog)
  * (re)locations involved that their addresses are not known
  * during the first run.
  */
-struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
 {
 	vm_dump(prog);
 
diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index e6b1bb2de627..1628b6fc70a4 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -2142,7 +2142,7 @@ bool bpf_jit_needs_zext(void)
 	return true;
 }
 
-struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
 {
 	struct bpf_binary_header *header;
 	struct jit_ctx ctx;
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index d310d1c35192..bd8757952507 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -2000,7 +2000,7 @@ struct arm64_jit_data {
 	struct jit_ctx ctx;
 };
 
-struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
 {
 	int image_size, prog_size, extable_size, extable_align, extable_offset;
 	struct bpf_binary_header *header;
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index fcc8c0c29fb0..5149ce4cef7e 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -1920,7 +1920,7 @@ int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags,
 	return ret < 0 ? ret : ret * LOONGARCH_INSN_SIZE;
 }
 
-struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
 {
 	bool extra_pass = false;
 	u8 *image_ptr, *ro_image_ptr;
diff --git a/arch/mips/net/bpf_jit_comp.c b/arch/mips/net/bpf_jit_comp.c
index d2b6c955f18e..6ee4abe6a1f7 100644
--- a/arch/mips/net/bpf_jit_comp.c
+++ b/arch/mips/net/bpf_jit_comp.c
@@ -909,7 +909,7 @@ bool bpf_jit_needs_zext(void)
 	return true;
 }
 
-struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
 {
 	struct bpf_binary_header *header = NULL;
 	struct jit_context ctx;
diff --git a/arch/parisc/net/bpf_jit_core.c b/arch/parisc/net/bpf_jit_core.c
index 35dca372b5df..172770132440 100644
--- a/arch/parisc/net/bpf_jit_core.c
+++ b/arch/parisc/net/bpf_jit_core.c
@@ -41,7 +41,7 @@ bool bpf_jit_needs_zext(void)
 	return true;
 }
 
-struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
 {
 	unsigned int prog_size = 0, extable_size = 0;
 	bool extra_pass = false;
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 2bae4699e78f..53ab97ad6074 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -162,7 +162,7 @@ static void priv_stack_check_guard(void __percpu *priv_stack_ptr, int alloc_size
 	}
 }
 
-struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *fp)
 {
 	u32 proglen;
 	u32 alloclen;
diff --git a/arch/riscv/net/bpf_jit_core.c b/arch/riscv/net/bpf_jit_core.c
index 36f0aea8096d..4365d07aaf54 100644
--- a/arch/riscv/net/bpf_jit_core.c
+++ b/arch/riscv/net/bpf_jit_core.c
@@ -41,7 +41,7 @@ bool bpf_jit_needs_zext(void)
 	return true;
 }
 
-struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
 {
 	unsigned int prog_size = 0, extable_size = 0;
 	bool extra_pass = false;
diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index 2dfc279b1be2..94128fe6be23 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -2312,7 +2312,7 @@ static struct bpf_binary_header *bpf_jit_alloc(struct bpf_jit *jit,
 /*
  * Compile eBPF program "fp"
  */
-struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *fp)
 {
 	struct bpf_binary_header *header;
 	struct s390_jit_data *jit_data;
diff --git a/arch/sparc/net/bpf_jit_comp_64.c b/arch/sparc/net/bpf_jit_comp_64.c
index e83e29137566..2fa0e9375127 100644
--- a/arch/sparc/net/bpf_jit_comp_64.c
+++ b/arch/sparc/net/bpf_jit_comp_64.c
@@ -1477,7 +1477,7 @@ struct sparc64_jit_data {
 	struct jit_ctx ctx;
 };
 
-struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
 {
 	struct sparc64_jit_data *jit_data;
 	struct bpf_binary_header *header;
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 77d00a8dec87..72d9a5faa230 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -3713,7 +3713,7 @@ struct x64_jit_data {
 #define MAX_PASSES 20
 #define PADDING_PASSES (MAX_PASSES - 5)
 
-struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
 {
 	struct bpf_binary_header *rw_header = NULL;
 	struct bpf_binary_header *header = NULL;
diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c
index 5f259577614a..852baf2e4db4 100644
--- a/arch/x86/net/bpf_jit_comp32.c
+++ b/arch/x86/net/bpf_jit_comp32.c
@@ -2518,7 +2518,7 @@ bool bpf_jit_needs_zext(void)
 	return true;
 }
 
-struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
 {
 	struct bpf_binary_header *header = NULL;
 	int proglen, oldproglen = 0;
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 9fa4d4090093..1ec6d5ba64cc 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -1108,6 +1108,8 @@ sk_filter_reason(struct sock *sk, struct sk_buff *skb)
 	return sk_filter_trim_cap(sk, skb, 1);
 }
 
+struct bpf_prog *__bpf_prog_select_runtime(struct bpf_verifier_env *env, struct bpf_prog *fp,
+					   int *err);
 struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err);
 void bpf_prog_free(struct bpf_prog *fp);
 
@@ -1153,7 +1155,7 @@ u64 __bpf_call_base(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
 	((u64 (*)(u64, u64, u64, u64, u64, const struct bpf_insn *)) \
 	 (void *)__bpf_call_base)
 
-struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog);
+struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog);
 void bpf_jit_compile(struct bpf_prog *prog);
 bool bpf_jit_needs_zext(void);
 bool bpf_jit_inlines_helper_call(s32 imm);
@@ -1188,12 +1190,25 @@ struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
 #ifdef CONFIG_BPF_SYSCALL
 struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 off,
 				     const struct bpf_insn *patch, u32 len);
+struct bpf_insn_aux_data *bpf_dup_insn_aux_data(struct bpf_verifier_env *env);
+void bpf_restore_insn_aux_data(struct bpf_verifier_env *env,
+			       struct bpf_insn_aux_data *orig_insn_aux);
 #else
 static inline struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 off,
 						   const struct bpf_insn *patch, u32 len)
 {
 	return ERR_PTR(-ENOTSUPP);
 }
+
+static inline struct bpf_insn_aux_data *bpf_dup_insn_aux_data(struct bpf_verifier_env *env)
+{
+	return NULL;
+}
+
+static inline void bpf_restore_insn_aux_data(struct bpf_verifier_env *env,
+					     struct bpf_insn_aux_data *orig_insn_aux)
+{
+}
 #endif /* CONFIG_BPF_SYSCALL */
 
 int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt);
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index fc9fb3c07866..79361aa11757 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1491,23 +1491,6 @@ void bpf_jit_prog_release_other(struct bpf_prog *fp, struct bpf_prog *fp_other)
 	bpf_prog_clone_free(fp_other);
 }
 
-static void adjust_insn_arrays(struct bpf_prog *prog, u32 off, u32 len)
-{
-#ifdef CONFIG_BPF_SYSCALL
-	struct bpf_map *map;
-	int i;
-
-	if (len <= 1)
-		return;
-
-	for (i = 0; i < prog->aux->used_map_cnt; i++) {
-		map = prog->aux->used_maps[i];
-		if (map->map_type == BPF_MAP_TYPE_INSN_ARRAY)
-			bpf_insn_array_adjust(map, off, len);
-	}
-#endif
-}
-
 /*
  * Now this function is used only to blind the main prog and must be invoked only when
  * bpf_prog_need_blind() returns true.
@@ -1580,13 +1563,6 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_verifier_env *env, struct bp
 
 		if (env)
 			env->prog = clone;
-		else
-			/*
-			 * Instructions arrays must be updated using absolute xlated offsets.
-			 * The arrays have already been adjusted by bpf_patch_insn_data() when
-			 * env is not NULL.
-			 */
-			adjust_insn_arrays(clone, i, rewritten);
 
 		/* Walk new program and skip insns we just inserted. */
 		insn = clone->insnsi + i + insn_delta;
@@ -2555,47 +2531,55 @@ static bool bpf_prog_select_interpreter(struct bpf_prog *fp)
 	return select_interpreter;
 }
 
-static struct bpf_prog *bpf_prog_jit_compile(struct bpf_prog *prog)
+static struct bpf_prog *bpf_prog_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
 {
 #ifdef CONFIG_BPF_JIT
 	struct bpf_prog *orig_prog;
+	struct bpf_insn_aux_data *orig_insn_aux;
 
 	if (!bpf_prog_need_blind(prog))
-		return bpf_int_jit_compile(prog);
+		return bpf_int_jit_compile(env, prog);
+
+	if (env) {
+		/*
+		 * If env is not NULL, we are called from the end of bpf_check(), at this
+		 * point, only insn_aux_data is used after failure, so it should be restored
+		 * on failure.
+		 */
+		orig_insn_aux = bpf_dup_insn_aux_data(env);
+		if (!orig_insn_aux)
+			return prog;
+	}
 
 	orig_prog = prog;
-	prog = bpf_jit_blind_constants(NULL, prog);
+	prog = bpf_jit_blind_constants(env, prog);
 	/*
 	 * If blinding was requested and we failed during blinding, we must fall
 	 * back to the interpreter.
 	 */
 	if (IS_ERR(prog))
-		return orig_prog;
+		goto out_restore;
 
-	prog = bpf_int_jit_compile(prog);
+	prog = bpf_int_jit_compile(env, prog);
 	if (prog->jited) {
 		bpf_jit_prog_release_other(prog, orig_prog);
+		if (env)
+			vfree(orig_insn_aux);
 		return prog;
 	}
 
 	bpf_jit_prog_release_other(orig_prog, prog);
+
+out_restore:
 	prog = orig_prog;
+	if (env)
+		bpf_restore_insn_aux_data(env, orig_insn_aux);
 #endif
 	return prog;
 }
 
-/**
- *	bpf_prog_select_runtime - select exec runtime for BPF program
- *	@fp: bpf_prog populated with BPF program
- *	@err: pointer to error variable
- *
- * Try to JIT eBPF program, if JIT is not available, use interpreter.
- * The BPF program will be executed via bpf_prog_run() function.
- *
- * Return: the &fp argument along with &err set to 0 for success or
- * a negative errno code on failure
- */
-struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
+struct bpf_prog *__bpf_prog_select_runtime(struct bpf_verifier_env *env, struct bpf_prog *fp,
+					   int *err)
 {
 	/* In case of BPF to BPF calls, verifier did all the prep
 	 * work with regards to JITing, etc.
@@ -2623,7 +2607,7 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
 		if (*err)
 			return fp;
 
-		fp = bpf_prog_jit_compile(fp);
+		fp = bpf_prog_jit_compile(env, fp);
 		bpf_prog_jit_attempt_done(fp);
 		if (!fp->jited && jit_needed) {
 			*err = -ENOTSUPP;
@@ -2649,6 +2633,22 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
 
 	return fp;
 }
+
+/**
+ *	bpf_prog_select_runtime - select exec runtime for BPF program
+ *	@fp: bpf_prog populated with BPF program
+ *	@err: pointer to error variable
+ *
+ * Try to JIT eBPF program, if JIT is not available, use interpreter.
+ * The BPF program will be executed via bpf_prog_run() function.
+ *
+ * Return: the &fp argument along with &err set to 0 for success or
+ * a negative errno code on failure
+ */
+struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
+{
+	return __bpf_prog_select_runtime(NULL, fp, err);
+}
 EXPORT_SYMBOL_GPL(bpf_prog_select_runtime);
 
 static unsigned int __bpf_prog_ret1(const void *ctx,
@@ -3136,7 +3136,7 @@ const struct bpf_func_proto bpf_tail_call_proto = {
  * It is encouraged to implement bpf_int_jit_compile() instead, so that
  * eBPF and implicitly also cBPF can get JITed!
  */
-struct bpf_prog * __weak bpf_int_jit_compile(struct bpf_prog *prog)
+struct bpf_prog * __weak bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_prog *prog)
 {
 	return prog;
 }
diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
index 721b830b5ef2..6c86980cc9e8 100644
--- a/kernel/bpf/fixups.c
+++ b/kernel/bpf/fixups.c
@@ -993,7 +993,7 @@ static void bpf_restore_subprog_starts(struct bpf_verifier_env *env, u32 *orig_s
 	env->subprog_info[env->subprog_cnt].start = env->prog->len;
 }
 
-static struct bpf_insn_aux_data *bpf_dup_insn_aux_data(struct bpf_verifier_env *env)
+struct bpf_insn_aux_data *bpf_dup_insn_aux_data(struct bpf_verifier_env *env)
 {
 	size_t size;
 	void *new_aux;
@@ -1005,8 +1005,8 @@ static struct bpf_insn_aux_data *bpf_dup_insn_aux_data(struct bpf_verifier_env *
 	return new_aux;
 }
 
-static void bpf_restore_insn_aux_data(struct bpf_verifier_env *env,
-				      struct bpf_insn_aux_data *orig_insn_aux)
+void bpf_restore_insn_aux_data(struct bpf_verifier_env *env,
+			       struct bpf_insn_aux_data *orig_insn_aux)
 {
 	/* the expanded elements are zero-filled, so no special handling is required */
 	vfree(env->insn_aux_data);
@@ -1150,7 +1150,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 		func[i]->aux->token = prog->aux->token;
 		if (!i)
 			func[i]->aux->exception_boundary = env->seen_exception;
-		func[i] = bpf_int_jit_compile(func[i]);
+		func[i] = bpf_int_jit_compile(env, func[i]);
 		if (!func[i]->jited) {
 			err = -ENOTSUPP;
 			goto out_free;
@@ -1194,7 +1194,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 	}
 	for (i = 0; i < env->subprog_cnt; i++) {
 		old_bpf_func = func[i]->bpf_func;
-		tmp = bpf_int_jit_compile(func[i]);
+		tmp = bpf_int_jit_compile(env, func[i]);
 		if (tmp != func[i] || func[i]->bpf_func != old_bpf_func) {
 			verbose(env, "JIT doesn't support bpf-to-bpf calls\n");
 			err = -ENOTSUPP;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index b73b25c63073..a3c0214ca934 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3083,10 +3083,6 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
 	if (err < 0)
 		goto free_used_maps;
 
-	prog = bpf_prog_select_runtime(prog, &err);
-	if (err < 0)
-		goto free_used_maps;
-
 	err = bpf_prog_mark_insn_arrays_ready(prog);
 	if (err < 0)
 		goto free_used_maps;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9e4980128151..e804e0da3500 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -20155,6 +20155,14 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 
 	adjust_btf_func(env);
 
+	/* extension progs temporarily inherit the attach_type of their targets
+	   for verification purposes, so set it back to zero before returning
+	 */
+	if (env->prog->type == BPF_PROG_TYPE_EXT)
+		env->prog->expected_attach_type = 0;
+
+	env->prog = __bpf_prog_select_runtime(env, env->prog, &ret);
+
 err_release_maps:
 	if (ret)
 		release_insn_arrays(env);
@@ -20166,12 +20174,6 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	if (!env->prog->aux->used_btfs)
 		release_btfs(env);
 
-	/* extension progs temporarily inherit the attach_type of their targets
-	   for verification purposes, so set it back to zero before returning
-	 */
-	if (env->prog->type == BPF_PROG_TYPE_EXT)
-		env->prog->expected_attach_type = 0;
-
 	*prog = env->prog;
 
 	module_put(env->attach_btf_mod);
-- 
2.43.0



^ permalink raw reply related

* [PATCH bpf v15 0/5] emit ENDBR/BTI instructions for indirect
From: Xu Kuohai @ 2026-04-16  6:43 UTC (permalink / raw)
  To: bpf, linux-kernel, linux-arm-kernel
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Yonghong Song, Puranjay Mohan, Anton Protopopov,
	Alexis Lothoré, Shahab Vahedi, Russell King, Tiezhu Yang,
	Hengqi Chen, Johan Almbladh, Paul Burton, Hari Bathini,
	Christophe Leroy, Naveen N Rao, Luke Nelson, Xi Wang,
	Björn Töpel, Pu Lehui, Ilya Leoshkevich, Heiko Carstens,
	Vasily Gorbik, David S . Miller, Wang YanQing

On architectures with CFI protection enabled that require landing pad
instructions at indirect jump targets, such as x86 with CET/IBT enabled
and arm64 with BTI enabled, kernel panics when an indirect jump lands on
a target without landing pad. Therefore, the JIT must emit landing pad
instructions for indirect jump targets.

The verifier already recognizes which instructions are indirect jump
targets during the verification phase. So we can store this information
in env->insn_aux_data and pass it to the JIT as new parameter, allowing
the JIT to consult env->insn_aux_data to determine which instructions are
indirect jump targets.

During JIT, constants blinding is performed. It rewrites the private copy
of instructions for the JITed program, but it does not adjust the global
env->insn_aux_data array. As a result, after constants blinding, the
instruction indexes used by JIT may no longer match the indexes in
env->insn_aux_data, so the JIT can not use env->insn_aux_data directly.

To avoid this mismatch, and given that all existing arch-specific JITs
already implement constants blinding with largely duplicated code, move
constants blinding from JIT to generic code.

v15:
- Rebase and target bpf tree
- Resotre subprog_start of the fake 'exit' subprog on failure
- Fix wrong function name used in comment

v14: https://lore.kernel.org/all/cover.1776062885.git.xukuohai@hotmail.com/
- Rebase
- Fix comment style
- Fix incorrect variable and function name used in commit message

v13: https://lore.kernel.org/bpf/20260411133847.1042658-1-xukuohai@huaweicloud.com
- Use vmalloc to allocate memory for insn_aux_data copies to match with vfree
- Do not free the copied memory of insn_aux_data when restoring from failure
- Code cleanup

v12: https://lore.kernel.org/bpf/20260403132811.753894-1-xukuohai@huaweicloud.com
- Restore env->insn_aux_data on JIT failure
- Fix incorrect error code sign (-EFAULT vs EFAULT)
- Fix incorrect prog used in the restore path

v11: https://lore.kernel.org/bpf/20260403090915.473493-1-xukuohai@huaweicloud.com
- Restore env->subprog_info after jit_subprogs() fails
- Clear prog->jit_requested and prog->blinding_requested on failure
- Use the actual env->insn_aux_data size in clear_insn_aux_data() on failure 

v10: https://lore.kernel.org/bpf/20260324122052.342751-1-xukuohai@huaweicloud.com
- Fix the incorrect call_imm restore in jit_subprogs 
- Define a dummy void version of bpf_jit_prog_release_other and
  bpf_patch_insn_data when the corresponding config is not set
- Remove the unnecessary #ifdef in x86_64 JIT (Leon Hwang)

v9: https://lore.kernel.org/bpf/20260312170255.3427799-1-xukuohai@huaweicloud.com
- Make constant blinding available for classic bpf (Eduard)
- Clear prog->bpf_func, prog->jited ... on the error path of extra pass (Eduard)
- Fix spelling errors and remove unused parameter (Anton Protopopov)

v8: https://lore.kernel.org/bpf/20260309140044.2652538-1-xukuohai@huaweicloud.com
- Define void bpf_jit_blind_constants() function when CONFIG_BPF_JIT is not set 
- Move indirect_target fixup for insn patching from bpf_jit_blind_constants()
  to adjust_insn_aux_data()

v7: https://lore.kernel.org/bpf/20260307103949.2340104-1-xukuohai@huaweicloud.com
- Move constants blinding logic back to bpf/core.c
- Compute ip address before switch statement in x86 JIT
- Clear JIT state from error path on arm64 and loongarch 

v6: https://lore.kernel.org/bpf/20260306102329.2056216-1-xukuohai@huaweicloud.com
- Move constants blinding from JIT to verifier
- Move call to bpf_prog_select_runtime from bpf_prog_load to verifier

v5: https://lore.kernel.org/bpf/20260302102726.1126019-1-xukuohai@huaweicloud.com
- Switch to pass env to JIT directly to get rid of copying private insn_aux_data for
  each prog

v4: https://lore.kernel.org/all/20260114093914.2403982-1-xukuohai@huaweicloud.com
- Switch to the approach proposed by Eduard, using insn_aux_data to identify indirect
  jump targets, and emit ENDBR on x86

v3: https://lore.kernel.org/bpf/20251227081033.240336-1-xukuohai@huaweicloud.com
- Get rid of unnecessary enum definition (Yonghong Song, Anton Protopopov)

v2: https://lore.kernel.org/bpf/20251223085447.139301-1-xukuohai@huaweicloud.com
- Exclude instruction arrays not used for indirect jumps (Anton Protopopov)

v1: https://lore.kernel.org/bpf/20251127140318.3944249-1-xukuohai@huaweicloud.com

Xu Kuohai (5):
  bpf: Move constants blinding out of arch-specific JITs
  bpf: Pass bpf_verifier_env to JIT
  bpf: Add helper to detect indirect jump targets
  bpf, x86: Emit ENDBR for indirect jump targets
  bpf, arm64: Emit BTI for indirect jump target

 arch/arc/net/bpf_jit_core.c      |  41 +++-----
 arch/arm/net/bpf_jit_32.c        |  43 ++------
 arch/arm64/net/bpf_jit_comp.c    |  87 ++++++-----------
 arch/loongarch/net/bpf_jit.c     |  61 ++++--------
 arch/mips/net/bpf_jit_comp.c     |  22 +----
 arch/parisc/net/bpf_jit_core.c   |  75 ++++++--------
 arch/powerpc/net/bpf_jit_comp.c  |  74 ++++++--------
 arch/riscv/net/bpf_jit_core.c    |  63 +++++-------
 arch/s390/net/bpf_jit_comp.c     |  61 ++++--------
 arch/sparc/net/bpf_jit_comp_64.c |  63 +++++-------
 arch/x86/net/bpf_jit_comp.c      |  73 +++++---------
 arch/x86/net/bpf_jit_comp32.c    |  35 +------
 include/linux/bpf.h              |   2 +
 include/linux/bpf_verifier.h     |   9 +-
 include/linux/filter.h           |  50 +++++++++-
 kernel/bpf/core.c                | 138 ++++++++++++++++++--------
 kernel/bpf/fixups.c              | 162 ++++++++++++++++++++++++++-----
 kernel/bpf/syscall.c             |   4 -
 kernel/bpf/verifier.c            |  21 ++--
 19 files changed, 529 insertions(+), 555 deletions(-)

-- 
2.43.0



^ permalink raw reply

* [PATCH bpf v15 3/5] bpf: Add helper to detect indirect jump targets
From: Xu Kuohai @ 2026-04-16  6:43 UTC (permalink / raw)
  To: bpf, linux-kernel, linux-arm-kernel
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Yonghong Song, Puranjay Mohan, Anton Protopopov,
	Alexis Lothoré, Shahab Vahedi, Russell King, Tiezhu Yang,
	Hengqi Chen, Johan Almbladh, Paul Burton, Hari Bathini,
	Christophe Leroy, Naveen N Rao, Luke Nelson, Xi Wang,
	Björn Töpel, Pu Lehui, Ilya Leoshkevich, Heiko Carstens,
	Vasily Gorbik, David S . Miller, Wang YanQing
In-Reply-To: <20260416064341.151802-1-xukuohai@huaweicloud.com>

From: Xu Kuohai <xukuohai@huawei.com>

Introduce helper bpf_insn_is_indirect_target to check whether a BPF
instruction is an indirect jump target.

Since the verifier knows which instructions are indirect jump targets,
add a new flag indirect_target to struct bpf_insn_aux_data to mark
them. The verifier sets this flag when verifying an indirect jump target
instruction, and the helper checks the flag to determine whether an
instruction is an indirect jump target.

Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> #v8
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> #v12
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
---
 include/linux/bpf.h          |  2 ++
 include/linux/bpf_verifier.h |  9 +++++----
 kernel/bpf/core.c            |  9 +++++++++
 kernel/bpf/fixups.c          | 12 ++++++++++++
 kernel/bpf/verifier.c        |  7 +++++++
 5 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 0136a108d083..b4b703c90ca9 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1541,6 +1541,8 @@ bool bpf_has_frame_pointer(unsigned long ip);
 int bpf_jit_charge_modmem(u32 size);
 void bpf_jit_uncharge_modmem(u32 size);
 bool bpf_prog_has_trampoline(const struct bpf_prog *prog);
+bool bpf_insn_is_indirect_target(const struct bpf_verifier_env *env, const struct bpf_prog *prog,
+				 int insn_idx);
 #else
 static inline int bpf_trampoline_link_prog(struct bpf_tramp_link *link,
 					   struct bpf_trampoline *tr,
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 53e8664cb566..b148f816f25b 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -630,16 +630,17 @@ struct bpf_insn_aux_data {
 
 	/* below fields are initialized once */
 	unsigned int orig_idx; /* original instruction index */
-	bool jmp_point;
-	bool prune_point;
+	u32 jmp_point:1;
+	u32 prune_point:1;
 	/* ensure we check state equivalence and save state checkpoint and
 	 * this instruction, regardless of any heuristics
 	 */
-	bool force_checkpoint;
+	u32 force_checkpoint:1;
 	/* true if instruction is a call to a helper function that
 	 * accepts callback function as a parameter.
 	 */
-	bool calls_callback;
+	u32 calls_callback:1;
+	u32 indirect_target:1; /* if it is an indirect jump target */
 	/*
 	 * CFG strongly connected component this instruction belongs to,
 	 * zero if it is a singleton SCC.
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 79361aa11757..8b018ff48875 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1573,6 +1573,15 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_verifier_env *env, struct bp
 	clone->blinded = 1;
 	return clone;
 }
+
+bool bpf_insn_is_indirect_target(const struct bpf_verifier_env *env, const struct bpf_prog *prog,
+				 int insn_idx)
+{
+	if (!env)
+		return false;
+	insn_idx += prog->aux->subprog_start;
+	return env->insn_aux_data[insn_idx].indirect_target;
+}
 #endif /* CONFIG_BPF_JIT */
 
 /* Base function for offset calculation. Needs to go into .text section,
diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
index 6c86980cc9e8..fba9e8c00878 100644
--- a/kernel/bpf/fixups.c
+++ b/kernel/bpf/fixups.c
@@ -183,6 +183,18 @@ static void adjust_insn_aux_data(struct bpf_verifier_env *env,
 		data[i].seen = old_seen;
 		data[i].zext_dst = insn_has_def32(insn + i);
 	}
+
+	/*
+	 * The indirect_target flag of the original instruction was moved to the last of the
+	 * new instructions by the above memmove and memset, but the indirect jump target is
+	 * actually the first instruction, so move it back. This also matches with the behavior
+	 * of bpf_insn_array_adjust(), which preserves xlated_off to point to the first new
+	 * instruction.
+	 */
+	if (data[off + cnt - 1].indirect_target) {
+		data[off].indirect_target = 1;
+		data[off + cnt - 1].indirect_target = 0;
+	}
 }
 
 static void adjust_subprog_starts(struct bpf_verifier_env *env, u32 off, u32 len)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index e804e0da3500..1e36b9e91277 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3497,6 +3497,11 @@ static int insn_stack_access_flags(int frameno, int spi)
 	return INSN_F_STACK_ACCESS | (spi << INSN_F_SPI_SHIFT) | frameno;
 }
 
+static void mark_indirect_target(struct bpf_verifier_env *env, int idx)
+{
+	env->insn_aux_data[idx].indirect_target = true;
+}
+
 #define LR_FRAMENO_BITS	3
 #define LR_SPI_BITS	6
 #define LR_ENTRY_BITS	(LR_SPI_BITS + LR_FRAMENO_BITS + 1)
@@ -17545,12 +17550,14 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in
 	}
 
 	for (i = 0; i < n - 1; i++) {
+		mark_indirect_target(env, env->gotox_tmp_buf->items[i]);
 		other_branch = push_stack(env, env->gotox_tmp_buf->items[i],
 					  env->insn_idx, env->cur_state->speculative);
 		if (IS_ERR(other_branch))
 			return PTR_ERR(other_branch);
 	}
 	env->insn_idx = env->gotox_tmp_buf->items[n-1];
+	mark_indirect_target(env, env->insn_idx);
 	return INSN_IDX_UPDATED;
 }
 
-- 
2.43.0



^ permalink raw reply related

* [PATCH bpf v15 4/5] bpf, x86: Emit ENDBR for indirect jump targets
From: Xu Kuohai @ 2026-04-16  6:43 UTC (permalink / raw)
  To: bpf, linux-kernel, linux-arm-kernel
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Yonghong Song, Puranjay Mohan, Anton Protopopov,
	Alexis Lothoré, Shahab Vahedi, Russell King, Tiezhu Yang,
	Hengqi Chen, Johan Almbladh, Paul Burton, Hari Bathini,
	Christophe Leroy, Naveen N Rao, Luke Nelson, Xi Wang,
	Björn Töpel, Pu Lehui, Ilya Leoshkevich, Heiko Carstens,
	Vasily Gorbik, David S . Miller, Wang YanQing
In-Reply-To: <20260416064341.151802-1-xukuohai@huaweicloud.com>

From: Xu Kuohai <xukuohai@huawei.com>

On CPUs that support CET/IBT, the indirect jump selftest triggers
a kernel panic because the indirect jump targets lack ENDBR
instructions.

To fix it, emit an ENDBR instruction to each indirect jump target. Since
the ENDBR instruction shifts the position of original jited instructions,
fix the instruction address calculation wherever the addresses are used.

For reference, below is a sample panic log.

 Missing ENDBR: bpf_prog_2e5f1c71c13ac3e0_big_jump_table+0x97/0xe1
 ------------[ cut here ]------------
 kernel BUG at arch/x86/kernel/cet.c:133!
 Oops: invalid opcode: 0000 [#1] SMP NOPTI

 ...

  ? 0xffffffffc00fb258
  ? bpf_prog_2e5f1c71c13ac3e0_big_jump_table+0x97/0xe1
  bpf_prog_test_run_syscall+0x110/0x2f0
  ? fdget+0xba/0xe0
  __sys_bpf+0xe4b/0x2590
  ? __kmalloc_node_track_caller_noprof+0x1c7/0x680
  ? bpf_prog_test_run_syscall+0x215/0x2f0
  __x64_sys_bpf+0x21/0x30
  do_syscall_64+0x85/0x620
  ? bpf_prog_test_run_syscall+0x1e2/0x2f0

Fixes: 493d9e0d6083 ("bpf, x86: add support for indirect jumps")
Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> # v8
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> # v12
Acked-by: Leon Hwang <leon.hwang@linux.dev>
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
---
 arch/x86/net/bpf_jit_comp.c | 28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 72d9a5faa230..ea9e707e8abf 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -58,8 +58,8 @@ static u8 *emit_code(u8 *ptr, u32 bytes, unsigned int len)
 #define EMIT_ENDBR()		EMIT(gen_endbr(), 4)
 #define EMIT_ENDBR_POISON()	EMIT(gen_endbr_poison(), 4)
 #else
-#define EMIT_ENDBR()
-#define EMIT_ENDBR_POISON()
+#define EMIT_ENDBR()		do { } while (0)
+#define EMIT_ENDBR_POISON()	do { } while (0)
 #endif
 
 static bool is_imm8(int value)
@@ -1649,8 +1649,8 @@ static int emit_spectre_bhb_barrier(u8 **pprog, u8 *ip,
 	return 0;
 }
 
-static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image,
-		  int oldproglen, struct jit_context *ctx, bool jmp_padding)
+static int do_jit(struct bpf_verifier_env *env, struct bpf_prog *bpf_prog, int *addrs, u8 *image,
+		  u8 *rw_image, int oldproglen, struct jit_context *ctx, bool jmp_padding)
 {
 	bool tail_call_reachable = bpf_prog->aux->tail_call_reachable;
 	struct bpf_insn *insn = bpf_prog->insnsi;
@@ -1663,7 +1663,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image
 	void __percpu *priv_stack_ptr;
 	int i, excnt = 0;
 	int ilen, proglen = 0;
-	u8 *prog = temp;
+	u8 *ip, *prog = temp;
 	u32 stack_depth;
 	int err;
 
@@ -1734,6 +1734,11 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image
 				dst_reg = X86_REG_R9;
 		}
 
+		if (bpf_insn_is_indirect_target(env, bpf_prog, i - 1))
+			EMIT_ENDBR();
+
+		ip = image + addrs[i - 1] + (prog - temp);
+
 		switch (insn->code) {
 			/* ALU */
 		case BPF_ALU | BPF_ADD | BPF_X:
@@ -2440,8 +2445,6 @@ st:			if (is_imm8(insn->off))
 
 			/* call */
 		case BPF_JMP | BPF_CALL: {
-			u8 *ip = image + addrs[i - 1];
-
 			func = (u8 *) __bpf_call_base + imm32;
 			if (src_reg == BPF_PSEUDO_CALL && tail_call_reachable) {
 				LOAD_TAIL_CALL_CNT_PTR(stack_depth);
@@ -2465,7 +2468,8 @@ st:			if (is_imm8(insn->off))
 			if (imm32)
 				emit_bpf_tail_call_direct(bpf_prog,
 							  &bpf_prog->aux->poke_tab[imm32 - 1],
-							  &prog, image + addrs[i - 1],
+							  &prog,
+							  ip,
 							  callee_regs_used,
 							  stack_depth,
 							  ctx);
@@ -2474,7 +2478,7 @@ st:			if (is_imm8(insn->off))
 							    &prog,
 							    callee_regs_used,
 							    stack_depth,
-							    image + addrs[i - 1],
+							    ip,
 							    ctx);
 			break;
 
@@ -2639,7 +2643,7 @@ st:			if (is_imm8(insn->off))
 			break;
 
 		case BPF_JMP | BPF_JA | BPF_X:
-			emit_indirect_jump(&prog, insn->dst_reg, image + addrs[i - 1]);
+			emit_indirect_jump(&prog, insn->dst_reg, ip);
 			break;
 		case BPF_JMP | BPF_JA:
 		case BPF_JMP32 | BPF_JA:
@@ -2729,8 +2733,6 @@ st:			if (is_imm8(insn->off))
 			ctx->cleanup_addr = proglen;
 			if (bpf_prog_was_classic(bpf_prog) &&
 			    !ns_capable_noaudit(&init_user_ns, CAP_SYS_ADMIN)) {
-				u8 *ip = image + addrs[i - 1];
-
 				if (emit_spectre_bhb_barrier(&prog, ip, bpf_prog))
 					return -EINVAL;
 			}
@@ -3791,7 +3793,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_verifier_env *env, struct bpf_pr
 	for (pass = 0; pass < MAX_PASSES || image; pass++) {
 		if (!padding && pass >= PADDING_PASSES)
 			padding = true;
-		proglen = do_jit(prog, addrs, image, rw_image, oldproglen, &ctx, padding);
+		proglen = do_jit(env, prog, addrs, image, rw_image, oldproglen, &ctx, padding);
 		if (proglen <= 0) {
 out_image:
 			image = NULL;
-- 
2.43.0



^ permalink raw reply related

* Re: [PATCH v5 04/12] coresight: etm4x: exclude ss_status from drvdata->config
From: Yeoreum Yun @ 2026-04-16  6:54 UTC (permalink / raw)
  To: Jie Gan
  Cc: coresight, linux-arm-kernel, linux-kernel, suzuki.poulose,
	mike.leach, james.clark, alexander.shishkin, leo.yan
In-Reply-To: <778a826b-918d-4f7c-95a9-1cdb013618d8@oss.qualcomm.com>

Hi Jie,

>
>
> On 4/16/2026 12:55 AM, Yeoreum Yun wrote:
> > The purpose of TRCSSCSRn register is to show status of
> > the corresponding Single-shot Comparator Control and input supports.
> > That means writable field's purpose for reset or restore from idle status
> > not for configuration.
> >
> > Therefore, exclude ss_status from drvdata->config, move it to etm4x_caps
> > and rename it to ss_smp.
> >
> > This includes remove TRCSSCRn from configurable item and
> > remove saving in etm4_disable_hw().
> >
> > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > ---
> >   .../hwtracing/coresight/coresight-etm4x-cfg.c |  1 -
> >   .../coresight/coresight-etm4x-core.c          | 19 ++++++-------------
> >   .../coresight/coresight-etm4x-sysfs.c         |  7 ++-----
> >   drivers/hwtracing/coresight/coresight-etm4x.h |  7 ++++++-
> >   4 files changed, 14 insertions(+), 20 deletions(-)
> >
> > diff --git a/drivers/hwtracing/coresight/coresight-etm4x-cfg.c b/drivers/hwtracing/coresight/coresight-etm4x-cfg.c
> > index c302072b293a..d14d7c8a23e5 100644
> > --- a/drivers/hwtracing/coresight/coresight-etm4x-cfg.c
> > +++ b/drivers/hwtracing/coresight/coresight-etm4x-cfg.c
> > @@ -86,7 +86,6 @@ static int etm4_cfg_map_reg_offset(struct etmv4_drvdata *drvdata,
> >   		off_mask =  (offset & GENMASK(11, 5));
> >   		do {
> >   			CHECKREGIDX(TRCSSCCRn(0), ss_ctrl, idx, off_mask);
> > -			CHECKREGIDX(TRCSSCSRn(0), ss_status, idx, off_mask);
> >   			CHECKREGIDX(TRCSSPCICRn(0), ss_pe_cmp, idx, off_mask);
> >   		} while (0);
> >   	} else if ((offset >= TRCCIDCVRn(0)) && (offset <= TRCVMIDCVRn(7))) {
> > diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > index b2b092a76eb5..f55338a4989d 100644
> > --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > @@ -91,7 +91,7 @@ static bool etm4x_sspcicrn_present(struct etmv4_drvdata *drvdata, int n)
> >   	const struct etmv4_caps *caps = &drvdata->caps;
> >   	return (n < caps->nr_ss_cmp) && caps->nr_pe_cmp &&
> > -	       (drvdata->config.ss_status[n] & TRCSSCSRn_PC);
> > +	       (caps->ss_cmp[n] & TRCSSCSRn_PC);
> >   }
> >   u64 etm4x_sysreg_read(u32 offset, bool _relaxed, bool _64bit)
> > @@ -573,11 +573,9 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
> >   		etm4x_relaxed_write32(csa, config->res_ctrl[i], TRCRSCTLRn(i));
> >   	for (i = 0; i < caps->nr_ss_cmp; i++) {
> > -		/* always clear status bit on restart if using single-shot */
> > -		if (config->ss_ctrl[i] || config->ss_pe_cmp[i])
> > -			config->ss_status[i] &= ~TRCSSCSRn_STATUS;
> >   		etm4x_relaxed_write32(csa, config->ss_ctrl[i], TRCSSCCRn(i));
> > -		etm4x_relaxed_write32(csa, config->ss_status[i], TRCSSCSRn(i));
> > +		/* always clear status and pending bits on restart if using single-shot */
> > +		etm4x_relaxed_write32(csa, 0x0, TRCSSCSRn(i));
> >   		if (etm4x_sspcicrn_present(drvdata, i))
> >   			etm4x_relaxed_write32(csa, config->ss_pe_cmp[i], TRCSSPCICRn(i));
> >   	}
> > @@ -1055,12 +1053,6 @@ static void etm4_disable_hw(struct etmv4_drvdata *drvdata)
> >   	etm4_disable_trace_unit(drvdata);
> > -	/* read the status of the single shot comparators */
> > -	for (i = 0; i < caps->nr_ss_cmp; i++) {
> > -		config->ss_status[i] =
> > -			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
> > -	}
> > -
> >   	/* read back the current counter values */
> >   	for (i = 0; i < caps->nr_cntr; i++) {
> >   		config->cntr_val[i] =
> > @@ -1503,8 +1495,9 @@ static void etm4_init_arch_data(void *info)
> >   	 */
> >   	caps->nr_ss_cmp = FIELD_GET(TRCIDR4_NUMSSCC_MASK, etmidr4);
> >   	for (i = 0; i < caps->nr_ss_cmp; i++) {
> > -		drvdata->config.ss_status[i] =
> > -			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
> > +		caps->ss_cmp[i] = etm4x_relaxed_read32(csa, TRCSSCSRn(i));
> > +		caps->ss_cmp[i] &= (TRCSSCSRn_PC | TRCSSCSRn_DV |
> > +				    TRCSSCSRn_DA | TRCSSCSRn_INST);
>
> Just re-go through this patch and had a question here:
>
> I’m not sure whether this new change should be documented in the ABI, given
> that the TRCSSCSRn_STATUS bit is masked. In my opinion, this change breaks
> the existing ABI description.
>
> Description from the ABI document:
>
> What:           /sys/bus/coresight/devices/etm<N>/sshot_status
> Date:           December 2019
> KernelVersion:  5.5
> Contact:        Mathieu Poirier <mathieu.poirier@linaro.org>
> Description:    (Read) Print the current value of the selected single
>                 shot status register.

But, as I mentioned another thread:
  - https://lore.kernel.org/all/ad5yV2FoNbGGLE6R@e129823.arm.com/

Till now, sysfs doesn't show the *current value* of the single shot
state since the config->ss_status is updated enabled/disabled sysfs
session. an I think once the session is disabled, other status bits
(currently STATUS and PENDING bits) don't have any meaning.

I think it's enough to change the doc's Description for this.

Any thought?

--
Sincerely,
Yeoreum Yun


^ permalink raw reply

* Re: [PATCH v5 07/12] coresight: etm4x: fix inconsistencies with sysfs configuration
From: Yeoreum Yun @ 2026-04-16  6:49 UTC (permalink / raw)
  To: Jie Gan
  Cc: coresight, linux-arm-kernel, linux-kernel, suzuki.poulose,
	mike.leach, james.clark, alexander.shishkin, leo.yan
In-Reply-To: <b9528488-b0c7-410b-b91b-b05c21fd0c08@oss.qualcomm.com>

Hi Jie,
>
>
> On 4/16/2026 12:55 AM, Yeoreum Yun wrote:
> > The current ETM4x configuration via sysfs can lead to
> > several inconsistencies:
> >
> >    - If the configuration is modified via sysfs while a perf session is
> >      active, the running configuration may differ before a sched-out and
> >      after a subsequent sched-in.
> >
> >    - If a perf session and a sysfs session enable tracing concurrently,
> >      the configuration from configfs may become corrupted.
> >
> >    - There is a risk of corrupting drvdata->config if a perf session enables
> >      tracing while cscfg_csdev_disable_active_config() is being handled in
> >      etm4_disable_sysfs().
> >
> > To resolve these issues, separate the configuration into:
> >
> >    - active_config: the configuration applied to the current session
> >    - config: the configuration set via sysfs
> >
> > Additionally:
> >
> >    - Apply the configuration from configfs after taking the appropriate mode.
> >
> >    - Since active_config and related fields are accessed only by the local CPU
> >      in etm4_enable/disable_sysfs_smp_call() (similar to perf enable/disable),
> >      remove the lock/unlock from the sysfs enable/disable path and
> >      startup/dying_cpu except when to access config fields.
> >
> > Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> > ---
>
> <...>
>
> > @@ -618,23 +624,45 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
> >   static void etm4_enable_sysfs_smp_call(void *info)
> >   {
> >   	struct etm4_enable_arg *arg = info;
> > +	struct etmv4_drvdata *drvdata;
> >   	struct coresight_device *csdev;
> >   	if (WARN_ON(!arg))
> >   		return;
> > -	csdev = arg->drvdata->csdev;
> > +	drvdata = arg->drvdata;
> > +	csdev = drvdata->csdev;
> >   	if (!coresight_take_mode(csdev, CS_MODE_SYSFS)) {
> >   		/* Someone is already using the tracer */
> >   		arg->rc = -EBUSY;
> >   		return;
> >   	}
> > -	arg->rc = etm4_enable_hw(arg->drvdata);
> > +	drvdata->active_config = arg->config;
> > -	/* The tracer didn't start */
> > +	if (arg->cfg_hash) {
> > +		arg->rc = cscfg_csdev_enable_active_config(csdev,
> > +							   arg->cfg_hash,
> > +							   arg->preset);
> > +		if (arg->rc)
> > +			goto err;
> > +	}
> > +
> > +	drvdata->trcid = arg->trace_id;
> > +
> > +	/* Tracer will never be paused in sysfs mode */
> > +	drvdata->paused = false;
> > +
> > +	arg->rc = etm4_enable_hw(drvdata);
> >   	if (arg->rc)
> > -		coresight_set_mode(csdev, CS_MODE_DISABLED);
>
> needs disable the active config in error path:
> cscfg_csdev_disable_active_config(drvdata->csdev);

You're right. I missed it. Thanks!

[...]

--
Sincerely,
Yeoreum Yun


^ permalink raw reply

* Re: [PATCH] arm64: cpufeature: Fix GCIE field ordering in ftr_id_aa64pfr2
From: Marc Zyngier @ 2026-04-16  6:42 UTC (permalink / raw)
  To: Mukesh Ojha; +Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel
In-Reply-To: <20260415200031.1885440-1-mukesh.ojha@oss.qualcomm.com>

On Wed, 15 Apr 2026 21:00:31 +0100,
Mukesh Ojha <mukesh.ojha@oss.qualcomm.com> wrote:
> 
> The ftr_id_aa64pfr2[] array must be sorted in descending order of
> shift value so that the overlap validation in init_cpu_features()
> works correctly. The GCIE field (bits 15:12, shift=12) was placed
> last in the array, after MTEFAR (bits 11:8, shift=8) and
> MTESTOREONLY (bits 7:4, shift=4), causing a spurious warning at
> boot:
> 
> [    0.000000] SYS_ID_AA64PFR2_EL1 has feature overlap at shift 12
> [    0.000000] WARNING: arch/arm64/kernel/cpufeature.c:989 at init_cpu_features+0x144/0x3d0, CPU#0:
> swapper/0
> ..
> 
> [    0.000000] pc : init_cpu_features+0x144/0x3d0
> [    0.000000] lr : init_cpu_features+0x144/0x3d0
> [    0.000000] sp : ffffc08678f03dc0
> 
> ...
>     0.000000] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffc08678f14000
> [    0.000000] Call trace:
> [    0.000000]  init_cpu_features+0x144/0x3d0 (P)
> [    0.000000]  cpuinfo_store_boot_cpu+0x4c/0x5c
> [    0.000000]  smp_prepare_boot_cpu+0x28/0x38
> [    0.000000]  start_kernel+0x1d4/0x848
> [    0.000000]  __primary_switched+0x88/0x90
> 
> This is because the overlap check computes (shift + width) > prev_shift,
> i.e. (12 + 4) > 8, which triggers since GCIE occupies bits above MTEFAR
> but was listed after it.
> 
> Fix the ordering to match the register layout: FPMR(35:32), GCIE(15:12),
> MTEFAR(11:8), MTESTOREONLY(7:4).
> 
> Fixes: 899ff451fcee ("KVM: arm64: Advertise ID_AA64PFR2_EL1.GCIE")
> Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>

This was present in next-20260403, identified the following day, a fix
proposed in [1], and the fixed commit appeared in next-20260407 (which
was the subsequent -next build) as 7e629348df81b.

May I humbly suggest that you check with the latest -next branch
before spending time on this sort of things? Two weeks is a pretty
long time...

Thanks,

	M.

[1] https://lore.kernel.org/all/874ilqcu3c.wl-maz@kernel.org/

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply

* Re: [PATCH 0/7] TQMLX2160A-MBLS2160A DT fixes/updates
From: Alexander Stein @ 2026-04-16  6:39 UTC (permalink / raw)
  To: Frank Li, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Shawn Guo
  Cc: linux-arm-kernel, linux, devicetree, linux-kernel, Nora Schiffer
In-Reply-To: <20260331141915.2918927-1-alexander.stein@ew.tq-group.com>

Hi Frank,

Am Dienstag, 31. März 2026, 16:19:01 CEST schrieb Alexander Stein:
> Hi,
> 
> this series adds small fixes and improvements for TQMLX2160A DTs.
> The DT overlays address specific hardware behaviour when serdes is configured
> differently.

Any feedback here?

Thanks and best regards
Alexander

> 
> Best regards,
> Alexander
> 
> Alexander Stein (1):
>   arm64: dts: fsl-lx2160a-tqmlx2160a: Remove deprecated properties
> 
> Nora Schiffer (6):
>   arm64: dts: fsl-lx2160a-tqmlx2160a: fix LED polarity
>   arm64: dts: fsl-lx2160a-tqmlx2160a-mblx2160a: use DPMAC 17 and 18 for
>     SGMII in SERDES2 configs 7 and 11
>   arm64: dts: fsl-lx2160a-tqmlx2160a: add aliases for all 18 DPMAC
>     instances
>   arm64: dts: fsl-lx2160a-tqmlx2160a-mbls2160a: add various GPIO hogs
>   arm64: dts: fsl-lx2160a-tqmlx2160a-mbls2160a: enable pcs_mdio17 and
>     pcs_mdio18 in appropriate overlays
>   arm64: dts: fsl-lx2160a-tqmlx2160a-mbls2160a: specify Ethernet PHY
>     reset GPIOs
> 
>  .../fsl-lx2160a-tqmlx2160a-mblx2160a.dts      | 306 +++++++++++++++++-
>  ...l-lx2160a-tqmlx2160a-mblx2160a_x_11_x.dtso |  20 ++
>  ...sl-lx2160a-tqmlx2160a-mblx2160a_x_7_x.dtso |  20 ++
>  .../dts/freescale/fsl-lx2160a-tqmlx2160a.dtsi |  23 +-
>  4 files changed, 357 insertions(+), 12 deletions(-)
> 
> 


-- 
TQ-Systems GmbH | Mühlstraße 2, Gut Delling | 82229 Seefeld, Germany
Amtsgericht München, HRB 105018
Geschäftsführer: Detlef Schneider, Rüdiger Stahl, Stefan Schneider
http://www.tq-group.com/




^ permalink raw reply

* Re: [PATCH] arm_pmu: acpi: fix reference leak on failed device registration
From: Guangshuo Li @ 2026-04-16  6:34 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Mark Rutland, Will Deacon, Anshuman Khandual, linux-arm-kernel,
	linux-perf-users, linux-kernel, stable
In-Reply-To: <2026041603-guts-crested-ef76@gregkh>

Hi Mark, Greg,

Thanks for the feedback.

On Thu, 16 Apr 2026 at 12:41, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Wed, Apr 15, 2026 at 07:19:06PM +0100, Mark Rutland wrote:
> > Hi,
> >
> > Thanks for the patch, but from a quick skim, I don't think this is the right
> > fix.
> >
> > Greg, I think we might want to rework the core API here; question for
> > you at the end.
> >
> > On Thu, Apr 16, 2026 at 01:41:59AM +0800, Guangshuo Li wrote:
> > > When platform_device_register() fails in arm_acpi_register_pmu_device(),
> > > the embedded struct device in pdev has already been initialized by
> > > device_initialize(), but the failure path only unregisters the GSI and
> > > does not drop the device reference for the current platform device:
> > >
> > >   arm_acpi_register_pmu_device()
> > >     -> platform_device_register(pdev)
> > >        -> device_initialize(&pdev->dev)
> > >        -> setup_pdev_dma_masks(pdev)
> > >        -> platform_device_add(pdev)
> > >
> > > This leads to a reference leak when platform_device_register() fails.
> >
> > AFAICT you're saying that the reference was taken *within*
> > platform_device_register(), and then platform_device_register() itself
> > has failed. I think it's surprising that platform_device_register()
> > doesn't clean that up itself in the case of an error.
> >
> > There are *tonnes* of calls to platform_device_register() throughout the
> > kernel that don't even bother to check the return value, and many that
> > just pass the return onto a caller that can't possibly know to call
> > platform_device_put().
> >
> > Code in the same file as platform_device_register() expects it to clean up
> > after itself, e.g.
> >
> > | int platform_add_devices(struct platform_device **devs, int num)
> > | {
> > |         int i, ret = 0;
> > |
> > |         for (i = 0; i < num; i++) {
> > |                 ret = platform_device_register(devs[i]);
> > |                 if (ret) {
> > |                         while (--i >= 0)
> > |                                 platform_device_unregister(devs[i]);
> > |                         break;
> > |                 }
> > |         }
> > |
> > |         return ret;
> > | }
> >
> > That's been there since the initial git commit, and back then,
> > platform_device_register() didn't mention that callers needed to perform
> > any cleanup.
> >
> > I see a comment was added to platform_device_register() in commit:
> >
> >   67e532a42cf4 ("driver core: platform: document registration-failure requirement")
> >
> > ... and that copied the commend added for device_register() in commit:
> >
> >   5739411acbaa ("Driver core: Clarify device cleanup.")
> >
> > ... but the potential brokenness is so widespread, and the behaviour is
> > so surprising, that I'd argue the real but is that device_register()
> > doesn't clean up in case of error. I don't think it's worth changing
> > this single instance given the prevalance and churn fixing all of that
> > would involve.
> >
> > I think it would be far better to fix the core driver API such that when
> > those functions return an error, they've already cleaned up for
> > themselves.
> >
> > Greg, am I missing some functional reason why we can't rework
> > device_register() and friends to handle cleanup themselves? I appreciate
> > that'll involve churn for some callers, but AFAICT the majority of
> > callers don't have the required cleanup.
>
> Yes, we should fix the platform core code here, this should not be
> required to do everywhere as obviously we all got it wrong.
>
> Guangshuo, can you submit a patch to do that instead and ask for all of
> your other patches to not be applied as well?
>
> thanks,
>
> greg k-h

I agree that fixing this in the platform core makes more sense than
handling it in individual callers.

I'll look into the core code and send a patch for that instead. I'll
also ask for my other related patches not to be applied.

Thanks,
Guangshuo


^ permalink raw reply

* Re: [PATCH v8 next 00/10] arm_mpam: Introduce Narrow-PARTID feature
From: Shaopeng Tan (Fujitsu) @ 2026-04-16  6:29 UTC (permalink / raw)
  To: ben.horgan@arm.com, Dave.Martin@arm.com, james.morse@arm.com,
	reinette.chatre@intel.com, fenghuay@nvidia.com, tglx@kernel.org,
	will@kernel.org, hpa@zytor.com, bp@alien8.de, babu.moger@amd.com,
	dave.hansen@linux.intel.com, mingo@redhat.com,
	tony.luck@intel.com, gshan@redhat.com, catalin.marinas@arm.com
  Cc: linux-arm-kernel@lists.infradead.org, x86@kernel.org,
	linux-kernel@vger.kernel.org, wangkefeng.wang@huawei.com
In-Reply-To: <20260413085405.1166412-1-zengheng4@huawei.com>

Hello Zeng Heng,

Could you tell me which branch this patch series based on?

Best regards,
Shaopent TAN

^ permalink raw reply

* Re: [PATCH v3 3/3] dt-bindings: i3c: Add AST2600 I3C global registers
From: Krzysztof Kozlowski @ 2026-04-16  6:21 UTC (permalink / raw)
  To: Dawid Glazik
  Cc: Alexandre Belloni, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Joel Stanley, Andrew Jeffery, linux-aspeed, linux-i3c, devicetree,
	linux-arm-kernel, Frank Li, Maciej Lawniczak
In-Reply-To: <cb0d16bf-988d-403e-8a8e-c85bf2a208d0@linux.intel.com>

On 15/04/2026 20:21, Dawid Glazik wrote:
> On 4/9/2026 9:30 AM, Krzysztof Kozlowski wrote:
>> On 09/04/2026 09:28, Krzysztof Kozlowski wrote:
>>> On Wed, Apr 08, 2026 at 10:34:35PM +0200, Dawid Glazik wrote:
>>>> Introduce the device-tree bindings for I3C global registers found on
>>>> AST2600 SoCs.
>>>>
>>>> Signed-off-by: Dawid Glazik <dawid.glazik@linux.intel.com>
>>>> ---
>>>> I wasn't sure if I should add newline at the end of the
>>>> file or not so I took
>>>> https://github.com/torvalds/linux/tree/master/Documentation/devicetree/bindings/i3c
>>>> as an example.
>>>
>>> Answer is: you cannot have patch warnings.
>>>
>>> Documentation/devicetree/bindings/i3c does not have patch warning, does
>>> it?
>>
>> And if you tested this code with standard tools, you would see that...
>>
>> Best regards,
>> Krzysztof
> 
> Thank you for the review and feedback. This is my first contribution to 
> Linux kernel so I'm still learning the process and toolchain. I 
> apologize for the rookie mistakes. I will address all the issues you've 
> pointed out and resubmit the series.


So get the patch reviewed by Intel colleagues which would tell you what
tools you must run and what warnings are accepted or not (and patch
warning is never accepted).

Best regards,
Krzysztof


^ permalink raw reply

* Re: [PATCH net-next 5/6] net: stmmac: move PHY handling out of __stmmac_open()/release()
From: Alexander Stein @ 2026-04-16  6:20 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, linux-arm-kernel,
	linux-stm32, Maxime Coquelin, netdev, Paolo Abeni
In-Reply-To: <ad-LtOBrKREM1tCk@shell.armlinux.org.uk>

Am Mittwoch, 15. April 2026, 14:59:32 CEST schrieb Russell King (Oracle):
> On Wed, Apr 15, 2026 at 08:08:40AM +0200, Alexander Stein wrote:
> > Hi,
> > 
> > Am Dienstag, 23. September 2025, 13:26:19 CEST schrieb Russell King (Oracle):
> > > Move the PHY attachment/detachment from the network driver out of
> > > __stmmac_open() and __stmmac_release() into stmmac_open() and
> > > stmmac_release() where these actions will only happen when the
> > > interface is administratively brought up or down. It does not make
> > > sense to detach and re-attach the PHY during a change of MTU.
> > 
> > Sorry for coming up now. But I recently noticed this commit breaks changing
> > the MTU on i.MX8MP. Once I simply change the MTU I run into some DMA error:
> > $ ip link set dev end1 mtu 1400
> > imx-dwmac 30bf0000.ethernet end1: Register MEM_TYPE_PAGE_POOL RxQ-0
> > imx-dwmac 30bf0000.ethernet end1: Register MEM_TYPE_PAGE_POOL RxQ-1
> > imx-dwmac 30bf0000.ethernet end1: Register MEM_TYPE_PAGE_POOL RxQ-2
> > imx-dwmac 30bf0000.ethernet end1: Register MEM_TYPE_PAGE_POOL RxQ-3
> > imx-dwmac 30bf0000.ethernet end1: Register MEM_TYPE_PAGE_POOL RxQ-4
> > imx-dwmac 30bf0000.ethernet end1: Link is Down
> > imx-dwmac 30bf0000.ethernet end1: Failed to reset the dma
> > imx-dwmac 30bf0000.ethernet end1: stmmac_hw_setup: DMA engine initialization failed
> 
> This basically means that a clock is missing. Please provide more
> information:
> 
> - what kernel version are you using?

Currently I am using v6.18.22.
$ ethtool -i end1
driver: st_gmac
version: 6.18.22
firmware-version: 
expansion-rom-version: 
bus-info: 30bf0000.ethernet
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

> - has EEE been negotiated?

No. It is marked as not supported

$ ethtool --show-eee end1
EEE settings for end1:
        EEE status: not supported

> - does the problem persist when EEE is disabled?

As EEE is not supported the problem occurs even with EEE disabled.

> - which PHY is attached to stmmac?

It is a TI DP83867.

imx-dwmac 30bf0000.ethernet eth1: PHY [stmmac-1:03] driver [TI DP83867] (irq=136)

> - which PHY interface mode is being used to connect the PHY to stmmac?

For this interface
> phy-mode = "rgmii-id";
is set.

In case it is helpful. My platform is arch/arm64/boot/dts/freescale/imx8mp-tqma8mpql-mba8mpxl.dts
Thanks for assisting. If there a further questions, don't hesitate to ask.

Thanks and best regards
Alexander
-- 
TQ-Systems GmbH | Mühlstraße 2, Gut Delling | 82229 Seefeld, Germany
Amtsgericht München, HRB 105018
Geschäftsführer: Detlef Schneider, Rüdiger Stahl, Stefan Schneider
http://www.tq-group.com/





^ permalink raw reply

* Re: [EXT] Re: [PATCH v2] pmdomain: imx: Make IMX8M/IMX9 BLK_CTRL tristate
From: Daniel Baluta @ 2026-04-16  6:04 UTC (permalink / raw)
  To: Zhipeng Wang, Marco Felsch
  Cc: ulfh@kernel.org, Frank Li, s.hauer@pengutronix.de,
	imx@lists.linux.dev, linux-pm@vger.kernel.org, Xuegang Liu,
	Jindong Yue, linux-kernel@vger.kernel.org, kernel@pengutronix.de,
	festevam@gmail.com, linux-arm-kernel@lists.infradead.org
In-Reply-To: <AMBPR04MB123344C1F802528A10CEA7FADEB252@AMBPR04MB12334.eurprd04.prod.outlook.com>

On 4/14/26 04:59, Zhipeng Wang wrote:
>  > On 26-04-13, Zhipeng Wang wrote:
>>> Convert IMX8M_BLK_CTRL and IMX9_BLK_CTRL from bool to tristate to
>>> allow building as loadable modules.
>> Out of curiosity, why do you want to have a PM driver to be buildable as
>> module?
>>
>> Regards,
>>   Marco
>>
> Hi Marco,
>
> Thank you for your question.
>
> The primary motivation is to support Google's GKI (Generic Kernel Image)
> requirement for Android devices.
>
> GKI separates the kernel into two parts:
> 1. A unified kernel image (GKI) that is common across all Android devices
> 2. Vendor-specific drivers that must be built as loadable modules
>
> Under the GKI architecture, SoC-specific drivers like IMX8M/IMX9 BLK_CTRL
> cannot be built into the core kernel image. Instead, they must be loadable
> modules that vendors can ship separately. This allows:
>
> - A single kernel binary to support multiple hardware platforms
> - Vendors to update their drivers independently without rebuilding the entire kernel
> - Better compliance with Android's kernel update and security policies
>
Can you please add the below line in the commit message?
> For i.MX8M/i.MX9 devices running Android with GKI kernels, the BLK_CTRL
> drivers need to be loaded as modules during boot. Without tristate support,
> these devices cannot properly initialize their power domains, making them
> non-functional under GKI.




^ permalink raw reply

* Re: [PATCH v2 2/3] dt-bindings: gpio: Add EIO GPIO compatible to gpio-zynq
From: Michal Simek @ 2026-04-16  5:58 UTC (permalink / raw)
  To: Conor Dooley, Shubhrajyoti Datta
  Cc: linux-kernel, git, shubhrajyoti.datta, Srinivas Neeli,
	Linus Walleij, Bartosz Golaszewski, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, linux-gpio, devicetree,
	linux-arm-kernel
In-Reply-To: <20260415-rectal-visible-a8ccb534a176@spud>

On 4/15/26 17:01, Conor Dooley wrote:
> On Wed, Apr 15, 2026 at 04:26:27PM +0530, Shubhrajyoti Datta wrote:
>> EIO (Extended IO) is a GPIO block found on xa2ve3288 silicon..
> 
> 
> Why does the compatible have a "1.0" when it is in silicon?

Sorry not following what the problem is. Yes this is hard block in silicon
and it is silicon v1.

> Why doesn't the compatible contain "xa2ve3288"?

This unit can be used on different silicons too.

> Why is this device not compatible with existing ones, since
> gpio-lines-names appears to be the sole difference?

There is no way how to detect gpio width.
Pretty much soc_device_match() to some extend could be use to detect which 
silicon it runs but on this particular one you have 3 gpio controllers described 
by this binding (pmc, versal and eio).

Thanks,
Michal

^ permalink raw reply

* Re: [PATCH v3] pmdomain: imx: Make IMX8M/IMX9 BLK_CTRL tristate
From: Daniel Baluta @ 2026-04-16  6:01 UTC (permalink / raw)
  To: Zhipeng Wang, ulfh, Frank.Li, s.hauer
  Cc: kernel, festevam, linux-pm, imx, linux-arm-kernel, linux-kernel,
	xuegang.liu, jindong.yue
In-Reply-To: <20260416015605.3536244-1-zhipeng.wang_1@nxp.com>

On 4/16/26 04:56, Zhipeng Wang wrote:
> Convert IMX8M_BLK_CTRL and IMX9_BLK_CTRL from bool to tristate
> to allow building as loadable modules.
>
> Add prompt strings to make these options visible and configurable
> in menuconfig, keeping them enabled by default on appropriate platforms.
>
> Also remove the IMX_GPCV2_PM_DOMAINS dependency from IMX9_BLK_CTRL.
> This dependency was incorrect from the beginning because i.MX93 uses a
> different power domain architecture compared to i.MX8M series:
>
> - i.MX8M uses GPCv2 (General Power Controller v2) for power domain
>   management, hence IMX8M_BLK_CTRL correctly depends on it.
>
> - i.MX93 uses BLK_CTRL directly without GPCv2. The hardware doesn't
>   have GPCv2 at all.
>
> Signed-off-by: Zhipeng Wang <zhipeng.wang_1@nxp.com>
> Reviewed-by: Frank Li <Frank.Li@nxp.com>
>
> ---

Please always add a change log here to help reviewers.

Change since v2:

* fixed typo reported by Frank




^ permalink raw reply

* Re: [PATCH v8 08/10] ASoC: mediatek: mt8196: add platform driver
From: Cyril Chao (钞悦) @ 2026-04-16  5:53 UTC (permalink / raw)
  To: broonie@kernel.org
  Cc: linux-kernel@vger.kernel.org, linux-mediatek@lists.infradead.org,
	devicetree@vger.kernel.org, Darren Ye (叶飞),
	linux-sound@vger.kernel.org, conor+dt@kernel.org, tiwai@suse.com,
	robh@kernel.org, lgirdwood@gmail.com,
	linux-arm-kernel@lists.infradead.org,
	Project_Global_Chrome_Upstream_Group, matthias.bgg@gmail.com,
	krzk+dt@kernel.org, perex@perex.cz, AngeloGioacchino Del Regno
In-Reply-To: <892468cc-7eb4-411e-b91b-f14789d8da0c@sirena.org.uk>

Thank you for your assistance in reviewing. Could you please also
review the modifications in the diff? If everything is okay, I will
include them in v9 in the next update.


diff --git a/sound/soc/mediatek/mt8196/mt8196-afe-pcm.c
b/sound/soc/mediatek/mt8196/mt8196-afe-pcm.c
index 3d3174cd8efb..ff7aa89e4779 100644
--- a/sound/soc/mediatek/mt8196/mt8196-afe-pcm.c
+++ b/sound/soc/mediatek/mt8196/mt8196-afe-pcm.c
@@ -90,9 +90,20 @@ static int mt8196_set_cm(struct mtk_base_afe *afe,
int id,
 	struct mt8196_afe_private *afe_priv = afe->platform_priv;
 	unsigned int rate = afe_priv->cm_rate[id];
 	unsigned int rate_val = mt8196_rate_transform(afe->dev, rate);
-	unsigned int update_val = update ? ((((26000000 / rate) - 10) /
(ch / 2)) - 1) : 0x64;
+	unsigned int ch_pair = ch / 2;
+	unsigned int update_val;
 	int reg = AFE_CM0_CON0 + 0x10 * id;
 
+	if (update) {
+		if (ch_pair == 0) {
+			dev_err(afe->dev, "CM%d: invalid channel count
%u\n", id, ch);
+			return -EINVAL;
+		}
+		update_val = (26000000 / rate - 10) / ch_pair - 1;
+	} else {
+		update_val = 0x64;
+	}
+
 	dev_dbg(afe->dev, "CM%d, rate %d, update %d, swap %d, ch %d\n",
 		id, rate, update, swap, ch);
 
@@ -471,6 +482,7 @@ static int ul_cm0_event(struct snd_soc_dapm_widget
*w,
 	struct mtk_base_afe *afe =
snd_soc_component_get_drvdata(cmpnt);
 	struct mt8196_afe_private *afe_priv = afe->platform_priv;
 	unsigned int channels = afe_priv->cm_channels;
+	int ret;
 
 	dev_dbg(afe->dev, "event 0x%x, name %s, channels %u\n",
 		event, w->name, channels);
@@ -478,7 +490,9 @@ static int ul_cm0_event(struct snd_soc_dapm_widget
*w,
 	switch (event) {
 	case SND_SOC_DAPM_PRE_PMU:
 		mt8196_enable_cm_bypass(afe, CM0, false);
-		mt8196_set_cm(afe, CM0, true, false, channels);
+		ret = mt8196_set_cm(afe, CM0, true, false, channels);
+		if (ret)
+			return ret;
 		regmap_update_bits(afe->regmap, AUDIO_TOP_CON0,
 				   PDN_CM0_MASK_SFT, 0 << PDN_CM0_SFT);
 		break;
@@ -502,6 +516,7 @@ static int ul_cm1_event(struct snd_soc_dapm_widget
*w,
 	struct mtk_base_afe *afe =
snd_soc_component_get_drvdata(cmpnt);
 	struct mt8196_afe_private *afe_priv = afe->platform_priv;
 	unsigned int channels = afe_priv->cm_channels;
+	int ret;
 
 	dev_dbg(afe->dev, "event 0x%x, name %s, channels %u\n",
 		event, w->name, channels);
@@ -509,7 +524,9 @@ static int ul_cm1_event(struct snd_soc_dapm_widget
*w,
 	switch (event) {
 	case SND_SOC_DAPM_PRE_PMU:
 		mt8196_enable_cm_bypass(afe, CM1, false);
-		mt8196_set_cm(afe, CM1, true, false, channels);
+		ret = mt8196_set_cm(afe, CM1, true, false, channels);
+		if (ret)
+			return ret;
 		regmap_update_bits(afe->regmap, AUDIO_TOP_CON0,
 				   PDN_CM1_MASK_SFT, 0 << PDN_CM1_SFT);
 		break;
@@ -533,6 +550,7 @@ static int ul_cm2_event(struct snd_soc_dapm_widget
*w,
 	struct mtk_base_afe *afe =
snd_soc_component_get_drvdata(cmpnt);
 	struct mt8196_afe_private *afe_priv = afe->platform_priv;
 	unsigned int channels = afe_priv->cm_channels;
+	int ret;
 
 	dev_dbg(afe->dev, "event 0x%x, name %s, channels %u\n",
 		event, w->name, channels);
@@ -540,7 +558,9 @@ static int ul_cm2_event(struct snd_soc_dapm_widget
*w,
 	switch (event) {
 	case SND_SOC_DAPM_PRE_PMU:
 		mt8196_enable_cm_bypass(afe, CM2, false);
-		mt8196_set_cm(afe, CM2, true, false, channels);
+		ret = mt8196_set_cm(afe, CM2, true, false, channels);
+		if (ret)
+			return ret;
 		regmap_update_bits(afe->regmap, AUDIO_TOP_CON0,
 				   PDN_CM2_MASK_SFT, 0 << PDN_CM2_SFT);
 		break;

Best Regards
Cyril Chao


On Fri, 2026-04-03 at 15:07 +0100, Mark Brown wrote:
> On Tue, Mar 24, 2026 at 09:56:49AM +0800, Cyril Chao wrote:
> 
> > +static int mt8196_set_cm(struct mtk_base_afe *afe, int id,
> > +			 bool update, bool swap, unsigned int ch)
> > +{
> > +	struct mt8196_afe_private *afe_priv = afe->platform_priv;
> > +	unsigned int rate = afe_priv->cm_rate[id];
> > +	unsigned int rate_val = mt8196_rate_transform(afe->dev, rate);
> > +	unsigned int update_val = update ? ((((26000000 / rate) - 10) /
> > (ch / 2)) - 1) : 0x64;
> > +	int reg = AFE_CM0_CON0 + 0x10 * id;
> 
> The driver looks like it supports mono so won't this trigger divide
> by
> zero?
> 
> Also please write normal conditional statements, it's much more
> leigible.

^ permalink raw reply related

* Re: [PATCH v5 04/12] coresight: etm4x: exclude ss_status from drvdata->config
From: Jie Gan @ 2026-04-16  5:42 UTC (permalink / raw)
  To: Yeoreum Yun, coresight, linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mike.leach, james.clark, alexander.shishkin,
	leo.yan
In-Reply-To: <20260415165528.3369607-5-yeoreum.yun@arm.com>



On 4/16/2026 12:55 AM, Yeoreum Yun wrote:
> The purpose of TRCSSCSRn register is to show status of
> the corresponding Single-shot Comparator Control and input supports.
> That means writable field's purpose for reset or restore from idle status
> not for configuration.
> 
> Therefore, exclude ss_status from drvdata->config, move it to etm4x_caps
> and rename it to ss_smp.
> 
> This includes remove TRCSSCRn from configurable item and
> remove saving in etm4_disable_hw().
> 
> Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> ---
>   .../hwtracing/coresight/coresight-etm4x-cfg.c |  1 -
>   .../coresight/coresight-etm4x-core.c          | 19 ++++++-------------
>   .../coresight/coresight-etm4x-sysfs.c         |  7 ++-----
>   drivers/hwtracing/coresight/coresight-etm4x.h |  7 ++++++-
>   4 files changed, 14 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-cfg.c b/drivers/hwtracing/coresight/coresight-etm4x-cfg.c
> index c302072b293a..d14d7c8a23e5 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-cfg.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-cfg.c
> @@ -86,7 +86,6 @@ static int etm4_cfg_map_reg_offset(struct etmv4_drvdata *drvdata,
>   		off_mask =  (offset & GENMASK(11, 5));
>   		do {
>   			CHECKREGIDX(TRCSSCCRn(0), ss_ctrl, idx, off_mask);
> -			CHECKREGIDX(TRCSSCSRn(0), ss_status, idx, off_mask);
>   			CHECKREGIDX(TRCSSPCICRn(0), ss_pe_cmp, idx, off_mask);
>   		} while (0);
>   	} else if ((offset >= TRCCIDCVRn(0)) && (offset <= TRCVMIDCVRn(7))) {
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index b2b092a76eb5..f55338a4989d 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -91,7 +91,7 @@ static bool etm4x_sspcicrn_present(struct etmv4_drvdata *drvdata, int n)
>   	const struct etmv4_caps *caps = &drvdata->caps;
>   
>   	return (n < caps->nr_ss_cmp) && caps->nr_pe_cmp &&
> -	       (drvdata->config.ss_status[n] & TRCSSCSRn_PC);
> +	       (caps->ss_cmp[n] & TRCSSCSRn_PC);
>   }
>   
>   u64 etm4x_sysreg_read(u32 offset, bool _relaxed, bool _64bit)
> @@ -573,11 +573,9 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
>   		etm4x_relaxed_write32(csa, config->res_ctrl[i], TRCRSCTLRn(i));
>   
>   	for (i = 0; i < caps->nr_ss_cmp; i++) {
> -		/* always clear status bit on restart if using single-shot */
> -		if (config->ss_ctrl[i] || config->ss_pe_cmp[i])
> -			config->ss_status[i] &= ~TRCSSCSRn_STATUS;
>   		etm4x_relaxed_write32(csa, config->ss_ctrl[i], TRCSSCCRn(i));
> -		etm4x_relaxed_write32(csa, config->ss_status[i], TRCSSCSRn(i));
> +		/* always clear status and pending bits on restart if using single-shot */
> +		etm4x_relaxed_write32(csa, 0x0, TRCSSCSRn(i));
>   		if (etm4x_sspcicrn_present(drvdata, i))
>   			etm4x_relaxed_write32(csa, config->ss_pe_cmp[i], TRCSSPCICRn(i));
>   	}
> @@ -1055,12 +1053,6 @@ static void etm4_disable_hw(struct etmv4_drvdata *drvdata)
>   
>   	etm4_disable_trace_unit(drvdata);
>   
> -	/* read the status of the single shot comparators */
> -	for (i = 0; i < caps->nr_ss_cmp; i++) {
> -		config->ss_status[i] =
> -			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
> -	}
> -
>   	/* read back the current counter values */
>   	for (i = 0; i < caps->nr_cntr; i++) {
>   		config->cntr_val[i] =
> @@ -1503,8 +1495,9 @@ static void etm4_init_arch_data(void *info)
>   	 */
>   	caps->nr_ss_cmp = FIELD_GET(TRCIDR4_NUMSSCC_MASK, etmidr4);
>   	for (i = 0; i < caps->nr_ss_cmp; i++) {
> -		drvdata->config.ss_status[i] =
> -			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
> +		caps->ss_cmp[i] = etm4x_relaxed_read32(csa, TRCSSCSRn(i));
> +		caps->ss_cmp[i] &= (TRCSSCSRn_PC | TRCSSCSRn_DV |
> +				    TRCSSCSRn_DA | TRCSSCSRn_INST);

Just re-go through this patch and had a question here:

I’m not sure whether this new change should be documented in the ABI, 
given that the TRCSSCSRn_STATUS bit is masked. In my opinion, this 
change breaks the existing ABI description.

Description from the ABI document:

What:           /sys/bus/coresight/devices/etm<N>/sshot_status
Date:           December 2019
KernelVersion:  5.5
Contact:        Mathieu Poirier <mathieu.poirier@linaro.org>
Description:    (Read) Print the current value of the selected single
                 shot status register.

Thanks,
Jie

>   	}
>   	/* NUMCIDC, bits[27:24] number of Context ID comparators for tracing */
>   	caps->numcidc = FIELD_GET(TRCIDR4_NUMCIDC_MASK, etmidr4);
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
> index 8bd28e71d4c9..5e26c2ec8f7b 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
> @@ -1829,8 +1829,6 @@ static ssize_t sshot_ctrl_store(struct device *dev,
>   	raw_spin_lock(&drvdata->spinlock);
>   	idx = config->ss_idx;
>   	config->ss_ctrl[idx] = FIELD_PREP(TRCSSCCRn_SAC_ARC_RST_MASK, val);
> -	/* must clear bit 31 in related status register on programming */
> -	config->ss_status[idx] &= ~TRCSSCSRn_STATUS;
>   	raw_spin_unlock(&drvdata->spinlock);
>   	return size;
>   }
> @@ -1841,10 +1839,11 @@ static ssize_t sshot_status_show(struct device *dev,
>   {
>   	unsigned long val;
>   	struct etmv4_drvdata *drvdata = dev_get_drvdata(dev->parent);
> +	const struct etmv4_caps *caps = &drvdata->caps;
>   	struct etmv4_config *config = &drvdata->config;
>   
>   	raw_spin_lock(&drvdata->spinlock);
> -	val = config->ss_status[config->ss_idx];
> +	val = caps->ss_cmp[config->ss_idx];
>   	raw_spin_unlock(&drvdata->spinlock);
>   	return scnprintf(buf, PAGE_SIZE, "%#lx\n", val);
>   }
> @@ -1879,8 +1878,6 @@ static ssize_t sshot_pe_ctrl_store(struct device *dev,
>   	raw_spin_lock(&drvdata->spinlock);
>   	idx = config->ss_idx;
>   	config->ss_pe_cmp[idx] = FIELD_PREP(TRCSSPCICRn_PC_MASK, val);
> -	/* must clear bit 31 in related status register on programming */
> -	config->ss_status[idx] &= ~TRCSSCSRn_STATUS;
>   	raw_spin_unlock(&drvdata->spinlock);
>   	return size;
>   }
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
> index 8168676f2945..db56c4414873 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x.h
> +++ b/drivers/hwtracing/coresight/coresight-etm4x.h
> @@ -213,6 +213,7 @@
>   #define TRCACATRn_EXLEVEL_MASK			GENMASK(14, 8)
>   
>   #define TRCSSCSRn_STATUS			BIT(31)
> +#define TRCSSCSRn_PENDING			BIT(30)
>   #define TRCSSCCRn_SAC_ARC_RST_MASK		GENMASK(24, 0)
>   
>   #define TRCSSPCICRn_PC_MASK			GENMASK(7, 0)
> @@ -729,6 +730,9 @@ static inline u32 etm4_res_sel_pair(u8 res_sel_idx)
>   #define ETM_DEFAULT_ADDR_COMP		0
>   
>   #define TRCSSCSRn_PC			BIT(3)
> +#define TRCSSCSRn_DV			BIT(2)
> +#define TRCSSCSRn_DA			BIT(1)
> +#define TRCSSCSRn_INST			BIT(0)
>   
>   /* PowerDown Control Register bits */
>   #define TRCPDCR_PU			BIT(3)
> @@ -861,6 +865,7 @@ enum etm_impdef_type {
>    * @lpoverride:	If the implementation can support low-power state over.
>    * @skip_power_up: Indicates if an implementation can skip powering up
>    *		   the trace unit.
> + * @ss_cmp:	Indicates supported single-shot comparators.
>    */
>   struct etmv4_caps {
>   	u8	nr_pe;
> @@ -899,6 +904,7 @@ struct etmv4_caps {
>   	bool	atbtrig : 1;
>   	bool	lpoverride : 1;
>   	bool	skip_power_up : 1;
> +	u32	ss_cmp[ETM_MAX_SS_CMP];
>   };
>   
>   /**
> @@ -977,7 +983,6 @@ struct etmv4_config {
>   	u32				res_ctrl[ETM_MAX_RES_SEL]; /* TRCRSCTLRn */
>   	u8				ss_idx;
>   	u32				ss_ctrl[ETM_MAX_SS_CMP];
> -	u32				ss_status[ETM_MAX_SS_CMP];
>   	u32				ss_pe_cmp[ETM_MAX_SS_CMP];
>   	u8				addr_idx;
>   	u64				addr_val[ETM_MAX_SINGLE_ADDR_CMP];



^ permalink raw reply

* RE: [PATCH v4 3/9] media: chips-media: wave6: Add Wave6 VPU interface
From: Nas Chung @ 2026-04-16  5:25 UTC (permalink / raw)
  To: Nicolas Dufresne, mchehab@kernel.org, hverkuil@xs4all.nl,
	robh@kernel.org, krzk+dt@kernel.org, conor+dt@kernel.org,
	shawnguo@kernel.org, s.hauer@pengutronix.de
  Cc: linux-media@vger.kernel.org, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-imx@nxp.com,
	linux-arm-kernel@lists.infradead.org, jackson.lee, lafley.kim,
	marek.vasut@mailbox.org, Ming Qian
In-Reply-To: <7306c00b626f4030d92b908022b9a39669b07bb7.camel@ndufresne.ca>

Hi, Nicolas.

Sorry, I just realized that I never replied to your earlier email.

>-----Original Message-----
>From: Nicolas Dufresne <nicolas@ndufresne.ca>
>Sent: Thursday, December 11, 2025 4:54 AM
>To: Nas Chung <nas.chung@chipsnmedia.com>; mchehab@kernel.org;
>hverkuil@xs4all.nl; robh@kernel.org; krzk+dt@kernel.org;
>conor+dt@kernel.org; shawnguo@kernel.org; s.hauer@pengutronix.de
>Cc: linux-media@vger.kernel.org; devicetree@vger.kernel.org; linux-
>kernel@vger.kernel.org; linux-imx@nxp.com; linux-arm-
>kernel@lists.infradead.org; jackson.lee <jackson.lee@chipsnmedia.com>;
>lafley.kim <lafley.kim@chipsnmedia.com>; marek.vasut@mailbox.org; Ming Qian
><ming.qian@oss.nxp.com>
>Subject: Re: [PATCH v4 3/9] media: chips-media: wave6: Add Wave6 VPU
>interface
>
>Hi,
>
>Le mercredi 22 octobre 2025 à 16:47 +0900, Nas Chung a écrit :
>> Add an interface layer to manage hardware register configuration
>> and communication with the Chips&Media Wave6 video codec IP.
>>
>> The interface provides low-level helper functions used by the
>> Wave6 core driver to implement video encoding and decoding operations.
>> It handles command submission to the firmware via MMIO registers,
>> and waits for a response by polling the firmware busy flag.
>>
>> Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
>> Tested-by: Ming Qian <ming.qian@oss.nxp.com>
>> Tested-by: Marek Vasut <marek.vasut@mailbox.org>
>> ---

[...]

>
>[...]
>
>stopping there for now. I feel like we did a big mistake in wave5 by
>allowing a
>heavy abstraction, its a lot harder to fix and it served no purpose since
>you
>went for a fresh driver for wave6. I think its proper to ask for a slimmer
>interface.
>
>The V4L2 API is the front-end, and where all the validation should take
>place.
>The HW interface should simply manage the HW in a readable and non-
>redundant
>way. In V4L2, strides and buffer size are part of the try/s/g_fmt API, so
>these
>should not be duplicated here and they should clearly use the common code.

I agree that the HW interface should be slimmer and should not duplicate
validation handled in the V4L2 layer.

>
>I know its painful to ear, but you will be remove 50% of the code, which
>long
>term will be a massive win on maintenance.

I am reworking the series to address your earlier feedback as well, and I will
include that in the next patch version.

Thanks again for your feedback.

Thanks.
Nas.

>
>regards,
>Nicolas


^ permalink raw reply

* Re: [PATCH] arm_pmu: acpi: fix reference leak on failed device registration
From: Greg Kroah-Hartman @ 2026-04-16  4:40 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Guangshuo Li, Will Deacon, Anshuman Khandual, linux-arm-kernel,
	linux-perf-users, linux-kernel, stable
In-Reply-To: <ad_WmuauLJ3xDKqh@J2N7QTR9R3>

On Wed, Apr 15, 2026 at 07:19:06PM +0100, Mark Rutland wrote:
> Hi,
> 
> Thanks for the patch, but from a quick skim, I don't think this is the right
> fix.
> 
> Greg, I think we might want to rework the core API here; question for
> you at the end.
> 
> On Thu, Apr 16, 2026 at 01:41:59AM +0800, Guangshuo Li wrote:
> > When platform_device_register() fails in arm_acpi_register_pmu_device(),
> > the embedded struct device in pdev has already been initialized by
> > device_initialize(), but the failure path only unregisters the GSI and
> > does not drop the device reference for the current platform device:
> > 
> >   arm_acpi_register_pmu_device()
> >     -> platform_device_register(pdev)
> >        -> device_initialize(&pdev->dev)
> >        -> setup_pdev_dma_masks(pdev)
> >        -> platform_device_add(pdev)
> > 
> > This leads to a reference leak when platform_device_register() fails.
> 
> AFAICT you're saying that the reference was taken *within*
> platform_device_register(), and then platform_device_register() itself
> has failed. I think it's surprising that platform_device_register()
> doesn't clean that up itself in the case of an error.
> 
> There are *tonnes* of calls to platform_device_register() throughout the
> kernel that don't even bother to check the return value, and many that
> just pass the return onto a caller that can't possibly know to call
> platform_device_put().
> 
> Code in the same file as platform_device_register() expects it to clean up
> after itself, e.g.
> 
> | int platform_add_devices(struct platform_device **devs, int num) 
> | {
> |         int i, ret = 0; 
> | 
> |         for (i = 0; i < num; i++) {
> |                 ret = platform_device_register(devs[i]);
> |                 if (ret) {
> |                         while (--i >= 0)
> |                                 platform_device_unregister(devs[i]);
> |                         break;
> |                 }    
> |         }    
> | 
> |         return ret; 
> | }
> 
> That's been there since the initial git commit, and back then,
> platform_device_register() didn't mention that callers needed to perform
> any cleanup.
> 
> I see a comment was added to platform_device_register() in commit:
> 
>   67e532a42cf4 ("driver core: platform: document registration-failure requirement")
> 
> ... and that copied the commend added for device_register() in commit:
> 
>   5739411acbaa ("Driver core: Clarify device cleanup.")
> 
> ... but the potential brokenness is so widespread, and the behaviour is
> so surprising, that I'd argue the real but is that device_register()
> doesn't clean up in case of error. I don't think it's worth changing
> this single instance given the prevalance and churn fixing all of that
> would involve.
> 
> I think it would be far better to fix the core driver API such that when
> those functions return an error, they've already cleaned up for
> themselves.
> 
> Greg, am I missing some functional reason why we can't rework
> device_register() and friends to handle cleanup themselves? I appreciate
> that'll involve churn for some callers, but AFAICT the majority of
> callers don't have the required cleanup.

Yes, we should fix the platform core code here, this should not be
required to do everywhere as obviously we all got it wrong.

Guangshuo, can you submit a patch to do that instead and ask for all of
your other patches to not be applied as well?

thanks,

greg k-h


^ permalink raw reply

* Re: [PATCH v5 07/12] coresight: etm4x: fix inconsistencies with sysfs configuration
From: Jie Gan @ 2026-04-16  4:35 UTC (permalink / raw)
  To: Yeoreum Yun, coresight, linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mike.leach, james.clark, alexander.shishkin,
	leo.yan
In-Reply-To: <20260415165528.3369607-8-yeoreum.yun@arm.com>



On 4/16/2026 12:55 AM, Yeoreum Yun wrote:
> The current ETM4x configuration via sysfs can lead to
> several inconsistencies:
> 
>    - If the configuration is modified via sysfs while a perf session is
>      active, the running configuration may differ before a sched-out and
>      after a subsequent sched-in.
> 
>    - If a perf session and a sysfs session enable tracing concurrently,
>      the configuration from configfs may become corrupted.
> 
>    - There is a risk of corrupting drvdata->config if a perf session enables
>      tracing while cscfg_csdev_disable_active_config() is being handled in
>      etm4_disable_sysfs().
> 
> To resolve these issues, separate the configuration into:
> 
>    - active_config: the configuration applied to the current session
>    - config: the configuration set via sysfs
> 
> Additionally:
> 
>    - Apply the configuration from configfs after taking the appropriate mode.
> 
>    - Since active_config and related fields are accessed only by the local CPU
>      in etm4_enable/disable_sysfs_smp_call() (similar to perf enable/disable),
>      remove the lock/unlock from the sysfs enable/disable path and
>      startup/dying_cpu except when to access config fields.
> 
> Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com>
> ---

<...>

> @@ -618,23 +624,45 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
>   static void etm4_enable_sysfs_smp_call(void *info)
>   {
>   	struct etm4_enable_arg *arg = info;
> +	struct etmv4_drvdata *drvdata;
>   	struct coresight_device *csdev;
>   
>   	if (WARN_ON(!arg))
>   		return;
>   
> -	csdev = arg->drvdata->csdev;
> +	drvdata = arg->drvdata;
> +	csdev = drvdata->csdev;
>   	if (!coresight_take_mode(csdev, CS_MODE_SYSFS)) {
>   		/* Someone is already using the tracer */
>   		arg->rc = -EBUSY;
>   		return;
>   	}
>   
> -	arg->rc = etm4_enable_hw(arg->drvdata);
> +	drvdata->active_config = arg->config;
>   
> -	/* The tracer didn't start */
> +	if (arg->cfg_hash) {
> +		arg->rc = cscfg_csdev_enable_active_config(csdev,
> +							   arg->cfg_hash,
> +							   arg->preset);
> +		if (arg->rc)
> +			goto err;
> +	}
> +
> +	drvdata->trcid = arg->trace_id;
> +
> +	/* Tracer will never be paused in sysfs mode */
> +	drvdata->paused = false;
> +
> +	arg->rc = etm4_enable_hw(drvdata);
>   	if (arg->rc)
> -		coresight_set_mode(csdev, CS_MODE_DISABLED);

needs disable the active config in error path:
cscfg_csdev_disable_active_config(drvdata->csdev);

Thanks,
Jie

> +		goto err;
> +
> +	drvdata->sticky_enable = true;
> +
> +	return;
> +err:
> +	/* The tracer didn't start */
> +	coresight_set_mode(csdev, CS_MODE_DISABLED);
>   }
>   
>   /*
> @@ -672,7 +700,7 @@ static int etm4_config_timestamp_event(struct etmv4_drvdata *drvdata,
>   	int ctridx;
>   	int rselector;
>   	const struct etmv4_caps *caps = &drvdata->caps;
> -	struct etmv4_config *config = &drvdata->config;
> +	struct etmv4_config *config = &drvdata->active_config;
>   
>   	/* No point in trying if we don't have at least one counter */
>   	if (!caps->nr_cntr)
> @@ -756,7 +784,7 @@ static int etm4_parse_event_config(struct coresight_device *csdev,
>   	int ret = 0;
>   	struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>   	const struct etmv4_caps *caps = &drvdata->caps;
> -	struct etmv4_config *config = &drvdata->config;
> +	struct etmv4_config *config = &drvdata->active_config;
>   	struct perf_event_attr max_timestamp = {
>   		.ATTR_CFG_FLD_timestamp_CFG = U64_MAX,
>   	};
> @@ -918,40 +946,29 @@ static int etm4_enable_sysfs(struct coresight_device *csdev, struct coresight_pa
>   
>   	/* enable any config activated by configfs */
>   	cscfg_config_sysfs_get_active_cfg(&cfg_hash, &preset);
> -	if (cfg_hash) {
> -		ret = cscfg_csdev_enable_active_config(csdev, cfg_hash, preset);
> -		if (ret) {
> -			etm4_release_trace_id(drvdata);
> -			return ret;
> -		}
> -	}
> -
> -	raw_spin_lock(&drvdata->spinlock);
> -
> -	drvdata->trcid = path->trace_id;
> -
> -	/* Tracer will never be paused in sysfs mode */
> -	drvdata->paused = false;
>   
>   	/*
>   	 * Executing etm4_enable_hw on the cpu whose ETM is being enabled
>   	 * ensures that register writes occur when cpu is powered.
>   	 */
>   	arg.drvdata = drvdata;
> +	arg.cfg_hash = cfg_hash;
> +	arg.preset = preset;
> +	arg.trace_id = path->trace_id;
> +
> +	raw_spin_lock(&drvdata->spinlock);
> +	arg.config = drvdata->config;
> +	raw_spin_unlock(&drvdata->spinlock);
> +
>   	ret = smp_call_function_single(drvdata->cpu,
>   				       etm4_enable_sysfs_smp_call, &arg, 1);
>   	if (!ret)
>   		ret = arg.rc;
>   	if (!ret)
> -		drvdata->sticky_enable = true;
> -
> -	if (ret)
> +		dev_dbg(&csdev->dev, "ETM tracing enabled\n");
> +	else
>   		etm4_release_trace_id(drvdata);
>   
> -	raw_spin_unlock(&drvdata->spinlock);
> -
> -	if (!ret)
> -		dev_dbg(&csdev->dev, "ETM tracing enabled\n");
>   	return ret;
>   }
>   
> @@ -1038,7 +1055,7 @@ static void etm4_disable_hw(struct etmv4_drvdata *drvdata)
>   {
>   	u32 control;
>   	const struct etmv4_caps *caps = &drvdata->caps;
> -	struct etmv4_config *config = &drvdata->config;
> +	struct etmv4_config *config = &drvdata->active_config;
>   	struct coresight_device *csdev = drvdata->csdev;
>   	struct csdev_access *csa = &csdev->access;
>   	int i;
> @@ -1074,6 +1091,8 @@ static void etm4_disable_sysfs_smp_call(void *info)
>   
>   	etm4_disable_hw(drvdata);
>   
> +	cscfg_csdev_disable_active_config(drvdata->csdev);
> +
>   	coresight_set_mode(drvdata->csdev, CS_MODE_DISABLED);
>   }
>   
> @@ -1124,7 +1143,6 @@ static void etm4_disable_sysfs(struct coresight_device *csdev)
>   	 * DYING hotplug callback is serviced by the ETM driver.
>   	 */
>   	cpus_read_lock();
> -	raw_spin_lock(&drvdata->spinlock);
>   
>   	/*
>   	 * Executing etm4_disable_hw on the cpu whose ETM is being disabled
> @@ -1133,10 +1151,6 @@ static void etm4_disable_sysfs(struct coresight_device *csdev)
>   	smp_call_function_single(drvdata->cpu, etm4_disable_sysfs_smp_call,
>   				 drvdata, 1);
>   
> -	raw_spin_unlock(&drvdata->spinlock);
> -
> -	cscfg_csdev_disable_active_config(csdev);
> -
>   	cpus_read_unlock();
>   
>   	/*
> @@ -1379,6 +1393,7 @@ static void etm4_init_arch_data(void *info)
>   	struct etm4_init_arg *init_arg = info;
>   	struct etmv4_drvdata *drvdata;
>   	struct etmv4_caps *caps;
> +	struct etmv4_config *config;
>   	struct csdev_access *csa;
>   	struct device *dev = init_arg->dev;
>   	int i;
> @@ -1386,6 +1401,7 @@ static void etm4_init_arch_data(void *info)
>   	drvdata = dev_get_drvdata(init_arg->dev);
>   	caps = &drvdata->caps;
>   	csa = init_arg->csa;
> +	config = &drvdata->active_config;
>   
>   	/*
>   	 * If we are unable to detect the access mechanism,
> @@ -1446,7 +1462,7 @@ static void etm4_init_arch_data(void *info)
>   
>   	/* EXLEVEL_S, bits[19:16] Secure state instruction tracing */
>   	caps->s_ex_level = FIELD_GET(TRCIDR3_EXLEVEL_S_MASK, etmidr3);
> -	drvdata->config.s_ex_level = caps->s_ex_level;
> +	config->s_ex_level = caps->s_ex_level;
>   	/* EXLEVEL_NS, bits[23:20] Non-secure state instruction tracing */
>   	caps->ns_ex_level = FIELD_GET(TRCIDR3_EXLEVEL_NS_MASK, etmidr3);
>   	/*
> @@ -1692,7 +1708,7 @@ static void etm4_set_default(struct etmv4_config *config)
>   static int etm4_get_next_comparator(struct etmv4_drvdata *drvdata, u32 type)
>   {
>   	int nr_comparator, index = 0;
> -	struct etmv4_config *config = &drvdata->config;
> +	struct etmv4_config *config = &drvdata->active_config;
>   
>   	/*
>   	 * nr_addr_cmp holds the number of comparator _pair_, so time 2
> @@ -1733,7 +1749,7 @@ static int etm4_set_event_filters(struct etmv4_drvdata *drvdata,
>   {
>   	int i, comparator, ret = 0;
>   	u64 address;
> -	struct etmv4_config *config = &drvdata->config;
> +	struct etmv4_config *config = &drvdata->active_config;
>   	struct etm_filters *filters = event->hw.addr_filters;
>   
>   	if (!filters)
> @@ -1851,13 +1867,11 @@ static int etm4_starting_cpu(unsigned int cpu)
>   	if (!etmdrvdata[cpu])
>   		return 0;
>   
> -	raw_spin_lock(&etmdrvdata[cpu]->spinlock);
>   	if (!etmdrvdata[cpu]->os_unlock)
>   		etm4_os_unlock(etmdrvdata[cpu]);
>   
>   	if (coresight_get_mode(etmdrvdata[cpu]->csdev))
>   		etm4_enable_hw(etmdrvdata[cpu]);
> -	raw_spin_unlock(&etmdrvdata[cpu]->spinlock);
>   	return 0;
>   }
>   
> @@ -1866,10 +1880,8 @@ static int etm4_dying_cpu(unsigned int cpu)
>   	if (!etmdrvdata[cpu])
>   		return 0;
>   
> -	raw_spin_lock(&etmdrvdata[cpu]->spinlock);
>   	if (coresight_get_mode(etmdrvdata[cpu]->csdev))
>   		etm4_disable_hw(etmdrvdata[cpu]);
> -	raw_spin_unlock(&etmdrvdata[cpu]->spinlock);
>   	return 0;
>   }
>   
> @@ -2255,7 +2267,8 @@ static int etm4_add_coresight_dev(struct etm4_init_arg *init_arg)
>   	if (!desc.name)
>   		return -ENOMEM;
>   
> -	etm4_set_default(&drvdata->config);
> +	etm4_set_default(&drvdata->active_config);
> +	drvdata->config = drvdata->active_config;
>   
>   	pdata = coresight_get_platform_data(dev);
>   	if (IS_ERR(pdata))
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
> index cbd8890d166a..9b50aaa368cf 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x.h
> +++ b/drivers/hwtracing/coresight/coresight-etm4x.h
> @@ -1069,6 +1069,7 @@ struct etmv4_save_state {
>    *		allows tracing at all ELs. We don't want to compute this
>    *		at runtime, due to the additional setting of TRFCR_CX when
>    *		in EL2. Otherwise, 0.
> + * @active_config:	structure holding current applied configuration parameters.
>    * @config:	structure holding configuration parameters.
>    * @save_state:	State to be preserved across power loss
>    * @paused:	Indicates if the trace unit is paused.
> @@ -1089,6 +1090,7 @@ struct etmv4_drvdata {
>   	bool				os_unlock : 1;
>   	bool				paused : 1;
>   	u64				trfcr;
> +	struct etmv4_config		active_config;
>   	struct etmv4_config		config;
>   	struct etmv4_save_state		*save_state;
>   	DECLARE_BITMAP(arch_features, ETM4_IMPDEF_FEATURE_MAX);



^ permalink raw reply

* Re: [PATCH v3 1/6] soc: mediatek: mtk-devapc: refine devapc interrupt handler
From: CK Hu (胡俊光) @ 2026-04-16  3:45 UTC (permalink / raw)
  To: robh@kernel.org, Xiaoshun Xu (徐晓顺),
	krzk+dt@kernel.org, conor+dt@kernel.org, matthias.bgg@gmail.com,
	AngeloGioacchino Del Regno
  Cc: linux-arm-kernel@lists.infradead.org,
	linux-mediatek@lists.infradead.org, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Sirius Wang (王皓昱),
	Project_Global_Chrome_Upstream_Group,
	Vince-WL Liu (劉文龍)
In-Reply-To: <20260416031231.2932493-2-xiaoshun.xu@mediatek.com>

On Thu, 2026-04-16 at 11:12 +0800, Xiaoshun Xu wrote:
> Because the violation IRQ uses a while loop, it might cause the
> system to remain in the interrupt handler indefinitely. We are
> currently optimizing this part of the process to handle only 20
> violations for debug violation issues, and then exit the loop
> 
> Signed-off-by: Xiaoshun Xu <xiaoshun.xu@mediatek.com>
> ---
>  drivers/soc/mediatek/mtk-devapc.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/soc/mediatek/mtk-devapc.c b/drivers/soc/mediatek/mtk-devapc.c
> index f54c966138b5..c9e1401315ad 100644
> --- a/drivers/soc/mediatek/mtk-devapc.c
> +++ b/drivers/soc/mediatek/mtk-devapc.c
> @@ -12,6 +12,7 @@
>  #include <linux/of_irq.h>
>  #include <linux/of_address.h>
>  
> +#define MAX_VIO_NUM 20
>  #define VIO_MOD_TO_REG_IND(m)	((m) / 32)
>  #define VIO_MOD_TO_REG_OFF(m)	((m) % 32)
>  
> @@ -188,13 +189,18 @@ static void devapc_extract_vio_dbg(struct mtk_devapc_context *ctx)
>   */
>  static irqreturn_t devapc_violation_irq(int irq_number, void *data)
>  {
> +	u32 vio_num = 0;
>  	struct mtk_devapc_context *ctx = data;
>  
> -	while (devapc_sync_vio_dbg(ctx))
> +	mask_module_irq(ctx, true);

mask irq is not related to this patch. This patch care about the infinite loop.
So separate mask irq part to an independent patch and describe why do this.

Regards,
CK

> +
> +	for (vio_num = 0; (vio_num < MAX_VIO_NUM) && (devapc_sync_vio_dbg(ctx)); ++vio_num)
>  		devapc_extract_vio_dbg(ctx);
>  
>  	clear_vio_status(ctx);
>  
> +	mask_module_irq(ctx, false);
> +
>  	return IRQ_HANDLED;
>  }
>  


^ permalink raw reply

* Re: [PATCH v13 3/3] of: Respect #{iommu,msi}-cells in maps
From: Vijayanand Jitta @ 2026-04-16  3:26 UTC (permalink / raw)
  To: Nipun Gupta, Nikhil Agarwal, Joerg Roedel, Will Deacon,
	Robin Murphy, Marc Zyngier, Lorenzo Pieralisi, Thomas Gleixner,
	Saravana Kannan, Richard Zhu, Lucas Stach,
	Krzysztof Wilczyński, Manivannan Sadhasivam, Bjorn Helgaas,
	Frank Li, Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam,
	Juergen Gross, Stefano Stabellini, Oleksandr Tyshchenko,
	Dmitry Baryshkov, Konrad Dybcio, Bjorn Andersson, Rob Herring,
	Conor Dooley, Krzysztof Kozlowski, Prakash Gupta, Vikash Garodia
  Cc: linux-kernel, iommu, linux-arm-kernel, devicetree, linux-pci, imx,
	xen-devel, linux-arm-msm, Charan Teja Kalla
In-Reply-To: <20260408-parse_iommu_cells-v13-3-fa921e92661b@oss.qualcomm.com>



On 4/8/2026 3:33 PM, Vijayanand Jitta wrote:
> From: Robin Murphy <robin.murphy@arm.com>
> 
> So far our parsing of {iommu,msi}-map properties has always blindly
> assumed that the output specifiers will always have exactly 1 cell.
> This typically does happen to be the case, but is not actually enforced
> (and the PCI msi-map binding even explicitly states support for 0 or 1
> cells) - as a result we've now ended up with dodgy DTs out in the field
> which depend on this behaviour to map a 1-cell specifier for a 2-cell
> provider, despite that being bogus per the bindings themselves.
> 
> Since there is some potential use in being able to map at least single
> input IDs to multi-cell output specifiers (and properly support 0-cell
> outputs as well), add support for properly parsing and using the target
> nodes' #cells values, albeit with the unfortunate complication of still
> having to work around expectations of the old behaviour too.
> 
> Since there are multi-cell output specifiers, the callers of of_map_id()
> may need to get the exact cell output value for further processing.
> Update of_map_id() to set args_count in the output to reflect the actual
> number of output specifier cells.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> Signed-off-by: Charan Teja Kalla <charan.kalla@oss.qualcomm.com>
> Signed-off-by: Vijayanand Jitta <vijayanand.jitta@oss.qualcomm.com>
> ---
>  drivers/of/base.c  | 157 +++++++++++++++++++++++++++++++++++++++++------------
>  include/linux/of.h |   6 +-
>  2 files changed, 125 insertions(+), 38 deletions(-)
> 
> diff --git a/drivers/of/base.c b/drivers/of/base.c
> index b3d002015192..2554e4f1a181 100644
> --- a/drivers/of/base.c
> +++ b/drivers/of/base.c
> @@ -2096,18 +2096,48 @@ int of_find_last_cache_level(unsigned int cpu)
>  	return cache_level;
>  }
>  
> +/*
> + * Some DTs have an iommu-map targeting a 2-cell IOMMU node while
> + * specifying only 1 cell. Fortunately they all consist of value '1'
> + * as the 2nd cell entry with the same target, so check for that pattern.
> + *
> + * Example:
> + *	IOMMU node:
> + *		#iommu-cells = <2>;
> + *
> + *	Device node:
> + *		iommu-map = <0x0000 &smmu 0x0000 0x1>,
> + *			    <0x0100 &smmu 0x0100 0x1>;
> + */
> +static bool of_check_bad_map(const __be32 *map, int len)
> +{
> +	__be32 phandle = map[1];
> +
> +	if (len % 4)
> +		return false;
> +	for (int i = 0; i < len; i += 4) {
> +		if (map[i + 1] != phandle || map[i + 3] != cpu_to_be32(1))
> +			return false;
> +	}
> +	return true;
> +}
> +
>  /**
>   * of_map_id - Translate an ID through a downstream mapping.
>   * @np: root complex device node.
>   * @id: device ID to map.
>   * @map_name: property name of the map to use.
> + * @cells_name: property name of target specifier cells.
>   * @map_mask_name: optional property name of the mask to use.
>   * @filter_np: optional device node to filter matches by, or NULL to match any.
>   *	If non-NULL, only map entries targeting this node will be matched.
>   * @arg: pointer to a &struct of_phandle_args for the result. On success,
> - *	@arg->args[0] will contain the translated ID. If a map entry was
> - *	matched, @arg->np will be set to the target node with a reference
> - *	held that the caller must release with of_node_put().
> + *	@arg->args_count will be set to the number of output specifier cells
> + *	as defined by @cells_name in the target node, and
> + *	@arg->args[0..args_count-1] will contain the translated output
> + *	specifier values. If a map entry was matched, @arg->np will be set
> + *	to the target node with a reference held that the caller must release
> + *	with of_node_put().
>   *
>   * Given a device ID, look up the appropriate implementation-defined
>   * platform ID and/or the target device which receives transactions on that
> @@ -2116,17 +2146,19 @@ int of_find_last_cache_level(unsigned int cpu)
>   * Return: 0 on success or a standard error code on failure.
>   */
>  int of_map_id(const struct device_node *np, u32 id,
> -	       const char *map_name, const char *map_mask_name,
> +	       const char *map_name, const char *cells_name,
> +	       const char *map_mask_name,
>  	       const struct device_node *filter_np, struct of_phandle_args *arg)
>  {
>  	u32 map_mask, masked_id;
> -	int map_len;
> +	int map_bytes, map_len, offset = 0;
> +	bool bad_map = false;
>  	const __be32 *map = NULL;
>  
>  	if (!np || !map_name || !arg)
>  		return -EINVAL;
>  
> -	map = of_get_property(np, map_name, &map_len);
> +	map = of_get_property(np, map_name, &map_bytes);
>  	if (!map) {
>  		if (filter_np)
>  			return -ENODEV;
> @@ -2136,11 +2168,9 @@ int of_map_id(const struct device_node *np, u32 id,
>  		return 0;
>  	}
>  
> -	if (!map_len || map_len % (4 * sizeof(*map))) {
> -		pr_err("%pOF: Error: Bad %s length: %d\n", np,
> -			map_name, map_len);
> -		return -EINVAL;
> -	}
> +	if (map_bytes % sizeof(*map))
> +		goto err_map_len;
> +	map_len = map_bytes / sizeof(*map);
>  
>  	/* The default is to select all bits. */
>  	map_mask = 0xffffffff;
> @@ -2153,39 +2183,84 @@ int of_map_id(const struct device_node *np, u32 id,
>  		of_property_read_u32(np, map_mask_name, &map_mask);
>  
>  	masked_id = map_mask & id;
> -	for ( ; map_len > 0; map_len -= 4 * sizeof(*map), map += 4) {
> +
> +	while (offset < map_len) {
>  		struct device_node *phandle_node;
> -		u32 id_base = be32_to_cpup(map + 0);
> -		u32 phandle = be32_to_cpup(map + 1);
> -		u32 out_base = be32_to_cpup(map + 2);
> -		u32 id_len = be32_to_cpup(map + 3);
> +		u32 id_base, phandle, id_len, id_off, cells = 0;
> +		const __be32 *out_base;
> +
> +		if (map_len - offset < 2)
> +			goto err_map_len;
> +
> +		id_base = be32_to_cpup(map + offset);
>  
>  		if (id_base & ~map_mask) {
> -			pr_err("%pOF: Invalid %s translation - %s-mask (0x%x) ignores id-base (0x%x)\n",
> -				np, map_name, map_name,
> -				map_mask, id_base);
> +			pr_err("%pOF: Invalid %s translation - %s (0x%x) ignores id-base (0x%x)\n",
> +			       np, map_name, map_mask_name, map_mask, id_base);
>  			return -EFAULT;
>  		}
>  
> -		if (masked_id < id_base || masked_id >= id_base + id_len)
> -			continue;
> -
> +		phandle = be32_to_cpup(map + offset + 1);
>  		phandle_node = of_find_node_by_phandle(phandle);
>  		if (!phandle_node)
>  			return -ENODEV;
>  
> +		if (bad_map) {
> +			cells = 1;
> +		} else if (of_property_read_u32(phandle_node, cells_name, &cells)) {
> +			pr_err("%pOF: missing %s property\n", phandle_node, cells_name);
> +			of_node_put(phandle_node);
> +			return -EINVAL;
> +		}
> +
> +		if (map_len - offset < 3 + cells) {
> +			of_node_put(phandle_node);
> +			goto err_map_len;
> +		}
> +
> +		if (offset == 0 && cells == 2) {
> +			bad_map = of_check_bad_map(map, map_len);
> +			if (bad_map) {
> +				pr_warn_once("%pOF: %s mismatches target %s, assuming extra cell of 0\n",
> +					     np, map_name, cells_name);
> +				cells = 1;
> +			}
> +		}
> +
> +		out_base = map + offset + 2;
> +		offset += 3 + cells;
> +
> +		id_len = be32_to_cpup(map + offset - 1);
> +		if (id_len > 1 && cells > 1) {
> +			/*
> +			 * With 1 output cell we reasonably assume its value
> +			 * has a linear relationship to the input; with more,
> +			 * we'd need help from the provider to know what to do.
> +			 */
> +			pr_err("%pOF: Unsupported %s - cannot handle %d-ID range with %d-cell output specifier\n",
> +			       np, map_name, id_len, cells);
> +			of_node_put(phandle_node);
> +			return -EINVAL;
> +		}
> +		id_off = masked_id - id_base;
> +		if (masked_id < id_base || id_off >= id_len) {
> +			of_node_put(phandle_node);
> +			continue;
> +		}
> +
>  		if (filter_np && filter_np != phandle_node) {
>  			of_node_put(phandle_node);
>  			continue;
>  		}
>  
>  		arg->np = phandle_node;
> -		arg->args[0] = masked_id - id_base + out_base;
> -		arg->args_count = 1;
> +		for (int i = 0; i < cells; i++)
> +			arg->args[i] = id_off + be32_to_cpu(out_base[i]);
> +		arg->args_count = cells;
>  
>  		pr_debug("%pOF: %s, using mask %08x, id-base: %08x, out-base: %08x, length: %08x, id: %08x -> %08x\n",
> -			np, map_name, map_mask, id_base, out_base,
> -			id_len, id, masked_id - id_base + out_base);
> +			np, map_name, map_mask, id_base, be32_to_cpup(out_base),
> +			id_len, id, id_off + be32_to_cpup(out_base));
>  		return 0;
>  	}
>  
> @@ -2196,6 +2271,10 @@ int of_map_id(const struct device_node *np, u32 id,
>  	arg->args[0] = id;
>  	arg->args_count = 1;
>  	return 0;
> +
> +err_map_len:
> +	pr_err("%pOF: Error: Bad %s length: %d\n", np, map_name, map_bytes);
> +	return -EINVAL;
>  }
>  EXPORT_SYMBOL_GPL(of_map_id);
>  
> @@ -2205,18 +2284,21 @@ EXPORT_SYMBOL_GPL(of_map_id);
>   * @id: Requester ID of the device (e.g. PCI RID/BDF or a platform
>   *      stream/device ID) used as the lookup key in the iommu-map table.
>   * @arg: pointer to a &struct of_phandle_args for the result. On success,
> - *	@arg->args[0] contains the translated ID. If a map entry was matched,
> - *	@arg->np holds a reference to the target node that the caller must
> - *	release with of_node_put().
> + *	@arg->args_count will be set to the number of output specifier cells
> + *	and @arg->args[0..args_count-1] will contain the translated output
> + *	specifier values. If a map entry was matched, @arg->np holds a
> + *	reference to the target node that the caller must release with
> + *	of_node_put().
>   *
> - * Convenience wrapper around of_map_id() using "iommu-map" and "iommu-map-mask".
> + * Convenience wrapper around of_map_id() using "iommu-map", "#iommu-cells",
> + * and "iommu-map-mask".
>   *
>   * Return: 0 on success or a standard error code on failure.
>   */
>  int of_map_iommu_id(const struct device_node *np, u32 id,
>  		    struct of_phandle_args *arg)
>  {
> -	return of_map_id(np, id, "iommu-map", "iommu-map-mask", NULL, arg);
> +	return of_map_id(np, id, "iommu-map", "#iommu-cells", "iommu-map-mask", NULL, arg);
>  }
>  EXPORT_SYMBOL_GPL(of_map_iommu_id);
>  
> @@ -2229,17 +2311,20 @@ EXPORT_SYMBOL_GPL(of_map_iommu_id);
>   *	to match any. If non-NULL, only map entries targeting this node will
>   *	be matched.
>   * @arg: pointer to a &struct of_phandle_args for the result. On success,
> - *	@arg->args[0] contains the translated ID. If a map entry was matched,
> - *	@arg->np holds a reference to the target node that the caller must
> - *	release with of_node_put().
> + *	@arg->args_count will be set to the number of output specifier cells
> + *	and @arg->args[0..args_count-1] will contain the translated output
> + *	specifier values. If a map entry was matched, @arg->np holds a
> + *	reference to the target node that the caller must release with
> + *	of_node_put().
>   *
> - * Convenience wrapper around of_map_id() using "msi-map" and "msi-map-mask".
> + * Convenience wrapper around of_map_id() using "msi-map", "#msi-cells",
> + * and "msi-map-mask".
>   *
>   * Return: 0 on success or a standard error code on failure.
>   */
>  int of_map_msi_id(const struct device_node *np, u32 id,
>  		  const struct device_node *filter_np, struct of_phandle_args *arg)
>  {
> -	return of_map_id(np, id, "msi-map", "msi-map-mask", filter_np, arg);
> +	return of_map_id(np, id, "msi-map", "#msi-cells", "msi-map-mask", filter_np, arg);
>  }
>  EXPORT_SYMBOL_GPL(of_map_msi_id);
> diff --git a/include/linux/of.h b/include/linux/of.h
> index 8548cd9eb4f1..51ac8539f2c3 100644
> --- a/include/linux/of.h
> +++ b/include/linux/of.h
> @@ -462,7 +462,8 @@ const char *of_prop_next_string(const struct property *prop, const char *cur);
>  bool of_console_check(const struct device_node *dn, char *name, int index);
>  
>  int of_map_id(const struct device_node *np, u32 id,
> -	       const char *map_name, const char *map_mask_name,
> +	       const char *map_name, const char *cells_name,
> +	       const char *map_mask_name,
>  	       const struct device_node *filter_np, struct of_phandle_args *arg);
>  
>  int of_map_iommu_id(const struct device_node *np, u32 id,
> @@ -934,7 +935,8 @@ static inline void of_property_clear_flag(struct property *p, unsigned long flag
>  }
>  
>  static inline int of_map_id(const struct device_node *np, u32 id,
> -			     const char *map_name, const char *map_mask_name,
> +			     const char *map_name, const char *cells_name,
> +			     const char *map_mask_name,
>  			     const struct device_node *filter_np,
>  			     struct of_phandle_args *arg)
>  {
> 

Gentle ping.

Thanks,
Vijay


^ permalink raw reply

* [PATCH v3 5/6] soc: mediatek: mtk-devapc: Add support for MT8196 DEVAPC
From: Xiaoshun Xu @ 2026-04-16  3:12 UTC (permalink / raw)
  To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Matthias Brugger,
	AngeloGioacchino Del Regno, Xiaoshun Xu
  Cc: devicetree, linux-kernel, linux-arm-kernel, linux-mediatek,
	Sirius Wang, Vince-wl Liu, Project_Global_Chrome_Upstream_Group
In-Reply-To: <20260416031231.2932493-1-xiaoshun.xu@mediatek.com>

Add support for MT8196 DEVAPC, MT8196 DEVAPC debug registers are
version 3 and add compatible for MT8196

Signed-off-by: Xiaoshun Xu <xiaoshun.xu@mediatek.com>
---
 drivers/soc/mediatek/mtk-devapc.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/soc/mediatek/mtk-devapc.c b/drivers/soc/mediatek/mtk-devapc.c
index 824b49613c5a..0f828028bdb4 100644
--- a/drivers/soc/mediatek/mtk-devapc.c
+++ b/drivers/soc/mediatek/mtk-devapc.c
@@ -324,6 +324,11 @@ static const struct mtk_devapc_data devapc_mt8189 = {
 	.regs_ofs = &devapc_regs_ofs_ver3,
 };
 
+static const struct mtk_devapc_data devapc_mt8196 = {
+	.version = 3,
+	.regs_ofs = &devapc_regs_ofs_ver3,
+};
+
 static const struct of_device_id mtk_devapc_dt_match[] = {
 	{
 		.compatible = "mediatek,mt6779-devapc",
@@ -334,6 +339,9 @@ static const struct of_device_id mtk_devapc_dt_match[] = {
 	}, {
 		.compatible = "mediatek,mt8189-devapc",
 		.data = &devapc_mt8189,
+	}, {
+		.compatible = "mediatek,mt8196-devapc",
+		.data = &devapc_mt8196,
 	}, {
 	},
 };
-- 
2.45.2



^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox