LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Hari Bathini <hbathini@linux.ibm.com>
To: adubey@linux.ibm.com, bpf@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org, maddy@linux.ibm.com,
	ast@kernel.org, andrii@kernel.org, daniel@iogearbox.net,
	shuah@kernel.org, linux-kselftest@vger.kernel.org,
	stable@vger.kernel.org
Subject: Re: [PATCH v7 2/7] powerpc/bpf: Move out dummy_tramp_addr after Long branch stub
Date: Sat, 13 Jun 2026 18:07:59 +0530	[thread overview]
Message-ID: <c1a8cf30-a229-48b7-93a5-02d7fc288fc8@linux.ibm.com> (raw)
In-Reply-To: <20260611153826.31187-3-adubey@linux.ibm.com>



On 11/06/26 9:08 pm, adubey@linux.ibm.com wrote:
> From: Abhishek Dubey <adubey@linux.ibm.com>
> 
> Move the long branch address field to the bottom of the long
> branch stub. This allows uninterrupted disassembly until the
> last 8 bytes. The last bytes exclusion is logically necessary to
> prevent disassembly failure, otherwise the actual program layout
> is never altered. Hence no effect on overall program size.
> Also, align dummy_tramp_addr field with 8-byte boundary.
> 
> Following is disassembler output for test program with moved down
> dummy_tramp_addr field:
> .....
> .....
> pc:68    left:44     a6 03 08 7c  :  mtlr 0
> pc:72    left:40     bc ff ff 4b  :  b .-68
> pc:76    left:36     a6 02 68 7d  :  mflr 11
> pc:80    left:32     05 00 9f 42  :  bcl 20, 31, .+4
> pc:84    left:28     a6 02 88 7d  :  mflr 12
> pc:88    left:24     14 00 8c e9  :  ld 12, 20(12)
> pc:92    left:20     a6 03 89 7d  :  mtctr 12
> pc:96    left:16     a6 03 68 7d  :  mtlr 11
> pc:100   left:12     20 04 80 4e  :  bctr
> pc:104   left:8      c0 34 1d 00  :
> 
> Failure log:
> Can't disasm instruction at offset 104: c0 34 1d 00 00 00 00 c0
> Disassembly logic can truncate at 104, ignoring last 8 bytes.
> 
> Update the dummy_tramp_addr field offset calculation from the end
> of the program to reflect its new location, for bpf_arch_text_poke()
> to update the actual trampoline's address in this field.
> 
> All BPF trampoline selftests continue to pass with this patch applied.
> 
> Signed-off-by: Abhishek Dubey <adubey@linux.ibm.com>
> ---
>   arch/powerpc/net/bpf_jit.h        |  3 +-
>   arch/powerpc/net/bpf_jit_comp.c   | 51 ++++++++++++++++---------------
>   arch/powerpc/net/bpf_jit_comp64.c |  3 +-
>   3 files changed, 31 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
> index 71e6e7d01057..6632de9871dd 100644
> --- a/arch/powerpc/net/bpf_jit.h
> +++ b/arch/powerpc/net/bpf_jit.h
> @@ -217,7 +217,8 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx);
>   void bpf_jit_build_epilogue(u32 *image, u32 *fimage, struct codegen_context *ctx);
>   void bpf_jit_build_fentry_stubs(u32 *image, u32 *fimage, struct codegen_context *ctx);
>   void bpf_jit_realloc_regs(struct codegen_context *ctx);
> -int bpf_jit_emit_exit_insn(u32 *image, struct codegen_context *ctx, int tmp_reg, long exit_addr);

> +int bpf_jit_emit_exit_insn(u32 *image, u32 *fimage, struct codegen_context *ctx, int tmp_reg,
> +										long exit_addr);

Yes, this does not compile on ppc32 without the corresponding
change there..

>   void prepare_for_fsession_fentry(u32 *image, struct codegen_context *ctx, int cookie_cnt,
>   								int cookie_off, int retval_off);
>   void store_func_meta(u32 *image, struct codegen_context *ctx, u64 func_meta, int func_meta_off);
> diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
> index 79288ff789b5..ebee23d33396 100644
> --- a/arch/powerpc/net/bpf_jit_comp.c
> +++ b/arch/powerpc/net/bpf_jit_comp.c
> @@ -52,9 +52,10 @@ asm (
>   void bpf_jit_build_fentry_stubs(u32 *image, u32 *fimage, struct codegen_context *ctx)
>   {
>   	int ool_stub_idx, long_branch_stub_idx;
> -	int ool_instrs;
> +	int stubs_instrs;
>   
>   	/*
> +	 * The dummy_tramp_addr field is placed at bottom of Long branch stub.
>   	 * In the final pass, align the mis-aligned dummy_tramp_addr field
>   	 * in the fimage. The alignment NOP must appear before OOL stub,
>   	 * to make ool_stub_idx & long_branch_stub_idx constant from end.
> @@ -62,13 +63,10 @@ void bpf_jit_build_fentry_stubs(u32 *image, u32 *fimage, struct codegen_context
>   	 * dummy_tramp_addr must be 8-byte aligned for load-register
>   	 * compatibility. The fimage can be non 8-byte aligned, so final
>   	 * alignment depends on start of fimage and the stub's instruction
> -	 * count offset. The OOL stub has 4 instructions (with
> -	 * CONFIG_PPC_FTRACE_OUT_OF_LINE) or 3 instructions (without)
> -	 * before dummy_tramp_addr.
> -	 *
> -	 * Emit a NOP here if (ctx->idx + ool_instrs) is odd, so that
> -	 * dummy_tramp_addr lands at an even instruction offset (== 8-byte
> -	 * aligned from an 8-byte aligned base).
> +	 * count. The stubs block has 11 instructions (with
> +	 * CONFIG_PPC_FTRACE_OUT_OF_LINE) or 10 instructions (without)
> +	 * before dummy_tramp_addr field. Emit a NOP if the address of
> +	 * dummy_tramp_addr is non aligned.
>   	 *
>   	 * In pass=0 when image==NULL, conservatively account for space
>   	 * required to accommodate alignment NOP. In case final pass skips
> @@ -76,8 +74,8 @@ void bpf_jit_build_fentry_stubs(u32 *image, u32 *fimage, struct codegen_context
>   	 * jited_len signifies correct program size.
>   	 */
>   
> -	ool_instrs = IS_ENABLED(CONFIG_PPC_FTRACE_OUT_OF_LINE) ? 4*4 : 3*4;
> -	if (!image || !IS_ALIGNED((unsigned long)fimage + ctx->idx*4 + ool_instrs, SZL))
> +	stubs_instrs = IS_ENABLED(CONFIG_PPC_FTRACE_OUT_OF_LINE) ? 11*4 : 10*4;

This should be stubs_sz instead of stubs_instrs. So:

     stub_sz = IS_ENABLED(CONFIG_PPC_FTRACE_OUT_OF_LINE) ? 44 : 40;


> +	if (!image || !IS_ALIGNED((unsigned long)fimage + ctx->idx*4 + stubs_instrs, SZL))
>   		EMIT(PPC_RAW_NOP());
>   
>   	/*
> @@ -98,35 +96,37 @@ void bpf_jit_build_fentry_stubs(u32 *image, u32 *fimage, struct codegen_context
>   
>   	/*
>   	 * Long branch stub:
> -	 *	.long	<dummy_tramp_addr>  // 8-byte aligned
>   	 *	mflr	r11
>   	 *	bcl	20,31,$+4
> -	 *	mflr	r12
> -	 *	ld	r12, -8-SZL(r12)
> +	 *	mflr	r12	// lr/r12 stores pc of current(this) inst.
> +	 *	ld	r12, 20(r12) // offset(dummy_tramp_addr) from prev inst. is 20
>   	 *	mtctr	r12
> -	 *	mtlr	r11 // needed to retain ftrace ABI
> +	 *	mtlr	r11	// needed to retain ftrace ABI
>   	 *	bctr
> +	 *	.long	<dummy_tramp_addr>  // 8-byte aligned
>   	 */
> -	if (image)
> -		*((unsigned long *)&image[ctx->idx]) = (unsigned long)dummy_tramp;
> -
> -	ctx->idx += SZL / 4;
>   	long_branch_stub_idx = ctx->idx;
>   	EMIT(PPC_RAW_MFLR(_R11));
>   	EMIT(PPC_RAW_BCL4());
>   	EMIT(PPC_RAW_MFLR(_R12));
> -	EMIT(PPC_RAW_LL(_R12, _R12, -8-SZL));
> +	EMIT(PPC_RAW_LL(_R12, _R12, 20));
>   	EMIT(PPC_RAW_MTCTR(_R12));
>   	EMIT(PPC_RAW_MTLR(_R11));
>   	EMIT(PPC_RAW_BCTR());
>   
> +	if (image)
> +		*((unsigned long *)&image[ctx->idx]) = (unsigned long)dummy_tramp;
> +
> +	ctx->idx += SZL / 4;
> +
>   	if (!bpf_jit_ool_stub) {
>   		bpf_jit_ool_stub = (ctx->idx - ool_stub_idx) * 4;
>   		bpf_jit_long_branch_stub = (ctx->idx - long_branch_stub_idx) * 4;
>   	}
>   }
>   
> -int bpf_jit_emit_exit_insn(u32 *image, struct codegen_context *ctx, int tmp_reg, long exit_addr)
> +int bpf_jit_emit_exit_insn(u32 *image, u32 *fimage, struct codegen_context *ctx,
> +							int tmp_reg, long exit_addr)
>   {
>   	if (!exit_addr || is_offset_in_branch_range(exit_addr - (ctx->idx * 4))) {
>   		PPC_JMP(exit_addr);
> @@ -136,7 +136,7 @@ int bpf_jit_emit_exit_insn(u32 *image, struct codegen_context *ctx, int tmp_reg,
>   		PPC_JMP(ctx->alt_exit_addr);
>   	} else {
>   		ctx->alt_exit_addr = ctx->idx * 4;
> -		bpf_jit_build_epilogue(image, NULL, ctx);
> +		bpf_jit_build_epilogue(image, fimage, ctx);
>   	}
>   
>   	return 0;
> @@ -1289,6 +1289,7 @@ static void do_isync(void *info __maybe_unused)
>    * bpf_func:
>    *	[nop|b]	ool_stub
>    * 2. Out-of-line stub:
> + *	nop	// optional nop for alignment
>    * ool_stub:
>    *	mflr	r0
>    *	[b|bl]	<bpf_prog>/<long_branch_stub>
> @@ -1296,14 +1297,14 @@ static void do_isync(void *info __maybe_unused)
>    *	b	bpf_func + 4
>    * 3. Long branch stub:
>    * long_branch_stub:
> - *	.long	<branch_addr>/<dummy_tramp>
>    *	mflr	r11
>    *	bcl	20,31,$+4
>    *	mflr	r12
> - *	ld	r12, -16(r12)
> + *	ld	r12, 20(r12)
>    *	mtctr	r12
>    *	mtlr	r11 // needed to retain ftrace ABI
>    *	bctr
> + *	.long	<branch_addr>/<dummy_tramp>
>    *
>    * dummy_tramp is used to reduce synchronization requirements.
>    *
> @@ -1405,10 +1406,12 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type old_t,
>   	 * 1. Update the address in the long branch stub:
>   	 * If new_addr is out of range, we will have to use the long branch stub, so patch new_addr
>   	 * here. Otherwise, revert to dummy_tramp, but only if we had patched old_addr here.
> +	 *
> +	 * dummy_tramp_addr moved to bottom of long branch stub.
>   	 */
>   	if ((new_addr && !is_offset_in_branch_range(new_addr - ip)) ||
>   	    (old_addr && !is_offset_in_branch_range(old_addr - ip)))
> -		ret = patch_ulong((void *)(bpf_func_end - bpf_jit_long_branch_stub - SZL),
> +		ret = patch_ulong((void *)(bpf_func_end - SZL), /* SZL: dummy_tramp_addr offset */
>   				  (new_addr && !is_offset_in_branch_range(new_addr - ip)) ?
>   				  (unsigned long)new_addr : (unsigned long)dummy_tramp);
>   	if (ret)
> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> index 885dc8cf55a2..eaf816a07f14 100644
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c
> @@ -1726,7 +1726,8 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
>   			 * we'll just fall through to the epilogue.
>   			 */
>   			if (i != flen - 1) {
> -				ret = bpf_jit_emit_exit_insn(image, ctx, tmp1_reg, exit_addr);
> +				ret = bpf_jit_emit_exit_insn(image, fimage, ctx,
> +								tmp1_reg, exit_addr);
>   				if (ret)
>   					return ret;
>   			}

- Hari


  parent reply	other threads:[~2026-06-13 12:38 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-11 15:38 [PATCH v7 0/7] powerpc/bpf: Add support for verifier selftest adubey
2026-06-11 15:38 ` [PATCH v7 1/7] powerpc/bpf: fix alignment of long branch trampoline address adubey
2026-06-13 12:34   ` Hari Bathini
2026-06-11 15:38 ` [PATCH v7 2/7] powerpc/bpf: Move out dummy_tramp_addr after Long branch stub adubey
2026-06-11 12:18   ` bot+bpf-ci
2026-06-13 12:37   ` Hari Bathini [this message]
2026-06-11 15:38 ` [PATCH v7 3/7] selftest/bpf: Fixing powerpc JIT disassembly failure adubey
2026-06-13 12:40   ` Hari Bathini
2026-06-11 15:38 ` [PATCH v7 4/7] selftest/bpf: Enable verifier selftest for powerpc64 adubey
2026-06-13 12:48   ` Hari Bathini
2026-06-11 15:38 ` [PATCH v7 5/7] powerpc64/bpf: fix compare instruction emitted for tailcall adubey
2026-06-13 12:41   ` Hari Bathini
2026-06-11 15:38 ` [PATCH v7 6/7] selftest/bpf: Add tailcall verifier selftest for powerpc64 adubey
2026-06-13 12:42   ` Hari Bathini
2026-06-11 15:38 ` [PATCH v7 7/7] powerpc/bpf: fix buffer overflow in JIT for large BPF programs adubey
2026-06-13 12:44   ` Hari Bathini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c1a8cf30-a229-48b7-93a5-02d7fc288fc8@linux.ibm.com \
    --to=hbathini@linux.ibm.com \
    --cc=adubey@linux.ibm.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maddy@linux.ibm.com \
    --cc=shuah@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox