public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH RESEND bpf-next v2 0/3] Use bpf_prog_pack for RV64 bpf trampoline
@ 2024-06-22  3:04 Pu Lehui
  2024-06-22  3:04 ` [PATCH RESEND bpf-next v2 1/3] bpf: Use precise image size for struct_ops trampoline Pu Lehui
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Pu Lehui @ 2024-06-22  3:04 UTC (permalink / raw)
  To: bpf, linux-riscv, netdev
  Cc: Björn Töpel, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, Puranjay Mohan, Palmer Dabbelt, Pu Lehui

We used bpf_prog_pack to aggregate bpf programs into huge page to
relieve the iTLB pressure on the system. We can apply it to bpf
trampoline, as Song had been implemented it in core and x86 [0]. This
patch is going to use bpf_prog_pack to RV64 bpf trampoline. Since Song
and Puranjay have done a lot of work for bpf_prog_pack on RV64,
implementing this function will be easy. But one thing to mention is
that emit_call in RV64 will generate the maximum number of instructions
during dry run, but during real patching it may be optimized to 1
instruction due to distance. This is no problem as it does not overflow
the allocated RO image.

Tests about regular trampoline and struct_ops trampoline have passed, as
well as "test_verifier" with no failure cases.

Link: https://lore.kernel.org/all/20231206224054.492250-1-song@kernel.org [0]

v2:
- Emit max number of insns for the "im" addr during dry run to solve OOB issue. (Song)

v1: https://lore.kernel.org/all/20240123103241.2282122-1-pulehui@huaweicloud.com/

Pu Lehui (3):
  bpf: Use precise image size for struct_ops trampoline
  riscv, bpf: Fix out-of-bounds issue when preparing trampoline image
  riscv, bpf: Use bpf_prog_pack for RV64 bpf trampoline

 arch/riscv/net/bpf_jit_comp64.c | 57 +++++++++++++++++++++++----------
 kernel/bpf/bpf_struct_ops.c     |  2 +-
 2 files changed, 41 insertions(+), 18 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH RESEND bpf-next v2 1/3] bpf: Use precise image size for struct_ops trampoline
  2024-06-22  3:04 [PATCH RESEND bpf-next v2 0/3] Use bpf_prog_pack for RV64 bpf trampoline Pu Lehui
@ 2024-06-22  3:04 ` Pu Lehui
  2024-06-22  3:04 ` [PATCH RESEND bpf-next v2 2/3] riscv, bpf: Fix out-of-bounds issue when preparing trampoline image Pu Lehui
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Pu Lehui @ 2024-06-22  3:04 UTC (permalink / raw)
  To: bpf, linux-riscv, netdev
  Cc: Björn Töpel, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, Puranjay Mohan, Palmer Dabbelt, Pu Lehui

From: Pu Lehui <pulehui@huawei.com>

For trampoline using bpf_prog_pack, we need to generate a rw_image
buffer with size of (image_end - image). For regular trampoline, we use
the precise image size generated by arch_bpf_trampoline_size to allocate
rw_image. But for struct_ops trampoline, we allocate rw_image directly
using close to PAGE_SIZE size. We do not need to allocate for that much,
as the patch size is usually much smaller than PAGE_SIZE. Let's use
precise image size for it too.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Acked-by: Song Liu <song@kernel.org>
Tested-by: Björn Töpel <bjorn@rivosinc.com> #riscv
---
 kernel/bpf/bpf_struct_ops.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index a2cf31b14be4..0d515ec57aa5 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -573,7 +573,7 @@ int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_links *tlinks,
 	}
 
 	size = arch_prepare_bpf_trampoline(NULL, image + image_off,
-					   image + PAGE_SIZE,
+					   image + image_off + size,
 					   model, flags, tlinks, stub_func);
 	if (size <= 0) {
 		if (image != *_image)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH RESEND bpf-next v2 2/3] riscv, bpf: Fix out-of-bounds issue when preparing trampoline image
  2024-06-22  3:04 [PATCH RESEND bpf-next v2 0/3] Use bpf_prog_pack for RV64 bpf trampoline Pu Lehui
  2024-06-22  3:04 ` [PATCH RESEND bpf-next v2 1/3] bpf: Use precise image size for struct_ops trampoline Pu Lehui
@ 2024-06-22  3:04 ` Pu Lehui
  2024-06-22  3:04 ` [PATCH RESEND bpf-next v2 3/3] riscv, bpf: Use bpf_prog_pack for RV64 bpf trampoline Pu Lehui
  2024-07-01 15:20 ` [PATCH RESEND bpf-next v2 0/3] " patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: Pu Lehui @ 2024-06-22  3:04 UTC (permalink / raw)
  To: bpf, linux-riscv, netdev
  Cc: Björn Töpel, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, Puranjay Mohan, Palmer Dabbelt, Pu Lehui

From: Pu Lehui <pulehui@huawei.com>

We get the size of the trampoline image during the dry run phase and
allocate memory based on that size. The allocated image will then be
populated with instructions during the real patch phase. But after
commit 26ef208c209a ("bpf: Use arch_bpf_trampoline_size"), the `im`
argument is inconsistent in the dry run and real patch phase. This may
cause emit_imm in RV64 to generate a different number of instructions
when generating the 'im' address, potentially causing out-of-bounds
issues. Let's emit the maximum number of instructions for the "im"
address during dry run to fix this problem.

Fixes: 26ef208c209a ("bpf: Use arch_bpf_trampoline_size")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
---
 arch/riscv/net/bpf_jit_comp64.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
index d5cebb0b0afe..e6d690657f3e 100644
--- a/arch/riscv/net/bpf_jit_comp64.c
+++ b/arch/riscv/net/bpf_jit_comp64.c
@@ -16,6 +16,8 @@
 #include "bpf_jit.h"
 
 #define RV_FENTRY_NINSNS 2
+/* imm that allows emit_imm to emit max count insns */
+#define RV_MAX_COUNT_IMM 0x7FFF7FF7FF7FF7FF
 
 #define RV_REG_TCC RV_REG_A6
 #define RV_REG_TCC_SAVED RV_REG_S6 /* Store A6 in S6 if program do calls */
@@ -916,7 +918,7 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
 		orig_call += RV_FENTRY_NINSNS * 4;
 
 	if (flags & BPF_TRAMP_F_CALL_ORIG) {
-		emit_imm(RV_REG_A0, (const s64)im, ctx);
+		emit_imm(RV_REG_A0, ctx->insns ? (const s64)im : RV_MAX_COUNT_IMM, ctx);
 		ret = emit_call((const u64)__bpf_tramp_enter, true, ctx);
 		if (ret)
 			return ret;
@@ -977,7 +979,7 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
 
 	if (flags & BPF_TRAMP_F_CALL_ORIG) {
 		im->ip_epilogue = ctx->insns + ctx->ninsns;
-		emit_imm(RV_REG_A0, (const s64)im, ctx);
+		emit_imm(RV_REG_A0, ctx->insns ? (const s64)im : RV_MAX_COUNT_IMM, ctx);
 		ret = emit_call((const u64)__bpf_tramp_exit, true, ctx);
 		if (ret)
 			goto out;
@@ -1046,6 +1048,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image,
 {
 	int ret;
 	struct rv_jit_context ctx;
+	u32 size = image_end - image;
 
 	ctx.ninsns = 0;
 	/*
@@ -1059,11 +1062,16 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image,
 	ctx.ro_insns = image;
 	ret = __arch_prepare_bpf_trampoline(im, m, tlinks, func_addr, flags, &ctx);
 	if (ret < 0)
-		return ret;
+		goto out;
 
-	bpf_flush_icache(ctx.insns, ctx.insns + ctx.ninsns);
+	if (WARN_ON(size < ninsns_rvoff(ctx.ninsns))) {
+		ret = -E2BIG;
+		goto out;
+	}
 
-	return ninsns_rvoff(ret);
+	bpf_flush_icache(image, image_end);
+out:
+	return ret < 0 ? ret : size;
 }
 
 int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH RESEND bpf-next v2 3/3] riscv, bpf: Use bpf_prog_pack for RV64 bpf trampoline
  2024-06-22  3:04 [PATCH RESEND bpf-next v2 0/3] Use bpf_prog_pack for RV64 bpf trampoline Pu Lehui
  2024-06-22  3:04 ` [PATCH RESEND bpf-next v2 1/3] bpf: Use precise image size for struct_ops trampoline Pu Lehui
  2024-06-22  3:04 ` [PATCH RESEND bpf-next v2 2/3] riscv, bpf: Fix out-of-bounds issue when preparing trampoline image Pu Lehui
@ 2024-06-22  3:04 ` Pu Lehui
  2024-07-01 15:20 ` [PATCH RESEND bpf-next v2 0/3] " patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: Pu Lehui @ 2024-06-22  3:04 UTC (permalink / raw)
  To: bpf, linux-riscv, netdev
  Cc: Björn Töpel, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
	Hao Luo, Jiri Olsa, Puranjay Mohan, Palmer Dabbelt, Pu Lehui

From: Pu Lehui <pulehui@huawei.com>

We used bpf_prog_pack to aggregate bpf programs into huge page to
relieve the iTLB pressure on the system. We can apply it to bpf
trampoline, as Song had been implemented it in core and x86 [0]. This
patch is going to use bpf_prog_pack to RV64 bpf trampoline. Since Song
and Puranjay have done a lot of work for bpf_prog_pack on RV64,
implementing this function will be easy.

Link: https://lore.kernel.org/all/20231206224054.492250-1-song@kernel.org [0]
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com> #riscv
---
 arch/riscv/net/bpf_jit_comp64.c | 43 ++++++++++++++++++++++-----------
 1 file changed, 29 insertions(+), 14 deletions(-)

diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
index e6d690657f3e..351e1484205e 100644
--- a/arch/riscv/net/bpf_jit_comp64.c
+++ b/arch/riscv/net/bpf_jit_comp64.c
@@ -957,7 +957,7 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
 			goto out;
 		emit_sd(RV_REG_FP, -retval_off, RV_REG_A0, ctx);
 		emit_sd(RV_REG_FP, -(retval_off - 8), regmap[BPF_REG_0], ctx);
-		im->ip_after_call = ctx->insns + ctx->ninsns;
+		im->ip_after_call = ctx->ro_insns + ctx->ninsns;
 		/* 2 nops reserved for auipc+jalr pair */
 		emit(rv_nop(), ctx);
 		emit(rv_nop(), ctx);
@@ -978,7 +978,7 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
 	}
 
 	if (flags & BPF_TRAMP_F_CALL_ORIG) {
-		im->ip_epilogue = ctx->insns + ctx->ninsns;
+		im->ip_epilogue = ctx->ro_insns + ctx->ninsns;
 		emit_imm(RV_REG_A0, ctx->insns ? (const s64)im : RV_MAX_COUNT_IMM, ctx);
 		ret = emit_call((const u64)__bpf_tramp_exit, true, ctx);
 		if (ret)
@@ -1041,25 +1041,33 @@ int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags,
 	return ret < 0 ? ret : ninsns_rvoff(ctx.ninsns);
 }
 
-int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image,
-				void *image_end, const struct btf_func_model *m,
+void *arch_alloc_bpf_trampoline(unsigned int size)
+{
+	return bpf_prog_pack_alloc(size, bpf_fill_ill_insns);
+}
+
+void arch_free_bpf_trampoline(void *image, unsigned int size)
+{
+	bpf_prog_pack_free(image, size);
+}
+
+int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image,
+				void *ro_image_end, const struct btf_func_model *m,
 				u32 flags, struct bpf_tramp_links *tlinks,
 				void *func_addr)
 {
 	int ret;
+	void *image, *res;
 	struct rv_jit_context ctx;
-	u32 size = image_end - image;
+	u32 size = ro_image_end - ro_image;
+
+	image = kvmalloc(size, GFP_KERNEL);
+	if (!image)
+		return -ENOMEM;
 
 	ctx.ninsns = 0;
-	/*
-	 * The bpf_int_jit_compile() uses a RW buffer (ctx.insns) to write the
-	 * JITed instructions and later copies it to a RX region (ctx.ro_insns).
-	 * It also uses ctx.ro_insns to calculate offsets for jumps etc. As the
-	 * trampoline image uses the same memory area for writing and execution,
-	 * both ctx.insns and ctx.ro_insns can be set to image.
-	 */
 	ctx.insns = image;
-	ctx.ro_insns = image;
+	ctx.ro_insns = ro_image;
 	ret = __arch_prepare_bpf_trampoline(im, m, tlinks, func_addr, flags, &ctx);
 	if (ret < 0)
 		goto out;
@@ -1069,8 +1077,15 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image,
 		goto out;
 	}
 
-	bpf_flush_icache(image, image_end);
+	res = bpf_arch_text_copy(ro_image, image, size);
+	if (IS_ERR(res)) {
+		ret = PTR_ERR(res);
+		goto out;
+	}
+
+	bpf_flush_icache(ro_image, ro_image_end);
 out:
+	kvfree(image);
 	return ret < 0 ? ret : size;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH RESEND bpf-next v2 0/3] Use bpf_prog_pack for RV64 bpf trampoline
  2024-06-22  3:04 [PATCH RESEND bpf-next v2 0/3] Use bpf_prog_pack for RV64 bpf trampoline Pu Lehui
                   ` (2 preceding siblings ...)
  2024-06-22  3:04 ` [PATCH RESEND bpf-next v2 3/3] riscv, bpf: Use bpf_prog_pack for RV64 bpf trampoline Pu Lehui
@ 2024-07-01 15:20 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-07-01 15:20 UTC (permalink / raw)
  To: Pu Lehui
  Cc: bpf, linux-riscv, netdev, bjorn, ast, daniel, andrii, martin.lau,
	eddyz87, song, yhs, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	puranjay, palmer, pulehui

Hello:

This series was applied to bpf/bpf-next.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Sat, 22 Jun 2024 03:04:34 +0000 you wrote:
> We used bpf_prog_pack to aggregate bpf programs into huge page to
> relieve the iTLB pressure on the system. We can apply it to bpf
> trampoline, as Song had been implemented it in core and x86 [0]. This
> patch is going to use bpf_prog_pack to RV64 bpf trampoline. Since Song
> and Puranjay have done a lot of work for bpf_prog_pack on RV64,
> implementing this function will be easy. But one thing to mention is
> that emit_call in RV64 will generate the maximum number of instructions
> during dry run, but during real patching it may be optimized to 1
> instruction due to distance. This is no problem as it does not overflow
> the allocated RO image.
> 
> [...]

Here is the summary with links:
  - [RESEND,bpf-next,v2,1/3] bpf: Use precise image size for struct_ops trampoline
    https://git.kernel.org/bpf/bpf-next/c/d1a426171d76
  - [RESEND,bpf-next,v2,2/3] riscv, bpf: Fix out-of-bounds issue when preparing trampoline image
    https://git.kernel.org/bpf/bpf-next/c/9f1e16fb1fc9
  - [RESEND,bpf-next,v2,3/3] riscv, bpf: Use bpf_prog_pack for RV64 bpf trampoline
    https://git.kernel.org/bpf/bpf-next/c/2382a405c581

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-07-01 15:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-22  3:04 [PATCH RESEND bpf-next v2 0/3] Use bpf_prog_pack for RV64 bpf trampoline Pu Lehui
2024-06-22  3:04 ` [PATCH RESEND bpf-next v2 1/3] bpf: Use precise image size for struct_ops trampoline Pu Lehui
2024-06-22  3:04 ` [PATCH RESEND bpf-next v2 2/3] riscv, bpf: Fix out-of-bounds issue when preparing trampoline image Pu Lehui
2024-06-22  3:04 ` [PATCH RESEND bpf-next v2 3/3] riscv, bpf: Use bpf_prog_pack for RV64 bpf trampoline Pu Lehui
2024-07-01 15:20 ` [PATCH RESEND bpf-next v2 0/3] " patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox