* [PATCH v3 0/5] Support trampoline for LoongArch
@ 2025-07-09 5:50 Chenghao Duan
2025-07-09 5:50 ` [PATCH v3 1/5] LoongArch: Add the function to generate the beq and bne assembly instructions Chenghao Duan
` (5 more replies)
0 siblings, 6 replies; 24+ messages in thread
From: Chenghao Duan @ 2025-07-09 5:50 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran
v3:
1. Patch 0003 adds EXECMEM_BPF memory type to the execmem subsystem.
2. Align the size calculated by arch_bpf_trampoline_size to page
boundaries.
3. Add the flush icache operation to larch_insn_text_copy.
4. Unify the implementation of bpf_arch_xxx into the patch
"0004-LoongArch-BPF-Add-bpf_arch_xxxxx-support-for-Loong.patch".
5. Change the patch order. Move the patch
"0002-LoongArch-BPF-Update-the-code-to-rename-validate_.patch" before
"0005-LoongArch-BPF-Add-bpf-trampoline-support-for-Loon.patch".
-----------------------------------------------------------------------
Historical Version:
v2:
1. Change the fixmap in the instruction copy function to set_memory_xxx.
2. Change the implementation method of the following code.
- arch_alloc_bpf_trampoline
- arch_free_bpf_trampoline
Use the BPF core's allocation and free functions.
- bpf_arch_text_invalidate
Operate with the function larch_insn_text_copy that carries
memory attribute modifications.
3. Correct the incorrect code formatting.
URL for version v2:
https://lore.kernel.org/all/20250618105048.1510560-1-duanchenghao@kylinos.cn/
---------
v1:
Support trampoline for LoongArch. The following feature tests have been
completed:
1. fentry
2. fexit
3. fmod_ret
TODO: The support for the struct_ops feature will be provided in
subsequent patches.
URL for version v1:
https://lore.kernel.org/all/20250611035952.111182-1-duanchenghao@kylinos.cn/
-----------------------------------------------------------------------
Chenghao Duan (5):
LoongArch: Add the function to generate the beq and bne assembly
instructions.
LoongArch: BPF: Update the code to rename validate_code to
validate_ctx.
LoongArch: BPF: Add EXECMEM_BPF memory to execmem subsystem
LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
LoongArch: BPF: Add bpf trampoline support for Loongarch
arch/loongarch/include/asm/inst.h | 3 +
arch/loongarch/kernel/inst.c | 60 ++++
arch/loongarch/mm/init.c | 6 +
arch/loongarch/net/bpf_jit.c | 491 +++++++++++++++++++++++++++++-
arch/loongarch/net/bpf_jit.h | 6 +
5 files changed, 565 insertions(+), 1 deletion(-)
--
2.43.0
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v3 1/5] LoongArch: Add the function to generate the beq and bne assembly instructions.
2025-07-09 5:50 [PATCH v3 0/5] Support trampoline for LoongArch Chenghao Duan
@ 2025-07-09 5:50 ` Chenghao Duan
2025-07-16 11:33 ` Hengqi Chen
2025-07-09 5:50 ` [PATCH v3 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx Chenghao Duan
` (4 subsequent siblings)
5 siblings, 1 reply; 24+ messages in thread
From: Chenghao Duan @ 2025-07-09 5:50 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran, Youling Tang
Add branch jump function:
larch_insn_gen_beq
larch_insn_gen_bne
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Co-developed-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
---
arch/loongarch/include/asm/inst.h | 2 ++
arch/loongarch/kernel/inst.c | 28 ++++++++++++++++++++++++++++
2 files changed, 30 insertions(+)
diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
index 3089785ca..2ae96a35d 100644
--- a/arch/loongarch/include/asm/inst.h
+++ b/arch/loongarch/include/asm/inst.h
@@ -511,6 +511,8 @@ u32 larch_insn_gen_lu12iw(enum loongarch_gpr rd, int imm);
u32 larch_insn_gen_lu32id(enum loongarch_gpr rd, int imm);
u32 larch_insn_gen_lu52id(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
+u32 larch_insn_gen_beq(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
+u32 larch_insn_gen_bne(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
static inline bool signed_imm_check(long val, unsigned int bit)
{
diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
index 14d7d700b..674e3b322 100644
--- a/arch/loongarch/kernel/inst.c
+++ b/arch/loongarch/kernel/inst.c
@@ -336,3 +336,31 @@ u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm)
return insn.word;
}
+
+u32 larch_insn_gen_beq(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm)
+{
+ union loongarch_instruction insn;
+
+ if ((imm & 3) || imm < -SZ_128K || imm >= SZ_128K) {
+ pr_warn("The generated beq instruction is out of range.\n");
+ return INSN_BREAK;
+ }
+
+ emit_beq(&insn, rj, rd, imm >> 2);
+
+ return insn.word;
+}
+
+u32 larch_insn_gen_bne(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm)
+{
+ union loongarch_instruction insn;
+
+ if ((imm & 3) || imm < -SZ_128K || imm >= SZ_128K) {
+ pr_warn("The generated bne instruction is out of range.\n");
+ return INSN_BREAK;
+ }
+
+ emit_bne(&insn, rj, rd, imm >> 2);
+
+ return insn.word;
+}
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v3 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx.
2025-07-09 5:50 [PATCH v3 0/5] Support trampoline for LoongArch Chenghao Duan
2025-07-09 5:50 ` [PATCH v3 1/5] LoongArch: Add the function to generate the beq and bne assembly instructions Chenghao Duan
@ 2025-07-09 5:50 ` Chenghao Duan
2025-07-16 11:55 ` Hengqi Chen
2025-07-09 5:50 ` [PATCH v3 3/5] LoongArch: BPF: Add EXECMEM_BPF memory to execmem subsystem Chenghao Duan
` (3 subsequent siblings)
5 siblings, 1 reply; 24+ messages in thread
From: Chenghao Duan @ 2025-07-09 5:50 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran
Update the code to rename validate_code to validate_ctx.
validate_code is used to check the validity of code.
validate_ctx is used to check both code validity and table entry
correctness.
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
---
arch/loongarch/net/bpf_jit.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index fa1500d4a..7032f11d3 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -1180,6 +1180,14 @@ static int validate_code(struct jit_ctx *ctx)
return -1;
}
+ return 0;
+}
+
+static int validate_ctx(struct jit_ctx *ctx)
+{
+ if (validate_code(ctx))
+ return -1;
+
if (WARN_ON_ONCE(ctx->num_exentries != ctx->prog->aux->num_exentries))
return -1;
@@ -1288,7 +1296,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
build_epilogue(&ctx);
/* 3. Extra pass to validate JITed code */
- if (validate_code(&ctx)) {
+ if (validate_ctx(&ctx)) {
bpf_jit_binary_free(header);
prog = orig_prog;
goto out_offset;
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v3 3/5] LoongArch: BPF: Add EXECMEM_BPF memory to execmem subsystem
2025-07-09 5:50 [PATCH v3 0/5] Support trampoline for LoongArch Chenghao Duan
2025-07-09 5:50 ` [PATCH v3 1/5] LoongArch: Add the function to generate the beq and bne assembly instructions Chenghao Duan
2025-07-09 5:50 ` [PATCH v3 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx Chenghao Duan
@ 2025-07-09 5:50 ` Chenghao Duan
2025-07-09 15:23 ` Huacai Chen
2025-07-09 5:50 ` [PATCH v3 4/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch Chenghao Duan
` (2 subsequent siblings)
5 siblings, 1 reply; 24+ messages in thread
From: Chenghao Duan @ 2025-07-09 5:50 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran
The bpf_jit_alloc_exec function serves as the core mechanism for BPF
memory allocation, invoking execmem_alloc(EXECMEM_BPF, size) to
allocate memory. This change explicitly designates the allocation space
for EXECMEM_BPF.
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
---
arch/loongarch/mm/init.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c
index c3e4586a7..07cedd9ee 100644
--- a/arch/loongarch/mm/init.c
+++ b/arch/loongarch/mm/init.c
@@ -239,6 +239,12 @@ struct execmem_info __init *execmem_arch_setup(void)
.pgprot = PAGE_KERNEL,
.alignment = 1,
},
+ [EXECMEM_BPF] = {
+ .start = VMALLOC_START,
+ .end = VMALLOC_END,
+ .pgprot = PAGE_KERNEL,
+ .alignment = PAGE_SIZE,
+ },
},
};
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v3 4/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
2025-07-09 5:50 [PATCH v3 0/5] Support trampoline for LoongArch Chenghao Duan
` (2 preceding siblings ...)
2025-07-09 5:50 ` [PATCH v3 3/5] LoongArch: BPF: Add EXECMEM_BPF memory to execmem subsystem Chenghao Duan
@ 2025-07-09 5:50 ` Chenghao Duan
2025-07-16 12:21 ` Hengqi Chen
` (2 more replies)
2025-07-09 5:50 ` [PATCH v3 5/5] LoongArch: BPF: Add bpf trampoline " Chenghao Duan
2025-07-10 7:29 ` [PATCH v3 0/5] Support trampoline for LoongArch Huacai Chen
5 siblings, 3 replies; 24+ messages in thread
From: Chenghao Duan @ 2025-07-09 5:50 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran
Implement the functions of bpf_arch_text_poke, bpf_arch_text_copy, and
bpf_arch_text_invalidate on the LoongArch architecture.
On LoongArch, since symbol addresses in the direct mapping
region cannot be reached via relative jump instructions from the paged
mapping region, we use the move_imm+jirl instruction pair as absolute
jump instructions. These require 2-5 instructions, so we reserve 5 NOP
instructions in the program as placeholders for function jumps.
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
---
arch/loongarch/include/asm/inst.h | 1 +
arch/loongarch/kernel/inst.c | 32 +++++++++++
arch/loongarch/net/bpf_jit.c | 90 +++++++++++++++++++++++++++++++
3 files changed, 123 insertions(+)
diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
index 2ae96a35d..88bb73e46 100644
--- a/arch/loongarch/include/asm/inst.h
+++ b/arch/loongarch/include/asm/inst.h
@@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
int larch_insn_read(void *addr, u32 *insnp);
int larch_insn_write(void *addr, u32 insn);
int larch_insn_patch_text(void *addr, u32 insn);
+int larch_insn_text_copy(void *dst, void *src, size_t len);
u32 larch_insn_gen_nop(void);
u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
index 674e3b322..8d6594968 100644
--- a/arch/loongarch/kernel/inst.c
+++ b/arch/loongarch/kernel/inst.c
@@ -4,6 +4,7 @@
*/
#include <linux/sizes.h>
#include <linux/uaccess.h>
+#include <linux/set_memory.h>
#include <asm/cacheflush.h>
#include <asm/inst.h>
@@ -218,6 +219,37 @@ int larch_insn_patch_text(void *addr, u32 insn)
return ret;
}
+int larch_insn_text_copy(void *dst, void *src, size_t len)
+{
+ unsigned long flags;
+ size_t wlen = 0;
+ size_t size;
+ void *ptr;
+ int ret = 0;
+
+ set_memory_rw((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
+ raw_spin_lock_irqsave(&patch_lock, flags);
+ while (wlen < len) {
+ ptr = dst + wlen;
+ size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
+ len - wlen);
+
+ ret = copy_to_kernel_nofault(ptr, src + wlen, size);
+ if (ret) {
+ pr_err("%s: operation failed\n", __func__);
+ break;
+ }
+ wlen += size;
+ }
+ raw_spin_unlock_irqrestore(&patch_lock, flags);
+ set_memory_rox((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
+
+ if (!ret)
+ flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
+
+ return ret;
+}
+
u32 larch_insn_gen_nop(void)
{
return INSN_NOP;
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index 7032f11d3..9cb01f0b0 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -4,6 +4,7 @@
*
* Copyright (C) 2022 Loongson Technology Corporation Limited
*/
+#include <linux/memory.h>
#include "bpf_jit.h"
#define REG_TCC LOONGARCH_GPR_A6
@@ -1367,3 +1368,92 @@ bool bpf_jit_supports_subprog_tailcalls(void)
{
return true;
}
+
+static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 ip, u64 target)
+{
+ s64 offset = (s64)(target - ip);
+
+ if (offset && (offset >= -SZ_128M && offset < SZ_128M)) {
+ emit_insn(ctx, bl, offset >> 2);
+ } else {
+ move_imm(ctx, LOONGARCH_GPR_T1, target, false);
+ emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
+ }
+
+ return 0;
+}
+
+static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
+{
+ struct jit_ctx ctx;
+
+ ctx.idx = 0;
+ ctx.image = (union loongarch_instruction *)insns;
+
+ if (!target) {
+ emit_insn((&ctx), nop);
+ emit_insn((&ctx), nop);
+ return 0;
+ }
+
+ return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
+ (unsigned long)ip, (unsigned long)target);
+}
+
+int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
+ void *old_addr, void *new_addr)
+{
+ u32 old_insns[5] = {[0 ... 4] = INSN_NOP};
+ u32 new_insns[5] = {[0 ... 4] = INSN_NOP};
+ bool is_call = poke_type == BPF_MOD_CALL;
+ int ret;
+
+ if (!is_kernel_text((unsigned long)ip) &&
+ !is_bpf_text_address((unsigned long)ip))
+ return -ENOTSUPP;
+
+ ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
+ if (ret)
+ return ret;
+
+ if (memcmp(ip, old_insns, 5 * 4))
+ return -EFAULT;
+
+ ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
+ if (ret)
+ return ret;
+
+ mutex_lock(&text_mutex);
+ if (memcmp(ip, new_insns, 5 * 4))
+ ret = larch_insn_text_copy(ip, new_insns, 5 * 4);
+ mutex_unlock(&text_mutex);
+ return ret;
+}
+
+int bpf_arch_text_invalidate(void *dst, size_t len)
+{
+ int i;
+ int ret = 0;
+ u32 *inst;
+
+ inst = kvmalloc(len, GFP_KERNEL);
+ if (!inst)
+ return -ENOMEM;
+
+ for (i = 0; i < (len/sizeof(u32)); i++)
+ inst[i] = INSN_BREAK;
+
+ if (larch_insn_text_copy(dst, inst, len))
+ ret = -EINVAL;
+
+ kvfree(inst);
+ return ret;
+}
+
+void *bpf_arch_text_copy(void *dst, void *src, size_t len)
+{
+ if (larch_insn_text_copy(dst, src, len))
+ return ERR_PTR(-EINVAL);
+
+ return dst;
+}
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v3 5/5] LoongArch: BPF: Add bpf trampoline support for Loongarch
2025-07-09 5:50 [PATCH v3 0/5] Support trampoline for LoongArch Chenghao Duan
` (3 preceding siblings ...)
2025-07-09 5:50 ` [PATCH v3 4/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch Chenghao Duan
@ 2025-07-09 5:50 ` Chenghao Duan
2025-07-09 17:19 ` kernel test robot
2025-07-16 12:32 ` Hengqi Chen
2025-07-10 7:29 ` [PATCH v3 0/5] Support trampoline for LoongArch Huacai Chen
5 siblings, 2 replies; 24+ messages in thread
From: Chenghao Duan @ 2025-07-09 5:50 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran
BPF trampoline is the critical infrastructure of the BPF subsystem, acting
as a mediator between kernel functions and BPF programs. Numerous important
features, such as using BPF program for zero overhead kernel introspection,
rely on this key component.
The related tests have passed, Including the following technical points:
1. fentry
2. fmod_ret
3. fexit
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
---
arch/loongarch/net/bpf_jit.c | 391 +++++++++++++++++++++++++++++++++++
arch/loongarch/net/bpf_jit.h | 6 +
2 files changed, 397 insertions(+)
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index 9cb01f0b0..6820558af 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -7,6 +7,10 @@
#include <linux/memory.h>
#include "bpf_jit.h"
+#define LOONGARCH_MAX_REG_ARGS 8
+#define LOONGARCH_FENTRY_NINSNS 2
+#define LOONGARCH_FENTRY_NBYTES (LOONGARCH_FENTRY_NINSNS * 4)
+
#define REG_TCC LOONGARCH_GPR_A6
#define TCC_SAVED LOONGARCH_GPR_S5
@@ -1400,6 +1404,16 @@ static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
(unsigned long)ip, (unsigned long)target);
}
+static int emit_call(struct jit_ctx *ctx, u64 addr)
+{
+ u64 ip;
+
+ if (addr && ctx->image && ctx->ro_image)
+ ip = (u64)(ctx->image + ctx->idx);
+
+ return emit_jump_and_link(ctx, LOONGARCH_GPR_RA, ip, addr);
+}
+
int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
void *old_addr, void *new_addr)
{
@@ -1457,3 +1471,380 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len)
return dst;
}
+
+static void store_args(struct jit_ctx *ctx, int nargs, int args_off)
+{
+ int i;
+
+ for (i = 0; i < nargs; i++) {
+ emit_insn(ctx, std, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
+ args_off -= 8;
+ }
+}
+
+static void restore_args(struct jit_ctx *ctx, int nargs, int args_off)
+{
+ int i;
+
+ for (i = 0; i < nargs; i++) {
+ emit_insn(ctx, ldd, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
+ args_off -= 8;
+ }
+}
+
+static int invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l,
+ int args_off, int retval_off,
+ int run_ctx_off, bool save_ret)
+{
+ int ret;
+ u32 *branch;
+ struct bpf_prog *p = l->link.prog;
+ int cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
+
+ if (l->cookie) {
+ move_imm(ctx, LOONGARCH_GPR_T1, l->cookie, false);
+ emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -run_ctx_off + cookie_off);
+ } else {
+ emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP,
+ -run_ctx_off + cookie_off);
+ }
+
+ /* arg1: prog */
+ move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
+ /* arg2: &run_ctx */
+ emit_insn(ctx, addid, LOONGARCH_GPR_A1, LOONGARCH_GPR_FP, -run_ctx_off);
+ ret = emit_call(ctx, (const u64)bpf_trampoline_enter(p));
+ if (ret)
+ return ret;
+
+ /* store prog start time */
+ move_reg(ctx, LOONGARCH_GPR_S1, LOONGARCH_GPR_A0);
+
+ /* if (__bpf_prog_enter(prog) == 0)
+ * goto skip_exec_of_prog;
+ *
+ */
+ branch = (u32 *)ctx->image + ctx->idx;
+ /* nop reserved for conditional jump */
+ emit_insn(ctx, nop);
+
+ /* arg1: &args_off */
+ emit_insn(ctx, addid, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -args_off);
+ if (!p->jited)
+ move_imm(ctx, LOONGARCH_GPR_A1, (const s64)p->insnsi, false);
+ ret = emit_call(ctx, (const u64)p->bpf_func);
+ if (ret)
+ return ret;
+
+ if (save_ret) {
+ emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
+ emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
+ }
+
+ /* update branch with beqz */
+ if (ctx->image) {
+ int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branch;
+ *branch = larch_insn_gen_beq(LOONGARCH_GPR_A0, LOONGARCH_GPR_ZERO, offset);
+ }
+
+ /* arg1: prog */
+ move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
+ /* arg2: prog start time */
+ move_reg(ctx, LOONGARCH_GPR_A1, LOONGARCH_GPR_S1);
+ /* arg3: &run_ctx */
+ emit_insn(ctx, addid, LOONGARCH_GPR_A2, LOONGARCH_GPR_FP, -run_ctx_off);
+ ret = emit_call(ctx, (const u64)bpf_trampoline_exit(p));
+
+ return ret;
+}
+
+static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl,
+ int args_off, int retval_off, int run_ctx_off, u32 **branches)
+{
+ int i;
+
+ emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP, -retval_off);
+ for (i = 0; i < tl->nr_links; i++) {
+ invoke_bpf_prog(ctx, tl->links[i], args_off, retval_off,
+ run_ctx_off, true);
+ emit_insn(ctx, ldd, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -retval_off);
+ branches[i] = (u32 *)ctx->image + ctx->idx;
+ emit_insn(ctx, nop);
+ }
+}
+
+u64 bpf_jit_alloc_exec_limit(void)
+{
+ return VMALLOC_END - VMALLOC_START;
+}
+
+void *arch_alloc_bpf_trampoline(unsigned int size)
+{
+ return bpf_prog_pack_alloc(size, jit_fill_hole);
+}
+
+void arch_free_bpf_trampoline(void *image, unsigned int size)
+{
+ bpf_prog_pack_free(image, size);
+}
+
+static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
+ const struct btf_func_model *m,
+ struct bpf_tramp_links *tlinks,
+ void *func_addr, u32 flags)
+{
+ int i;
+ int stack_size = 0, nargs = 0;
+ int retval_off, args_off, nargs_off, ip_off, run_ctx_off, sreg_off;
+ struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
+ struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
+ struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
+ int ret, save_ret;
+ void *orig_call = func_addr;
+ u32 **branches = NULL;
+
+ if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
+ return -ENOTSUPP;
+
+ /*
+ * FP + 8 [ RA to parent func ] return address to parent
+ * function
+ * FP + 0 [ FP of parent func ] frame pointer of parent
+ * function
+ * FP - 8 [ T0 to traced func ] return address of traced
+ * function
+ * FP - 16 [ FP of traced func ] frame pointer of traced
+ * function
+ *
+ * FP - retval_off [ return value ] BPF_TRAMP_F_CALL_ORIG or
+ * BPF_TRAMP_F_RET_FENTRY_RET
+ * [ argN ]
+ * [ ... ]
+ * FP - args_off [ arg1 ]
+ *
+ * FP - nargs_off [ regs count ]
+ *
+ * FP - ip_off [ traced func ] BPF_TRAMP_F_IP_ARG
+ *
+ * FP - run_ctx_off [ bpf_tramp_run_ctx ]
+ *
+ * FP - sreg_off [ callee saved reg ]
+ *
+ */
+
+ if (m->nr_args > LOONGARCH_MAX_REG_ARGS)
+ return -ENOTSUPP;
+
+ if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
+ return -ENOTSUPP;
+
+ stack_size = 0;
+
+ /* room of trampoline frame to store return address and frame pointer */
+ stack_size += 16;
+
+ save_ret = flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET);
+ if (save_ret) {
+ /* Save BPF R0 and A0 */
+ stack_size += 16;
+ retval_off = stack_size;
+ }
+
+ /* room of trampoline frame to store args */
+ nargs = m->nr_args;
+ stack_size += nargs * 8;
+ args_off = stack_size;
+
+ /* room of trampoline frame to store args number */
+ stack_size += 8;
+ nargs_off = stack_size;
+
+ /* room of trampoline frame to store ip address */
+ if (flags & BPF_TRAMP_F_IP_ARG) {
+ stack_size += 8;
+ ip_off = stack_size;
+ }
+
+ /* room of trampoline frame to store struct bpf_tramp_run_ctx */
+ stack_size += round_up(sizeof(struct bpf_tramp_run_ctx), 8);
+ run_ctx_off = stack_size;
+
+ stack_size += 8;
+ sreg_off = stack_size;
+
+ stack_size = round_up(stack_size, 16);
+
+ /* For the trampoline called from function entry */
+ /* RA and FP for parent function*/
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
+ emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
+ emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
+ emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
+
+ /* RA and FP for traced function*/
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
+ emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
+ emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
+ emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
+
+ /* callee saved register S1 to pass start time */
+ emit_insn(ctx, std, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
+
+ /* store ip address of the traced function */
+ if (flags & BPF_TRAMP_F_IP_ARG) {
+ move_imm(ctx, LOONGARCH_GPR_T1, (const s64)func_addr, false);
+ emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -ip_off);
+ }
+
+ /* store nargs number*/
+ move_imm(ctx, LOONGARCH_GPR_T1, nargs, false);
+ emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -nargs_off);
+
+ store_args(ctx, nargs, args_off);
+
+ /* To traced function */
+ orig_call += LOONGARCH_FENTRY_NBYTES;
+ if (flags & BPF_TRAMP_F_CALL_ORIG) {
+ move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
+ ret = emit_call(ctx, (const u64)__bpf_tramp_enter);
+ if (ret)
+ return ret;
+ }
+
+ for (i = 0; i < fentry->nr_links; i++) {
+ ret = invoke_bpf_prog(ctx, fentry->links[i], args_off, retval_off,
+ run_ctx_off, flags & BPF_TRAMP_F_RET_FENTRY_RET);
+ if (ret)
+ return ret;
+ }
+ if (fmod_ret->nr_links) {
+ branches = kcalloc(fmod_ret->nr_links, sizeof(u32 *), GFP_KERNEL);
+ if (!branches)
+ return -ENOMEM;
+
+ invoke_bpf_mod_ret(ctx, fmod_ret, args_off, retval_off,
+ run_ctx_off, branches);
+ }
+
+ if (flags & BPF_TRAMP_F_CALL_ORIG) {
+ restore_args(ctx, m->nr_args, args_off);
+ ret = emit_call(ctx, (const u64)orig_call);
+ if (ret)
+ goto out;
+ emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
+ emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
+ im->ip_after_call = ctx->ro_image + ctx->idx;
+ /* Reserve space for the move_imm + jirl instruction */
+ emit_insn(ctx, nop);
+ emit_insn(ctx, nop);
+ emit_insn(ctx, nop);
+ emit_insn(ctx, nop);
+ emit_insn(ctx, nop);
+ }
+
+ for (i = 0; ctx->image && i < fmod_ret->nr_links; i++) {
+ int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branches[i];
+ *branches[i] = larch_insn_gen_bne(LOONGARCH_GPR_T1, LOONGARCH_GPR_ZERO, offset);
+ }
+
+ for (i = 0; i < fexit->nr_links; i++) {
+ ret = invoke_bpf_prog(ctx, fexit->links[i], args_off, retval_off,
+ run_ctx_off, false);
+ if (ret)
+ goto out;
+ }
+
+ if (flags & BPF_TRAMP_F_CALL_ORIG) {
+ im->ip_epilogue = ctx->ro_image + ctx->idx;
+ move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
+ ret = emit_call(ctx, (const u64)__bpf_tramp_exit);
+ if (ret)
+ goto out;
+ }
+
+ if (flags & BPF_TRAMP_F_RESTORE_REGS)
+ restore_args(ctx, m->nr_args, args_off);
+
+ if (save_ret) {
+ emit_insn(ctx, ldd, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
+ emit_insn(ctx, ldd, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
+ }
+
+ emit_insn(ctx, ldd, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
+
+ /* trampoline called from function entry */
+ emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
+ emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
+
+ emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
+ emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
+
+ if (flags & BPF_TRAMP_F_SKIP_FRAME)
+ /* return to parent function */
+ emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
+ else
+ /* return to traced function */
+ emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
+
+ ret = ctx->idx;
+out:
+ kfree(branches);
+
+ return ret;
+}
+
+int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image,
+ void *ro_image_end, const struct btf_func_model *m,
+ u32 flags, struct bpf_tramp_links *tlinks,
+ void *func_addr)
+{
+ int ret;
+ void *image, *tmp;
+ u32 size = ro_image_end - ro_image;
+
+ image = kvmalloc(size, GFP_KERNEL);
+ if (!image)
+ return -ENOMEM;
+
+ struct jit_ctx ctx = {
+ .image = (union loongarch_instruction *)image,
+ .ro_image = (union loongarch_instruction *)ro_image,
+ .idx = 0,
+ };
+
+ jit_fill_hole(image, (unsigned int)(ro_image_end - ro_image));
+ ret = __arch_prepare_bpf_trampoline(&ctx, im, m, tlinks, func_addr, flags);
+ if (ret > 0 && validate_code(&ctx) < 0) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ tmp = bpf_arch_text_copy(ro_image, image, size);
+ if (IS_ERR(tmp)) {
+ ret = PTR_ERR(tmp);
+ goto out;
+ }
+
+ bpf_flush_icache(ro_image, ro_image_end);
+out:
+ kvfree(image);
+ return ret < 0 ? ret : size;
+}
+
+int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags,
+ struct bpf_tramp_links *tlinks, void *func_addr)
+{
+ struct bpf_tramp_image im;
+ struct jit_ctx ctx;
+ int ret;
+
+ ctx.image = NULL;
+ ctx.idx = 0;
+
+ ret = __arch_prepare_bpf_trampoline(&ctx, &im, m, tlinks, func_addr, flags);
+
+ /* Page align */
+ return ret < 0 ? ret : round_up(ret * LOONGARCH_INSN_SIZE, PAGE_SIZE);
+}
diff --git a/arch/loongarch/net/bpf_jit.h b/arch/loongarch/net/bpf_jit.h
index f9c569f53..5697158fd 100644
--- a/arch/loongarch/net/bpf_jit.h
+++ b/arch/loongarch/net/bpf_jit.h
@@ -18,6 +18,7 @@ struct jit_ctx {
u32 *offset;
int num_exentries;
union loongarch_instruction *image;
+ union loongarch_instruction *ro_image;
u32 stack_size;
};
@@ -308,3 +309,8 @@ static inline int emit_tailcall_jmp(struct jit_ctx *ctx, u8 cond, enum loongarch
return -EINVAL;
}
+
+static inline void bpf_flush_icache(void *start, void *end)
+{
+ flush_icache_range((unsigned long)start, (unsigned long)end);
+}
--
2.43.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v3 3/5] LoongArch: BPF: Add EXECMEM_BPF memory to execmem subsystem
2025-07-09 5:50 ` [PATCH v3 3/5] LoongArch: BPF: Add EXECMEM_BPF memory to execmem subsystem Chenghao Duan
@ 2025-07-09 15:23 ` Huacai Chen
2025-07-10 7:23 ` Chenghao Duan
0 siblings, 1 reply; 24+ messages in thread
From: Huacai Chen @ 2025-07-09 15:23 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, hengqi.chen, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran
Hi, Chenghao,
On Wed, Jul 9, 2025 at 1:50 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> The bpf_jit_alloc_exec function serves as the core mechanism for BPF
> memory allocation, invoking execmem_alloc(EXECMEM_BPF, size) to
> allocate memory. This change explicitly designates the allocation space
> for EXECMEM_BPF.
Without this patch, BPF JIT is allocated from MODULES region; with
this patch, BPF JIT will be allocated from VMALLOC region. However,
BPF JIT is similar to modules that the target of direct branch
instruction is limited, so it should also be allocated from the
MODULES region.
So, it is better to drop this patch.
Huacai
>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> ---
> arch/loongarch/mm/init.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c
> index c3e4586a7..07cedd9ee 100644
> --- a/arch/loongarch/mm/init.c
> +++ b/arch/loongarch/mm/init.c
> @@ -239,6 +239,12 @@ struct execmem_info __init *execmem_arch_setup(void)
> .pgprot = PAGE_KERNEL,
> .alignment = 1,
> },
> + [EXECMEM_BPF] = {
> + .start = VMALLOC_START,
> + .end = VMALLOC_END,
> + .pgprot = PAGE_KERNEL,
> + .alignment = PAGE_SIZE,
> + },
> },
> };
>
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 5/5] LoongArch: BPF: Add bpf trampoline support for Loongarch
2025-07-09 5:50 ` [PATCH v3 5/5] LoongArch: BPF: Add bpf trampoline " Chenghao Duan
@ 2025-07-09 17:19 ` kernel test robot
2025-07-16 12:32 ` Hengqi Chen
1 sibling, 0 replies; 24+ messages in thread
From: kernel test robot @ 2025-07-09 17:19 UTC (permalink / raw)
To: Chenghao Duan, ast, daniel, andrii, yangtiezhu, hengqi.chen,
chenhuacai
Cc: oe-kbuild-all, martin.lau, eddyz87, song, yonghong.song,
john.fastabend, kpsingh, sdf, haoluo, jolsa, kernel, linux-kernel,
loongarch, bpf, guodongtai, duanchenghao, youling.tang,
jianghaoran
Hi Chenghao,
kernel test robot noticed the following build warnings:
[auto build test WARNING on bpf-next/net]
[also build test WARNING on bpf-next/master bpf/master linus/master v6.16-rc5 next-20250709]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Chenghao-Duan/LoongArch-Add-the-function-to-generate-the-beq-and-bne-assembly-instructions/20250709-135350
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git net
patch link: https://lore.kernel.org/r/20250709055029.723243-6-duanchenghao%40kylinos.cn
patch subject: [PATCH v3 5/5] LoongArch: BPF: Add bpf trampoline support for Loongarch
config: loongarch-allyesconfig (https://download.01.org/0day-ci/archive/20250710/202507100034.wXofj6VX-lkp@intel.com/config)
compiler: clang version 21.0.0git (https://github.com/llvm/llvm-project 01c97b4953e87ae455bd4c41e3de3f0f0f29c61c)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250710/202507100034.wXofj6VX-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202507100034.wXofj6VX-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> arch/loongarch/net/bpf_jit.c:1411:6: warning: variable 'ip' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
1411 | if (addr && ctx->image && ctx->ro_image)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/loongarch/net/bpf_jit.c:1414:51: note: uninitialized use occurs here
1414 | return emit_jump_and_link(ctx, LOONGARCH_GPR_RA, ip, addr);
| ^~
arch/loongarch/net/bpf_jit.c:1411:2: note: remove the 'if' if its condition is always true
1411 | if (addr && ctx->image && ctx->ro_image)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1412 | ip = (u64)(ctx->image + ctx->idx);
>> arch/loongarch/net/bpf_jit.c:1411:6: warning: variable 'ip' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized]
1411 | if (addr && ctx->image && ctx->ro_image)
| ^~~~~~~~~~~~~~~~~~
arch/loongarch/net/bpf_jit.c:1414:51: note: uninitialized use occurs here
1414 | return emit_jump_and_link(ctx, LOONGARCH_GPR_RA, ip, addr);
| ^~
arch/loongarch/net/bpf_jit.c:1411:6: note: remove the '&&' if its condition is always true
1411 | if (addr && ctx->image && ctx->ro_image)
| ^~~~~~~~~~~~~~~~~~~~~
>> arch/loongarch/net/bpf_jit.c:1411:6: warning: variable 'ip' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized]
1411 | if (addr && ctx->image && ctx->ro_image)
| ^~~~
arch/loongarch/net/bpf_jit.c:1414:51: note: uninitialized use occurs here
1414 | return emit_jump_and_link(ctx, LOONGARCH_GPR_RA, ip, addr);
| ^~
arch/loongarch/net/bpf_jit.c:1411:6: note: remove the '&&' if its condition is always true
1411 | if (addr && ctx->image && ctx->ro_image)
| ^~~~~~~
arch/loongarch/net/bpf_jit.c:1409:8: note: initialize the variable 'ip' to silence this warning
1409 | u64 ip;
| ^
| = 0
3 warnings generated.
vim +1411 arch/loongarch/net/bpf_jit.c
1406
1407 static int emit_call(struct jit_ctx *ctx, u64 addr)
1408 {
1409 u64 ip;
1410
> 1411 if (addr && ctx->image && ctx->ro_image)
1412 ip = (u64)(ctx->image + ctx->idx);
1413
1414 return emit_jump_and_link(ctx, LOONGARCH_GPR_RA, ip, addr);
1415 }
1416
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 3/5] LoongArch: BPF: Add EXECMEM_BPF memory to execmem subsystem
2025-07-09 15:23 ` Huacai Chen
@ 2025-07-10 7:23 ` Chenghao Duan
0 siblings, 0 replies; 24+ messages in thread
From: Chenghao Duan @ 2025-07-10 7:23 UTC (permalink / raw)
To: Huacai Chen
Cc: ast, daniel, andrii, yangtiezhu, hengqi.chen, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran
On Wed, Jul 09, 2025 at 11:23:12PM +0800, Huacai Chen wrote:
> Hi, Chenghao,
>
> On Wed, Jul 9, 2025 at 1:50 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > The bpf_jit_alloc_exec function serves as the core mechanism for BPF
> > memory allocation, invoking execmem_alloc(EXECMEM_BPF, size) to
> > allocate memory. This change explicitly designates the allocation space
> > for EXECMEM_BPF.
> Without this patch, BPF JIT is allocated from MODULES region; with
> this patch, BPF JIT will be allocated from VMALLOC region. However,
> BPF JIT is similar to modules that the target of direct branch
> instruction is limited, so it should also be allocated from the
> MODULES region.
>
> So, it is better to drop this patch.
>
>
> Huacai
Dear Chen,
I understand your technical considerations. Whether to keep or remove
the current patch has no impact on trampoline, so we can drop this
patch.
Chenghao
>
> >
> > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > ---
> > arch/loongarch/mm/init.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c
> > index c3e4586a7..07cedd9ee 100644
> > --- a/arch/loongarch/mm/init.c
> > +++ b/arch/loongarch/mm/init.c
> > @@ -239,6 +239,12 @@ struct execmem_info __init *execmem_arch_setup(void)
> > .pgprot = PAGE_KERNEL,
> > .alignment = 1,
> > },
> > + [EXECMEM_BPF] = {
> > + .start = VMALLOC_START,
> > + .end = VMALLOC_END,
> > + .pgprot = PAGE_KERNEL,
> > + .alignment = PAGE_SIZE,
> > + },
> > },
> > };
> >
> > --
> > 2.43.0
> >
> >
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 0/5] Support trampoline for LoongArch
2025-07-09 5:50 [PATCH v3 0/5] Support trampoline for LoongArch Chenghao Duan
` (4 preceding siblings ...)
2025-07-09 5:50 ` [PATCH v3 5/5] LoongArch: BPF: Add bpf trampoline " Chenghao Duan
@ 2025-07-10 7:29 ` Huacai Chen
2025-07-14 8:55 ` Tiezhu Yang
5 siblings, 1 reply; 24+ messages in thread
From: Huacai Chen @ 2025-07-10 7:29 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, hengqi.chen, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran
Hi, Tiezhu and Hengqi,
Could you please pay some time to review this series? I hope it can be
merged to 6.17.
Huacai
On Wed, Jul 9, 2025 at 1:50 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> v3:
> 1. Patch 0003 adds EXECMEM_BPF memory type to the execmem subsystem.
>
> 2. Align the size calculated by arch_bpf_trampoline_size to page
> boundaries.
>
> 3. Add the flush icache operation to larch_insn_text_copy.
>
> 4. Unify the implementation of bpf_arch_xxx into the patch
> "0004-LoongArch-BPF-Add-bpf_arch_xxxxx-support-for-Loong.patch".
>
> 5. Change the patch order. Move the patch
> "0002-LoongArch-BPF-Update-the-code-to-rename-validate_.patch" before
> "0005-LoongArch-BPF-Add-bpf-trampoline-support-for-Loon.patch".
>
> -----------------------------------------------------------------------
> Historical Version:
> v2:
> 1. Change the fixmap in the instruction copy function to set_memory_xxx.
>
> 2. Change the implementation method of the following code.
> - arch_alloc_bpf_trampoline
> - arch_free_bpf_trampoline
> Use the BPF core's allocation and free functions.
>
> - bpf_arch_text_invalidate
> Operate with the function larch_insn_text_copy that carries
> memory attribute modifications.
>
> 3. Correct the incorrect code formatting.
>
> URL for version v2:
> https://lore.kernel.org/all/20250618105048.1510560-1-duanchenghao@kylinos.cn/
> ---------
> v1:
> Support trampoline for LoongArch. The following feature tests have been
> completed:
> 1. fentry
> 2. fexit
> 3. fmod_ret
>
> TODO: The support for the struct_ops feature will be provided in
> subsequent patches.
>
> URL for version v1:
> https://lore.kernel.org/all/20250611035952.111182-1-duanchenghao@kylinos.cn/
> -----------------------------------------------------------------------
>
> Chenghao Duan (5):
> LoongArch: Add the function to generate the beq and bne assembly
> instructions.
> LoongArch: BPF: Update the code to rename validate_code to
> validate_ctx.
> LoongArch: BPF: Add EXECMEM_BPF memory to execmem subsystem
> LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
> LoongArch: BPF: Add bpf trampoline support for Loongarch
>
> arch/loongarch/include/asm/inst.h | 3 +
> arch/loongarch/kernel/inst.c | 60 ++++
> arch/loongarch/mm/init.c | 6 +
> arch/loongarch/net/bpf_jit.c | 491 +++++++++++++++++++++++++++++-
> arch/loongarch/net/bpf_jit.h | 6 +
> 5 files changed, 565 insertions(+), 1 deletion(-)
>
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 0/5] Support trampoline for LoongArch
2025-07-10 7:29 ` [PATCH v3 0/5] Support trampoline for LoongArch Huacai Chen
@ 2025-07-14 8:55 ` Tiezhu Yang
0 siblings, 0 replies; 24+ messages in thread
From: Tiezhu Yang @ 2025-07-14 8:55 UTC (permalink / raw)
To: Huacai Chen, Chenghao Duan
Cc: ast, daniel, andrii, hengqi.chen, martin.lau, eddyz87, song,
yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran
On 2025/7/10 下午3:29, Huacai Chen wrote:
> Hi, Tiezhu and Hengqi,
>
> Could you please pay some time to review this series? I hope it can be
> merged to 6.17.
With the patch #1, #2, #4 and #5, the following related testcases
passed on LoongArch:
sudo ./test_progs -a fentry_test/fentry
sudo ./test_progs -a fexit_test/fexit
sudo ./test_progs -a fentry_fexit
sudo ./test_progs -a modify_return
sudo ./test_progs -a fexit_sleep
sudo ./test_progs -a test_overhead
sudo ./test_progs -a trampoline_count
Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Thanks,
Tiezhu
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 1/5] LoongArch: Add the function to generate the beq and bne assembly instructions.
2025-07-09 5:50 ` [PATCH v3 1/5] LoongArch: Add the function to generate the beq and bne assembly instructions Chenghao Duan
@ 2025-07-16 11:33 ` Hengqi Chen
0 siblings, 0 replies; 24+ messages in thread
From: Hengqi Chen @ 2025-07-16 11:33 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, Youling Tang
On Wed, Jul 9, 2025 at 1:50 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> Add branch jump function:
> larch_insn_gen_beq
> larch_insn_gen_bne
>
Please drop the period from subject line.
The commit message is kind of vague...
Maybe:
LoongArch: Add larch_insn_gen_{beq,bne} helpers
Add larch_insn_gen_beq() and larch_insn_gen_bne() helpers
which will be used in BPF trampoline implementation.
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Co-developed-by: Youling Tang <tangyouling@kylinos.cn>
> Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> ---
> arch/loongarch/include/asm/inst.h | 2 ++
> arch/loongarch/kernel/inst.c | 28 ++++++++++++++++++++++++++++
> 2 files changed, 30 insertions(+)
>
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> index 3089785ca..2ae96a35d 100644
> --- a/arch/loongarch/include/asm/inst.h
> +++ b/arch/loongarch/include/asm/inst.h
> @@ -511,6 +511,8 @@ u32 larch_insn_gen_lu12iw(enum loongarch_gpr rd, int imm);
> u32 larch_insn_gen_lu32id(enum loongarch_gpr rd, int imm);
> u32 larch_insn_gen_lu52id(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
> u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
> +u32 larch_insn_gen_beq(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
> +u32 larch_insn_gen_bne(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
>
> static inline bool signed_imm_check(long val, unsigned int bit)
> {
> diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> index 14d7d700b..674e3b322 100644
> --- a/arch/loongarch/kernel/inst.c
> +++ b/arch/loongarch/kernel/inst.c
> @@ -336,3 +336,31 @@ u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm)
>
> return insn.word;
> }
> +
> +u32 larch_insn_gen_beq(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm)
> +{
> + union loongarch_instruction insn;
> +
> + if ((imm & 3) || imm < -SZ_128K || imm >= SZ_128K) {
> + pr_warn("The generated beq instruction is out of range.\n");
> + return INSN_BREAK;
> + }
> +
> + emit_beq(&insn, rj, rd, imm >> 2);
> +
> + return insn.word;
> +}
> +
> +u32 larch_insn_gen_bne(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm)
> +{
> + union loongarch_instruction insn;
> +
> + if ((imm & 3) || imm < -SZ_128K || imm >= SZ_128K) {
> + pr_warn("The generated bne instruction is out of range.\n");
> + return INSN_BREAK;
> + }
> +
> + emit_bne(&insn, rj, rd, imm >> 2);
> +
> + return insn.word;
> +}
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx.
2025-07-09 5:50 ` [PATCH v3 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx Chenghao Duan
@ 2025-07-16 11:55 ` Hengqi Chen
2025-07-17 9:46 ` Chenghao Duan
0 siblings, 1 reply; 24+ messages in thread
From: Hengqi Chen @ 2025-07-16 11:55 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran
On Wed, Jul 9, 2025 at 1:50 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> Update the code to rename validate_code to validate_ctx.
> validate_code is used to check the validity of code.
> validate_ctx is used to check both code validity and table entry
> correctness.
>
The commit message is awkward to read.
Please describe the purpose of this change.
* Rename the existing validate_code() to validate_ctx()
* Factor out the code validation handling into a new helper validate_code()
The new validate_code() will be used in subsequent changes.
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> ---
> arch/loongarch/net/bpf_jit.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index fa1500d4a..7032f11d3 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -1180,6 +1180,14 @@ static int validate_code(struct jit_ctx *ctx)
> return -1;
> }
>
> + return 0;
> +}
> +
> +static int validate_ctx(struct jit_ctx *ctx)
> +{
> + if (validate_code(ctx))
> + return -1;
> +
> if (WARN_ON_ONCE(ctx->num_exentries != ctx->prog->aux->num_exentries))
> return -1;
>
> @@ -1288,7 +1296,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
> build_epilogue(&ctx);
>
> /* 3. Extra pass to validate JITed code */
> - if (validate_code(&ctx)) {
> + if (validate_ctx(&ctx)) {
> bpf_jit_binary_free(header);
> prog = orig_prog;
> goto out_offset;
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 4/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
2025-07-09 5:50 ` [PATCH v3 4/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch Chenghao Duan
@ 2025-07-16 12:21 ` Hengqi Chen
2025-07-17 9:27 ` Chenghao Duan
2025-07-16 18:41 ` Vincent Li
2025-07-18 23:08 ` Vincent Li
2 siblings, 1 reply; 24+ messages in thread
From: Hengqi Chen @ 2025-07-16 12:21 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran
On Wed, Jul 9, 2025 at 1:50 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> Implement the functions of bpf_arch_text_poke, bpf_arch_text_copy, and
> bpf_arch_text_invalidate on the LoongArch architecture.
>
> On LoongArch, since symbol addresses in the direct mapping
> region cannot be reached via relative jump instructions from the paged
> mapping region, we use the move_imm+jirl instruction pair as absolute
> jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> instructions in the program as placeholders for function jumps.
>
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> ---
> arch/loongarch/include/asm/inst.h | 1 +
> arch/loongarch/kernel/inst.c | 32 +++++++++++
> arch/loongarch/net/bpf_jit.c | 90 +++++++++++++++++++++++++++++++
> 3 files changed, 123 insertions(+)
>
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> index 2ae96a35d..88bb73e46 100644
> --- a/arch/loongarch/include/asm/inst.h
> +++ b/arch/loongarch/include/asm/inst.h
> @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> int larch_insn_read(void *addr, u32 *insnp);
> int larch_insn_write(void *addr, u32 insn);
> int larch_insn_patch_text(void *addr, u32 insn);
> +int larch_insn_text_copy(void *dst, void *src, size_t len);
>
> u32 larch_insn_gen_nop(void);
> u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> index 674e3b322..8d6594968 100644
> --- a/arch/loongarch/kernel/inst.c
> +++ b/arch/loongarch/kernel/inst.c
> @@ -4,6 +4,7 @@
> */
> #include <linux/sizes.h>
> #include <linux/uaccess.h>
> +#include <linux/set_memory.h>
>
> #include <asm/cacheflush.h>
> #include <asm/inst.h>
> @@ -218,6 +219,37 @@ int larch_insn_patch_text(void *addr, u32 insn)
> return ret;
> }
>
> +int larch_insn_text_copy(void *dst, void *src, size_t len)
> +{
> + unsigned long flags;
> + size_t wlen = 0;
> + size_t size;
> + void *ptr;
> + int ret = 0;
> +
> + set_memory_rw((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> + raw_spin_lock_irqsave(&patch_lock, flags);
> + while (wlen < len) {
> + ptr = dst + wlen;
> + size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
> + len - wlen);
> +
> + ret = copy_to_kernel_nofault(ptr, src + wlen, size);
> + if (ret) {
> + pr_err("%s: operation failed\n", __func__);
> + break;
> + }
> + wlen += size;
> + }
Again, why do you do copy_to_kernel_nofault() in a loop ?
This larch_insn_text_copy() can be part of the first patch like
larch_insn_gen_{beq,bne}. WDYT ?
> + raw_spin_unlock_irqrestore(&patch_lock, flags);
> + set_memory_rox((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> +
> + if (!ret)
> + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> +
> + return ret;
> +}
> +
> u32 larch_insn_gen_nop(void)
> {
> return INSN_NOP;
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 7032f11d3..9cb01f0b0 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -4,6 +4,7 @@
> *
> * Copyright (C) 2022 Loongson Technology Corporation Limited
> */
> +#include <linux/memory.h>
> #include "bpf_jit.h"
>
> #define REG_TCC LOONGARCH_GPR_A6
> @@ -1367,3 +1368,92 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> {
> return true;
> }
> +
> +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 ip, u64 target)
> +{
> + s64 offset = (s64)(target - ip);
> +
> + if (offset && (offset >= -SZ_128M && offset < SZ_128M)) {
> + emit_insn(ctx, bl, offset >> 2);
> + } else {
> + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> + }
> +
> + return 0;
> +}
> +
> +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> +{
> + struct jit_ctx ctx;
> +
> + ctx.idx = 0;
> + ctx.image = (union loongarch_instruction *)insns;
> +
> + if (!target) {
> + emit_insn((&ctx), nop);
> + emit_insn((&ctx), nop);
> + return 0;
> + }
> +
> + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> + (unsigned long)ip, (unsigned long)target);
> +}
> +
> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> + void *old_addr, void *new_addr)
> +{
> + u32 old_insns[5] = {[0 ... 4] = INSN_NOP};
> + u32 new_insns[5] = {[0 ... 4] = INSN_NOP};
> + bool is_call = poke_type == BPF_MOD_CALL;
> + int ret;
> +
> + if (!is_kernel_text((unsigned long)ip) &&
> + !is_bpf_text_address((unsigned long)ip))
> + return -ENOTSUPP;
> +
> + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> + if (ret)
> + return ret;
> +
> + if (memcmp(ip, old_insns, 5 * 4))
> + return -EFAULT;
> +
> + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> + if (ret)
> + return ret;
> +
> + mutex_lock(&text_mutex);
> + if (memcmp(ip, new_insns, 5 * 4))
> + ret = larch_insn_text_copy(ip, new_insns, 5 * 4);
> + mutex_unlock(&text_mutex);
> + return ret;
> +}
> +
> +int bpf_arch_text_invalidate(void *dst, size_t len)
> +{
> + int i;
> + int ret = 0;
> + u32 *inst;
> +
> + inst = kvmalloc(len, GFP_KERNEL);
> + if (!inst)
> + return -ENOMEM;
> +
> + for (i = 0; i < (len/sizeof(u32)); i++)
> + inst[i] = INSN_BREAK;
> +
> + if (larch_insn_text_copy(dst, inst, len))
> + ret = -EINVAL;
> +
> + kvfree(inst);
> + return ret;
> +}
> +
> +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> +{
> + if (larch_insn_text_copy(dst, src, len))
> + return ERR_PTR(-EINVAL);
> +
> + return dst;
> +}
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 5/5] LoongArch: BPF: Add bpf trampoline support for Loongarch
2025-07-09 5:50 ` [PATCH v3 5/5] LoongArch: BPF: Add bpf trampoline " Chenghao Duan
2025-07-09 17:19 ` kernel test robot
@ 2025-07-16 12:32 ` Hengqi Chen
2025-07-17 9:43 ` Chenghao Duan
1 sibling, 1 reply; 24+ messages in thread
From: Hengqi Chen @ 2025-07-16 12:32 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran
On Wed, Jul 9, 2025 at 1:51 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> BPF trampoline is the critical infrastructure of the BPF subsystem, acting
> as a mediator between kernel functions and BPF programs. Numerous important
> features, such as using BPF program for zero overhead kernel introspection,
> rely on this key component.
>
> The related tests have passed, Including the following technical points:
> 1. fentry
> 2. fmod_ret
> 3. fexit
>
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> ---
> arch/loongarch/net/bpf_jit.c | 391 +++++++++++++++++++++++++++++++++++
> arch/loongarch/net/bpf_jit.h | 6 +
> 2 files changed, 397 insertions(+)
>
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 9cb01f0b0..6820558af 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -7,6 +7,10 @@
> #include <linux/memory.h>
> #include "bpf_jit.h"
>
> +#define LOONGARCH_MAX_REG_ARGS 8
> +#define LOONGARCH_FENTRY_NINSNS 2
> +#define LOONGARCH_FENTRY_NBYTES (LOONGARCH_FENTRY_NINSNS * 4)
> +
> #define REG_TCC LOONGARCH_GPR_A6
> #define TCC_SAVED LOONGARCH_GPR_S5
>
> @@ -1400,6 +1404,16 @@ static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> (unsigned long)ip, (unsigned long)target);
> }
>
> +static int emit_call(struct jit_ctx *ctx, u64 addr)
> +{
> + u64 ip;
> +
> + if (addr && ctx->image && ctx->ro_image)
> + ip = (u64)(ctx->image + ctx->idx);
> +
> + return emit_jump_and_link(ctx, LOONGARCH_GPR_RA, ip, addr);
> +}
> +
> int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> void *old_addr, void *new_addr)
> {
> @@ -1457,3 +1471,380 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len)
>
> return dst;
> }
> +
> +static void store_args(struct jit_ctx *ctx, int nargs, int args_off)
> +{
> + int i;
> +
> + for (i = 0; i < nargs; i++) {
> + emit_insn(ctx, std, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
> + args_off -= 8;
> + }
> +}
> +
> +static void restore_args(struct jit_ctx *ctx, int nargs, int args_off)
> +{
> + int i;
> +
> + for (i = 0; i < nargs; i++) {
> + emit_insn(ctx, ldd, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
> + args_off -= 8;
> + }
> +}
> +
> +static int invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l,
> + int args_off, int retval_off,
> + int run_ctx_off, bool save_ret)
> +{
> + int ret;
> + u32 *branch;
> + struct bpf_prog *p = l->link.prog;
> + int cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
> +
> + if (l->cookie) {
> + move_imm(ctx, LOONGARCH_GPR_T1, l->cookie, false);
> + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -run_ctx_off + cookie_off);
> + } else {
> + emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP,
> + -run_ctx_off + cookie_off);
> + }
> +
> + /* arg1: prog */
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
> + /* arg2: &run_ctx */
> + emit_insn(ctx, addid, LOONGARCH_GPR_A1, LOONGARCH_GPR_FP, -run_ctx_off);
> + ret = emit_call(ctx, (const u64)bpf_trampoline_enter(p));
> + if (ret)
> + return ret;
> +
> + /* store prog start time */
> + move_reg(ctx, LOONGARCH_GPR_S1, LOONGARCH_GPR_A0);
> +
> + /* if (__bpf_prog_enter(prog) == 0)
> + * goto skip_exec_of_prog;
> + *
> + */
> + branch = (u32 *)ctx->image + ctx->idx;
> + /* nop reserved for conditional jump */
> + emit_insn(ctx, nop);
> +
> + /* arg1: &args_off */
> + emit_insn(ctx, addid, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -args_off);
> + if (!p->jited)
> + move_imm(ctx, LOONGARCH_GPR_A1, (const s64)p->insnsi, false);
> + ret = emit_call(ctx, (const u64)p->bpf_func);
> + if (ret)
> + return ret;
> +
> + if (save_ret) {
> + emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> + emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> + }
> +
> + /* update branch with beqz */
> + if (ctx->image) {
> + int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branch;
> + *branch = larch_insn_gen_beq(LOONGARCH_GPR_A0, LOONGARCH_GPR_ZERO, offset);
> + }
> +
> + /* arg1: prog */
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
> + /* arg2: prog start time */
> + move_reg(ctx, LOONGARCH_GPR_A1, LOONGARCH_GPR_S1);
> + /* arg3: &run_ctx */
> + emit_insn(ctx, addid, LOONGARCH_GPR_A2, LOONGARCH_GPR_FP, -run_ctx_off);
> + ret = emit_call(ctx, (const u64)bpf_trampoline_exit(p));
> +
> + return ret;
> +}
> +
> +static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl,
> + int args_off, int retval_off, int run_ctx_off, u32 **branches)
> +{
> + int i;
> +
> + emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP, -retval_off);
> + for (i = 0; i < tl->nr_links; i++) {
> + invoke_bpf_prog(ctx, tl->links[i], args_off, retval_off,
> + run_ctx_off, true);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -retval_off);
> + branches[i] = (u32 *)ctx->image + ctx->idx;
> + emit_insn(ctx, nop);
> + }
> +}
> +
> +u64 bpf_jit_alloc_exec_limit(void)
> +{
> + return VMALLOC_END - VMALLOC_START;
> +}
> +
> +void *arch_alloc_bpf_trampoline(unsigned int size)
> +{
> + return bpf_prog_pack_alloc(size, jit_fill_hole);
> +}
> +
> +void arch_free_bpf_trampoline(void *image, unsigned int size)
> +{
> + bpf_prog_pack_free(image, size);
> +}
> +
> +static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
> + const struct btf_func_model *m,
> + struct bpf_tramp_links *tlinks,
> + void *func_addr, u32 flags)
> +{
> + int i;
> + int stack_size = 0, nargs = 0;
> + int retval_off, args_off, nargs_off, ip_off, run_ctx_off, sreg_off;
> + struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
> + struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> + struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> + int ret, save_ret;
> + void *orig_call = func_addr;
> + u32 **branches = NULL;
> +
> + if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
> + return -ENOTSUPP;
> +
> + /*
> + * FP + 8 [ RA to parent func ] return address to parent
> + * function
> + * FP + 0 [ FP of parent func ] frame pointer of parent
> + * function
> + * FP - 8 [ T0 to traced func ] return address of traced
> + * function
> + * FP - 16 [ FP of traced func ] frame pointer of traced
> + * function
> + *
> + * FP - retval_off [ return value ] BPF_TRAMP_F_CALL_ORIG or
> + * BPF_TRAMP_F_RET_FENTRY_RET
> + * [ argN ]
> + * [ ... ]
> + * FP - args_off [ arg1 ]
> + *
> + * FP - nargs_off [ regs count ]
> + *
> + * FP - ip_off [ traced func ] BPF_TRAMP_F_IP_ARG
> + *
> + * FP - run_ctx_off [ bpf_tramp_run_ctx ]
> + *
> + * FP - sreg_off [ callee saved reg ]
> + *
> + */
> +
> + if (m->nr_args > LOONGARCH_MAX_REG_ARGS)
> + return -ENOTSUPP;
> +
> + if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
> + return -ENOTSUPP;
> +
> + stack_size = 0;
> +
> + /* room of trampoline frame to store return address and frame pointer */
> + stack_size += 16;
> +
> + save_ret = flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET);
> + if (save_ret) {
> + /* Save BPF R0 and A0 */
> + stack_size += 16;
> + retval_off = stack_size;
> + }
> +
> + /* room of trampoline frame to store args */
> + nargs = m->nr_args;
> + stack_size += nargs * 8;
> + args_off = stack_size;
> +
> + /* room of trampoline frame to store args number */
> + stack_size += 8;
> + nargs_off = stack_size;
> +
> + /* room of trampoline frame to store ip address */
> + if (flags & BPF_TRAMP_F_IP_ARG) {
> + stack_size += 8;
> + ip_off = stack_size;
> + }
> +
> + /* room of trampoline frame to store struct bpf_tramp_run_ctx */
> + stack_size += round_up(sizeof(struct bpf_tramp_run_ctx), 8);
> + run_ctx_off = stack_size;
> +
> + stack_size += 8;
> + sreg_off = stack_size;
> +
> + stack_size = round_up(stack_size, 16);
> +
> + /* For the trampoline called from function entry */
> + /* RA and FP for parent function*/
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
> + emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
> +
> + /* RA and FP for traced function*/
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
> + emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
> +
> + /* callee saved register S1 to pass start time */
> + emit_insn(ctx, std, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> +
> + /* store ip address of the traced function */
> + if (flags & BPF_TRAMP_F_IP_ARG) {
> + move_imm(ctx, LOONGARCH_GPR_T1, (const s64)func_addr, false);
> + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -ip_off);
> + }
> +
> + /* store nargs number*/
> + move_imm(ctx, LOONGARCH_GPR_T1, nargs, false);
> + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -nargs_off);
> +
> + store_args(ctx, nargs, args_off);
> +
> + /* To traced function */
> + orig_call += LOONGARCH_FENTRY_NBYTES;
> + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
> + ret = emit_call(ctx, (const u64)__bpf_tramp_enter);
> + if (ret)
> + return ret;
> + }
> +
> + for (i = 0; i < fentry->nr_links; i++) {
> + ret = invoke_bpf_prog(ctx, fentry->links[i], args_off, retval_off,
> + run_ctx_off, flags & BPF_TRAMP_F_RET_FENTRY_RET);
> + if (ret)
> + return ret;
> + }
> + if (fmod_ret->nr_links) {
> + branches = kcalloc(fmod_ret->nr_links, sizeof(u32 *), GFP_KERNEL);
> + if (!branches)
> + return -ENOMEM;
> +
> + invoke_bpf_mod_ret(ctx, fmod_ret, args_off, retval_off,
> + run_ctx_off, branches);
> + }
> +
> + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> + restore_args(ctx, m->nr_args, args_off);
> + ret = emit_call(ctx, (const u64)orig_call);
> + if (ret)
> + goto out;
> + emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> + emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> + im->ip_after_call = ctx->ro_image + ctx->idx;
> + /* Reserve space for the move_imm + jirl instruction */
> + emit_insn(ctx, nop);
> + emit_insn(ctx, nop);
> + emit_insn(ctx, nop);
> + emit_insn(ctx, nop);
> + emit_insn(ctx, nop);
> + }
> +
> + for (i = 0; ctx->image && i < fmod_ret->nr_links; i++) {
> + int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branches[i];
> + *branches[i] = larch_insn_gen_bne(LOONGARCH_GPR_T1, LOONGARCH_GPR_ZERO, offset);
> + }
> +
> + for (i = 0; i < fexit->nr_links; i++) {
> + ret = invoke_bpf_prog(ctx, fexit->links[i], args_off, retval_off,
> + run_ctx_off, false);
> + if (ret)
> + goto out;
> + }
> +
> + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> + im->ip_epilogue = ctx->ro_image + ctx->idx;
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
> + ret = emit_call(ctx, (const u64)__bpf_tramp_exit);
> + if (ret)
> + goto out;
> + }
> +
> + if (flags & BPF_TRAMP_F_RESTORE_REGS)
> + restore_args(ctx, m->nr_args, args_off);
> +
> + if (save_ret) {
> + emit_insn(ctx, ldd, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> + emit_insn(ctx, ldd, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> + }
> +
> + emit_insn(ctx, ldd, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> +
> + /* trampoline called from function entry */
> + emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
> +
> + emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
> +
> + if (flags & BPF_TRAMP_F_SKIP_FRAME)
> + /* return to parent function */
> + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
> + else
> + /* return to traced function */
> + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
> +
> + ret = ctx->idx;
> +out:
> + kfree(branches);
> +
> + return ret;
> +}
> +
> +int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image,
> + void *ro_image_end, const struct btf_func_model *m,
> + u32 flags, struct bpf_tramp_links *tlinks,
> + void *func_addr)
> +{
> + int ret;
> + void *image, *tmp;
> + u32 size = ro_image_end - ro_image;
> +
> + image = kvmalloc(size, GFP_KERNEL);
> + if (!image)
> + return -ENOMEM;
> +
> + struct jit_ctx ctx = {
> + .image = (union loongarch_instruction *)image,
> + .ro_image = (union loongarch_instruction *)ro_image,
> + .idx = 0,
> + };
> +
Declare ctx at function entry ?
> + jit_fill_hole(image, (unsigned int)(ro_image_end - ro_image));
> + ret = __arch_prepare_bpf_trampoline(&ctx, im, m, tlinks, func_addr, flags);
> + if (ret > 0 && validate_code(&ctx) < 0) {
> + ret = -EINVAL;
> + goto out;
> + }
> +
> + tmp = bpf_arch_text_copy(ro_image, image, size);
> + if (IS_ERR(tmp)) {
> + ret = PTR_ERR(tmp);
> + goto out;
> + }
> +
> + bpf_flush_icache(ro_image, ro_image_end);
> +out:
> + kvfree(image);
> + return ret < 0 ? ret : size;
> +}
> +
> +int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags,
> + struct bpf_tramp_links *tlinks, void *func_addr)
> +{
> + struct bpf_tramp_image im;
> + struct jit_ctx ctx;
> + int ret;
> +
> + ctx.image = NULL;
> + ctx.idx = 0;
> +
> + ret = __arch_prepare_bpf_trampoline(&ctx, &im, m, tlinks, func_addr, flags);
> +
> + /* Page align */
> + return ret < 0 ? ret : round_up(ret * LOONGARCH_INSN_SIZE, PAGE_SIZE);
> +}
> diff --git a/arch/loongarch/net/bpf_jit.h b/arch/loongarch/net/bpf_jit.h
> index f9c569f53..5697158fd 100644
> --- a/arch/loongarch/net/bpf_jit.h
> +++ b/arch/loongarch/net/bpf_jit.h
> @@ -18,6 +18,7 @@ struct jit_ctx {
> u32 *offset;
> int num_exentries;
> union loongarch_instruction *image;
> + union loongarch_instruction *ro_image;
> u32 stack_size;
> };
>
> @@ -308,3 +309,8 @@ static inline int emit_tailcall_jmp(struct jit_ctx *ctx, u8 cond, enum loongarch
>
> return -EINVAL;
> }
> +
> +static inline void bpf_flush_icache(void *start, void *end)
> +{
> + flush_icache_range((unsigned long)start, (unsigned long)end);
> +}
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 4/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
2025-07-09 5:50 ` [PATCH v3 4/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch Chenghao Duan
2025-07-16 12:21 ` Hengqi Chen
@ 2025-07-16 18:41 ` Vincent Li
2025-07-18 23:08 ` Vincent Li
2 siblings, 0 replies; 24+ messages in thread
From: Vincent Li @ 2025-07-16 18:41 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai,
martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, youling.tang, jianghaoran
On Tue, Jul 8, 2025 at 11:02 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> Implement the functions of bpf_arch_text_poke, bpf_arch_text_copy, and
> bpf_arch_text_invalidate on the LoongArch architecture.
>
> On LoongArch, since symbol addresses in the direct mapping
> region cannot be reached via relative jump instructions from the paged
> mapping region, we use the move_imm+jirl instruction pair as absolute
> jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> instructions in the program as placeholders for function jumps.
>
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> ---
> arch/loongarch/include/asm/inst.h | 1 +
> arch/loongarch/kernel/inst.c | 32 +++++++++++
> arch/loongarch/net/bpf_jit.c | 90 +++++++++++++++++++++++++++++++
> 3 files changed, 123 insertions(+)
>
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> index 2ae96a35d..88bb73e46 100644
> --- a/arch/loongarch/include/asm/inst.h
> +++ b/arch/loongarch/include/asm/inst.h
> @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> int larch_insn_read(void *addr, u32 *insnp);
> int larch_insn_write(void *addr, u32 insn);
> int larch_insn_patch_text(void *addr, u32 insn);
> +int larch_insn_text_copy(void *dst, void *src, size_t len);
>
> u32 larch_insn_gen_nop(void);
> u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> index 674e3b322..8d6594968 100644
> --- a/arch/loongarch/kernel/inst.c
> +++ b/arch/loongarch/kernel/inst.c
> @@ -4,6 +4,7 @@
> */
> #include <linux/sizes.h>
> #include <linux/uaccess.h>
> +#include <linux/set_memory.h>
>
> #include <asm/cacheflush.h>
> #include <asm/inst.h>
> @@ -218,6 +219,37 @@ int larch_insn_patch_text(void *addr, u32 insn)
> return ret;
> }
>
> +int larch_insn_text_copy(void *dst, void *src, size_t len)
> +{
> + unsigned long flags;
> + size_t wlen = 0;
> + size_t size;
> + void *ptr;
> + int ret = 0;
> +
> + set_memory_rw((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> + raw_spin_lock_irqsave(&patch_lock, flags);
> + while (wlen < len) {
> + ptr = dst + wlen;
> + size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
> + len - wlen);
> +
> + ret = copy_to_kernel_nofault(ptr, src + wlen, size);
> + if (ret) {
> + pr_err("%s: operation failed\n", __func__);
> + break;
> + }
> + wlen += size;
> + }
> + raw_spin_unlock_irqrestore(&patch_lock, flags);
> + set_memory_rox((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> +
> + if (!ret)
> + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> +
> + return ret;
> +}
> +
> u32 larch_insn_gen_nop(void)
> {
> return INSN_NOP;
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 7032f11d3..9cb01f0b0 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -4,6 +4,7 @@
> *
> * Copyright (C) 2022 Loongson Technology Corporation Limited
> */
> +#include <linux/memory.h>
> #include "bpf_jit.h"
>
> #define REG_TCC LOONGARCH_GPR_A6
> @@ -1367,3 +1368,92 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> {
> return true;
> }
> +
> +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 ip, u64 target)
> +{
> + s64 offset = (s64)(target - ip);
> +
> + if (offset && (offset >= -SZ_128M && offset < SZ_128M)) {
> + emit_insn(ctx, bl, offset >> 2);
> + } else {
> + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> + }
> +
> + return 0;
> +}
> +
> +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> +{
> + struct jit_ctx ctx;
> +
> + ctx.idx = 0;
> + ctx.image = (union loongarch_instruction *)insns;
> +
> + if (!target) {
> + emit_insn((&ctx), nop);
> + emit_insn((&ctx), nop);
> + return 0;
> + }
> +
> + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> + (unsigned long)ip, (unsigned long)target);
> +}
> +
> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> + void *old_addr, void *new_addr)
> +{
> + u32 old_insns[5] = {[0 ... 4] = INSN_NOP};
> + u32 new_insns[5] = {[0 ... 4] = INSN_NOP};
> + bool is_call = poke_type == BPF_MOD_CALL;
> + int ret;
> +
> + if (!is_kernel_text((unsigned long)ip) &&
> + !is_bpf_text_address((unsigned long)ip))
> + return -ENOTSUPP;
> +
> + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> + if (ret)
> + return ret;
> +
> + if (memcmp(ip, old_insns, 5 * 4))
> + return -EFAULT;
> +
> + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> + if (ret)
> + return ret;
> +
> + mutex_lock(&text_mutex);
> + if (memcmp(ip, new_insns, 5 * 4))
> + ret = larch_insn_text_copy(ip, new_insns, 5 * 4);
> + mutex_unlock(&text_mutex);
> + return ret;
> +}
> +
I recommend to add comment for bpf_arch_text_poke() similar to
bpf_arch_text_poke() in arch/arm64/net/bpf_jit_comp.c so we have clear
understanding on how it works, given we already have issue with it
when running xdp-filter from xdp-tools that I reported from another
email thread.
> +int bpf_arch_text_invalidate(void *dst, size_t len)
> +{
> + int i;
> + int ret = 0;
> + u32 *inst;
> +
> + inst = kvmalloc(len, GFP_KERNEL);
> + if (!inst)
> + return -ENOMEM;
> +
> + for (i = 0; i < (len/sizeof(u32)); i++)
> + inst[i] = INSN_BREAK;
> +
> + if (larch_insn_text_copy(dst, inst, len))
> + ret = -EINVAL;
> +
> + kvfree(inst);
> + return ret;
> +}
> +
> +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> +{
> + if (larch_insn_text_copy(dst, src, len))
> + return ERR_PTR(-EINVAL);
> +
> + return dst;
> +}
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 4/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
2025-07-16 12:21 ` Hengqi Chen
@ 2025-07-17 9:27 ` Chenghao Duan
2025-07-17 10:12 ` Hengqi Chen
0 siblings, 1 reply; 24+ messages in thread
From: Chenghao Duan @ 2025-07-17 9:27 UTC (permalink / raw)
To: Hengqi Chen
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran
On Wed, Jul 16, 2025 at 08:21:59PM +0800, Hengqi Chen wrote:
> On Wed, Jul 9, 2025 at 1:50 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > Implement the functions of bpf_arch_text_poke, bpf_arch_text_copy, and
> > bpf_arch_text_invalidate on the LoongArch architecture.
> >
> > On LoongArch, since symbol addresses in the direct mapping
> > region cannot be reached via relative jump instructions from the paged
> > mapping region, we use the move_imm+jirl instruction pair as absolute
> > jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> > instructions in the program as placeholders for function jumps.
> >
> > Co-developed-by: George Guo <guodongtai@kylinos.cn>
> > Signed-off-by: George Guo <guodongtai@kylinos.cn>
> > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > ---
> > arch/loongarch/include/asm/inst.h | 1 +
> > arch/loongarch/kernel/inst.c | 32 +++++++++++
> > arch/loongarch/net/bpf_jit.c | 90 +++++++++++++++++++++++++++++++
> > 3 files changed, 123 insertions(+)
> >
> > diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> > index 2ae96a35d..88bb73e46 100644
> > --- a/arch/loongarch/include/asm/inst.h
> > +++ b/arch/loongarch/include/asm/inst.h
> > @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> > int larch_insn_read(void *addr, u32 *insnp);
> > int larch_insn_write(void *addr, u32 insn);
> > int larch_insn_patch_text(void *addr, u32 insn);
> > +int larch_insn_text_copy(void *dst, void *src, size_t len);
> >
> > u32 larch_insn_gen_nop(void);
> > u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > index 674e3b322..8d6594968 100644
> > --- a/arch/loongarch/kernel/inst.c
> > +++ b/arch/loongarch/kernel/inst.c
> > @@ -4,6 +4,7 @@
> > */
> > #include <linux/sizes.h>
> > #include <linux/uaccess.h>
> > +#include <linux/set_memory.h>
> >
> > #include <asm/cacheflush.h>
> > #include <asm/inst.h>
> > @@ -218,6 +219,37 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > return ret;
> > }
> >
> > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > +{
> > + unsigned long flags;
> > + size_t wlen = 0;
> > + size_t size;
> > + void *ptr;
> > + int ret = 0;
> > +
> > + set_memory_rw((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> > + raw_spin_lock_irqsave(&patch_lock, flags);
> > + while (wlen < len) {
> > + ptr = dst + wlen;
> > + size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
> > + len - wlen);
> > +
> > + ret = copy_to_kernel_nofault(ptr, src + wlen, size);
> > + if (ret) {
> > + pr_err("%s: operation failed\n", __func__);
> > + break;
> > + }
> > + wlen += size;
> > + }
>
> Again, why do you do copy_to_kernel_nofault() in a loop ?
The while loop processes all sizes. I referred to how ARM64 and
RISC-V64 handle this using loops as well.
> This larch_insn_text_copy() can be part of the first patch like
> larch_insn_gen_{beq,bne}. WDYT ?
From my perspective, it is acceptable to include both
larch_insn_text_copy and larch_insn_gen_{beq,bne} in the same patch,
or place them in the bpf_arch_xxxx patch. larch_insn_text_copy is
solely used for BPF; the application scope of larch_insn_gen_{beq,bne}
is not limited to BPF.
>
> > + raw_spin_unlock_irqrestore(&patch_lock, flags);
> > + set_memory_rox((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> > +
> > + if (!ret)
> > + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> > +
> > + return ret;
> > +}
> > +
> > u32 larch_insn_gen_nop(void)
> > {
> > return INSN_NOP;
> > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > index 7032f11d3..9cb01f0b0 100644
> > --- a/arch/loongarch/net/bpf_jit.c
> > +++ b/arch/loongarch/net/bpf_jit.c
> > @@ -4,6 +4,7 @@
> > *
> > * Copyright (C) 2022 Loongson Technology Corporation Limited
> > */
> > +#include <linux/memory.h>
> > #include "bpf_jit.h"
> >
> > #define REG_TCC LOONGARCH_GPR_A6
> > @@ -1367,3 +1368,92 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> > {
> > return true;
> > }
> > +
> > +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 ip, u64 target)
> > +{
> > + s64 offset = (s64)(target - ip);
> > +
> > + if (offset && (offset >= -SZ_128M && offset < SZ_128M)) {
> > + emit_insn(ctx, bl, offset >> 2);
> > + } else {
> > + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> > + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> > +{
> > + struct jit_ctx ctx;
> > +
> > + ctx.idx = 0;
> > + ctx.image = (union loongarch_instruction *)insns;
> > +
> > + if (!target) {
> > + emit_insn((&ctx), nop);
> > + emit_insn((&ctx), nop);
> > + return 0;
> > + }
> > +
> > + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> > + (unsigned long)ip, (unsigned long)target);
> > +}
> > +
> > +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> > + void *old_addr, void *new_addr)
> > +{
> > + u32 old_insns[5] = {[0 ... 4] = INSN_NOP};
> > + u32 new_insns[5] = {[0 ... 4] = INSN_NOP};
> > + bool is_call = poke_type == BPF_MOD_CALL;
> > + int ret;
> > +
> > + if (!is_kernel_text((unsigned long)ip) &&
> > + !is_bpf_text_address((unsigned long)ip))
> > + return -ENOTSUPP;
> > +
> > + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> > + if (ret)
> > + return ret;
> > +
> > + if (memcmp(ip, old_insns, 5 * 4))
> > + return -EFAULT;
> > +
> > + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> > + if (ret)
> > + return ret;
> > +
> > + mutex_lock(&text_mutex);
> > + if (memcmp(ip, new_insns, 5 * 4))
> > + ret = larch_insn_text_copy(ip, new_insns, 5 * 4);
> > + mutex_unlock(&text_mutex);
> > + return ret;
> > +}
> > +
> > +int bpf_arch_text_invalidate(void *dst, size_t len)
> > +{
> > + int i;
> > + int ret = 0;
> > + u32 *inst;
> > +
> > + inst = kvmalloc(len, GFP_KERNEL);
> > + if (!inst)
> > + return -ENOMEM;
> > +
> > + for (i = 0; i < (len/sizeof(u32)); i++)
> > + inst[i] = INSN_BREAK;
> > +
> > + if (larch_insn_text_copy(dst, inst, len))
> > + ret = -EINVAL;
> > +
> > + kvfree(inst);
> > + return ret;
> > +}
> > +
> > +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> > +{
> > + if (larch_insn_text_copy(dst, src, len))
> > + return ERR_PTR(-EINVAL);
> > +
> > + return dst;
> > +}
> > --
> > 2.43.0
> >
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 5/5] LoongArch: BPF: Add bpf trampoline support for Loongarch
2025-07-16 12:32 ` Hengqi Chen
@ 2025-07-17 9:43 ` Chenghao Duan
0 siblings, 0 replies; 24+ messages in thread
From: Chenghao Duan @ 2025-07-17 9:43 UTC (permalink / raw)
To: Hengqi Chen
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran
On Wed, Jul 16, 2025 at 08:32:54PM +0800, Hengqi Chen wrote:
> On Wed, Jul 9, 2025 at 1:51 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > BPF trampoline is the critical infrastructure of the BPF subsystem, acting
> > as a mediator between kernel functions and BPF programs. Numerous important
> > features, such as using BPF program for zero overhead kernel introspection,
> > rely on this key component.
> >
> > The related tests have passed, Including the following technical points:
> > 1. fentry
> > 2. fmod_ret
> > 3. fexit
> >
> > Co-developed-by: George Guo <guodongtai@kylinos.cn>
> > Signed-off-by: George Guo <guodongtai@kylinos.cn>
> > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > ---
> > arch/loongarch/net/bpf_jit.c | 391 +++++++++++++++++++++++++++++++++++
> > arch/loongarch/net/bpf_jit.h | 6 +
> > 2 files changed, 397 insertions(+)
> >
> > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > index 9cb01f0b0..6820558af 100644
> > --- a/arch/loongarch/net/bpf_jit.c
> > +++ b/arch/loongarch/net/bpf_jit.c
> > @@ -7,6 +7,10 @@
> > #include <linux/memory.h>
> > #include "bpf_jit.h"
> >
> > +#define LOONGARCH_MAX_REG_ARGS 8
> > +#define LOONGARCH_FENTRY_NINSNS 2
> > +#define LOONGARCH_FENTRY_NBYTES (LOONGARCH_FENTRY_NINSNS * 4)
> > +
> > #define REG_TCC LOONGARCH_GPR_A6
> > #define TCC_SAVED LOONGARCH_GPR_S5
> >
> > @@ -1400,6 +1404,16 @@ static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> > (unsigned long)ip, (unsigned long)target);
> > }
> >
> > +static int emit_call(struct jit_ctx *ctx, u64 addr)
> > +{
> > + u64 ip;
> > +
> > + if (addr && ctx->image && ctx->ro_image)
> > + ip = (u64)(ctx->image + ctx->idx);
> > +
> > + return emit_jump_and_link(ctx, LOONGARCH_GPR_RA, ip, addr);
> > +}
> > +
> > int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> > void *old_addr, void *new_addr)
> > {
> > @@ -1457,3 +1471,380 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> >
> > return dst;
> > }
> > +
> > +static void store_args(struct jit_ctx *ctx, int nargs, int args_off)
> > +{
> > + int i;
> > +
> > + for (i = 0; i < nargs; i++) {
> > + emit_insn(ctx, std, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
> > + args_off -= 8;
> > + }
> > +}
> > +
> > +static void restore_args(struct jit_ctx *ctx, int nargs, int args_off)
> > +{
> > + int i;
> > +
> > + for (i = 0; i < nargs; i++) {
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
> > + args_off -= 8;
> > + }
> > +}
> > +
> > +static int invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l,
> > + int args_off, int retval_off,
> > + int run_ctx_off, bool save_ret)
> > +{
> > + int ret;
> > + u32 *branch;
> > + struct bpf_prog *p = l->link.prog;
> > + int cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
> > +
> > + if (l->cookie) {
> > + move_imm(ctx, LOONGARCH_GPR_T1, l->cookie, false);
> > + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -run_ctx_off + cookie_off);
> > + } else {
> > + emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP,
> > + -run_ctx_off + cookie_off);
> > + }
> > +
> > + /* arg1: prog */
> > + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
> > + /* arg2: &run_ctx */
> > + emit_insn(ctx, addid, LOONGARCH_GPR_A1, LOONGARCH_GPR_FP, -run_ctx_off);
> > + ret = emit_call(ctx, (const u64)bpf_trampoline_enter(p));
> > + if (ret)
> > + return ret;
> > +
> > + /* store prog start time */
> > + move_reg(ctx, LOONGARCH_GPR_S1, LOONGARCH_GPR_A0);
> > +
> > + /* if (__bpf_prog_enter(prog) == 0)
> > + * goto skip_exec_of_prog;
> > + *
> > + */
> > + branch = (u32 *)ctx->image + ctx->idx;
> > + /* nop reserved for conditional jump */
> > + emit_insn(ctx, nop);
> > +
> > + /* arg1: &args_off */
> > + emit_insn(ctx, addid, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -args_off);
> > + if (!p->jited)
> > + move_imm(ctx, LOONGARCH_GPR_A1, (const s64)p->insnsi, false);
> > + ret = emit_call(ctx, (const u64)p->bpf_func);
> > + if (ret)
> > + return ret;
> > +
> > + if (save_ret) {
> > + emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> > + emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> > + }
> > +
> > + /* update branch with beqz */
> > + if (ctx->image) {
> > + int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branch;
> > + *branch = larch_insn_gen_beq(LOONGARCH_GPR_A0, LOONGARCH_GPR_ZERO, offset);
> > + }
> > +
> > + /* arg1: prog */
> > + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
> > + /* arg2: prog start time */
> > + move_reg(ctx, LOONGARCH_GPR_A1, LOONGARCH_GPR_S1);
> > + /* arg3: &run_ctx */
> > + emit_insn(ctx, addid, LOONGARCH_GPR_A2, LOONGARCH_GPR_FP, -run_ctx_off);
> > + ret = emit_call(ctx, (const u64)bpf_trampoline_exit(p));
> > +
> > + return ret;
> > +}
> > +
> > +static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl,
> > + int args_off, int retval_off, int run_ctx_off, u32 **branches)
> > +{
> > + int i;
> > +
> > + emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP, -retval_off);
> > + for (i = 0; i < tl->nr_links; i++) {
> > + invoke_bpf_prog(ctx, tl->links[i], args_off, retval_off,
> > + run_ctx_off, true);
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -retval_off);
> > + branches[i] = (u32 *)ctx->image + ctx->idx;
> > + emit_insn(ctx, nop);
> > + }
> > +}
> > +
> > +u64 bpf_jit_alloc_exec_limit(void)
> > +{
> > + return VMALLOC_END - VMALLOC_START;
> > +}
> > +
> > +void *arch_alloc_bpf_trampoline(unsigned int size)
> > +{
> > + return bpf_prog_pack_alloc(size, jit_fill_hole);
> > +}
> > +
> > +void arch_free_bpf_trampoline(void *image, unsigned int size)
> > +{
> > + bpf_prog_pack_free(image, size);
> > +}
> > +
> > +static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
> > + const struct btf_func_model *m,
> > + struct bpf_tramp_links *tlinks,
> > + void *func_addr, u32 flags)
> > +{
> > + int i;
> > + int stack_size = 0, nargs = 0;
> > + int retval_off, args_off, nargs_off, ip_off, run_ctx_off, sreg_off;
> > + struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
> > + struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> > + struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> > + int ret, save_ret;
> > + void *orig_call = func_addr;
> > + u32 **branches = NULL;
> > +
> > + if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
> > + return -ENOTSUPP;
> > +
> > + /*
> > + * FP + 8 [ RA to parent func ] return address to parent
> > + * function
> > + * FP + 0 [ FP of parent func ] frame pointer of parent
> > + * function
> > + * FP - 8 [ T0 to traced func ] return address of traced
> > + * function
> > + * FP - 16 [ FP of traced func ] frame pointer of traced
> > + * function
> > + *
> > + * FP - retval_off [ return value ] BPF_TRAMP_F_CALL_ORIG or
> > + * BPF_TRAMP_F_RET_FENTRY_RET
> > + * [ argN ]
> > + * [ ... ]
> > + * FP - args_off [ arg1 ]
> > + *
> > + * FP - nargs_off [ regs count ]
> > + *
> > + * FP - ip_off [ traced func ] BPF_TRAMP_F_IP_ARG
> > + *
> > + * FP - run_ctx_off [ bpf_tramp_run_ctx ]
> > + *
> > + * FP - sreg_off [ callee saved reg ]
> > + *
> > + */
> > +
> > + if (m->nr_args > LOONGARCH_MAX_REG_ARGS)
> > + return -ENOTSUPP;
> > +
> > + if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
> > + return -ENOTSUPP;
> > +
> > + stack_size = 0;
> > +
> > + /* room of trampoline frame to store return address and frame pointer */
> > + stack_size += 16;
> > +
> > + save_ret = flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET);
> > + if (save_ret) {
> > + /* Save BPF R0 and A0 */
> > + stack_size += 16;
> > + retval_off = stack_size;
> > + }
> > +
> > + /* room of trampoline frame to store args */
> > + nargs = m->nr_args;
> > + stack_size += nargs * 8;
> > + args_off = stack_size;
> > +
> > + /* room of trampoline frame to store args number */
> > + stack_size += 8;
> > + nargs_off = stack_size;
> > +
> > + /* room of trampoline frame to store ip address */
> > + if (flags & BPF_TRAMP_F_IP_ARG) {
> > + stack_size += 8;
> > + ip_off = stack_size;
> > + }
> > +
> > + /* room of trampoline frame to store struct bpf_tramp_run_ctx */
> > + stack_size += round_up(sizeof(struct bpf_tramp_run_ctx), 8);
> > + run_ctx_off = stack_size;
> > +
> > + stack_size += 8;
> > + sreg_off = stack_size;
> > +
> > + stack_size = round_up(stack_size, 16);
> > +
> > + /* For the trampoline called from function entry */
> > + /* RA and FP for parent function*/
> > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
> > + emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> > + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> > + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
> > +
> > + /* RA and FP for traced function*/
> > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
> > + emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> > + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> > + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
> > +
> > + /* callee saved register S1 to pass start time */
> > + emit_insn(ctx, std, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> > +
> > + /* store ip address of the traced function */
> > + if (flags & BPF_TRAMP_F_IP_ARG) {
> > + move_imm(ctx, LOONGARCH_GPR_T1, (const s64)func_addr, false);
> > + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -ip_off);
> > + }
> > +
> > + /* store nargs number*/
> > + move_imm(ctx, LOONGARCH_GPR_T1, nargs, false);
> > + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -nargs_off);
> > +
> > + store_args(ctx, nargs, args_off);
> > +
> > + /* To traced function */
> > + orig_call += LOONGARCH_FENTRY_NBYTES;
> > + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> > + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
> > + ret = emit_call(ctx, (const u64)__bpf_tramp_enter);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + for (i = 0; i < fentry->nr_links; i++) {
> > + ret = invoke_bpf_prog(ctx, fentry->links[i], args_off, retval_off,
> > + run_ctx_off, flags & BPF_TRAMP_F_RET_FENTRY_RET);
> > + if (ret)
> > + return ret;
> > + }
> > + if (fmod_ret->nr_links) {
> > + branches = kcalloc(fmod_ret->nr_links, sizeof(u32 *), GFP_KERNEL);
> > + if (!branches)
> > + return -ENOMEM;
> > +
> > + invoke_bpf_mod_ret(ctx, fmod_ret, args_off, retval_off,
> > + run_ctx_off, branches);
> > + }
> > +
> > + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> > + restore_args(ctx, m->nr_args, args_off);
> > + ret = emit_call(ctx, (const u64)orig_call);
> > + if (ret)
> > + goto out;
> > + emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> > + emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> > + im->ip_after_call = ctx->ro_image + ctx->idx;
> > + /* Reserve space for the move_imm + jirl instruction */
> > + emit_insn(ctx, nop);
> > + emit_insn(ctx, nop);
> > + emit_insn(ctx, nop);
> > + emit_insn(ctx, nop);
> > + emit_insn(ctx, nop);
> > + }
> > +
> > + for (i = 0; ctx->image && i < fmod_ret->nr_links; i++) {
> > + int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branches[i];
> > + *branches[i] = larch_insn_gen_bne(LOONGARCH_GPR_T1, LOONGARCH_GPR_ZERO, offset);
> > + }
> > +
> > + for (i = 0; i < fexit->nr_links; i++) {
> > + ret = invoke_bpf_prog(ctx, fexit->links[i], args_off, retval_off,
> > + run_ctx_off, false);
> > + if (ret)
> > + goto out;
> > + }
> > +
> > + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> > + im->ip_epilogue = ctx->ro_image + ctx->idx;
> > + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
> > + ret = emit_call(ctx, (const u64)__bpf_tramp_exit);
> > + if (ret)
> > + goto out;
> > + }
> > +
> > + if (flags & BPF_TRAMP_F_RESTORE_REGS)
> > + restore_args(ctx, m->nr_args, args_off);
> > +
> > + if (save_ret) {
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> > + emit_insn(ctx, ldd, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> > + }
> > +
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> > +
> > + /* trampoline called from function entry */
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
> > +
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
> > +
> > + if (flags & BPF_TRAMP_F_SKIP_FRAME)
> > + /* return to parent function */
> > + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
> > + else
> > + /* return to traced function */
> > + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
> > +
> > + ret = ctx->idx;
> > +out:
> > + kfree(branches);
> > +
> > + return ret;
> > +}
> > +
> > +int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image,
> > + void *ro_image_end, const struct btf_func_model *m,
> > + u32 flags, struct bpf_tramp_links *tlinks,
> > + void *func_addr)
> > +{
> > + int ret;
> > + void *image, *tmp;
> > + u32 size = ro_image_end - ro_image;
> > +
> > + image = kvmalloc(size, GFP_KERNEL);
> > + if (!image)
> > + return -ENOMEM;
> > +
> > + struct jit_ctx ctx = {
> > + .image = (union loongarch_instruction *)image,
> > + .ro_image = (union loongarch_instruction *)ro_image,
> > + .idx = 0,
> > + };
> > +
>
> Declare ctx at function entry ?
Yes, since the image is allocated first, it appears as it does now.
Do you think there's anything wrong with doing it this way?
>
> > + jit_fill_hole(image, (unsigned int)(ro_image_end - ro_image));
> > + ret = __arch_prepare_bpf_trampoline(&ctx, im, m, tlinks, func_addr, flags);
> > + if (ret > 0 && validate_code(&ctx) < 0) {
> > + ret = -EINVAL;
> > + goto out;
> > + }
> > +
> > + tmp = bpf_arch_text_copy(ro_image, image, size);
> > + if (IS_ERR(tmp)) {
> > + ret = PTR_ERR(tmp);
> > + goto out;
> > + }
> > +
> > + bpf_flush_icache(ro_image, ro_image_end);
> > +out:
> > + kvfree(image);
> > + return ret < 0 ? ret : size;
> > +}
> > +
> > +int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags,
> > + struct bpf_tramp_links *tlinks, void *func_addr)
> > +{
> > + struct bpf_tramp_image im;
> > + struct jit_ctx ctx;
> > + int ret;
> > +
> > + ctx.image = NULL;
> > + ctx.idx = 0;
> > +
> > + ret = __arch_prepare_bpf_trampoline(&ctx, &im, m, tlinks, func_addr, flags);
> > +
> > + /* Page align */
> > + return ret < 0 ? ret : round_up(ret * LOONGARCH_INSN_SIZE, PAGE_SIZE);
> > +}
> > diff --git a/arch/loongarch/net/bpf_jit.h b/arch/loongarch/net/bpf_jit.h
> > index f9c569f53..5697158fd 100644
> > --- a/arch/loongarch/net/bpf_jit.h
> > +++ b/arch/loongarch/net/bpf_jit.h
> > @@ -18,6 +18,7 @@ struct jit_ctx {
> > u32 *offset;
> > int num_exentries;
> > union loongarch_instruction *image;
> > + union loongarch_instruction *ro_image;
> > u32 stack_size;
> > };
> >
> > @@ -308,3 +309,8 @@ static inline int emit_tailcall_jmp(struct jit_ctx *ctx, u8 cond, enum loongarch
> >
> > return -EINVAL;
> > }
> > +
> > +static inline void bpf_flush_icache(void *start, void *end)
> > +{
> > + flush_icache_range((unsigned long)start, (unsigned long)end);
> > +}
> > --
> > 2.43.0
> >
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx.
2025-07-16 11:55 ` Hengqi Chen
@ 2025-07-17 9:46 ` Chenghao Duan
0 siblings, 0 replies; 24+ messages in thread
From: Chenghao Duan @ 2025-07-17 9:46 UTC (permalink / raw)
To: Hengqi Chen
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran
On Wed, Jul 16, 2025 at 07:55:46PM +0800, Hengqi Chen wrote:
> On Wed, Jul 9, 2025 at 1:50 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > Update the code to rename validate_code to validate_ctx.
> > validate_code is used to check the validity of code.
> > validate_ctx is used to check both code validity and table entry
> > correctness.
> >
>
> The commit message is awkward to read.
> Please describe the purpose of this change.
> * Rename the existing validate_code() to validate_ctx()
> * Factor out the code validation handling into a new helper validate_code()
>
> The new validate_code() will be used in subsequent changes.
>
Hi Hengqi,
Thank you very much for your suggestions. I will refer to your advice
to revise the commit message in Version V4.
Chenghao
> > Co-developed-by: George Guo <guodongtai@kylinos.cn>
> > Signed-off-by: George Guo <guodongtai@kylinos.cn>
> > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > ---
> > arch/loongarch/net/bpf_jit.c | 10 +++++++++-
> > 1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > index fa1500d4a..7032f11d3 100644
> > --- a/arch/loongarch/net/bpf_jit.c
> > +++ b/arch/loongarch/net/bpf_jit.c
> > @@ -1180,6 +1180,14 @@ static int validate_code(struct jit_ctx *ctx)
> > return -1;
> > }
> >
> > + return 0;
> > +}
> > +
> > +static int validate_ctx(struct jit_ctx *ctx)
> > +{
> > + if (validate_code(ctx))
> > + return -1;
> > +
> > if (WARN_ON_ONCE(ctx->num_exentries != ctx->prog->aux->num_exentries))
> > return -1;
> >
> > @@ -1288,7 +1296,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
> > build_epilogue(&ctx);
> >
> > /* 3. Extra pass to validate JITed code */
> > - if (validate_code(&ctx)) {
> > + if (validate_ctx(&ctx)) {
> > bpf_jit_binary_free(header);
> > prog = orig_prog;
> > goto out_offset;
> > --
> > 2.43.0
> >
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 4/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
2025-07-17 9:27 ` Chenghao Duan
@ 2025-07-17 10:12 ` Hengqi Chen
2025-07-18 2:16 ` Chenghao Duan
0 siblings, 1 reply; 24+ messages in thread
From: Hengqi Chen @ 2025-07-17 10:12 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran
On Thu, Jul 17, 2025 at 5:27 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> On Wed, Jul 16, 2025 at 08:21:59PM +0800, Hengqi Chen wrote:
> > On Wed, Jul 9, 2025 at 1:50 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > >
> > > Implement the functions of bpf_arch_text_poke, bpf_arch_text_copy, and
> > > bpf_arch_text_invalidate on the LoongArch architecture.
> > >
> > > On LoongArch, since symbol addresses in the direct mapping
> > > region cannot be reached via relative jump instructions from the paged
> > > mapping region, we use the move_imm+jirl instruction pair as absolute
> > > jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> > > instructions in the program as placeholders for function jumps.
> > >
> > > Co-developed-by: George Guo <guodongtai@kylinos.cn>
> > > Signed-off-by: George Guo <guodongtai@kylinos.cn>
> > > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > > ---
> > > arch/loongarch/include/asm/inst.h | 1 +
> > > arch/loongarch/kernel/inst.c | 32 +++++++++++
> > > arch/loongarch/net/bpf_jit.c | 90 +++++++++++++++++++++++++++++++
> > > 3 files changed, 123 insertions(+)
> > >
> > > diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> > > index 2ae96a35d..88bb73e46 100644
> > > --- a/arch/loongarch/include/asm/inst.h
> > > +++ b/arch/loongarch/include/asm/inst.h
> > > @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> > > int larch_insn_read(void *addr, u32 *insnp);
> > > int larch_insn_write(void *addr, u32 insn);
> > > int larch_insn_patch_text(void *addr, u32 insn);
> > > +int larch_insn_text_copy(void *dst, void *src, size_t len);
> > >
> > > u32 larch_insn_gen_nop(void);
> > > u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> > > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > > index 674e3b322..8d6594968 100644
> > > --- a/arch/loongarch/kernel/inst.c
> > > +++ b/arch/loongarch/kernel/inst.c
> > > @@ -4,6 +4,7 @@
> > > */
> > > #include <linux/sizes.h>
> > > #include <linux/uaccess.h>
> > > +#include <linux/set_memory.h>
> > >
> > > #include <asm/cacheflush.h>
> > > #include <asm/inst.h>
> > > @@ -218,6 +219,37 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > > return ret;
> > > }
> > >
> > > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > > +{
> > > + unsigned long flags;
> > > + size_t wlen = 0;
> > > + size_t size;
> > > + void *ptr;
> > > + int ret = 0;
> > > +
> > > + set_memory_rw((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> > > + raw_spin_lock_irqsave(&patch_lock, flags);
> > > + while (wlen < len) {
> > > + ptr = dst + wlen;
> > > + size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
> > > + len - wlen);
> > > +
> > > + ret = copy_to_kernel_nofault(ptr, src + wlen, size);
> > > + if (ret) {
> > > + pr_err("%s: operation failed\n", __func__);
> > > + break;
> > > + }
> > > + wlen += size;
> > > + }
> >
> > Again, why do you do copy_to_kernel_nofault() in a loop ?
>
> The while loop processes all sizes. I referred to how ARM64 and
> RISC-V64 handle this using loops as well.
Any pointers ?
>
> > This larch_insn_text_copy() can be part of the first patch like
> > larch_insn_gen_{beq,bne}. WDYT ?
>
> From my perspective, it is acceptable to include both
> larch_insn_text_copy and larch_insn_gen_{beq,bne} in the same patch,
> or place them in the bpf_arch_xxxx patch. larch_insn_text_copy is
> solely used for BPF; the application scope of larch_insn_gen_{beq,bne}
> is not limited to BPF.
>
The implementation of larch_insn_text_copy() seems generic.
> >
> > > + raw_spin_unlock_irqrestore(&patch_lock, flags);
> > > + set_memory_rox((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> > > +
> > > + if (!ret)
> > > + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> > > +
> > > + return ret;
> > > +}
> > > +
> > > u32 larch_insn_gen_nop(void)
> > > {
> > > return INSN_NOP;
> > > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > > index 7032f11d3..9cb01f0b0 100644
> > > --- a/arch/loongarch/net/bpf_jit.c
> > > +++ b/arch/loongarch/net/bpf_jit.c
> > > @@ -4,6 +4,7 @@
> > > *
> > > * Copyright (C) 2022 Loongson Technology Corporation Limited
> > > */
> > > +#include <linux/memory.h>
> > > #include "bpf_jit.h"
> > >
> > > #define REG_TCC LOONGARCH_GPR_A6
> > > @@ -1367,3 +1368,92 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> > > {
> > > return true;
> > > }
> > > +
> > > +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 ip, u64 target)
> > > +{
> > > + s64 offset = (s64)(target - ip);
> > > +
> > > + if (offset && (offset >= -SZ_128M && offset < SZ_128M)) {
> > > + emit_insn(ctx, bl, offset >> 2);
> > > + } else {
> > > + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> > > + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> > > + }
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> > > +{
> > > + struct jit_ctx ctx;
> > > +
> > > + ctx.idx = 0;
> > > + ctx.image = (union loongarch_instruction *)insns;
> > > +
> > > + if (!target) {
> > > + emit_insn((&ctx), nop);
> > > + emit_insn((&ctx), nop);
> > > + return 0;
> > > + }
> > > +
> > > + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> > > + (unsigned long)ip, (unsigned long)target);
> > > +}
> > > +
> > > +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> > > + void *old_addr, void *new_addr)
> > > +{
> > > + u32 old_insns[5] = {[0 ... 4] = INSN_NOP};
> > > + u32 new_insns[5] = {[0 ... 4] = INSN_NOP};
> > > + bool is_call = poke_type == BPF_MOD_CALL;
> > > + int ret;
> > > +
> > > + if (!is_kernel_text((unsigned long)ip) &&
> > > + !is_bpf_text_address((unsigned long)ip))
> > > + return -ENOTSUPP;
> > > +
> > > + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + if (memcmp(ip, old_insns, 5 * 4))
> > > + return -EFAULT;
> > > +
> > > + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + mutex_lock(&text_mutex);
> > > + if (memcmp(ip, new_insns, 5 * 4))
> > > + ret = larch_insn_text_copy(ip, new_insns, 5 * 4);
> > > + mutex_unlock(&text_mutex);
> > > + return ret;
> > > +}
> > > +
> > > +int bpf_arch_text_invalidate(void *dst, size_t len)
> > > +{
> > > + int i;
> > > + int ret = 0;
> > > + u32 *inst;
> > > +
> > > + inst = kvmalloc(len, GFP_KERNEL);
> > > + if (!inst)
> > > + return -ENOMEM;
> > > +
> > > + for (i = 0; i < (len/sizeof(u32)); i++)
> > > + inst[i] = INSN_BREAK;
> > > +
> > > + if (larch_insn_text_copy(dst, inst, len))
> > > + ret = -EINVAL;
> > > +
> > > + kvfree(inst);
> > > + return ret;
> > > +}
> > > +
> > > +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> > > +{
> > > + if (larch_insn_text_copy(dst, src, len))
> > > + return ERR_PTR(-EINVAL);
> > > +
> > > + return dst;
> > > +}
> > > --
> > > 2.43.0
> > >
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 4/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
2025-07-17 10:12 ` Hengqi Chen
@ 2025-07-18 2:16 ` Chenghao Duan
2025-07-21 1:38 ` Hengqi Chen
0 siblings, 1 reply; 24+ messages in thread
From: Chenghao Duan @ 2025-07-18 2:16 UTC (permalink / raw)
To: Hengqi Chen
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran
On Thu, Jul 17, 2025 at 06:12:55PM +0800, Hengqi Chen wrote:
> On Thu, Jul 17, 2025 at 5:27 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > On Wed, Jul 16, 2025 at 08:21:59PM +0800, Hengqi Chen wrote:
> > > On Wed, Jul 9, 2025 at 1:50 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > > >
> > > > Implement the functions of bpf_arch_text_poke, bpf_arch_text_copy, and
> > > > bpf_arch_text_invalidate on the LoongArch architecture.
> > > >
> > > > On LoongArch, since symbol addresses in the direct mapping
> > > > region cannot be reached via relative jump instructions from the paged
> > > > mapping region, we use the move_imm+jirl instruction pair as absolute
> > > > jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> > > > instructions in the program as placeholders for function jumps.
> > > >
> > > > Co-developed-by: George Guo <guodongtai@kylinos.cn>
> > > > Signed-off-by: George Guo <guodongtai@kylinos.cn>
> > > > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > > > ---
> > > > arch/loongarch/include/asm/inst.h | 1 +
> > > > arch/loongarch/kernel/inst.c | 32 +++++++++++
> > > > arch/loongarch/net/bpf_jit.c | 90 +++++++++++++++++++++++++++++++
> > > > 3 files changed, 123 insertions(+)
> > > >
> > > > diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> > > > index 2ae96a35d..88bb73e46 100644
> > > > --- a/arch/loongarch/include/asm/inst.h
> > > > +++ b/arch/loongarch/include/asm/inst.h
> > > > @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> > > > int larch_insn_read(void *addr, u32 *insnp);
> > > > int larch_insn_write(void *addr, u32 insn);
> > > > int larch_insn_patch_text(void *addr, u32 insn);
> > > > +int larch_insn_text_copy(void *dst, void *src, size_t len);
> > > >
> > > > u32 larch_insn_gen_nop(void);
> > > > u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> > > > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > > > index 674e3b322..8d6594968 100644
> > > > --- a/arch/loongarch/kernel/inst.c
> > > > +++ b/arch/loongarch/kernel/inst.c
> > > > @@ -4,6 +4,7 @@
> > > > */
> > > > #include <linux/sizes.h>
> > > > #include <linux/uaccess.h>
> > > > +#include <linux/set_memory.h>
> > > >
> > > > #include <asm/cacheflush.h>
> > > > #include <asm/inst.h>
> > > > @@ -218,6 +219,37 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > > > return ret;
> > > > }
> > > >
> > > > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > > > +{
> > > > + unsigned long flags;
> > > > + size_t wlen = 0;
> > > > + size_t size;
> > > > + void *ptr;
> > > > + int ret = 0;
> > > > +
> > > > + set_memory_rw((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> > > > + raw_spin_lock_irqsave(&patch_lock, flags);
> > > > + while (wlen < len) {
> > > > + ptr = dst + wlen;
> > > > + size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
> > > > + len - wlen);
> > > > +
> > > > + ret = copy_to_kernel_nofault(ptr, src + wlen, size);
> > > > + if (ret) {
> > > > + pr_err("%s: operation failed\n", __func__);
> > > > + break;
> > > > + }
> > > > + wlen += size;
> > > > + }
> > >
> > > Again, why do you do copy_to_kernel_nofault() in a loop ?
> >
> > The while loop processes all sizes. I referred to how ARM64 and
> > RISC-V64 handle this using loops as well.
>
> Any pointers ?
I didn't understand what you meant.
>
> >
> > > This larch_insn_text_copy() can be part of the first patch like
> > > larch_insn_gen_{beq,bne}. WDYT ?
> >
> > From my perspective, it is acceptable to include both
> > larch_insn_text_copy and larch_insn_gen_{beq,bne} in the same patch,
> > or place them in the bpf_arch_xxxx patch. larch_insn_text_copy is
> > solely used for BPF; the application scope of larch_insn_gen_{beq,bne}
> > is not limited to BPF.
> >
>
> The implementation of larch_insn_text_copy() seems generic.
The use of larch_insn_text_copy() requires page_size alignment.
Currently, only the size of the trampoline is page-aligned.
>
> > >
> > > > + raw_spin_unlock_irqrestore(&patch_lock, flags);
> > > > + set_memory_rox((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> > > > +
> > > > + if (!ret)
> > > > + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> > > > +
> > > > + return ret;
> > > > +}
> > > > +
> > > > u32 larch_insn_gen_nop(void)
> > > > {
> > > > return INSN_NOP;
> > > > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > > > index 7032f11d3..9cb01f0b0 100644
> > > > --- a/arch/loongarch/net/bpf_jit.c
> > > > +++ b/arch/loongarch/net/bpf_jit.c
> > > > @@ -4,6 +4,7 @@
> > > > *
> > > > * Copyright (C) 2022 Loongson Technology Corporation Limited
> > > > */
> > > > +#include <linux/memory.h>
> > > > #include "bpf_jit.h"
> > > >
> > > > #define REG_TCC LOONGARCH_GPR_A6
> > > > @@ -1367,3 +1368,92 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> > > > {
> > > > return true;
> > > > }
> > > > +
> > > > +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 ip, u64 target)
> > > > +{
> > > > + s64 offset = (s64)(target - ip);
> > > > +
> > > > + if (offset && (offset >= -SZ_128M && offset < SZ_128M)) {
> > > > + emit_insn(ctx, bl, offset >> 2);
> > > > + } else {
> > > > + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> > > > + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> > > > + }
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> > > > +{
> > > > + struct jit_ctx ctx;
> > > > +
> > > > + ctx.idx = 0;
> > > > + ctx.image = (union loongarch_instruction *)insns;
> > > > +
> > > > + if (!target) {
> > > > + emit_insn((&ctx), nop);
> > > > + emit_insn((&ctx), nop);
> > > > + return 0;
> > > > + }
> > > > +
> > > > + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> > > > + (unsigned long)ip, (unsigned long)target);
> > > > +}
> > > > +
> > > > +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> > > > + void *old_addr, void *new_addr)
> > > > +{
> > > > + u32 old_insns[5] = {[0 ... 4] = INSN_NOP};
> > > > + u32 new_insns[5] = {[0 ... 4] = INSN_NOP};
> > > > + bool is_call = poke_type == BPF_MOD_CALL;
> > > > + int ret;
> > > > +
> > > > + if (!is_kernel_text((unsigned long)ip) &&
> > > > + !is_bpf_text_address((unsigned long)ip))
> > > > + return -ENOTSUPP;
> > > > +
> > > > + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> > > > + if (ret)
> > > > + return ret;
> > > > +
> > > > + if (memcmp(ip, old_insns, 5 * 4))
> > > > + return -EFAULT;
> > > > +
> > > > + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> > > > + if (ret)
> > > > + return ret;
> > > > +
> > > > + mutex_lock(&text_mutex);
> > > > + if (memcmp(ip, new_insns, 5 * 4))
> > > > + ret = larch_insn_text_copy(ip, new_insns, 5 * 4);
> > > > + mutex_unlock(&text_mutex);
> > > > + return ret;
> > > > +}
> > > > +
> > > > +int bpf_arch_text_invalidate(void *dst, size_t len)
> > > > +{
> > > > + int i;
> > > > + int ret = 0;
> > > > + u32 *inst;
> > > > +
> > > > + inst = kvmalloc(len, GFP_KERNEL);
> > > > + if (!inst)
> > > > + return -ENOMEM;
> > > > +
> > > > + for (i = 0; i < (len/sizeof(u32)); i++)
> > > > + inst[i] = INSN_BREAK;
> > > > +
> > > > + if (larch_insn_text_copy(dst, inst, len))
> > > > + ret = -EINVAL;
> > > > +
> > > > + kvfree(inst);
> > > > + return ret;
> > > > +}
> > > > +
> > > > +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> > > > +{
> > > > + if (larch_insn_text_copy(dst, src, len))
> > > > + return ERR_PTR(-EINVAL);
> > > > +
> > > > + return dst;
> > > > +}
> > > > --
> > > > 2.43.0
> > > >
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 4/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
2025-07-09 5:50 ` [PATCH v3 4/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch Chenghao Duan
2025-07-16 12:21 ` Hengqi Chen
2025-07-16 18:41 ` Vincent Li
@ 2025-07-18 23:08 ` Vincent Li
2 siblings, 0 replies; 24+ messages in thread
From: Vincent Li @ 2025-07-18 23:08 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai,
martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, youling.tang, jianghaoran
Hi Chenghao,
On Tue, Jul 8, 2025 at 11:02 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> Implement the functions of bpf_arch_text_poke, bpf_arch_text_copy, and
> bpf_arch_text_invalidate on the LoongArch architecture.
>
> On LoongArch, since symbol addresses in the direct mapping
> region cannot be reached via relative jump instructions from the paged
> mapping region, we use the move_imm+jirl instruction pair as absolute
> jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> instructions in the program as placeholders for function jumps.
>
When I compare your Loongarch implementation to riscv commit 596f2e6f9
("riscv, bpf: Add bpf_arch_text_poke support for RV64"), I noticed
riscv commit has below change to bpf_jit_build_prologue(), that I
think is for adding 4 NOPs for bpf2bpf (bpf freplace bpf) use case.
but your implementation does not have a similar change to loongarch
build_prologue.
@@ -1293,6 +1373,10 @@ void bpf_jit_build_prologue(struct rv_jit_context *ctx)
store_offset = stack_adjust - 8;
+ /* reserve 4 nop insns */
+ for (i = 0; i < 4; i++)
+ emit(rv_nop(), ctx);
+
later riscv commit 25ad10658d ("riscv, bpf: Adapt bpf trampoline to
optimized riscv ftrace framework") made further changes to
bpf_jit_build_prologue() with below.
@@ -1691,8 +1702,8 @@ void bpf_jit_build_prologue(struct rv_jit_context *ctx)
store_offset = stack_adjust - 8;
- /* reserve 4 nop insns */
- for (i = 0; i < 4; i++)
+ /* nops reserved for auipc+jalr pair */
+ for (i = 0; i < RV_FENTRY_NINSNS; i++)
emit(rv_nop(), ctx);
I assume Loongarch has not adopted the ftrace framework, so Loongarch
build_prologue() should reserve 5 NOPs too? and that would resolve the
xdp-tool use case and selftest fexit_bpf2bpf issue?
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> ---
> arch/loongarch/include/asm/inst.h | 1 +
> arch/loongarch/kernel/inst.c | 32 +++++++++++
> arch/loongarch/net/bpf_jit.c | 90 +++++++++++++++++++++++++++++++
> 3 files changed, 123 insertions(+)
>
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> index 2ae96a35d..88bb73e46 100644
> --- a/arch/loongarch/include/asm/inst.h
> +++ b/arch/loongarch/include/asm/inst.h
> @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> int larch_insn_read(void *addr, u32 *insnp);
> int larch_insn_write(void *addr, u32 insn);
> int larch_insn_patch_text(void *addr, u32 insn);
> +int larch_insn_text_copy(void *dst, void *src, size_t len);
>
> u32 larch_insn_gen_nop(void);
> u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> index 674e3b322..8d6594968 100644
> --- a/arch/loongarch/kernel/inst.c
> +++ b/arch/loongarch/kernel/inst.c
> @@ -4,6 +4,7 @@
> */
> #include <linux/sizes.h>
> #include <linux/uaccess.h>
> +#include <linux/set_memory.h>
>
> #include <asm/cacheflush.h>
> #include <asm/inst.h>
> @@ -218,6 +219,37 @@ int larch_insn_patch_text(void *addr, u32 insn)
> return ret;
> }
>
> +int larch_insn_text_copy(void *dst, void *src, size_t len)
> +{
> + unsigned long flags;
> + size_t wlen = 0;
> + size_t size;
> + void *ptr;
> + int ret = 0;
> +
> + set_memory_rw((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> + raw_spin_lock_irqsave(&patch_lock, flags);
> + while (wlen < len) {
> + ptr = dst + wlen;
> + size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
> + len - wlen);
> +
> + ret = copy_to_kernel_nofault(ptr, src + wlen, size);
> + if (ret) {
> + pr_err("%s: operation failed\n", __func__);
> + break;
> + }
> + wlen += size;
> + }
> + raw_spin_unlock_irqrestore(&patch_lock, flags);
> + set_memory_rox((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> +
> + if (!ret)
> + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> +
> + return ret;
> +}
> +
> u32 larch_insn_gen_nop(void)
> {
> return INSN_NOP;
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 7032f11d3..9cb01f0b0 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -4,6 +4,7 @@
> *
> * Copyright (C) 2022 Loongson Technology Corporation Limited
> */
> +#include <linux/memory.h>
> #include "bpf_jit.h"
>
> #define REG_TCC LOONGARCH_GPR_A6
> @@ -1367,3 +1368,92 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> {
> return true;
> }
> +
> +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 ip, u64 target)
> +{
> + s64 offset = (s64)(target - ip);
> +
> + if (offset && (offset >= -SZ_128M && offset < SZ_128M)) {
> + emit_insn(ctx, bl, offset >> 2);
> + } else {
> + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> + }
> +
> + return 0;
> +}
> +
> +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> +{
> + struct jit_ctx ctx;
> +
> + ctx.idx = 0;
> + ctx.image = (union loongarch_instruction *)insns;
> +
> + if (!target) {
> + emit_insn((&ctx), nop);
> + emit_insn((&ctx), nop);
> + return 0;
> + }
> +
> + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> + (unsigned long)ip, (unsigned long)target);
> +}
> +
> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> + void *old_addr, void *new_addr)
> +{
> + u32 old_insns[5] = {[0 ... 4] = INSN_NOP};
> + u32 new_insns[5] = {[0 ... 4] = INSN_NOP};
> + bool is_call = poke_type == BPF_MOD_CALL;
> + int ret;
> +
> + if (!is_kernel_text((unsigned long)ip) &&
> + !is_bpf_text_address((unsigned long)ip))
> + return -ENOTSUPP;
> +
> + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> + if (ret)
> + return ret;
> +
> + if (memcmp(ip, old_insns, 5 * 4))
> + return -EFAULT;
> +
> + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> + if (ret)
> + return ret;
> +
> + mutex_lock(&text_mutex);
> + if (memcmp(ip, new_insns, 5 * 4))
> + ret = larch_insn_text_copy(ip, new_insns, 5 * 4);
> + mutex_unlock(&text_mutex);
> + return ret;
> +}
> +
> +int bpf_arch_text_invalidate(void *dst, size_t len)
> +{
> + int i;
> + int ret = 0;
> + u32 *inst;
> +
> + inst = kvmalloc(len, GFP_KERNEL);
> + if (!inst)
> + return -ENOMEM;
> +
> + for (i = 0; i < (len/sizeof(u32)); i++)
> + inst[i] = INSN_BREAK;
> +
> + if (larch_insn_text_copy(dst, inst, len))
> + ret = -EINVAL;
> +
> + kvfree(inst);
> + return ret;
> +}
> +
> +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> +{
> + if (larch_insn_text_copy(dst, src, len))
> + return ERR_PTR(-EINVAL);
> +
> + return dst;
> +}
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 4/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
2025-07-18 2:16 ` Chenghao Duan
@ 2025-07-21 1:38 ` Hengqi Chen
2025-07-21 7:59 ` Chenghao Duan
0 siblings, 1 reply; 24+ messages in thread
From: Hengqi Chen @ 2025-07-21 1:38 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran
On Fri, Jul 18, 2025 at 10:17 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> On Thu, Jul 17, 2025 at 06:12:55PM +0800, Hengqi Chen wrote:
> > On Thu, Jul 17, 2025 at 5:27 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > >
> > > On Wed, Jul 16, 2025 at 08:21:59PM +0800, Hengqi Chen wrote:
> > > > On Wed, Jul 9, 2025 at 1:50 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > > > >
> > > > > Implement the functions of bpf_arch_text_poke, bpf_arch_text_copy, and
> > > > > bpf_arch_text_invalidate on the LoongArch architecture.
> > > > >
> > > > > On LoongArch, since symbol addresses in the direct mapping
> > > > > region cannot be reached via relative jump instructions from the paged
> > > > > mapping region, we use the move_imm+jirl instruction pair as absolute
> > > > > jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> > > > > instructions in the program as placeholders for function jumps.
> > > > >
> > > > > Co-developed-by: George Guo <guodongtai@kylinos.cn>
> > > > > Signed-off-by: George Guo <guodongtai@kylinos.cn>
> > > > > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > > > > ---
> > > > > arch/loongarch/include/asm/inst.h | 1 +
> > > > > arch/loongarch/kernel/inst.c | 32 +++++++++++
> > > > > arch/loongarch/net/bpf_jit.c | 90 +++++++++++++++++++++++++++++++
> > > > > 3 files changed, 123 insertions(+)
> > > > >
> > > > > diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> > > > > index 2ae96a35d..88bb73e46 100644
> > > > > --- a/arch/loongarch/include/asm/inst.h
> > > > > +++ b/arch/loongarch/include/asm/inst.h
> > > > > @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> > > > > int larch_insn_read(void *addr, u32 *insnp);
> > > > > int larch_insn_write(void *addr, u32 insn);
> > > > > int larch_insn_patch_text(void *addr, u32 insn);
> > > > > +int larch_insn_text_copy(void *dst, void *src, size_t len);
> > > > >
> > > > > u32 larch_insn_gen_nop(void);
> > > > > u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> > > > > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > > > > index 674e3b322..8d6594968 100644
> > > > > --- a/arch/loongarch/kernel/inst.c
> > > > > +++ b/arch/loongarch/kernel/inst.c
> > > > > @@ -4,6 +4,7 @@
> > > > > */
> > > > > #include <linux/sizes.h>
> > > > > #include <linux/uaccess.h>
> > > > > +#include <linux/set_memory.h>
> > > > >
> > > > > #include <asm/cacheflush.h>
> > > > > #include <asm/inst.h>
> > > > > @@ -218,6 +219,37 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > > > > return ret;
> > > > > }
> > > > >
> > > > > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > > > > +{
> > > > > + unsigned long flags;
> > > > > + size_t wlen = 0;
> > > > > + size_t size;
> > > > > + void *ptr;
> > > > > + int ret = 0;
> > > > > +
> > > > > + set_memory_rw((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> > > > > + raw_spin_lock_irqsave(&patch_lock, flags);
> > > > > + while (wlen < len) {
> > > > > + ptr = dst + wlen;
> > > > > + size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
> > > > > + len - wlen);
> > > > > +
> > > > > + ret = copy_to_kernel_nofault(ptr, src + wlen, size);
> > > > > + if (ret) {
> > > > > + pr_err("%s: operation failed\n", __func__);
> > > > > + break;
> > > > > + }
> > > > > + wlen += size;
> > > > > + }
> > > >
> > > > Again, why do you do copy_to_kernel_nofault() in a loop ?
> > >
> > > The while loop processes all sizes. I referred to how ARM64 and
> > > RISC-V64 handle this using loops as well.
> >
> > Any pointers ?
>
> I didn't understand what you meant.
>
It's your responsibility to explain why we need a loop here, not mine.
I checked every callsite of copy_to_kernel_nofault(), no one uses a loop.
> >
> > >
> > > > This larch_insn_text_copy() can be part of the first patch like
> > > > larch_insn_gen_{beq,bne}. WDYT ?
> > >
> > > From my perspective, it is acceptable to include both
> > > larch_insn_text_copy and larch_insn_gen_{beq,bne} in the same patch,
> > > or place them in the bpf_arch_xxxx patch. larch_insn_text_copy is
> > > solely used for BPF; the application scope of larch_insn_gen_{beq,bne}
> > > is not limited to BPF.
> > >
> >
> > The implementation of larch_insn_text_copy() seems generic.
>
> The use of larch_insn_text_copy() requires page_size alignment.
> Currently, only the size of the trampoline is page-aligned.
>
Then clearly document it.
> >
> > > >
> > > > > + raw_spin_unlock_irqrestore(&patch_lock, flags);
> > > > > + set_memory_rox((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> > > > > +
> > > > > + if (!ret)
> > > > > + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> > > > > +
> > > > > + return ret;
> > > > > +}
> > > > > +
> > > > > u32 larch_insn_gen_nop(void)
> > > > > {
> > > > > return INSN_NOP;
> > > > > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > > > > index 7032f11d3..9cb01f0b0 100644
> > > > > --- a/arch/loongarch/net/bpf_jit.c
> > > > > +++ b/arch/loongarch/net/bpf_jit.c
> > > > > @@ -4,6 +4,7 @@
> > > > > *
> > > > > * Copyright (C) 2022 Loongson Technology Corporation Limited
> > > > > */
> > > > > +#include <linux/memory.h>
> > > > > #include "bpf_jit.h"
> > > > >
> > > > > #define REG_TCC LOONGARCH_GPR_A6
> > > > > @@ -1367,3 +1368,92 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> > > > > {
> > > > > return true;
> > > > > }
> > > > > +
> > > > > +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 ip, u64 target)
> > > > > +{
> > > > > + s64 offset = (s64)(target - ip);
> > > > > +
> > > > > + if (offset && (offset >= -SZ_128M && offset < SZ_128M)) {
> > > > > + emit_insn(ctx, bl, offset >> 2);
> > > > > + } else {
> > > > > + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> > > > > + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> > > > > + }
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> > > > > +{
> > > > > + struct jit_ctx ctx;
> > > > > +
> > > > > + ctx.idx = 0;
> > > > > + ctx.image = (union loongarch_instruction *)insns;
> > > > > +
> > > > > + if (!target) {
> > > > > + emit_insn((&ctx), nop);
> > > > > + emit_insn((&ctx), nop);
> > > > > + return 0;
> > > > > + }
> > > > > +
> > > > > + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> > > > > + (unsigned long)ip, (unsigned long)target);
> > > > > +}
> > > > > +
> > > > > +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> > > > > + void *old_addr, void *new_addr)
> > > > > +{
> > > > > + u32 old_insns[5] = {[0 ... 4] = INSN_NOP};
> > > > > + u32 new_insns[5] = {[0 ... 4] = INSN_NOP};
> > > > > + bool is_call = poke_type == BPF_MOD_CALL;
> > > > > + int ret;
> > > > > +
> > > > > + if (!is_kernel_text((unsigned long)ip) &&
> > > > > + !is_bpf_text_address((unsigned long)ip))
> > > > > + return -ENOTSUPP;
> > > > > +
> > > > > + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> > > > > + if (ret)
> > > > > + return ret;
> > > > > +
> > > > > + if (memcmp(ip, old_insns, 5 * 4))
> > > > > + return -EFAULT;
> > > > > +
> > > > > + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> > > > > + if (ret)
> > > > > + return ret;
> > > > > +
> > > > > + mutex_lock(&text_mutex);
> > > > > + if (memcmp(ip, new_insns, 5 * 4))
> > > > > + ret = larch_insn_text_copy(ip, new_insns, 5 * 4);
> > > > > + mutex_unlock(&text_mutex);
> > > > > + return ret;
> > > > > +}
> > > > > +
> > > > > +int bpf_arch_text_invalidate(void *dst, size_t len)
> > > > > +{
> > > > > + int i;
> > > > > + int ret = 0;
> > > > > + u32 *inst;
> > > > > +
> > > > > + inst = kvmalloc(len, GFP_KERNEL);
> > > > > + if (!inst)
> > > > > + return -ENOMEM;
> > > > > +
> > > > > + for (i = 0; i < (len/sizeof(u32)); i++)
> > > > > + inst[i] = INSN_BREAK;
> > > > > +
> > > > > + if (larch_insn_text_copy(dst, inst, len))
> > > > > + ret = -EINVAL;
> > > > > +
> > > > > + kvfree(inst);
> > > > > + return ret;
> > > > > +}
> > > > > +
> > > > > +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> > > > > +{
> > > > > + if (larch_insn_text_copy(dst, src, len))
> > > > > + return ERR_PTR(-EINVAL);
> > > > > +
> > > > > + return dst;
> > > > > +}
> > > > > --
> > > > > 2.43.0
> > > > >
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v3 4/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
2025-07-21 1:38 ` Hengqi Chen
@ 2025-07-21 7:59 ` Chenghao Duan
0 siblings, 0 replies; 24+ messages in thread
From: Chenghao Duan @ 2025-07-21 7:59 UTC (permalink / raw)
To: Hengqi Chen
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran
On Mon, Jul 21, 2025 at 09:38:47AM +0800, Hengqi Chen wrote:
> On Fri, Jul 18, 2025 at 10:17 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > On Thu, Jul 17, 2025 at 06:12:55PM +0800, Hengqi Chen wrote:
> > > On Thu, Jul 17, 2025 at 5:27 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > > >
> > > > On Wed, Jul 16, 2025 at 08:21:59PM +0800, Hengqi Chen wrote:
> > > > > On Wed, Jul 9, 2025 at 1:50 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > > > > >
> > > > > > Implement the functions of bpf_arch_text_poke, bpf_arch_text_copy, and
> > > > > > bpf_arch_text_invalidate on the LoongArch architecture.
> > > > > >
> > > > > > On LoongArch, since symbol addresses in the direct mapping
> > > > > > region cannot be reached via relative jump instructions from the paged
> > > > > > mapping region, we use the move_imm+jirl instruction pair as absolute
> > > > > > jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> > > > > > instructions in the program as placeholders for function jumps.
> > > > > >
> > > > > > Co-developed-by: George Guo <guodongtai@kylinos.cn>
> > > > > > Signed-off-by: George Guo <guodongtai@kylinos.cn>
> > > > > > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > > > > > ---
> > > > > > arch/loongarch/include/asm/inst.h | 1 +
> > > > > > arch/loongarch/kernel/inst.c | 32 +++++++++++
> > > > > > arch/loongarch/net/bpf_jit.c | 90 +++++++++++++++++++++++++++++++
> > > > > > 3 files changed, 123 insertions(+)
> > > > > >
> > > > > > diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> > > > > > index 2ae96a35d..88bb73e46 100644
> > > > > > --- a/arch/loongarch/include/asm/inst.h
> > > > > > +++ b/arch/loongarch/include/asm/inst.h
> > > > > > @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> > > > > > int larch_insn_read(void *addr, u32 *insnp);
> > > > > > int larch_insn_write(void *addr, u32 insn);
> > > > > > int larch_insn_patch_text(void *addr, u32 insn);
> > > > > > +int larch_insn_text_copy(void *dst, void *src, size_t len);
> > > > > >
> > > > > > u32 larch_insn_gen_nop(void);
> > > > > > u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> > > > > > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > > > > > index 674e3b322..8d6594968 100644
> > > > > > --- a/arch/loongarch/kernel/inst.c
> > > > > > +++ b/arch/loongarch/kernel/inst.c
> > > > > > @@ -4,6 +4,7 @@
> > > > > > */
> > > > > > #include <linux/sizes.h>
> > > > > > #include <linux/uaccess.h>
> > > > > > +#include <linux/set_memory.h>
> > > > > >
> > > > > > #include <asm/cacheflush.h>
> > > > > > #include <asm/inst.h>
> > > > > > @@ -218,6 +219,37 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > > > > > return ret;
> > > > > > }
> > > > > >
> > > > > > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > > > > > +{
> > > > > > + unsigned long flags;
> > > > > > + size_t wlen = 0;
> > > > > > + size_t size;
> > > > > > + void *ptr;
> > > > > > + int ret = 0;
> > > > > > +
> > > > > > + set_memory_rw((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> > > > > > + raw_spin_lock_irqsave(&patch_lock, flags);
> > > > > > + while (wlen < len) {
> > > > > > + ptr = dst + wlen;
> > > > > > + size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
> > > > > > + len - wlen);
> > > > > > +
> > > > > > + ret = copy_to_kernel_nofault(ptr, src + wlen, size);
> > > > > > + if (ret) {
> > > > > > + pr_err("%s: operation failed\n", __func__);
> > > > > > + break;
> > > > > > + }
> > > > > > + wlen += size;
> > > > > > + }
> > > > >
> > > > > Again, why do you do copy_to_kernel_nofault() in a loop ?
> > > >
> > > > The while loop processes all sizes. I referred to how ARM64 and
> > > > RISC-V64 handle this using loops as well.
> > >
> > > Any pointers ?
> >
> > I didn't understand what you meant.
> >
>
> It's your responsibility to explain why we need a loop here, not mine.
> I checked every callsite of copy_to_kernel_nofault(), no one uses a loop.
>
I referred to ARM and RISC-V. ARM copies a maximum length of PAGE_SIZE
each time, while RISC-V copies a maximum length of PAGE_SIZE*2 each time.
ARM and RISC-V use loops because Fixmap can map PAGE_SIZE at a time.
arm64: arch/arm64/kernel/patching.c __text_poke
riscv: arch/riscv/kernel/patch.c atch_insn_write
Although the internal implementation of copy_to_kernel_nofault also uses
a loop and employs store word instructions for copying, I've retained
the approach of processing each page individually. This is because I'm
concerned that misalignment might lead to other hidden issues.
Currently, the tests show that not using a loop works fine, but I've
kept this method as a precaution.
> > >
> > > >
> > > > > This larch_insn_text_copy() can be part of the first patch like
> > > > > larch_insn_gen_{beq,bne}. WDYT ?
> > > >
> > > > From my perspective, it is acceptable to include both
> > > > larch_insn_text_copy and larch_insn_gen_{beq,bne} in the same patch,
> > > > or place them in the bpf_arch_xxxx patch. larch_insn_text_copy is
> > > > solely used for BPF; the application scope of larch_insn_gen_{beq,bne}
> > > > is not limited to BPF.
> > > >
> > >
> > > The implementation of larch_insn_text_copy() seems generic.
> >
> > The use of larch_insn_text_copy() requires page_size alignment.
> > Currently, only the size of the trampoline is page-aligned.
> >
>
> Then clearly document it.
Okay, I will provide an explanation in the next version of the commit.
>
> > >
> > > > >
> > > > > > + raw_spin_unlock_irqrestore(&patch_lock, flags);
> > > > > > + set_memory_rox((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> > > > > > +
> > > > > > + if (!ret)
> > > > > > + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> > > > > > +
> > > > > > + return ret;
> > > > > > +}
> > > > > > +
> > > > > > u32 larch_insn_gen_nop(void)
> > > > > > {
> > > > > > return INSN_NOP;
> > > > > > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > > > > > index 7032f11d3..9cb01f0b0 100644
> > > > > > --- a/arch/loongarch/net/bpf_jit.c
> > > > > > +++ b/arch/loongarch/net/bpf_jit.c
> > > > > > @@ -4,6 +4,7 @@
> > > > > > *
> > > > > > * Copyright (C) 2022 Loongson Technology Corporation Limited
> > > > > > */
> > > > > > +#include <linux/memory.h>
> > > > > > #include "bpf_jit.h"
> > > > > >
> > > > > > #define REG_TCC LOONGARCH_GPR_A6
> > > > > > @@ -1367,3 +1368,92 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> > > > > > {
> > > > > > return true;
> > > > > > }
> > > > > > +
> > > > > > +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 ip, u64 target)
> > > > > > +{
> > > > > > + s64 offset = (s64)(target - ip);
> > > > > > +
> > > > > > + if (offset && (offset >= -SZ_128M && offset < SZ_128M)) {
> > > > > > + emit_insn(ctx, bl, offset >> 2);
> > > > > > + } else {
> > > > > > + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> > > > > > + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> > > > > > + }
> > > > > > +
> > > > > > + return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> > > > > > +{
> > > > > > + struct jit_ctx ctx;
> > > > > > +
> > > > > > + ctx.idx = 0;
> > > > > > + ctx.image = (union loongarch_instruction *)insns;
> > > > > > +
> > > > > > + if (!target) {
> > > > > > + emit_insn((&ctx), nop);
> > > > > > + emit_insn((&ctx), nop);
> > > > > > + return 0;
> > > > > > + }
> > > > > > +
> > > > > > + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> > > > > > + (unsigned long)ip, (unsigned long)target);
> > > > > > +}
> > > > > > +
> > > > > > +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> > > > > > + void *old_addr, void *new_addr)
> > > > > > +{
> > > > > > + u32 old_insns[5] = {[0 ... 4] = INSN_NOP};
> > > > > > + u32 new_insns[5] = {[0 ... 4] = INSN_NOP};
> > > > > > + bool is_call = poke_type == BPF_MOD_CALL;
> > > > > > + int ret;
> > > > > > +
> > > > > > + if (!is_kernel_text((unsigned long)ip) &&
> > > > > > + !is_bpf_text_address((unsigned long)ip))
> > > > > > + return -ENOTSUPP;
> > > > > > +
> > > > > > + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> > > > > > + if (ret)
> > > > > > + return ret;
> > > > > > +
> > > > > > + if (memcmp(ip, old_insns, 5 * 4))
> > > > > > + return -EFAULT;
> > > > > > +
> > > > > > + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> > > > > > + if (ret)
> > > > > > + return ret;
> > > > > > +
> > > > > > + mutex_lock(&text_mutex);
> > > > > > + if (memcmp(ip, new_insns, 5 * 4))
> > > > > > + ret = larch_insn_text_copy(ip, new_insns, 5 * 4);
> > > > > > + mutex_unlock(&text_mutex);
> > > > > > + return ret;
> > > > > > +}
> > > > > > +
> > > > > > +int bpf_arch_text_invalidate(void *dst, size_t len)
> > > > > > +{
> > > > > > + int i;
> > > > > > + int ret = 0;
> > > > > > + u32 *inst;
> > > > > > +
> > > > > > + inst = kvmalloc(len, GFP_KERNEL);
> > > > > > + if (!inst)
> > > > > > + return -ENOMEM;
> > > > > > +
> > > > > > + for (i = 0; i < (len/sizeof(u32)); i++)
> > > > > > + inst[i] = INSN_BREAK;
> > > > > > +
> > > > > > + if (larch_insn_text_copy(dst, inst, len))
> > > > > > + ret = -EINVAL;
> > > > > > +
> > > > > > + kvfree(inst);
> > > > > > + return ret;
> > > > > > +}
> > > > > > +
> > > > > > +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> > > > > > +{
> > > > > > + if (larch_insn_text_copy(dst, src, len))
> > > > > > + return ERR_PTR(-EINVAL);
> > > > > > +
> > > > > > + return dst;
> > > > > > +}
> > > > > > --
> > > > > > 2.43.0
> > > > > >
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2025-07-21 7:59 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-09 5:50 [PATCH v3 0/5] Support trampoline for LoongArch Chenghao Duan
2025-07-09 5:50 ` [PATCH v3 1/5] LoongArch: Add the function to generate the beq and bne assembly instructions Chenghao Duan
2025-07-16 11:33 ` Hengqi Chen
2025-07-09 5:50 ` [PATCH v3 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx Chenghao Duan
2025-07-16 11:55 ` Hengqi Chen
2025-07-17 9:46 ` Chenghao Duan
2025-07-09 5:50 ` [PATCH v3 3/5] LoongArch: BPF: Add EXECMEM_BPF memory to execmem subsystem Chenghao Duan
2025-07-09 15:23 ` Huacai Chen
2025-07-10 7:23 ` Chenghao Duan
2025-07-09 5:50 ` [PATCH v3 4/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch Chenghao Duan
2025-07-16 12:21 ` Hengqi Chen
2025-07-17 9:27 ` Chenghao Duan
2025-07-17 10:12 ` Hengqi Chen
2025-07-18 2:16 ` Chenghao Duan
2025-07-21 1:38 ` Hengqi Chen
2025-07-21 7:59 ` Chenghao Duan
2025-07-16 18:41 ` Vincent Li
2025-07-18 23:08 ` Vincent Li
2025-07-09 5:50 ` [PATCH v3 5/5] LoongArch: BPF: Add bpf trampoline " Chenghao Duan
2025-07-09 17:19 ` kernel test robot
2025-07-16 12:32 ` Hengqi Chen
2025-07-17 9:43 ` Chenghao Duan
2025-07-10 7:29 ` [PATCH v3 0/5] Support trampoline for LoongArch Huacai Chen
2025-07-14 8:55 ` Tiezhu Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).