* [PATCH v4 0/5] Support trampoline for LoongArch
@ 2025-07-24 14:19 Chenghao Duan
2025-07-24 14:19 ` [PATCH v4 1/5] LoongArch: Add larch_insn_gen_{beq,bne} helpers Chenghao Duan
` (6 more replies)
0 siblings, 7 replies; 22+ messages in thread
From: Chenghao Duan @ 2025-07-24 14:19 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran,
vincent.mc.li
v4:
1. Delete the #3 patch of version V3.
2. Add 5 NOP instructions in build_prologue().
Reserve space for the move_imm + jirl instruction.
3. Differentiate between direct jumps and ftrace jumps of trampoline:
direct jumps skip 5 instructions.
ftrace jumps skip 2 instructions.
4. Remove the generation of BL jump instructions in emit_jump_and_link().
After the trampoline ends, it will jump to the specified register.
The BL instruction writes PC+4 to r1 instead of allowing the
specification of rd.
-----------------------------------------------------------------------
Historical Version:
v3:
1. Patch 0003 adds EXECMEM_BPF memory type to the execmem subsystem.
2. Align the size calculated by arch_bpf_trampoline_size to page
boundaries.
3. Add the flush icache operation to larch_insn_text_copy.
4. Unify the implementation of bpf_arch_xxx into the patch
"0004-LoongArch-BPF-Add-bpf_arch_xxxxx-support-for-Loong.patch".
5. Change the patch order. Move the patch
"0002-LoongArch-BPF-Update-the-code-to-rename-validate_.patch" before
"0005-LoongArch-BPF-Add-bpf-trampoline-support-for-Loon.patch".
URL for version v3:
https://lore.kernel.org/all/20250709055029.723243-1-duanchenghao@kylinos.cn/
---------
v2:
1. Change the fixmap in the instruction copy function to set_memory_xxx.
2. Change the implementation method of the following code.
- arch_alloc_bpf_trampoline
- arch_free_bpf_trampoline
Use the BPF core's allocation and free functions.
- bpf_arch_text_invalidate
Operate with the function larch_insn_text_copy that carries
memory attribute modifications.
3. Correct the incorrect code formatting.
URL for version v2:
https://lore.kernel.org/all/20250618105048.1510560-1-duanchenghao@kylinos.cn/
---------
v1:
Support trampoline for LoongArch. The following feature tests have been
completed:
1. fentry
2. fexit
3. fmod_ret
TODO: The support for the struct_ops feature will be provided in
subsequent patches.
URL for version v1:
https://lore.kernel.org/all/20250611035952.111182-1-duanchenghao@kylinos.cn/
-----------------------------------------------------------------------
Chenghao Duan (4):
LoongArch: Add larch_insn_gen_{beq,bne} helpers
LoongArch: BPF: Update the code to rename validate_code to
validate_ctx
LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
LoongArch: BPF: Add bpf trampoline support for Loongarch
Tiezhu Yang (1):
LoongArch: BPF: Add struct ops support for trampoline
arch/loongarch/include/asm/inst.h | 3 +
arch/loongarch/kernel/inst.c | 60 ++++
arch/loongarch/net/bpf_jit.c | 521 +++++++++++++++++++++++++++++-
arch/loongarch/net/bpf_jit.h | 6 +
4 files changed, 589 insertions(+), 1 deletion(-)
--
2.25.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v4 1/5] LoongArch: Add larch_insn_gen_{beq,bne} helpers
2025-07-24 14:19 [PATCH v4 0/5] Support trampoline for LoongArch Chenghao Duan
@ 2025-07-24 14:19 ` Chenghao Duan
2025-07-24 14:19 ` [PATCH v4 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx Chenghao Duan
` (5 subsequent siblings)
6 siblings, 0 replies; 22+ messages in thread
From: Chenghao Duan @ 2025-07-24 14:19 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran,
vincent.mc.li, Youling Tang
Add larch_insn_gen_beq() and larch_insn_gen_bne() helpers which will
be used in BPF trampoline implementation.
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Co-developed-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Reviewed-by: Huacai Chen <chenhuacai@kernel.org>
---
arch/loongarch/include/asm/inst.h | 2 ++
arch/loongarch/kernel/inst.c | 28 ++++++++++++++++++++++++++++
2 files changed, 30 insertions(+)
diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
index 3089785ca..2ae96a35d 100644
--- a/arch/loongarch/include/asm/inst.h
+++ b/arch/loongarch/include/asm/inst.h
@@ -511,6 +511,8 @@ u32 larch_insn_gen_lu12iw(enum loongarch_gpr rd, int imm);
u32 larch_insn_gen_lu32id(enum loongarch_gpr rd, int imm);
u32 larch_insn_gen_lu52id(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
+u32 larch_insn_gen_beq(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
+u32 larch_insn_gen_bne(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
static inline bool signed_imm_check(long val, unsigned int bit)
{
diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
index 14d7d700b..674e3b322 100644
--- a/arch/loongarch/kernel/inst.c
+++ b/arch/loongarch/kernel/inst.c
@@ -336,3 +336,31 @@ u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm)
return insn.word;
}
+
+u32 larch_insn_gen_beq(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm)
+{
+ union loongarch_instruction insn;
+
+ if ((imm & 3) || imm < -SZ_128K || imm >= SZ_128K) {
+ pr_warn("The generated beq instruction is out of range.\n");
+ return INSN_BREAK;
+ }
+
+ emit_beq(&insn, rj, rd, imm >> 2);
+
+ return insn.word;
+}
+
+u32 larch_insn_gen_bne(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm)
+{
+ union loongarch_instruction insn;
+
+ if ((imm & 3) || imm < -SZ_128K || imm >= SZ_128K) {
+ pr_warn("The generated bne instruction is out of range.\n");
+ return INSN_BREAK;
+ }
+
+ emit_bne(&insn, rj, rd, imm >> 2);
+
+ return insn.word;
+}
--
2.25.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v4 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx
2025-07-24 14:19 [PATCH v4 0/5] Support trampoline for LoongArch Chenghao Duan
2025-07-24 14:19 ` [PATCH v4 1/5] LoongArch: Add larch_insn_gen_{beq,bne} helpers Chenghao Duan
@ 2025-07-24 14:19 ` Chenghao Duan
2025-07-24 14:19 ` [PATCH v4 3/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch Chenghao Duan
` (4 subsequent siblings)
6 siblings, 0 replies; 22+ messages in thread
From: Chenghao Duan @ 2025-07-24 14:19 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran,
vincent.mc.li
Rename the existing validate_code() to validate_ctx()
Factor out the code validation handling into a new helper validate_code()
* validate_code is used to check the validity of code.
* validate_ctx is used to check both code validity and table entry
correctness.
The new validate_code() will be used in subsequent changes.
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Reviewed-by: Huacai Chen <chenhuacai@kernel.org>
---
arch/loongarch/net/bpf_jit.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index fa1500d4a..7032f11d3 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -1180,6 +1180,14 @@ static int validate_code(struct jit_ctx *ctx)
return -1;
}
+ return 0;
+}
+
+static int validate_ctx(struct jit_ctx *ctx)
+{
+ if (validate_code(ctx))
+ return -1;
+
if (WARN_ON_ONCE(ctx->num_exentries != ctx->prog->aux->num_exentries))
return -1;
@@ -1288,7 +1296,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
build_epilogue(&ctx);
/* 3. Extra pass to validate JITed code */
- if (validate_code(&ctx)) {
+ if (validate_ctx(&ctx)) {
bpf_jit_binary_free(header);
prog = orig_prog;
goto out_offset;
--
2.25.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v4 3/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
2025-07-24 14:19 [PATCH v4 0/5] Support trampoline for LoongArch Chenghao Duan
2025-07-24 14:19 ` [PATCH v4 1/5] LoongArch: Add larch_insn_gen_{beq,bne} helpers Chenghao Duan
2025-07-24 14:19 ` [PATCH v4 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx Chenghao Duan
@ 2025-07-24 14:19 ` Chenghao Duan
2025-07-28 2:30 ` Huacai Chen
` (2 more replies)
2025-07-24 14:19 ` [PATCH v4 4/5] LoongArch: BPF: Add bpf trampoline " Chenghao Duan
` (3 subsequent siblings)
6 siblings, 3 replies; 22+ messages in thread
From: Chenghao Duan @ 2025-07-24 14:19 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran,
vincent.mc.li
Implement the functions of bpf_arch_text_poke, bpf_arch_text_copy, and
bpf_arch_text_invalidate on the LoongArch architecture.
On LoongArch, since symbol addresses in the direct mapping
region cannot be reached via relative jump instructions from the paged
mapping region, we use the move_imm+jirl instruction pair as absolute
jump instructions. These require 2-5 instructions, so we reserve 5 NOP
instructions in the program as placeholders for function jumps.
larch_insn_text_copy is solely used for BPF. The use of
larch_insn_text_copy() requires page_size alignment. Currently, only
the size of the trampoline is page-aligned.
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Reviewed-by: Huacai Chen <chenhuacai@kernel.org>
---
arch/loongarch/include/asm/inst.h | 1 +
arch/loongarch/kernel/inst.c | 32 ++++++++++
arch/loongarch/net/bpf_jit.c | 97 +++++++++++++++++++++++++++++++
3 files changed, 130 insertions(+)
diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
index 2ae96a35d..88bb73e46 100644
--- a/arch/loongarch/include/asm/inst.h
+++ b/arch/loongarch/include/asm/inst.h
@@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
int larch_insn_read(void *addr, u32 *insnp);
int larch_insn_write(void *addr, u32 insn);
int larch_insn_patch_text(void *addr, u32 insn);
+int larch_insn_text_copy(void *dst, void *src, size_t len);
u32 larch_insn_gen_nop(void);
u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
index 674e3b322..8d6594968 100644
--- a/arch/loongarch/kernel/inst.c
+++ b/arch/loongarch/kernel/inst.c
@@ -4,6 +4,7 @@
*/
#include <linux/sizes.h>
#include <linux/uaccess.h>
+#include <linux/set_memory.h>
#include <asm/cacheflush.h>
#include <asm/inst.h>
@@ -218,6 +219,37 @@ int larch_insn_patch_text(void *addr, u32 insn)
return ret;
}
+int larch_insn_text_copy(void *dst, void *src, size_t len)
+{
+ unsigned long flags;
+ size_t wlen = 0;
+ size_t size;
+ void *ptr;
+ int ret = 0;
+
+ set_memory_rw((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
+ raw_spin_lock_irqsave(&patch_lock, flags);
+ while (wlen < len) {
+ ptr = dst + wlen;
+ size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
+ len - wlen);
+
+ ret = copy_to_kernel_nofault(ptr, src + wlen, size);
+ if (ret) {
+ pr_err("%s: operation failed\n", __func__);
+ break;
+ }
+ wlen += size;
+ }
+ raw_spin_unlock_irqrestore(&patch_lock, flags);
+ set_memory_rox((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
+
+ if (!ret)
+ flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
+
+ return ret;
+}
+
u32 larch_insn_gen_nop(void)
{
return INSN_NOP;
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index 7032f11d3..86504e710 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -4,8 +4,12 @@
*
* Copyright (C) 2022 Loongson Technology Corporation Limited
*/
+#include <linux/memory.h>
#include "bpf_jit.h"
+#define LOONGARCH_LONG_JUMP_NINSNS 5
+#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
+
#define REG_TCC LOONGARCH_GPR_A6
#define TCC_SAVED LOONGARCH_GPR_S5
@@ -88,6 +92,7 @@ static u8 tail_call_reg(struct jit_ctx *ctx)
*/
static void build_prologue(struct jit_ctx *ctx)
{
+ int i;
int stack_adjust = 0, store_offset, bpf_stack_adjust;
bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
@@ -98,6 +103,10 @@ static void build_prologue(struct jit_ctx *ctx)
stack_adjust = round_up(stack_adjust, 16);
stack_adjust += bpf_stack_adjust;
+ /* Reserve space for the move_imm + jirl instruction */
+ for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
+ emit_insn(ctx, nop);
+
/*
* First instruction initializes the tail call count (TCC).
* On tail call we skip this instruction, and the TCC is
@@ -1367,3 +1376,91 @@ bool bpf_jit_supports_subprog_tailcalls(void)
{
return true;
}
+
+static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 target)
+{
+ if (!target) {
+ pr_err("bpf_jit: jump target address is error\n");
+ return -EFAULT;
+ }
+
+ move_imm(ctx, LOONGARCH_GPR_T1, target, false);
+ emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
+
+ return 0;
+}
+
+static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
+{
+ struct jit_ctx ctx;
+
+ ctx.idx = 0;
+ ctx.image = (union loongarch_instruction *)insns;
+
+ if (!target) {
+ emit_insn((&ctx), nop);
+ emit_insn((&ctx), nop);
+ return 0;
+ }
+
+ return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
+ (unsigned long)target);
+}
+
+int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
+ void *old_addr, void *new_addr)
+{
+ u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
+ u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
+ bool is_call = poke_type == BPF_MOD_CALL;
+ int ret;
+
+ if (!is_kernel_text((unsigned long)ip) &&
+ !is_bpf_text_address((unsigned long)ip))
+ return -ENOTSUPP;
+
+ ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
+ if (ret)
+ return ret;
+
+ if (memcmp(ip, old_insns, LOONGARCH_LONG_JUMP_NBYTES))
+ return -EFAULT;
+
+ ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
+ if (ret)
+ return ret;
+
+ mutex_lock(&text_mutex);
+ if (memcmp(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES))
+ ret = larch_insn_text_copy(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES);
+ mutex_unlock(&text_mutex);
+ return ret;
+}
+
+int bpf_arch_text_invalidate(void *dst, size_t len)
+{
+ int i;
+ int ret = 0;
+ u32 *inst;
+
+ inst = kvmalloc(len, GFP_KERNEL);
+ if (!inst)
+ return -ENOMEM;
+
+ for (i = 0; i < (len/sizeof(u32)); i++)
+ inst[i] = INSN_BREAK;
+
+ if (larch_insn_text_copy(dst, inst, len))
+ ret = -EINVAL;
+
+ kvfree(inst);
+ return ret;
+}
+
+void *bpf_arch_text_copy(void *dst, void *src, size_t len)
+{
+ if (larch_insn_text_copy(dst, src, len))
+ return ERR_PTR(-EINVAL);
+
+ return dst;
+}
--
2.25.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v4 4/5] LoongArch: BPF: Add bpf trampoline support for Loongarch
2025-07-24 14:19 [PATCH v4 0/5] Support trampoline for LoongArch Chenghao Duan
` (2 preceding siblings ...)
2025-07-24 14:19 ` [PATCH v4 3/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch Chenghao Duan
@ 2025-07-24 14:19 ` Chenghao Duan
2025-07-28 2:03 ` Geliang Tang
2025-07-28 10:50 ` Hengqi Chen
2025-07-24 14:19 ` [PATCH v4 5/5] LoongArch: BPF: Add struct ops support for trampoline Chenghao Duan
` (2 subsequent siblings)
6 siblings, 2 replies; 22+ messages in thread
From: Chenghao Duan @ 2025-07-24 14:19 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran,
vincent.mc.li, kernel test robot
BPF trampoline is the critical infrastructure of the BPF subsystem, acting
as a mediator between kernel functions and BPF programs. Numerous important
features, such as using BPF program for zero overhead kernel introspection,
rely on this key component.
The related tests have passed, Including the following technical points:
1. fentry
2. fmod_ret
3. fexit
The following related testcases passed on LoongArch:
sudo ./test_progs -a fentry_test/fentry
sudo ./test_progs -a fexit_test/fexit
sudo ./test_progs -a fentry_fexit
sudo ./test_progs -a modify_return
sudo ./test_progs -a fexit_sleep
sudo ./test_progs -a test_overhead
sudo ./test_progs -a trampoline_count
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202507100034.wXofj6VX-lkp@intel.com/
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Tested-by: Vincent Li <vincent.mc.li@gmail.com>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Reviewed-by: Huacai Chen <chenhuacai@kernel.org>
---
arch/loongarch/net/bpf_jit.c | 391 +++++++++++++++++++++++++++++++++++
arch/loongarch/net/bpf_jit.h | 6 +
2 files changed, 397 insertions(+)
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index 86504e710..ac5ce3a28 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -7,9 +7,15 @@
#include <linux/memory.h>
#include "bpf_jit.h"
+#define LOONGARCH_MAX_REG_ARGS 8
+
#define LOONGARCH_LONG_JUMP_NINSNS 5
#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
+#define LOONGARCH_FENTRY_NINSNS 2
+#define LOONGARCH_FENTRY_NBYTES (LOONGARCH_FENTRY_NINSNS * 4)
+#define LOONGARCH_BPF_FENTRY_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
+
#define REG_TCC LOONGARCH_GPR_A6
#define TCC_SAVED LOONGARCH_GPR_S5
@@ -1407,6 +1413,11 @@ static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
(unsigned long)target);
}
+static int emit_call(struct jit_ctx *ctx, u64 addr)
+{
+ return emit_jump_and_link(ctx, LOONGARCH_GPR_RA, addr);
+}
+
int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
void *old_addr, void *new_addr)
{
@@ -1464,3 +1475,383 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len)
return dst;
}
+
+static void store_args(struct jit_ctx *ctx, int nargs, int args_off)
+{
+ int i;
+
+ for (i = 0; i < nargs; i++) {
+ emit_insn(ctx, std, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
+ args_off -= 8;
+ }
+}
+
+static void restore_args(struct jit_ctx *ctx, int nargs, int args_off)
+{
+ int i;
+
+ for (i = 0; i < nargs; i++) {
+ emit_insn(ctx, ldd, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
+ args_off -= 8;
+ }
+}
+
+static int invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l,
+ int args_off, int retval_off,
+ int run_ctx_off, bool save_ret)
+{
+ int ret;
+ u32 *branch;
+ struct bpf_prog *p = l->link.prog;
+ int cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
+
+ if (l->cookie) {
+ move_imm(ctx, LOONGARCH_GPR_T1, l->cookie, false);
+ emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -run_ctx_off + cookie_off);
+ } else {
+ emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP,
+ -run_ctx_off + cookie_off);
+ }
+
+ /* arg1: prog */
+ move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
+ /* arg2: &run_ctx */
+ emit_insn(ctx, addid, LOONGARCH_GPR_A1, LOONGARCH_GPR_FP, -run_ctx_off);
+ ret = emit_call(ctx, (const u64)bpf_trampoline_enter(p));
+ if (ret)
+ return ret;
+
+ /* store prog start time */
+ move_reg(ctx, LOONGARCH_GPR_S1, LOONGARCH_GPR_A0);
+
+ /* if (__bpf_prog_enter(prog) == 0)
+ * goto skip_exec_of_prog;
+ *
+ */
+ branch = (u32 *)ctx->image + ctx->idx;
+ /* nop reserved for conditional jump */
+ emit_insn(ctx, nop);
+
+ /* arg1: &args_off */
+ emit_insn(ctx, addid, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -args_off);
+ if (!p->jited)
+ move_imm(ctx, LOONGARCH_GPR_A1, (const s64)p->insnsi, false);
+ ret = emit_call(ctx, (const u64)p->bpf_func);
+ if (ret)
+ return ret;
+
+ if (save_ret) {
+ emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
+ emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
+ }
+
+ /* update branch with beqz */
+ if (ctx->image) {
+ int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branch;
+ *branch = larch_insn_gen_beq(LOONGARCH_GPR_A0, LOONGARCH_GPR_ZERO, offset);
+ }
+
+ /* arg1: prog */
+ move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
+ /* arg2: prog start time */
+ move_reg(ctx, LOONGARCH_GPR_A1, LOONGARCH_GPR_S1);
+ /* arg3: &run_ctx */
+ emit_insn(ctx, addid, LOONGARCH_GPR_A2, LOONGARCH_GPR_FP, -run_ctx_off);
+ ret = emit_call(ctx, (const u64)bpf_trampoline_exit(p));
+
+ return ret;
+}
+
+static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl,
+ int args_off, int retval_off, int run_ctx_off, u32 **branches)
+{
+ int i;
+
+ emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP, -retval_off);
+ for (i = 0; i < tl->nr_links; i++) {
+ invoke_bpf_prog(ctx, tl->links[i], args_off, retval_off,
+ run_ctx_off, true);
+ emit_insn(ctx, ldd, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -retval_off);
+ branches[i] = (u32 *)ctx->image + ctx->idx;
+ emit_insn(ctx, nop);
+ }
+}
+
+u64 bpf_jit_alloc_exec_limit(void)
+{
+ return VMALLOC_END - VMALLOC_START;
+}
+
+void *arch_alloc_bpf_trampoline(unsigned int size)
+{
+ return bpf_prog_pack_alloc(size, jit_fill_hole);
+}
+
+void arch_free_bpf_trampoline(void *image, unsigned int size)
+{
+ bpf_prog_pack_free(image, size);
+}
+
+static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
+ const struct btf_func_model *m,
+ struct bpf_tramp_links *tlinks,
+ void *func_addr, u32 flags)
+{
+ int i;
+ int stack_size = 0, nargs = 0;
+ int retval_off, args_off, nargs_off, ip_off, run_ctx_off, sreg_off;
+ struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
+ struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
+ struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
+ int ret, save_ret;
+ void *orig_call = func_addr;
+ u32 **branches = NULL;
+
+ if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
+ return -ENOTSUPP;
+
+ /*
+ * FP + 8 [ RA to parent func ] return address to parent
+ * function
+ * FP + 0 [ FP of parent func ] frame pointer of parent
+ * function
+ * FP - 8 [ T0 to traced func ] return address of traced
+ * function
+ * FP - 16 [ FP of traced func ] frame pointer of traced
+ * function
+ *
+ * FP - retval_off [ return value ] BPF_TRAMP_F_CALL_ORIG or
+ * BPF_TRAMP_F_RET_FENTRY_RET
+ * [ argN ]
+ * [ ... ]
+ * FP - args_off [ arg1 ]
+ *
+ * FP - nargs_off [ regs count ]
+ *
+ * FP - ip_off [ traced func ] BPF_TRAMP_F_IP_ARG
+ *
+ * FP - run_ctx_off [ bpf_tramp_run_ctx ]
+ *
+ * FP - sreg_off [ callee saved reg ]
+ *
+ */
+
+ if (m->nr_args > LOONGARCH_MAX_REG_ARGS)
+ return -ENOTSUPP;
+
+ if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
+ return -ENOTSUPP;
+
+ stack_size = 0;
+
+ /* room of trampoline frame to store return address and frame pointer */
+ stack_size += 16;
+
+ save_ret = flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET);
+ if (save_ret) {
+ /* Save BPF R0 and A0 */
+ stack_size += 16;
+ retval_off = stack_size;
+ }
+
+ /* room of trampoline frame to store args */
+ nargs = m->nr_args;
+ stack_size += nargs * 8;
+ args_off = stack_size;
+
+ /* room of trampoline frame to store args number */
+ stack_size += 8;
+ nargs_off = stack_size;
+
+ /* room of trampoline frame to store ip address */
+ if (flags & BPF_TRAMP_F_IP_ARG) {
+ stack_size += 8;
+ ip_off = stack_size;
+ }
+
+ /* room of trampoline frame to store struct bpf_tramp_run_ctx */
+ stack_size += round_up(sizeof(struct bpf_tramp_run_ctx), 8);
+ run_ctx_off = stack_size;
+
+ stack_size += 8;
+ sreg_off = stack_size;
+
+ stack_size = round_up(stack_size, 16);
+
+ /* For the trampoline called from function entry */
+ /* RA and FP for parent function*/
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
+ emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
+ emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
+ emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
+
+ /* RA and FP for traced function*/
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
+ emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
+ emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
+ emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
+
+ /* callee saved register S1 to pass start time */
+ emit_insn(ctx, std, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
+
+ /* store ip address of the traced function */
+ if (flags & BPF_TRAMP_F_IP_ARG) {
+ move_imm(ctx, LOONGARCH_GPR_T1, (const s64)func_addr, false);
+ emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -ip_off);
+ }
+
+ /* store nargs number*/
+ move_imm(ctx, LOONGARCH_GPR_T1, nargs, false);
+ emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -nargs_off);
+
+ store_args(ctx, nargs, args_off);
+
+ /* To traced function */
+ /* Ftrace jump skips 2 NOP instructions */
+ if (is_kernel_text((unsigned long)orig_call))
+ orig_call += LOONGARCH_FENTRY_NBYTES;
+ /* Direct jump skips 5 NOP instructions */
+ else if (is_bpf_text_address((unsigned long)orig_call))
+ orig_call += LOONGARCH_BPF_FENTRY_NBYTES;
+
+ if (flags & BPF_TRAMP_F_CALL_ORIG) {
+ move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
+ ret = emit_call(ctx, (const u64)__bpf_tramp_enter);
+ if (ret)
+ return ret;
+ }
+
+ for (i = 0; i < fentry->nr_links; i++) {
+ ret = invoke_bpf_prog(ctx, fentry->links[i], args_off, retval_off,
+ run_ctx_off, flags & BPF_TRAMP_F_RET_FENTRY_RET);
+ if (ret)
+ return ret;
+ }
+ if (fmod_ret->nr_links) {
+ branches = kcalloc(fmod_ret->nr_links, sizeof(u32 *), GFP_KERNEL);
+ if (!branches)
+ return -ENOMEM;
+
+ invoke_bpf_mod_ret(ctx, fmod_ret, args_off, retval_off,
+ run_ctx_off, branches);
+ }
+
+ if (flags & BPF_TRAMP_F_CALL_ORIG) {
+ restore_args(ctx, m->nr_args, args_off);
+ ret = emit_call(ctx, (const u64)orig_call);
+ if (ret)
+ goto out;
+ emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
+ emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
+ im->ip_after_call = ctx->ro_image + ctx->idx;
+ /* Reserve space for the move_imm + jirl instruction */
+ for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
+ emit_insn(ctx, nop);
+ }
+
+ for (i = 0; ctx->image && i < fmod_ret->nr_links; i++) {
+ int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branches[i];
+ *branches[i] = larch_insn_gen_bne(LOONGARCH_GPR_T1, LOONGARCH_GPR_ZERO, offset);
+ }
+
+ for (i = 0; i < fexit->nr_links; i++) {
+ ret = invoke_bpf_prog(ctx, fexit->links[i], args_off, retval_off,
+ run_ctx_off, false);
+ if (ret)
+ goto out;
+ }
+
+ if (flags & BPF_TRAMP_F_CALL_ORIG) {
+ im->ip_epilogue = ctx->ro_image + ctx->idx;
+ move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
+ ret = emit_call(ctx, (const u64)__bpf_tramp_exit);
+ if (ret)
+ goto out;
+ }
+
+ if (flags & BPF_TRAMP_F_RESTORE_REGS)
+ restore_args(ctx, m->nr_args, args_off);
+
+ if (save_ret) {
+ emit_insn(ctx, ldd, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
+ emit_insn(ctx, ldd, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
+ }
+
+ emit_insn(ctx, ldd, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
+
+ /* trampoline called from function entry */
+ emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
+ emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
+
+ emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
+ emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
+
+ if (flags & BPF_TRAMP_F_SKIP_FRAME)
+ /* return to parent function */
+ emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
+ else
+ /* return to traced function */
+ emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
+
+ ret = ctx->idx;
+out:
+ kfree(branches);
+
+ return ret;
+}
+
+int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image,
+ void *ro_image_end, const struct btf_func_model *m,
+ u32 flags, struct bpf_tramp_links *tlinks,
+ void *func_addr)
+{
+ int ret;
+ void *image, *tmp;
+ u32 size = ro_image_end - ro_image;
+
+ image = kvmalloc(size, GFP_KERNEL);
+ if (!image)
+ return -ENOMEM;
+
+ struct jit_ctx ctx = {
+ .image = (union loongarch_instruction *)image,
+ .ro_image = (union loongarch_instruction *)ro_image,
+ .idx = 0,
+ };
+
+ jit_fill_hole(image, (unsigned int)(ro_image_end - ro_image));
+ ret = __arch_prepare_bpf_trampoline(&ctx, im, m, tlinks, func_addr, flags);
+ if (ret > 0 && validate_code(&ctx) < 0) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ tmp = bpf_arch_text_copy(ro_image, image, size);
+ if (IS_ERR(tmp)) {
+ ret = PTR_ERR(tmp);
+ goto out;
+ }
+
+ bpf_flush_icache(ro_image, ro_image_end);
+out:
+ kvfree(image);
+ return ret < 0 ? ret : size;
+}
+
+int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags,
+ struct bpf_tramp_links *tlinks, void *func_addr)
+{
+ struct bpf_tramp_image im;
+ struct jit_ctx ctx;
+ int ret;
+
+ ctx.image = NULL;
+ ctx.idx = 0;
+
+ ret = __arch_prepare_bpf_trampoline(&ctx, &im, m, tlinks, func_addr, flags);
+
+ /* Page align */
+ return ret < 0 ? ret : round_up(ret * LOONGARCH_INSN_SIZE, PAGE_SIZE);
+}
diff --git a/arch/loongarch/net/bpf_jit.h b/arch/loongarch/net/bpf_jit.h
index f9c569f53..5697158fd 100644
--- a/arch/loongarch/net/bpf_jit.h
+++ b/arch/loongarch/net/bpf_jit.h
@@ -18,6 +18,7 @@ struct jit_ctx {
u32 *offset;
int num_exentries;
union loongarch_instruction *image;
+ union loongarch_instruction *ro_image;
u32 stack_size;
};
@@ -308,3 +309,8 @@ static inline int emit_tailcall_jmp(struct jit_ctx *ctx, u8 cond, enum loongarch
return -EINVAL;
}
+
+static inline void bpf_flush_icache(void *start, void *end)
+{
+ flush_icache_range((unsigned long)start, (unsigned long)end);
+}
--
2.25.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v4 5/5] LoongArch: BPF: Add struct ops support for trampoline
2025-07-24 14:19 [PATCH v4 0/5] Support trampoline for LoongArch Chenghao Duan
` (3 preceding siblings ...)
2025-07-24 14:19 ` [PATCH v4 4/5] LoongArch: BPF: Add bpf trampoline " Chenghao Duan
@ 2025-07-24 14:19 ` Chenghao Duan
2025-07-28 10:55 ` Hengqi Chen
2025-07-24 15:30 ` [PATCH v4 0/5] Support trampoline for LoongArch Vincent Li
2025-07-27 1:00 ` Geliang Tang
6 siblings, 1 reply; 22+ messages in thread
From: Chenghao Duan @ 2025-07-24 14:19 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran,
vincent.mc.li
From: Tiezhu Yang <yangtiezhu@loongson.cn>
Use BPF_TRAMP_F_INDIRECT flag to detect struct ops and emit proper
prologue and epilogue for this case.
With this patch, all of the struct_ops related testcases (except
struct_ops_multi_pages) passed on LoongArch.
The testcase struct_ops_multi_pages failed is because the actual
image_pages_cnt is 40 which is bigger than MAX_TRAMP_IMAGE_PAGES.
Before:
$ sudo ./test_progs -t struct_ops -d struct_ops_multi_pages
...
WATCHDOG: test case struct_ops_module/struct_ops_load executes for 10 seconds...
After:
$ sudo ./test_progs -t struct_ops -d struct_ops_multi_pages
...
#15 bad_struct_ops:OK
...
#399 struct_ops_autocreate:OK
...
#400 struct_ops_kptr_return:OK
...
#401 struct_ops_maybe_null:OK
...
#402 struct_ops_module:OK
...
#404 struct_ops_no_cfi:OK
...
#405 struct_ops_private_stack:SKIP
...
#406 struct_ops_refcounted:OK
Summary: 8/25 PASSED, 3 SKIPPED, 0 FAILED
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
---
arch/loongarch/net/bpf_jit.c | 71 ++++++++++++++++++++++++------------
1 file changed, 47 insertions(+), 24 deletions(-)
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index ac5ce3a28..6a84fb104 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -1603,6 +1603,7 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
+ bool is_struct_ops = flags & BPF_TRAMP_F_INDIRECT;
int ret, save_ret;
void *orig_call = func_addr;
u32 **branches = NULL;
@@ -1678,18 +1679,31 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
stack_size = round_up(stack_size, 16);
- /* For the trampoline called from function entry */
- /* RA and FP for parent function*/
- emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
- emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
- emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
- emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
-
- /* RA and FP for traced function*/
- emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
- emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
- emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
- emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
+ if (!is_struct_ops) {
+ /*
+ * For the trampoline called from function entry,
+ * the frame of traced function and the frame of
+ * trampoline need to be considered.
+ */
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
+ emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
+ emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
+ emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
+
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
+ emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
+ emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
+ emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
+ } else {
+ /*
+ * For the trampoline called directly, just handle
+ * the frame of trampoline.
+ */
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
+ emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, stack_size - 8);
+ emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
+ emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
+ }
/* callee saved register S1 to pass start time */
emit_insn(ctx, std, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
@@ -1779,21 +1793,30 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
emit_insn(ctx, ldd, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
- /* trampoline called from function entry */
- emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
- emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
- emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
+ if (!is_struct_ops) {
+ /* trampoline called from function entry */
+ emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
+ emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
+
+ emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
+ emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
- emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
- emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
- emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
+ if (flags & BPF_TRAMP_F_SKIP_FRAME)
+ /* return to parent function */
+ emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
+ else
+ /* return to traced function */
+ emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
+ } else {
+ /* trampoline called directly */
+ emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, stack_size - 8);
+ emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
- if (flags & BPF_TRAMP_F_SKIP_FRAME)
- /* return to parent function */
emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
- else
- /* return to traced function */
- emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
+ }
ret = ctx->idx;
out:
--
2.25.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH v4 0/5] Support trampoline for LoongArch
2025-07-24 14:19 [PATCH v4 0/5] Support trampoline for LoongArch Chenghao Duan
` (4 preceding siblings ...)
2025-07-24 14:19 ` [PATCH v4 5/5] LoongArch: BPF: Add struct ops support for trampoline Chenghao Duan
@ 2025-07-24 15:30 ` Vincent Li
2025-07-25 10:18 ` Chenghao Duan
2025-07-27 1:00 ` Geliang Tang
6 siblings, 1 reply; 22+ messages in thread
From: Vincent Li @ 2025-07-24 15:30 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai,
martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, youling.tang, jianghaoran
On Thu, Jul 24, 2025 at 7:19 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> v4:
> 1. Delete the #3 patch of version V3.
>
> 2. Add 5 NOP instructions in build_prologue().
> Reserve space for the move_imm + jirl instruction.
>
> 3. Differentiate between direct jumps and ftrace jumps of trampoline:
> direct jumps skip 5 instructions.
> ftrace jumps skip 2 instructions.
>
> 4. Remove the generation of BL jump instructions in emit_jump_and_link().
> After the trampoline ends, it will jump to the specified register.
> The BL instruction writes PC+4 to r1 instead of allowing the
> specification of rd.
>
> -----------------------------------------------------------------------
> Historical Version:
> v3:
> 1. Patch 0003 adds EXECMEM_BPF memory type to the execmem subsystem.
>
> 2. Align the size calculated by arch_bpf_trampoline_size to page
> boundaries.
>
> 3. Add the flush icache operation to larch_insn_text_copy.
>
> 4. Unify the implementation of bpf_arch_xxx into the patch
> "0004-LoongArch-BPF-Add-bpf_arch_xxxxx-support-for-Loong.patch".
>
> 5. Change the patch order. Move the patch
> "0002-LoongArch-BPF-Update-the-code-to-rename-validate_.patch" before
> "0005-LoongArch-BPF-Add-bpf-trampoline-support-for-Loon.patch".
>
> URL for version v3:
> https://lore.kernel.org/all/20250709055029.723243-1-duanchenghao@kylinos.cn/
> ---------
> v2:
> 1. Change the fixmap in the instruction copy function to set_memory_xxx.
>
> 2. Change the implementation method of the following code.
> - arch_alloc_bpf_trampoline
> - arch_free_bpf_trampoline
> Use the BPF core's allocation and free functions.
>
> - bpf_arch_text_invalidate
> Operate with the function larch_insn_text_copy that carries
> memory attribute modifications.
>
> 3. Correct the incorrect code formatting.
>
> URL for version v2:
> https://lore.kernel.org/all/20250618105048.1510560-1-duanchenghao@kylinos.cn/
> ---------
> v1:
> Support trampoline for LoongArch. The following feature tests have been
> completed:
> 1. fentry
> 2. fexit
> 3. fmod_ret
>
> TODO: The support for the struct_ops feature will be provided in
> subsequent patches.
>
> URL for version v1:
> https://lore.kernel.org/all/20250611035952.111182-1-duanchenghao@kylinos.cn/
> -----------------------------------------------------------------------
>
> Chenghao Duan (4):
> LoongArch: Add larch_insn_gen_{beq,bne} helpers
> LoongArch: BPF: Update the code to rename validate_code to
> validate_ctx
> LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
> LoongArch: BPF: Add bpf trampoline support for Loongarch
>
> Tiezhu Yang (1):
> LoongArch: BPF: Add struct ops support for trampoline
>
> arch/loongarch/include/asm/inst.h | 3 +
> arch/loongarch/kernel/inst.c | 60 ++++
> arch/loongarch/net/bpf_jit.c | 521 +++++++++++++++++++++++++++++-
> arch/loongarch/net/bpf_jit.h | 6 +
> 4 files changed, 589 insertions(+), 1 deletion(-)
>
> --
> 2.25.1
>
Tested the whole patch series and it resolved the xdp-tool xdp-filter issue
[root@fedora ~]# xdp-loader status
CURRENT XDP PROGRAM STATUS:
Interface Prio Program name Mode ID Tag
Chain actions
--------------------------------------------------------------------------------------
lo xdp_dispatcher skb 53 4d7e87c0d30db711
=> 10 xdpfilt_alw_all 62
320c53c06933a8fa XDP_PASS
dummy0 <No XDP program loaded!>
sit0 <No XDP program loaded!>
enp0s3f0 <No XDP program loaded!>
wlp3s0 <No XDP program loaded!>
you can add Tested-by: Vincent Li <vincent.mc.li@gmail.com>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 0/5] Support trampoline for LoongArch
2025-07-24 15:30 ` [PATCH v4 0/5] Support trampoline for LoongArch Vincent Li
@ 2025-07-25 10:18 ` Chenghao Duan
2025-07-26 19:14 ` Daniel Borkmann
0 siblings, 1 reply; 22+ messages in thread
From: Chenghao Duan @ 2025-07-25 10:18 UTC (permalink / raw)
To: Vincent Li
Cc: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai,
martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, youling.tang, jianghaoran
On Thu, Jul 24, 2025 at 08:30:35AM -0700, Vincent Li wrote:
> On Thu, Jul 24, 2025 at 7:19 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > v4:
> > 1. Delete the #3 patch of version V3.
> >
> > 2. Add 5 NOP instructions in build_prologue().
> > Reserve space for the move_imm + jirl instruction.
> >
> > 3. Differentiate between direct jumps and ftrace jumps of trampoline:
> > direct jumps skip 5 instructions.
> > ftrace jumps skip 2 instructions.
> >
> > 4. Remove the generation of BL jump instructions in emit_jump_and_link().
> > After the trampoline ends, it will jump to the specified register.
> > The BL instruction writes PC+4 to r1 instead of allowing the
> > specification of rd.
> >
> > -----------------------------------------------------------------------
> > Historical Version:
> > v3:
> > 1. Patch 0003 adds EXECMEM_BPF memory type to the execmem subsystem.
> >
> > 2. Align the size calculated by arch_bpf_trampoline_size to page
> > boundaries.
> >
> > 3. Add the flush icache operation to larch_insn_text_copy.
> >
> > 4. Unify the implementation of bpf_arch_xxx into the patch
> > "0004-LoongArch-BPF-Add-bpf_arch_xxxxx-support-for-Loong.patch".
> >
> > 5. Change the patch order. Move the patch
> > "0002-LoongArch-BPF-Update-the-code-to-rename-validate_.patch" before
> > "0005-LoongArch-BPF-Add-bpf-trampoline-support-for-Loon.patch".
> >
> > URL for version v3:
> > https://lore.kernel.org/all/20250709055029.723243-1-duanchenghao@kylinos.cn/
> > ---------
> > v2:
> > 1. Change the fixmap in the instruction copy function to set_memory_xxx.
> >
> > 2. Change the implementation method of the following code.
> > - arch_alloc_bpf_trampoline
> > - arch_free_bpf_trampoline
> > Use the BPF core's allocation and free functions.
> >
> > - bpf_arch_text_invalidate
> > Operate with the function larch_insn_text_copy that carries
> > memory attribute modifications.
> >
> > 3. Correct the incorrect code formatting.
> >
> > URL for version v2:
> > https://lore.kernel.org/all/20250618105048.1510560-1-duanchenghao@kylinos.cn/
> > ---------
> > v1:
> > Support trampoline for LoongArch. The following feature tests have been
> > completed:
> > 1. fentry
> > 2. fexit
> > 3. fmod_ret
> >
> > TODO: The support for the struct_ops feature will be provided in
> > subsequent patches.
> >
> > URL for version v1:
> > https://lore.kernel.org/all/20250611035952.111182-1-duanchenghao@kylinos.cn/
> > -----------------------------------------------------------------------
> >
> > Chenghao Duan (4):
> > LoongArch: Add larch_insn_gen_{beq,bne} helpers
> > LoongArch: BPF: Update the code to rename validate_code to
> > validate_ctx
> > LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
> > LoongArch: BPF: Add bpf trampoline support for Loongarch
> >
> > Tiezhu Yang (1):
> > LoongArch: BPF: Add struct ops support for trampoline
> >
> > arch/loongarch/include/asm/inst.h | 3 +
> > arch/loongarch/kernel/inst.c | 60 ++++
> > arch/loongarch/net/bpf_jit.c | 521 +++++++++++++++++++++++++++++-
> > arch/loongarch/net/bpf_jit.h | 6 +
> > 4 files changed, 589 insertions(+), 1 deletion(-)
> >
> > --
> > 2.25.1
> >
>
> Tested the whole patch series and it resolved the xdp-tool xdp-filter issue
>
> [root@fedora ~]# xdp-loader status
> CURRENT XDP PROGRAM STATUS:
>
> Interface Prio Program name Mode ID Tag
> Chain actions
> --------------------------------------------------------------------------------------
> lo xdp_dispatcher skb 53 4d7e87c0d30db711
> => 10 xdpfilt_alw_all 62
> 320c53c06933a8fa XDP_PASS
> dummy0 <No XDP program loaded!>
> sit0 <No XDP program loaded!>
> enp0s3f0 <No XDP program loaded!>
> wlp3s0 <No XDP program loaded!>
>
> you can add Tested-by: Vincent Li <vincent.mc.li@gmail.com>
Hi Vincent,
Okay, thank you very much for your support. The existing patch has
included "Tested-by: Vincent Li vincent.mc.li@gmail.com".
Brs Chenghao
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 0/5] Support trampoline for LoongArch
2025-07-25 10:18 ` Chenghao Duan
@ 2025-07-26 19:14 ` Daniel Borkmann
0 siblings, 0 replies; 22+ messages in thread
From: Daniel Borkmann @ 2025-07-26 19:14 UTC (permalink / raw)
To: Chenghao Duan, Vincent Li
Cc: ast, andrii, yangtiezhu, hengqi.chen, chenhuacai, martin.lau,
eddyz87, song, yonghong.song, john.fastabend, kpsingh, sdf,
haoluo, jolsa, kernel, linux-kernel, loongarch, bpf, guodongtai,
youling.tang, jianghaoran
On 7/25/25 12:18 PM, Chenghao Duan wrote:
> On Thu, Jul 24, 2025 at 08:30:35AM -0700, Vincent Li wrote:
>> On Thu, Jul 24, 2025 at 7:19 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>>>
>>> v4:
>>> 1. Delete the #3 patch of version V3.
>>>
>>> 2. Add 5 NOP instructions in build_prologue().
>>> Reserve space for the move_imm + jirl instruction.
>>>
>>> 3. Differentiate between direct jumps and ftrace jumps of trampoline:
>>> direct jumps skip 5 instructions.
>>> ftrace jumps skip 2 instructions.
>>>
>>> 4. Remove the generation of BL jump instructions in emit_jump_and_link().
>>> After the trampoline ends, it will jump to the specified register.
>>> The BL instruction writes PC+4 to r1 instead of allowing the
>>> specification of rd.
>>>
>>> -----------------------------------------------------------------------
>>> Historical Version:
>>> v3:
>>> 1. Patch 0003 adds EXECMEM_BPF memory type to the execmem subsystem.
>>>
>>> 2. Align the size calculated by arch_bpf_trampoline_size to page
>>> boundaries.
>>>
>>> 3. Add the flush icache operation to larch_insn_text_copy.
>>>
>>> 4. Unify the implementation of bpf_arch_xxx into the patch
>>> "0004-LoongArch-BPF-Add-bpf_arch_xxxxx-support-for-Loong.patch".
>>>
>>> 5. Change the patch order. Move the patch
>>> "0002-LoongArch-BPF-Update-the-code-to-rename-validate_.patch" before
>>> "0005-LoongArch-BPF-Add-bpf-trampoline-support-for-Loon.patch".
>>>
>>> URL for version v3:
>>> https://lore.kernel.org/all/20250709055029.723243-1-duanchenghao@kylinos.cn/
>>> ---------
>>> v2:
>>> 1. Change the fixmap in the instruction copy function to set_memory_xxx.
>>>
>>> 2. Change the implementation method of the following code.
>>> - arch_alloc_bpf_trampoline
>>> - arch_free_bpf_trampoline
>>> Use the BPF core's allocation and free functions.
>>>
>>> - bpf_arch_text_invalidate
>>> Operate with the function larch_insn_text_copy that carries
>>> memory attribute modifications.
>>>
>>> 3. Correct the incorrect code formatting.
>>>
>>> URL for version v2:
>>> https://lore.kernel.org/all/20250618105048.1510560-1-duanchenghao@kylinos.cn/
>>> ---------
>>> v1:
>>> Support trampoline for LoongArch. The following feature tests have been
>>> completed:
>>> 1. fentry
>>> 2. fexit
>>> 3. fmod_ret
>>>
>>> TODO: The support for the struct_ops feature will be provided in
>>> subsequent patches.
>>>
>>> URL for version v1:
>>> https://lore.kernel.org/all/20250611035952.111182-1-duanchenghao@kylinos.cn/
>>> -----------------------------------------------------------------------
>>>
>>> Chenghao Duan (4):
>>> LoongArch: Add larch_insn_gen_{beq,bne} helpers
>>> LoongArch: BPF: Update the code to rename validate_code to
>>> validate_ctx
>>> LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
>>> LoongArch: BPF: Add bpf trampoline support for Loongarch
>>>
>>> Tiezhu Yang (1):
>>> LoongArch: BPF: Add struct ops support for trampoline
>>>
>>> arch/loongarch/include/asm/inst.h | 3 +
>>> arch/loongarch/kernel/inst.c | 60 ++++
>>> arch/loongarch/net/bpf_jit.c | 521 +++++++++++++++++++++++++++++-
>>> arch/loongarch/net/bpf_jit.h | 6 +
>>> 4 files changed, 589 insertions(+), 1 deletion(-)
>>>
>>> --
>>> 2.25.1
>>>
>>
>> Tested the whole patch series and it resolved the xdp-tool xdp-filter issue
>>
>> [root@fedora ~]# xdp-loader status
>> CURRENT XDP PROGRAM STATUS:
>>
>> Interface Prio Program name Mode ID Tag
>> Chain actions
>> --------------------------------------------------------------------------------------
>> lo xdp_dispatcher skb 53 4d7e87c0d30db711
>> => 10 xdpfilt_alw_all 62
>> 320c53c06933a8fa XDP_PASS
>> dummy0 <No XDP program loaded!>
>> sit0 <No XDP program loaded!>
>> enp0s3f0 <No XDP program loaded!>
>> wlp3s0 <No XDP program loaded!>
>>
>> you can add Tested-by: Vincent Li <vincent.mc.li@gmail.com>
>
> Hi Vincent,
>
> Okay, thank you very much for your support. The existing patch has
> included "Tested-by: Vincent Li vincent.mc.li@gmail.com".
Huacai, I presume you'll route this series to Linus, correct?
Thanks,
Daniel
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 0/5] Support trampoline for LoongArch
2025-07-24 14:19 [PATCH v4 0/5] Support trampoline for LoongArch Chenghao Duan
` (5 preceding siblings ...)
2025-07-24 15:30 ` [PATCH v4 0/5] Support trampoline for LoongArch Vincent Li
@ 2025-07-27 1:00 ` Geliang Tang
2025-07-28 2:42 ` Huacai Chen
6 siblings, 1 reply; 22+ messages in thread
From: Geliang Tang @ 2025-07-27 1:00 UTC (permalink / raw)
To: Chenghao Duan, ast, daniel, andrii, yangtiezhu, hengqi.chen,
chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, youling.tang, jianghaoran, vincent.mc.li
Hi Chenghao, Huacai, Tuezhu,
I first discovered this Loongarch BPF trampoline issue when debugging
MPTCP BPF selftests on a Loongarch machine last June (see my commit
eef0532e900c "selftests/bpf: Null checks for links in bpf_tcp_ca"), and
reported it to Huachui. Tiezhu and I started implementing BPF
trampoline last June. I also called on more Chinese kernel engineers to
participate in the development of the Loongarch BPF trampoline at the
openEuler Developer Day 2024 and CLSF 2024 conferences. Although this
work was finally handed over to Chenghao, it is also necessary to
mention me as the reporter and our early developers in the commit log.
Thanks,
-Geliang
On Thu, 2025-07-24 at 22:19 +0800, Chenghao Duan wrote:
> v4:
> 1. Delete the #3 patch of version V3.
>
> 2. Add 5 NOP instructions in build_prologue().
> Reserve space for the move_imm + jirl instruction.
>
> 3. Differentiate between direct jumps and ftrace jumps of trampoline:
> direct jumps skip 5 instructions.
> ftrace jumps skip 2 instructions.
>
> 4. Remove the generation of BL jump instructions in
> emit_jump_and_link().
> After the trampoline ends, it will jump to the specified register.
> The BL instruction writes PC+4 to r1 instead of allowing the
> specification of rd.
>
> ---------------------------------------------------------------------
> --
> Historical Version:
> v3:
> 1. Patch 0003 adds EXECMEM_BPF memory type to the execmem subsystem.
>
> 2. Align the size calculated by arch_bpf_trampoline_size to page
> boundaries.
>
> 3. Add the flush icache operation to larch_insn_text_copy.
>
> 4. Unify the implementation of bpf_arch_xxx into the patch
> "0004-LoongArch-BPF-Add-bpf_arch_xxxxx-support-for-Loong.patch".
>
> 5. Change the patch order. Move the patch
> "0002-LoongArch-BPF-Update-the-code-to-rename-validate_.patch" before
> "0005-LoongArch-BPF-Add-bpf-trampoline-support-for-Loon.patch".
>
> URL for version v3:
> https://lore.kernel.org/all/20250709055029.723243-1-duanchenghao@kylinos.cn/
> ---------
> v2:
> 1. Change the fixmap in the instruction copy function to
> set_memory_xxx.
>
> 2. Change the implementation method of the following code.
> - arch_alloc_bpf_trampoline
> - arch_free_bpf_trampoline
> Use the BPF core's allocation and free functions.
>
> - bpf_arch_text_invalidate
> Operate with the function larch_insn_text_copy that carries
> memory attribute modifications.
>
> 3. Correct the incorrect code formatting.
>
> URL for version v2:
> https://lore.kernel.org/all/20250618105048.1510560-1-duanchenghao@kylinos.cn/
> ---------
> v1:
> Support trampoline for LoongArch. The following feature tests have
> been
> completed:
> 1. fentry
> 2. fexit
> 3. fmod_ret
>
> TODO: The support for the struct_ops feature will be provided in
> subsequent patches.
>
> URL for version v1:
> https://lore.kernel.org/all/20250611035952.111182-1-duanchenghao@kylinos.cn/
> ---------------------------------------------------------------------
> --
>
> Chenghao Duan (4):
> LoongArch: Add larch_insn_gen_{beq,bne} helpers
> LoongArch: BPF: Update the code to rename validate_code to
> validate_ctx
> LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
> LoongArch: BPF: Add bpf trampoline support for Loongarch
>
> Tiezhu Yang (1):
> LoongArch: BPF: Add struct ops support for trampoline
>
> arch/loongarch/include/asm/inst.h | 3 +
> arch/loongarch/kernel/inst.c | 60 ++++
> arch/loongarch/net/bpf_jit.c | 521
> +++++++++++++++++++++++++++++-
> arch/loongarch/net/bpf_jit.h | 6 +
> 4 files changed, 589 insertions(+), 1 deletion(-)
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 4/5] LoongArch: BPF: Add bpf trampoline support for Loongarch
2025-07-24 14:19 ` [PATCH v4 4/5] LoongArch: BPF: Add bpf trampoline " Chenghao Duan
@ 2025-07-28 2:03 ` Geliang Tang
2025-07-28 10:50 ` Hengqi Chen
1 sibling, 0 replies; 22+ messages in thread
From: Geliang Tang @ 2025-07-28 2:03 UTC (permalink / raw)
To: Chenghao Duan, ast, daniel, andrii, yangtiezhu, hengqi.chen,
chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, youling.tang, jianghaoran, vincent.mc.li,
kernel test robot
Hi Chenghao, Huacai,
On Thu, 2025-07-24 at 22:19 +0800, Chenghao Duan wrote:
> BPF trampoline is the critical infrastructure of the BPF subsystem,
> acting
> as a mediator between kernel functions and BPF programs. Numerous
> important
> features, such as using BPF program for zero overhead kernel
> introspection,
> rely on this key component.
>
> The related tests have passed, Including the following technical
> points:
> 1. fentry
> 2. fmod_ret
> 3. fexit
>
> The following related testcases passed on LoongArch:
> sudo ./test_progs -a fentry_test/fentry
> sudo ./test_progs -a fexit_test/fexit
> sudo ./test_progs -a fentry_fexit
> sudo ./test_progs -a modify_return
> sudo ./test_progs -a fexit_sleep
> sudo ./test_progs -a test_overhead
> sudo ./test_progs -a trampoline_count
Please add the following paragraph to the commit log:
'''
This issue was first reported by Geliang Tang in June 2024 while
debugging MPTCP BPF selftests on a LoongArch machine (see commit
eef0532e900c "selftests/bpf: Null checks for links in bpf_tcp_ca").
Geliang, Huachui, and Tiezhu then worked together to drive the
implementation of this feature, encouraging broader collaboration among
Chinese kernel engineers.
Reported-by: Geliang Tang <geliang@kernel.org>
'''
Thanks,
-Geliang
>
> Reported-by: kernel test robot <lkp@intel.com>
> Closes:
> https://lore.kernel.org/oe-kbuild-all/202507100034.wXofj6VX-lkp@intel.com/
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
> Tested-by: Vincent Li <vincent.mc.li@gmail.com>
> Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
> Reviewed-by: Huacai Chen <chenhuacai@kernel.org>
> ---
> arch/loongarch/net/bpf_jit.c | 391
> +++++++++++++++++++++++++++++++++++
> arch/loongarch/net/bpf_jit.h | 6 +
> 2 files changed, 397 insertions(+)
>
> diff --git a/arch/loongarch/net/bpf_jit.c
> b/arch/loongarch/net/bpf_jit.c
> index 86504e710..ac5ce3a28 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -7,9 +7,15 @@
> #include <linux/memory.h>
> #include "bpf_jit.h"
>
> +#define LOONGARCH_MAX_REG_ARGS 8
> +
> #define LOONGARCH_LONG_JUMP_NINSNS 5
> #define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
>
> +#define LOONGARCH_FENTRY_NINSNS 2
> +#define LOONGARCH_FENTRY_NBYTES (LOONGARCH_FENTRY_NINSNS * 4)
> +#define LOONGARCH_BPF_FENTRY_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> +
> #define REG_TCC LOONGARCH_GPR_A6
> #define TCC_SAVED LOONGARCH_GPR_S5
>
> @@ -1407,6 +1413,11 @@ static int gen_jump_or_nops(void *target, void
> *ip, u32 *insns, bool is_call)
> (unsigned long)target);
> }
>
> +static int emit_call(struct jit_ctx *ctx, u64 addr)
> +{
> + return emit_jump_and_link(ctx, LOONGARCH_GPR_RA, addr);
> +}
> +
> int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> void *old_addr, void *new_addr)
> {
> @@ -1464,3 +1475,383 @@ void *bpf_arch_text_copy(void *dst, void
> *src, size_t len)
>
> return dst;
> }
> +
> +static void store_args(struct jit_ctx *ctx, int nargs, int args_off)
> +{
> + int i;
> +
> + for (i = 0; i < nargs; i++) {
> + emit_insn(ctx, std, LOONGARCH_GPR_A0 + i,
> LOONGARCH_GPR_FP, -args_off);
> + args_off -= 8;
> + }
> +}
> +
> +static void restore_args(struct jit_ctx *ctx, int nargs, int
> args_off)
> +{
> + int i;
> +
> + for (i = 0; i < nargs; i++) {
> + emit_insn(ctx, ldd, LOONGARCH_GPR_A0 + i,
> LOONGARCH_GPR_FP, -args_off);
> + args_off -= 8;
> + }
> +}
> +
> +static int invoke_bpf_prog(struct jit_ctx *ctx, struct
> bpf_tramp_link *l,
> + int args_off, int retval_off,
> + int run_ctx_off, bool save_ret)
> +{
> + int ret;
> + u32 *branch;
> + struct bpf_prog *p = l->link.prog;
> + int cookie_off = offsetof(struct bpf_tramp_run_ctx,
> bpf_cookie);
> +
> + if (l->cookie) {
> + move_imm(ctx, LOONGARCH_GPR_T1, l->cookie, false);
> + emit_insn(ctx, std, LOONGARCH_GPR_T1,
> LOONGARCH_GPR_FP, -run_ctx_off + cookie_off);
> + } else {
> + emit_insn(ctx, std, LOONGARCH_GPR_ZERO,
> LOONGARCH_GPR_FP,
> + -run_ctx_off + cookie_off);
> + }
> +
> + /* arg1: prog */
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
> + /* arg2: &run_ctx */
> + emit_insn(ctx, addid, LOONGARCH_GPR_A1, LOONGARCH_GPR_FP, -
> run_ctx_off);
> + ret = emit_call(ctx, (const u64)bpf_trampoline_enter(p));
> + if (ret)
> + return ret;
> +
> + /* store prog start time */
> + move_reg(ctx, LOONGARCH_GPR_S1, LOONGARCH_GPR_A0);
> +
> + /* if (__bpf_prog_enter(prog) == 0)
> + * goto skip_exec_of_prog;
> + *
> + */
> + branch = (u32 *)ctx->image + ctx->idx;
> + /* nop reserved for conditional jump */
> + emit_insn(ctx, nop);
> +
> + /* arg1: &args_off */
> + emit_insn(ctx, addid, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -
> args_off);
> + if (!p->jited)
> + move_imm(ctx, LOONGARCH_GPR_A1, (const s64)p-
> >insnsi, false);
> + ret = emit_call(ctx, (const u64)p->bpf_func);
> + if (ret)
> + return ret;
> +
> + if (save_ret) {
> + emit_insn(ctx, std, LOONGARCH_GPR_A0,
> LOONGARCH_GPR_FP, -retval_off);
> + emit_insn(ctx, std, regmap[BPF_REG_0],
> LOONGARCH_GPR_FP, -(retval_off - 8));
> + }
> +
> + /* update branch with beqz */
> + if (ctx->image) {
> + int offset = (void *)(&ctx->image[ctx->idx]) - (void
> *)branch;
> + *branch = larch_insn_gen_beq(LOONGARCH_GPR_A0,
> LOONGARCH_GPR_ZERO, offset);
> + }
> +
> + /* arg1: prog */
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
> + /* arg2: prog start time */
> + move_reg(ctx, LOONGARCH_GPR_A1, LOONGARCH_GPR_S1);
> + /* arg3: &run_ctx */
> + emit_insn(ctx, addid, LOONGARCH_GPR_A2, LOONGARCH_GPR_FP, -
> run_ctx_off);
> + ret = emit_call(ctx, (const u64)bpf_trampoline_exit(p));
> +
> + return ret;
> +}
> +
> +static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct
> bpf_tramp_links *tl,
> + int args_off, int retval_off, int
> run_ctx_off, u32 **branches)
> +{
> + int i;
> +
> + emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP, -
> retval_off);
> + for (i = 0; i < tl->nr_links; i++) {
> + invoke_bpf_prog(ctx, tl->links[i], args_off,
> retval_off,
> + run_ctx_off, true);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_T1,
> LOONGARCH_GPR_FP, -retval_off);
> + branches[i] = (u32 *)ctx->image + ctx->idx;
> + emit_insn(ctx, nop);
> + }
> +}
> +
> +u64 bpf_jit_alloc_exec_limit(void)
> +{
> + return VMALLOC_END - VMALLOC_START;
> +}
> +
> +void *arch_alloc_bpf_trampoline(unsigned int size)
> +{
> + return bpf_prog_pack_alloc(size, jit_fill_hole);
> +}
> +
> +void arch_free_bpf_trampoline(void *image, unsigned int size)
> +{
> + bpf_prog_pack_free(image, size);
> +}
> +
> +static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct
> bpf_tramp_image *im,
> + const struct btf_func_model
> *m,
> + struct bpf_tramp_links
> *tlinks,
> + void *func_addr, u32 flags)
> +{
> + int i;
> + int stack_size = 0, nargs = 0;
> + int retval_off, args_off, nargs_off, ip_off, run_ctx_off,
> sreg_off;
> + struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
> + struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> + struct bpf_tramp_links *fmod_ret =
> &tlinks[BPF_TRAMP_MODIFY_RETURN];
> + int ret, save_ret;
> + void *orig_call = func_addr;
> + u32 **branches = NULL;
> +
> + if (flags & (BPF_TRAMP_F_ORIG_STACK |
> BPF_TRAMP_F_SHARE_IPMODIFY))
> + return -ENOTSUPP;
> +
> + /*
> + * FP + 8 [ RA to parent func ] return address to
> parent
> + * function
> + * FP + 0 [ FP of parent func ] frame pointer of
> parent
> + * function
> + * FP - 8 [ T0 to traced func ] return address of
> traced
> + * function
> + * FP - 16 [ FP of traced func ] frame pointer of
> traced
> + * function
> + *
> + * FP - retval_off [ return value ]
> BPF_TRAMP_F_CALL_ORIG or
> + * BPF_TRAMP_F_RET_FENTRY_RET
> + * [ argN ]
> + * [ ... ]
> + * FP - args_off [ arg1 ]
> + *
> + * FP - nargs_off [ regs count ]
> + *
> + * FP - ip_off [ traced func ] BPF_TRAMP_F_IP_ARG
> + *
> + * FP - run_ctx_off [ bpf_tramp_run_ctx ]
> + *
> + * FP - sreg_off [ callee saved reg ]
> + *
> + */
> +
> + if (m->nr_args > LOONGARCH_MAX_REG_ARGS)
> + return -ENOTSUPP;
> +
> + if (flags & (BPF_TRAMP_F_ORIG_STACK |
> BPF_TRAMP_F_SHARE_IPMODIFY))
> + return -ENOTSUPP;
> +
> + stack_size = 0;
> +
> + /* room of trampoline frame to store return address and
> frame pointer */
> + stack_size += 16;
> +
> + save_ret = flags & (BPF_TRAMP_F_CALL_ORIG |
> BPF_TRAMP_F_RET_FENTRY_RET);
> + if (save_ret) {
> + /* Save BPF R0 and A0 */
> + stack_size += 16;
> + retval_off = stack_size;
> + }
> +
> + /* room of trampoline frame to store args */
> + nargs = m->nr_args;
> + stack_size += nargs * 8;
> + args_off = stack_size;
> +
> + /* room of trampoline frame to store args number */
> + stack_size += 8;
> + nargs_off = stack_size;
> +
> + /* room of trampoline frame to store ip address */
> + if (flags & BPF_TRAMP_F_IP_ARG) {
> + stack_size += 8;
> + ip_off = stack_size;
> + }
> +
> + /* room of trampoline frame to store struct
> bpf_tramp_run_ctx */
> + stack_size += round_up(sizeof(struct bpf_tramp_run_ctx), 8);
> + run_ctx_off = stack_size;
> +
> + stack_size += 8;
> + sreg_off = stack_size;
> +
> + stack_size = round_up(stack_size, 16);
> +
> + /* For the trampoline called from function entry */
> + /* RA and FP for parent function*/
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -
> 16);
> + emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP,
> 16);
> +
> + /* RA and FP for traced function*/
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -
> stack_size);
> + emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP,
> stack_size - 8);
> + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP,
> stack_size - 16);
> + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP,
> stack_size);
> +
> + /* callee saved register S1 to pass start time */
> + emit_insn(ctx, std, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -
> sreg_off);
> +
> + /* store ip address of the traced function */
> + if (flags & BPF_TRAMP_F_IP_ARG) {
> + move_imm(ctx, LOONGARCH_GPR_T1, (const
> s64)func_addr, false);
> + emit_insn(ctx, std, LOONGARCH_GPR_T1,
> LOONGARCH_GPR_FP, -ip_off);
> + }
> +
> + /* store nargs number*/
> + move_imm(ctx, LOONGARCH_GPR_T1, nargs, false);
> + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -
> nargs_off);
> +
> + store_args(ctx, nargs, args_off);
> +
> + /* To traced function */
> + /* Ftrace jump skips 2 NOP instructions */
> + if (is_kernel_text((unsigned long)orig_call))
> + orig_call += LOONGARCH_FENTRY_NBYTES;
> + /* Direct jump skips 5 NOP instructions */
> + else if (is_bpf_text_address((unsigned long)orig_call))
> + orig_call += LOONGARCH_BPF_FENTRY_NBYTES;
> +
> + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im,
> false);
> + ret = emit_call(ctx, (const u64)__bpf_tramp_enter);
> + if (ret)
> + return ret;
> + }
> +
> + for (i = 0; i < fentry->nr_links; i++) {
> + ret = invoke_bpf_prog(ctx, fentry->links[i],
> args_off, retval_off,
> + run_ctx_off, flags &
> BPF_TRAMP_F_RET_FENTRY_RET);
> + if (ret)
> + return ret;
> + }
> + if (fmod_ret->nr_links) {
> + branches = kcalloc(fmod_ret->nr_links, sizeof(u32
> *), GFP_KERNEL);
> + if (!branches)
> + return -ENOMEM;
> +
> + invoke_bpf_mod_ret(ctx, fmod_ret, args_off,
> retval_off,
> + run_ctx_off, branches);
> + }
> +
> + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> + restore_args(ctx, m->nr_args, args_off);
> + ret = emit_call(ctx, (const u64)orig_call);
> + if (ret)
> + goto out;
> + emit_insn(ctx, std, LOONGARCH_GPR_A0,
> LOONGARCH_GPR_FP, -retval_off);
> + emit_insn(ctx, std, regmap[BPF_REG_0],
> LOONGARCH_GPR_FP, -(retval_off - 8));
> + im->ip_after_call = ctx->ro_image + ctx->idx;
> + /* Reserve space for the move_imm + jirl instruction
> */
> + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> + emit_insn(ctx, nop);
> + }
> +
> + for (i = 0; ctx->image && i < fmod_ret->nr_links; i++) {
> + int offset = (void *)(&ctx->image[ctx->idx]) - (void
> *)branches[i];
> + *branches[i] = larch_insn_gen_bne(LOONGARCH_GPR_T1,
> LOONGARCH_GPR_ZERO, offset);
> + }
> +
> + for (i = 0; i < fexit->nr_links; i++) {
> + ret = invoke_bpf_prog(ctx, fexit->links[i],
> args_off, retval_off,
> + run_ctx_off, false);
> + if (ret)
> + goto out;
> + }
> +
> + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> + im->ip_epilogue = ctx->ro_image + ctx->idx;
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im,
> false);
> + ret = emit_call(ctx, (const u64)__bpf_tramp_exit);
> + if (ret)
> + goto out;
> + }
> +
> + if (flags & BPF_TRAMP_F_RESTORE_REGS)
> + restore_args(ctx, m->nr_args, args_off);
> +
> + if (save_ret) {
> + emit_insn(ctx, ldd, LOONGARCH_GPR_A0,
> LOONGARCH_GPR_FP, -retval_off);
> + emit_insn(ctx, ldd, regmap[BPF_REG_0],
> LOONGARCH_GPR_FP, -(retval_off - 8));
> + }
> +
> + emit_insn(ctx, ldd, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -
> sreg_off);
> +
> + /* trampoline called from function entry */
> + emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP,
> stack_size - 8);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP,
> stack_size - 16);
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP,
> stack_size);
> +
> + emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP,
> 16);
> +
> + if (flags & BPF_TRAMP_F_SKIP_FRAME)
> + /* return to parent function */
> + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO,
> LOONGARCH_GPR_RA, 0);
> + else
> + /* return to traced function */
> + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO,
> LOONGARCH_GPR_T0, 0);
> +
> + ret = ctx->idx;
> +out:
> + kfree(branches);
> +
> + return ret;
> +}
> +
> +int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void
> *ro_image,
> + void *ro_image_end, const struct
> btf_func_model *m,
> + u32 flags, struct bpf_tramp_links
> *tlinks,
> + void *func_addr)
> +{
> + int ret;
> + void *image, *tmp;
> + u32 size = ro_image_end - ro_image;
> +
> + image = kvmalloc(size, GFP_KERNEL);
> + if (!image)
> + return -ENOMEM;
> +
> + struct jit_ctx ctx = {
> + .image = (union loongarch_instruction *)image,
> + .ro_image = (union loongarch_instruction *)ro_image,
> + .idx = 0,
> + };
> +
> + jit_fill_hole(image, (unsigned int)(ro_image_end -
> ro_image));
> + ret = __arch_prepare_bpf_trampoline(&ctx, im, m, tlinks,
> func_addr, flags);
> + if (ret > 0 && validate_code(&ctx) < 0) {
> + ret = -EINVAL;
> + goto out;
> + }
> +
> + tmp = bpf_arch_text_copy(ro_image, image, size);
> + if (IS_ERR(tmp)) {
> + ret = PTR_ERR(tmp);
> + goto out;
> + }
> +
> + bpf_flush_icache(ro_image, ro_image_end);
> +out:
> + kvfree(image);
> + return ret < 0 ? ret : size;
> +}
> +
> +int arch_bpf_trampoline_size(const struct btf_func_model *m, u32
> flags,
> + struct bpf_tramp_links *tlinks, void
> *func_addr)
> +{
> + struct bpf_tramp_image im;
> + struct jit_ctx ctx;
> + int ret;
> +
> + ctx.image = NULL;
> + ctx.idx = 0;
> +
> + ret = __arch_prepare_bpf_trampoline(&ctx, &im, m, tlinks,
> func_addr, flags);
> +
> + /* Page align */
> + return ret < 0 ? ret : round_up(ret * LOONGARCH_INSN_SIZE,
> PAGE_SIZE);
> +}
> diff --git a/arch/loongarch/net/bpf_jit.h
> b/arch/loongarch/net/bpf_jit.h
> index f9c569f53..5697158fd 100644
> --- a/arch/loongarch/net/bpf_jit.h
> +++ b/arch/loongarch/net/bpf_jit.h
> @@ -18,6 +18,7 @@ struct jit_ctx {
> u32 *offset;
> int num_exentries;
> union loongarch_instruction *image;
> + union loongarch_instruction *ro_image;
> u32 stack_size;
> };
>
> @@ -308,3 +309,8 @@ static inline int emit_tailcall_jmp(struct
> jit_ctx *ctx, u8 cond, enum loongarch
>
> return -EINVAL;
> }
> +
> +static inline void bpf_flush_icache(void *start, void *end)
> +{
> + flush_icache_range((unsigned long)start, (unsigned
> long)end);
> +}
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 3/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
2025-07-24 14:19 ` [PATCH v4 3/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch Chenghao Duan
@ 2025-07-28 2:30 ` Huacai Chen
2025-07-28 10:47 ` Hengqi Chen
2025-07-28 10:58 ` Hengqi Chen
2 siblings, 0 replies; 22+ messages in thread
From: Huacai Chen @ 2025-07-28 2:30 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, hengqi.chen, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li
Hi, Chenghao,
On Thu, Jul 24, 2025 at 10:21 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> Implement the functions of bpf_arch_text_poke, bpf_arch_text_copy, and
> bpf_arch_text_invalidate on the LoongArch architecture.
>
> On LoongArch, since symbol addresses in the direct mapping
> region cannot be reached via relative jump instructions from the paged
> mapping region, we use the move_imm+jirl instruction pair as absolute
> jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> instructions in the program as placeholders for function jumps.
>
> larch_insn_text_copy is solely used for BPF. The use of
> larch_insn_text_copy() requires page_size alignment. Currently, only
> the size of the trampoline is page-aligned.
>
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
> Reviewed-by: Huacai Chen <chenhuacai@kernel.org>
> ---
> arch/loongarch/include/asm/inst.h | 1 +
> arch/loongarch/kernel/inst.c | 32 ++++++++++
> arch/loongarch/net/bpf_jit.c | 97 +++++++++++++++++++++++++++++++
> 3 files changed, 130 insertions(+)
>
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> index 2ae96a35d..88bb73e46 100644
> --- a/arch/loongarch/include/asm/inst.h
> +++ b/arch/loongarch/include/asm/inst.h
> @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> int larch_insn_read(void *addr, u32 *insnp);
> int larch_insn_write(void *addr, u32 insn);
> int larch_insn_patch_text(void *addr, u32 insn);
> +int larch_insn_text_copy(void *dst, void *src, size_t len);
>
> u32 larch_insn_gen_nop(void);
> u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> index 674e3b322..8d6594968 100644
> --- a/arch/loongarch/kernel/inst.c
> +++ b/arch/loongarch/kernel/inst.c
> @@ -4,6 +4,7 @@
> */
> #include <linux/sizes.h>
> #include <linux/uaccess.h>
> +#include <linux/set_memory.h>
>
> #include <asm/cacheflush.h>
> #include <asm/inst.h>
> @@ -218,6 +219,37 @@ int larch_insn_patch_text(void *addr, u32 insn)
> return ret;
> }
>
> +int larch_insn_text_copy(void *dst, void *src, size_t len)
> +{
> + unsigned long flags;
> + size_t wlen = 0;
> + size_t size;
> + void *ptr;
> + int ret = 0;
> +
> + set_memory_rw((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> + raw_spin_lock_irqsave(&patch_lock, flags);
> + while (wlen < len) {
> + ptr = dst + wlen;
> + size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
> + len - wlen);
> +
> + ret = copy_to_kernel_nofault(ptr, src + wlen, size);
> + if (ret) {
> + pr_err("%s: operation failed\n", __func__);
> + break;
> + }
> + wlen += size;
> + }
I have an off-list discussion with Hengqi, and I know why he has
questions on this. He said we don't need a loop, we can just copy the
whole thing in a single operation, and I think he is right. RISC-V
uses a loop because it uses fixmap, and with fixmap they can only
handle a single page every time. We use set_memory_{rw, rox} so we
have no limitation.
> + raw_spin_unlock_irqrestore(&patch_lock, flags);
> + set_memory_rox((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> +
> + if (!ret)
> + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> +
> + return ret;
> +}
To save time, please test this method:
+int larch_insn_text_copy(void *dst, void *src, size_t len)
+{
+ int ret = 0;
+ unsigned long flags;
+
+ WARN_ON(!PAGE_ALIGNED(src)); //maybe this is unneeded
+ WARN_ON(!PAGE_ALIGNED(dst));
+ WARN_ON(!PAGE_ALIGNED(size));
+
+ set_memory_rw((unsigned long)dst, len / PAGE_SIZE);
+ raw_spin_lock_irqsave(&patch_lock, flags);
+
+ ret = copy_to_kernel_nofault(dst, src, len);
+ if (ret)
+ pr_err("%s: operation failed\n", __func__);
+
+ raw_spin_unlock_irqrestore(&patch_lock, flags);
+ set_memory_rox((unsigned long)dst, len / PAGE_SIZE);
+
+ if (!ret)
+ flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
+
+ return ret;
+}
Huacai
> +
> u32 larch_insn_gen_nop(void)
> {
> return INSN_NOP;
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 7032f11d3..86504e710 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -4,8 +4,12 @@
> *
> * Copyright (C) 2022 Loongson Technology Corporation Limited
> */
> +#include <linux/memory.h>
> #include "bpf_jit.h"
>
> +#define LOONGARCH_LONG_JUMP_NINSNS 5
> +#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> +
> #define REG_TCC LOONGARCH_GPR_A6
> #define TCC_SAVED LOONGARCH_GPR_S5
>
> @@ -88,6 +92,7 @@ static u8 tail_call_reg(struct jit_ctx *ctx)
> */
> static void build_prologue(struct jit_ctx *ctx)
> {
> + int i;
> int stack_adjust = 0, store_offset, bpf_stack_adjust;
>
> bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
> @@ -98,6 +103,10 @@ static void build_prologue(struct jit_ctx *ctx)
> stack_adjust = round_up(stack_adjust, 16);
> stack_adjust += bpf_stack_adjust;
>
> + /* Reserve space for the move_imm + jirl instruction */
> + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> + emit_insn(ctx, nop);
> +
> /*
> * First instruction initializes the tail call count (TCC).
> * On tail call we skip this instruction, and the TCC is
> @@ -1367,3 +1376,91 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> {
> return true;
> }
> +
> +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 target)
> +{
> + if (!target) {
> + pr_err("bpf_jit: jump target address is error\n");
> + return -EFAULT;
> + }
> +
> + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> +
> + return 0;
> +}
> +
> +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> +{
> + struct jit_ctx ctx;
> +
> + ctx.idx = 0;
> + ctx.image = (union loongarch_instruction *)insns;
> +
> + if (!target) {
> + emit_insn((&ctx), nop);
> + emit_insn((&ctx), nop);
> + return 0;
> + }
> +
> + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> + (unsigned long)target);
> +}
> +
> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> + void *old_addr, void *new_addr)
> +{
> + u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> + u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> + bool is_call = poke_type == BPF_MOD_CALL;
> + int ret;
> +
> + if (!is_kernel_text((unsigned long)ip) &&
> + !is_bpf_text_address((unsigned long)ip))
> + return -ENOTSUPP;
> +
> + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> + if (ret)
> + return ret;
> +
> + if (memcmp(ip, old_insns, LOONGARCH_LONG_JUMP_NBYTES))
> + return -EFAULT;
> +
> + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> + if (ret)
> + return ret;
> +
> + mutex_lock(&text_mutex);
> + if (memcmp(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES))
> + ret = larch_insn_text_copy(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES);
> + mutex_unlock(&text_mutex);
> + return ret;
> +}
> +
> +int bpf_arch_text_invalidate(void *dst, size_t len)
> +{
> + int i;
> + int ret = 0;
> + u32 *inst;
> +
> + inst = kvmalloc(len, GFP_KERNEL);
> + if (!inst)
> + return -ENOMEM;
> +
> + for (i = 0; i < (len/sizeof(u32)); i++)
> + inst[i] = INSN_BREAK;
> +
> + if (larch_insn_text_copy(dst, inst, len))
> + ret = -EINVAL;
> +
> + kvfree(inst);
> + return ret;
> +}
> +
> +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> +{
> + if (larch_insn_text_copy(dst, src, len))
> + return ERR_PTR(-EINVAL);
> +
> + return dst;
> +}
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 0/5] Support trampoline for LoongArch
2025-07-27 1:00 ` Geliang Tang
@ 2025-07-28 2:42 ` Huacai Chen
0 siblings, 0 replies; 22+ messages in thread
From: Huacai Chen @ 2025-07-28 2:42 UTC (permalink / raw)
To: Geliang Tang
Cc: Chenghao Duan, ast, daniel, andrii, yangtiezhu, hengqi.chen,
martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, youling.tang, jianghaoran, vincent.mc.li
On Sun, Jul 27, 2025 at 9:00 AM Geliang Tang <geliang@kernel.org> wrote:
>
> Hi Chenghao, Huacai, Tuezhu,
>
> I first discovered this Loongarch BPF trampoline issue when debugging
> MPTCP BPF selftests on a Loongarch machine last June (see my commit
> eef0532e900c "selftests/bpf: Null checks for links in bpf_tcp_ca"), and
> reported it to Huachui. Tiezhu and I started implementing BPF
> trampoline last June. I also called on more Chinese kernel engineers to
> participate in the development of the Loongarch BPF trampoline at the
> openEuler Developer Day 2024 and CLSF 2024 conferences. Although this
> work was finally handed over to Chenghao, it is also necessary to
> mention me as the reporter and our early developers in the commit log.
Thank you for reminding me, since the 3rd patch need to be fixed,
chenghao can do that as soon as possible, then adjust the SOB together
in V5.
Huacai
>
> Thanks,
> -Geliang
>
> On Thu, 2025-07-24 at 22:19 +0800, Chenghao Duan wrote:
> > v4:
> > 1. Delete the #3 patch of version V3.
> >
> > 2. Add 5 NOP instructions in build_prologue().
> > Reserve space for the move_imm + jirl instruction.
> >
> > 3. Differentiate between direct jumps and ftrace jumps of trampoline:
> > direct jumps skip 5 instructions.
> > ftrace jumps skip 2 instructions.
> >
> > 4. Remove the generation of BL jump instructions in
> > emit_jump_and_link().
> > After the trampoline ends, it will jump to the specified register.
> > The BL instruction writes PC+4 to r1 instead of allowing the
> > specification of rd.
> >
> > ---------------------------------------------------------------------
> > --
> > Historical Version:
> > v3:
> > 1. Patch 0003 adds EXECMEM_BPF memory type to the execmem subsystem.
> >
> > 2. Align the size calculated by arch_bpf_trampoline_size to page
> > boundaries.
> >
> > 3. Add the flush icache operation to larch_insn_text_copy.
> >
> > 4. Unify the implementation of bpf_arch_xxx into the patch
> > "0004-LoongArch-BPF-Add-bpf_arch_xxxxx-support-for-Loong.patch".
> >
> > 5. Change the patch order. Move the patch
> > "0002-LoongArch-BPF-Update-the-code-to-rename-validate_.patch" before
> > "0005-LoongArch-BPF-Add-bpf-trampoline-support-for-Loon.patch".
> >
> > URL for version v3:
> > https://lore.kernel.org/all/20250709055029.723243-1-duanchenghao@kylinos.cn/
> > ---------
> > v2:
> > 1. Change the fixmap in the instruction copy function to
> > set_memory_xxx.
> >
> > 2. Change the implementation method of the following code.
> > - arch_alloc_bpf_trampoline
> > - arch_free_bpf_trampoline
> > Use the BPF core's allocation and free functions.
> >
> > - bpf_arch_text_invalidate
> > Operate with the function larch_insn_text_copy that carries
> > memory attribute modifications.
> >
> > 3. Correct the incorrect code formatting.
> >
> > URL for version v2:
> > https://lore.kernel.org/all/20250618105048.1510560-1-duanchenghao@kylinos.cn/
> > ---------
> > v1:
> > Support trampoline for LoongArch. The following feature tests have
> > been
> > completed:
> > 1. fentry
> > 2. fexit
> > 3. fmod_ret
> >
> > TODO: The support for the struct_ops feature will be provided in
> > subsequent patches.
> >
> > URL for version v1:
> > https://lore.kernel.org/all/20250611035952.111182-1-duanchenghao@kylinos.cn/
> > ---------------------------------------------------------------------
> > --
> >
> > Chenghao Duan (4):
> > LoongArch: Add larch_insn_gen_{beq,bne} helpers
> > LoongArch: BPF: Update the code to rename validate_code to
> > validate_ctx
> > LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
> > LoongArch: BPF: Add bpf trampoline support for Loongarch
> >
> > Tiezhu Yang (1):
> > LoongArch: BPF: Add struct ops support for trampoline
> >
> > arch/loongarch/include/asm/inst.h | 3 +
> > arch/loongarch/kernel/inst.c | 60 ++++
> > arch/loongarch/net/bpf_jit.c | 521
> > +++++++++++++++++++++++++++++-
> > arch/loongarch/net/bpf_jit.h | 6 +
> > 4 files changed, 589 insertions(+), 1 deletion(-)
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 3/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
2025-07-24 14:19 ` [PATCH v4 3/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch Chenghao Duan
2025-07-28 2:30 ` Huacai Chen
@ 2025-07-28 10:47 ` Hengqi Chen
2025-07-28 13:21 ` Chenghao Duan
2025-07-28 10:58 ` Hengqi Chen
2 siblings, 1 reply; 22+ messages in thread
From: Hengqi Chen @ 2025-07-28 10:47 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li
On Thu, Jul 24, 2025 at 10:21 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> Implement the functions of bpf_arch_text_poke, bpf_arch_text_copy, and
> bpf_arch_text_invalidate on the LoongArch architecture.
>
> On LoongArch, since symbol addresses in the direct mapping
> region cannot be reached via relative jump instructions from the paged
> mapping region, we use the move_imm+jirl instruction pair as absolute
> jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> instructions in the program as placeholders for function jumps.
>
> larch_insn_text_copy is solely used for BPF. The use of
> larch_insn_text_copy() requires page_size alignment. Currently, only
> the size of the trampoline is page-aligned.
>
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
> Reviewed-by: Huacai Chen <chenhuacai@kernel.org>
> ---
> arch/loongarch/include/asm/inst.h | 1 +
> arch/loongarch/kernel/inst.c | 32 ++++++++++
> arch/loongarch/net/bpf_jit.c | 97 +++++++++++++++++++++++++++++++
> 3 files changed, 130 insertions(+)
>
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> index 2ae96a35d..88bb73e46 100644
> --- a/arch/loongarch/include/asm/inst.h
> +++ b/arch/loongarch/include/asm/inst.h
> @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> int larch_insn_read(void *addr, u32 *insnp);
> int larch_insn_write(void *addr, u32 insn);
> int larch_insn_patch_text(void *addr, u32 insn);
> +int larch_insn_text_copy(void *dst, void *src, size_t len);
>
> u32 larch_insn_gen_nop(void);
> u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> index 674e3b322..8d6594968 100644
> --- a/arch/loongarch/kernel/inst.c
> +++ b/arch/loongarch/kernel/inst.c
> @@ -4,6 +4,7 @@
> */
> #include <linux/sizes.h>
> #include <linux/uaccess.h>
> +#include <linux/set_memory.h>
>
> #include <asm/cacheflush.h>
> #include <asm/inst.h>
> @@ -218,6 +219,37 @@ int larch_insn_patch_text(void *addr, u32 insn)
> return ret;
> }
>
> +int larch_insn_text_copy(void *dst, void *src, size_t len)
> +{
> + unsigned long flags;
> + size_t wlen = 0;
> + size_t size;
> + void *ptr;
> + int ret = 0;
> +
> + set_memory_rw((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> + raw_spin_lock_irqsave(&patch_lock, flags);
> + while (wlen < len) {
> + ptr = dst + wlen;
> + size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
> + len - wlen);
> +
> + ret = copy_to_kernel_nofault(ptr, src + wlen, size);
> + if (ret) {
> + pr_err("%s: operation failed\n", __func__);
> + break;
> + }
> + wlen += size;
> + }
> + raw_spin_unlock_irqrestore(&patch_lock, flags);
> + set_memory_rox((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> +
> + if (!ret)
> + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> +
> + return ret;
> +}
> +
> u32 larch_insn_gen_nop(void)
> {
> return INSN_NOP;
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 7032f11d3..86504e710 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -4,8 +4,12 @@
> *
> * Copyright (C) 2022 Loongson Technology Corporation Limited
> */
> +#include <linux/memory.h>
> #include "bpf_jit.h"
>
> +#define LOONGARCH_LONG_JUMP_NINSNS 5
> +#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> +
> #define REG_TCC LOONGARCH_GPR_A6
> #define TCC_SAVED LOONGARCH_GPR_S5
>
> @@ -88,6 +92,7 @@ static u8 tail_call_reg(struct jit_ctx *ctx)
> */
> static void build_prologue(struct jit_ctx *ctx)
> {
> + int i;
> int stack_adjust = 0, store_offset, bpf_stack_adjust;
>
> bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
> @@ -98,6 +103,10 @@ static void build_prologue(struct jit_ctx *ctx)
> stack_adjust = round_up(stack_adjust, 16);
> stack_adjust += bpf_stack_adjust;
>
> + /* Reserve space for the move_imm + jirl instruction */
> + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> + emit_insn(ctx, nop);
> +
> /*
> * First instruction initializes the tail call count (TCC).
> * On tail call we skip this instruction, and the TCC is
> @@ -1367,3 +1376,91 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> {
> return true;
> }
> +
> +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 target)
> +{
> + if (!target) {
> + pr_err("bpf_jit: jump target address is error\n");
is error ? is NULL ?
> + return -EFAULT;
> + }
> +
> + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> +
> + return 0;
> +}
> +
> +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> +{
> + struct jit_ctx ctx;
> +
> + ctx.idx = 0;
> + ctx.image = (union loongarch_instruction *)insns;
> +
> + if (!target) {
> + emit_insn((&ctx), nop);
> + emit_insn((&ctx), nop);
> + return 0;
> + }
> +
> + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> + (unsigned long)target);
> +}
> +
> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> + void *old_addr, void *new_addr)
> +{
> + u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> + u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> + bool is_call = poke_type == BPF_MOD_CALL;
> + int ret;
> +
> + if (!is_kernel_text((unsigned long)ip) &&
> + !is_bpf_text_address((unsigned long)ip))
> + return -ENOTSUPP;
> +
> + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> + if (ret)
> + return ret;
> +
> + if (memcmp(ip, old_insns, LOONGARCH_LONG_JUMP_NBYTES))
> + return -EFAULT;
> +
> + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> + if (ret)
> + return ret;
> +
> + mutex_lock(&text_mutex);
> + if (memcmp(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES))
> + ret = larch_insn_text_copy(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES);
> + mutex_unlock(&text_mutex);
> + return ret;
> +}
> +
> +int bpf_arch_text_invalidate(void *dst, size_t len)
> +{
> + int i;
> + int ret = 0;
> + u32 *inst;
> +
> + inst = kvmalloc(len, GFP_KERNEL);
> + if (!inst)
> + return -ENOMEM;
> +
> + for (i = 0; i < (len/sizeof(u32)); i++)
> + inst[i] = INSN_BREAK;
> +
> + if (larch_insn_text_copy(dst, inst, len))
Do we need text_mutex here and below for larch_insn_text_copy() ?
> + ret = -EINVAL;
> +
> + kvfree(inst);
> + return ret;
> +}
> +
> +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> +{
> + if (larch_insn_text_copy(dst, src, len))
> + return ERR_PTR(-EINVAL);
> +
> + return dst;
> +}
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 4/5] LoongArch: BPF: Add bpf trampoline support for Loongarch
2025-07-24 14:19 ` [PATCH v4 4/5] LoongArch: BPF: Add bpf trampoline " Chenghao Duan
2025-07-28 2:03 ` Geliang Tang
@ 2025-07-28 10:50 ` Hengqi Chen
1 sibling, 0 replies; 22+ messages in thread
From: Hengqi Chen @ 2025-07-28 10:50 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li, kernel test robot
On Thu, Jul 24, 2025 at 10:21 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> BPF trampoline is the critical infrastructure of the BPF subsystem, acting
> as a mediator between kernel functions and BPF programs. Numerous important
> features, such as using BPF program for zero overhead kernel introspection,
> rely on this key component.
>
> The related tests have passed, Including the following technical points:
> 1. fentry
> 2. fmod_ret
> 3. fexit
>
> The following related testcases passed on LoongArch:
> sudo ./test_progs -a fentry_test/fentry
> sudo ./test_progs -a fexit_test/fexit
> sudo ./test_progs -a fentry_fexit
> sudo ./test_progs -a modify_return
> sudo ./test_progs -a fexit_sleep
> sudo ./test_progs -a test_overhead
> sudo ./test_progs -a trampoline_count
>
> Reported-by: kernel test robot <lkp@intel.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202507100034.wXofj6VX-lkp@intel.com/
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
> Tested-by: Vincent Li <vincent.mc.li@gmail.com>
> Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
> Reviewed-by: Huacai Chen <chenhuacai@kernel.org>
> ---
> arch/loongarch/net/bpf_jit.c | 391 +++++++++++++++++++++++++++++++++++
> arch/loongarch/net/bpf_jit.h | 6 +
> 2 files changed, 397 insertions(+)
>
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 86504e710..ac5ce3a28 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -7,9 +7,15 @@
> #include <linux/memory.h>
> #include "bpf_jit.h"
>
> +#define LOONGARCH_MAX_REG_ARGS 8
> +
> #define LOONGARCH_LONG_JUMP_NINSNS 5
> #define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
>
> +#define LOONGARCH_FENTRY_NINSNS 2
> +#define LOONGARCH_FENTRY_NBYTES (LOONGARCH_FENTRY_NINSNS * 4)
> +#define LOONGARCH_BPF_FENTRY_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> +
> #define REG_TCC LOONGARCH_GPR_A6
> #define TCC_SAVED LOONGARCH_GPR_S5
>
> @@ -1407,6 +1413,11 @@ static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> (unsigned long)target);
> }
>
> +static int emit_call(struct jit_ctx *ctx, u64 addr)
> +{
> + return emit_jump_and_link(ctx, LOONGARCH_GPR_RA, addr);
> +}
> +
> int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> void *old_addr, void *new_addr)
> {
> @@ -1464,3 +1475,383 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len)
>
> return dst;
> }
> +
> +static void store_args(struct jit_ctx *ctx, int nargs, int args_off)
> +{
> + int i;
> +
> + for (i = 0; i < nargs; i++) {
> + emit_insn(ctx, std, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
> + args_off -= 8;
> + }
> +}
> +
> +static void restore_args(struct jit_ctx *ctx, int nargs, int args_off)
> +{
> + int i;
> +
> + for (i = 0; i < nargs; i++) {
> + emit_insn(ctx, ldd, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
> + args_off -= 8;
> + }
> +}
> +
> +static int invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l,
> + int args_off, int retval_off,
> + int run_ctx_off, bool save_ret)
> +{
> + int ret;
> + u32 *branch;
> + struct bpf_prog *p = l->link.prog;
> + int cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
> +
> + if (l->cookie) {
> + move_imm(ctx, LOONGARCH_GPR_T1, l->cookie, false);
> + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -run_ctx_off + cookie_off);
> + } else {
> + emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP,
> + -run_ctx_off + cookie_off);
> + }
> +
> + /* arg1: prog */
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
> + /* arg2: &run_ctx */
> + emit_insn(ctx, addid, LOONGARCH_GPR_A1, LOONGARCH_GPR_FP, -run_ctx_off);
> + ret = emit_call(ctx, (const u64)bpf_trampoline_enter(p));
> + if (ret)
> + return ret;
> +
> + /* store prog start time */
> + move_reg(ctx, LOONGARCH_GPR_S1, LOONGARCH_GPR_A0);
> +
> + /* if (__bpf_prog_enter(prog) == 0)
> + * goto skip_exec_of_prog;
> + *
> + */
> + branch = (u32 *)ctx->image + ctx->idx;
> + /* nop reserved for conditional jump */
> + emit_insn(ctx, nop);
> +
> + /* arg1: &args_off */
> + emit_insn(ctx, addid, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -args_off);
> + if (!p->jited)
> + move_imm(ctx, LOONGARCH_GPR_A1, (const s64)p->insnsi, false);
> + ret = emit_call(ctx, (const u64)p->bpf_func);
> + if (ret)
> + return ret;
> +
> + if (save_ret) {
> + emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> + emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> + }
> +
> + /* update branch with beqz */
> + if (ctx->image) {
> + int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branch;
> + *branch = larch_insn_gen_beq(LOONGARCH_GPR_A0, LOONGARCH_GPR_ZERO, offset);
> + }
> +
> + /* arg1: prog */
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
> + /* arg2: prog start time */
> + move_reg(ctx, LOONGARCH_GPR_A1, LOONGARCH_GPR_S1);
> + /* arg3: &run_ctx */
> + emit_insn(ctx, addid, LOONGARCH_GPR_A2, LOONGARCH_GPR_FP, -run_ctx_off);
> + ret = emit_call(ctx, (const u64)bpf_trampoline_exit(p));
> +
> + return ret;
> +}
> +
> +static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl,
> + int args_off, int retval_off, int run_ctx_off, u32 **branches)
> +{
> + int i;
> +
> + emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP, -retval_off);
> + for (i = 0; i < tl->nr_links; i++) {
> + invoke_bpf_prog(ctx, tl->links[i], args_off, retval_off,
> + run_ctx_off, true);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -retval_off);
> + branches[i] = (u32 *)ctx->image + ctx->idx;
> + emit_insn(ctx, nop);
> + }
> +}
> +
> +u64 bpf_jit_alloc_exec_limit(void)
> +{
> + return VMALLOC_END - VMALLOC_START;
> +}
> +
> +void *arch_alloc_bpf_trampoline(unsigned int size)
> +{
> + return bpf_prog_pack_alloc(size, jit_fill_hole);
> +}
> +
> +void arch_free_bpf_trampoline(void *image, unsigned int size)
> +{
> + bpf_prog_pack_free(image, size);
> +}
> +
> +static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
> + const struct btf_func_model *m,
> + struct bpf_tramp_links *tlinks,
> + void *func_addr, u32 flags)
> +{
> + int i;
> + int stack_size = 0, nargs = 0;
> + int retval_off, args_off, nargs_off, ip_off, run_ctx_off, sreg_off;
> + struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
> + struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> + struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> + int ret, save_ret;
> + void *orig_call = func_addr;
> + u32 **branches = NULL;
> +
> + if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
> + return -ENOTSUPP;
> +
> + /*
> + * FP + 8 [ RA to parent func ] return address to parent
> + * function
> + * FP + 0 [ FP of parent func ] frame pointer of parent
> + * function
> + * FP - 8 [ T0 to traced func ] return address of traced
> + * function
> + * FP - 16 [ FP of traced func ] frame pointer of traced
> + * function
> + *
> + * FP - retval_off [ return value ] BPF_TRAMP_F_CALL_ORIG or
> + * BPF_TRAMP_F_RET_FENTRY_RET
> + * [ argN ]
> + * [ ... ]
> + * FP - args_off [ arg1 ]
> + *
> + * FP - nargs_off [ regs count ]
> + *
> + * FP - ip_off [ traced func ] BPF_TRAMP_F_IP_ARG
> + *
> + * FP - run_ctx_off [ bpf_tramp_run_ctx ]
> + *
> + * FP - sreg_off [ callee saved reg ]
> + *
> + */
> +
> + if (m->nr_args > LOONGARCH_MAX_REG_ARGS)
> + return -ENOTSUPP;
> +
> + if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
> + return -ENOTSUPP;
> +
> + stack_size = 0;
> +
> + /* room of trampoline frame to store return address and frame pointer */
> + stack_size += 16;
> +
> + save_ret = flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET);
> + if (save_ret) {
> + /* Save BPF R0 and A0 */
> + stack_size += 16;
> + retval_off = stack_size;
> + }
> +
> + /* room of trampoline frame to store args */
> + nargs = m->nr_args;
> + stack_size += nargs * 8;
> + args_off = stack_size;
> +
> + /* room of trampoline frame to store args number */
> + stack_size += 8;
> + nargs_off = stack_size;
> +
> + /* room of trampoline frame to store ip address */
> + if (flags & BPF_TRAMP_F_IP_ARG) {
> + stack_size += 8;
> + ip_off = stack_size;
> + }
> +
> + /* room of trampoline frame to store struct bpf_tramp_run_ctx */
> + stack_size += round_up(sizeof(struct bpf_tramp_run_ctx), 8);
> + run_ctx_off = stack_size;
> +
> + stack_size += 8;
> + sreg_off = stack_size;
> +
> + stack_size = round_up(stack_size, 16);
> +
> + /* For the trampoline called from function entry */
> + /* RA and FP for parent function*/
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
> + emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
> +
> + /* RA and FP for traced function*/
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
> + emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
> +
> + /* callee saved register S1 to pass start time */
> + emit_insn(ctx, std, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> +
> + /* store ip address of the traced function */
> + if (flags & BPF_TRAMP_F_IP_ARG) {
> + move_imm(ctx, LOONGARCH_GPR_T1, (const s64)func_addr, false);
> + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -ip_off);
> + }
> +
> + /* store nargs number*/
> + move_imm(ctx, LOONGARCH_GPR_T1, nargs, false);
> + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -nargs_off);
> +
> + store_args(ctx, nargs, args_off);
> +
> + /* To traced function */
> + /* Ftrace jump skips 2 NOP instructions */
> + if (is_kernel_text((unsigned long)orig_call))
> + orig_call += LOONGARCH_FENTRY_NBYTES;
> + /* Direct jump skips 5 NOP instructions */
> + else if (is_bpf_text_address((unsigned long)orig_call))
> + orig_call += LOONGARCH_BPF_FENTRY_NBYTES;
> +
> + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
> + ret = emit_call(ctx, (const u64)__bpf_tramp_enter);
> + if (ret)
> + return ret;
> + }
> +
> + for (i = 0; i < fentry->nr_links; i++) {
> + ret = invoke_bpf_prog(ctx, fentry->links[i], args_off, retval_off,
> + run_ctx_off, flags & BPF_TRAMP_F_RET_FENTRY_RET);
> + if (ret)
> + return ret;
> + }
> + if (fmod_ret->nr_links) {
> + branches = kcalloc(fmod_ret->nr_links, sizeof(u32 *), GFP_KERNEL);
> + if (!branches)
> + return -ENOMEM;
> +
> + invoke_bpf_mod_ret(ctx, fmod_ret, args_off, retval_off,
> + run_ctx_off, branches);
> + }
> +
> + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> + restore_args(ctx, m->nr_args, args_off);
> + ret = emit_call(ctx, (const u64)orig_call);
> + if (ret)
> + goto out;
> + emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> + emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> + im->ip_after_call = ctx->ro_image + ctx->idx;
> + /* Reserve space for the move_imm + jirl instruction */
> + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> + emit_insn(ctx, nop);
> + }
> +
> + for (i = 0; ctx->image && i < fmod_ret->nr_links; i++) {
> + int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branches[i];
> + *branches[i] = larch_insn_gen_bne(LOONGARCH_GPR_T1, LOONGARCH_GPR_ZERO, offset);
> + }
> +
> + for (i = 0; i < fexit->nr_links; i++) {
> + ret = invoke_bpf_prog(ctx, fexit->links[i], args_off, retval_off,
> + run_ctx_off, false);
> + if (ret)
> + goto out;
> + }
> +
> + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> + im->ip_epilogue = ctx->ro_image + ctx->idx;
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
> + ret = emit_call(ctx, (const u64)__bpf_tramp_exit);
> + if (ret)
> + goto out;
> + }
> +
> + if (flags & BPF_TRAMP_F_RESTORE_REGS)
> + restore_args(ctx, m->nr_args, args_off);
> +
> + if (save_ret) {
> + emit_insn(ctx, ldd, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> + emit_insn(ctx, ldd, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> + }
> +
> + emit_insn(ctx, ldd, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> +
> + /* trampoline called from function entry */
> + emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
> +
> + emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
> +
> + if (flags & BPF_TRAMP_F_SKIP_FRAME)
> + /* return to parent function */
> + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
> + else
> + /* return to traced function */
> + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
> +
> + ret = ctx->idx;
> +out:
> + kfree(branches);
> +
> + return ret;
> +}
> +
> +int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image,
> + void *ro_image_end, const struct btf_func_model *m,
> + u32 flags, struct bpf_tramp_links *tlinks,
> + void *func_addr)
> +{
> + int ret;
> + void *image, *tmp;
> + u32 size = ro_image_end - ro_image;
> +
> + image = kvmalloc(size, GFP_KERNEL);
> + if (!image)
> + return -ENOMEM;
> +
> + struct jit_ctx ctx = {
> + .image = (union loongarch_instruction *)image,
> + .ro_image = (union loongarch_instruction *)ro_image,
> + .idx = 0,
> + };
Declare ctx at function entry, please.
> +
> + jit_fill_hole(image, (unsigned int)(ro_image_end - ro_image));
> + ret = __arch_prepare_bpf_trampoline(&ctx, im, m, tlinks, func_addr, flags);
> + if (ret > 0 && validate_code(&ctx) < 0) {
> + ret = -EINVAL;
> + goto out;
> + }
> +
> + tmp = bpf_arch_text_copy(ro_image, image, size);
> + if (IS_ERR(tmp)) {
> + ret = PTR_ERR(tmp);
> + goto out;
> + }
> +
> + bpf_flush_icache(ro_image, ro_image_end);
> +out:
> + kvfree(image);
> + return ret < 0 ? ret : size;
> +}
> +
> +int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags,
> + struct bpf_tramp_links *tlinks, void *func_addr)
> +{
> + struct bpf_tramp_image im;
> + struct jit_ctx ctx;
> + int ret;
> +
> + ctx.image = NULL;
> + ctx.idx = 0;
> +
> + ret = __arch_prepare_bpf_trampoline(&ctx, &im, m, tlinks, func_addr, flags);
> +
> + /* Page align */
> + return ret < 0 ? ret : round_up(ret * LOONGARCH_INSN_SIZE, PAGE_SIZE);
> +}
> diff --git a/arch/loongarch/net/bpf_jit.h b/arch/loongarch/net/bpf_jit.h
> index f9c569f53..5697158fd 100644
> --- a/arch/loongarch/net/bpf_jit.h
> +++ b/arch/loongarch/net/bpf_jit.h
> @@ -18,6 +18,7 @@ struct jit_ctx {
> u32 *offset;
> int num_exentries;
> union loongarch_instruction *image;
> + union loongarch_instruction *ro_image;
> u32 stack_size;
> };
>
> @@ -308,3 +309,8 @@ static inline int emit_tailcall_jmp(struct jit_ctx *ctx, u8 cond, enum loongarch
>
> return -EINVAL;
> }
> +
> +static inline void bpf_flush_icache(void *start, void *end)
> +{
> + flush_icache_range((unsigned long)start, (unsigned long)end);
> +}
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 5/5] LoongArch: BPF: Add struct ops support for trampoline
2025-07-24 14:19 ` [PATCH v4 5/5] LoongArch: BPF: Add struct ops support for trampoline Chenghao Duan
@ 2025-07-28 10:55 ` Hengqi Chen
2025-07-28 13:34 ` Chenghao Duan
0 siblings, 1 reply; 22+ messages in thread
From: Hengqi Chen @ 2025-07-28 10:55 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li
On Thu, Jul 24, 2025 at 10:22 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> From: Tiezhu Yang <yangtiezhu@loongson.cn>
>
> Use BPF_TRAMP_F_INDIRECT flag to detect struct ops and emit proper
> prologue and epilogue for this case.
>
> With this patch, all of the struct_ops related testcases (except
> struct_ops_multi_pages) passed on LoongArch.
>
> The testcase struct_ops_multi_pages failed is because the actual
> image_pages_cnt is 40 which is bigger than MAX_TRAMP_IMAGE_PAGES.
>
> Before:
>
> $ sudo ./test_progs -t struct_ops -d struct_ops_multi_pages
> ...
> WATCHDOG: test case struct_ops_module/struct_ops_load executes for 10 seconds...
>
> After:
>
> $ sudo ./test_progs -t struct_ops -d struct_ops_multi_pages
> ...
> #15 bad_struct_ops:OK
> ...
> #399 struct_ops_autocreate:OK
> ...
> #400 struct_ops_kptr_return:OK
> ...
> #401 struct_ops_maybe_null:OK
> ...
> #402 struct_ops_module:OK
> ...
> #404 struct_ops_no_cfi:OK
> ...
> #405 struct_ops_private_stack:SKIP
> ...
> #406 struct_ops_refcounted:OK
> Summary: 8/25 PASSED, 3 SKIPPED, 0 FAILED
>
> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
> ---
> arch/loongarch/net/bpf_jit.c | 71 ++++++++++++++++++++++++------------
> 1 file changed, 47 insertions(+), 24 deletions(-)
>
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index ac5ce3a28..6a84fb104 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -1603,6 +1603,7 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
> struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
> struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> + bool is_struct_ops = flags & BPF_TRAMP_F_INDIRECT;
> int ret, save_ret;
> void *orig_call = func_addr;
> u32 **branches = NULL;
> @@ -1678,18 +1679,31 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
>
> stack_size = round_up(stack_size, 16);
>
> - /* For the trampoline called from function entry */
> - /* RA and FP for parent function*/
> - emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
> - emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> - emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> - emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
> -
> - /* RA and FP for traced function*/
> - emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
> - emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> - emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> - emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
> + if (!is_struct_ops) {
> + /*
> + * For the trampoline called from function entry,
> + * the frame of traced function and the frame of
> + * trampoline need to be considered.
> + */
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
> + emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
> +
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
> + emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
> + } else {
> + /*
> + * For the trampoline called directly, just handle
> + * the frame of trampoline.
> + */
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
> + emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, stack_size - 8);
> + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
> + }
>
The diff removes code added in patch 4/5, this should be squashed to
the trampoline patch if possible.
> /* callee saved register S1 to pass start time */
> emit_insn(ctx, std, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> @@ -1779,21 +1793,30 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
>
> emit_insn(ctx, ldd, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
>
> - /* trampoline called from function entry */
> - emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> - emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> - emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
> + if (!is_struct_ops) {
> + /* trampoline called from function entry */
> + emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
> +
> + emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
>
> - emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> - emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> - emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
> + if (flags & BPF_TRAMP_F_SKIP_FRAME)
> + /* return to parent function */
> + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
> + else
> + /* return to traced function */
> + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
> + } else {
> + /* trampoline called directly */
> + emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, stack_size - 8);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
>
> - if (flags & BPF_TRAMP_F_SKIP_FRAME)
> - /* return to parent function */
> emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
> - else
> - /* return to traced function */
> - emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
> + }
>
> ret = ctx->idx;
> out:
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 3/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
2025-07-24 14:19 ` [PATCH v4 3/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch Chenghao Duan
2025-07-28 2:30 ` Huacai Chen
2025-07-28 10:47 ` Hengqi Chen
@ 2025-07-28 10:58 ` Hengqi Chen
2025-07-28 12:59 ` Chenghao Duan
2 siblings, 1 reply; 22+ messages in thread
From: Hengqi Chen @ 2025-07-28 10:58 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li
On Thu, Jul 24, 2025 at 10:21 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> Implement the functions of bpf_arch_text_poke, bpf_arch_text_copy, and
> bpf_arch_text_invalidate on the LoongArch architecture.
>
> On LoongArch, since symbol addresses in the direct mapping
> region cannot be reached via relative jump instructions from the paged
> mapping region, we use the move_imm+jirl instruction pair as absolute
> jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> instructions in the program as placeholders for function jumps.
>
> larch_insn_text_copy is solely used for BPF. The use of
> larch_insn_text_copy() requires page_size alignment. Currently, only
> the size of the trampoline is page-aligned.
>
The subject line seems kind of casual, bpf_arch_xxxxx ?
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
> Reviewed-by: Huacai Chen <chenhuacai@kernel.org>
I didn't leave a Reviewed-by tag last time, no ?
> ---
> arch/loongarch/include/asm/inst.h | 1 +
> arch/loongarch/kernel/inst.c | 32 ++++++++++
> arch/loongarch/net/bpf_jit.c | 97 +++++++++++++++++++++++++++++++
> 3 files changed, 130 insertions(+)
>
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> index 2ae96a35d..88bb73e46 100644
> --- a/arch/loongarch/include/asm/inst.h
> +++ b/arch/loongarch/include/asm/inst.h
> @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> int larch_insn_read(void *addr, u32 *insnp);
> int larch_insn_write(void *addr, u32 insn);
> int larch_insn_patch_text(void *addr, u32 insn);
> +int larch_insn_text_copy(void *dst, void *src, size_t len);
>
> u32 larch_insn_gen_nop(void);
> u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> index 674e3b322..8d6594968 100644
> --- a/arch/loongarch/kernel/inst.c
> +++ b/arch/loongarch/kernel/inst.c
> @@ -4,6 +4,7 @@
> */
> #include <linux/sizes.h>
> #include <linux/uaccess.h>
> +#include <linux/set_memory.h>
>
> #include <asm/cacheflush.h>
> #include <asm/inst.h>
> @@ -218,6 +219,37 @@ int larch_insn_patch_text(void *addr, u32 insn)
> return ret;
> }
>
> +int larch_insn_text_copy(void *dst, void *src, size_t len)
> +{
> + unsigned long flags;
> + size_t wlen = 0;
> + size_t size;
> + void *ptr;
> + int ret = 0;
> +
> + set_memory_rw((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> + raw_spin_lock_irqsave(&patch_lock, flags);
> + while (wlen < len) {
> + ptr = dst + wlen;
> + size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
> + len - wlen);
> +
> + ret = copy_to_kernel_nofault(ptr, src + wlen, size);
> + if (ret) {
> + pr_err("%s: operation failed\n", __func__);
> + break;
> + }
> + wlen += size;
> + }
> + raw_spin_unlock_irqrestore(&patch_lock, flags);
> + set_memory_rox((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> +
> + if (!ret)
> + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> +
> + return ret;
> +}
> +
> u32 larch_insn_gen_nop(void)
> {
> return INSN_NOP;
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 7032f11d3..86504e710 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -4,8 +4,12 @@
> *
> * Copyright (C) 2022 Loongson Technology Corporation Limited
> */
> +#include <linux/memory.h>
> #include "bpf_jit.h"
>
> +#define LOONGARCH_LONG_JUMP_NINSNS 5
> +#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> +
> #define REG_TCC LOONGARCH_GPR_A6
> #define TCC_SAVED LOONGARCH_GPR_S5
>
> @@ -88,6 +92,7 @@ static u8 tail_call_reg(struct jit_ctx *ctx)
> */
> static void build_prologue(struct jit_ctx *ctx)
> {
> + int i;
> int stack_adjust = 0, store_offset, bpf_stack_adjust;
>
> bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
> @@ -98,6 +103,10 @@ static void build_prologue(struct jit_ctx *ctx)
> stack_adjust = round_up(stack_adjust, 16);
> stack_adjust += bpf_stack_adjust;
>
> + /* Reserve space for the move_imm + jirl instruction */
> + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> + emit_insn(ctx, nop);
> +
> /*
> * First instruction initializes the tail call count (TCC).
> * On tail call we skip this instruction, and the TCC is
> @@ -1367,3 +1376,91 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> {
> return true;
> }
> +
> +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 target)
> +{
> + if (!target) {
> + pr_err("bpf_jit: jump target address is error\n");
> + return -EFAULT;
> + }
> +
> + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> +
> + return 0;
> +}
> +
> +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> +{
> + struct jit_ctx ctx;
> +
> + ctx.idx = 0;
> + ctx.image = (union loongarch_instruction *)insns;
> +
> + if (!target) {
> + emit_insn((&ctx), nop);
> + emit_insn((&ctx), nop);
> + return 0;
> + }
> +
> + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> + (unsigned long)target);
> +}
> +
> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> + void *old_addr, void *new_addr)
> +{
> + u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> + u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> + bool is_call = poke_type == BPF_MOD_CALL;
> + int ret;
> +
> + if (!is_kernel_text((unsigned long)ip) &&
> + !is_bpf_text_address((unsigned long)ip))
> + return -ENOTSUPP;
> +
> + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> + if (ret)
> + return ret;
> +
> + if (memcmp(ip, old_insns, LOONGARCH_LONG_JUMP_NBYTES))
> + return -EFAULT;
> +
> + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> + if (ret)
> + return ret;
> +
> + mutex_lock(&text_mutex);
> + if (memcmp(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES))
> + ret = larch_insn_text_copy(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES);
> + mutex_unlock(&text_mutex);
> + return ret;
> +}
> +
> +int bpf_arch_text_invalidate(void *dst, size_t len)
> +{
> + int i;
> + int ret = 0;
> + u32 *inst;
> +
> + inst = kvmalloc(len, GFP_KERNEL);
> + if (!inst)
> + return -ENOMEM;
> +
> + for (i = 0; i < (len/sizeof(u32)); i++)
> + inst[i] = INSN_BREAK;
> +
> + if (larch_insn_text_copy(dst, inst, len))
> + ret = -EINVAL;
> +
> + kvfree(inst);
> + return ret;
> +}
> +
> +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> +{
> + if (larch_insn_text_copy(dst, src, len))
> + return ERR_PTR(-EINVAL);
> +
> + return dst;
> +}
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 3/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
2025-07-28 10:58 ` Hengqi Chen
@ 2025-07-28 12:59 ` Chenghao Duan
0 siblings, 0 replies; 22+ messages in thread
From: Chenghao Duan @ 2025-07-28 12:59 UTC (permalink / raw)
To: Hengqi Chen
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li
On Mon, Jul 28, 2025 at 06:58:41PM +0800, Hengqi Chen wrote:
> On Thu, Jul 24, 2025 at 10:21 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > Implement the functions of bpf_arch_text_poke, bpf_arch_text_copy, and
> > bpf_arch_text_invalidate on the LoongArch architecture.
> >
> > On LoongArch, since symbol addresses in the direct mapping
> > region cannot be reached via relative jump instructions from the paged
> > mapping region, we use the move_imm+jirl instruction pair as absolute
> > jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> > instructions in the program as placeholders for function jumps.
> >
> > larch_insn_text_copy is solely used for BPF. The use of
> > larch_insn_text_copy() requires page_size alignment. Currently, only
> > the size of the trampoline is page-aligned.
> >
>
> The subject line seems kind of casual, bpf_arch_xxxxx ?
Here is the modified commit log. Please take a look.
LoongArch: BPF: Implement dynamic code modification support
This commit adds the necessary infrastructure for BPF dynamic code
modification on LoongArch architecture:
1. Implement bpf_arch_text_poke() for runtime instruction patching.
2. Add bpf_arch_text_copy() for instruction block copying.
3. Create bpf_arch_text_invalidate() for code invalidation.
On LoongArch, since symbol addresses in the direct mapping
region cannot be reached via relative jump instructions from the paged
mapping region, we use the move_imm+jirl instruction pair as absolute
jump instructions. These require 2-5 instructions, so we reserve 5 NOP
instructions in the program as placeholders for function jumps.
arch_insn_text_copy is solely used for BPF. The use of
larch_insn_text_copy() requires page_size alignment. Currently, only
the size of the trampoline is page-aligned.
>
> > Co-developed-by: George Guo <guodongtai@kylinos.cn>
> > Signed-off-by: George Guo <guodongtai@kylinos.cn>
> > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
> > Reviewed-by: Huacai Chen <chenhuacai@kernel.org>
>
> I didn't leave a Reviewed-by tag last time, no ?
I added this. If there are any mistakes, please correct them.
>
> > ---
> > arch/loongarch/include/asm/inst.h | 1 +
> > arch/loongarch/kernel/inst.c | 32 ++++++++++
> > arch/loongarch/net/bpf_jit.c | 97 +++++++++++++++++++++++++++++++
> > 3 files changed, 130 insertions(+)
> >
> > diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> > index 2ae96a35d..88bb73e46 100644
> > --- a/arch/loongarch/include/asm/inst.h
> > +++ b/arch/loongarch/include/asm/inst.h
> > @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> > int larch_insn_read(void *addr, u32 *insnp);
> > int larch_insn_write(void *addr, u32 insn);
> > int larch_insn_patch_text(void *addr, u32 insn);
> > +int larch_insn_text_copy(void *dst, void *src, size_t len);
> >
> > u32 larch_insn_gen_nop(void);
> > u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > index 674e3b322..8d6594968 100644
> > --- a/arch/loongarch/kernel/inst.c
> > +++ b/arch/loongarch/kernel/inst.c
> > @@ -4,6 +4,7 @@
> > */
> > #include <linux/sizes.h>
> > #include <linux/uaccess.h>
> > +#include <linux/set_memory.h>
> >
> > #include <asm/cacheflush.h>
> > #include <asm/inst.h>
> > @@ -218,6 +219,37 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > return ret;
> > }
> >
> > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > +{
> > + unsigned long flags;
> > + size_t wlen = 0;
> > + size_t size;
> > + void *ptr;
> > + int ret = 0;
> > +
> > + set_memory_rw((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> > + raw_spin_lock_irqsave(&patch_lock, flags);
> > + while (wlen < len) {
> > + ptr = dst + wlen;
> > + size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
> > + len - wlen);
> > +
> > + ret = copy_to_kernel_nofault(ptr, src + wlen, size);
> > + if (ret) {
> > + pr_err("%s: operation failed\n", __func__);
> > + break;
> > + }
> > + wlen += size;
> > + }
> > + raw_spin_unlock_irqrestore(&patch_lock, flags);
> > + set_memory_rox((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> > +
> > + if (!ret)
> > + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> > +
> > + return ret;
> > +}
> > +
> > u32 larch_insn_gen_nop(void)
> > {
> > return INSN_NOP;
> > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > index 7032f11d3..86504e710 100644
> > --- a/arch/loongarch/net/bpf_jit.c
> > +++ b/arch/loongarch/net/bpf_jit.c
> > @@ -4,8 +4,12 @@
> > *
> > * Copyright (C) 2022 Loongson Technology Corporation Limited
> > */
> > +#include <linux/memory.h>
> > #include "bpf_jit.h"
> >
> > +#define LOONGARCH_LONG_JUMP_NINSNS 5
> > +#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> > +
> > #define REG_TCC LOONGARCH_GPR_A6
> > #define TCC_SAVED LOONGARCH_GPR_S5
> >
> > @@ -88,6 +92,7 @@ static u8 tail_call_reg(struct jit_ctx *ctx)
> > */
> > static void build_prologue(struct jit_ctx *ctx)
> > {
> > + int i;
> > int stack_adjust = 0, store_offset, bpf_stack_adjust;
> >
> > bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
> > @@ -98,6 +103,10 @@ static void build_prologue(struct jit_ctx *ctx)
> > stack_adjust = round_up(stack_adjust, 16);
> > stack_adjust += bpf_stack_adjust;
> >
> > + /* Reserve space for the move_imm + jirl instruction */
> > + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> > + emit_insn(ctx, nop);
> > +
> > /*
> > * First instruction initializes the tail call count (TCC).
> > * On tail call we skip this instruction, and the TCC is
> > @@ -1367,3 +1376,91 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> > {
> > return true;
> > }
> > +
> > +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 target)
> > +{
> > + if (!target) {
> > + pr_err("bpf_jit: jump target address is error\n");
> > + return -EFAULT;
> > + }
> > +
> > + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> > + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> > +
> > + return 0;
> > +}
> > +
> > +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> > +{
> > + struct jit_ctx ctx;
> > +
> > + ctx.idx = 0;
> > + ctx.image = (union loongarch_instruction *)insns;
> > +
> > + if (!target) {
> > + emit_insn((&ctx), nop);
> > + emit_insn((&ctx), nop);
> > + return 0;
> > + }
> > +
> > + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> > + (unsigned long)target);
> > +}
> > +
> > +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> > + void *old_addr, void *new_addr)
> > +{
> > + u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> > + u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> > + bool is_call = poke_type == BPF_MOD_CALL;
> > + int ret;
> > +
> > + if (!is_kernel_text((unsigned long)ip) &&
> > + !is_bpf_text_address((unsigned long)ip))
> > + return -ENOTSUPP;
> > +
> > + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> > + if (ret)
> > + return ret;
> > +
> > + if (memcmp(ip, old_insns, LOONGARCH_LONG_JUMP_NBYTES))
> > + return -EFAULT;
> > +
> > + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> > + if (ret)
> > + return ret;
> > +
> > + mutex_lock(&text_mutex);
> > + if (memcmp(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES))
> > + ret = larch_insn_text_copy(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES);
> > + mutex_unlock(&text_mutex);
> > + return ret;
> > +}
> > +
> > +int bpf_arch_text_invalidate(void *dst, size_t len)
> > +{
> > + int i;
> > + int ret = 0;
> > + u32 *inst;
> > +
> > + inst = kvmalloc(len, GFP_KERNEL);
> > + if (!inst)
> > + return -ENOMEM;
> > +
> > + for (i = 0; i < (len/sizeof(u32)); i++)
> > + inst[i] = INSN_BREAK;
> > +
> > + if (larch_insn_text_copy(dst, inst, len))
> > + ret = -EINVAL;
> > +
> > + kvfree(inst);
> > + return ret;
> > +}
> > +
> > +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> > +{
> > + if (larch_insn_text_copy(dst, src, len))
> > + return ERR_PTR(-EINVAL);
> > +
> > + return dst;
> > +}
> > --
> > 2.25.1
> >
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 3/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
2025-07-28 10:47 ` Hengqi Chen
@ 2025-07-28 13:21 ` Chenghao Duan
2025-07-29 11:56 ` Chenghao Duan
0 siblings, 1 reply; 22+ messages in thread
From: Chenghao Duan @ 2025-07-28 13:21 UTC (permalink / raw)
To: Hengqi Chen
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li
On Mon, Jul 28, 2025 at 06:47:03PM +0800, Hengqi Chen wrote:
> On Thu, Jul 24, 2025 at 10:21 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > Implement the functions of bpf_arch_text_poke, bpf_arch_text_copy, and
> > bpf_arch_text_invalidate on the LoongArch architecture.
> >
> > On LoongArch, since symbol addresses in the direct mapping
> > region cannot be reached via relative jump instructions from the paged
> > mapping region, we use the move_imm+jirl instruction pair as absolute
> > jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> > instructions in the program as placeholders for function jumps.
> >
> > larch_insn_text_copy is solely used for BPF. The use of
> > larch_insn_text_copy() requires page_size alignment. Currently, only
> > the size of the trampoline is page-aligned.
> >
> > Co-developed-by: George Guo <guodongtai@kylinos.cn>
> > Signed-off-by: George Guo <guodongtai@kylinos.cn>
> > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
> > Reviewed-by: Huacai Chen <chenhuacai@kernel.org>
> > ---
> > arch/loongarch/include/asm/inst.h | 1 +
> > arch/loongarch/kernel/inst.c | 32 ++++++++++
> > arch/loongarch/net/bpf_jit.c | 97 +++++++++++++++++++++++++++++++
> > 3 files changed, 130 insertions(+)
> >
> > diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> > index 2ae96a35d..88bb73e46 100644
> > --- a/arch/loongarch/include/asm/inst.h
> > +++ b/arch/loongarch/include/asm/inst.h
> > @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> > int larch_insn_read(void *addr, u32 *insnp);
> > int larch_insn_write(void *addr, u32 insn);
> > int larch_insn_patch_text(void *addr, u32 insn);
> > +int larch_insn_text_copy(void *dst, void *src, size_t len);
> >
> > u32 larch_insn_gen_nop(void);
> > u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > index 674e3b322..8d6594968 100644
> > --- a/arch/loongarch/kernel/inst.c
> > +++ b/arch/loongarch/kernel/inst.c
> > @@ -4,6 +4,7 @@
> > */
> > #include <linux/sizes.h>
> > #include <linux/uaccess.h>
> > +#include <linux/set_memory.h>
> >
> > #include <asm/cacheflush.h>
> > #include <asm/inst.h>
> > @@ -218,6 +219,37 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > return ret;
> > }
> >
> > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > +{
> > + unsigned long flags;
> > + size_t wlen = 0;
> > + size_t size;
> > + void *ptr;
> > + int ret = 0;
> > +
> > + set_memory_rw((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> > + raw_spin_lock_irqsave(&patch_lock, flags);
> > + while (wlen < len) {
> > + ptr = dst + wlen;
> > + size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
> > + len - wlen);
> > +
> > + ret = copy_to_kernel_nofault(ptr, src + wlen, size);
> > + if (ret) {
> > + pr_err("%s: operation failed\n", __func__);
> > + break;
> > + }
> > + wlen += size;
> > + }
> > + raw_spin_unlock_irqrestore(&patch_lock, flags);
> > + set_memory_rox((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> > +
> > + if (!ret)
> > + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> > +
> > + return ret;
> > +}
> > +
> > u32 larch_insn_gen_nop(void)
> > {
> > return INSN_NOP;
> > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > index 7032f11d3..86504e710 100644
> > --- a/arch/loongarch/net/bpf_jit.c
> > +++ b/arch/loongarch/net/bpf_jit.c
> > @@ -4,8 +4,12 @@
> > *
> > * Copyright (C) 2022 Loongson Technology Corporation Limited
> > */
> > +#include <linux/memory.h>
> > #include "bpf_jit.h"
> >
> > +#define LOONGARCH_LONG_JUMP_NINSNS 5
> > +#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> > +
> > #define REG_TCC LOONGARCH_GPR_A6
> > #define TCC_SAVED LOONGARCH_GPR_S5
> >
> > @@ -88,6 +92,7 @@ static u8 tail_call_reg(struct jit_ctx *ctx)
> > */
> > static void build_prologue(struct jit_ctx *ctx)
> > {
> > + int i;
> > int stack_adjust = 0, store_offset, bpf_stack_adjust;
> >
> > bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
> > @@ -98,6 +103,10 @@ static void build_prologue(struct jit_ctx *ctx)
> > stack_adjust = round_up(stack_adjust, 16);
> > stack_adjust += bpf_stack_adjust;
> >
> > + /* Reserve space for the move_imm + jirl instruction */
> > + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> > + emit_insn(ctx, nop);
> > +
> > /*
> > * First instruction initializes the tail call count (TCC).
> > * On tail call we skip this instruction, and the TCC is
> > @@ -1367,3 +1376,91 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> > {
> > return true;
> > }
> > +
> > +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 target)
> > +{
> > + if (!target) {
> > + pr_err("bpf_jit: jump target address is error\n");
>
> is error ? is NULL ?
What I mean is: This is an illegal target address.
>
> > + return -EFAULT;
> > + }
> > +
> > + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> > + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> > +
> > + return 0;
> > +}
> > +
> > +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> > +{
> > + struct jit_ctx ctx;
> > +
> > + ctx.idx = 0;
> > + ctx.image = (union loongarch_instruction *)insns;
> > +
> > + if (!target) {
> > + emit_insn((&ctx), nop);
> > + emit_insn((&ctx), nop);
> > + return 0;
> > + }
> > +
> > + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> > + (unsigned long)target);
> > +}
> > +
> > +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> > + void *old_addr, void *new_addr)
> > +{
> > + u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> > + u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> > + bool is_call = poke_type == BPF_MOD_CALL;
> > + int ret;
> > +
> > + if (!is_kernel_text((unsigned long)ip) &&
> > + !is_bpf_text_address((unsigned long)ip))
> > + return -ENOTSUPP;
> > +
> > + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> > + if (ret)
> > + return ret;
> > +
> > + if (memcmp(ip, old_insns, LOONGARCH_LONG_JUMP_NBYTES))
> > + return -EFAULT;
> > +
> > + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> > + if (ret)
> > + return ret;
> > +
> > + mutex_lock(&text_mutex);
> > + if (memcmp(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES))
> > + ret = larch_insn_text_copy(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES);
> > + mutex_unlock(&text_mutex);
> > + return ret;
> > +}
> > +
> > +int bpf_arch_text_invalidate(void *dst, size_t len)
> > +{
> > + int i;
> > + int ret = 0;
> > + u32 *inst;
> > +
> > + inst = kvmalloc(len, GFP_KERNEL);
> > + if (!inst)
> > + return -ENOMEM;
> > +
> > + for (i = 0; i < (len/sizeof(u32)); i++)
> > + inst[i] = INSN_BREAK;
> > +
> > + if (larch_insn_text_copy(dst, inst, len))
>
> Do we need text_mutex here and below for larch_insn_text_copy() ?
As a matter of fact, my use of text_mutex is modeled after the arm64
code. Arm64 also only adds text_mutex to bpf_arch_text_poke. Therefore,
I have only added text_mutex to bpf_arch_text_poke as well.
In the next version of the code, I will try to add text_mutex to all
contexts where larch_insn_text_copy is used and conduct tests accordingly.
>
> > + ret = -EINVAL;
> > +
> > + kvfree(inst);
> > + return ret;
> > +}
> > +
> > +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> > +{
> > + if (larch_insn_text_copy(dst, src, len))
> > + return ERR_PTR(-EINVAL);
> > +
> > + return dst;
> > +}
> > --
> > 2.25.1
> >
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 5/5] LoongArch: BPF: Add struct ops support for trampoline
2025-07-28 10:55 ` Hengqi Chen
@ 2025-07-28 13:34 ` Chenghao Duan
2025-07-29 2:32 ` Huacai Chen
0 siblings, 1 reply; 22+ messages in thread
From: Chenghao Duan @ 2025-07-28 13:34 UTC (permalink / raw)
To: Hengqi Chen
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li
On Mon, Jul 28, 2025 at 06:55:52PM +0800, Hengqi Chen wrote:
> On Thu, Jul 24, 2025 at 10:22 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > From: Tiezhu Yang <yangtiezhu@loongson.cn>
> >
> > Use BPF_TRAMP_F_INDIRECT flag to detect struct ops and emit proper
> > prologue and epilogue for this case.
> >
> > With this patch, all of the struct_ops related testcases (except
> > struct_ops_multi_pages) passed on LoongArch.
> >
> > The testcase struct_ops_multi_pages failed is because the actual
> > image_pages_cnt is 40 which is bigger than MAX_TRAMP_IMAGE_PAGES.
> >
> > Before:
> >
> > $ sudo ./test_progs -t struct_ops -d struct_ops_multi_pages
> > ...
> > WATCHDOG: test case struct_ops_module/struct_ops_load executes for 10 seconds...
> >
> > After:
> >
> > $ sudo ./test_progs -t struct_ops -d struct_ops_multi_pages
> > ...
> > #15 bad_struct_ops:OK
> > ...
> > #399 struct_ops_autocreate:OK
> > ...
> > #400 struct_ops_kptr_return:OK
> > ...
> > #401 struct_ops_maybe_null:OK
> > ...
> > #402 struct_ops_module:OK
> > ...
> > #404 struct_ops_no_cfi:OK
> > ...
> > #405 struct_ops_private_stack:SKIP
> > ...
> > #406 struct_ops_refcounted:OK
> > Summary: 8/25 PASSED, 3 SKIPPED, 0 FAILED
> >
> > Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
> > ---
> > arch/loongarch/net/bpf_jit.c | 71 ++++++++++++++++++++++++------------
> > 1 file changed, 47 insertions(+), 24 deletions(-)
> >
> > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > index ac5ce3a28..6a84fb104 100644
> > --- a/arch/loongarch/net/bpf_jit.c
> > +++ b/arch/loongarch/net/bpf_jit.c
> > @@ -1603,6 +1603,7 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
> > struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
> > struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> > struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> > + bool is_struct_ops = flags & BPF_TRAMP_F_INDIRECT;
> > int ret, save_ret;
> > void *orig_call = func_addr;
> > u32 **branches = NULL;
> > @@ -1678,18 +1679,31 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
> >
> > stack_size = round_up(stack_size, 16);
> >
> > - /* For the trampoline called from function entry */
> > - /* RA and FP for parent function*/
> > - emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
> > - emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> > - emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> > - emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
> > -
> > - /* RA and FP for traced function*/
> > - emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
> > - emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> > - emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> > - emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
> > + if (!is_struct_ops) {
> > + /*
> > + * For the trampoline called from function entry,
> > + * the frame of traced function and the frame of
> > + * trampoline need to be considered.
> > + */
> > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
> > + emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> > + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> > + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
> > +
> > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
> > + emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> > + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> > + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
> > + } else {
> > + /*
> > + * For the trampoline called directly, just handle
> > + * the frame of trampoline.
> > + */
> > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
> > + emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, stack_size - 8);
> > + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> > + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
> > + }
> >
>
> The diff removes code added in patch 4/5, this should be squashed to
> the trampoline patch if possible.
This patch was provided by Tiezhu Yang, and there was a discussion about
it at the time.
https://lore.kernel.org/all/cd190c8a-a7b9-53de-d363-c3d695fe3191@loongson.cn/
>
> > /* callee saved register S1 to pass start time */
> > emit_insn(ctx, std, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> > @@ -1779,21 +1793,30 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
> >
> > emit_insn(ctx, ldd, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> >
> > - /* trampoline called from function entry */
> > - emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> > - emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> > - emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
> > + if (!is_struct_ops) {
> > + /* trampoline called from function entry */
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
> > +
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
> >
> > - emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> > - emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> > - emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
> > + if (flags & BPF_TRAMP_F_SKIP_FRAME)
> > + /* return to parent function */
> > + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
> > + else
> > + /* return to traced function */
> > + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
> > + } else {
> > + /* trampoline called directly */
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, stack_size - 8);
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
> >
> > - if (flags & BPF_TRAMP_F_SKIP_FRAME)
> > - /* return to parent function */
> > emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
> > - else
> > - /* return to traced function */
> > - emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
> > + }
> >
> > ret = ctx->idx;
> > out:
> > --
> > 2.25.1
> >
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 5/5] LoongArch: BPF: Add struct ops support for trampoline
2025-07-28 13:34 ` Chenghao Duan
@ 2025-07-29 2:32 ` Huacai Chen
0 siblings, 0 replies; 22+ messages in thread
From: Huacai Chen @ 2025-07-29 2:32 UTC (permalink / raw)
To: Chenghao Duan
Cc: Hengqi Chen, ast, daniel, andrii, yangtiezhu, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li
On Mon, Jul 28, 2025 at 9:34 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> On Mon, Jul 28, 2025 at 06:55:52PM +0800, Hengqi Chen wrote:
> > On Thu, Jul 24, 2025 at 10:22 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > >
> > > From: Tiezhu Yang <yangtiezhu@loongson.cn>
> > >
> > > Use BPF_TRAMP_F_INDIRECT flag to detect struct ops and emit proper
> > > prologue and epilogue for this case.
> > >
> > > With this patch, all of the struct_ops related testcases (except
> > > struct_ops_multi_pages) passed on LoongArch.
> > >
> > > The testcase struct_ops_multi_pages failed is because the actual
> > > image_pages_cnt is 40 which is bigger than MAX_TRAMP_IMAGE_PAGES.
> > >
> > > Before:
> > >
> > > $ sudo ./test_progs -t struct_ops -d struct_ops_multi_pages
> > > ...
> > > WATCHDOG: test case struct_ops_module/struct_ops_load executes for 10 seconds...
> > >
> > > After:
> > >
> > > $ sudo ./test_progs -t struct_ops -d struct_ops_multi_pages
> > > ...
> > > #15 bad_struct_ops:OK
> > > ...
> > > #399 struct_ops_autocreate:OK
> > > ...
> > > #400 struct_ops_kptr_return:OK
> > > ...
> > > #401 struct_ops_maybe_null:OK
> > > ...
> > > #402 struct_ops_module:OK
> > > ...
> > > #404 struct_ops_no_cfi:OK
> > > ...
> > > #405 struct_ops_private_stack:SKIP
> > > ...
> > > #406 struct_ops_refcounted:OK
> > > Summary: 8/25 PASSED, 3 SKIPPED, 0 FAILED
> > >
> > > Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
> > > ---
> > > arch/loongarch/net/bpf_jit.c | 71 ++++++++++++++++++++++++------------
> > > 1 file changed, 47 insertions(+), 24 deletions(-)
> > >
> > > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > > index ac5ce3a28..6a84fb104 100644
> > > --- a/arch/loongarch/net/bpf_jit.c
> > > +++ b/arch/loongarch/net/bpf_jit.c
> > > @@ -1603,6 +1603,7 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
> > > struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
> > > struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> > > struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> > > + bool is_struct_ops = flags & BPF_TRAMP_F_INDIRECT;
> > > int ret, save_ret;
> > > void *orig_call = func_addr;
> > > u32 **branches = NULL;
> > > @@ -1678,18 +1679,31 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
> > >
> > > stack_size = round_up(stack_size, 16);
> > >
> > > - /* For the trampoline called from function entry */
> > > - /* RA and FP for parent function*/
> > > - emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
> > > - emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> > > - emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> > > - emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
> > > -
> > > - /* RA and FP for traced function*/
> > > - emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
> > > - emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> > > - emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> > > - emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
> > > + if (!is_struct_ops) {
> > > + /*
> > > + * For the trampoline called from function entry,
> > > + * the frame of traced function and the frame of
> > > + * trampoline need to be considered.
> > > + */
> > > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
> > > + emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> > > + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> > > + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
> > > +
> > > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
> > > + emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> > > + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> > > + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
> > > + } else {
> > > + /*
> > > + * For the trampoline called directly, just handle
> > > + * the frame of trampoline.
> > > + */
> > > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
> > > + emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, stack_size - 8);
> > > + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> > > + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
> > > + }
> > >
> >
> > The diff removes code added in patch 4/5, this should be squashed to
> > the trampoline patch if possible.
>
> This patch was provided by Tiezhu Yang, and there was a discussion about
> it at the time.
> https://lore.kernel.org/all/cd190c8a-a7b9-53de-d363-c3d695fe3191@loongson.cn/
From my opinion I also prefer to squash.
Huacai
>
> >
> > > /* callee saved register S1 to pass start time */
> > > emit_insn(ctx, std, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> > > @@ -1779,21 +1793,30 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
> > >
> > > emit_insn(ctx, ldd, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> > >
> > > - /* trampoline called from function entry */
> > > - emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> > > - emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> > > - emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
> > > + if (!is_struct_ops) {
> > > + /* trampoline called from function entry */
> > > + emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> > > + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> > > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
> > > +
> > > + emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> > > + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> > > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
> > >
> > > - emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> > > - emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> > > - emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
> > > + if (flags & BPF_TRAMP_F_SKIP_FRAME)
> > > + /* return to parent function */
> > > + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
> > > + else
> > > + /* return to traced function */
> > > + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
> > > + } else {
> > > + /* trampoline called directly */
> > > + emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, stack_size - 8);
> > > + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> > > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
> > >
> > > - if (flags & BPF_TRAMP_F_SKIP_FRAME)
> > > - /* return to parent function */
> > > emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
> > > - else
> > > - /* return to traced function */
> > > - emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
> > > + }
> > >
> > > ret = ctx->idx;
> > > out:
> > > --
> > > 2.25.1
> > >
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v4 3/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch
2025-07-28 13:21 ` Chenghao Duan
@ 2025-07-29 11:56 ` Chenghao Duan
0 siblings, 0 replies; 22+ messages in thread
From: Chenghao Duan @ 2025-07-29 11:56 UTC (permalink / raw)
To: Hengqi Chen
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li
On Mon, Jul 28, 2025 at 09:21:52PM +0800, Chenghao Duan wrote:
> On Mon, Jul 28, 2025 at 06:47:03PM +0800, Hengqi Chen wrote:
> > On Thu, Jul 24, 2025 at 10:21 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > >
> > > Implement the functions of bpf_arch_text_poke, bpf_arch_text_copy, and
> > > bpf_arch_text_invalidate on the LoongArch architecture.
> > >
> > > On LoongArch, since symbol addresses in the direct mapping
> > > region cannot be reached via relative jump instructions from the paged
> > > mapping region, we use the move_imm+jirl instruction pair as absolute
> > > jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> > > instructions in the program as placeholders for function jumps.
> > >
> > > larch_insn_text_copy is solely used for BPF. The use of
> > > larch_insn_text_copy() requires page_size alignment. Currently, only
> > > the size of the trampoline is page-aligned.
> > >
> > > Co-developed-by: George Guo <guodongtai@kylinos.cn>
> > > Signed-off-by: George Guo <guodongtai@kylinos.cn>
> > > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > > Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
> > > Reviewed-by: Huacai Chen <chenhuacai@kernel.org>
> > > ---
> > > arch/loongarch/include/asm/inst.h | 1 +
> > > arch/loongarch/kernel/inst.c | 32 ++++++++++
> > > arch/loongarch/net/bpf_jit.c | 97 +++++++++++++++++++++++++++++++
> > > 3 files changed, 130 insertions(+)
> > >
> > > diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> > > index 2ae96a35d..88bb73e46 100644
> > > --- a/arch/loongarch/include/asm/inst.h
> > > +++ b/arch/loongarch/include/asm/inst.h
> > > @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> > > int larch_insn_read(void *addr, u32 *insnp);
> > > int larch_insn_write(void *addr, u32 insn);
> > > int larch_insn_patch_text(void *addr, u32 insn);
> > > +int larch_insn_text_copy(void *dst, void *src, size_t len);
> > >
> > > u32 larch_insn_gen_nop(void);
> > > u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> > > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > > index 674e3b322..8d6594968 100644
> > > --- a/arch/loongarch/kernel/inst.c
> > > +++ b/arch/loongarch/kernel/inst.c
> > > @@ -4,6 +4,7 @@
> > > */
> > > #include <linux/sizes.h>
> > > #include <linux/uaccess.h>
> > > +#include <linux/set_memory.h>
> > >
> > > #include <asm/cacheflush.h>
> > > #include <asm/inst.h>
> > > @@ -218,6 +219,37 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > > return ret;
> > > }
> > >
> > > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > > +{
> > > + unsigned long flags;
> > > + size_t wlen = 0;
> > > + size_t size;
> > > + void *ptr;
> > > + int ret = 0;
> > > +
> > > + set_memory_rw((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> > > + raw_spin_lock_irqsave(&patch_lock, flags);
> > > + while (wlen < len) {
> > > + ptr = dst + wlen;
> > > + size = min_t(size_t, PAGE_SIZE - offset_in_page(ptr),
> > > + len - wlen);
> > > +
> > > + ret = copy_to_kernel_nofault(ptr, src + wlen, size);
> > > + if (ret) {
> > > + pr_err("%s: operation failed\n", __func__);
> > > + break;
> > > + }
> > > + wlen += size;
> > > + }
> > > + raw_spin_unlock_irqrestore(&patch_lock, flags);
> > > + set_memory_rox((unsigned long)dst, round_up(len, PAGE_SIZE) / PAGE_SIZE);
> > > +
> > > + if (!ret)
> > > + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> > > +
> > > + return ret;
> > > +}
> > > +
> > > u32 larch_insn_gen_nop(void)
> > > {
> > > return INSN_NOP;
> > > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > > index 7032f11d3..86504e710 100644
> > > --- a/arch/loongarch/net/bpf_jit.c
> > > +++ b/arch/loongarch/net/bpf_jit.c
> > > @@ -4,8 +4,12 @@
> > > *
> > > * Copyright (C) 2022 Loongson Technology Corporation Limited
> > > */
> > > +#include <linux/memory.h>
> > > #include "bpf_jit.h"
> > >
> > > +#define LOONGARCH_LONG_JUMP_NINSNS 5
> > > +#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> > > +
> > > #define REG_TCC LOONGARCH_GPR_A6
> > > #define TCC_SAVED LOONGARCH_GPR_S5
> > >
> > > @@ -88,6 +92,7 @@ static u8 tail_call_reg(struct jit_ctx *ctx)
> > > */
> > > static void build_prologue(struct jit_ctx *ctx)
> > > {
> > > + int i;
> > > int stack_adjust = 0, store_offset, bpf_stack_adjust;
> > >
> > > bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
> > > @@ -98,6 +103,10 @@ static void build_prologue(struct jit_ctx *ctx)
> > > stack_adjust = round_up(stack_adjust, 16);
> > > stack_adjust += bpf_stack_adjust;
> > >
> > > + /* Reserve space for the move_imm + jirl instruction */
> > > + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> > > + emit_insn(ctx, nop);
> > > +
> > > /*
> > > * First instruction initializes the tail call count (TCC).
> > > * On tail call we skip this instruction, and the TCC is
> > > @@ -1367,3 +1376,91 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> > > {
> > > return true;
> > > }
> > > +
> > > +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 target)
> > > +{
> > > + if (!target) {
> > > + pr_err("bpf_jit: jump target address is error\n");
> >
> > is error ? is NULL ?
>
> What I mean is: This is an illegal target address.
>
> >
> > > + return -EFAULT;
> > > + }
> > > +
> > > + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> > > + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> > > +{
> > > + struct jit_ctx ctx;
> > > +
> > > + ctx.idx = 0;
> > > + ctx.image = (union loongarch_instruction *)insns;
> > > +
> > > + if (!target) {
> > > + emit_insn((&ctx), nop);
> > > + emit_insn((&ctx), nop);
> > > + return 0;
> > > + }
> > > +
> > > + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> > > + (unsigned long)target);
> > > +}
> > > +
> > > +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> > > + void *old_addr, void *new_addr)
> > > +{
> > > + u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> > > + u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> > > + bool is_call = poke_type == BPF_MOD_CALL;
> > > + int ret;
> > > +
> > > + if (!is_kernel_text((unsigned long)ip) &&
> > > + !is_bpf_text_address((unsigned long)ip))
> > > + return -ENOTSUPP;
> > > +
> > > + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + if (memcmp(ip, old_insns, LOONGARCH_LONG_JUMP_NBYTES))
> > > + return -EFAULT;
> > > +
> > > + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + mutex_lock(&text_mutex);
> > > + if (memcmp(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES))
> > > + ret = larch_insn_text_copy(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES);
> > > + mutex_unlock(&text_mutex);
> > > + return ret;
> > > +}
> > > +
> > > +int bpf_arch_text_invalidate(void *dst, size_t len)
> > > +{
> > > + int i;
> > > + int ret = 0;
> > > + u32 *inst;
> > > +
> > > + inst = kvmalloc(len, GFP_KERNEL);
> > > + if (!inst)
> > > + return -ENOMEM;
> > > +
> > > + for (i = 0; i < (len/sizeof(u32)); i++)
> > > + inst[i] = INSN_BREAK;
> > > +
> > > + if (larch_insn_text_copy(dst, inst, len))
> >
> > Do we need text_mutex here and below for larch_insn_text_copy() ?
>
> As a matter of fact, my use of text_mutex is modeled after the arm64
> code. Arm64 also only adds text_mutex to bpf_arch_text_poke. Therefore,
> I have only added text_mutex to bpf_arch_text_poke as well.
>
> In the next version of the code, I will try to add text_mutex to all
> contexts where larch_insn_text_copy is used and conduct tests accordingly.
Adding text_mutex tests in all contexts of larch_insn_text_copy passed.
> >
> > > + ret = -EINVAL;
> > > +
> > > + kvfree(inst);
> > > + return ret;
> > > +}
> > > +
> > > +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> > > +{
> > > + if (larch_insn_text_copy(dst, src, len))
> > > + return ERR_PTR(-EINVAL);
> > > +
> > > + return dst;
> > > +}
> > > --
> > > 2.25.1
> > >
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2025-07-29 11:56 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-24 14:19 [PATCH v4 0/5] Support trampoline for LoongArch Chenghao Duan
2025-07-24 14:19 ` [PATCH v4 1/5] LoongArch: Add larch_insn_gen_{beq,bne} helpers Chenghao Duan
2025-07-24 14:19 ` [PATCH v4 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx Chenghao Duan
2025-07-24 14:19 ` [PATCH v4 3/5] LoongArch: BPF: Add bpf_arch_xxxxx support for Loongarch Chenghao Duan
2025-07-28 2:30 ` Huacai Chen
2025-07-28 10:47 ` Hengqi Chen
2025-07-28 13:21 ` Chenghao Duan
2025-07-29 11:56 ` Chenghao Duan
2025-07-28 10:58 ` Hengqi Chen
2025-07-28 12:59 ` Chenghao Duan
2025-07-24 14:19 ` [PATCH v4 4/5] LoongArch: BPF: Add bpf trampoline " Chenghao Duan
2025-07-28 2:03 ` Geliang Tang
2025-07-28 10:50 ` Hengqi Chen
2025-07-24 14:19 ` [PATCH v4 5/5] LoongArch: BPF: Add struct ops support for trampoline Chenghao Duan
2025-07-28 10:55 ` Hengqi Chen
2025-07-28 13:34 ` Chenghao Duan
2025-07-29 2:32 ` Huacai Chen
2025-07-24 15:30 ` [PATCH v4 0/5] Support trampoline for LoongArch Vincent Li
2025-07-25 10:18 ` Chenghao Duan
2025-07-26 19:14 ` Daniel Borkmann
2025-07-27 1:00 ` Geliang Tang
2025-07-28 2:42 ` Huacai Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).