* [PATCH v5 0/5] Support trampoline for LoongArch
@ 2025-07-30 13:12 Chenghao Duan
2025-07-30 13:12 ` [PATCH v5 1/5] LoongArch: Add larch_insn_gen_{beq,bne} helpers Chenghao Duan
` (5 more replies)
0 siblings, 6 replies; 35+ messages in thread
From: Chenghao Duan @ 2025-07-30 13:12 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran,
vincent.mc.li, geliang
v5:
1. Modify the internal implementation of larch_insn_text_copy by
removing the while loop processing. There is a while loop inside
copy_to_kernel_nofault that handles and copies all data.
2. text_mutex has been added to all usage contexts of
larch_insn_text_copy, and the relevant tests have passed.
-----------------------------------------------------------------------
Historical Version:
v4:
1. Delete the #3 patch of version V3.
2. Add 5 NOP instructions in build_prologue().
Reserve space for the move_imm + jirl instruction.
3. Differentiate between direct jumps and ftrace jumps of trampoline:
direct jumps skip 5 instructions.
ftrace jumps skip 2 instructions.
4. Remove the generation of BL jump instructions in emit_jump_and_link().
After the trampoline ends, it will jump to the specified register.
The BL instruction writes PC+4 to r1 instead of allowing the
specification of rd.
URL for version v4:
https://lore.kernel.org/all/20250724141929.691853-1-duanchenghao@kylinos.cn/
---------
v3:
1. Patch 0003 adds EXECMEM_BPF memory type to the execmem subsystem.
2. Align the size calculated by arch_bpf_trampoline_size to page
boundaries.
3. Add the flush icache operation to larch_insn_text_copy.
4. Unify the implementation of bpf_arch_xxx into the patch
"0004-LoongArch-BPF-Add-bpf_arch_xxxxx-support-for-Loong.patch".
5. Change the patch order. Move the patch
"0002-LoongArch-BPF-Update-the-code-to-rename-validate_.patch" before
"0005-LoongArch-BPF-Add-bpf-trampoline-support-for-Loon.patch".
URL for version v3:
https://lore.kernel.org/all/20250709055029.723243-1-duanchenghao@kylinos.cn/
---------
v2:
1. Change the fixmap in the instruction copy function to set_memory_xxx.
2. Change the implementation method of the following code.
- arch_alloc_bpf_trampoline
- arch_free_bpf_trampoline
Use the BPF core's allocation and free functions.
- bpf_arch_text_invalidate
Operate with the function larch_insn_text_copy that carries
memory attribute modifications.
3. Correct the incorrect code formatting.
URL for version v2:
https://lore.kernel.org/all/20250618105048.1510560-1-duanchenghao@kylinos.cn/
---------
v1:
Support trampoline for LoongArch. The following feature tests have been
completed:
1. fentry
2. fexit
3. fmod_ret
TODO: The support for the struct_ops feature will be provided in
subsequent patches.
URL for version v1:
https://lore.kernel.org/all/20250611035952.111182-1-duanchenghao@kylinos.cn/
-----------------------------------------------------------------------
Chenghao Duan (4):
LoongArch: Add larch_insn_gen_{beq,bne} helpers
LoongArch: BPF: Update the code to rename validate_code to
validate_ctx
LoongArch: BPF: Implement dynamic code modification support
LoongArch: BPF: Add bpf trampoline support for Loongarch
Tiezhu Yang (1):
LoongArch: BPF: Add struct ops support for trampoline
arch/loongarch/include/asm/inst.h | 3 +
arch/loongarch/kernel/inst.c | 54 +++
arch/loongarch/net/bpf_jit.c | 527 +++++++++++++++++++++++++++++-
arch/loongarch/net/bpf_jit.h | 6 +
4 files changed, 589 insertions(+), 1 deletion(-)
--
2.25.1
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v5 1/5] LoongArch: Add larch_insn_gen_{beq,bne} helpers
2025-07-30 13:12 [PATCH v5 0/5] Support trampoline for LoongArch Chenghao Duan
@ 2025-07-30 13:12 ` Chenghao Duan
2025-07-31 1:41 ` Hengqi Chen
2025-07-30 13:12 ` [PATCH v5 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx Chenghao Duan
` (4 subsequent siblings)
5 siblings, 1 reply; 35+ messages in thread
From: Chenghao Duan @ 2025-07-30 13:12 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran,
vincent.mc.li, geliang, Youling Tang
Add larch_insn_gen_beq() and larch_insn_gen_bne() helpers which will
be used in BPF trampoline implementation.
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Co-developed-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
---
arch/loongarch/include/asm/inst.h | 2 ++
arch/loongarch/kernel/inst.c | 28 ++++++++++++++++++++++++++++
2 files changed, 30 insertions(+)
diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
index 3089785ca..2ae96a35d 100644
--- a/arch/loongarch/include/asm/inst.h
+++ b/arch/loongarch/include/asm/inst.h
@@ -511,6 +511,8 @@ u32 larch_insn_gen_lu12iw(enum loongarch_gpr rd, int imm);
u32 larch_insn_gen_lu32id(enum loongarch_gpr rd, int imm);
u32 larch_insn_gen_lu52id(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
+u32 larch_insn_gen_beq(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
+u32 larch_insn_gen_bne(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
static inline bool signed_imm_check(long val, unsigned int bit)
{
diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
index 14d7d700b..674e3b322 100644
--- a/arch/loongarch/kernel/inst.c
+++ b/arch/loongarch/kernel/inst.c
@@ -336,3 +336,31 @@ u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm)
return insn.word;
}
+
+u32 larch_insn_gen_beq(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm)
+{
+ union loongarch_instruction insn;
+
+ if ((imm & 3) || imm < -SZ_128K || imm >= SZ_128K) {
+ pr_warn("The generated beq instruction is out of range.\n");
+ return INSN_BREAK;
+ }
+
+ emit_beq(&insn, rj, rd, imm >> 2);
+
+ return insn.word;
+}
+
+u32 larch_insn_gen_bne(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm)
+{
+ union loongarch_instruction insn;
+
+ if ((imm & 3) || imm < -SZ_128K || imm >= SZ_128K) {
+ pr_warn("The generated bne instruction is out of range.\n");
+ return INSN_BREAK;
+ }
+
+ emit_bne(&insn, rj, rd, imm >> 2);
+
+ return insn.word;
+}
--
2.25.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v5 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx
2025-07-30 13:12 [PATCH v5 0/5] Support trampoline for LoongArch Chenghao Duan
2025-07-30 13:12 ` [PATCH v5 1/5] LoongArch: Add larch_insn_gen_{beq,bne} helpers Chenghao Duan
@ 2025-07-30 13:12 ` Chenghao Duan
2025-07-31 1:44 ` Hengqi Chen
2025-07-30 13:12 ` [PATCH v5 3/5] LoongArch: BPF: Implement dynamic code modification support Chenghao Duan
` (3 subsequent siblings)
5 siblings, 1 reply; 35+ messages in thread
From: Chenghao Duan @ 2025-07-30 13:12 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran,
vincent.mc.li, geliang
Rename the existing validate_code() to validate_ctx()
Factor out the code validation handling into a new helper validate_code()
* validate_code is used to check the validity of code.
* validate_ctx is used to check both code validity and table entry
correctness.
The new validate_code() will be used in subsequent changes.
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
---
arch/loongarch/net/bpf_jit.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index fa1500d4a..7032f11d3 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -1180,6 +1180,14 @@ static int validate_code(struct jit_ctx *ctx)
return -1;
}
+ return 0;
+}
+
+static int validate_ctx(struct jit_ctx *ctx)
+{
+ if (validate_code(ctx))
+ return -1;
+
if (WARN_ON_ONCE(ctx->num_exentries != ctx->prog->aux->num_exentries))
return -1;
@@ -1288,7 +1296,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
build_epilogue(&ctx);
/* 3. Extra pass to validate JITed code */
- if (validate_code(&ctx)) {
+ if (validate_ctx(&ctx)) {
bpf_jit_binary_free(header);
prog = orig_prog;
goto out_offset;
--
2.25.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v5 3/5] LoongArch: BPF: Implement dynamic code modification support
2025-07-30 13:12 [PATCH v5 0/5] Support trampoline for LoongArch Chenghao Duan
2025-07-30 13:12 ` [PATCH v5 1/5] LoongArch: Add larch_insn_gen_{beq,bne} helpers Chenghao Duan
2025-07-30 13:12 ` [PATCH v5 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx Chenghao Duan
@ 2025-07-30 13:12 ` Chenghao Duan
2025-08-04 2:02 ` Hengqi Chen
2025-08-04 2:24 ` Hengqi Chen
2025-07-30 13:12 ` [PATCH v5 4/5] LoongArch: BPF: Add bpf trampoline support for Loongarch Chenghao Duan
` (2 subsequent siblings)
5 siblings, 2 replies; 35+ messages in thread
From: Chenghao Duan @ 2025-07-30 13:12 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran,
vincent.mc.li, geliang
This commit adds support for BPF dynamic code modification on the
LoongArch architecture.:
1. Implement bpf_arch_text_poke() for runtime instruction patching.
2. Add bpf_arch_text_copy() for instruction block copying.
3. Create bpf_arch_text_invalidate() for code invalidation.
On LoongArch, since symbol addresses in the direct mapping
region cannot be reached via relative jump instructions from the paged
mapping region, we use the move_imm+jirl instruction pair as absolute
jump instructions. These require 2-5 instructions, so we reserve 5 NOP
instructions in the program as placeholders for function jumps.
larch_insn_text_copy is solely used for BPF. The use of
larch_insn_text_copy() requires page_size alignment. Currently, only
the size of the trampoline is page-aligned.
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
---
arch/loongarch/include/asm/inst.h | 1 +
arch/loongarch/kernel/inst.c | 27 ++++++++
arch/loongarch/net/bpf_jit.c | 104 ++++++++++++++++++++++++++++++
3 files changed, 132 insertions(+)
diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
index 2ae96a35d..88bb73e46 100644
--- a/arch/loongarch/include/asm/inst.h
+++ b/arch/loongarch/include/asm/inst.h
@@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
int larch_insn_read(void *addr, u32 *insnp);
int larch_insn_write(void *addr, u32 insn);
int larch_insn_patch_text(void *addr, u32 insn);
+int larch_insn_text_copy(void *dst, void *src, size_t len);
u32 larch_insn_gen_nop(void);
u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
index 674e3b322..7df63a950 100644
--- a/arch/loongarch/kernel/inst.c
+++ b/arch/loongarch/kernel/inst.c
@@ -4,6 +4,7 @@
*/
#include <linux/sizes.h>
#include <linux/uaccess.h>
+#include <linux/set_memory.h>
#include <asm/cacheflush.h>
#include <asm/inst.h>
@@ -218,6 +219,32 @@ int larch_insn_patch_text(void *addr, u32 insn)
return ret;
}
+int larch_insn_text_copy(void *dst, void *src, size_t len)
+{
+ int ret;
+ unsigned long flags;
+ unsigned long dst_start, dst_end, dst_len;
+
+ dst_start = round_down((unsigned long)dst, PAGE_SIZE);
+ dst_end = round_up((unsigned long)dst + len, PAGE_SIZE);
+ dst_len = dst_end - dst_start;
+
+ set_memory_rw(dst_start, dst_len / PAGE_SIZE);
+ raw_spin_lock_irqsave(&patch_lock, flags);
+
+ ret = copy_to_kernel_nofault(dst, src, len);
+ if (ret)
+ pr_err("%s: operation failed\n", __func__);
+
+ raw_spin_unlock_irqrestore(&patch_lock, flags);
+ set_memory_rox(dst_start, dst_len / PAGE_SIZE);
+
+ if (!ret)
+ flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
+
+ return ret;
+}
+
u32 larch_insn_gen_nop(void)
{
return INSN_NOP;
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index 7032f11d3..5e6ae7e0e 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -4,8 +4,12 @@
*
* Copyright (C) 2022 Loongson Technology Corporation Limited
*/
+#include <linux/memory.h>
#include "bpf_jit.h"
+#define LOONGARCH_LONG_JUMP_NINSNS 5
+#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
+
#define REG_TCC LOONGARCH_GPR_A6
#define TCC_SAVED LOONGARCH_GPR_S5
@@ -88,6 +92,7 @@ static u8 tail_call_reg(struct jit_ctx *ctx)
*/
static void build_prologue(struct jit_ctx *ctx)
{
+ int i;
int stack_adjust = 0, store_offset, bpf_stack_adjust;
bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
@@ -98,6 +103,10 @@ static void build_prologue(struct jit_ctx *ctx)
stack_adjust = round_up(stack_adjust, 16);
stack_adjust += bpf_stack_adjust;
+ /* Reserve space for the move_imm + jirl instruction */
+ for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
+ emit_insn(ctx, nop);
+
/*
* First instruction initializes the tail call count (TCC).
* On tail call we skip this instruction, and the TCC is
@@ -1367,3 +1376,98 @@ bool bpf_jit_supports_subprog_tailcalls(void)
{
return true;
}
+
+static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 target)
+{
+ if (!target) {
+ pr_err("bpf_jit: jump target address is error\n");
+ return -EFAULT;
+ }
+
+ move_imm(ctx, LOONGARCH_GPR_T1, target, false);
+ emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
+
+ return 0;
+}
+
+static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
+{
+ struct jit_ctx ctx;
+
+ ctx.idx = 0;
+ ctx.image = (union loongarch_instruction *)insns;
+
+ if (!target) {
+ emit_insn((&ctx), nop);
+ emit_insn((&ctx), nop);
+ return 0;
+ }
+
+ return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
+ (unsigned long)target);
+}
+
+int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
+ void *old_addr, void *new_addr)
+{
+ u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
+ u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
+ bool is_call = poke_type == BPF_MOD_CALL;
+ int ret;
+
+ if (!is_kernel_text((unsigned long)ip) &&
+ !is_bpf_text_address((unsigned long)ip))
+ return -ENOTSUPP;
+
+ ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
+ if (ret)
+ return ret;
+
+ if (memcmp(ip, old_insns, LOONGARCH_LONG_JUMP_NBYTES))
+ return -EFAULT;
+
+ ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
+ if (ret)
+ return ret;
+
+ mutex_lock(&text_mutex);
+ if (memcmp(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES))
+ ret = larch_insn_text_copy(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES);
+ mutex_unlock(&text_mutex);
+ return ret;
+}
+
+int bpf_arch_text_invalidate(void *dst, size_t len)
+{
+ int i;
+ int ret = 0;
+ u32 *inst;
+
+ inst = kvmalloc(len, GFP_KERNEL);
+ if (!inst)
+ return -ENOMEM;
+
+ for (i = 0; i < (len/sizeof(u32)); i++)
+ inst[i] = INSN_BREAK;
+
+ mutex_lock(&text_mutex);
+ if (larch_insn_text_copy(dst, inst, len))
+ ret = -EINVAL;
+ mutex_unlock(&text_mutex);
+
+ kvfree(inst);
+ return ret;
+}
+
+void *bpf_arch_text_copy(void *dst, void *src, size_t len)
+{
+ int ret;
+
+ mutex_lock(&text_mutex);
+ ret = larch_insn_text_copy(dst, src, len);
+ mutex_unlock(&text_mutex);
+ if (ret)
+ return ERR_PTR(-EINVAL);
+
+ return dst;
+}
--
2.25.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v5 4/5] LoongArch: BPF: Add bpf trampoline support for Loongarch
2025-07-30 13:12 [PATCH v5 0/5] Support trampoline for LoongArch Chenghao Duan
` (2 preceding siblings ...)
2025-07-30 13:12 ` [PATCH v5 3/5] LoongArch: BPF: Implement dynamic code modification support Chenghao Duan
@ 2025-07-30 13:12 ` Chenghao Duan
2025-07-31 2:17 ` Chenghao Duan
2025-08-03 14:17 ` Huacai Chen
2025-07-30 13:12 ` [PATCH v5 5/5] LoongArch: BPF: Add struct ops support for trampoline Chenghao Duan
2025-08-01 5:21 ` [PATCH v5 0/5] Support trampoline for LoongArch Vincent Li
5 siblings, 2 replies; 35+ messages in thread
From: Chenghao Duan @ 2025-07-30 13:12 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran,
vincent.mc.li, geliang, kernel test robot
BPF trampoline is the critical infrastructure of the BPF subsystem, acting
as a mediator between kernel functions and BPF programs. Numerous important
features, such as using BPF program for zero overhead kernel introspection,
rely on this key component.
The related tests have passed, Including the following technical points:
1. fentry
2. fmod_ret
3. fexit
The following related testcases passed on LoongArch:
sudo ./test_progs -a fentry_test/fentry
sudo ./test_progs -a fexit_test/fexit
sudo ./test_progs -a fentry_fexit
sudo ./test_progs -a modify_return
sudo ./test_progs -a fexit_sleep
sudo ./test_progs -a test_overhead
sudo ./test_progs -a trampoline_count
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202507100034.wXofj6VX-lkp@intel.com/
Reported-by: Geliang Tang <geliang@kernel.org>
Co-developed-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: George Guo <guodongtai@kylinos.cn>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Tested-by: Vincent Li <vincent.mc.li@gmail.com>
---
arch/loongarch/net/bpf_jit.c | 390 +++++++++++++++++++++++++++++++++++
arch/loongarch/net/bpf_jit.h | 6 +
2 files changed, 396 insertions(+)
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index 5e6ae7e0e..eddf582e4 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -7,9 +7,15 @@
#include <linux/memory.h>
#include "bpf_jit.h"
+#define LOONGARCH_MAX_REG_ARGS 8
+
#define LOONGARCH_LONG_JUMP_NINSNS 5
#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
+#define LOONGARCH_FENTRY_NINSNS 2
+#define LOONGARCH_FENTRY_NBYTES (LOONGARCH_FENTRY_NINSNS * 4)
+#define LOONGARCH_BPF_FENTRY_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
+
#define REG_TCC LOONGARCH_GPR_A6
#define TCC_SAVED LOONGARCH_GPR_S5
@@ -1407,6 +1413,11 @@ static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
(unsigned long)target);
}
+static int emit_call(struct jit_ctx *ctx, u64 addr)
+{
+ return emit_jump_and_link(ctx, LOONGARCH_GPR_RA, addr);
+}
+
int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
void *old_addr, void *new_addr)
{
@@ -1471,3 +1482,382 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len)
return dst;
}
+
+static void store_args(struct jit_ctx *ctx, int nargs, int args_off)
+{
+ int i;
+
+ for (i = 0; i < nargs; i++) {
+ emit_insn(ctx, std, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
+ args_off -= 8;
+ }
+}
+
+static void restore_args(struct jit_ctx *ctx, int nargs, int args_off)
+{
+ int i;
+
+ for (i = 0; i < nargs; i++) {
+ emit_insn(ctx, ldd, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
+ args_off -= 8;
+ }
+}
+
+static int invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l,
+ int args_off, int retval_off,
+ int run_ctx_off, bool save_ret)
+{
+ int ret;
+ u32 *branch;
+ struct bpf_prog *p = l->link.prog;
+ int cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
+
+ if (l->cookie) {
+ move_imm(ctx, LOONGARCH_GPR_T1, l->cookie, false);
+ emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -run_ctx_off + cookie_off);
+ } else {
+ emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP,
+ -run_ctx_off + cookie_off);
+ }
+
+ /* arg1: prog */
+ move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
+ /* arg2: &run_ctx */
+ emit_insn(ctx, addid, LOONGARCH_GPR_A1, LOONGARCH_GPR_FP, -run_ctx_off);
+ ret = emit_call(ctx, (const u64)bpf_trampoline_enter(p));
+ if (ret)
+ return ret;
+
+ /* store prog start time */
+ move_reg(ctx, LOONGARCH_GPR_S1, LOONGARCH_GPR_A0);
+
+ /* if (__bpf_prog_enter(prog) == 0)
+ * goto skip_exec_of_prog;
+ *
+ */
+ branch = (u32 *)ctx->image + ctx->idx;
+ /* nop reserved for conditional jump */
+ emit_insn(ctx, nop);
+
+ /* arg1: &args_off */
+ emit_insn(ctx, addid, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -args_off);
+ if (!p->jited)
+ move_imm(ctx, LOONGARCH_GPR_A1, (const s64)p->insnsi, false);
+ ret = emit_call(ctx, (const u64)p->bpf_func);
+ if (ret)
+ return ret;
+
+ if (save_ret) {
+ emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
+ emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
+ }
+
+ /* update branch with beqz */
+ if (ctx->image) {
+ int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branch;
+ *branch = larch_insn_gen_beq(LOONGARCH_GPR_A0, LOONGARCH_GPR_ZERO, offset);
+ }
+
+ /* arg1: prog */
+ move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
+ /* arg2: prog start time */
+ move_reg(ctx, LOONGARCH_GPR_A1, LOONGARCH_GPR_S1);
+ /* arg3: &run_ctx */
+ emit_insn(ctx, addid, LOONGARCH_GPR_A2, LOONGARCH_GPR_FP, -run_ctx_off);
+ ret = emit_call(ctx, (const u64)bpf_trampoline_exit(p));
+
+ return ret;
+}
+
+static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl,
+ int args_off, int retval_off, int run_ctx_off, u32 **branches)
+{
+ int i;
+
+ emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP, -retval_off);
+ for (i = 0; i < tl->nr_links; i++) {
+ invoke_bpf_prog(ctx, tl->links[i], args_off, retval_off,
+ run_ctx_off, true);
+ emit_insn(ctx, ldd, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -retval_off);
+ branches[i] = (u32 *)ctx->image + ctx->idx;
+ emit_insn(ctx, nop);
+ }
+}
+
+u64 bpf_jit_alloc_exec_limit(void)
+{
+ return VMALLOC_END - VMALLOC_START;
+}
+
+void *arch_alloc_bpf_trampoline(unsigned int size)
+{
+ return bpf_prog_pack_alloc(size, jit_fill_hole);
+}
+
+void arch_free_bpf_trampoline(void *image, unsigned int size)
+{
+ bpf_prog_pack_free(image, size);
+}
+
+static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
+ const struct btf_func_model *m,
+ struct bpf_tramp_links *tlinks,
+ void *func_addr, u32 flags)
+{
+ int i;
+ int stack_size = 0, nargs = 0;
+ int retval_off, args_off, nargs_off, ip_off, run_ctx_off, sreg_off;
+ struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
+ struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
+ struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
+ int ret, save_ret;
+ void *orig_call = func_addr;
+ u32 **branches = NULL;
+
+ if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
+ return -ENOTSUPP;
+
+ /*
+ * FP + 8 [ RA to parent func ] return address to parent
+ * function
+ * FP + 0 [ FP of parent func ] frame pointer of parent
+ * function
+ * FP - 8 [ T0 to traced func ] return address of traced
+ * function
+ * FP - 16 [ FP of traced func ] frame pointer of traced
+ * function
+ *
+ * FP - retval_off [ return value ] BPF_TRAMP_F_CALL_ORIG or
+ * BPF_TRAMP_F_RET_FENTRY_RET
+ * [ argN ]
+ * [ ... ]
+ * FP - args_off [ arg1 ]
+ *
+ * FP - nargs_off [ regs count ]
+ *
+ * FP - ip_off [ traced func ] BPF_TRAMP_F_IP_ARG
+ *
+ * FP - run_ctx_off [ bpf_tramp_run_ctx ]
+ *
+ * FP - sreg_off [ callee saved reg ]
+ *
+ */
+
+ if (m->nr_args > LOONGARCH_MAX_REG_ARGS)
+ return -ENOTSUPP;
+
+ if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
+ return -ENOTSUPP;
+
+ stack_size = 0;
+
+ /* room of trampoline frame to store return address and frame pointer */
+ stack_size += 16;
+
+ save_ret = flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET);
+ if (save_ret) {
+ /* Save BPF R0 and A0 */
+ stack_size += 16;
+ retval_off = stack_size;
+ }
+
+ /* room of trampoline frame to store args */
+ nargs = m->nr_args;
+ stack_size += nargs * 8;
+ args_off = stack_size;
+
+ /* room of trampoline frame to store args number */
+ stack_size += 8;
+ nargs_off = stack_size;
+
+ /* room of trampoline frame to store ip address */
+ if (flags & BPF_TRAMP_F_IP_ARG) {
+ stack_size += 8;
+ ip_off = stack_size;
+ }
+
+ /* room of trampoline frame to store struct bpf_tramp_run_ctx */
+ stack_size += round_up(sizeof(struct bpf_tramp_run_ctx), 8);
+ run_ctx_off = stack_size;
+
+ stack_size += 8;
+ sreg_off = stack_size;
+
+ stack_size = round_up(stack_size, 16);
+
+ /* For the trampoline called from function entry */
+ /* RA and FP for parent function*/
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
+ emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
+ emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
+ emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
+
+ /* RA and FP for traced function*/
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
+ emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
+ emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
+ emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
+
+ /* callee saved register S1 to pass start time */
+ emit_insn(ctx, std, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
+
+ /* store ip address of the traced function */
+ if (flags & BPF_TRAMP_F_IP_ARG) {
+ move_imm(ctx, LOONGARCH_GPR_T1, (const s64)func_addr, false);
+ emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -ip_off);
+ }
+
+ /* store nargs number*/
+ move_imm(ctx, LOONGARCH_GPR_T1, nargs, false);
+ emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -nargs_off);
+
+ store_args(ctx, nargs, args_off);
+
+ /* To traced function */
+ /* Ftrace jump skips 2 NOP instructions */
+ if (is_kernel_text((unsigned long)orig_call))
+ orig_call += LOONGARCH_FENTRY_NBYTES;
+ /* Direct jump skips 5 NOP instructions */
+ else if (is_bpf_text_address((unsigned long)orig_call))
+ orig_call += LOONGARCH_BPF_FENTRY_NBYTES;
+
+ if (flags & BPF_TRAMP_F_CALL_ORIG) {
+ move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
+ ret = emit_call(ctx, (const u64)__bpf_tramp_enter);
+ if (ret)
+ return ret;
+ }
+
+ for (i = 0; i < fentry->nr_links; i++) {
+ ret = invoke_bpf_prog(ctx, fentry->links[i], args_off, retval_off,
+ run_ctx_off, flags & BPF_TRAMP_F_RET_FENTRY_RET);
+ if (ret)
+ return ret;
+ }
+ if (fmod_ret->nr_links) {
+ branches = kcalloc(fmod_ret->nr_links, sizeof(u32 *), GFP_KERNEL);
+ if (!branches)
+ return -ENOMEM;
+
+ invoke_bpf_mod_ret(ctx, fmod_ret, args_off, retval_off,
+ run_ctx_off, branches);
+ }
+
+ if (flags & BPF_TRAMP_F_CALL_ORIG) {
+ restore_args(ctx, m->nr_args, args_off);
+ ret = emit_call(ctx, (const u64)orig_call);
+ if (ret)
+ goto out;
+ emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
+ emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
+ im->ip_after_call = ctx->ro_image + ctx->idx;
+ /* Reserve space for the move_imm + jirl instruction */
+ for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
+ emit_insn(ctx, nop);
+ }
+
+ for (i = 0; ctx->image && i < fmod_ret->nr_links; i++) {
+ int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branches[i];
+ *branches[i] = larch_insn_gen_bne(LOONGARCH_GPR_T1, LOONGARCH_GPR_ZERO, offset);
+ }
+
+ for (i = 0; i < fexit->nr_links; i++) {
+ ret = invoke_bpf_prog(ctx, fexit->links[i], args_off, retval_off,
+ run_ctx_off, false);
+ if (ret)
+ goto out;
+ }
+
+ if (flags & BPF_TRAMP_F_CALL_ORIG) {
+ im->ip_epilogue = ctx->ro_image + ctx->idx;
+ move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
+ ret = emit_call(ctx, (const u64)__bpf_tramp_exit);
+ if (ret)
+ goto out;
+ }
+
+ if (flags & BPF_TRAMP_F_RESTORE_REGS)
+ restore_args(ctx, m->nr_args, args_off);
+
+ if (save_ret) {
+ emit_insn(ctx, ldd, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
+ emit_insn(ctx, ldd, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
+ }
+
+ emit_insn(ctx, ldd, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
+
+ /* trampoline called from function entry */
+ emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
+ emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
+
+ emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
+ emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
+
+ if (flags & BPF_TRAMP_F_SKIP_FRAME)
+ /* return to parent function */
+ emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
+ else
+ /* return to traced function */
+ emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
+
+ ret = ctx->idx;
+out:
+ kfree(branches);
+
+ return ret;
+}
+
+int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image,
+ void *ro_image_end, const struct btf_func_model *m,
+ u32 flags, struct bpf_tramp_links *tlinks,
+ void *func_addr)
+{
+ int ret;
+ void *image, *tmp;
+ struct jit_ctx ctx;
+ u32 size = ro_image_end - ro_image;
+
+ image = kvmalloc(size, GFP_KERNEL);
+ if (!image)
+ return -ENOMEM;
+
+ ctx.image = (union loongarch_instruction *)image;
+ ctx.ro_image = (union loongarch_instruction *)ro_image;
+ ctx.idx = 0;
+
+ jit_fill_hole(image, (unsigned int)(ro_image_end - ro_image));
+ ret = __arch_prepare_bpf_trampoline(&ctx, im, m, tlinks, func_addr, flags);
+ if (ret > 0 && validate_code(&ctx) < 0) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ tmp = bpf_arch_text_copy(ro_image, image, size);
+ if (IS_ERR(tmp)) {
+ ret = PTR_ERR(tmp);
+ goto out;
+ }
+
+ bpf_flush_icache(ro_image, ro_image_end);
+out:
+ kvfree(image);
+ return ret < 0 ? ret : size;
+}
+
+int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags,
+ struct bpf_tramp_links *tlinks, void *func_addr)
+{
+ struct bpf_tramp_image im;
+ struct jit_ctx ctx;
+ int ret;
+
+ ctx.image = NULL;
+ ctx.idx = 0;
+
+ ret = __arch_prepare_bpf_trampoline(&ctx, &im, m, tlinks, func_addr, flags);
+
+ /* Page align */
+ return ret < 0 ? ret : round_up(ret * LOONGARCH_INSN_SIZE, PAGE_SIZE);
+}
diff --git a/arch/loongarch/net/bpf_jit.h b/arch/loongarch/net/bpf_jit.h
index f9c569f53..5697158fd 100644
--- a/arch/loongarch/net/bpf_jit.h
+++ b/arch/loongarch/net/bpf_jit.h
@@ -18,6 +18,7 @@ struct jit_ctx {
u32 *offset;
int num_exentries;
union loongarch_instruction *image;
+ union loongarch_instruction *ro_image;
u32 stack_size;
};
@@ -308,3 +309,8 @@ static inline int emit_tailcall_jmp(struct jit_ctx *ctx, u8 cond, enum loongarch
return -EINVAL;
}
+
+static inline void bpf_flush_icache(void *start, void *end)
+{
+ flush_icache_range((unsigned long)start, (unsigned long)end);
+}
--
2.25.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v5 5/5] LoongArch: BPF: Add struct ops support for trampoline
2025-07-30 13:12 [PATCH v5 0/5] Support trampoline for LoongArch Chenghao Duan
` (3 preceding siblings ...)
2025-07-30 13:12 ` [PATCH v5 4/5] LoongArch: BPF: Add bpf trampoline support for Loongarch Chenghao Duan
@ 2025-07-30 13:12 ` Chenghao Duan
2025-08-01 5:21 ` [PATCH v5 0/5] Support trampoline for LoongArch Vincent Li
5 siblings, 0 replies; 35+ messages in thread
From: Chenghao Duan @ 2025-07-30 13:12 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, duanchenghao, youling.tang, jianghaoran,
vincent.mc.li, geliang
From: Tiezhu Yang <yangtiezhu@loongson.cn>
Use BPF_TRAMP_F_INDIRECT flag to detect struct ops and emit proper
prologue and epilogue for this case.
With this patch, all of the struct_ops related testcases (except
struct_ops_multi_pages) passed on LoongArch.
The testcase struct_ops_multi_pages failed is because the actual
image_pages_cnt is 40 which is bigger than MAX_TRAMP_IMAGE_PAGES.
Before:
$ sudo ./test_progs -t struct_ops -d struct_ops_multi_pages
...
WATCHDOG: test case struct_ops_module/struct_ops_load executes for 10 seconds...
After:
$ sudo ./test_progs -t struct_ops -d struct_ops_multi_pages
...
#15 bad_struct_ops:OK
...
#399 struct_ops_autocreate:OK
...
#400 struct_ops_kptr_return:OK
...
#401 struct_ops_maybe_null:OK
...
#402 struct_ops_module:OK
...
#404 struct_ops_no_cfi:OK
...
#405 struct_ops_private_stack:SKIP
...
#406 struct_ops_refcounted:OK
Summary: 8/25 PASSED, 3 SKIPPED, 0 FAILED
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
---
arch/loongarch/net/bpf_jit.c | 71 ++++++++++++++++++++++++------------
1 file changed, 47 insertions(+), 24 deletions(-)
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index eddf582e4..725c2d5ee 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -1610,6 +1610,7 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
+ bool is_struct_ops = flags & BPF_TRAMP_F_INDIRECT;
int ret, save_ret;
void *orig_call = func_addr;
u32 **branches = NULL;
@@ -1685,18 +1686,31 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
stack_size = round_up(stack_size, 16);
- /* For the trampoline called from function entry */
- /* RA and FP for parent function*/
- emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
- emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
- emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
- emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
-
- /* RA and FP for traced function*/
- emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
- emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
- emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
- emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
+ if (!is_struct_ops) {
+ /*
+ * For the trampoline called from function entry,
+ * the frame of traced function and the frame of
+ * trampoline need to be considered.
+ */
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
+ emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
+ emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
+ emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
+
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
+ emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
+ emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
+ emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
+ } else {
+ /*
+ * For the trampoline called directly, just handle
+ * the frame of trampoline.
+ */
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
+ emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, stack_size - 8);
+ emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
+ emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
+ }
/* callee saved register S1 to pass start time */
emit_insn(ctx, std, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
@@ -1786,21 +1800,30 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
emit_insn(ctx, ldd, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
- /* trampoline called from function entry */
- emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
- emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
- emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
+ if (!is_struct_ops) {
+ /* trampoline called from function entry */
+ emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
+ emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
+
+ emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
+ emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
- emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
- emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
- emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
+ if (flags & BPF_TRAMP_F_SKIP_FRAME)
+ /* return to parent function */
+ emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
+ else
+ /* return to traced function */
+ emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
+ } else {
+ /* trampoline called directly */
+ emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, stack_size - 8);
+ emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
+ emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
- if (flags & BPF_TRAMP_F_SKIP_FRAME)
- /* return to parent function */
emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
- else
- /* return to traced function */
- emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
+ }
ret = ctx->idx;
out:
--
2.25.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v5 1/5] LoongArch: Add larch_insn_gen_{beq,bne} helpers
2025-07-30 13:12 ` [PATCH v5 1/5] LoongArch: Add larch_insn_gen_{beq,bne} helpers Chenghao Duan
@ 2025-07-31 1:41 ` Hengqi Chen
0 siblings, 0 replies; 35+ messages in thread
From: Hengqi Chen @ 2025-07-31 1:41 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li, geliang, Youling Tang
On Wed, Jul 30, 2025 at 9:13 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> Add larch_insn_gen_beq() and larch_insn_gen_bne() helpers which will
> be used in BPF trampoline implementation.
>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Co-developed-by: Youling Tang <tangyouling@kylinos.cn>
> Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> ---
> arch/loongarch/include/asm/inst.h | 2 ++
> arch/loongarch/kernel/inst.c | 28 ++++++++++++++++++++++++++++
> 2 files changed, 30 insertions(+)
>
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> index 3089785ca..2ae96a35d 100644
> --- a/arch/loongarch/include/asm/inst.h
> +++ b/arch/loongarch/include/asm/inst.h
> @@ -511,6 +511,8 @@ u32 larch_insn_gen_lu12iw(enum loongarch_gpr rd, int imm);
> u32 larch_insn_gen_lu32id(enum loongarch_gpr rd, int imm);
> u32 larch_insn_gen_lu52id(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
> u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
> +u32 larch_insn_gen_beq(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
> +u32 larch_insn_gen_bne(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
>
> static inline bool signed_imm_check(long val, unsigned int bit)
> {
> diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> index 14d7d700b..674e3b322 100644
> --- a/arch/loongarch/kernel/inst.c
> +++ b/arch/loongarch/kernel/inst.c
> @@ -336,3 +336,31 @@ u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm)
>
> return insn.word;
> }
> +
> +u32 larch_insn_gen_beq(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm)
> +{
> + union loongarch_instruction insn;
> +
> + if ((imm & 3) || imm < -SZ_128K || imm >= SZ_128K) {
> + pr_warn("The generated beq instruction is out of range.\n");
> + return INSN_BREAK;
> + }
> +
> + emit_beq(&insn, rj, rd, imm >> 2);
> +
> + return insn.word;
> +}
> +
> +u32 larch_insn_gen_bne(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm)
> +{
> + union loongarch_instruction insn;
> +
> + if ((imm & 3) || imm < -SZ_128K || imm >= SZ_128K) {
> + pr_warn("The generated bne instruction is out of range.\n");
> + return INSN_BREAK;
> + }
> +
> + emit_bne(&insn, rj, rd, imm >> 2);
> +
> + return insn.word;
> +}
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx
2025-07-30 13:12 ` [PATCH v5 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx Chenghao Duan
@ 2025-07-31 1:44 ` Hengqi Chen
0 siblings, 0 replies; 35+ messages in thread
From: Hengqi Chen @ 2025-07-31 1:44 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li, geliang
On Wed, Jul 30, 2025 at 9:13 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> Rename the existing validate_code() to validate_ctx()
> Factor out the code validation handling into a new helper validate_code()
>
> * validate_code is used to check the validity of code.
> * validate_ctx is used to check both code validity and table entry
> correctness.
>
> The new validate_code() will be used in subsequent changes.
>
I still feel uncomfortable about the subject line.
Hope Huacai can rephrase it when apply.
other than that,
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> ---
> arch/loongarch/net/bpf_jit.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index fa1500d4a..7032f11d3 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -1180,6 +1180,14 @@ static int validate_code(struct jit_ctx *ctx)
> return -1;
> }
>
> + return 0;
> +}
> +
> +static int validate_ctx(struct jit_ctx *ctx)
> +{
> + if (validate_code(ctx))
> + return -1;
> +
> if (WARN_ON_ONCE(ctx->num_exentries != ctx->prog->aux->num_exentries))
> return -1;
>
> @@ -1288,7 +1296,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
> build_epilogue(&ctx);
>
> /* 3. Extra pass to validate JITed code */
> - if (validate_code(&ctx)) {
> + if (validate_ctx(&ctx)) {
> bpf_jit_binary_free(header);
> prog = orig_prog;
> goto out_offset;
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 4/5] LoongArch: BPF: Add bpf trampoline support for Loongarch
2025-07-30 13:12 ` [PATCH v5 4/5] LoongArch: BPF: Add bpf trampoline support for Loongarch Chenghao Duan
@ 2025-07-31 2:17 ` Chenghao Duan
2025-08-01 8:04 ` Huacai Chen
2025-08-03 14:17 ` Huacai Chen
1 sibling, 1 reply; 35+ messages in thread
From: Chenghao Duan @ 2025-07-31 2:17 UTC (permalink / raw)
To: ast, daniel, andrii, yangtiezhu, hengqi.chen, chenhuacai
Cc: martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, youling.tang, jianghaoran, vincent.mc.li, geliang,
kernel test robot
On Wed, Jul 30, 2025 at 09:12:56PM +0800, Chenghao Duan wrote:
> BPF trampoline is the critical infrastructure of the BPF subsystem, acting
> as a mediator between kernel functions and BPF programs. Numerous important
> features, such as using BPF program for zero overhead kernel introspection,
> rely on this key component.
>
> The related tests have passed, Including the following technical points:
> 1. fentry
> 2. fmod_ret
> 3. fexit
>
> The following related testcases passed on LoongArch:
> sudo ./test_progs -a fentry_test/fentry
> sudo ./test_progs -a fexit_test/fexit
> sudo ./test_progs -a fentry_fexit
> sudo ./test_progs -a modify_return
> sudo ./test_progs -a fexit_sleep
> sudo ./test_progs -a test_overhead
> sudo ./test_progs -a trampoline_count
Hi Teacher Huacai,
If no code modifications are needed, please help add the following
commit log proposed by Teacher Geliang. If code modifications are
required, I will add it in the next version.
'''
This issue was first reported by Geliang Tang in June 2024 while
debugging MPTCP BPF selftests on a LoongArch machine (see commit
eef0532e900c "selftests/bpf: Null checks for links in bpf_tcp_ca").
Geliang, Huachui, and Tiezhu then worked together to drive the
implementation of this feature, encouraging broader collaboration among
Chinese kernel engineers.
'''
This log was proposed at:
https://lore.kernel.org/all/828dd09de3b86f81c8f25130ae209d0d12b0fd9f.camel@kernel.org/
>
> Reported-by: kernel test robot <lkp@intel.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202507100034.wXofj6VX-lkp@intel.com/
> Reported-by: Geliang Tang <geliang@kernel.org>
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
> Tested-by: Vincent Li <vincent.mc.li@gmail.com>
> ---
> arch/loongarch/net/bpf_jit.c | 390 +++++++++++++++++++++++++++++++++++
> arch/loongarch/net/bpf_jit.h | 6 +
> 2 files changed, 396 insertions(+)
>
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 5e6ae7e0e..eddf582e4 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -7,9 +7,15 @@
> #include <linux/memory.h>
> #include "bpf_jit.h"
>
> +#define LOONGARCH_MAX_REG_ARGS 8
> +
> #define LOONGARCH_LONG_JUMP_NINSNS 5
> #define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
>
> +#define LOONGARCH_FENTRY_NINSNS 2
> +#define LOONGARCH_FENTRY_NBYTES (LOONGARCH_FENTRY_NINSNS * 4)
> +#define LOONGARCH_BPF_FENTRY_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> +
> #define REG_TCC LOONGARCH_GPR_A6
> #define TCC_SAVED LOONGARCH_GPR_S5
>
> @@ -1407,6 +1413,11 @@ static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> (unsigned long)target);
> }
>
> +static int emit_call(struct jit_ctx *ctx, u64 addr)
> +{
> + return emit_jump_and_link(ctx, LOONGARCH_GPR_RA, addr);
> +}
> +
> int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> void *old_addr, void *new_addr)
> {
> @@ -1471,3 +1482,382 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len)
>
> return dst;
> }
> +
> +static void store_args(struct jit_ctx *ctx, int nargs, int args_off)
> +{
> + int i;
> +
> + for (i = 0; i < nargs; i++) {
> + emit_insn(ctx, std, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
> + args_off -= 8;
> + }
> +}
> +
> +static void restore_args(struct jit_ctx *ctx, int nargs, int args_off)
> +{
> + int i;
> +
> + for (i = 0; i < nargs; i++) {
> + emit_insn(ctx, ldd, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
> + args_off -= 8;
> + }
> +}
> +
> +static int invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l,
> + int args_off, int retval_off,
> + int run_ctx_off, bool save_ret)
> +{
> + int ret;
> + u32 *branch;
> + struct bpf_prog *p = l->link.prog;
> + int cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
> +
> + if (l->cookie) {
> + move_imm(ctx, LOONGARCH_GPR_T1, l->cookie, false);
> + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -run_ctx_off + cookie_off);
> + } else {
> + emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP,
> + -run_ctx_off + cookie_off);
> + }
> +
> + /* arg1: prog */
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
> + /* arg2: &run_ctx */
> + emit_insn(ctx, addid, LOONGARCH_GPR_A1, LOONGARCH_GPR_FP, -run_ctx_off);
> + ret = emit_call(ctx, (const u64)bpf_trampoline_enter(p));
> + if (ret)
> + return ret;
> +
> + /* store prog start time */
> + move_reg(ctx, LOONGARCH_GPR_S1, LOONGARCH_GPR_A0);
> +
> + /* if (__bpf_prog_enter(prog) == 0)
> + * goto skip_exec_of_prog;
> + *
> + */
> + branch = (u32 *)ctx->image + ctx->idx;
> + /* nop reserved for conditional jump */
> + emit_insn(ctx, nop);
> +
> + /* arg1: &args_off */
> + emit_insn(ctx, addid, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -args_off);
> + if (!p->jited)
> + move_imm(ctx, LOONGARCH_GPR_A1, (const s64)p->insnsi, false);
> + ret = emit_call(ctx, (const u64)p->bpf_func);
> + if (ret)
> + return ret;
> +
> + if (save_ret) {
> + emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> + emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> + }
> +
> + /* update branch with beqz */
> + if (ctx->image) {
> + int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branch;
> + *branch = larch_insn_gen_beq(LOONGARCH_GPR_A0, LOONGARCH_GPR_ZERO, offset);
> + }
> +
> + /* arg1: prog */
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
> + /* arg2: prog start time */
> + move_reg(ctx, LOONGARCH_GPR_A1, LOONGARCH_GPR_S1);
> + /* arg3: &run_ctx */
> + emit_insn(ctx, addid, LOONGARCH_GPR_A2, LOONGARCH_GPR_FP, -run_ctx_off);
> + ret = emit_call(ctx, (const u64)bpf_trampoline_exit(p));
> +
> + return ret;
> +}
> +
> +static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl,
> + int args_off, int retval_off, int run_ctx_off, u32 **branches)
> +{
> + int i;
> +
> + emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP, -retval_off);
> + for (i = 0; i < tl->nr_links; i++) {
> + invoke_bpf_prog(ctx, tl->links[i], args_off, retval_off,
> + run_ctx_off, true);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -retval_off);
> + branches[i] = (u32 *)ctx->image + ctx->idx;
> + emit_insn(ctx, nop);
> + }
> +}
> +
> +u64 bpf_jit_alloc_exec_limit(void)
> +{
> + return VMALLOC_END - VMALLOC_START;
> +}
> +
> +void *arch_alloc_bpf_trampoline(unsigned int size)
> +{
> + return bpf_prog_pack_alloc(size, jit_fill_hole);
> +}
> +
> +void arch_free_bpf_trampoline(void *image, unsigned int size)
> +{
> + bpf_prog_pack_free(image, size);
> +}
> +
> +static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
> + const struct btf_func_model *m,
> + struct bpf_tramp_links *tlinks,
> + void *func_addr, u32 flags)
> +{
> + int i;
> + int stack_size = 0, nargs = 0;
> + int retval_off, args_off, nargs_off, ip_off, run_ctx_off, sreg_off;
> + struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
> + struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> + struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> + int ret, save_ret;
> + void *orig_call = func_addr;
> + u32 **branches = NULL;
> +
> + if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
> + return -ENOTSUPP;
> +
> + /*
> + * FP + 8 [ RA to parent func ] return address to parent
> + * function
> + * FP + 0 [ FP of parent func ] frame pointer of parent
> + * function
> + * FP - 8 [ T0 to traced func ] return address of traced
> + * function
> + * FP - 16 [ FP of traced func ] frame pointer of traced
> + * function
> + *
> + * FP - retval_off [ return value ] BPF_TRAMP_F_CALL_ORIG or
> + * BPF_TRAMP_F_RET_FENTRY_RET
> + * [ argN ]
> + * [ ... ]
> + * FP - args_off [ arg1 ]
> + *
> + * FP - nargs_off [ regs count ]
> + *
> + * FP - ip_off [ traced func ] BPF_TRAMP_F_IP_ARG
> + *
> + * FP - run_ctx_off [ bpf_tramp_run_ctx ]
> + *
> + * FP - sreg_off [ callee saved reg ]
> + *
> + */
> +
> + if (m->nr_args > LOONGARCH_MAX_REG_ARGS)
> + return -ENOTSUPP;
> +
> + if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
> + return -ENOTSUPP;
> +
> + stack_size = 0;
> +
> + /* room of trampoline frame to store return address and frame pointer */
> + stack_size += 16;
> +
> + save_ret = flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET);
> + if (save_ret) {
> + /* Save BPF R0 and A0 */
> + stack_size += 16;
> + retval_off = stack_size;
> + }
> +
> + /* room of trampoline frame to store args */
> + nargs = m->nr_args;
> + stack_size += nargs * 8;
> + args_off = stack_size;
> +
> + /* room of trampoline frame to store args number */
> + stack_size += 8;
> + nargs_off = stack_size;
> +
> + /* room of trampoline frame to store ip address */
> + if (flags & BPF_TRAMP_F_IP_ARG) {
> + stack_size += 8;
> + ip_off = stack_size;
> + }
> +
> + /* room of trampoline frame to store struct bpf_tramp_run_ctx */
> + stack_size += round_up(sizeof(struct bpf_tramp_run_ctx), 8);
> + run_ctx_off = stack_size;
> +
> + stack_size += 8;
> + sreg_off = stack_size;
> +
> + stack_size = round_up(stack_size, 16);
> +
> + /* For the trampoline called from function entry */
> + /* RA and FP for parent function*/
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
> + emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
> +
> + /* RA and FP for traced function*/
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
> + emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
> +
> + /* callee saved register S1 to pass start time */
> + emit_insn(ctx, std, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> +
> + /* store ip address of the traced function */
> + if (flags & BPF_TRAMP_F_IP_ARG) {
> + move_imm(ctx, LOONGARCH_GPR_T1, (const s64)func_addr, false);
> + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -ip_off);
> + }
> +
> + /* store nargs number*/
> + move_imm(ctx, LOONGARCH_GPR_T1, nargs, false);
> + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -nargs_off);
> +
> + store_args(ctx, nargs, args_off);
> +
> + /* To traced function */
> + /* Ftrace jump skips 2 NOP instructions */
> + if (is_kernel_text((unsigned long)orig_call))
> + orig_call += LOONGARCH_FENTRY_NBYTES;
> + /* Direct jump skips 5 NOP instructions */
> + else if (is_bpf_text_address((unsigned long)orig_call))
> + orig_call += LOONGARCH_BPF_FENTRY_NBYTES;
> +
> + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
> + ret = emit_call(ctx, (const u64)__bpf_tramp_enter);
> + if (ret)
> + return ret;
> + }
> +
> + for (i = 0; i < fentry->nr_links; i++) {
> + ret = invoke_bpf_prog(ctx, fentry->links[i], args_off, retval_off,
> + run_ctx_off, flags & BPF_TRAMP_F_RET_FENTRY_RET);
> + if (ret)
> + return ret;
> + }
> + if (fmod_ret->nr_links) {
> + branches = kcalloc(fmod_ret->nr_links, sizeof(u32 *), GFP_KERNEL);
> + if (!branches)
> + return -ENOMEM;
> +
> + invoke_bpf_mod_ret(ctx, fmod_ret, args_off, retval_off,
> + run_ctx_off, branches);
> + }
> +
> + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> + restore_args(ctx, m->nr_args, args_off);
> + ret = emit_call(ctx, (const u64)orig_call);
> + if (ret)
> + goto out;
> + emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> + emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> + im->ip_after_call = ctx->ro_image + ctx->idx;
> + /* Reserve space for the move_imm + jirl instruction */
> + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> + emit_insn(ctx, nop);
> + }
> +
> + for (i = 0; ctx->image && i < fmod_ret->nr_links; i++) {
> + int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branches[i];
> + *branches[i] = larch_insn_gen_bne(LOONGARCH_GPR_T1, LOONGARCH_GPR_ZERO, offset);
> + }
> +
> + for (i = 0; i < fexit->nr_links; i++) {
> + ret = invoke_bpf_prog(ctx, fexit->links[i], args_off, retval_off,
> + run_ctx_off, false);
> + if (ret)
> + goto out;
> + }
> +
> + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> + im->ip_epilogue = ctx->ro_image + ctx->idx;
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
> + ret = emit_call(ctx, (const u64)__bpf_tramp_exit);
> + if (ret)
> + goto out;
> + }
> +
> + if (flags & BPF_TRAMP_F_RESTORE_REGS)
> + restore_args(ctx, m->nr_args, args_off);
> +
> + if (save_ret) {
> + emit_insn(ctx, ldd, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> + emit_insn(ctx, ldd, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> + }
> +
> + emit_insn(ctx, ldd, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> +
> + /* trampoline called from function entry */
> + emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
> +
> + emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
> +
> + if (flags & BPF_TRAMP_F_SKIP_FRAME)
> + /* return to parent function */
> + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
> + else
> + /* return to traced function */
> + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
> +
> + ret = ctx->idx;
> +out:
> + kfree(branches);
> +
> + return ret;
> +}
> +
> +int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image,
> + void *ro_image_end, const struct btf_func_model *m,
> + u32 flags, struct bpf_tramp_links *tlinks,
> + void *func_addr)
> +{
> + int ret;
> + void *image, *tmp;
> + struct jit_ctx ctx;
> + u32 size = ro_image_end - ro_image;
> +
> + image = kvmalloc(size, GFP_KERNEL);
> + if (!image)
> + return -ENOMEM;
> +
> + ctx.image = (union loongarch_instruction *)image;
> + ctx.ro_image = (union loongarch_instruction *)ro_image;
> + ctx.idx = 0;
> +
> + jit_fill_hole(image, (unsigned int)(ro_image_end - ro_image));
> + ret = __arch_prepare_bpf_trampoline(&ctx, im, m, tlinks, func_addr, flags);
> + if (ret > 0 && validate_code(&ctx) < 0) {
> + ret = -EINVAL;
> + goto out;
> + }
> +
> + tmp = bpf_arch_text_copy(ro_image, image, size);
> + if (IS_ERR(tmp)) {
> + ret = PTR_ERR(tmp);
> + goto out;
> + }
> +
> + bpf_flush_icache(ro_image, ro_image_end);
> +out:
> + kvfree(image);
> + return ret < 0 ? ret : size;
> +}
> +
> +int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags,
> + struct bpf_tramp_links *tlinks, void *func_addr)
> +{
> + struct bpf_tramp_image im;
> + struct jit_ctx ctx;
> + int ret;
> +
> + ctx.image = NULL;
> + ctx.idx = 0;
> +
> + ret = __arch_prepare_bpf_trampoline(&ctx, &im, m, tlinks, func_addr, flags);
> +
> + /* Page align */
> + return ret < 0 ? ret : round_up(ret * LOONGARCH_INSN_SIZE, PAGE_SIZE);
> +}
> diff --git a/arch/loongarch/net/bpf_jit.h b/arch/loongarch/net/bpf_jit.h
> index f9c569f53..5697158fd 100644
> --- a/arch/loongarch/net/bpf_jit.h
> +++ b/arch/loongarch/net/bpf_jit.h
> @@ -18,6 +18,7 @@ struct jit_ctx {
> u32 *offset;
> int num_exentries;
> union loongarch_instruction *image;
> + union loongarch_instruction *ro_image;
> u32 stack_size;
> };
>
> @@ -308,3 +309,8 @@ static inline int emit_tailcall_jmp(struct jit_ctx *ctx, u8 cond, enum loongarch
>
> return -EINVAL;
> }
> +
> +static inline void bpf_flush_icache(void *start, void *end)
> +{
> + flush_icache_range((unsigned long)start, (unsigned long)end);
> +}
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-08-01 5:21 ` [PATCH v5 0/5] Support trampoline for LoongArch Vincent Li
@ 2025-08-01 2:00 ` Vincent Li
2025-08-02 9:19 ` Tiezhu Yang
1 sibling, 0 replies; 35+ messages in thread
From: Vincent Li @ 2025-08-01 2:00 UTC (permalink / raw)
To: Chenghao Duan
Cc: yangtiezhu, hengqi.chen, chenhuacai, kernel, loongarch,
guodongtai, youling.tang, jianghaoran, geliang
On Thu, Jul 31, 2025 at 10:21 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> Hi Chenghao,
>
> I trimmed the email recipients only to the loongarch mailing list and
> folks who might pay attention to this, I personally don't like to
> bother other people who may not be interested in this :). Folks let me
> know if this is not ok. anyway, please check my bpf selftest result
> inline. The fentry_attach_stress results in kernel lockup.
>
> On Wed, Jul 30, 2025 at 6:13 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > v5:
> > 1. Modify the internal implementation of larch_insn_text_copy by
> > removing the while loop processing. There is a while loop inside
> > copy_to_kernel_nofault that handles and copies all data.
> >
> > 2. text_mutex has been added to all usage contexts of
> > larch_insn_text_copy, and the relevant tests have passed.
> >
> > -----------------------------------------------------------------------
> > Historical Version:
> > v4:
> > 1. Delete the #3 patch of version V3.
> >
> > 2. Add 5 NOP instructions in build_prologue().
> > Reserve space for the move_imm + jirl instruction.
> >
> > 3. Differentiate between direct jumps and ftrace jumps of trampoline:
> > direct jumps skip 5 instructions.
> > ftrace jumps skip 2 instructions.
> >
> > 4. Remove the generation of BL jump instructions in emit_jump_and_link().
> > After the trampoline ends, it will jump to the specified register.
> > The BL instruction writes PC+4 to r1 instead of allowing the
> > specification of rd.
> >
> > URL for version v4:
> > https://lore.kernel.org/all/20250724141929.691853-1-duanchenghao@kylinos.cn/
> > ---------
> > v3:
> > 1. Patch 0003 adds EXECMEM_BPF memory type to the execmem subsystem.
> >
> > 2. Align the size calculated by arch_bpf_trampoline_size to page
> > boundaries.
> >
> > 3. Add the flush icache operation to larch_insn_text_copy.
> >
> > 4. Unify the implementation of bpf_arch_xxx into the patch
> > "0004-LoongArch-BPF-Add-bpf_arch_xxxxx-support-for-Loong.patch".
> >
> > 5. Change the patch order. Move the patch
> > "0002-LoongArch-BPF-Update-the-code-to-rename-validate_.patch" before
> > "0005-LoongArch-BPF-Add-bpf-trampoline-support-for-Loon.patch".
> >
> > URL for version v3:
> > https://lore.kernel.org/all/20250709055029.723243-1-duanchenghao@kylinos.cn/
> > ---------
> > v2:
> > 1. Change the fixmap in the instruction copy function to set_memory_xxx.
> >
> > 2. Change the implementation method of the following code.
> > - arch_alloc_bpf_trampoline
> > - arch_free_bpf_trampoline
> > Use the BPF core's allocation and free functions.
> >
> > - bpf_arch_text_invalidate
> > Operate with the function larch_insn_text_copy that carries
> > memory attribute modifications.
> >
> > 3. Correct the incorrect code formatting.
> >
> > URL for version v2:
> > https://lore.kernel.org/all/20250618105048.1510560-1-duanchenghao@kylinos.cn/
> > ---------
> > v1:
> > Support trampoline for LoongArch. The following feature tests have been
> > completed:
> > 1. fentry
> > 2. fexit
> > 3. fmod_ret
> >
>
> I ran ./test_progs --list to test the following, and the
> fentry_attach_stress caused kernel lockup, so I had to power reset the
> machine.
>
> fentry_attach_btf_presence
>
> fentry_attach_stress
>
> fentry_fexit
>
> fentry_test
>
> fexit_bpf2bpf
>
> fexit_noreturns
>
> fexit_sleep
>
> fexit_stress
>
> fexit_test
>
>
> [root@fedora bpf]# ./test_progs -a fexit_test
> #112/1 fexit_test/fexit:OK
> fexit_many_args:PASS:fexit_many_args_skel_load 0 nsec
> libbpf: prog 'test2': failed to attach: -ENOTSUPP
> libbpf: prog 'test2': failed to auto-attach: -ENOTSUPP
> fexit_many_args:FAIL:fexit_many_args_attach unexpected error: -524 (errno 524)
> #112/2 fexit_test/fexit_many_args:FAIL
> #112 fexit_test:FAIL
>
> All error logs:
> fexit_many_args:PASS:fexit_many_args_skel_load 0 nsec
> libbpf: prog 'test2': failed to attach: -ENOTSUPP
> libbpf: prog 'test2': failed to auto-attach: -ENOTSUPP
> fexit_many_args:FAIL:fexit_many_args_attach unexpected error: -524 (errno 524)
> #112/2 fexit_test/fexit_many_args:FAIL
> #112 fexit_test:FAIL
> Summary: 0/1 PASSED, 0 SKIPPED, 1 FAILED
>
> [root@fedora bpf]# ./test_progs -a fexit_stress
> #111 fexit_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fexit_sleep
> #110 fexit_sleep:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
>
> [root@fedora bpf]# ./test_progs -a fexit_noreturns
> #109/1 fexit_noreturns/noreturns:OK
> #109 fexit_noreturns:OK
> Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fexit_bpf2bpf
> #108/1 fexit_bpf2bpf/target_no_callees:OK
> #108/2 fexit_bpf2bpf/target_yes_callees:OK
> #108/3 fexit_bpf2bpf/func_replace:OK
> #108/4 fexit_bpf2bpf/func_replace_verify:OK
> #108/5 fexit_bpf2bpf/func_sockmap_update:OK
> #108/6 fexit_bpf2bpf/func_replace_return_code:OK
> #108/7 fexit_bpf2bpf/func_map_prog_compatibility:OK
> #108/8 fexit_bpf2bpf/func_replace_unreliable:OK
> #108/9 fexit_bpf2bpf/func_replace_multi:OK
> #108/10 fexit_bpf2bpf/fmod_ret_freplace:OK
> #108/11 fexit_bpf2bpf/func_replace_global_func:OK
> (cgroup_helpers.c:100: errno: Invalid argument) Enabling controller
> cpu: /mnt/cgroup.subtree_control
> #108 fexit_bpf2bpf: Failed to setup cgroup environment
> test_fentry_to_cgroup_bpf:FAIL:cgroup_fd unexpected cgroup_fd: actual
> -1 < expected 0
> #108/12 fexit_bpf2bpf/fentry_to_cgroup_bpf:FAIL
> #108/13 fexit_bpf2bpf/func_replace_progmap:OK
> #108 fexit_bpf2bpf:FAIL
>
> All error logs:
> (cgroup_helpers.c:100: errno: Invalid argument) Enabling controller
> cpu: /mnt/cgroup.subtree_control
> #108 fexit_bpf2bpf: Failed to setup cgroup environment
> test_fentry_to_cgroup_bpf:FAIL:cgroup_fd unexpected cgroup_fd: actual
> -1 < expected 0
> #108/12 fexit_bpf2bpf/fentry_to_cgroup_bpf:FAIL
> #108 fexit_bpf2bpf:FAIL
> Summary: 0/12 PASSED, 0 SKIPPED, 1 FAILED
>
> [root@fedora bpf]# ./test_progs -a fentry_test
> #107/1 fentry_test/fentry:OK
> fentry_many_args:PASS:fentry_many_args_skel_load 0 nsec
> libbpf: prog 'test2': failed to attach: -ENOTSUPP
> libbpf: prog 'test2': failed to auto-attach: -ENOTSUPP
> fentry_many_args:FAIL:fentry_many_args_attach unexpected error: -524 (errno 524)
> #107/2 fentry_test/fentry_many_args:FAIL
> #107 fentry_test:FAIL
>
> All error logs:
> fentry_many_args:PASS:fentry_many_args_skel_load 0 nsec
> libbpf: prog 'test2': failed to attach: -ENOTSUPP
> libbpf: prog 'test2': failed to auto-attach: -ENOTSUPP
> fentry_many_args:FAIL:fentry_many_args_attach unexpected error: -524 (errno 524)
> #107/2 fentry_test/fentry_many_args:FAIL
> #107 fentry_test:FAIL
> Summary: 0/1 PASSED, 0 SKIPPED, 1 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_fexit
> #106 fentry_fexit:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
>
> [root@fedora bpf]# ./test_progs -a fentry_attach_btf_presence
> #104 fentry_attach_btf_presence:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
>
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress. <-----kernel
> locked up after this
> client_loop: send disconnect: Broken pipe
>
the fentry_attach_stress seems to trigger race condition in
bpf_arch_text_poke(), see
commit d459dbbbfa323165849451edb9690a933c210bac
Author: Ilya Leoshkevich <iii@linux.ibm.com>
Date: Wed Jul 16 21:35:07 2025 +0200
selftests/bpf: Stress test attaching a BPF prog to another BPF prog
Add a test that invokes a BPF prog in a loop, while concurrently
attaching and detaching another BPF prog to and from it. This helps
identifying race conditions in bpf_arch_text_poke().
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20250716194524.48109-3-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>
> > TODO: The support for the struct_ops feature will be provided in
> > subsequent patches.
> >
> > URL for version v1:
> > https://lore.kernel.org/all/20250611035952.111182-1-duanchenghao@kylinos.cn/
> > -----------------------------------------------------------------------
> >
> > Chenghao Duan (4):
> > LoongArch: Add larch_insn_gen_{beq,bne} helpers
> > LoongArch: BPF: Update the code to rename validate_code to
> > validate_ctx
> > LoongArch: BPF: Implement dynamic code modification support
> > LoongArch: BPF: Add bpf trampoline support for Loongarch
> >
> > Tiezhu Yang (1):
> > LoongArch: BPF: Add struct ops support for trampoline
> >
> > arch/loongarch/include/asm/inst.h | 3 +
> > arch/loongarch/kernel/inst.c | 54 +++
> > arch/loongarch/net/bpf_jit.c | 527 +++++++++++++++++++++++++++++-
> > arch/loongarch/net/bpf_jit.h | 6 +
> > 4 files changed, 589 insertions(+), 1 deletion(-)
> >
> > --
> > 2.25.1
> >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-07-30 13:12 [PATCH v5 0/5] Support trampoline for LoongArch Chenghao Duan
` (4 preceding siblings ...)
2025-07-30 13:12 ` [PATCH v5 5/5] LoongArch: BPF: Add struct ops support for trampoline Chenghao Duan
@ 2025-08-01 5:21 ` Vincent Li
2025-08-01 2:00 ` Vincent Li
2025-08-02 9:19 ` Tiezhu Yang
5 siblings, 2 replies; 35+ messages in thread
From: Vincent Li @ 2025-08-01 5:21 UTC (permalink / raw)
To: Chenghao Duan
Cc: yangtiezhu, hengqi.chen, chenhuacai, kernel, loongarch,
guodongtai, youling.tang, jianghaoran, geliang
Hi Chenghao,
I trimmed the email recipients only to the loongarch mailing list and
folks who might pay attention to this, I personally don't like to
bother other people who may not be interested in this :). Folks let me
know if this is not ok. anyway, please check my bpf selftest result
inline. The fentry_attach_stress results in kernel lockup.
On Wed, Jul 30, 2025 at 6:13 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> v5:
> 1. Modify the internal implementation of larch_insn_text_copy by
> removing the while loop processing. There is a while loop inside
> copy_to_kernel_nofault that handles and copies all data.
>
> 2. text_mutex has been added to all usage contexts of
> larch_insn_text_copy, and the relevant tests have passed.
>
> -----------------------------------------------------------------------
> Historical Version:
> v4:
> 1. Delete the #3 patch of version V3.
>
> 2. Add 5 NOP instructions in build_prologue().
> Reserve space for the move_imm + jirl instruction.
>
> 3. Differentiate between direct jumps and ftrace jumps of trampoline:
> direct jumps skip 5 instructions.
> ftrace jumps skip 2 instructions.
>
> 4. Remove the generation of BL jump instructions in emit_jump_and_link().
> After the trampoline ends, it will jump to the specified register.
> The BL instruction writes PC+4 to r1 instead of allowing the
> specification of rd.
>
> URL for version v4:
> https://lore.kernel.org/all/20250724141929.691853-1-duanchenghao@kylinos.cn/
> ---------
> v3:
> 1. Patch 0003 adds EXECMEM_BPF memory type to the execmem subsystem.
>
> 2. Align the size calculated by arch_bpf_trampoline_size to page
> boundaries.
>
> 3. Add the flush icache operation to larch_insn_text_copy.
>
> 4. Unify the implementation of bpf_arch_xxx into the patch
> "0004-LoongArch-BPF-Add-bpf_arch_xxxxx-support-for-Loong.patch".
>
> 5. Change the patch order. Move the patch
> "0002-LoongArch-BPF-Update-the-code-to-rename-validate_.patch" before
> "0005-LoongArch-BPF-Add-bpf-trampoline-support-for-Loon.patch".
>
> URL for version v3:
> https://lore.kernel.org/all/20250709055029.723243-1-duanchenghao@kylinos.cn/
> ---------
> v2:
> 1. Change the fixmap in the instruction copy function to set_memory_xxx.
>
> 2. Change the implementation method of the following code.
> - arch_alloc_bpf_trampoline
> - arch_free_bpf_trampoline
> Use the BPF core's allocation and free functions.
>
> - bpf_arch_text_invalidate
> Operate with the function larch_insn_text_copy that carries
> memory attribute modifications.
>
> 3. Correct the incorrect code formatting.
>
> URL for version v2:
> https://lore.kernel.org/all/20250618105048.1510560-1-duanchenghao@kylinos.cn/
> ---------
> v1:
> Support trampoline for LoongArch. The following feature tests have been
> completed:
> 1. fentry
> 2. fexit
> 3. fmod_ret
>
I ran ./test_progs --list to test the following, and the
fentry_attach_stress caused kernel lockup, so I had to power reset the
machine.
fentry_attach_btf_presence
fentry_attach_stress
fentry_fexit
fentry_test
fexit_bpf2bpf
fexit_noreturns
fexit_sleep
fexit_stress
fexit_test
[root@fedora bpf]# ./test_progs -a fexit_test
#112/1 fexit_test/fexit:OK
fexit_many_args:PASS:fexit_many_args_skel_load 0 nsec
libbpf: prog 'test2': failed to attach: -ENOTSUPP
libbpf: prog 'test2': failed to auto-attach: -ENOTSUPP
fexit_many_args:FAIL:fexit_many_args_attach unexpected error: -524 (errno 524)
#112/2 fexit_test/fexit_many_args:FAIL
#112 fexit_test:FAIL
All error logs:
fexit_many_args:PASS:fexit_many_args_skel_load 0 nsec
libbpf: prog 'test2': failed to attach: -ENOTSUPP
libbpf: prog 'test2': failed to auto-attach: -ENOTSUPP
fexit_many_args:FAIL:fexit_many_args_attach unexpected error: -524 (errno 524)
#112/2 fexit_test/fexit_many_args:FAIL
#112 fexit_test:FAIL
Summary: 0/1 PASSED, 0 SKIPPED, 1 FAILED
[root@fedora bpf]# ./test_progs -a fexit_stress
#111 fexit_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fexit_sleep
#110 fexit_sleep:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fexit_noreturns
#109/1 fexit_noreturns/noreturns:OK
#109 fexit_noreturns:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fexit_bpf2bpf
#108/1 fexit_bpf2bpf/target_no_callees:OK
#108/2 fexit_bpf2bpf/target_yes_callees:OK
#108/3 fexit_bpf2bpf/func_replace:OK
#108/4 fexit_bpf2bpf/func_replace_verify:OK
#108/5 fexit_bpf2bpf/func_sockmap_update:OK
#108/6 fexit_bpf2bpf/func_replace_return_code:OK
#108/7 fexit_bpf2bpf/func_map_prog_compatibility:OK
#108/8 fexit_bpf2bpf/func_replace_unreliable:OK
#108/9 fexit_bpf2bpf/func_replace_multi:OK
#108/10 fexit_bpf2bpf/fmod_ret_freplace:OK
#108/11 fexit_bpf2bpf/func_replace_global_func:OK
(cgroup_helpers.c:100: errno: Invalid argument) Enabling controller
cpu: /mnt/cgroup.subtree_control
#108 fexit_bpf2bpf: Failed to setup cgroup environment
test_fentry_to_cgroup_bpf:FAIL:cgroup_fd unexpected cgroup_fd: actual
-1 < expected 0
#108/12 fexit_bpf2bpf/fentry_to_cgroup_bpf:FAIL
#108/13 fexit_bpf2bpf/func_replace_progmap:OK
#108 fexit_bpf2bpf:FAIL
All error logs:
(cgroup_helpers.c:100: errno: Invalid argument) Enabling controller
cpu: /mnt/cgroup.subtree_control
#108 fexit_bpf2bpf: Failed to setup cgroup environment
test_fentry_to_cgroup_bpf:FAIL:cgroup_fd unexpected cgroup_fd: actual
-1 < expected 0
#108/12 fexit_bpf2bpf/fentry_to_cgroup_bpf:FAIL
#108 fexit_bpf2bpf:FAIL
Summary: 0/12 PASSED, 0 SKIPPED, 1 FAILED
[root@fedora bpf]# ./test_progs -a fentry_test
#107/1 fentry_test/fentry:OK
fentry_many_args:PASS:fentry_many_args_skel_load 0 nsec
libbpf: prog 'test2': failed to attach: -ENOTSUPP
libbpf: prog 'test2': failed to auto-attach: -ENOTSUPP
fentry_many_args:FAIL:fentry_many_args_attach unexpected error: -524 (errno 524)
#107/2 fentry_test/fentry_many_args:FAIL
#107 fentry_test:FAIL
All error logs:
fentry_many_args:PASS:fentry_many_args_skel_load 0 nsec
libbpf: prog 'test2': failed to attach: -ENOTSUPP
libbpf: prog 'test2': failed to auto-attach: -ENOTSUPP
fentry_many_args:FAIL:fentry_many_args_attach unexpected error: -524 (errno 524)
#107/2 fentry_test/fentry_many_args:FAIL
#107 fentry_test:FAIL
Summary: 0/1 PASSED, 0 SKIPPED, 1 FAILED
[root@fedora bpf]# ./test_progs -a fentry_fexit
#106 fentry_fexit:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_btf_presence
#104 fentry_attach_btf_presence:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_stress. <-----kernel
locked up after this
client_loop: send disconnect: Broken pipe
> TODO: The support for the struct_ops feature will be provided in
> subsequent patches.
>
> URL for version v1:
> https://lore.kernel.org/all/20250611035952.111182-1-duanchenghao@kylinos.cn/
> -----------------------------------------------------------------------
>
> Chenghao Duan (4):
> LoongArch: Add larch_insn_gen_{beq,bne} helpers
> LoongArch: BPF: Update the code to rename validate_code to
> validate_ctx
> LoongArch: BPF: Implement dynamic code modification support
> LoongArch: BPF: Add bpf trampoline support for Loongarch
>
> Tiezhu Yang (1):
> LoongArch: BPF: Add struct ops support for trampoline
>
> arch/loongarch/include/asm/inst.h | 3 +
> arch/loongarch/kernel/inst.c | 54 +++
> arch/loongarch/net/bpf_jit.c | 527 +++++++++++++++++++++++++++++-
> arch/loongarch/net/bpf_jit.h | 6 +
> 4 files changed, 589 insertions(+), 1 deletion(-)
>
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 4/5] LoongArch: BPF: Add bpf trampoline support for Loongarch
2025-07-31 2:17 ` Chenghao Duan
@ 2025-08-01 8:04 ` Huacai Chen
0 siblings, 0 replies; 35+ messages in thread
From: Huacai Chen @ 2025-08-01 8:04 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, hengqi.chen, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li, geliang, kernel test robot
On Thu, Jul 31, 2025 at 10:18 AM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> On Wed, Jul 30, 2025 at 09:12:56PM +0800, Chenghao Duan wrote:
> > BPF trampoline is the critical infrastructure of the BPF subsystem, acting
> > as a mediator between kernel functions and BPF programs. Numerous important
> > features, such as using BPF program for zero overhead kernel introspection,
> > rely on this key component.
> >
> > The related tests have passed, Including the following technical points:
> > 1. fentry
> > 2. fmod_ret
> > 3. fexit
> >
> > The following related testcases passed on LoongArch:
> > sudo ./test_progs -a fentry_test/fentry
> > sudo ./test_progs -a fexit_test/fexit
> > sudo ./test_progs -a fentry_fexit
> > sudo ./test_progs -a modify_return
> > sudo ./test_progs -a fexit_sleep
> > sudo ./test_progs -a test_overhead
> > sudo ./test_progs -a trampoline_count
>
> Hi Teacher Huacai,
>
> If no code modifications are needed, please help add the following
> commit log proposed by Teacher Geliang. If code modifications are
> required, I will add it in the next version.
It probably need a new version since Vincent Li has reported a bug. Sadly.
Huacai
>
> '''
> This issue was first reported by Geliang Tang in June 2024 while
> debugging MPTCP BPF selftests on a LoongArch machine (see commit
> eef0532e900c "selftests/bpf: Null checks for links in bpf_tcp_ca").
> Geliang, Huachui, and Tiezhu then worked together to drive the
> implementation of this feature, encouraging broader collaboration among
> Chinese kernel engineers.
> '''
>
> This log was proposed at:
> https://lore.kernel.org/all/828dd09de3b86f81c8f25130ae209d0d12b0fd9f.camel@kernel.org/
>
> >
> > Reported-by: kernel test robot <lkp@intel.com>
> > Closes: https://lore.kernel.org/oe-kbuild-all/202507100034.wXofj6VX-lkp@intel.com/
> > Reported-by: Geliang Tang <geliang@kernel.org>
> > Co-developed-by: George Guo <guodongtai@kylinos.cn>
> > Signed-off-by: George Guo <guodongtai@kylinos.cn>
> > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
> > Tested-by: Vincent Li <vincent.mc.li@gmail.com>
> > ---
> > arch/loongarch/net/bpf_jit.c | 390 +++++++++++++++++++++++++++++++++++
> > arch/loongarch/net/bpf_jit.h | 6 +
> > 2 files changed, 396 insertions(+)
> >
> > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > index 5e6ae7e0e..eddf582e4 100644
> > --- a/arch/loongarch/net/bpf_jit.c
> > +++ b/arch/loongarch/net/bpf_jit.c
> > @@ -7,9 +7,15 @@
> > #include <linux/memory.h>
> > #include "bpf_jit.h"
> >
> > +#define LOONGARCH_MAX_REG_ARGS 8
> > +
> > #define LOONGARCH_LONG_JUMP_NINSNS 5
> > #define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> >
> > +#define LOONGARCH_FENTRY_NINSNS 2
> > +#define LOONGARCH_FENTRY_NBYTES (LOONGARCH_FENTRY_NINSNS * 4)
> > +#define LOONGARCH_BPF_FENTRY_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> > +
> > #define REG_TCC LOONGARCH_GPR_A6
> > #define TCC_SAVED LOONGARCH_GPR_S5
> >
> > @@ -1407,6 +1413,11 @@ static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> > (unsigned long)target);
> > }
> >
> > +static int emit_call(struct jit_ctx *ctx, u64 addr)
> > +{
> > + return emit_jump_and_link(ctx, LOONGARCH_GPR_RA, addr);
> > +}
> > +
> > int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> > void *old_addr, void *new_addr)
> > {
> > @@ -1471,3 +1482,382 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> >
> > return dst;
> > }
> > +
> > +static void store_args(struct jit_ctx *ctx, int nargs, int args_off)
> > +{
> > + int i;
> > +
> > + for (i = 0; i < nargs; i++) {
> > + emit_insn(ctx, std, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
> > + args_off -= 8;
> > + }
> > +}
> > +
> > +static void restore_args(struct jit_ctx *ctx, int nargs, int args_off)
> > +{
> > + int i;
> > +
> > + for (i = 0; i < nargs; i++) {
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
> > + args_off -= 8;
> > + }
> > +}
> > +
> > +static int invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l,
> > + int args_off, int retval_off,
> > + int run_ctx_off, bool save_ret)
> > +{
> > + int ret;
> > + u32 *branch;
> > + struct bpf_prog *p = l->link.prog;
> > + int cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
> > +
> > + if (l->cookie) {
> > + move_imm(ctx, LOONGARCH_GPR_T1, l->cookie, false);
> > + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -run_ctx_off + cookie_off);
> > + } else {
> > + emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP,
> > + -run_ctx_off + cookie_off);
> > + }
> > +
> > + /* arg1: prog */
> > + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
> > + /* arg2: &run_ctx */
> > + emit_insn(ctx, addid, LOONGARCH_GPR_A1, LOONGARCH_GPR_FP, -run_ctx_off);
> > + ret = emit_call(ctx, (const u64)bpf_trampoline_enter(p));
> > + if (ret)
> > + return ret;
> > +
> > + /* store prog start time */
> > + move_reg(ctx, LOONGARCH_GPR_S1, LOONGARCH_GPR_A0);
> > +
> > + /* if (__bpf_prog_enter(prog) == 0)
> > + * goto skip_exec_of_prog;
> > + *
> > + */
> > + branch = (u32 *)ctx->image + ctx->idx;
> > + /* nop reserved for conditional jump */
> > + emit_insn(ctx, nop);
> > +
> > + /* arg1: &args_off */
> > + emit_insn(ctx, addid, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -args_off);
> > + if (!p->jited)
> > + move_imm(ctx, LOONGARCH_GPR_A1, (const s64)p->insnsi, false);
> > + ret = emit_call(ctx, (const u64)p->bpf_func);
> > + if (ret)
> > + return ret;
> > +
> > + if (save_ret) {
> > + emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> > + emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> > + }
> > +
> > + /* update branch with beqz */
> > + if (ctx->image) {
> > + int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branch;
> > + *branch = larch_insn_gen_beq(LOONGARCH_GPR_A0, LOONGARCH_GPR_ZERO, offset);
> > + }
> > +
> > + /* arg1: prog */
> > + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
> > + /* arg2: prog start time */
> > + move_reg(ctx, LOONGARCH_GPR_A1, LOONGARCH_GPR_S1);
> > + /* arg3: &run_ctx */
> > + emit_insn(ctx, addid, LOONGARCH_GPR_A2, LOONGARCH_GPR_FP, -run_ctx_off);
> > + ret = emit_call(ctx, (const u64)bpf_trampoline_exit(p));
> > +
> > + return ret;
> > +}
> > +
> > +static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl,
> > + int args_off, int retval_off, int run_ctx_off, u32 **branches)
> > +{
> > + int i;
> > +
> > + emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP, -retval_off);
> > + for (i = 0; i < tl->nr_links; i++) {
> > + invoke_bpf_prog(ctx, tl->links[i], args_off, retval_off,
> > + run_ctx_off, true);
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -retval_off);
> > + branches[i] = (u32 *)ctx->image + ctx->idx;
> > + emit_insn(ctx, nop);
> > + }
> > +}
> > +
> > +u64 bpf_jit_alloc_exec_limit(void)
> > +{
> > + return VMALLOC_END - VMALLOC_START;
> > +}
> > +
> > +void *arch_alloc_bpf_trampoline(unsigned int size)
> > +{
> > + return bpf_prog_pack_alloc(size, jit_fill_hole);
> > +}
> > +
> > +void arch_free_bpf_trampoline(void *image, unsigned int size)
> > +{
> > + bpf_prog_pack_free(image, size);
> > +}
> > +
> > +static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
> > + const struct btf_func_model *m,
> > + struct bpf_tramp_links *tlinks,
> > + void *func_addr, u32 flags)
> > +{
> > + int i;
> > + int stack_size = 0, nargs = 0;
> > + int retval_off, args_off, nargs_off, ip_off, run_ctx_off, sreg_off;
> > + struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
> > + struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> > + struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> > + int ret, save_ret;
> > + void *orig_call = func_addr;
> > + u32 **branches = NULL;
> > +
> > + if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
> > + return -ENOTSUPP;
> > +
> > + /*
> > + * FP + 8 [ RA to parent func ] return address to parent
> > + * function
> > + * FP + 0 [ FP of parent func ] frame pointer of parent
> > + * function
> > + * FP - 8 [ T0 to traced func ] return address of traced
> > + * function
> > + * FP - 16 [ FP of traced func ] frame pointer of traced
> > + * function
> > + *
> > + * FP - retval_off [ return value ] BPF_TRAMP_F_CALL_ORIG or
> > + * BPF_TRAMP_F_RET_FENTRY_RET
> > + * [ argN ]
> > + * [ ... ]
> > + * FP - args_off [ arg1 ]
> > + *
> > + * FP - nargs_off [ regs count ]
> > + *
> > + * FP - ip_off [ traced func ] BPF_TRAMP_F_IP_ARG
> > + *
> > + * FP - run_ctx_off [ bpf_tramp_run_ctx ]
> > + *
> > + * FP - sreg_off [ callee saved reg ]
> > + *
> > + */
> > +
> > + if (m->nr_args > LOONGARCH_MAX_REG_ARGS)
> > + return -ENOTSUPP;
> > +
> > + if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
> > + return -ENOTSUPP;
> > +
> > + stack_size = 0;
> > +
> > + /* room of trampoline frame to store return address and frame pointer */
> > + stack_size += 16;
> > +
> > + save_ret = flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET);
> > + if (save_ret) {
> > + /* Save BPF R0 and A0 */
> > + stack_size += 16;
> > + retval_off = stack_size;
> > + }
> > +
> > + /* room of trampoline frame to store args */
> > + nargs = m->nr_args;
> > + stack_size += nargs * 8;
> > + args_off = stack_size;
> > +
> > + /* room of trampoline frame to store args number */
> > + stack_size += 8;
> > + nargs_off = stack_size;
> > +
> > + /* room of trampoline frame to store ip address */
> > + if (flags & BPF_TRAMP_F_IP_ARG) {
> > + stack_size += 8;
> > + ip_off = stack_size;
> > + }
> > +
> > + /* room of trampoline frame to store struct bpf_tramp_run_ctx */
> > + stack_size += round_up(sizeof(struct bpf_tramp_run_ctx), 8);
> > + run_ctx_off = stack_size;
> > +
> > + stack_size += 8;
> > + sreg_off = stack_size;
> > +
> > + stack_size = round_up(stack_size, 16);
> > +
> > + /* For the trampoline called from function entry */
> > + /* RA and FP for parent function*/
> > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
> > + emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> > + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> > + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
> > +
> > + /* RA and FP for traced function*/
> > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
> > + emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> > + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> > + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
> > +
> > + /* callee saved register S1 to pass start time */
> > + emit_insn(ctx, std, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> > +
> > + /* store ip address of the traced function */
> > + if (flags & BPF_TRAMP_F_IP_ARG) {
> > + move_imm(ctx, LOONGARCH_GPR_T1, (const s64)func_addr, false);
> > + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -ip_off);
> > + }
> > +
> > + /* store nargs number*/
> > + move_imm(ctx, LOONGARCH_GPR_T1, nargs, false);
> > + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -nargs_off);
> > +
> > + store_args(ctx, nargs, args_off);
> > +
> > + /* To traced function */
> > + /* Ftrace jump skips 2 NOP instructions */
> > + if (is_kernel_text((unsigned long)orig_call))
> > + orig_call += LOONGARCH_FENTRY_NBYTES;
> > + /* Direct jump skips 5 NOP instructions */
> > + else if (is_bpf_text_address((unsigned long)orig_call))
> > + orig_call += LOONGARCH_BPF_FENTRY_NBYTES;
> > +
> > + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> > + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
> > + ret = emit_call(ctx, (const u64)__bpf_tramp_enter);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + for (i = 0; i < fentry->nr_links; i++) {
> > + ret = invoke_bpf_prog(ctx, fentry->links[i], args_off, retval_off,
> > + run_ctx_off, flags & BPF_TRAMP_F_RET_FENTRY_RET);
> > + if (ret)
> > + return ret;
> > + }
> > + if (fmod_ret->nr_links) {
> > + branches = kcalloc(fmod_ret->nr_links, sizeof(u32 *), GFP_KERNEL);
> > + if (!branches)
> > + return -ENOMEM;
> > +
> > + invoke_bpf_mod_ret(ctx, fmod_ret, args_off, retval_off,
> > + run_ctx_off, branches);
> > + }
> > +
> > + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> > + restore_args(ctx, m->nr_args, args_off);
> > + ret = emit_call(ctx, (const u64)orig_call);
> > + if (ret)
> > + goto out;
> > + emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> > + emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> > + im->ip_after_call = ctx->ro_image + ctx->idx;
> > + /* Reserve space for the move_imm + jirl instruction */
> > + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> > + emit_insn(ctx, nop);
> > + }
> > +
> > + for (i = 0; ctx->image && i < fmod_ret->nr_links; i++) {
> > + int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branches[i];
> > + *branches[i] = larch_insn_gen_bne(LOONGARCH_GPR_T1, LOONGARCH_GPR_ZERO, offset);
> > + }
> > +
> > + for (i = 0; i < fexit->nr_links; i++) {
> > + ret = invoke_bpf_prog(ctx, fexit->links[i], args_off, retval_off,
> > + run_ctx_off, false);
> > + if (ret)
> > + goto out;
> > + }
> > +
> > + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> > + im->ip_epilogue = ctx->ro_image + ctx->idx;
> > + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
> > + ret = emit_call(ctx, (const u64)__bpf_tramp_exit);
> > + if (ret)
> > + goto out;
> > + }
> > +
> > + if (flags & BPF_TRAMP_F_RESTORE_REGS)
> > + restore_args(ctx, m->nr_args, args_off);
> > +
> > + if (save_ret) {
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> > + emit_insn(ctx, ldd, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> > + }
> > +
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> > +
> > + /* trampoline called from function entry */
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
> > +
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> > + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> > + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
> > +
> > + if (flags & BPF_TRAMP_F_SKIP_FRAME)
> > + /* return to parent function */
> > + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
> > + else
> > + /* return to traced function */
> > + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
> > +
> > + ret = ctx->idx;
> > +out:
> > + kfree(branches);
> > +
> > + return ret;
> > +}
> > +
> > +int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image,
> > + void *ro_image_end, const struct btf_func_model *m,
> > + u32 flags, struct bpf_tramp_links *tlinks,
> > + void *func_addr)
> > +{
> > + int ret;
> > + void *image, *tmp;
> > + struct jit_ctx ctx;
> > + u32 size = ro_image_end - ro_image;
> > +
> > + image = kvmalloc(size, GFP_KERNEL);
> > + if (!image)
> > + return -ENOMEM;
> > +
> > + ctx.image = (union loongarch_instruction *)image;
> > + ctx.ro_image = (union loongarch_instruction *)ro_image;
> > + ctx.idx = 0;
> > +
> > + jit_fill_hole(image, (unsigned int)(ro_image_end - ro_image));
> > + ret = __arch_prepare_bpf_trampoline(&ctx, im, m, tlinks, func_addr, flags);
> > + if (ret > 0 && validate_code(&ctx) < 0) {
> > + ret = -EINVAL;
> > + goto out;
> > + }
> > +
> > + tmp = bpf_arch_text_copy(ro_image, image, size);
> > + if (IS_ERR(tmp)) {
> > + ret = PTR_ERR(tmp);
> > + goto out;
> > + }
> > +
> > + bpf_flush_icache(ro_image, ro_image_end);
> > +out:
> > + kvfree(image);
> > + return ret < 0 ? ret : size;
> > +}
> > +
> > +int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags,
> > + struct bpf_tramp_links *tlinks, void *func_addr)
> > +{
> > + struct bpf_tramp_image im;
> > + struct jit_ctx ctx;
> > + int ret;
> > +
> > + ctx.image = NULL;
> > + ctx.idx = 0;
> > +
> > + ret = __arch_prepare_bpf_trampoline(&ctx, &im, m, tlinks, func_addr, flags);
> > +
> > + /* Page align */
> > + return ret < 0 ? ret : round_up(ret * LOONGARCH_INSN_SIZE, PAGE_SIZE);
> > +}
> > diff --git a/arch/loongarch/net/bpf_jit.h b/arch/loongarch/net/bpf_jit.h
> > index f9c569f53..5697158fd 100644
> > --- a/arch/loongarch/net/bpf_jit.h
> > +++ b/arch/loongarch/net/bpf_jit.h
> > @@ -18,6 +18,7 @@ struct jit_ctx {
> > u32 *offset;
> > int num_exentries;
> > union loongarch_instruction *image;
> > + union loongarch_instruction *ro_image;
> > u32 stack_size;
> > };
> >
> > @@ -308,3 +309,8 @@ static inline int emit_tailcall_jmp(struct jit_ctx *ctx, u8 cond, enum loongarch
> >
> > return -EINVAL;
> > }
> > +
> > +static inline void bpf_flush_icache(void *start, void *end)
> > +{
> > + flush_icache_range((unsigned long)start, (unsigned long)end);
> > +}
> > --
> > 2.25.1
> >
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-08-01 5:21 ` [PATCH v5 0/5] Support trampoline for LoongArch Vincent Li
2025-08-01 2:00 ` Vincent Li
@ 2025-08-02 9:19 ` Tiezhu Yang
2025-08-02 13:53 ` Vincent Li
1 sibling, 1 reply; 35+ messages in thread
From: Tiezhu Yang @ 2025-08-02 9:19 UTC (permalink / raw)
To: Vincent Li, Chenghao Duan
Cc: hengqi.chen, chenhuacai, kernel, loongarch, guodongtai,
youling.tang, jianghaoran, geliang
On 2025/8/1 下午1:21, Vincent Li wrote:
> Hi Chenghao,
>
> I trimmed the email recipients only to the loongarch mailing list and
> folks who might pay attention to this, I personally don't like to
> bother other people who may not be interested in this :). Folks let me
> know if this is not ok. anyway, please check my bpf selftest result
> inline. The fentry_attach_stress results in kernel lockup.
It passed on my test environment.
$ sudo ./test_progs -a fentry_attach_stress
#104 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
I used loongson3_defconfig and the following additional configs:
CONFIG_KPROBES=y
CONFIG_FUNCTION_ERROR_INJECTION=y
CONFIG_TEST_BPF=m
CONFIG_FTRACE=y
CONFIG_FUNCTION_TRACER=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_FPROBE=y
CONFIG_FTRACE_SYSCALLS=y
CONFIG_BPF_KPROBE_OVERRIDE=y
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_DEBUG_INFO_BTF=y
CONFIG_NET_SCH_BPF=y
CONFIG_BPF_LSM=y
CONFIG_BPF_UNPRIV_DEFAULT_OFF=n
CONFIG_ARCH_STRICT_ALIGN=n
I am not sure whether it is related with configs, you can test it again.
Thanks,
Tiezhu
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-08-02 9:19 ` Tiezhu Yang
@ 2025-08-02 13:53 ` Vincent Li
2025-08-02 14:47 ` Vincent Li
0 siblings, 1 reply; 35+ messages in thread
From: Vincent Li @ 2025-08-02 13:53 UTC (permalink / raw)
To: Tiezhu Yang
Cc: Chenghao Duan, hengqi.chen, chenhuacai, kernel, loongarch,
guodongtai, youling.tang, jianghaoran, geliang
On Sat, Aug 2, 2025 at 2:19 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
>
> On 2025/8/1 下午1:21, Vincent Li wrote:
> > Hi Chenghao,
> >
> > I trimmed the email recipients only to the loongarch mailing list and
> > folks who might pay attention to this, I personally don't like to
> > bother other people who may not be interested in this :). Folks let me
> > know if this is not ok. anyway, please check my bpf selftest result
> > inline. The fentry_attach_stress results in kernel lockup.
>
> It passed on my test environment.
>
> $ sudo ./test_progs -a fentry_attach_stress
> #104 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
>
> I used loongson3_defconfig and the following additional configs:
>
> CONFIG_KPROBES=y
> CONFIG_FUNCTION_ERROR_INJECTION=y
> CONFIG_TEST_BPF=m
> CONFIG_FTRACE=y
> CONFIG_FUNCTION_TRACER=y
> CONFIG_DYNAMIC_FTRACE=y
> CONFIG_FPROBE=y
> CONFIG_FTRACE_SYSCALLS=y
> CONFIG_BPF_KPROBE_OVERRIDE=y
> CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> CONFIG_DEBUG_INFO_BTF=y
> CONFIG_NET_SCH_BPF=y
> CONFIG_BPF_LSM=y
> CONFIG_BPF_UNPRIV_DEFAULT_OFF=n
> CONFIG_ARCH_STRICT_ALIGN=n
>
> I am not sure whether it is related with configs, you can test it again.
>
Have you tried to run the same fentry_attach_stress multiple times or
in a loop like while true; do ./test_progs -a fentry_attach_stress;
sleep 1; done
the lockup happens intermittently, sometime it PASSED, sometime kernel
locks up. I merged the tools/testing/selftests/bpf/config with my
original config by
./scripts/kconfig/merge_config.sh -y .config
tools/testing/selftests/bpf/config. my config seems including
everything you listed above except CONFIG_ARCH_STRICT_ALIGN not set,
here is my config https://www.bpfire.net/download/loongfire/config.txt
> Thanks,
> Tiezhu
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-08-02 13:53 ` Vincent Li
@ 2025-08-02 14:47 ` Vincent Li
2025-08-02 15:52 ` Vincent Li
0 siblings, 1 reply; 35+ messages in thread
From: Vincent Li @ 2025-08-02 14:47 UTC (permalink / raw)
To: Tiezhu Yang
Cc: Chenghao Duan, hengqi.chen, chenhuacai, kernel, loongarch,
guodongtai, youling.tang, jianghaoran, geliang
On Sat, Aug 2, 2025 at 6:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> On Sat, Aug 2, 2025 at 2:19 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
> >
> > On 2025/8/1 下午1:21, Vincent Li wrote:
> > > Hi Chenghao,
> > >
> > > I trimmed the email recipients only to the loongarch mailing list and
> > > folks who might pay attention to this, I personally don't like to
> > > bother other people who may not be interested in this :). Folks let me
> > > know if this is not ok. anyway, please check my bpf selftest result
> > > inline. The fentry_attach_stress results in kernel lockup.
> >
> > It passed on my test environment.
> >
> > $ sudo ./test_progs -a fentry_attach_stress
> > #104 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> >
> > I used loongson3_defconfig and the following additional configs:
> >
> > CONFIG_KPROBES=y
> > CONFIG_FUNCTION_ERROR_INJECTION=y
> > CONFIG_TEST_BPF=m
> > CONFIG_FTRACE=y
> > CONFIG_FUNCTION_TRACER=y
> > CONFIG_DYNAMIC_FTRACE=y
> > CONFIG_FPROBE=y
> > CONFIG_FTRACE_SYSCALLS=y
> > CONFIG_BPF_KPROBE_OVERRIDE=y
> > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> > CONFIG_DEBUG_INFO_BTF=y
> > CONFIG_NET_SCH_BPF=y
> > CONFIG_BPF_LSM=y
> > CONFIG_BPF_UNPRIV_DEFAULT_OFF=n
> > CONFIG_ARCH_STRICT_ALIGN=n
> >
> > I am not sure whether it is related with configs, you can test it again.
> >
I did:
cp arch/loongarch/configs/loongson3_defconfig .config
./scripts/kconfig/merge_config.sh -y .config config-tiezhu(your above config)
indeed I could not reproduce the lockup, even run the test in a loop.
it seems to be related to the kernel config I use, maybe you could try
my kernel config?
[root@fedora bpf]# while true; do ./test_progs -a
fentry_attach_stress; sleep 1; done
#105 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#105 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#105 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#105 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#105 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#105 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#105 fentry_attach_stress:OK
>
> Have you tried to run the same fentry_attach_stress multiple times or
> in a loop like while true; do ./test_progs -a fentry_attach_stress;
> sleep 1; done
> the lockup happens intermittently, sometime it PASSED, sometime kernel
> locks up. I merged the tools/testing/selftests/bpf/config with my
> original config by
> ./scripts/kconfig/merge_config.sh -y .config
> tools/testing/selftests/bpf/config. my config seems including
> everything you listed above except CONFIG_ARCH_STRICT_ALIGN not set,
> here is my config https://www.bpfire.net/download/loongfire/config.txt
>
> > Thanks,
> > Tiezhu
> >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-08-02 14:47 ` Vincent Li
@ 2025-08-02 15:52 ` Vincent Li
2025-08-03 14:10 ` Huacai Chen
0 siblings, 1 reply; 35+ messages in thread
From: Vincent Li @ 2025-08-02 15:52 UTC (permalink / raw)
To: Tiezhu Yang
Cc: Chenghao Duan, hengqi.chen, chenhuacai, kernel, loongarch,
guodongtai, youling.tang, jianghaoran, geliang
On Sat, Aug 2, 2025 at 7:47 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> On Sat, Aug 2, 2025 at 6:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> >
> > On Sat, Aug 2, 2025 at 2:19 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
> > >
> > > On 2025/8/1 下午1:21, Vincent Li wrote:
> > > > Hi Chenghao,
> > > >
> > > > I trimmed the email recipients only to the loongarch mailing list and
> > > > folks who might pay attention to this, I personally don't like to
> > > > bother other people who may not be interested in this :). Folks let me
> > > > know if this is not ok. anyway, please check my bpf selftest result
> > > > inline. The fentry_attach_stress results in kernel lockup.
> > >
> > > It passed on my test environment.
> > >
> > > $ sudo ./test_progs -a fentry_attach_stress
> > > #104 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > >
> > > I used loongson3_defconfig and the following additional configs:
> > >
> > > CONFIG_KPROBES=y
> > > CONFIG_FUNCTION_ERROR_INJECTION=y
> > > CONFIG_TEST_BPF=m
> > > CONFIG_FTRACE=y
> > > CONFIG_FUNCTION_TRACER=y
> > > CONFIG_DYNAMIC_FTRACE=y
> > > CONFIG_FPROBE=y
> > > CONFIG_FTRACE_SYSCALLS=y
> > > CONFIG_BPF_KPROBE_OVERRIDE=y
> > > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> > > CONFIG_DEBUG_INFO_BTF=y
> > > CONFIG_NET_SCH_BPF=y
> > > CONFIG_BPF_LSM=y
> > > CONFIG_BPF_UNPRIV_DEFAULT_OFF=n
> > > CONFIG_ARCH_STRICT_ALIGN=n
> > >
> > > I am not sure whether it is related with configs, you can test it again.
> > >
>
> I did:
> cp arch/loongarch/configs/loongson3_defconfig .config
> ./scripts/kconfig/merge_config.sh -y .config config-tiezhu(your above config)
> indeed I could not reproduce the lockup, even run the test in a loop.
> it seems to be related to the kernel config I use, maybe you could try
> my kernel config?
>
> [root@fedora bpf]# while true; do ./test_progs -a
> fentry_attach_stress; sleep 1; done
> #105 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> #105 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
>
> #105 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> #105 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> #105 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> #105 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> #105 fentry_attach_stress:OK
>
while checking dmesg with your nolockup config, I see following log:
[ 3469.410821] Hardware name: Loongson
Loongson-3A6000-7A2000-NUC/Loongson-3A6000-7A2000-NUC, BIOS
Loongson-UDK2018-V4.0.05759-stable202405 07/12/24 15:49:14
[ 3469.410824] pc 90000002456d4880 ra 90000000060885f4 tp
90000002975d8000 sp 90000002975dbc10
[ 3469.410826] a0 0000000000000000 a1 ffff8000128f0048 a2
00007ffff1c02638 a3 0000000000000000
[ 3469.410828] a4 00007ffff1c02680 a5 00007ffff0ce0f20 a6
00007ffff0ce0f20 a7 0000000000000118
[ 3469.410830] t0 ffff80000338dd44 t1 90000002456d4880 t2
d665bdcea9f14eb9 t3 90000001dc9c1000
[ 3469.410832] t4 90000001dea99670 t5 0000000000000000 t6
0000000000000000 t7 0000000000000000
[ 3469.410834] t8 000000000000000f u0 000000000000000a s9
90000002975dbec0 s0 90000002975dbc70
[ 3469.410836] s1 0000000000000000 s2 90000000076a5000 s3
0000000000000000 s4 ffff8000128f0000
[ 3469.410838] s5 0000000000000000 s6 0000000000000000 s7
0000000000000050 s8 fffffffffffffdf4
[ 3469.410840] ra: 90000000060885f4 __bpf_prog_test_run_raw_tp+0x6c/0x108
[ 3469.410848] ERA: 90000002456d4880 0x90000002456d4880
[ 3469.410851] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[ 3469.410860] PRMD: 00000004 (PPLV0 +PIE -PWE)
[ 3469.410865] EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
[ 3469.410870] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
[ 3469.410875] ESTAT: 000d0000 [INE] (IS= ECode=13 EsubCode=0)
[ 3469.410878] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000)
[ 3469.410880] Modules linked in: bpf_testmod(O) tls nft_fib_inet
nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink cmac
algif_hash algif_skcipher af_alg bnep vfat fat rtw88_8821ce
rtw88_8821c rtw88_pci rtw88_core mac80211 btusb libarc4 btrtl btbcm
btmtk btintel cfg80211 bluetooth sha3_generic kvm jitterentropy_rng
drbg ecdh_generic ecc loongson3_cpufreq spi_loongson_pci rfkill
spi_loongson_core uio_pdrv_genirq uio lm75 fuse efi_pstore pstore zram
842_decompress 842_compress lz4hc_compress lz4_compress uas
usb_storage efivarfs [last unloaded: bpf_testmod(O)]
[ 3469.410946] Process test_progs (pid: 37338,
threadinfo=00000000760120b6, task=00000000620daecd)
[ 3469.410950] Stack : 0000000000000000 0000000000000000
0000000000000000 d665bdcea9f14eb9
[ 3469.410956] 000000000000000f 0000000000000000
90000002975dbc70 90000000076a5000
[ 3469.410961] 00007ffff1c02638 ffff8000128f0000
90000002975dbd90 900000000608ab28
[ 3469.410966] ffff8000128f0000 0000000000000000
0000000000000000 d665bdcea9f14eb9
[ 3469.410971] 0000000000000000 90000000076a5000
90000002975dbd90 00007ffff1c02638
[ 3469.410976] 0000000000000000 000000000000000a
ffff8000128f0000 9000000004f21328
[ 3469.410981] 0000000000000000 9000000004e554c0
0000000000000000 900000010f6a4240
[ 3469.410986] 0000000129e18000 ffffffff00003500
0000000000000000 0000000000000000
[ 3469.410990] 0000000000000000 0000000000000000
0000000000000000 d665bdcea9f14eb9
[ 3469.410995] 0000000000000001 00007ffff1c03040
0000000000000000 0000000000000002
[ 3469.411000] ...
[ 3469.411002] Call Trace:
[ 3469.411005] [<9000000004d11fb8>] handle_syscall+0xb8/0x158
[ 3469.411012] Code: 4c2e0031 31465341 00323731 <00000000> 00000000
975d8000 90000002 00000003 00402040
[ 3469.411024] ---[ end trace 0000000000000000 ]---
here is the relevant config diff between your nolockup config and my
lockup config that I suspect your nolockup config didn't cause kernel
lockup
diff -u config-nolockup config-lockup
# Debug Oops, Lockups and Hangs
#
-# CONFIG_PANIC_ON_OOPS is not set
-CONFIG_PANIC_ON_OOPS_VALUE=0
+CONFIG_PANIC_ON_OOPS=y
+CONFIG_PANIC_ON_OOPS_VALUE=1
CONFIG_PANIC_TIMEOUT=0
-# CONFIG_SOFTLOCKUP_DETECTOR is not set
+CONFIG_LOCKUP_DETECTOR=y
+CONFIG_SOFTLOCKUP_DETECTOR=y
+CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y
-# CONFIG_HARDLOCKUP_DETECTOR is not set
-# CONFIG_DETECT_HUNG_TASK is not set
-# CONFIG_WQ_WATCHDOG is not set
-# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set
+CONFIG_HARDLOCKUP_DETECTOR=y
+# CONFIG_HARDLOCKUP_DETECTOR_PERF is not set
+CONFIG_HARDLOCKUP_DETECTOR_BUDDY=y
+# CONFIG_HARDLOCKUP_DETECTOR_ARCH is not set
+CONFIG_HARDLOCKUP_DETECTOR_COUNTS_HRTIMER=y
+CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
+CONFIG_DETECT_HUNG_TASK=y
+CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
+CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
+CONFIG_DETECT_HUNG_TASK_BLOCKER=y
+CONFIG_WQ_WATCHDOG=y
+CONFIG_WQ_CPU_INTENSIVE_REPORT=y
# CONFIG_TEST_LOCKUP is not set
# end of Debug Oops, Lockups and Hangs
> >
> > Have you tried to run the same fentry_attach_stress multiple times or
> > in a loop like while true; do ./test_progs -a fentry_attach_stress;
> > sleep 1; done
> > the lockup happens intermittently, sometime it PASSED, sometime kernel
> > locks up. I merged the tools/testing/selftests/bpf/config with my
> > original config by
> > ./scripts/kconfig/merge_config.sh -y .config
> > tools/testing/selftests/bpf/config. my config seems including
> > everything you listed above except CONFIG_ARCH_STRICT_ALIGN not set,
> > here is my config https://www.bpfire.net/download/loongfire/config.txt
> >
> > > Thanks,
> > > Tiezhu
> > >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-08-02 15:52 ` Vincent Li
@ 2025-08-03 14:10 ` Huacai Chen
2025-08-03 15:24 ` Vincent Li
0 siblings, 1 reply; 35+ messages in thread
From: Huacai Chen @ 2025-08-03 14:10 UTC (permalink / raw)
To: Vincent Li
Cc: Tiezhu Yang, Chenghao Duan, hengqi.chen, kernel, loongarch,
guodongtai, youling.tang, jianghaoran, geliang
On Sat, Aug 2, 2025 at 11:52 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> On Sat, Aug 2, 2025 at 7:47 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> >
> > On Sat, Aug 2, 2025 at 6:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > >
> > > On Sat, Aug 2, 2025 at 2:19 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
> > > >
> > > > On 2025/8/1 下午1:21, Vincent Li wrote:
> > > > > Hi Chenghao,
> > > > >
> > > > > I trimmed the email recipients only to the loongarch mailing list and
> > > > > folks who might pay attention to this, I personally don't like to
> > > > > bother other people who may not be interested in this :). Folks let me
> > > > > know if this is not ok. anyway, please check my bpf selftest result
> > > > > inline. The fentry_attach_stress results in kernel lockup.
> > > >
> > > > It passed on my test environment.
> > > >
> > > > $ sudo ./test_progs -a fentry_attach_stress
> > > > #104 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > >
> > > > I used loongson3_defconfig and the following additional configs:
> > > >
> > > > CONFIG_KPROBES=y
> > > > CONFIG_FUNCTION_ERROR_INJECTION=y
> > > > CONFIG_TEST_BPF=m
> > > > CONFIG_FTRACE=y
> > > > CONFIG_FUNCTION_TRACER=y
> > > > CONFIG_DYNAMIC_FTRACE=y
> > > > CONFIG_FPROBE=y
> > > > CONFIG_FTRACE_SYSCALLS=y
> > > > CONFIG_BPF_KPROBE_OVERRIDE=y
> > > > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> > > > CONFIG_DEBUG_INFO_BTF=y
> > > > CONFIG_NET_SCH_BPF=y
> > > > CONFIG_BPF_LSM=y
> > > > CONFIG_BPF_UNPRIV_DEFAULT_OFF=n
> > > > CONFIG_ARCH_STRICT_ALIGN=n
> > > >
> > > > I am not sure whether it is related with configs, you can test it again.
> > > >
> >
> > I did:
> > cp arch/loongarch/configs/loongson3_defconfig .config
> > ./scripts/kconfig/merge_config.sh -y .config config-tiezhu(your above config)
> > indeed I could not reproduce the lockup, even run the test in a loop.
> > it seems to be related to the kernel config I use, maybe you could try
> > my kernel config?
> >
> > [root@fedora bpf]# while true; do ./test_progs -a
> > fentry_attach_stress; sleep 1; done
> > #105 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > #105 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> >
> > #105 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > #105 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > #105 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > #105 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > #105 fentry_attach_stress:OK
> >
>
> while checking dmesg with your nolockup config, I see following log:
>
> [ 3469.410821] Hardware name: Loongson
> Loongson-3A6000-7A2000-NUC/Loongson-3A6000-7A2000-NUC, BIOS
> Loongson-UDK2018-V4.0.05759-stable202405 07/12/24 15:49:14
> [ 3469.410824] pc 90000002456d4880 ra 90000000060885f4 tp
> 90000002975d8000 sp 90000002975dbc10
> [ 3469.410826] a0 0000000000000000 a1 ffff8000128f0048 a2
> 00007ffff1c02638 a3 0000000000000000
> [ 3469.410828] a4 00007ffff1c02680 a5 00007ffff0ce0f20 a6
> 00007ffff0ce0f20 a7 0000000000000118
> [ 3469.410830] t0 ffff80000338dd44 t1 90000002456d4880 t2
> d665bdcea9f14eb9 t3 90000001dc9c1000
> [ 3469.410832] t4 90000001dea99670 t5 0000000000000000 t6
> 0000000000000000 t7 0000000000000000
> [ 3469.410834] t8 000000000000000f u0 000000000000000a s9
> 90000002975dbec0 s0 90000002975dbc70
> [ 3469.410836] s1 0000000000000000 s2 90000000076a5000 s3
> 0000000000000000 s4 ffff8000128f0000
> [ 3469.410838] s5 0000000000000000 s6 0000000000000000 s7
> 0000000000000050 s8 fffffffffffffdf4
> [ 3469.410840] ra: 90000000060885f4 __bpf_prog_test_run_raw_tp+0x6c/0x108
> [ 3469.410848] ERA: 90000002456d4880 0x90000002456d4880
> [ 3469.410851] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> [ 3469.410860] PRMD: 00000004 (PPLV0 +PIE -PWE)
> [ 3469.410865] EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
> [ 3469.410870] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
> [ 3469.410875] ESTAT: 000d0000 [INE] (IS= ECode=13 EsubCode=0)
> [ 3469.410878] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000)
> [ 3469.410880] Modules linked in: bpf_testmod(O) tls nft_fib_inet
> nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
> nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
> nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink cmac
> algif_hash algif_skcipher af_alg bnep vfat fat rtw88_8821ce
> rtw88_8821c rtw88_pci rtw88_core mac80211 btusb libarc4 btrtl btbcm
> btmtk btintel cfg80211 bluetooth sha3_generic kvm jitterentropy_rng
> drbg ecdh_generic ecc loongson3_cpufreq spi_loongson_pci rfkill
> spi_loongson_core uio_pdrv_genirq uio lm75 fuse efi_pstore pstore zram
> 842_decompress 842_compress lz4hc_compress lz4_compress uas
> usb_storage efivarfs [last unloaded: bpf_testmod(O)]
> [ 3469.410946] Process test_progs (pid: 37338,
> threadinfo=00000000760120b6, task=00000000620daecd)
> [ 3469.410950] Stack : 0000000000000000 0000000000000000
> 0000000000000000 d665bdcea9f14eb9
> [ 3469.410956] 000000000000000f 0000000000000000
> 90000002975dbc70 90000000076a5000
> [ 3469.410961] 00007ffff1c02638 ffff8000128f0000
> 90000002975dbd90 900000000608ab28
> [ 3469.410966] ffff8000128f0000 0000000000000000
> 0000000000000000 d665bdcea9f14eb9
> [ 3469.410971] 0000000000000000 90000000076a5000
> 90000002975dbd90 00007ffff1c02638
> [ 3469.410976] 0000000000000000 000000000000000a
> ffff8000128f0000 9000000004f21328
> [ 3469.410981] 0000000000000000 9000000004e554c0
> 0000000000000000 900000010f6a4240
> [ 3469.410986] 0000000129e18000 ffffffff00003500
> 0000000000000000 0000000000000000
> [ 3469.410990] 0000000000000000 0000000000000000
> 0000000000000000 d665bdcea9f14eb9
> [ 3469.410995] 0000000000000001 00007ffff1c03040
> 0000000000000000 0000000000000002
> [ 3469.411000] ...
> [ 3469.411002] Call Trace:
> [ 3469.411005] [<9000000004d11fb8>] handle_syscall+0xb8/0x158
>
> [ 3469.411012] Code: 4c2e0031 31465341 00323731 <00000000> 00000000
> 975d8000 90000002 00000003 00402040
>
> [ 3469.411024] ---[ end trace 0000000000000000 ]---
>
> here is the relevant config diff between your nolockup config and my
> lockup config that I suspect your nolockup config didn't cause kernel
> lockup
>
> diff -u config-nolockup config-lockup
>
> # Debug Oops, Lockups and Hangs
> #
> -# CONFIG_PANIC_ON_OOPS is not set
> -CONFIG_PANIC_ON_OOPS_VALUE=0
> +CONFIG_PANIC_ON_OOPS=y
> +CONFIG_PANIC_ON_OOPS_VALUE=1
> CONFIG_PANIC_TIMEOUT=0
> -# CONFIG_SOFTLOCKUP_DETECTOR is not set
> +CONFIG_LOCKUP_DETECTOR=y
> +CONFIG_SOFTLOCKUP_DETECTOR=y
> +CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
> CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y
> -# CONFIG_HARDLOCKUP_DETECTOR is not set
> -# CONFIG_DETECT_HUNG_TASK is not set
> -# CONFIG_WQ_WATCHDOG is not set
> -# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set
> +CONFIG_HARDLOCKUP_DETECTOR=y
> +# CONFIG_HARDLOCKUP_DETECTOR_PERF is not set
> +CONFIG_HARDLOCKUP_DETECTOR_BUDDY=y
> +# CONFIG_HARDLOCKUP_DETECTOR_ARCH is not set
> +CONFIG_HARDLOCKUP_DETECTOR_COUNTS_HRTIMER=y
> +CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
> +CONFIG_DETECT_HUNG_TASK=y
> +CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
> +CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
> +CONFIG_DETECT_HUNG_TASK_BLOCKER=y
> +CONFIG_WQ_WATCHDOG=y
> +CONFIG_WQ_CPU_INTENSIVE_REPORT=y
> # CONFIG_TEST_LOCKUP is not set
> # end of Debug Oops, Lockups and Hangs
Hi, Vincent,
I have applied all BPF patches with some small modifications. Can you
test it again?
https://web.git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/log/?h=loongarch-next
Huacai
>
> > >
> > > Have you tried to run the same fentry_attach_stress multiple times or
> > > in a loop like while true; do ./test_progs -a fentry_attach_stress;
> > > sleep 1; done
> > > the lockup happens intermittently, sometime it PASSED, sometime kernel
> > > locks up. I merged the tools/testing/selftests/bpf/config with my
> > > original config by
> > > ./scripts/kconfig/merge_config.sh -y .config
> > > tools/testing/selftests/bpf/config. my config seems including
> > > everything you listed above except CONFIG_ARCH_STRICT_ALIGN not set,
> > > here is my config https://www.bpfire.net/download/loongfire/config.txt
> > >
> > > > Thanks,
> > > > Tiezhu
> > > >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 4/5] LoongArch: BPF: Add bpf trampoline support for Loongarch
2025-07-30 13:12 ` [PATCH v5 4/5] LoongArch: BPF: Add bpf trampoline support for Loongarch Chenghao Duan
2025-07-31 2:17 ` Chenghao Duan
@ 2025-08-03 14:17 ` Huacai Chen
1 sibling, 0 replies; 35+ messages in thread
From: Huacai Chen @ 2025-08-03 14:17 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, hengqi.chen, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li, geliang, kernel test robot
Hi, Chenghao,
On Wed, Jul 30, 2025 at 9:13 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> BPF trampoline is the critical infrastructure of the BPF subsystem, acting
> as a mediator between kernel functions and BPF programs. Numerous important
> features, such as using BPF program for zero overhead kernel introspection,
> rely on this key component.
>
> The related tests have passed, Including the following technical points:
> 1. fentry
> 2. fmod_ret
> 3. fexit
>
> The following related testcases passed on LoongArch:
> sudo ./test_progs -a fentry_test/fentry
> sudo ./test_progs -a fexit_test/fexit
> sudo ./test_progs -a fentry_fexit
> sudo ./test_progs -a modify_return
> sudo ./test_progs -a fexit_sleep
> sudo ./test_progs -a test_overhead
> sudo ./test_progs -a trampoline_count
>
> Reported-by: kernel test robot <lkp@intel.com>
> Closes: https://lore.kernel.org/oe-kbuild-all/202507100034.wXofj6VX-lkp@intel.com/
> Reported-by: Geliang Tang <geliang@kernel.org>
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
> Tested-by: Vincent Li <vincent.mc.li@gmail.com>
> ---
> arch/loongarch/net/bpf_jit.c | 390 +++++++++++++++++++++++++++++++++++
> arch/loongarch/net/bpf_jit.h | 6 +
> 2 files changed, 396 insertions(+)
>
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 5e6ae7e0e..eddf582e4 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -7,9 +7,15 @@
> #include <linux/memory.h>
> #include "bpf_jit.h"
>
> +#define LOONGARCH_MAX_REG_ARGS 8
> +
> #define LOONGARCH_LONG_JUMP_NINSNS 5
> #define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
>
> +#define LOONGARCH_FENTRY_NINSNS 2
> +#define LOONGARCH_FENTRY_NBYTES (LOONGARCH_FENTRY_NINSNS * 4)
> +#define LOONGARCH_BPF_FENTRY_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> +
> #define REG_TCC LOONGARCH_GPR_A6
> #define TCC_SAVED LOONGARCH_GPR_S5
>
> @@ -1407,6 +1413,11 @@ static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> (unsigned long)target);
> }
>
> +static int emit_call(struct jit_ctx *ctx, u64 addr)
> +{
> + return emit_jump_and_link(ctx, LOONGARCH_GPR_RA, addr);
> +}
> +
> int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> void *old_addr, void *new_addr)
> {
> @@ -1471,3 +1482,382 @@ void *bpf_arch_text_copy(void *dst, void *src, size_t len)
>
> return dst;
> }
> +
> +static void store_args(struct jit_ctx *ctx, int nargs, int args_off)
> +{
> + int i;
> +
> + for (i = 0; i < nargs; i++) {
> + emit_insn(ctx, std, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
> + args_off -= 8;
> + }
> +}
> +
> +static void restore_args(struct jit_ctx *ctx, int nargs, int args_off)
> +{
> + int i;
> +
> + for (i = 0; i < nargs; i++) {
> + emit_insn(ctx, ldd, LOONGARCH_GPR_A0 + i, LOONGARCH_GPR_FP, -args_off);
> + args_off -= 8;
> + }
> +}
> +
> +static int invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l,
> + int args_off, int retval_off,
> + int run_ctx_off, bool save_ret)
> +{
> + int ret;
> + u32 *branch;
> + struct bpf_prog *p = l->link.prog;
> + int cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
> +
> + if (l->cookie) {
> + move_imm(ctx, LOONGARCH_GPR_T1, l->cookie, false);
> + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -run_ctx_off + cookie_off);
> + } else {
> + emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP,
> + -run_ctx_off + cookie_off);
> + }
> +
> + /* arg1: prog */
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
> + /* arg2: &run_ctx */
> + emit_insn(ctx, addid, LOONGARCH_GPR_A1, LOONGARCH_GPR_FP, -run_ctx_off);
> + ret = emit_call(ctx, (const u64)bpf_trampoline_enter(p));
> + if (ret)
> + return ret;
> +
> + /* store prog start time */
> + move_reg(ctx, LOONGARCH_GPR_S1, LOONGARCH_GPR_A0);
> +
> + /* if (__bpf_prog_enter(prog) == 0)
> + * goto skip_exec_of_prog;
> + *
> + */
> + branch = (u32 *)ctx->image + ctx->idx;
> + /* nop reserved for conditional jump */
> + emit_insn(ctx, nop);
> +
> + /* arg1: &args_off */
> + emit_insn(ctx, addid, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -args_off);
> + if (!p->jited)
> + move_imm(ctx, LOONGARCH_GPR_A1, (const s64)p->insnsi, false);
> + ret = emit_call(ctx, (const u64)p->bpf_func);
> + if (ret)
> + return ret;
> +
> + if (save_ret) {
> + emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> + emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> + }
> +
> + /* update branch with beqz */
> + if (ctx->image) {
> + int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branch;
> + *branch = larch_insn_gen_beq(LOONGARCH_GPR_A0, LOONGARCH_GPR_ZERO, offset);
> + }
> +
> + /* arg1: prog */
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)p, false);
> + /* arg2: prog start time */
> + move_reg(ctx, LOONGARCH_GPR_A1, LOONGARCH_GPR_S1);
> + /* arg3: &run_ctx */
> + emit_insn(ctx, addid, LOONGARCH_GPR_A2, LOONGARCH_GPR_FP, -run_ctx_off);
> + ret = emit_call(ctx, (const u64)bpf_trampoline_exit(p));
> +
> + return ret;
> +}
> +
> +static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl,
> + int args_off, int retval_off, int run_ctx_off, u32 **branches)
> +{
> + int i;
> +
> + emit_insn(ctx, std, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_FP, -retval_off);
> + for (i = 0; i < tl->nr_links; i++) {
> + invoke_bpf_prog(ctx, tl->links[i], args_off, retval_off,
> + run_ctx_off, true);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -retval_off);
> + branches[i] = (u32 *)ctx->image + ctx->idx;
> + emit_insn(ctx, nop);
> + }
> +}
> +
> +u64 bpf_jit_alloc_exec_limit(void)
> +{
> + return VMALLOC_END - VMALLOC_START;
> +}
I think this function should be removed, because we alloc bpf in the
module region.
Huacai
> +
> +void *arch_alloc_bpf_trampoline(unsigned int size)
> +{
> + return bpf_prog_pack_alloc(size, jit_fill_hole);
> +}
> +
> +void arch_free_bpf_trampoline(void *image, unsigned int size)
> +{
> + bpf_prog_pack_free(image, size);
> +}
> +
> +static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
> + const struct btf_func_model *m,
> + struct bpf_tramp_links *tlinks,
> + void *func_addr, u32 flags)
> +{
> + int i;
> + int stack_size = 0, nargs = 0;
> + int retval_off, args_off, nargs_off, ip_off, run_ctx_off, sreg_off;
> + struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
> + struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> + struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> + int ret, save_ret;
> + void *orig_call = func_addr;
> + u32 **branches = NULL;
> +
> + if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
> + return -ENOTSUPP;
> +
> + /*
> + * FP + 8 [ RA to parent func ] return address to parent
> + * function
> + * FP + 0 [ FP of parent func ] frame pointer of parent
> + * function
> + * FP - 8 [ T0 to traced func ] return address of traced
> + * function
> + * FP - 16 [ FP of traced func ] frame pointer of traced
> + * function
> + *
> + * FP - retval_off [ return value ] BPF_TRAMP_F_CALL_ORIG or
> + * BPF_TRAMP_F_RET_FENTRY_RET
> + * [ argN ]
> + * [ ... ]
> + * FP - args_off [ arg1 ]
> + *
> + * FP - nargs_off [ regs count ]
> + *
> + * FP - ip_off [ traced func ] BPF_TRAMP_F_IP_ARG
> + *
> + * FP - run_ctx_off [ bpf_tramp_run_ctx ]
> + *
> + * FP - sreg_off [ callee saved reg ]
> + *
> + */
> +
> + if (m->nr_args > LOONGARCH_MAX_REG_ARGS)
> + return -ENOTSUPP;
> +
> + if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY))
> + return -ENOTSUPP;
> +
> + stack_size = 0;
> +
> + /* room of trampoline frame to store return address and frame pointer */
> + stack_size += 16;
> +
> + save_ret = flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET);
> + if (save_ret) {
> + /* Save BPF R0 and A0 */
> + stack_size += 16;
> + retval_off = stack_size;
> + }
> +
> + /* room of trampoline frame to store args */
> + nargs = m->nr_args;
> + stack_size += nargs * 8;
> + args_off = stack_size;
> +
> + /* room of trampoline frame to store args number */
> + stack_size += 8;
> + nargs_off = stack_size;
> +
> + /* room of trampoline frame to store ip address */
> + if (flags & BPF_TRAMP_F_IP_ARG) {
> + stack_size += 8;
> + ip_off = stack_size;
> + }
> +
> + /* room of trampoline frame to store struct bpf_tramp_run_ctx */
> + stack_size += round_up(sizeof(struct bpf_tramp_run_ctx), 8);
> + run_ctx_off = stack_size;
> +
> + stack_size += 8;
> + sreg_off = stack_size;
> +
> + stack_size = round_up(stack_size, 16);
> +
> + /* For the trampoline called from function entry */
> + /* RA and FP for parent function*/
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -16);
> + emit_insn(ctx, std, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 16);
> +
> + /* RA and FP for traced function*/
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, -stack_size);
> + emit_insn(ctx, std, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> + emit_insn(ctx, std, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> + emit_insn(ctx, addid, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size);
> +
> + /* callee saved register S1 to pass start time */
> + emit_insn(ctx, std, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> +
> + /* store ip address of the traced function */
> + if (flags & BPF_TRAMP_F_IP_ARG) {
> + move_imm(ctx, LOONGARCH_GPR_T1, (const s64)func_addr, false);
> + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -ip_off);
> + }
> +
> + /* store nargs number*/
> + move_imm(ctx, LOONGARCH_GPR_T1, nargs, false);
> + emit_insn(ctx, std, LOONGARCH_GPR_T1, LOONGARCH_GPR_FP, -nargs_off);
> +
> + store_args(ctx, nargs, args_off);
> +
> + /* To traced function */
> + /* Ftrace jump skips 2 NOP instructions */
> + if (is_kernel_text((unsigned long)orig_call))
> + orig_call += LOONGARCH_FENTRY_NBYTES;
> + /* Direct jump skips 5 NOP instructions */
> + else if (is_bpf_text_address((unsigned long)orig_call))
> + orig_call += LOONGARCH_BPF_FENTRY_NBYTES;
> +
> + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
> + ret = emit_call(ctx, (const u64)__bpf_tramp_enter);
> + if (ret)
> + return ret;
> + }
> +
> + for (i = 0; i < fentry->nr_links; i++) {
> + ret = invoke_bpf_prog(ctx, fentry->links[i], args_off, retval_off,
> + run_ctx_off, flags & BPF_TRAMP_F_RET_FENTRY_RET);
> + if (ret)
> + return ret;
> + }
> + if (fmod_ret->nr_links) {
> + branches = kcalloc(fmod_ret->nr_links, sizeof(u32 *), GFP_KERNEL);
> + if (!branches)
> + return -ENOMEM;
> +
> + invoke_bpf_mod_ret(ctx, fmod_ret, args_off, retval_off,
> + run_ctx_off, branches);
> + }
> +
> + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> + restore_args(ctx, m->nr_args, args_off);
> + ret = emit_call(ctx, (const u64)orig_call);
> + if (ret)
> + goto out;
> + emit_insn(ctx, std, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> + emit_insn(ctx, std, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> + im->ip_after_call = ctx->ro_image + ctx->idx;
> + /* Reserve space for the move_imm + jirl instruction */
> + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> + emit_insn(ctx, nop);
> + }
> +
> + for (i = 0; ctx->image && i < fmod_ret->nr_links; i++) {
> + int offset = (void *)(&ctx->image[ctx->idx]) - (void *)branches[i];
> + *branches[i] = larch_insn_gen_bne(LOONGARCH_GPR_T1, LOONGARCH_GPR_ZERO, offset);
> + }
> +
> + for (i = 0; i < fexit->nr_links; i++) {
> + ret = invoke_bpf_prog(ctx, fexit->links[i], args_off, retval_off,
> + run_ctx_off, false);
> + if (ret)
> + goto out;
> + }
> +
> + if (flags & BPF_TRAMP_F_CALL_ORIG) {
> + im->ip_epilogue = ctx->ro_image + ctx->idx;
> + move_imm(ctx, LOONGARCH_GPR_A0, (const s64)im, false);
> + ret = emit_call(ctx, (const u64)__bpf_tramp_exit);
> + if (ret)
> + goto out;
> + }
> +
> + if (flags & BPF_TRAMP_F_RESTORE_REGS)
> + restore_args(ctx, m->nr_args, args_off);
> +
> + if (save_ret) {
> + emit_insn(ctx, ldd, LOONGARCH_GPR_A0, LOONGARCH_GPR_FP, -retval_off);
> + emit_insn(ctx, ldd, regmap[BPF_REG_0], LOONGARCH_GPR_FP, -(retval_off - 8));
> + }
> +
> + emit_insn(ctx, ldd, LOONGARCH_GPR_S1, LOONGARCH_GPR_FP, -sreg_off);
> +
> + /* trampoline called from function entry */
> + emit_insn(ctx, ldd, LOONGARCH_GPR_T0, LOONGARCH_GPR_SP, stack_size - 8);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, stack_size - 16);
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, stack_size);
> +
> + emit_insn(ctx, ldd, LOONGARCH_GPR_RA, LOONGARCH_GPR_SP, 8);
> + emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
> + emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
> +
> + if (flags & BPF_TRAMP_F_SKIP_FRAME)
> + /* return to parent function */
> + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
> + else
> + /* return to traced function */
> + emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
> +
> + ret = ctx->idx;
> +out:
> + kfree(branches);
> +
> + return ret;
> +}
> +
> +int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image,
> + void *ro_image_end, const struct btf_func_model *m,
> + u32 flags, struct bpf_tramp_links *tlinks,
> + void *func_addr)
> +{
> + int ret;
> + void *image, *tmp;
> + struct jit_ctx ctx;
> + u32 size = ro_image_end - ro_image;
> +
> + image = kvmalloc(size, GFP_KERNEL);
> + if (!image)
> + return -ENOMEM;
> +
> + ctx.image = (union loongarch_instruction *)image;
> + ctx.ro_image = (union loongarch_instruction *)ro_image;
> + ctx.idx = 0;
> +
> + jit_fill_hole(image, (unsigned int)(ro_image_end - ro_image));
> + ret = __arch_prepare_bpf_trampoline(&ctx, im, m, tlinks, func_addr, flags);
> + if (ret > 0 && validate_code(&ctx) < 0) {
> + ret = -EINVAL;
> + goto out;
> + }
> +
> + tmp = bpf_arch_text_copy(ro_image, image, size);
> + if (IS_ERR(tmp)) {
> + ret = PTR_ERR(tmp);
> + goto out;
> + }
> +
> + bpf_flush_icache(ro_image, ro_image_end);
> +out:
> + kvfree(image);
> + return ret < 0 ? ret : size;
> +}
> +
> +int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags,
> + struct bpf_tramp_links *tlinks, void *func_addr)
> +{
> + struct bpf_tramp_image im;
> + struct jit_ctx ctx;
> + int ret;
> +
> + ctx.image = NULL;
> + ctx.idx = 0;
> +
> + ret = __arch_prepare_bpf_trampoline(&ctx, &im, m, tlinks, func_addr, flags);
> +
> + /* Page align */
> + return ret < 0 ? ret : round_up(ret * LOONGARCH_INSN_SIZE, PAGE_SIZE);
> +}
> diff --git a/arch/loongarch/net/bpf_jit.h b/arch/loongarch/net/bpf_jit.h
> index f9c569f53..5697158fd 100644
> --- a/arch/loongarch/net/bpf_jit.h
> +++ b/arch/loongarch/net/bpf_jit.h
> @@ -18,6 +18,7 @@ struct jit_ctx {
> u32 *offset;
> int num_exentries;
> union loongarch_instruction *image;
> + union loongarch_instruction *ro_image;
> u32 stack_size;
> };
>
> @@ -308,3 +309,8 @@ static inline int emit_tailcall_jmp(struct jit_ctx *ctx, u8 cond, enum loongarch
>
> return -EINVAL;
> }
> +
> +static inline void bpf_flush_icache(void *start, void *end)
> +{
> + flush_icache_range((unsigned long)start, (unsigned long)end);
> +}
> --
> 2.25.1
>
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-08-03 14:10 ` Huacai Chen
@ 2025-08-03 15:24 ` Vincent Li
2025-08-04 2:12 ` Hengqi Chen
2025-08-04 8:24 ` Huacai Chen
0 siblings, 2 replies; 35+ messages in thread
From: Vincent Li @ 2025-08-03 15:24 UTC (permalink / raw)
To: Huacai Chen
Cc: Tiezhu Yang, Chenghao Duan, hengqi.chen, kernel, loongarch,
guodongtai, youling.tang, jianghaoran, geliang
On Sun, Aug 3, 2025 at 7:11 AM Huacai Chen <chenhuacai@kernel.org> wrote:
>
> On Sat, Aug 2, 2025 at 11:52 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> >
> > On Sat, Aug 2, 2025 at 7:47 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > >
> > > On Sat, Aug 2, 2025 at 6:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > >
> > > > On Sat, Aug 2, 2025 at 2:19 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
> > > > >
> > > > > On 2025/8/1 下午1:21, Vincent Li wrote:
> > > > > > Hi Chenghao,
> > > > > >
> > > > > > I trimmed the email recipients only to the loongarch mailing list and
> > > > > > folks who might pay attention to this, I personally don't like to
> > > > > > bother other people who may not be interested in this :). Folks let me
> > > > > > know if this is not ok. anyway, please check my bpf selftest result
> > > > > > inline. The fentry_attach_stress results in kernel lockup.
> > > > >
> > > > > It passed on my test environment.
> > > > >
> > > > > $ sudo ./test_progs -a fentry_attach_stress
> > > > > #104 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > >
> > > > > I used loongson3_defconfig and the following additional configs:
> > > > >
> > > > > CONFIG_KPROBES=y
> > > > > CONFIG_FUNCTION_ERROR_INJECTION=y
> > > > > CONFIG_TEST_BPF=m
> > > > > CONFIG_FTRACE=y
> > > > > CONFIG_FUNCTION_TRACER=y
> > > > > CONFIG_DYNAMIC_FTRACE=y
> > > > > CONFIG_FPROBE=y
> > > > > CONFIG_FTRACE_SYSCALLS=y
> > > > > CONFIG_BPF_KPROBE_OVERRIDE=y
> > > > > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> > > > > CONFIG_DEBUG_INFO_BTF=y
> > > > > CONFIG_NET_SCH_BPF=y
> > > > > CONFIG_BPF_LSM=y
> > > > > CONFIG_BPF_UNPRIV_DEFAULT_OFF=n
> > > > > CONFIG_ARCH_STRICT_ALIGN=n
> > > > >
> > > > > I am not sure whether it is related with configs, you can test it again.
> > > > >
> > >
> > > I did:
> > > cp arch/loongarch/configs/loongson3_defconfig .config
> > > ./scripts/kconfig/merge_config.sh -y .config config-tiezhu(your above config)
> > > indeed I could not reproduce the lockup, even run the test in a loop.
> > > it seems to be related to the kernel config I use, maybe you could try
> > > my kernel config?
> > >
> > > [root@fedora bpf]# while true; do ./test_progs -a
> > > fentry_attach_stress; sleep 1; done
> > > #105 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > #105 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > >
> > > #105 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > #105 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > #105 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > #105 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > #105 fentry_attach_stress:OK
> > >
> >
> > while checking dmesg with your nolockup config, I see following log:
> >
> > [ 3469.410821] Hardware name: Loongson
> > Loongson-3A6000-7A2000-NUC/Loongson-3A6000-7A2000-NUC, BIOS
> > Loongson-UDK2018-V4.0.05759-stable202405 07/12/24 15:49:14
> > [ 3469.410824] pc 90000002456d4880 ra 90000000060885f4 tp
> > 90000002975d8000 sp 90000002975dbc10
> > [ 3469.410826] a0 0000000000000000 a1 ffff8000128f0048 a2
> > 00007ffff1c02638 a3 0000000000000000
> > [ 3469.410828] a4 00007ffff1c02680 a5 00007ffff0ce0f20 a6
> > 00007ffff0ce0f20 a7 0000000000000118
> > [ 3469.410830] t0 ffff80000338dd44 t1 90000002456d4880 t2
> > d665bdcea9f14eb9 t3 90000001dc9c1000
> > [ 3469.410832] t4 90000001dea99670 t5 0000000000000000 t6
> > 0000000000000000 t7 0000000000000000
> > [ 3469.410834] t8 000000000000000f u0 000000000000000a s9
> > 90000002975dbec0 s0 90000002975dbc70
> > [ 3469.410836] s1 0000000000000000 s2 90000000076a5000 s3
> > 0000000000000000 s4 ffff8000128f0000
> > [ 3469.410838] s5 0000000000000000 s6 0000000000000000 s7
> > 0000000000000050 s8 fffffffffffffdf4
> > [ 3469.410840] ra: 90000000060885f4 __bpf_prog_test_run_raw_tp+0x6c/0x108
> > [ 3469.410848] ERA: 90000002456d4880 0x90000002456d4880
> > [ 3469.410851] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > [ 3469.410860] PRMD: 00000004 (PPLV0 +PIE -PWE)
> > [ 3469.410865] EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
> > [ 3469.410870] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
> > [ 3469.410875] ESTAT: 000d0000 [INE] (IS= ECode=13 EsubCode=0)
> > [ 3469.410878] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000)
> > [ 3469.410880] Modules linked in: bpf_testmod(O) tls nft_fib_inet
> > nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
> > nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
> > nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink cmac
> > algif_hash algif_skcipher af_alg bnep vfat fat rtw88_8821ce
> > rtw88_8821c rtw88_pci rtw88_core mac80211 btusb libarc4 btrtl btbcm
> > btmtk btintel cfg80211 bluetooth sha3_generic kvm jitterentropy_rng
> > drbg ecdh_generic ecc loongson3_cpufreq spi_loongson_pci rfkill
> > spi_loongson_core uio_pdrv_genirq uio lm75 fuse efi_pstore pstore zram
> > 842_decompress 842_compress lz4hc_compress lz4_compress uas
> > usb_storage efivarfs [last unloaded: bpf_testmod(O)]
> > [ 3469.410946] Process test_progs (pid: 37338,
> > threadinfo=00000000760120b6, task=00000000620daecd)
> > [ 3469.410950] Stack : 0000000000000000 0000000000000000
> > 0000000000000000 d665bdcea9f14eb9
> > [ 3469.410956] 000000000000000f 0000000000000000
> > 90000002975dbc70 90000000076a5000
> > [ 3469.410961] 00007ffff1c02638 ffff8000128f0000
> > 90000002975dbd90 900000000608ab28
> > [ 3469.410966] ffff8000128f0000 0000000000000000
> > 0000000000000000 d665bdcea9f14eb9
> > [ 3469.410971] 0000000000000000 90000000076a5000
> > 90000002975dbd90 00007ffff1c02638
> > [ 3469.410976] 0000000000000000 000000000000000a
> > ffff8000128f0000 9000000004f21328
> > [ 3469.410981] 0000000000000000 9000000004e554c0
> > 0000000000000000 900000010f6a4240
> > [ 3469.410986] 0000000129e18000 ffffffff00003500
> > 0000000000000000 0000000000000000
> > [ 3469.410990] 0000000000000000 0000000000000000
> > 0000000000000000 d665bdcea9f14eb9
> > [ 3469.410995] 0000000000000001 00007ffff1c03040
> > 0000000000000000 0000000000000002
> > [ 3469.411000] ...
> > [ 3469.411002] Call Trace:
> > [ 3469.411005] [<9000000004d11fb8>] handle_syscall+0xb8/0x158
> >
> > [ 3469.411012] Code: 4c2e0031 31465341 00323731 <00000000> 00000000
> > 975d8000 90000002 00000003 00402040
> >
> > [ 3469.411024] ---[ end trace 0000000000000000 ]---
> >
> > here is the relevant config diff between your nolockup config and my
> > lockup config that I suspect your nolockup config didn't cause kernel
> > lockup
> >
> > diff -u config-nolockup config-lockup
> >
> > # Debug Oops, Lockups and Hangs
> > #
> > -# CONFIG_PANIC_ON_OOPS is not set
> > -CONFIG_PANIC_ON_OOPS_VALUE=0
> > +CONFIG_PANIC_ON_OOPS=y
> > +CONFIG_PANIC_ON_OOPS_VALUE=1
> > CONFIG_PANIC_TIMEOUT=0
> > -# CONFIG_SOFTLOCKUP_DETECTOR is not set
> > +CONFIG_LOCKUP_DETECTOR=y
> > +CONFIG_SOFTLOCKUP_DETECTOR=y
> > +CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
> > CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y
> > -# CONFIG_HARDLOCKUP_DETECTOR is not set
> > -# CONFIG_DETECT_HUNG_TASK is not set
> > -# CONFIG_WQ_WATCHDOG is not set
> > -# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set
> > +CONFIG_HARDLOCKUP_DETECTOR=y
> > +# CONFIG_HARDLOCKUP_DETECTOR_PERF is not set
> > +CONFIG_HARDLOCKUP_DETECTOR_BUDDY=y
> > +# CONFIG_HARDLOCKUP_DETECTOR_ARCH is not set
> > +CONFIG_HARDLOCKUP_DETECTOR_COUNTS_HRTIMER=y
> > +CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
> > +CONFIG_DETECT_HUNG_TASK=y
> > +CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
> > +CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
> > +CONFIG_DETECT_HUNG_TASK_BLOCKER=y
> > +CONFIG_WQ_WATCHDOG=y
> > +CONFIG_WQ_CPU_INTENSIVE_REPORT=y
> > # CONFIG_TEST_LOCKUP is not set
> > # end of Debug Oops, Lockups and Hangs
> Hi, Vincent,
>
> I have applied all BPF patches with some small modifications. Can you
> test it again?
> https://web.git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/log/?h=loongarch-next
>
> Huacai
>
I see you applied the tail call bug patches, I thought about applying
tail call patches just in case, but I am not sure the
fentry_attach_stress involves tail call count bugs. I tried your
loongarch-next branch anyway, but the lockup still happens when I run
fentry_attach_stress in a while loop. I am not sure if this particular
test should block the merge of bpf trampoline patches, it looks to be
an extreme and rare case, maybe track it as a bug report and fix it
later after merge? just my two cents :)
[root@fedora ~]# cd /usr/src/linux-loongson/
[root@fedora linux-loongson]# git branch
* loongarch-next
master
[root@fedora linux-loongson]# uname -a
Linux fedora 6.16.0+ #1 SMP PREEMPT_DYNAMIC Sun Aug 3 08:01:34 PDT
2025 loongarch64 GNU/Linux
[root@fedora linux-loongson]# cd tools/testing/selftests/bpf/
[root@fedora bpf]# ./test_progs -a fentry_attach_stress
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_stress
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_stress
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_stress
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_stress
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_stress
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_stress
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_stress
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_stress
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_stress
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_stress
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_stress
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_stress
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_stress
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_stress
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# while true; do ./test_progs -a
fentry_attach_stress; sleep 5; done
client_loop: send disconnect: Broken pipe
> >
> > > >
> > > > Have you tried to run the same fentry_attach_stress multiple times or
> > > > in a loop like while true; do ./test_progs -a fentry_attach_stress;
> > > > sleep 1; done
> > > > the lockup happens intermittently, sometime it PASSED, sometime kernel
> > > > locks up. I merged the tools/testing/selftests/bpf/config with my
> > > > original config by
> > > > ./scripts/kconfig/merge_config.sh -y .config
> > > > tools/testing/selftests/bpf/config. my config seems including
> > > > everything you listed above except CONFIG_ARCH_STRICT_ALIGN not set,
> > > > here is my config https://www.bpfire.net/download/loongfire/config.txt
> > > >
> > > > > Thanks,
> > > > > Tiezhu
> > > > >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 3/5] LoongArch: BPF: Implement dynamic code modification support
2025-07-30 13:12 ` [PATCH v5 3/5] LoongArch: BPF: Implement dynamic code modification support Chenghao Duan
@ 2025-08-04 2:02 ` Hengqi Chen
2025-08-05 4:10 ` Huacai Chen
2025-08-04 2:24 ` Hengqi Chen
1 sibling, 1 reply; 35+ messages in thread
From: Hengqi Chen @ 2025-08-04 2:02 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li, geliang
On Wed, Jul 30, 2025 at 9:13 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> This commit adds support for BPF dynamic code modification on the
> LoongArch architecture.:
> 1. Implement bpf_arch_text_poke() for runtime instruction patching.
> 2. Add bpf_arch_text_copy() for instruction block copying.
> 3. Create bpf_arch_text_invalidate() for code invalidation.
>
> On LoongArch, since symbol addresses in the direct mapping
> region cannot be reached via relative jump instructions from the paged
> mapping region, we use the move_imm+jirl instruction pair as absolute
> jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> instructions in the program as placeholders for function jumps.
>
> larch_insn_text_copy is solely used for BPF. The use of
> larch_insn_text_copy() requires page_size alignment. Currently, only
> the size of the trampoline is page-aligned.
>
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> ---
> arch/loongarch/include/asm/inst.h | 1 +
> arch/loongarch/kernel/inst.c | 27 ++++++++
> arch/loongarch/net/bpf_jit.c | 104 ++++++++++++++++++++++++++++++
> 3 files changed, 132 insertions(+)
>
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> index 2ae96a35d..88bb73e46 100644
> --- a/arch/loongarch/include/asm/inst.h
> +++ b/arch/loongarch/include/asm/inst.h
> @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> int larch_insn_read(void *addr, u32 *insnp);
> int larch_insn_write(void *addr, u32 insn);
> int larch_insn_patch_text(void *addr, u32 insn);
> +int larch_insn_text_copy(void *dst, void *src, size_t len);
>
> u32 larch_insn_gen_nop(void);
> u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> index 674e3b322..7df63a950 100644
> --- a/arch/loongarch/kernel/inst.c
> +++ b/arch/loongarch/kernel/inst.c
> @@ -4,6 +4,7 @@
> */
> #include <linux/sizes.h>
> #include <linux/uaccess.h>
> +#include <linux/set_memory.h>
>
> #include <asm/cacheflush.h>
> #include <asm/inst.h>
> @@ -218,6 +219,32 @@ int larch_insn_patch_text(void *addr, u32 insn)
> return ret;
> }
>
> +int larch_insn_text_copy(void *dst, void *src, size_t len)
> +{
> + int ret;
> + unsigned long flags;
> + unsigned long dst_start, dst_end, dst_len;
> +
> + dst_start = round_down((unsigned long)dst, PAGE_SIZE);
> + dst_end = round_up((unsigned long)dst + len, PAGE_SIZE);
> + dst_len = dst_end - dst_start;
> +
> + set_memory_rw(dst_start, dst_len / PAGE_SIZE);
> + raw_spin_lock_irqsave(&patch_lock, flags);
> +
> + ret = copy_to_kernel_nofault(dst, src, len);
> + if (ret)
> + pr_err("%s: operation failed\n", __func__);
> +
> + raw_spin_unlock_irqrestore(&patch_lock, flags);
> + set_memory_rox(dst_start, dst_len / PAGE_SIZE);
> +
> + if (!ret)
> + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> +
> + return ret;
> +}
> +
> u32 larch_insn_gen_nop(void)
> {
> return INSN_NOP;
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 7032f11d3..5e6ae7e0e 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -4,8 +4,12 @@
> *
> * Copyright (C) 2022 Loongson Technology Corporation Limited
> */
> +#include <linux/memory.h>
> #include "bpf_jit.h"
>
> +#define LOONGARCH_LONG_JUMP_NINSNS 5
> +#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> +
> #define REG_TCC LOONGARCH_GPR_A6
> #define TCC_SAVED LOONGARCH_GPR_S5
>
> @@ -88,6 +92,7 @@ static u8 tail_call_reg(struct jit_ctx *ctx)
> */
> static void build_prologue(struct jit_ctx *ctx)
> {
> + int i;
> int stack_adjust = 0, store_offset, bpf_stack_adjust;
>
> bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
> @@ -98,6 +103,10 @@ static void build_prologue(struct jit_ctx *ctx)
> stack_adjust = round_up(stack_adjust, 16);
> stack_adjust += bpf_stack_adjust;
>
> + /* Reserve space for the move_imm + jirl instruction */
> + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> + emit_insn(ctx, nop);
> +
> /*
> * First instruction initializes the tail call count (TCC).
> * On tail call we skip this instruction, and the TCC is
> @@ -1367,3 +1376,98 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> {
> return true;
> }
> +
> +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 target)
> +{
> + if (!target) {
> + pr_err("bpf_jit: jump target address is error\n");
> + return -EFAULT;
> + }
> +
> + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> +
> + return 0;
> +}
> +
> +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> +{
> + struct jit_ctx ctx;
> +
> + ctx.idx = 0;
> + ctx.image = (union loongarch_instruction *)insns;
> +
> + if (!target) {
> + emit_insn((&ctx), nop);
> + emit_insn((&ctx), nop);
There should be 5 nops, no ?
> + return 0;
> + }
> +
> + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> + (unsigned long)target);
> +}
> +
> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> + void *old_addr, void *new_addr)
> +{
> + u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> + u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> + bool is_call = poke_type == BPF_MOD_CALL;
> + int ret;
> +
> + if (!is_kernel_text((unsigned long)ip) &&
> + !is_bpf_text_address((unsigned long)ip))
> + return -ENOTSUPP;
> +
> + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> + if (ret)
> + return ret;
> +
> + if (memcmp(ip, old_insns, LOONGARCH_LONG_JUMP_NBYTES))
> + return -EFAULT;
> +
> + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> + if (ret)
> + return ret;
> +
> + mutex_lock(&text_mutex);
> + if (memcmp(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES))
> + ret = larch_insn_text_copy(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES);
> + mutex_unlock(&text_mutex);
> + return ret;
> +}
> +
> +int bpf_arch_text_invalidate(void *dst, size_t len)
> +{
> + int i;
> + int ret = 0;
> + u32 *inst;
> +
> + inst = kvmalloc(len, GFP_KERNEL);
> + if (!inst)
> + return -ENOMEM;
> +
> + for (i = 0; i < (len/sizeof(u32)); i++)
> + inst[i] = INSN_BREAK;
> +
> + mutex_lock(&text_mutex);
> + if (larch_insn_text_copy(dst, inst, len))
> + ret = -EINVAL;
> + mutex_unlock(&text_mutex);
> +
> + kvfree(inst);
> + return ret;
> +}
> +
> +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> +{
> + int ret;
> +
> + mutex_lock(&text_mutex);
> + ret = larch_insn_text_copy(dst, src, len);
> + mutex_unlock(&text_mutex);
> + if (ret)
> + return ERR_PTR(-EINVAL);
> +
> + return dst;
> +}
> --
bpf_arch_text_invalidate() and bpf_arch_text_copy() is not related to
BPF trampoline, right ?
> 2.25.1
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-08-03 15:24 ` Vincent Li
@ 2025-08-04 2:12 ` Hengqi Chen
2025-08-04 2:28 ` Huacai Chen
2025-08-04 3:10 ` Vincent Li
2025-08-04 8:24 ` Huacai Chen
1 sibling, 2 replies; 35+ messages in thread
From: Hengqi Chen @ 2025-08-04 2:12 UTC (permalink / raw)
To: Vincent Li
Cc: Huacai Chen, Tiezhu Yang, Chenghao Duan, kernel, loongarch,
guodongtai, youling.tang, jianghaoran, geliang
On Sun, Aug 3, 2025 at 11:25 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> On Sun, Aug 3, 2025 at 7:11 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > On Sat, Aug 2, 2025 at 11:52 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > >
> > > On Sat, Aug 2, 2025 at 7:47 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > >
> > > > On Sat, Aug 2, 2025 at 6:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > >
> > > > > On Sat, Aug 2, 2025 at 2:19 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
> > > > > >
> > > > > > On 2025/8/1 下午1:21, Vincent Li wrote:
> > > > > > > Hi Chenghao,
> > > > > > >
> > > > > > > I trimmed the email recipients only to the loongarch mailing list and
> > > > > > > folks who might pay attention to this, I personally don't like to
> > > > > > > bother other people who may not be interested in this :). Folks let me
> > > > > > > know if this is not ok. anyway, please check my bpf selftest result
> > > > > > > inline. The fentry_attach_stress results in kernel lockup.
> > > > > >
> > > > > > It passed on my test environment.
> > > > > >
> > > > > > $ sudo ./test_progs -a fentry_attach_stress
> > > > > > #104 fentry_attach_stress:OK
> > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > >
> > > > > > I used loongson3_defconfig and the following additional configs:
> > > > > >
> > > > > > CONFIG_KPROBES=y
> > > > > > CONFIG_FUNCTION_ERROR_INJECTION=y
> > > > > > CONFIG_TEST_BPF=m
> > > > > > CONFIG_FTRACE=y
> > > > > > CONFIG_FUNCTION_TRACER=y
> > > > > > CONFIG_DYNAMIC_FTRACE=y
> > > > > > CONFIG_FPROBE=y
> > > > > > CONFIG_FTRACE_SYSCALLS=y
> > > > > > CONFIG_BPF_KPROBE_OVERRIDE=y
> > > > > > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> > > > > > CONFIG_DEBUG_INFO_BTF=y
> > > > > > CONFIG_NET_SCH_BPF=y
> > > > > > CONFIG_BPF_LSM=y
> > > > > > CONFIG_BPF_UNPRIV_DEFAULT_OFF=n
> > > > > > CONFIG_ARCH_STRICT_ALIGN=n
> > > > > >
> > > > > > I am not sure whether it is related with configs, you can test it again.
> > > > > >
> > > >
> > > > I did:
> > > > cp arch/loongarch/configs/loongson3_defconfig .config
> > > > ./scripts/kconfig/merge_config.sh -y .config config-tiezhu(your above config)
> > > > indeed I could not reproduce the lockup, even run the test in a loop.
> > > > it seems to be related to the kernel config I use, maybe you could try
> > > > my kernel config?
> > > >
> > > > [root@fedora bpf]# while true; do ./test_progs -a
> > > > fentry_attach_stress; sleep 1; done
> > > > #105 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > #105 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > >
> > > > #105 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > #105 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > #105 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > #105 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > #105 fentry_attach_stress:OK
> > > >
> > >
> > > while checking dmesg with your nolockup config, I see following log:
> > >
> > > [ 3469.410821] Hardware name: Loongson
> > > Loongson-3A6000-7A2000-NUC/Loongson-3A6000-7A2000-NUC, BIOS
> > > Loongson-UDK2018-V4.0.05759-stable202405 07/12/24 15:49:14
> > > [ 3469.410824] pc 90000002456d4880 ra 90000000060885f4 tp
> > > 90000002975d8000 sp 90000002975dbc10
> > > [ 3469.410826] a0 0000000000000000 a1 ffff8000128f0048 a2
> > > 00007ffff1c02638 a3 0000000000000000
> > > [ 3469.410828] a4 00007ffff1c02680 a5 00007ffff0ce0f20 a6
> > > 00007ffff0ce0f20 a7 0000000000000118
> > > [ 3469.410830] t0 ffff80000338dd44 t1 90000002456d4880 t2
> > > d665bdcea9f14eb9 t3 90000001dc9c1000
> > > [ 3469.410832] t4 90000001dea99670 t5 0000000000000000 t6
> > > 0000000000000000 t7 0000000000000000
> > > [ 3469.410834] t8 000000000000000f u0 000000000000000a s9
> > > 90000002975dbec0 s0 90000002975dbc70
> > > [ 3469.410836] s1 0000000000000000 s2 90000000076a5000 s3
> > > 0000000000000000 s4 ffff8000128f0000
> > > [ 3469.410838] s5 0000000000000000 s6 0000000000000000 s7
> > > 0000000000000050 s8 fffffffffffffdf4
> > > [ 3469.410840] ra: 90000000060885f4 __bpf_prog_test_run_raw_tp+0x6c/0x108
> > > [ 3469.410848] ERA: 90000002456d4880 0x90000002456d4880
> > > [ 3469.410851] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > > [ 3469.410860] PRMD: 00000004 (PPLV0 +PIE -PWE)
> > > [ 3469.410865] EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
> > > [ 3469.410870] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
> > > [ 3469.410875] ESTAT: 000d0000 [INE] (IS= ECode=13 EsubCode=0)
> > > [ 3469.410878] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000)
> > > [ 3469.410880] Modules linked in: bpf_testmod(O) tls nft_fib_inet
> > > nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
> > > nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
> > > nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink cmac
> > > algif_hash algif_skcipher af_alg bnep vfat fat rtw88_8821ce
> > > rtw88_8821c rtw88_pci rtw88_core mac80211 btusb libarc4 btrtl btbcm
> > > btmtk btintel cfg80211 bluetooth sha3_generic kvm jitterentropy_rng
> > > drbg ecdh_generic ecc loongson3_cpufreq spi_loongson_pci rfkill
> > > spi_loongson_core uio_pdrv_genirq uio lm75 fuse efi_pstore pstore zram
> > > 842_decompress 842_compress lz4hc_compress lz4_compress uas
> > > usb_storage efivarfs [last unloaded: bpf_testmod(O)]
> > > [ 3469.410946] Process test_progs (pid: 37338,
> > > threadinfo=00000000760120b6, task=00000000620daecd)
> > > [ 3469.410950] Stack : 0000000000000000 0000000000000000
> > > 0000000000000000 d665bdcea9f14eb9
> > > [ 3469.410956] 000000000000000f 0000000000000000
> > > 90000002975dbc70 90000000076a5000
> > > [ 3469.410961] 00007ffff1c02638 ffff8000128f0000
> > > 90000002975dbd90 900000000608ab28
> > > [ 3469.410966] ffff8000128f0000 0000000000000000
> > > 0000000000000000 d665bdcea9f14eb9
> > > [ 3469.410971] 0000000000000000 90000000076a5000
> > > 90000002975dbd90 00007ffff1c02638
> > > [ 3469.410976] 0000000000000000 000000000000000a
> > > ffff8000128f0000 9000000004f21328
> > > [ 3469.410981] 0000000000000000 9000000004e554c0
> > > 0000000000000000 900000010f6a4240
> > > [ 3469.410986] 0000000129e18000 ffffffff00003500
> > > 0000000000000000 0000000000000000
> > > [ 3469.410990] 0000000000000000 0000000000000000
> > > 0000000000000000 d665bdcea9f14eb9
> > > [ 3469.410995] 0000000000000001 00007ffff1c03040
> > > 0000000000000000 0000000000000002
> > > [ 3469.411000] ...
> > > [ 3469.411002] Call Trace:
> > > [ 3469.411005] [<9000000004d11fb8>] handle_syscall+0xb8/0x158
> > >
> > > [ 3469.411012] Code: 4c2e0031 31465341 00323731 <00000000> 00000000
> > > 975d8000 90000002 00000003 00402040
> > >
> > > [ 3469.411024] ---[ end trace 0000000000000000 ]---
> > >
> > > here is the relevant config diff between your nolockup config and my
> > > lockup config that I suspect your nolockup config didn't cause kernel
> > > lockup
> > >
> > > diff -u config-nolockup config-lockup
> > >
> > > # Debug Oops, Lockups and Hangs
> > > #
> > > -# CONFIG_PANIC_ON_OOPS is not set
> > > -CONFIG_PANIC_ON_OOPS_VALUE=0
> > > +CONFIG_PANIC_ON_OOPS=y
> > > +CONFIG_PANIC_ON_OOPS_VALUE=1
> > > CONFIG_PANIC_TIMEOUT=0
> > > -# CONFIG_SOFTLOCKUP_DETECTOR is not set
> > > +CONFIG_LOCKUP_DETECTOR=y
> > > +CONFIG_SOFTLOCKUP_DETECTOR=y
> > > +CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
> > > CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y
> > > -# CONFIG_HARDLOCKUP_DETECTOR is not set
> > > -# CONFIG_DETECT_HUNG_TASK is not set
> > > -# CONFIG_WQ_WATCHDOG is not set
> > > -# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set
> > > +CONFIG_HARDLOCKUP_DETECTOR=y
> > > +# CONFIG_HARDLOCKUP_DETECTOR_PERF is not set
> > > +CONFIG_HARDLOCKUP_DETECTOR_BUDDY=y
> > > +# CONFIG_HARDLOCKUP_DETECTOR_ARCH is not set
> > > +CONFIG_HARDLOCKUP_DETECTOR_COUNTS_HRTIMER=y
> > > +CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
> > > +CONFIG_DETECT_HUNG_TASK=y
> > > +CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
> > > +CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
> > > +CONFIG_DETECT_HUNG_TASK_BLOCKER=y
> > > +CONFIG_WQ_WATCHDOG=y
> > > +CONFIG_WQ_CPU_INTENSIVE_REPORT=y
> > > # CONFIG_TEST_LOCKUP is not set
> > > # end of Debug Oops, Lockups and Hangs
> > Hi, Vincent,
> >
> > I have applied all BPF patches with some small modifications. Can you
> > test it again?
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/log/?h=loongarch-next
> >
> > Huacai
> >
> I see you applied the tail call bug patches, I thought about applying
> tail call patches just in case, but I am not sure the
> fentry_attach_stress involves tail call count bugs. I tried your
> loongarch-next branch anyway, but the lockup still happens when I run
> fentry_attach_stress in a while loop. I am not sure if this particular
> test should block the merge of bpf trampoline patches, it looks to be
> an extreme and rare case, maybe track it as a bug report and fix it
> later after merge? just my two cents :)
>
I think this violates the community rules. And could be a disaster in
production.
> [root@fedora ~]# cd /usr/src/linux-loongson/
> [root@fedora linux-loongson]# git branch
> * loongarch-next
> master
> [root@fedora linux-loongson]# uname -a
> Linux fedora 6.16.0+ #1 SMP PREEMPT_DYNAMIC Sun Aug 3 08:01:34 PDT
> 2025 loongarch64 GNU/Linux
>
> [root@fedora linux-loongson]# cd tools/testing/selftests/bpf/
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
>
> [root@fedora bpf]# while true; do ./test_progs -a
> fentry_attach_stress; sleep 5; done
> client_loop: send disconnect: Broken pipe
>
> > >
> > > > >
> > > > > Have you tried to run the same fentry_attach_stress multiple times or
> > > > > in a loop like while true; do ./test_progs -a fentry_attach_stress;
> > > > > sleep 1; done
> > > > > the lockup happens intermittently, sometime it PASSED, sometime kernel
> > > > > locks up. I merged the tools/testing/selftests/bpf/config with my
> > > > > original config by
> > > > > ./scripts/kconfig/merge_config.sh -y .config
> > > > > tools/testing/selftests/bpf/config. my config seems including
> > > > > everything you listed above except CONFIG_ARCH_STRICT_ALIGN not set,
> > > > > here is my config https://www.bpfire.net/download/loongfire/config.txt
> > > > >
> > > > > > Thanks,
> > > > > > Tiezhu
> > > > > >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 3/5] LoongArch: BPF: Implement dynamic code modification support
2025-07-30 13:12 ` [PATCH v5 3/5] LoongArch: BPF: Implement dynamic code modification support Chenghao Duan
2025-08-04 2:02 ` Hengqi Chen
@ 2025-08-04 2:24 ` Hengqi Chen
1 sibling, 0 replies; 35+ messages in thread
From: Hengqi Chen @ 2025-08-04 2:24 UTC (permalink / raw)
To: Chenghao Duan
Cc: ast, daniel, andrii, yangtiezhu, chenhuacai, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li, geliang
On Wed, Jul 30, 2025 at 9:13 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> This commit adds support for BPF dynamic code modification on the
> LoongArch architecture.:
> 1. Implement bpf_arch_text_poke() for runtime instruction patching.
> 2. Add bpf_arch_text_copy() for instruction block copying.
> 3. Create bpf_arch_text_invalidate() for code invalidation.
>
> On LoongArch, since symbol addresses in the direct mapping
> region cannot be reached via relative jump instructions from the paged
> mapping region, we use the move_imm+jirl instruction pair as absolute
> jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> instructions in the program as placeholders for function jumps.
>
> larch_insn_text_copy is solely used for BPF. The use of
> larch_insn_text_copy() requires page_size alignment. Currently, only
> the size of the trampoline is page-aligned.
>
> Co-developed-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: George Guo <guodongtai@kylinos.cn>
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> ---
> arch/loongarch/include/asm/inst.h | 1 +
> arch/loongarch/kernel/inst.c | 27 ++++++++
> arch/loongarch/net/bpf_jit.c | 104 ++++++++++++++++++++++++++++++
> 3 files changed, 132 insertions(+)
>
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> index 2ae96a35d..88bb73e46 100644
> --- a/arch/loongarch/include/asm/inst.h
> +++ b/arch/loongarch/include/asm/inst.h
> @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> int larch_insn_read(void *addr, u32 *insnp);
> int larch_insn_write(void *addr, u32 insn);
> int larch_insn_patch_text(void *addr, u32 insn);
> +int larch_insn_text_copy(void *dst, void *src, size_t len);
>
> u32 larch_insn_gen_nop(void);
> u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> index 674e3b322..7df63a950 100644
> --- a/arch/loongarch/kernel/inst.c
> +++ b/arch/loongarch/kernel/inst.c
> @@ -4,6 +4,7 @@
> */
> #include <linux/sizes.h>
> #include <linux/uaccess.h>
> +#include <linux/set_memory.h>
>
> #include <asm/cacheflush.h>
> #include <asm/inst.h>
> @@ -218,6 +219,32 @@ int larch_insn_patch_text(void *addr, u32 insn)
> return ret;
> }
>
> +int larch_insn_text_copy(void *dst, void *src, size_t len)
> +{
> + int ret;
> + unsigned long flags;
> + unsigned long dst_start, dst_end, dst_len;
> +
> + dst_start = round_down((unsigned long)dst, PAGE_SIZE);
> + dst_end = round_up((unsigned long)dst + len, PAGE_SIZE);
> + dst_len = dst_end - dst_start;
> +
> + set_memory_rw(dst_start, dst_len / PAGE_SIZE);
> + raw_spin_lock_irqsave(&patch_lock, flags);
> +
> + ret = copy_to_kernel_nofault(dst, src, len);
> + if (ret)
> + pr_err("%s: operation failed\n", __func__);
> +
> + raw_spin_unlock_irqrestore(&patch_lock, flags);
> + set_memory_rox(dst_start, dst_len / PAGE_SIZE);
> +
> + if (!ret)
> + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> +
> + return ret;
> +}
> +
> u32 larch_insn_gen_nop(void)
> {
> return INSN_NOP;
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 7032f11d3..5e6ae7e0e 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -4,8 +4,12 @@
> *
> * Copyright (C) 2022 Loongson Technology Corporation Limited
> */
> +#include <linux/memory.h>
> #include "bpf_jit.h"
>
> +#define LOONGARCH_LONG_JUMP_NINSNS 5
> +#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> +
> #define REG_TCC LOONGARCH_GPR_A6
> #define TCC_SAVED LOONGARCH_GPR_S5
>
> @@ -88,6 +92,7 @@ static u8 tail_call_reg(struct jit_ctx *ctx)
> */
> static void build_prologue(struct jit_ctx *ctx)
> {
> + int i;
> int stack_adjust = 0, store_offset, bpf_stack_adjust;
>
> bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
> @@ -98,6 +103,10 @@ static void build_prologue(struct jit_ctx *ctx)
> stack_adjust = round_up(stack_adjust, 16);
> stack_adjust += bpf_stack_adjust;
>
> + /* Reserve space for the move_imm + jirl instruction */
> + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> + emit_insn(ctx, nop);
> +
> /*
> * First instruction initializes the tail call count (TCC).
> * On tail call we skip this instruction, and the TCC is
> @@ -1367,3 +1376,98 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> {
> return true;
> }
> +
> +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 target)
> +{
> + if (!target) {
> + pr_err("bpf_jit: jump target address is error\n");
> + return -EFAULT;
> + }
> +
> + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> +
> + return 0;
> +}
> +
> +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> +{
> + struct jit_ctx ctx;
> +
> + ctx.idx = 0;
> + ctx.image = (union loongarch_instruction *)insns;
> +
> + if (!target) {
> + emit_insn((&ctx), nop);
> + emit_insn((&ctx), nop);
> + return 0;
> + }
> +
> + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> + (unsigned long)target);
> +}
> +
> +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> + void *old_addr, void *new_addr)
> +{
> + u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> + u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> + bool is_call = poke_type == BPF_MOD_CALL;
> + int ret;
> +
> + if (!is_kernel_text((unsigned long)ip) &&
> + !is_bpf_text_address((unsigned long)ip))
> + return -ENOTSUPP;
> +
> + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> + if (ret)
> + return ret;
> +
> + if (memcmp(ip, old_insns, LOONGARCH_LONG_JUMP_NBYTES))
> + return -EFAULT;
> +
> + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> + if (ret)
> + return ret;
> +
> + mutex_lock(&text_mutex);
> + if (memcmp(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES))
> + ret = larch_insn_text_copy(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES);
> + mutex_unlock(&text_mutex);
The text_mutex and patch_lock inside larch_insn_text_copy() ONLY
prevent concurrent modifications.
You may need stop_machine() to prevent concurrent modifications/executions.
> + return ret;
> +}
> +
> +int bpf_arch_text_invalidate(void *dst, size_t len)
> +{
> + int i;
> + int ret = 0;
> + u32 *inst;
> +
> + inst = kvmalloc(len, GFP_KERNEL);
> + if (!inst)
> + return -ENOMEM;
> +
> + for (i = 0; i < (len/sizeof(u32)); i++)
> + inst[i] = INSN_BREAK;
> +
> + mutex_lock(&text_mutex);
> + if (larch_insn_text_copy(dst, inst, len))
> + ret = -EINVAL;
> + mutex_unlock(&text_mutex);
> +
> + kvfree(inst);
> + return ret;
> +}
> +
> +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> +{
> + int ret;
> +
> + mutex_lock(&text_mutex);
> + ret = larch_insn_text_copy(dst, src, len);
> + mutex_unlock(&text_mutex);
> + if (ret)
> + return ERR_PTR(-EINVAL);
> +
> + return dst;
> +}
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-08-04 2:12 ` Hengqi Chen
@ 2025-08-04 2:28 ` Huacai Chen
2025-08-04 3:10 ` Vincent Li
1 sibling, 0 replies; 35+ messages in thread
From: Huacai Chen @ 2025-08-04 2:28 UTC (permalink / raw)
To: Hengqi Chen
Cc: Vincent Li, Tiezhu Yang, Chenghao Duan, kernel, loongarch,
guodongtai, youling.tang, jianghaoran, geliang
On Mon, Aug 4, 2025 at 10:13 AM Hengqi Chen <hengqi.chen@gmail.com> wrote:
>
> On Sun, Aug 3, 2025 at 11:25 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> >
> > On Sun, Aug 3, 2025 at 7:11 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > On Sat, Aug 2, 2025 at 11:52 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > >
> > > > On Sat, Aug 2, 2025 at 7:47 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > >
> > > > > On Sat, Aug 2, 2025 at 6:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > >
> > > > > > On Sat, Aug 2, 2025 at 2:19 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
> > > > > > >
> > > > > > > On 2025/8/1 下午1:21, Vincent Li wrote:
> > > > > > > > Hi Chenghao,
> > > > > > > >
> > > > > > > > I trimmed the email recipients only to the loongarch mailing list and
> > > > > > > > folks who might pay attention to this, I personally don't like to
> > > > > > > > bother other people who may not be interested in this :). Folks let me
> > > > > > > > know if this is not ok. anyway, please check my bpf selftest result
> > > > > > > > inline. The fentry_attach_stress results in kernel lockup.
> > > > > > >
> > > > > > > It passed on my test environment.
> > > > > > >
> > > > > > > $ sudo ./test_progs -a fentry_attach_stress
> > > > > > > #104 fentry_attach_stress:OK
> > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > >
> > > > > > > I used loongson3_defconfig and the following additional configs:
> > > > > > >
> > > > > > > CONFIG_KPROBES=y
> > > > > > > CONFIG_FUNCTION_ERROR_INJECTION=y
> > > > > > > CONFIG_TEST_BPF=m
> > > > > > > CONFIG_FTRACE=y
> > > > > > > CONFIG_FUNCTION_TRACER=y
> > > > > > > CONFIG_DYNAMIC_FTRACE=y
> > > > > > > CONFIG_FPROBE=y
> > > > > > > CONFIG_FTRACE_SYSCALLS=y
> > > > > > > CONFIG_BPF_KPROBE_OVERRIDE=y
> > > > > > > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> > > > > > > CONFIG_DEBUG_INFO_BTF=y
> > > > > > > CONFIG_NET_SCH_BPF=y
> > > > > > > CONFIG_BPF_LSM=y
> > > > > > > CONFIG_BPF_UNPRIV_DEFAULT_OFF=n
> > > > > > > CONFIG_ARCH_STRICT_ALIGN=n
> > > > > > >
> > > > > > > I am not sure whether it is related with configs, you can test it again.
> > > > > > >
> > > > >
> > > > > I did:
> > > > > cp arch/loongarch/configs/loongson3_defconfig .config
> > > > > ./scripts/kconfig/merge_config.sh -y .config config-tiezhu(your above config)
> > > > > indeed I could not reproduce the lockup, even run the test in a loop.
> > > > > it seems to be related to the kernel config I use, maybe you could try
> > > > > my kernel config?
> > > > >
> > > > > [root@fedora bpf]# while true; do ./test_progs -a
> > > > > fentry_attach_stress; sleep 1; done
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > >
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > #105 fentry_attach_stress:OK
> > > > >
> > > >
> > > > while checking dmesg with your nolockup config, I see following log:
> > > >
> > > > [ 3469.410821] Hardware name: Loongson
> > > > Loongson-3A6000-7A2000-NUC/Loongson-3A6000-7A2000-NUC, BIOS
> > > > Loongson-UDK2018-V4.0.05759-stable202405 07/12/24 15:49:14
> > > > [ 3469.410824] pc 90000002456d4880 ra 90000000060885f4 tp
> > > > 90000002975d8000 sp 90000002975dbc10
> > > > [ 3469.410826] a0 0000000000000000 a1 ffff8000128f0048 a2
> > > > 00007ffff1c02638 a3 0000000000000000
> > > > [ 3469.410828] a4 00007ffff1c02680 a5 00007ffff0ce0f20 a6
> > > > 00007ffff0ce0f20 a7 0000000000000118
> > > > [ 3469.410830] t0 ffff80000338dd44 t1 90000002456d4880 t2
> > > > d665bdcea9f14eb9 t3 90000001dc9c1000
> > > > [ 3469.410832] t4 90000001dea99670 t5 0000000000000000 t6
> > > > 0000000000000000 t7 0000000000000000
> > > > [ 3469.410834] t8 000000000000000f u0 000000000000000a s9
> > > > 90000002975dbec0 s0 90000002975dbc70
> > > > [ 3469.410836] s1 0000000000000000 s2 90000000076a5000 s3
> > > > 0000000000000000 s4 ffff8000128f0000
> > > > [ 3469.410838] s5 0000000000000000 s6 0000000000000000 s7
> > > > 0000000000000050 s8 fffffffffffffdf4
> > > > [ 3469.410840] ra: 90000000060885f4 __bpf_prog_test_run_raw_tp+0x6c/0x108
> > > > [ 3469.410848] ERA: 90000002456d4880 0x90000002456d4880
> > > > [ 3469.410851] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > > > [ 3469.410860] PRMD: 00000004 (PPLV0 +PIE -PWE)
> > > > [ 3469.410865] EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
> > > > [ 3469.410870] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
> > > > [ 3469.410875] ESTAT: 000d0000 [INE] (IS= ECode=13 EsubCode=0)
> > > > [ 3469.410878] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000)
> > > > [ 3469.410880] Modules linked in: bpf_testmod(O) tls nft_fib_inet
> > > > nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
> > > > nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
> > > > nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink cmac
> > > > algif_hash algif_skcipher af_alg bnep vfat fat rtw88_8821ce
> > > > rtw88_8821c rtw88_pci rtw88_core mac80211 btusb libarc4 btrtl btbcm
> > > > btmtk btintel cfg80211 bluetooth sha3_generic kvm jitterentropy_rng
> > > > drbg ecdh_generic ecc loongson3_cpufreq spi_loongson_pci rfkill
> > > > spi_loongson_core uio_pdrv_genirq uio lm75 fuse efi_pstore pstore zram
> > > > 842_decompress 842_compress lz4hc_compress lz4_compress uas
> > > > usb_storage efivarfs [last unloaded: bpf_testmod(O)]
> > > > [ 3469.410946] Process test_progs (pid: 37338,
> > > > threadinfo=00000000760120b6, task=00000000620daecd)
> > > > [ 3469.410950] Stack : 0000000000000000 0000000000000000
> > > > 0000000000000000 d665bdcea9f14eb9
> > > > [ 3469.410956] 000000000000000f 0000000000000000
> > > > 90000002975dbc70 90000000076a5000
> > > > [ 3469.410961] 00007ffff1c02638 ffff8000128f0000
> > > > 90000002975dbd90 900000000608ab28
> > > > [ 3469.410966] ffff8000128f0000 0000000000000000
> > > > 0000000000000000 d665bdcea9f14eb9
> > > > [ 3469.410971] 0000000000000000 90000000076a5000
> > > > 90000002975dbd90 00007ffff1c02638
> > > > [ 3469.410976] 0000000000000000 000000000000000a
> > > > ffff8000128f0000 9000000004f21328
> > > > [ 3469.410981] 0000000000000000 9000000004e554c0
> > > > 0000000000000000 900000010f6a4240
> > > > [ 3469.410986] 0000000129e18000 ffffffff00003500
> > > > 0000000000000000 0000000000000000
> > > > [ 3469.410990] 0000000000000000 0000000000000000
> > > > 0000000000000000 d665bdcea9f14eb9
> > > > [ 3469.410995] 0000000000000001 00007ffff1c03040
> > > > 0000000000000000 0000000000000002
> > > > [ 3469.411000] ...
> > > > [ 3469.411002] Call Trace:
> > > > [ 3469.411005] [<9000000004d11fb8>] handle_syscall+0xb8/0x158
> > > >
> > > > [ 3469.411012] Code: 4c2e0031 31465341 00323731 <00000000> 00000000
> > > > 975d8000 90000002 00000003 00402040
> > > >
> > > > [ 3469.411024] ---[ end trace 0000000000000000 ]---
> > > >
> > > > here is the relevant config diff between your nolockup config and my
> > > > lockup config that I suspect your nolockup config didn't cause kernel
> > > > lockup
> > > >
> > > > diff -u config-nolockup config-lockup
> > > >
> > > > # Debug Oops, Lockups and Hangs
> > > > #
> > > > -# CONFIG_PANIC_ON_OOPS is not set
> > > > -CONFIG_PANIC_ON_OOPS_VALUE=0
> > > > +CONFIG_PANIC_ON_OOPS=y
> > > > +CONFIG_PANIC_ON_OOPS_VALUE=1
> > > > CONFIG_PANIC_TIMEOUT=0
> > > > -# CONFIG_SOFTLOCKUP_DETECTOR is not set
> > > > +CONFIG_LOCKUP_DETECTOR=y
> > > > +CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > +CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
> > > > CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y
> > > > -# CONFIG_HARDLOCKUP_DETECTOR is not set
> > > > -# CONFIG_DETECT_HUNG_TASK is not set
> > > > -# CONFIG_WQ_WATCHDOG is not set
> > > > -# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set
> > > > +CONFIG_HARDLOCKUP_DETECTOR=y
> > > > +# CONFIG_HARDLOCKUP_DETECTOR_PERF is not set
> > > > +CONFIG_HARDLOCKUP_DETECTOR_BUDDY=y
> > > > +# CONFIG_HARDLOCKUP_DETECTOR_ARCH is not set
> > > > +CONFIG_HARDLOCKUP_DETECTOR_COUNTS_HRTIMER=y
> > > > +CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
> > > > +CONFIG_DETECT_HUNG_TASK=y
> > > > +CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
> > > > +CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
> > > > +CONFIG_DETECT_HUNG_TASK_BLOCKER=y
> > > > +CONFIG_WQ_WATCHDOG=y
> > > > +CONFIG_WQ_CPU_INTENSIVE_REPORT=y
> > > > # CONFIG_TEST_LOCKUP is not set
> > > > # end of Debug Oops, Lockups and Hangs
> > > Hi, Vincent,
> > >
> > > I have applied all BPF patches with some small modifications. Can you
> > > test it again?
> > > https://web.git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/log/?h=loongarch-next
> > >
> > > Huacai
> > >
> > I see you applied the tail call bug patches, I thought about applying
> > tail call patches just in case, but I am not sure the
> > fentry_attach_stress involves tail call count bugs. I tried your
> > loongarch-next branch anyway, but the lockup still happens when I run
> > fentry_attach_stress in a while loop. I am not sure if this particular
> > test should block the merge of bpf trampoline patches, it looks to be
> > an extreme and rare case, maybe track it as a bug report and fix it
> > later after merge? just my two cents :)
> >
>
> I think this violates the community rules. And could be a disaster in
> production.
Chenghao, please fix it as soon as possible, on top of loongarch-next.
Huacai
>
> > [root@fedora ~]# cd /usr/src/linux-loongson/
> > [root@fedora linux-loongson]# git branch
> > * loongarch-next
> > master
> > [root@fedora linux-loongson]# uname -a
> > Linux fedora 6.16.0+ #1 SMP PREEMPT_DYNAMIC Sun Aug 3 08:01:34 PDT
> > 2025 loongarch64 GNU/Linux
> >
> > [root@fedora linux-loongson]# cd tools/testing/selftests/bpf/
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> >
> > [root@fedora bpf]# while true; do ./test_progs -a
> > fentry_attach_stress; sleep 5; done
> > client_loop: send disconnect: Broken pipe
> >
> > > >
> > > > > >
> > > > > > Have you tried to run the same fentry_attach_stress multiple times or
> > > > > > in a loop like while true; do ./test_progs -a fentry_attach_stress;
> > > > > > sleep 1; done
> > > > > > the lockup happens intermittently, sometime it PASSED, sometime kernel
> > > > > > locks up. I merged the tools/testing/selftests/bpf/config with my
> > > > > > original config by
> > > > > > ./scripts/kconfig/merge_config.sh -y .config
> > > > > > tools/testing/selftests/bpf/config. my config seems including
> > > > > > everything you listed above except CONFIG_ARCH_STRICT_ALIGN not set,
> > > > > > here is my config https://www.bpfire.net/download/loongfire/config.txt
> > > > > >
> > > > > > > Thanks,
> > > > > > > Tiezhu
> > > > > > >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-08-04 2:12 ` Hengqi Chen
2025-08-04 2:28 ` Huacai Chen
@ 2025-08-04 3:10 ` Vincent Li
1 sibling, 0 replies; 35+ messages in thread
From: Vincent Li @ 2025-08-04 3:10 UTC (permalink / raw)
To: Hengqi Chen
Cc: Huacai Chen, Tiezhu Yang, Chenghao Duan, kernel, loongarch,
guodongtai, youling.tang, jianghaoran, geliang
On Sun, Aug 3, 2025 at 7:13 PM Hengqi Chen <hengqi.chen@gmail.com> wrote:
>
> On Sun, Aug 3, 2025 at 11:25 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> >
> > On Sun, Aug 3, 2025 at 7:11 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > On Sat, Aug 2, 2025 at 11:52 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > >
> > > > On Sat, Aug 2, 2025 at 7:47 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > >
> > > > > On Sat, Aug 2, 2025 at 6:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > >
> > > > > > On Sat, Aug 2, 2025 at 2:19 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
> > > > > > >
> > > > > > > On 2025/8/1 下午1:21, Vincent Li wrote:
> > > > > > > > Hi Chenghao,
> > > > > > > >
> > > > > > > > I trimmed the email recipients only to the loongarch mailing list and
> > > > > > > > folks who might pay attention to this, I personally don't like to
> > > > > > > > bother other people who may not be interested in this :). Folks let me
> > > > > > > > know if this is not ok. anyway, please check my bpf selftest result
> > > > > > > > inline. The fentry_attach_stress results in kernel lockup.
> > > > > > >
> > > > > > > It passed on my test environment.
> > > > > > >
> > > > > > > $ sudo ./test_progs -a fentry_attach_stress
> > > > > > > #104 fentry_attach_stress:OK
> > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > >
> > > > > > > I used loongson3_defconfig and the following additional configs:
> > > > > > >
> > > > > > > CONFIG_KPROBES=y
> > > > > > > CONFIG_FUNCTION_ERROR_INJECTION=y
> > > > > > > CONFIG_TEST_BPF=m
> > > > > > > CONFIG_FTRACE=y
> > > > > > > CONFIG_FUNCTION_TRACER=y
> > > > > > > CONFIG_DYNAMIC_FTRACE=y
> > > > > > > CONFIG_FPROBE=y
> > > > > > > CONFIG_FTRACE_SYSCALLS=y
> > > > > > > CONFIG_BPF_KPROBE_OVERRIDE=y
> > > > > > > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> > > > > > > CONFIG_DEBUG_INFO_BTF=y
> > > > > > > CONFIG_NET_SCH_BPF=y
> > > > > > > CONFIG_BPF_LSM=y
> > > > > > > CONFIG_BPF_UNPRIV_DEFAULT_OFF=n
> > > > > > > CONFIG_ARCH_STRICT_ALIGN=n
> > > > > > >
> > > > > > > I am not sure whether it is related with configs, you can test it again.
> > > > > > >
> > > > >
> > > > > I did:
> > > > > cp arch/loongarch/configs/loongson3_defconfig .config
> > > > > ./scripts/kconfig/merge_config.sh -y .config config-tiezhu(your above config)
> > > > > indeed I could not reproduce the lockup, even run the test in a loop.
> > > > > it seems to be related to the kernel config I use, maybe you could try
> > > > > my kernel config?
> > > > >
> > > > > [root@fedora bpf]# while true; do ./test_progs -a
> > > > > fentry_attach_stress; sleep 1; done
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > >
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > #105 fentry_attach_stress:OK
> > > > >
> > > >
> > > > while checking dmesg with your nolockup config, I see following log:
> > > >
> > > > [ 3469.410821] Hardware name: Loongson
> > > > Loongson-3A6000-7A2000-NUC/Loongson-3A6000-7A2000-NUC, BIOS
> > > > Loongson-UDK2018-V4.0.05759-stable202405 07/12/24 15:49:14
> > > > [ 3469.410824] pc 90000002456d4880 ra 90000000060885f4 tp
> > > > 90000002975d8000 sp 90000002975dbc10
> > > > [ 3469.410826] a0 0000000000000000 a1 ffff8000128f0048 a2
> > > > 00007ffff1c02638 a3 0000000000000000
> > > > [ 3469.410828] a4 00007ffff1c02680 a5 00007ffff0ce0f20 a6
> > > > 00007ffff0ce0f20 a7 0000000000000118
> > > > [ 3469.410830] t0 ffff80000338dd44 t1 90000002456d4880 t2
> > > > d665bdcea9f14eb9 t3 90000001dc9c1000
> > > > [ 3469.410832] t4 90000001dea99670 t5 0000000000000000 t6
> > > > 0000000000000000 t7 0000000000000000
> > > > [ 3469.410834] t8 000000000000000f u0 000000000000000a s9
> > > > 90000002975dbec0 s0 90000002975dbc70
> > > > [ 3469.410836] s1 0000000000000000 s2 90000000076a5000 s3
> > > > 0000000000000000 s4 ffff8000128f0000
> > > > [ 3469.410838] s5 0000000000000000 s6 0000000000000000 s7
> > > > 0000000000000050 s8 fffffffffffffdf4
> > > > [ 3469.410840] ra: 90000000060885f4 __bpf_prog_test_run_raw_tp+0x6c/0x108
> > > > [ 3469.410848] ERA: 90000002456d4880 0x90000002456d4880
> > > > [ 3469.410851] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > > > [ 3469.410860] PRMD: 00000004 (PPLV0 +PIE -PWE)
> > > > [ 3469.410865] EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
> > > > [ 3469.410870] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
> > > > [ 3469.410875] ESTAT: 000d0000 [INE] (IS= ECode=13 EsubCode=0)
> > > > [ 3469.410878] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000)
> > > > [ 3469.410880] Modules linked in: bpf_testmod(O) tls nft_fib_inet
> > > > nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
> > > > nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
> > > > nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink cmac
> > > > algif_hash algif_skcipher af_alg bnep vfat fat rtw88_8821ce
> > > > rtw88_8821c rtw88_pci rtw88_core mac80211 btusb libarc4 btrtl btbcm
> > > > btmtk btintel cfg80211 bluetooth sha3_generic kvm jitterentropy_rng
> > > > drbg ecdh_generic ecc loongson3_cpufreq spi_loongson_pci rfkill
> > > > spi_loongson_core uio_pdrv_genirq uio lm75 fuse efi_pstore pstore zram
> > > > 842_decompress 842_compress lz4hc_compress lz4_compress uas
> > > > usb_storage efivarfs [last unloaded: bpf_testmod(O)]
> > > > [ 3469.410946] Process test_progs (pid: 37338,
> > > > threadinfo=00000000760120b6, task=00000000620daecd)
> > > > [ 3469.410950] Stack : 0000000000000000 0000000000000000
> > > > 0000000000000000 d665bdcea9f14eb9
> > > > [ 3469.410956] 000000000000000f 0000000000000000
> > > > 90000002975dbc70 90000000076a5000
> > > > [ 3469.410961] 00007ffff1c02638 ffff8000128f0000
> > > > 90000002975dbd90 900000000608ab28
> > > > [ 3469.410966] ffff8000128f0000 0000000000000000
> > > > 0000000000000000 d665bdcea9f14eb9
> > > > [ 3469.410971] 0000000000000000 90000000076a5000
> > > > 90000002975dbd90 00007ffff1c02638
> > > > [ 3469.410976] 0000000000000000 000000000000000a
> > > > ffff8000128f0000 9000000004f21328
> > > > [ 3469.410981] 0000000000000000 9000000004e554c0
> > > > 0000000000000000 900000010f6a4240
> > > > [ 3469.410986] 0000000129e18000 ffffffff00003500
> > > > 0000000000000000 0000000000000000
> > > > [ 3469.410990] 0000000000000000 0000000000000000
> > > > 0000000000000000 d665bdcea9f14eb9
> > > > [ 3469.410995] 0000000000000001 00007ffff1c03040
> > > > 0000000000000000 0000000000000002
> > > > [ 3469.411000] ...
> > > > [ 3469.411002] Call Trace:
> > > > [ 3469.411005] [<9000000004d11fb8>] handle_syscall+0xb8/0x158
> > > >
> > > > [ 3469.411012] Code: 4c2e0031 31465341 00323731 <00000000> 00000000
> > > > 975d8000 90000002 00000003 00402040
> > > >
> > > > [ 3469.411024] ---[ end trace 0000000000000000 ]---
> > > >
> > > > here is the relevant config diff between your nolockup config and my
> > > > lockup config that I suspect your nolockup config didn't cause kernel
> > > > lockup
> > > >
> > > > diff -u config-nolockup config-lockup
> > > >
> > > > # Debug Oops, Lockups and Hangs
> > > > #
> > > > -# CONFIG_PANIC_ON_OOPS is not set
> > > > -CONFIG_PANIC_ON_OOPS_VALUE=0
> > > > +CONFIG_PANIC_ON_OOPS=y
> > > > +CONFIG_PANIC_ON_OOPS_VALUE=1
> > > > CONFIG_PANIC_TIMEOUT=0
> > > > -# CONFIG_SOFTLOCKUP_DETECTOR is not set
> > > > +CONFIG_LOCKUP_DETECTOR=y
> > > > +CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > +CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
> > > > CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y
> > > > -# CONFIG_HARDLOCKUP_DETECTOR is not set
> > > > -# CONFIG_DETECT_HUNG_TASK is not set
> > > > -# CONFIG_WQ_WATCHDOG is not set
> > > > -# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set
> > > > +CONFIG_HARDLOCKUP_DETECTOR=y
> > > > +# CONFIG_HARDLOCKUP_DETECTOR_PERF is not set
> > > > +CONFIG_HARDLOCKUP_DETECTOR_BUDDY=y
> > > > +# CONFIG_HARDLOCKUP_DETECTOR_ARCH is not set
> > > > +CONFIG_HARDLOCKUP_DETECTOR_COUNTS_HRTIMER=y
> > > > +CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
> > > > +CONFIG_DETECT_HUNG_TASK=y
> > > > +CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
> > > > +CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
> > > > +CONFIG_DETECT_HUNG_TASK_BLOCKER=y
> > > > +CONFIG_WQ_WATCHDOG=y
> > > > +CONFIG_WQ_CPU_INTENSIVE_REPORT=y
> > > > # CONFIG_TEST_LOCKUP is not set
> > > > # end of Debug Oops, Lockups and Hangs
> > > Hi, Vincent,
> > >
> > > I have applied all BPF patches with some small modifications. Can you
> > > test it again?
> > > https://web.git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/log/?h=loongarch-next
> > >
> > > Huacai
> > >
> > I see you applied the tail call bug patches, I thought about applying
> > tail call patches just in case, but I am not sure the
> > fentry_attach_stress involves tail call count bugs. I tried your
> > loongarch-next branch anyway, but the lockup still happens when I run
> > fentry_attach_stress in a while loop. I am not sure if this particular
> > test should block the merge of bpf trampoline patches, it looks to be
> > an extreme and rare case, maybe track it as a bug report and fix it
> > later after merge? just my two cents :)
> >
>
> I think this violates the community rules. And could be a disaster in
> production.
>
Got it, Sorry for my ignorance :)
> > [root@fedora ~]# cd /usr/src/linux-loongson/
> > [root@fedora linux-loongson]# git branch
> > * loongarch-next
> > master
> > [root@fedora linux-loongson]# uname -a
> > Linux fedora 6.16.0+ #1 SMP PREEMPT_DYNAMIC Sun Aug 3 08:01:34 PDT
> > 2025 loongarch64 GNU/Linux
> >
> > [root@fedora linux-loongson]# cd tools/testing/selftests/bpf/
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> >
> > [root@fedora bpf]# while true; do ./test_progs -a
> > fentry_attach_stress; sleep 5; done
> > client_loop: send disconnect: Broken pipe
> >
> > > >
> > > > > >
> > > > > > Have you tried to run the same fentry_attach_stress multiple times or
> > > > > > in a loop like while true; do ./test_progs -a fentry_attach_stress;
> > > > > > sleep 1; done
> > > > > > the lockup happens intermittently, sometime it PASSED, sometime kernel
> > > > > > locks up. I merged the tools/testing/selftests/bpf/config with my
> > > > > > original config by
> > > > > > ./scripts/kconfig/merge_config.sh -y .config
> > > > > > tools/testing/selftests/bpf/config. my config seems including
> > > > > > everything you listed above except CONFIG_ARCH_STRICT_ALIGN not set,
> > > > > > here is my config https://www.bpfire.net/download/loongfire/config.txt
> > > > > >
> > > > > > > Thanks,
> > > > > > > Tiezhu
> > > > > > >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-08-03 15:24 ` Vincent Li
2025-08-04 2:12 ` Hengqi Chen
@ 2025-08-04 8:24 ` Huacai Chen
2025-08-04 13:28 ` Vincent Li
1 sibling, 1 reply; 35+ messages in thread
From: Huacai Chen @ 2025-08-04 8:24 UTC (permalink / raw)
To: Vincent Li
Cc: Tiezhu Yang, Chenghao Duan, hengqi.chen, kernel, loongarch,
guodongtai, youling.tang, jianghaoran, geliang
On Sun, Aug 3, 2025 at 11:25 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> On Sun, Aug 3, 2025 at 7:11 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > On Sat, Aug 2, 2025 at 11:52 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > >
> > > On Sat, Aug 2, 2025 at 7:47 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > >
> > > > On Sat, Aug 2, 2025 at 6:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > >
> > > > > On Sat, Aug 2, 2025 at 2:19 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
> > > > > >
> > > > > > On 2025/8/1 下午1:21, Vincent Li wrote:
> > > > > > > Hi Chenghao,
> > > > > > >
> > > > > > > I trimmed the email recipients only to the loongarch mailing list and
> > > > > > > folks who might pay attention to this, I personally don't like to
> > > > > > > bother other people who may not be interested in this :). Folks let me
> > > > > > > know if this is not ok. anyway, please check my bpf selftest result
> > > > > > > inline. The fentry_attach_stress results in kernel lockup.
> > > > > >
> > > > > > It passed on my test environment.
> > > > > >
> > > > > > $ sudo ./test_progs -a fentry_attach_stress
> > > > > > #104 fentry_attach_stress:OK
> > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > >
> > > > > > I used loongson3_defconfig and the following additional configs:
> > > > > >
> > > > > > CONFIG_KPROBES=y
> > > > > > CONFIG_FUNCTION_ERROR_INJECTION=y
> > > > > > CONFIG_TEST_BPF=m
> > > > > > CONFIG_FTRACE=y
> > > > > > CONFIG_FUNCTION_TRACER=y
> > > > > > CONFIG_DYNAMIC_FTRACE=y
> > > > > > CONFIG_FPROBE=y
> > > > > > CONFIG_FTRACE_SYSCALLS=y
> > > > > > CONFIG_BPF_KPROBE_OVERRIDE=y
> > > > > > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> > > > > > CONFIG_DEBUG_INFO_BTF=y
> > > > > > CONFIG_NET_SCH_BPF=y
> > > > > > CONFIG_BPF_LSM=y
> > > > > > CONFIG_BPF_UNPRIV_DEFAULT_OFF=n
> > > > > > CONFIG_ARCH_STRICT_ALIGN=n
> > > > > >
> > > > > > I am not sure whether it is related with configs, you can test it again.
> > > > > >
> > > >
> > > > I did:
> > > > cp arch/loongarch/configs/loongson3_defconfig .config
> > > > ./scripts/kconfig/merge_config.sh -y .config config-tiezhu(your above config)
> > > > indeed I could not reproduce the lockup, even run the test in a loop.
> > > > it seems to be related to the kernel config I use, maybe you could try
> > > > my kernel config?
> > > >
> > > > [root@fedora bpf]# while true; do ./test_progs -a
> > > > fentry_attach_stress; sleep 1; done
> > > > #105 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > #105 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > >
> > > > #105 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > #105 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > #105 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > #105 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > #105 fentry_attach_stress:OK
> > > >
> > >
> > > while checking dmesg with your nolockup config, I see following log:
> > >
> > > [ 3469.410821] Hardware name: Loongson
> > > Loongson-3A6000-7A2000-NUC/Loongson-3A6000-7A2000-NUC, BIOS
> > > Loongson-UDK2018-V4.0.05759-stable202405 07/12/24 15:49:14
> > > [ 3469.410824] pc 90000002456d4880 ra 90000000060885f4 tp
> > > 90000002975d8000 sp 90000002975dbc10
> > > [ 3469.410826] a0 0000000000000000 a1 ffff8000128f0048 a2
> > > 00007ffff1c02638 a3 0000000000000000
> > > [ 3469.410828] a4 00007ffff1c02680 a5 00007ffff0ce0f20 a6
> > > 00007ffff0ce0f20 a7 0000000000000118
> > > [ 3469.410830] t0 ffff80000338dd44 t1 90000002456d4880 t2
> > > d665bdcea9f14eb9 t3 90000001dc9c1000
> > > [ 3469.410832] t4 90000001dea99670 t5 0000000000000000 t6
> > > 0000000000000000 t7 0000000000000000
> > > [ 3469.410834] t8 000000000000000f u0 000000000000000a s9
> > > 90000002975dbec0 s0 90000002975dbc70
> > > [ 3469.410836] s1 0000000000000000 s2 90000000076a5000 s3
> > > 0000000000000000 s4 ffff8000128f0000
> > > [ 3469.410838] s5 0000000000000000 s6 0000000000000000 s7
> > > 0000000000000050 s8 fffffffffffffdf4
> > > [ 3469.410840] ra: 90000000060885f4 __bpf_prog_test_run_raw_tp+0x6c/0x108
> > > [ 3469.410848] ERA: 90000002456d4880 0x90000002456d4880
> > > [ 3469.410851] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > > [ 3469.410860] PRMD: 00000004 (PPLV0 +PIE -PWE)
> > > [ 3469.410865] EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
> > > [ 3469.410870] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
> > > [ 3469.410875] ESTAT: 000d0000 [INE] (IS= ECode=13 EsubCode=0)
> > > [ 3469.410878] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000)
> > > [ 3469.410880] Modules linked in: bpf_testmod(O) tls nft_fib_inet
> > > nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
> > > nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
> > > nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink cmac
> > > algif_hash algif_skcipher af_alg bnep vfat fat rtw88_8821ce
> > > rtw88_8821c rtw88_pci rtw88_core mac80211 btusb libarc4 btrtl btbcm
> > > btmtk btintel cfg80211 bluetooth sha3_generic kvm jitterentropy_rng
> > > drbg ecdh_generic ecc loongson3_cpufreq spi_loongson_pci rfkill
> > > spi_loongson_core uio_pdrv_genirq uio lm75 fuse efi_pstore pstore zram
> > > 842_decompress 842_compress lz4hc_compress lz4_compress uas
> > > usb_storage efivarfs [last unloaded: bpf_testmod(O)]
> > > [ 3469.410946] Process test_progs (pid: 37338,
> > > threadinfo=00000000760120b6, task=00000000620daecd)
> > > [ 3469.410950] Stack : 0000000000000000 0000000000000000
> > > 0000000000000000 d665bdcea9f14eb9
> > > [ 3469.410956] 000000000000000f 0000000000000000
> > > 90000002975dbc70 90000000076a5000
> > > [ 3469.410961] 00007ffff1c02638 ffff8000128f0000
> > > 90000002975dbd90 900000000608ab28
> > > [ 3469.410966] ffff8000128f0000 0000000000000000
> > > 0000000000000000 d665bdcea9f14eb9
> > > [ 3469.410971] 0000000000000000 90000000076a5000
> > > 90000002975dbd90 00007ffff1c02638
> > > [ 3469.410976] 0000000000000000 000000000000000a
> > > ffff8000128f0000 9000000004f21328
> > > [ 3469.410981] 0000000000000000 9000000004e554c0
> > > 0000000000000000 900000010f6a4240
> > > [ 3469.410986] 0000000129e18000 ffffffff00003500
> > > 0000000000000000 0000000000000000
> > > [ 3469.410990] 0000000000000000 0000000000000000
> > > 0000000000000000 d665bdcea9f14eb9
> > > [ 3469.410995] 0000000000000001 00007ffff1c03040
> > > 0000000000000000 0000000000000002
> > > [ 3469.411000] ...
> > > [ 3469.411002] Call Trace:
> > > [ 3469.411005] [<9000000004d11fb8>] handle_syscall+0xb8/0x158
> > >
> > > [ 3469.411012] Code: 4c2e0031 31465341 00323731 <00000000> 00000000
> > > 975d8000 90000002 00000003 00402040
> > >
> > > [ 3469.411024] ---[ end trace 0000000000000000 ]---
> > >
> > > here is the relevant config diff between your nolockup config and my
> > > lockup config that I suspect your nolockup config didn't cause kernel
> > > lockup
> > >
> > > diff -u config-nolockup config-lockup
> > >
> > > # Debug Oops, Lockups and Hangs
> > > #
> > > -# CONFIG_PANIC_ON_OOPS is not set
> > > -CONFIG_PANIC_ON_OOPS_VALUE=0
> > > +CONFIG_PANIC_ON_OOPS=y
> > > +CONFIG_PANIC_ON_OOPS_VALUE=1
> > > CONFIG_PANIC_TIMEOUT=0
> > > -# CONFIG_SOFTLOCKUP_DETECTOR is not set
> > > +CONFIG_LOCKUP_DETECTOR=y
> > > +CONFIG_SOFTLOCKUP_DETECTOR=y
> > > +CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
> > > CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y
> > > -# CONFIG_HARDLOCKUP_DETECTOR is not set
> > > -# CONFIG_DETECT_HUNG_TASK is not set
> > > -# CONFIG_WQ_WATCHDOG is not set
> > > -# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set
> > > +CONFIG_HARDLOCKUP_DETECTOR=y
> > > +# CONFIG_HARDLOCKUP_DETECTOR_PERF is not set
> > > +CONFIG_HARDLOCKUP_DETECTOR_BUDDY=y
> > > +# CONFIG_HARDLOCKUP_DETECTOR_ARCH is not set
> > > +CONFIG_HARDLOCKUP_DETECTOR_COUNTS_HRTIMER=y
> > > +CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
> > > +CONFIG_DETECT_HUNG_TASK=y
> > > +CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
> > > +CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
> > > +CONFIG_DETECT_HUNG_TASK_BLOCKER=y
> > > +CONFIG_WQ_WATCHDOG=y
> > > +CONFIG_WQ_CPU_INTENSIVE_REPORT=y
> > > # CONFIG_TEST_LOCKUP is not set
> > > # end of Debug Oops, Lockups and Hangs
> > Hi, Vincent,
> >
> > I have applied all BPF patches with some small modifications. Can you
> > test it again?
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/log/?h=loongarch-next
> >
> > Huacai
> >
> I see you applied the tail call bug patches, I thought about applying
> tail call patches just in case, but I am not sure the
> fentry_attach_stress involves tail call count bugs. I tried your
> loongarch-next branch anyway, but the lockup still happens when I run
> fentry_attach_stress in a while loop. I am not sure if this particular
> test should block the merge of bpf trampoline patches, it looks to be
> an extreme and rare case, maybe track it as a bug report and fix it
> later after merge? just my two cents :)
>
> [root@fedora ~]# cd /usr/src/linux-loongson/
> [root@fedora linux-loongson]# git branch
> * loongarch-next
> master
> [root@fedora linux-loongson]# uname -a
> Linux fedora 6.16.0+ #1 SMP PREEMPT_DYNAMIC Sun Aug 3 08:01:34 PDT
> 2025 loongarch64 GNU/Linux
>
> [root@fedora linux-loongson]# cd tools/testing/selftests/bpf/
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> #107 fentry_attach_stress:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
>
> [root@fedora bpf]# while true; do ./test_progs -a
> fentry_attach_stress; sleep 5; done
> client_loop: send disconnect: Broken pipe
Could you please try this on top of loongarch-next?
diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
index e61c482068fe..c63c78a99f99 100644
--- a/arch/loongarch/kernel/inst.c
+++ b/arch/loongarch/kernel/inst.c
@@ -5,6 +5,7 @@
#include <linux/sizes.h>
#include <linux/uaccess.h>
#include <linux/set_memory.h>
+#include <linux/stop_machine.h>
#include <asm/cacheflush.h>
#include <asm/inst.h>
@@ -219,32 +220,49 @@ int larch_insn_patch_text(void *addr, u32 insn)
return ret;
}
-int larch_insn_text_copy(void *dst, void *src, size_t len)
+struct insn_copy {
+ void *dst;
+ void *src;
+ size_t len;
+ unsigned int cpu;
+};
+
+static int text_copy_cb(void *data)
{
- int ret;
- unsigned long flags;
- unsigned long dst_start, dst_end, dst_len;
+ int ret = 0;
+ size_t start, end;
+ struct insn_copy *copy = data;
- dst_start = round_down((unsigned long)dst, PAGE_SIZE);
- dst_end = round_up((unsigned long)dst + len, PAGE_SIZE);
- dst_len = dst_end - dst_start; /* page-aligned */
+ if (smp_processor_id() == copy->cpu) {
+ start = round_down((size_t)copy->dst, PAGE_SIZE);
+ end = round_up((size_t)copy->dst + copy->len, PAGE_SIZE);
- set_memory_rw(dst_start, dst_len / PAGE_SIZE);
- raw_spin_lock_irqsave(&patch_lock, flags);
+ set_memory_rw(start, (end - start) / PAGE_SIZE);
- ret = copy_to_kernel_nofault(dst, src, len);
- if (ret)
- pr_err("%s: operation failed\n", __func__);
+ ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
+ if (ret)
+ pr_err("%s: operation failed\n", __func__);
- raw_spin_unlock_irqrestore(&patch_lock, flags);
- set_memory_rox(dst_start, dst_len / PAGE_SIZE);
+ set_memory_rox(start, (end - start) / PAGE_SIZE);
+ }
- if (!ret)
- flush_icache_range((unsigned long)dst, (unsigned
long)dst + len);
+ flush_icache_range((unsigned long)copy->dst, (unsigned
long)copy->dst + copy->len);
return ret;
}
+int larch_insn_text_copy(void *dst, void *src, size_t len)
+{
+ struct insn_copy copy = {
+ .dst = dst,
+ .src = src,
+ .len = len,
+ .cpu = smp_processor_id(),
+ };
+
+ return stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
+}
+
u32 larch_insn_gen_nop(void)
{
return INSN_NOP;
>
> > >
> > > > >
> > > > > Have you tried to run the same fentry_attach_stress multiple times or
> > > > > in a loop like while true; do ./test_progs -a fentry_attach_stress;
> > > > > sleep 1; done
> > > > > the lockup happens intermittently, sometime it PASSED, sometime kernel
> > > > > locks up. I merged the tools/testing/selftests/bpf/config with my
> > > > > original config by
> > > > > ./scripts/kconfig/merge_config.sh -y .config
> > > > > tools/testing/selftests/bpf/config. my config seems including
> > > > > everything you listed above except CONFIG_ARCH_STRICT_ALIGN not set,
> > > > > here is my config https://www.bpfire.net/download/loongfire/config.txt
> > > > >
> > > > > > Thanks,
> > > > > > Tiezhu
> > > > > >
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-08-04 8:24 ` Huacai Chen
@ 2025-08-04 13:28 ` Vincent Li
2025-08-04 14:24 ` Vincent Li
0 siblings, 1 reply; 35+ messages in thread
From: Vincent Li @ 2025-08-04 13:28 UTC (permalink / raw)
To: Huacai Chen
Cc: Tiezhu Yang, Chenghao Duan, hengqi.chen, kernel, loongarch,
guodongtai, youling.tang, jianghaoran, geliang
On Mon, Aug 4, 2025 at 1:24 AM Huacai Chen <chenhuacai@kernel.org> wrote:
>
> On Sun, Aug 3, 2025 at 11:25 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> >
> > On Sun, Aug 3, 2025 at 7:11 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > On Sat, Aug 2, 2025 at 11:52 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > >
> > > > On Sat, Aug 2, 2025 at 7:47 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > >
> > > > > On Sat, Aug 2, 2025 at 6:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > >
> > > > > > On Sat, Aug 2, 2025 at 2:19 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
> > > > > > >
> > > > > > > On 2025/8/1 下午1:21, Vincent Li wrote:
> > > > > > > > Hi Chenghao,
> > > > > > > >
> > > > > > > > I trimmed the email recipients only to the loongarch mailing list and
> > > > > > > > folks who might pay attention to this, I personally don't like to
> > > > > > > > bother other people who may not be interested in this :). Folks let me
> > > > > > > > know if this is not ok. anyway, please check my bpf selftest result
> > > > > > > > inline. The fentry_attach_stress results in kernel lockup.
> > > > > > >
> > > > > > > It passed on my test environment.
> > > > > > >
> > > > > > > $ sudo ./test_progs -a fentry_attach_stress
> > > > > > > #104 fentry_attach_stress:OK
> > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > >
> > > > > > > I used loongson3_defconfig and the following additional configs:
> > > > > > >
> > > > > > > CONFIG_KPROBES=y
> > > > > > > CONFIG_FUNCTION_ERROR_INJECTION=y
> > > > > > > CONFIG_TEST_BPF=m
> > > > > > > CONFIG_FTRACE=y
> > > > > > > CONFIG_FUNCTION_TRACER=y
> > > > > > > CONFIG_DYNAMIC_FTRACE=y
> > > > > > > CONFIG_FPROBE=y
> > > > > > > CONFIG_FTRACE_SYSCALLS=y
> > > > > > > CONFIG_BPF_KPROBE_OVERRIDE=y
> > > > > > > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> > > > > > > CONFIG_DEBUG_INFO_BTF=y
> > > > > > > CONFIG_NET_SCH_BPF=y
> > > > > > > CONFIG_BPF_LSM=y
> > > > > > > CONFIG_BPF_UNPRIV_DEFAULT_OFF=n
> > > > > > > CONFIG_ARCH_STRICT_ALIGN=n
> > > > > > >
> > > > > > > I am not sure whether it is related with configs, you can test it again.
> > > > > > >
> > > > >
> > > > > I did:
> > > > > cp arch/loongarch/configs/loongson3_defconfig .config
> > > > > ./scripts/kconfig/merge_config.sh -y .config config-tiezhu(your above config)
> > > > > indeed I could not reproduce the lockup, even run the test in a loop.
> > > > > it seems to be related to the kernel config I use, maybe you could try
> > > > > my kernel config?
> > > > >
> > > > > [root@fedora bpf]# while true; do ./test_progs -a
> > > > > fentry_attach_stress; sleep 1; done
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > >
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > #105 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > #105 fentry_attach_stress:OK
> > > > >
> > > >
> > > > while checking dmesg with your nolockup config, I see following log:
> > > >
> > > > [ 3469.410821] Hardware name: Loongson
> > > > Loongson-3A6000-7A2000-NUC/Loongson-3A6000-7A2000-NUC, BIOS
> > > > Loongson-UDK2018-V4.0.05759-stable202405 07/12/24 15:49:14
> > > > [ 3469.410824] pc 90000002456d4880 ra 90000000060885f4 tp
> > > > 90000002975d8000 sp 90000002975dbc10
> > > > [ 3469.410826] a0 0000000000000000 a1 ffff8000128f0048 a2
> > > > 00007ffff1c02638 a3 0000000000000000
> > > > [ 3469.410828] a4 00007ffff1c02680 a5 00007ffff0ce0f20 a6
> > > > 00007ffff0ce0f20 a7 0000000000000118
> > > > [ 3469.410830] t0 ffff80000338dd44 t1 90000002456d4880 t2
> > > > d665bdcea9f14eb9 t3 90000001dc9c1000
> > > > [ 3469.410832] t4 90000001dea99670 t5 0000000000000000 t6
> > > > 0000000000000000 t7 0000000000000000
> > > > [ 3469.410834] t8 000000000000000f u0 000000000000000a s9
> > > > 90000002975dbec0 s0 90000002975dbc70
> > > > [ 3469.410836] s1 0000000000000000 s2 90000000076a5000 s3
> > > > 0000000000000000 s4 ffff8000128f0000
> > > > [ 3469.410838] s5 0000000000000000 s6 0000000000000000 s7
> > > > 0000000000000050 s8 fffffffffffffdf4
> > > > [ 3469.410840] ra: 90000000060885f4 __bpf_prog_test_run_raw_tp+0x6c/0x108
> > > > [ 3469.410848] ERA: 90000002456d4880 0x90000002456d4880
> > > > [ 3469.410851] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > > > [ 3469.410860] PRMD: 00000004 (PPLV0 +PIE -PWE)
> > > > [ 3469.410865] EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
> > > > [ 3469.410870] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
> > > > [ 3469.410875] ESTAT: 000d0000 [INE] (IS= ECode=13 EsubCode=0)
> > > > [ 3469.410878] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000)
> > > > [ 3469.410880] Modules linked in: bpf_testmod(O) tls nft_fib_inet
> > > > nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
> > > > nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
> > > > nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink cmac
> > > > algif_hash algif_skcipher af_alg bnep vfat fat rtw88_8821ce
> > > > rtw88_8821c rtw88_pci rtw88_core mac80211 btusb libarc4 btrtl btbcm
> > > > btmtk btintel cfg80211 bluetooth sha3_generic kvm jitterentropy_rng
> > > > drbg ecdh_generic ecc loongson3_cpufreq spi_loongson_pci rfkill
> > > > spi_loongson_core uio_pdrv_genirq uio lm75 fuse efi_pstore pstore zram
> > > > 842_decompress 842_compress lz4hc_compress lz4_compress uas
> > > > usb_storage efivarfs [last unloaded: bpf_testmod(O)]
> > > > [ 3469.410946] Process test_progs (pid: 37338,
> > > > threadinfo=00000000760120b6, task=00000000620daecd)
> > > > [ 3469.410950] Stack : 0000000000000000 0000000000000000
> > > > 0000000000000000 d665bdcea9f14eb9
> > > > [ 3469.410956] 000000000000000f 0000000000000000
> > > > 90000002975dbc70 90000000076a5000
> > > > [ 3469.410961] 00007ffff1c02638 ffff8000128f0000
> > > > 90000002975dbd90 900000000608ab28
> > > > [ 3469.410966] ffff8000128f0000 0000000000000000
> > > > 0000000000000000 d665bdcea9f14eb9
> > > > [ 3469.410971] 0000000000000000 90000000076a5000
> > > > 90000002975dbd90 00007ffff1c02638
> > > > [ 3469.410976] 0000000000000000 000000000000000a
> > > > ffff8000128f0000 9000000004f21328
> > > > [ 3469.410981] 0000000000000000 9000000004e554c0
> > > > 0000000000000000 900000010f6a4240
> > > > [ 3469.410986] 0000000129e18000 ffffffff00003500
> > > > 0000000000000000 0000000000000000
> > > > [ 3469.410990] 0000000000000000 0000000000000000
> > > > 0000000000000000 d665bdcea9f14eb9
> > > > [ 3469.410995] 0000000000000001 00007ffff1c03040
> > > > 0000000000000000 0000000000000002
> > > > [ 3469.411000] ...
> > > > [ 3469.411002] Call Trace:
> > > > [ 3469.411005] [<9000000004d11fb8>] handle_syscall+0xb8/0x158
> > > >
> > > > [ 3469.411012] Code: 4c2e0031 31465341 00323731 <00000000> 00000000
> > > > 975d8000 90000002 00000003 00402040
> > > >
> > > > [ 3469.411024] ---[ end trace 0000000000000000 ]---
> > > >
> > > > here is the relevant config diff between your nolockup config and my
> > > > lockup config that I suspect your nolockup config didn't cause kernel
> > > > lockup
> > > >
> > > > diff -u config-nolockup config-lockup
> > > >
> > > > # Debug Oops, Lockups and Hangs
> > > > #
> > > > -# CONFIG_PANIC_ON_OOPS is not set
> > > > -CONFIG_PANIC_ON_OOPS_VALUE=0
> > > > +CONFIG_PANIC_ON_OOPS=y
> > > > +CONFIG_PANIC_ON_OOPS_VALUE=1
> > > > CONFIG_PANIC_TIMEOUT=0
> > > > -# CONFIG_SOFTLOCKUP_DETECTOR is not set
> > > > +CONFIG_LOCKUP_DETECTOR=y
> > > > +CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > +CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
> > > > CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y
> > > > -# CONFIG_HARDLOCKUP_DETECTOR is not set
> > > > -# CONFIG_DETECT_HUNG_TASK is not set
> > > > -# CONFIG_WQ_WATCHDOG is not set
> > > > -# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set
> > > > +CONFIG_HARDLOCKUP_DETECTOR=y
> > > > +# CONFIG_HARDLOCKUP_DETECTOR_PERF is not set
> > > > +CONFIG_HARDLOCKUP_DETECTOR_BUDDY=y
> > > > +# CONFIG_HARDLOCKUP_DETECTOR_ARCH is not set
> > > > +CONFIG_HARDLOCKUP_DETECTOR_COUNTS_HRTIMER=y
> > > > +CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
> > > > +CONFIG_DETECT_HUNG_TASK=y
> > > > +CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
> > > > +CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
> > > > +CONFIG_DETECT_HUNG_TASK_BLOCKER=y
> > > > +CONFIG_WQ_WATCHDOG=y
> > > > +CONFIG_WQ_CPU_INTENSIVE_REPORT=y
> > > > # CONFIG_TEST_LOCKUP is not set
> > > > # end of Debug Oops, Lockups and Hangs
> > > Hi, Vincent,
> > >
> > > I have applied all BPF patches with some small modifications. Can you
> > > test it again?
> > > https://web.git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/log/?h=loongarch-next
> > >
> > > Huacai
> > >
> > I see you applied the tail call bug patches, I thought about applying
> > tail call patches just in case, but I am not sure the
> > fentry_attach_stress involves tail call count bugs. I tried your
> > loongarch-next branch anyway, but the lockup still happens when I run
> > fentry_attach_stress in a while loop. I am not sure if this particular
> > test should block the merge of bpf trampoline patches, it looks to be
> > an extreme and rare case, maybe track it as a bug report and fix it
> > later after merge? just my two cents :)
> >
> > [root@fedora ~]# cd /usr/src/linux-loongson/
> > [root@fedora linux-loongson]# git branch
> > * loongarch-next
> > master
> > [root@fedora linux-loongson]# uname -a
> > Linux fedora 6.16.0+ #1 SMP PREEMPT_DYNAMIC Sun Aug 3 08:01:34 PDT
> > 2025 loongarch64 GNU/Linux
> >
> > [root@fedora linux-loongson]# cd tools/testing/selftests/bpf/
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > #107 fentry_attach_stress:OK
> > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> >
> > [root@fedora bpf]# while true; do ./test_progs -a
> > fentry_attach_stress; sleep 5; done
> > client_loop: send disconnect: Broken pipe
> Could you please try this on top of loongarch-next?
> diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> index e61c482068fe..c63c78a99f99 100644
> --- a/arch/loongarch/kernel/inst.c
> +++ b/arch/loongarch/kernel/inst.c
> @@ -5,6 +5,7 @@
> #include <linux/sizes.h>
> #include <linux/uaccess.h>
> #include <linux/set_memory.h>
> +#include <linux/stop_machine.h>
>
> #include <asm/cacheflush.h>
> #include <asm/inst.h>
> @@ -219,32 +220,49 @@ int larch_insn_patch_text(void *addr, u32 insn)
> return ret;
> }
>
> -int larch_insn_text_copy(void *dst, void *src, size_t len)
> +struct insn_copy {
> + void *dst;
> + void *src;
> + size_t len;
> + unsigned int cpu;
> +};
> +
> +static int text_copy_cb(void *data)
> {
> - int ret;
> - unsigned long flags;
> - unsigned long dst_start, dst_end, dst_len;
> + int ret = 0;
> + size_t start, end;
> + struct insn_copy *copy = data;
>
> - dst_start = round_down((unsigned long)dst, PAGE_SIZE);
> - dst_end = round_up((unsigned long)dst + len, PAGE_SIZE);
> - dst_len = dst_end - dst_start; /* page-aligned */
> + if (smp_processor_id() == copy->cpu) {
> + start = round_down((size_t)copy->dst, PAGE_SIZE);
> + end = round_up((size_t)copy->dst + copy->len, PAGE_SIZE);
>
> - set_memory_rw(dst_start, dst_len / PAGE_SIZE);
> - raw_spin_lock_irqsave(&patch_lock, flags);
> + set_memory_rw(start, (end - start) / PAGE_SIZE);
>
> - ret = copy_to_kernel_nofault(dst, src, len);
> - if (ret)
> - pr_err("%s: operation failed\n", __func__);
> + ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
> + if (ret)
> + pr_err("%s: operation failed\n", __func__);
>
> - raw_spin_unlock_irqrestore(&patch_lock, flags);
> - set_memory_rox(dst_start, dst_len / PAGE_SIZE);
> + set_memory_rox(start, (end - start) / PAGE_SIZE);
> + }
>
> - if (!ret)
> - flush_icache_range((unsigned long)dst, (unsigned
> long)dst + len);
> + flush_icache_range((unsigned long)copy->dst, (unsigned
> long)copy->dst + copy->len);
>
> return ret;
> }
>
> +int larch_insn_text_copy(void *dst, void *src, size_t len)
> +{
> + struct insn_copy copy = {
> + .dst = dst,
> + .src = src,
> + .len = len,
> + .cpu = smp_processor_id(),
> + };
> +
> + return stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
> +}
> +
> u32 larch_insn_gen_nop(void)
> {
> return INSN_NOP;
>
Here is the result code I manually patched according to the above
diff. unfortunately, it made the lockup issue worse, even with
Tiezhu's no lockup config, it locked up immediately when start
fentry_attach_stress
struct insn_copy {
void *dst;
void *src;
size_t len;
unsigned int cpu;
};
static int text_copy_cb(void *data)
{
int ret = 0;
size_t start, end;
struct insn_copy *copy = data;
if (smp_processor_id() == copy->cpu) {
start = round_down((size_t)copy->dst, PAGE_SIZE);
end = round_up((size_t)copy->dst + copy->len, PAGE_SIZE);
set_memory_rw(start, (end - start) / PAGE_SIZE);
ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
if (ret)
pr_err("%s: operation failed\n", __func__);
set_memory_rox(start, (end - start) / PAGE_SIZE);
}
flush_icache_range((unsigned long)copy->dst, (unsigned
long)copy->dst + copy->len);
return ret;
}
int larch_insn_text_copy(void *dst, void *src, size_t len)
{
struct insn_copy copy = {
.dst = dst,
.src = src,
.len = len,
.cpu = smp_processor_id(),
};
return stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
}
> >
> > > >
> > > > > >
> > > > > > Have you tried to run the same fentry_attach_stress multiple times or
> > > > > > in a loop like while true; do ./test_progs -a fentry_attach_stress;
> > > > > > sleep 1; done
> > > > > > the lockup happens intermittently, sometime it PASSED, sometime kernel
> > > > > > locks up. I merged the tools/testing/selftests/bpf/config with my
> > > > > > original config by
> > > > > > ./scripts/kconfig/merge_config.sh -y .config
> > > > > > tools/testing/selftests/bpf/config. my config seems including
> > > > > > everything you listed above except CONFIG_ARCH_STRICT_ALIGN not set,
> > > > > > here is my config https://www.bpfire.net/download/loongfire/config.txt
> > > > > >
> > > > > > > Thanks,
> > > > > > > Tiezhu
> > > > > > >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-08-04 13:28 ` Vincent Li
@ 2025-08-04 14:24 ` Vincent Li
2025-08-04 14:58 ` Vincent Li
2025-08-04 15:36 ` Vincent Li
0 siblings, 2 replies; 35+ messages in thread
From: Vincent Li @ 2025-08-04 14:24 UTC (permalink / raw)
To: Huacai Chen
Cc: Tiezhu Yang, Chenghao Duan, hengqi.chen, kernel, loongarch,
guodongtai, youling.tang, jianghaoran, geliang
On Mon, Aug 4, 2025 at 6:28 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> On Mon, Aug 4, 2025 at 1:24 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > On Sun, Aug 3, 2025 at 11:25 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > >
> > > On Sun, Aug 3, 2025 at 7:11 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > >
> > > > On Sat, Aug 2, 2025 at 11:52 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > >
> > > > > On Sat, Aug 2, 2025 at 7:47 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > >
> > > > > > On Sat, Aug 2, 2025 at 6:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > >
> > > > > > > On Sat, Aug 2, 2025 at 2:19 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
> > > > > > > >
> > > > > > > > On 2025/8/1 下午1:21, Vincent Li wrote:
> > > > > > > > > Hi Chenghao,
> > > > > > > > >
> > > > > > > > > I trimmed the email recipients only to the loongarch mailing list and
> > > > > > > > > folks who might pay attention to this, I personally don't like to
> > > > > > > > > bother other people who may not be interested in this :). Folks let me
> > > > > > > > > know if this is not ok. anyway, please check my bpf selftest result
> > > > > > > > > inline. The fentry_attach_stress results in kernel lockup.
> > > > > > > >
> > > > > > > > It passed on my test environment.
> > > > > > > >
> > > > > > > > $ sudo ./test_progs -a fentry_attach_stress
> > > > > > > > #104 fentry_attach_stress:OK
> > > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > >
> > > > > > > > I used loongson3_defconfig and the following additional configs:
> > > > > > > >
> > > > > > > > CONFIG_KPROBES=y
> > > > > > > > CONFIG_FUNCTION_ERROR_INJECTION=y
> > > > > > > > CONFIG_TEST_BPF=m
> > > > > > > > CONFIG_FTRACE=y
> > > > > > > > CONFIG_FUNCTION_TRACER=y
> > > > > > > > CONFIG_DYNAMIC_FTRACE=y
> > > > > > > > CONFIG_FPROBE=y
> > > > > > > > CONFIG_FTRACE_SYSCALLS=y
> > > > > > > > CONFIG_BPF_KPROBE_OVERRIDE=y
> > > > > > > > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> > > > > > > > CONFIG_DEBUG_INFO_BTF=y
> > > > > > > > CONFIG_NET_SCH_BPF=y
> > > > > > > > CONFIG_BPF_LSM=y
> > > > > > > > CONFIG_BPF_UNPRIV_DEFAULT_OFF=n
> > > > > > > > CONFIG_ARCH_STRICT_ALIGN=n
> > > > > > > >
> > > > > > > > I am not sure whether it is related with configs, you can test it again.
> > > > > > > >
> > > > > >
> > > > > > I did:
> > > > > > cp arch/loongarch/configs/loongson3_defconfig .config
> > > > > > ./scripts/kconfig/merge_config.sh -y .config config-tiezhu(your above config)
> > > > > > indeed I could not reproduce the lockup, even run the test in a loop.
> > > > > > it seems to be related to the kernel config I use, maybe you could try
> > > > > > my kernel config?
> > > > > >
> > > > > > [root@fedora bpf]# while true; do ./test_progs -a
> > > > > > fentry_attach_stress; sleep 1; done
> > > > > > #105 fentry_attach_stress:OK
> > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > #105 fentry_attach_stress:OK
> > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > >
> > > > > > #105 fentry_attach_stress:OK
> > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > #105 fentry_attach_stress:OK
> > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > #105 fentry_attach_stress:OK
> > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > #105 fentry_attach_stress:OK
> > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > #105 fentry_attach_stress:OK
> > > > > >
> > > > >
> > > > > while checking dmesg with your nolockup config, I see following log:
> > > > >
> > > > > [ 3469.410821] Hardware name: Loongson
> > > > > Loongson-3A6000-7A2000-NUC/Loongson-3A6000-7A2000-NUC, BIOS
> > > > > Loongson-UDK2018-V4.0.05759-stable202405 07/12/24 15:49:14
> > > > > [ 3469.410824] pc 90000002456d4880 ra 90000000060885f4 tp
> > > > > 90000002975d8000 sp 90000002975dbc10
> > > > > [ 3469.410826] a0 0000000000000000 a1 ffff8000128f0048 a2
> > > > > 00007ffff1c02638 a3 0000000000000000
> > > > > [ 3469.410828] a4 00007ffff1c02680 a5 00007ffff0ce0f20 a6
> > > > > 00007ffff0ce0f20 a7 0000000000000118
> > > > > [ 3469.410830] t0 ffff80000338dd44 t1 90000002456d4880 t2
> > > > > d665bdcea9f14eb9 t3 90000001dc9c1000
> > > > > [ 3469.410832] t4 90000001dea99670 t5 0000000000000000 t6
> > > > > 0000000000000000 t7 0000000000000000
> > > > > [ 3469.410834] t8 000000000000000f u0 000000000000000a s9
> > > > > 90000002975dbec0 s0 90000002975dbc70
> > > > > [ 3469.410836] s1 0000000000000000 s2 90000000076a5000 s3
> > > > > 0000000000000000 s4 ffff8000128f0000
> > > > > [ 3469.410838] s5 0000000000000000 s6 0000000000000000 s7
> > > > > 0000000000000050 s8 fffffffffffffdf4
> > > > > [ 3469.410840] ra: 90000000060885f4 __bpf_prog_test_run_raw_tp+0x6c/0x108
> > > > > [ 3469.410848] ERA: 90000002456d4880 0x90000002456d4880
> > > > > [ 3469.410851] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > > > > [ 3469.410860] PRMD: 00000004 (PPLV0 +PIE -PWE)
> > > > > [ 3469.410865] EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
> > > > > [ 3469.410870] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
> > > > > [ 3469.410875] ESTAT: 000d0000 [INE] (IS= ECode=13 EsubCode=0)
> > > > > [ 3469.410878] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000)
> > > > > [ 3469.410880] Modules linked in: bpf_testmod(O) tls nft_fib_inet
> > > > > nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
> > > > > nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
> > > > > nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink cmac
> > > > > algif_hash algif_skcipher af_alg bnep vfat fat rtw88_8821ce
> > > > > rtw88_8821c rtw88_pci rtw88_core mac80211 btusb libarc4 btrtl btbcm
> > > > > btmtk btintel cfg80211 bluetooth sha3_generic kvm jitterentropy_rng
> > > > > drbg ecdh_generic ecc loongson3_cpufreq spi_loongson_pci rfkill
> > > > > spi_loongson_core uio_pdrv_genirq uio lm75 fuse efi_pstore pstore zram
> > > > > 842_decompress 842_compress lz4hc_compress lz4_compress uas
> > > > > usb_storage efivarfs [last unloaded: bpf_testmod(O)]
> > > > > [ 3469.410946] Process test_progs (pid: 37338,
> > > > > threadinfo=00000000760120b6, task=00000000620daecd)
> > > > > [ 3469.410950] Stack : 0000000000000000 0000000000000000
> > > > > 0000000000000000 d665bdcea9f14eb9
> > > > > [ 3469.410956] 000000000000000f 0000000000000000
> > > > > 90000002975dbc70 90000000076a5000
> > > > > [ 3469.410961] 00007ffff1c02638 ffff8000128f0000
> > > > > 90000002975dbd90 900000000608ab28
> > > > > [ 3469.410966] ffff8000128f0000 0000000000000000
> > > > > 0000000000000000 d665bdcea9f14eb9
> > > > > [ 3469.410971] 0000000000000000 90000000076a5000
> > > > > 90000002975dbd90 00007ffff1c02638
> > > > > [ 3469.410976] 0000000000000000 000000000000000a
> > > > > ffff8000128f0000 9000000004f21328
> > > > > [ 3469.410981] 0000000000000000 9000000004e554c0
> > > > > 0000000000000000 900000010f6a4240
> > > > > [ 3469.410986] 0000000129e18000 ffffffff00003500
> > > > > 0000000000000000 0000000000000000
> > > > > [ 3469.410990] 0000000000000000 0000000000000000
> > > > > 0000000000000000 d665bdcea9f14eb9
> > > > > [ 3469.410995] 0000000000000001 00007ffff1c03040
> > > > > 0000000000000000 0000000000000002
> > > > > [ 3469.411000] ...
> > > > > [ 3469.411002] Call Trace:
> > > > > [ 3469.411005] [<9000000004d11fb8>] handle_syscall+0xb8/0x158
> > > > >
> > > > > [ 3469.411012] Code: 4c2e0031 31465341 00323731 <00000000> 00000000
> > > > > 975d8000 90000002 00000003 00402040
> > > > >
> > > > > [ 3469.411024] ---[ end trace 0000000000000000 ]---
> > > > >
> > > > > here is the relevant config diff between your nolockup config and my
> > > > > lockup config that I suspect your nolockup config didn't cause kernel
> > > > > lockup
> > > > >
> > > > > diff -u config-nolockup config-lockup
> > > > >
> > > > > # Debug Oops, Lockups and Hangs
> > > > > #
> > > > > -# CONFIG_PANIC_ON_OOPS is not set
> > > > > -CONFIG_PANIC_ON_OOPS_VALUE=0
> > > > > +CONFIG_PANIC_ON_OOPS=y
> > > > > +CONFIG_PANIC_ON_OOPS_VALUE=1
> > > > > CONFIG_PANIC_TIMEOUT=0
> > > > > -# CONFIG_SOFTLOCKUP_DETECTOR is not set
> > > > > +CONFIG_LOCKUP_DETECTOR=y
> > > > > +CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > +CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
> > > > > CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y
> > > > > -# CONFIG_HARDLOCKUP_DETECTOR is not set
> > > > > -# CONFIG_DETECT_HUNG_TASK is not set
> > > > > -# CONFIG_WQ_WATCHDOG is not set
> > > > > -# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set
> > > > > +CONFIG_HARDLOCKUP_DETECTOR=y
> > > > > +# CONFIG_HARDLOCKUP_DETECTOR_PERF is not set
> > > > > +CONFIG_HARDLOCKUP_DETECTOR_BUDDY=y
> > > > > +# CONFIG_HARDLOCKUP_DETECTOR_ARCH is not set
> > > > > +CONFIG_HARDLOCKUP_DETECTOR_COUNTS_HRTIMER=y
> > > > > +CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
> > > > > +CONFIG_DETECT_HUNG_TASK=y
> > > > > +CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
> > > > > +CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
> > > > > +CONFIG_DETECT_HUNG_TASK_BLOCKER=y
> > > > > +CONFIG_WQ_WATCHDOG=y
> > > > > +CONFIG_WQ_CPU_INTENSIVE_REPORT=y
> > > > > # CONFIG_TEST_LOCKUP is not set
> > > > > # end of Debug Oops, Lockups and Hangs
> > > > Hi, Vincent,
> > > >
> > > > I have applied all BPF patches with some small modifications. Can you
> > > > test it again?
> > > > https://web.git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/log/?h=loongarch-next
> > > >
> > > > Huacai
> > > >
> > > I see you applied the tail call bug patches, I thought about applying
> > > tail call patches just in case, but I am not sure the
> > > fentry_attach_stress involves tail call count bugs. I tried your
> > > loongarch-next branch anyway, but the lockup still happens when I run
> > > fentry_attach_stress in a while loop. I am not sure if this particular
> > > test should block the merge of bpf trampoline patches, it looks to be
> > > an extreme and rare case, maybe track it as a bug report and fix it
> > > later after merge? just my two cents :)
> > >
> > > [root@fedora ~]# cd /usr/src/linux-loongson/
> > > [root@fedora linux-loongson]# git branch
> > > * loongarch-next
> > > master
> > > [root@fedora linux-loongson]# uname -a
> > > Linux fedora 6.16.0+ #1 SMP PREEMPT_DYNAMIC Sun Aug 3 08:01:34 PDT
> > > 2025 loongarch64 GNU/Linux
> > >
> > > [root@fedora linux-loongson]# cd tools/testing/selftests/bpf/
> > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > #107 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > #107 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > #107 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > #107 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > #107 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > #107 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > #107 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > #107 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > #107 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > #107 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > #107 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > #107 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > #107 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > #107 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > #107 fentry_attach_stress:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > >
> > > [root@fedora bpf]# while true; do ./test_progs -a
> > > fentry_attach_stress; sleep 5; done
> > > client_loop: send disconnect: Broken pipe
> > Could you please try this on top of loongarch-next?
> > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > index e61c482068fe..c63c78a99f99 100644
> > --- a/arch/loongarch/kernel/inst.c
> > +++ b/arch/loongarch/kernel/inst.c
> > @@ -5,6 +5,7 @@
> > #include <linux/sizes.h>
> > #include <linux/uaccess.h>
> > #include <linux/set_memory.h>
> > +#include <linux/stop_machine.h>
> >
> > #include <asm/cacheflush.h>
> > #include <asm/inst.h>
> > @@ -219,32 +220,49 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > return ret;
> > }
> >
> > -int larch_insn_text_copy(void *dst, void *src, size_t len)
> > +struct insn_copy {
> > + void *dst;
> > + void *src;
> > + size_t len;
> > + unsigned int cpu;
> > +};
> > +
> > +static int text_copy_cb(void *data)
> > {
> > - int ret;
> > - unsigned long flags;
> > - unsigned long dst_start, dst_end, dst_len;
> > + int ret = 0;
> > + size_t start, end;
> > + struct insn_copy *copy = data;
> >
> > - dst_start = round_down((unsigned long)dst, PAGE_SIZE);
> > - dst_end = round_up((unsigned long)dst + len, PAGE_SIZE);
> > - dst_len = dst_end - dst_start; /* page-aligned */
> > + if (smp_processor_id() == copy->cpu) {
> > + start = round_down((size_t)copy->dst, PAGE_SIZE);
> > + end = round_up((size_t)copy->dst + copy->len, PAGE_SIZE);
> >
> > - set_memory_rw(dst_start, dst_len / PAGE_SIZE);
> > - raw_spin_lock_irqsave(&patch_lock, flags);
> > + set_memory_rw(start, (end - start) / PAGE_SIZE);
> >
> > - ret = copy_to_kernel_nofault(dst, src, len);
> > - if (ret)
> > - pr_err("%s: operation failed\n", __func__);
> > + ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
> > + if (ret)
> > + pr_err("%s: operation failed\n", __func__);
> >
> > - raw_spin_unlock_irqrestore(&patch_lock, flags);
> > - set_memory_rox(dst_start, dst_len / PAGE_SIZE);
> > + set_memory_rox(start, (end - start) / PAGE_SIZE);
> > + }
> >
> > - if (!ret)
> > - flush_icache_range((unsigned long)dst, (unsigned
> > long)dst + len);
> > + flush_icache_range((unsigned long)copy->dst, (unsigned
> > long)copy->dst + copy->len);
> >
> > return ret;
> > }
> >
> > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > +{
> > + struct insn_copy copy = {
> > + .dst = dst,
> > + .src = src,
> > + .len = len,
> > + .cpu = smp_processor_id(),
> > + };
> > +
> > + return stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
> > +}
> > +
> > u32 larch_insn_gen_nop(void)
> > {
> > return INSN_NOP;
> >
> Here is the result code I manually patched according to the above
> diff. unfortunately, it made the lockup issue worse, even with
> Tiezhu's no lockup config, it locked up immediately when start
> fentry_attach_stress
>
> struct insn_copy {
> void *dst;
> void *src;
> size_t len;
> unsigned int cpu;
> };
>
> static int text_copy_cb(void *data)
> {
> int ret = 0;
> size_t start, end;
> struct insn_copy *copy = data;
>
> if (smp_processor_id() == copy->cpu) {
> start = round_down((size_t)copy->dst, PAGE_SIZE);
> end = round_up((size_t)copy->dst + copy->len, PAGE_SIZE);
> set_memory_rw(start, (end - start) / PAGE_SIZE);
> ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
> if (ret)
> pr_err("%s: operation failed\n", __func__);
>
> set_memory_rox(start, (end - start) / PAGE_SIZE);
> }
>
> flush_icache_range((unsigned long)copy->dst, (unsigned
> long)copy->dst + copy->len);
>
> return ret;
> }
>
> int larch_insn_text_copy(void *dst, void *src, size_t len)
> {
> struct insn_copy copy = {
> .dst = dst,
> .src = src,
> .len = len,
> .cpu = smp_processor_id(),
> };
>
> return stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
> }
>
Huacai,
Here is your second revision code that is tested so far working well,
with Tiezhu's kernel config, no lockup running fentry_attach_stress in
while loop, no dmesg kernel panic trace, other fentry*/fexit* also
past. I am going to do clean kernel build with my kernel config with
softlockup/hardlockup detection on, and will report result ASAP
struct insn_copy {
void *dst;
void *src;
size_t len;
unsigned int cpu;
};
static int text_copy_cb(void *data)
{
int ret = 0;
struct insn_copy *copy = data;
if (smp_processor_id() == copy->cpu) {
ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
if (ret)
pr_err("%s: operation failed\n", __func__);
}
flush_icache_range((unsigned long)copy->dst, (unsigned
long)copy->dst + copy->len);
return ret;
}
int larch_insn_text_copy(void *dst, void *src, size_t len)
{
int ret = 0;
size_t start, end;
struct insn_copy copy = {
.dst = dst,
.src = src,
.len = len,
.cpu = smp_processor_id(),
};
start = round_down((size_t)dst, PAGE_SIZE);
end = round_up((size_t)dst + len, PAGE_SIZE);
set_memory_rw(start, (end - start) / PAGE_SIZE);
return stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
set_memory_rox(start, (end - start) / PAGE_SIZE);
return ret;
}
>
> > >
> > > > >
> > > > > > >
> > > > > > > Have you tried to run the same fentry_attach_stress multiple times or
> > > > > > > in a loop like while true; do ./test_progs -a fentry_attach_stress;
> > > > > > > sleep 1; done
> > > > > > > the lockup happens intermittently, sometime it PASSED, sometime kernel
> > > > > > > locks up. I merged the tools/testing/selftests/bpf/config with my
> > > > > > > original config by
> > > > > > > ./scripts/kconfig/merge_config.sh -y .config
> > > > > > > tools/testing/selftests/bpf/config. my config seems including
> > > > > > > everything you listed above except CONFIG_ARCH_STRICT_ALIGN not set,
> > > > > > > here is my config https://www.bpfire.net/download/loongfire/config.txt
> > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Tiezhu
> > > > > > > >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-08-04 14:24 ` Vincent Li
@ 2025-08-04 14:58 ` Vincent Li
2025-08-04 15:36 ` Vincent Li
1 sibling, 0 replies; 35+ messages in thread
From: Vincent Li @ 2025-08-04 14:58 UTC (permalink / raw)
To: Huacai Chen
Cc: Tiezhu Yang, Chenghao Duan, hengqi.chen, kernel, loongarch,
guodongtai, youling.tang, jianghaoran, geliang
On Mon, Aug 4, 2025 at 7:24 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> On Mon, Aug 4, 2025 at 6:28 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> >
> > On Mon, Aug 4, 2025 at 1:24 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > On Sun, Aug 3, 2025 at 11:25 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > >
> > > > On Sun, Aug 3, 2025 at 7:11 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > >
> > > > > On Sat, Aug 2, 2025 at 11:52 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > >
> > > > > > On Sat, Aug 2, 2025 at 7:47 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > >
> > > > > > > On Sat, Aug 2, 2025 at 6:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Sat, Aug 2, 2025 at 2:19 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
> > > > > > > > >
> > > > > > > > > On 2025/8/1 下午1:21, Vincent Li wrote:
> > > > > > > > > > Hi Chenghao,
> > > > > > > > > >
> > > > > > > > > > I trimmed the email recipients only to the loongarch mailing list and
> > > > > > > > > > folks who might pay attention to this, I personally don't like to
> > > > > > > > > > bother other people who may not be interested in this :). Folks let me
> > > > > > > > > > know if this is not ok. anyway, please check my bpf selftest result
> > > > > > > > > > inline. The fentry_attach_stress results in kernel lockup.
> > > > > > > > >
> > > > > > > > > It passed on my test environment.
> > > > > > > > >
> > > > > > > > > $ sudo ./test_progs -a fentry_attach_stress
> > > > > > > > > #104 fentry_attach_stress:OK
> > > > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > > >
> > > > > > > > > I used loongson3_defconfig and the following additional configs:
> > > > > > > > >
> > > > > > > > > CONFIG_KPROBES=y
> > > > > > > > > CONFIG_FUNCTION_ERROR_INJECTION=y
> > > > > > > > > CONFIG_TEST_BPF=m
> > > > > > > > > CONFIG_FTRACE=y
> > > > > > > > > CONFIG_FUNCTION_TRACER=y
> > > > > > > > > CONFIG_DYNAMIC_FTRACE=y
> > > > > > > > > CONFIG_FPROBE=y
> > > > > > > > > CONFIG_FTRACE_SYSCALLS=y
> > > > > > > > > CONFIG_BPF_KPROBE_OVERRIDE=y
> > > > > > > > > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> > > > > > > > > CONFIG_DEBUG_INFO_BTF=y
> > > > > > > > > CONFIG_NET_SCH_BPF=y
> > > > > > > > > CONFIG_BPF_LSM=y
> > > > > > > > > CONFIG_BPF_UNPRIV_DEFAULT_OFF=n
> > > > > > > > > CONFIG_ARCH_STRICT_ALIGN=n
> > > > > > > > >
> > > > > > > > > I am not sure whether it is related with configs, you can test it again.
> > > > > > > > >
> > > > > > >
> > > > > > > I did:
> > > > > > > cp arch/loongarch/configs/loongson3_defconfig .config
> > > > > > > ./scripts/kconfig/merge_config.sh -y .config config-tiezhu(your above config)
> > > > > > > indeed I could not reproduce the lockup, even run the test in a loop.
> > > > > > > it seems to be related to the kernel config I use, maybe you could try
> > > > > > > my kernel config?
> > > > > > >
> > > > > > > [root@fedora bpf]# while true; do ./test_progs -a
> > > > > > > fentry_attach_stress; sleep 1; done
> > > > > > > #105 fentry_attach_stress:OK
> > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > #105 fentry_attach_stress:OK
> > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > >
> > > > > > > #105 fentry_attach_stress:OK
> > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > #105 fentry_attach_stress:OK
> > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > #105 fentry_attach_stress:OK
> > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > #105 fentry_attach_stress:OK
> > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > #105 fentry_attach_stress:OK
> > > > > > >
> > > > > >
> > > > > > while checking dmesg with your nolockup config, I see following log:
> > > > > >
> > > > > > [ 3469.410821] Hardware name: Loongson
> > > > > > Loongson-3A6000-7A2000-NUC/Loongson-3A6000-7A2000-NUC, BIOS
> > > > > > Loongson-UDK2018-V4.0.05759-stable202405 07/12/24 15:49:14
> > > > > > [ 3469.410824] pc 90000002456d4880 ra 90000000060885f4 tp
> > > > > > 90000002975d8000 sp 90000002975dbc10
> > > > > > [ 3469.410826] a0 0000000000000000 a1 ffff8000128f0048 a2
> > > > > > 00007ffff1c02638 a3 0000000000000000
> > > > > > [ 3469.410828] a4 00007ffff1c02680 a5 00007ffff0ce0f20 a6
> > > > > > 00007ffff0ce0f20 a7 0000000000000118
> > > > > > [ 3469.410830] t0 ffff80000338dd44 t1 90000002456d4880 t2
> > > > > > d665bdcea9f14eb9 t3 90000001dc9c1000
> > > > > > [ 3469.410832] t4 90000001dea99670 t5 0000000000000000 t6
> > > > > > 0000000000000000 t7 0000000000000000
> > > > > > [ 3469.410834] t8 000000000000000f u0 000000000000000a s9
> > > > > > 90000002975dbec0 s0 90000002975dbc70
> > > > > > [ 3469.410836] s1 0000000000000000 s2 90000000076a5000 s3
> > > > > > 0000000000000000 s4 ffff8000128f0000
> > > > > > [ 3469.410838] s5 0000000000000000 s6 0000000000000000 s7
> > > > > > 0000000000000050 s8 fffffffffffffdf4
> > > > > > [ 3469.410840] ra: 90000000060885f4 __bpf_prog_test_run_raw_tp+0x6c/0x108
> > > > > > [ 3469.410848] ERA: 90000002456d4880 0x90000002456d4880
> > > > > > [ 3469.410851] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > > > > > [ 3469.410860] PRMD: 00000004 (PPLV0 +PIE -PWE)
> > > > > > [ 3469.410865] EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
> > > > > > [ 3469.410870] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
> > > > > > [ 3469.410875] ESTAT: 000d0000 [INE] (IS= ECode=13 EsubCode=0)
> > > > > > [ 3469.410878] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000)
> > > > > > [ 3469.410880] Modules linked in: bpf_testmod(O) tls nft_fib_inet
> > > > > > nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
> > > > > > nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
> > > > > > nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink cmac
> > > > > > algif_hash algif_skcipher af_alg bnep vfat fat rtw88_8821ce
> > > > > > rtw88_8821c rtw88_pci rtw88_core mac80211 btusb libarc4 btrtl btbcm
> > > > > > btmtk btintel cfg80211 bluetooth sha3_generic kvm jitterentropy_rng
> > > > > > drbg ecdh_generic ecc loongson3_cpufreq spi_loongson_pci rfkill
> > > > > > spi_loongson_core uio_pdrv_genirq uio lm75 fuse efi_pstore pstore zram
> > > > > > 842_decompress 842_compress lz4hc_compress lz4_compress uas
> > > > > > usb_storage efivarfs [last unloaded: bpf_testmod(O)]
> > > > > > [ 3469.410946] Process test_progs (pid: 37338,
> > > > > > threadinfo=00000000760120b6, task=00000000620daecd)
> > > > > > [ 3469.410950] Stack : 0000000000000000 0000000000000000
> > > > > > 0000000000000000 d665bdcea9f14eb9
> > > > > > [ 3469.410956] 000000000000000f 0000000000000000
> > > > > > 90000002975dbc70 90000000076a5000
> > > > > > [ 3469.410961] 00007ffff1c02638 ffff8000128f0000
> > > > > > 90000002975dbd90 900000000608ab28
> > > > > > [ 3469.410966] ffff8000128f0000 0000000000000000
> > > > > > 0000000000000000 d665bdcea9f14eb9
> > > > > > [ 3469.410971] 0000000000000000 90000000076a5000
> > > > > > 90000002975dbd90 00007ffff1c02638
> > > > > > [ 3469.410976] 0000000000000000 000000000000000a
> > > > > > ffff8000128f0000 9000000004f21328
> > > > > > [ 3469.410981] 0000000000000000 9000000004e554c0
> > > > > > 0000000000000000 900000010f6a4240
> > > > > > [ 3469.410986] 0000000129e18000 ffffffff00003500
> > > > > > 0000000000000000 0000000000000000
> > > > > > [ 3469.410990] 0000000000000000 0000000000000000
> > > > > > 0000000000000000 d665bdcea9f14eb9
> > > > > > [ 3469.410995] 0000000000000001 00007ffff1c03040
> > > > > > 0000000000000000 0000000000000002
> > > > > > [ 3469.411000] ...
> > > > > > [ 3469.411002] Call Trace:
> > > > > > [ 3469.411005] [<9000000004d11fb8>] handle_syscall+0xb8/0x158
> > > > > >
> > > > > > [ 3469.411012] Code: 4c2e0031 31465341 00323731 <00000000> 00000000
> > > > > > 975d8000 90000002 00000003 00402040
> > > > > >
> > > > > > [ 3469.411024] ---[ end trace 0000000000000000 ]---
> > > > > >
> > > > > > here is the relevant config diff between your nolockup config and my
> > > > > > lockup config that I suspect your nolockup config didn't cause kernel
> > > > > > lockup
> > > > > >
> > > > > > diff -u config-nolockup config-lockup
> > > > > >
> > > > > > # Debug Oops, Lockups and Hangs
> > > > > > #
> > > > > > -# CONFIG_PANIC_ON_OOPS is not set
> > > > > > -CONFIG_PANIC_ON_OOPS_VALUE=0
> > > > > > +CONFIG_PANIC_ON_OOPS=y
> > > > > > +CONFIG_PANIC_ON_OOPS_VALUE=1
> > > > > > CONFIG_PANIC_TIMEOUT=0
> > > > > > -# CONFIG_SOFTLOCKUP_DETECTOR is not set
> > > > > > +CONFIG_LOCKUP_DETECTOR=y
> > > > > > +CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > +CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
> > > > > > CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y
> > > > > > -# CONFIG_HARDLOCKUP_DETECTOR is not set
> > > > > > -# CONFIG_DETECT_HUNG_TASK is not set
> > > > > > -# CONFIG_WQ_WATCHDOG is not set
> > > > > > -# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set
> > > > > > +CONFIG_HARDLOCKUP_DETECTOR=y
> > > > > > +# CONFIG_HARDLOCKUP_DETECTOR_PERF is not set
> > > > > > +CONFIG_HARDLOCKUP_DETECTOR_BUDDY=y
> > > > > > +# CONFIG_HARDLOCKUP_DETECTOR_ARCH is not set
> > > > > > +CONFIG_HARDLOCKUP_DETECTOR_COUNTS_HRTIMER=y
> > > > > > +CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
> > > > > > +CONFIG_DETECT_HUNG_TASK=y
> > > > > > +CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
> > > > > > +CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
> > > > > > +CONFIG_DETECT_HUNG_TASK_BLOCKER=y
> > > > > > +CONFIG_WQ_WATCHDOG=y
> > > > > > +CONFIG_WQ_CPU_INTENSIVE_REPORT=y
> > > > > > # CONFIG_TEST_LOCKUP is not set
> > > > > > # end of Debug Oops, Lockups and Hangs
> > > > > Hi, Vincent,
> > > > >
> > > > > I have applied all BPF patches with some small modifications. Can you
> > > > > test it again?
> > > > > https://web.git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/log/?h=loongarch-next
> > > > >
> > > > > Huacai
> > > > >
> > > > I see you applied the tail call bug patches, I thought about applying
> > > > tail call patches just in case, but I am not sure the
> > > > fentry_attach_stress involves tail call count bugs. I tried your
> > > > loongarch-next branch anyway, but the lockup still happens when I run
> > > > fentry_attach_stress in a while loop. I am not sure if this particular
> > > > test should block the merge of bpf trampoline patches, it looks to be
> > > > an extreme and rare case, maybe track it as a bug report and fix it
> > > > later after merge? just my two cents :)
> > > >
> > > > [root@fedora ~]# cd /usr/src/linux-loongson/
> > > > [root@fedora linux-loongson]# git branch
> > > > * loongarch-next
> > > > master
> > > > [root@fedora linux-loongson]# uname -a
> > > > Linux fedora 6.16.0+ #1 SMP PREEMPT_DYNAMIC Sun Aug 3 08:01:34 PDT
> > > > 2025 loongarch64 GNU/Linux
> > > >
> > > > [root@fedora linux-loongson]# cd tools/testing/selftests/bpf/
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > >
> > > > [root@fedora bpf]# while true; do ./test_progs -a
> > > > fentry_attach_stress; sleep 5; done
> > > > client_loop: send disconnect: Broken pipe
> > > Could you please try this on top of loongarch-next?
> > > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > > index e61c482068fe..c63c78a99f99 100644
> > > --- a/arch/loongarch/kernel/inst.c
> > > +++ b/arch/loongarch/kernel/inst.c
> > > @@ -5,6 +5,7 @@
> > > #include <linux/sizes.h>
> > > #include <linux/uaccess.h>
> > > #include <linux/set_memory.h>
> > > +#include <linux/stop_machine.h>
> > >
> > > #include <asm/cacheflush.h>
> > > #include <asm/inst.h>
> > > @@ -219,32 +220,49 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > > return ret;
> > > }
> > >
> > > -int larch_insn_text_copy(void *dst, void *src, size_t len)
> > > +struct insn_copy {
> > > + void *dst;
> > > + void *src;
> > > + size_t len;
> > > + unsigned int cpu;
> > > +};
> > > +
> > > +static int text_copy_cb(void *data)
> > > {
> > > - int ret;
> > > - unsigned long flags;
> > > - unsigned long dst_start, dst_end, dst_len;
> > > + int ret = 0;
> > > + size_t start, end;
> > > + struct insn_copy *copy = data;
> > >
> > > - dst_start = round_down((unsigned long)dst, PAGE_SIZE);
> > > - dst_end = round_up((unsigned long)dst + len, PAGE_SIZE);
> > > - dst_len = dst_end - dst_start; /* page-aligned */
> > > + if (smp_processor_id() == copy->cpu) {
> > > + start = round_down((size_t)copy->dst, PAGE_SIZE);
> > > + end = round_up((size_t)copy->dst + copy->len, PAGE_SIZE);
> > >
> > > - set_memory_rw(dst_start, dst_len / PAGE_SIZE);
> > > - raw_spin_lock_irqsave(&patch_lock, flags);
> > > + set_memory_rw(start, (end - start) / PAGE_SIZE);
> > >
> > > - ret = copy_to_kernel_nofault(dst, src, len);
> > > - if (ret)
> > > - pr_err("%s: operation failed\n", __func__);
> > > + ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
> > > + if (ret)
> > > + pr_err("%s: operation failed\n", __func__);
> > >
> > > - raw_spin_unlock_irqrestore(&patch_lock, flags);
> > > - set_memory_rox(dst_start, dst_len / PAGE_SIZE);
> > > + set_memory_rox(start, (end - start) / PAGE_SIZE);
> > > + }
> > >
> > > - if (!ret)
> > > - flush_icache_range((unsigned long)dst, (unsigned
> > > long)dst + len);
> > > + flush_icache_range((unsigned long)copy->dst, (unsigned
> > > long)copy->dst + copy->len);
> > >
> > > return ret;
> > > }
> > >
> > > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > > +{
> > > + struct insn_copy copy = {
> > > + .dst = dst,
> > > + .src = src,
> > > + .len = len,
> > > + .cpu = smp_processor_id(),
> > > + };
> > > +
> > > + return stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
> > > +}
> > > +
> > > u32 larch_insn_gen_nop(void)
> > > {
> > > return INSN_NOP;
> > >
> > Here is the result code I manually patched according to the above
> > diff. unfortunately, it made the lockup issue worse, even with
> > Tiezhu's no lockup config, it locked up immediately when start
> > fentry_attach_stress
> >
> > struct insn_copy {
> > void *dst;
> > void *src;
> > size_t len;
> > unsigned int cpu;
> > };
> >
> > static int text_copy_cb(void *data)
> > {
> > int ret = 0;
> > size_t start, end;
> > struct insn_copy *copy = data;
> >
> > if (smp_processor_id() == copy->cpu) {
> > start = round_down((size_t)copy->dst, PAGE_SIZE);
> > end = round_up((size_t)copy->dst + copy->len, PAGE_SIZE);
> > set_memory_rw(start, (end - start) / PAGE_SIZE);
> > ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
> > if (ret)
> > pr_err("%s: operation failed\n", __func__);
> >
> > set_memory_rox(start, (end - start) / PAGE_SIZE);
> > }
> >
> > flush_icache_range((unsigned long)copy->dst, (unsigned
> > long)copy->dst + copy->len);
> >
> > return ret;
> > }
> >
> > int larch_insn_text_copy(void *dst, void *src, size_t len)
> > {
> > struct insn_copy copy = {
> > .dst = dst,
> > .src = src,
> > .len = len,
> > .cpu = smp_processor_id(),
> > };
> >
> > return stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
> > }
> >
> Huacai,
>
> Here is your second revision code that is tested so far working well,
> with Tiezhu's kernel config, no lockup running fentry_attach_stress in
> while loop, no dmesg kernel panic trace, other fentry*/fexit* also
> past. I am going to do clean kernel build with my kernel config with
> softlockup/hardlockup detection on, and will report result ASAP
>
all test looks good with kernel with softlock/hardlock detection on
[root@fedora bpf]# while true; do ./test_progs -a
fentry_attach_stress; sleep 1; done
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
#107 fentry_attach_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
^C
[root@fedora bpf]# ./test_progs -a fexit_bpf2bpf
#110/1 fexit_bpf2bpf/target_no_callees:OK
#110/2 fexit_bpf2bpf/target_yes_callees:OK
#110/3 fexit_bpf2bpf/func_replace:OK
#110/4 fexit_bpf2bpf/func_replace_verify:OK
#110/5 fexit_bpf2bpf/func_sockmap_update:OK
#110/6 fexit_bpf2bpf/func_replace_return_code:OK
#110/7 fexit_bpf2bpf/func_map_prog_compatibility:OK
#110/8 fexit_bpf2bpf/func_replace_unreliable:OK
#110/9 fexit_bpf2bpf/func_replace_multi:OK
#110/10 fexit_bpf2bpf/fmod_ret_freplace:OK
#110/11 fexit_bpf2bpf/func_replace_global_func:OK
(cgroup_helpers.c:101: errno: Invalid argument) Enabling controller
cpu: /mnt/cgroup.subtree_control
#110 fexit_bpf2bpf: Failed to setup cgroup environment
test_fentry_to_cgroup_bpf:FAIL:cgroup_fd unexpected cgroup_fd: actual
-1 < expected 0
#110/12 fexit_bpf2bpf/fentry_to_cgroup_bpf:FAIL
#110/13 fexit_bpf2bpf/func_replace_progmap:OK
#110 fexit_bpf2bpf:FAIL
All error logs:
(cgroup_helpers.c:101: errno: Invalid argument) Enabling controller
cpu: /mnt/cgroup.subtree_control
#110 fexit_bpf2bpf: Failed to setup cgroup environment
test_fentry_to_cgroup_bpf:FAIL:cgroup_fd unexpected cgroup_fd: actual
-1 < expected 0
#110/12 fexit_bpf2bpf/fentry_to_cgroup_bpf:FAIL
#110 fexit_bpf2bpf:FAIL
Summary: 0/12 PASSED, 0 SKIPPED, 1 FAILED
[root@fedora bpf]# ./test_progs -a fentry_fexit
#108 fentry_fexit:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fentry_test
#109/1 fentry_test/fentry:OK
fentry_many_args:PASS:fentry_many_args_skel_load 0 nsec
libbpf: prog 'test2': failed to attach: -ENOTSUPP
libbpf: prog 'test2': failed to auto-attach: -ENOTSUPP
fentry_many_args:FAIL:fentry_many_args_attach unexpected error: -524 (errno 524)
#109/2 fentry_test/fentry_many_args:FAIL
#109 fentry_test:FAIL
All error logs:
fentry_many_args:PASS:fentry_many_args_skel_load 0 nsec
libbpf: prog 'test2': failed to attach: -ENOTSUPP
libbpf: prog 'test2': failed to auto-attach: -ENOTSUPP
fentry_many_args:FAIL:fentry_many_args_attach unexpected error: -524 (errno 524)
#109/2 fentry_test/fentry_many_args:FAIL
#109 fentry_test:FAIL
Summary: 0/1 PASSED, 0 SKIPPED, 1 FAILED
[root@fedora bpf]# ./test_progs -a fentry_attach_btf_presence
#106 fentry_attach_btf_presence:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fexit_noreturns
Summary: 0/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fexit_sleep
#111 fexit_sleep:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fexit_stress
#112 fexit_stress:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
[root@fedora bpf]# ./test_progs -a fexit_test
#113/1 fexit_test/fexit:OK
fexit_many_args:PASS:fexit_many_args_skel_load 0 nsec
libbpf: prog 'test2': failed to attach: -ENOTSUPP
libbpf: prog 'test2': failed to auto-attach: -ENOTSUPP
fexit_many_args:FAIL:fexit_many_args_attach unexpected error: -524 (errno 524)
#113/2 fexit_test/fexit_many_args:FAIL
#113 fexit_test:FAIL
All error logs:
fexit_many_args:PASS:fexit_many_args_skel_load 0 nsec
libbpf: prog 'test2': failed to attach: -ENOTSUPP
libbpf: prog 'test2': failed to auto-attach: -ENOTSUPP
fexit_many_args:FAIL:fexit_many_args_attach unexpected error: -524 (errno 524)
#113/2 fexit_test/fexit_many_args:FAIL
#113 fexit_test:FAIL
Summary: 0/1 PASSED, 0 SKIPPED, 1 FAILED
[root@fedora linux-loongson]# xdp-loader status
CURRENT XDP PROGRAM STATUS:
Interface Prio Program name Mode ID Tag
Chain actions
--------------------------------------------------------------------------------------
lo xdp_dispatcher skb 74280 4d7e87c0d30db711
=> 10 xdpfilt_alw_all 74289
320c53c06933a8fa XDP_PASS
dummy0 <No XDP program loaded!>
> struct insn_copy {
> void *dst;
> void *src;
> size_t len;
> unsigned int cpu;
> };
>
> static int text_copy_cb(void *data)
> {
> int ret = 0;
> struct insn_copy *copy = data;
>
> if (smp_processor_id() == copy->cpu) {
> ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
> if (ret)
> pr_err("%s: operation failed\n", __func__);
>
> }
>
> flush_icache_range((unsigned long)copy->dst, (unsigned
> long)copy->dst + copy->len);
>
> return ret;
> }
>
> int larch_insn_text_copy(void *dst, void *src, size_t len)
> {
> int ret = 0;
> size_t start, end;
> struct insn_copy copy = {
> .dst = dst,
> .src = src,
> .len = len,
> .cpu = smp_processor_id(),
> };
>
> start = round_down((size_t)dst, PAGE_SIZE);
> end = round_up((size_t)dst + len, PAGE_SIZE);
>
> set_memory_rw(start, (end - start) / PAGE_SIZE);
> return stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
> set_memory_rox(start, (end - start) / PAGE_SIZE);
>
> return ret;
> }
>
> >
> > > >
> > > > > >
> > > > > > > >
> > > > > > > > Have you tried to run the same fentry_attach_stress multiple times or
> > > > > > > > in a loop like while true; do ./test_progs -a fentry_attach_stress;
> > > > > > > > sleep 1; done
> > > > > > > > the lockup happens intermittently, sometime it PASSED, sometime kernel
> > > > > > > > locks up. I merged the tools/testing/selftests/bpf/config with my
> > > > > > > > original config by
> > > > > > > > ./scripts/kconfig/merge_config.sh -y .config
> > > > > > > > tools/testing/selftests/bpf/config. my config seems including
> > > > > > > > everything you listed above except CONFIG_ARCH_STRICT_ALIGN not set,
> > > > > > > > here is my config https://www.bpfire.net/download/loongfire/config.txt
> > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Tiezhu
> > > > > > > > >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-08-04 14:24 ` Vincent Li
2025-08-04 14:58 ` Vincent Li
@ 2025-08-04 15:36 ` Vincent Li
2025-08-04 15:51 ` Vincent Li
1 sibling, 1 reply; 35+ messages in thread
From: Vincent Li @ 2025-08-04 15:36 UTC (permalink / raw)
To: Huacai Chen
Cc: Tiezhu Yang, Chenghao Duan, hengqi.chen, kernel, loongarch,
guodongtai, youling.tang, jianghaoran, geliang
On Mon, Aug 4, 2025 at 7:24 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> On Mon, Aug 4, 2025 at 6:28 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> >
> > On Mon, Aug 4, 2025 at 1:24 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > On Sun, Aug 3, 2025 at 11:25 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > >
> > > > On Sun, Aug 3, 2025 at 7:11 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > >
> > > > > On Sat, Aug 2, 2025 at 11:52 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > >
> > > > > > On Sat, Aug 2, 2025 at 7:47 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > >
> > > > > > > On Sat, Aug 2, 2025 at 6:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Sat, Aug 2, 2025 at 2:19 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
> > > > > > > > >
> > > > > > > > > On 2025/8/1 下午1:21, Vincent Li wrote:
> > > > > > > > > > Hi Chenghao,
> > > > > > > > > >
> > > > > > > > > > I trimmed the email recipients only to the loongarch mailing list and
> > > > > > > > > > folks who might pay attention to this, I personally don't like to
> > > > > > > > > > bother other people who may not be interested in this :). Folks let me
> > > > > > > > > > know if this is not ok. anyway, please check my bpf selftest result
> > > > > > > > > > inline. The fentry_attach_stress results in kernel lockup.
> > > > > > > > >
> > > > > > > > > It passed on my test environment.
> > > > > > > > >
> > > > > > > > > $ sudo ./test_progs -a fentry_attach_stress
> > > > > > > > > #104 fentry_attach_stress:OK
> > > > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > > >
> > > > > > > > > I used loongson3_defconfig and the following additional configs:
> > > > > > > > >
> > > > > > > > > CONFIG_KPROBES=y
> > > > > > > > > CONFIG_FUNCTION_ERROR_INJECTION=y
> > > > > > > > > CONFIG_TEST_BPF=m
> > > > > > > > > CONFIG_FTRACE=y
> > > > > > > > > CONFIG_FUNCTION_TRACER=y
> > > > > > > > > CONFIG_DYNAMIC_FTRACE=y
> > > > > > > > > CONFIG_FPROBE=y
> > > > > > > > > CONFIG_FTRACE_SYSCALLS=y
> > > > > > > > > CONFIG_BPF_KPROBE_OVERRIDE=y
> > > > > > > > > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> > > > > > > > > CONFIG_DEBUG_INFO_BTF=y
> > > > > > > > > CONFIG_NET_SCH_BPF=y
> > > > > > > > > CONFIG_BPF_LSM=y
> > > > > > > > > CONFIG_BPF_UNPRIV_DEFAULT_OFF=n
> > > > > > > > > CONFIG_ARCH_STRICT_ALIGN=n
> > > > > > > > >
> > > > > > > > > I am not sure whether it is related with configs, you can test it again.
> > > > > > > > >
> > > > > > >
> > > > > > > I did:
> > > > > > > cp arch/loongarch/configs/loongson3_defconfig .config
> > > > > > > ./scripts/kconfig/merge_config.sh -y .config config-tiezhu(your above config)
> > > > > > > indeed I could not reproduce the lockup, even run the test in a loop.
> > > > > > > it seems to be related to the kernel config I use, maybe you could try
> > > > > > > my kernel config?
> > > > > > >
> > > > > > > [root@fedora bpf]# while true; do ./test_progs -a
> > > > > > > fentry_attach_stress; sleep 1; done
> > > > > > > #105 fentry_attach_stress:OK
> > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > #105 fentry_attach_stress:OK
> > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > >
> > > > > > > #105 fentry_attach_stress:OK
> > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > #105 fentry_attach_stress:OK
> > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > #105 fentry_attach_stress:OK
> > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > #105 fentry_attach_stress:OK
> > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > #105 fentry_attach_stress:OK
> > > > > > >
> > > > > >
> > > > > > while checking dmesg with your nolockup config, I see following log:
> > > > > >
> > > > > > [ 3469.410821] Hardware name: Loongson
> > > > > > Loongson-3A6000-7A2000-NUC/Loongson-3A6000-7A2000-NUC, BIOS
> > > > > > Loongson-UDK2018-V4.0.05759-stable202405 07/12/24 15:49:14
> > > > > > [ 3469.410824] pc 90000002456d4880 ra 90000000060885f4 tp
> > > > > > 90000002975d8000 sp 90000002975dbc10
> > > > > > [ 3469.410826] a0 0000000000000000 a1 ffff8000128f0048 a2
> > > > > > 00007ffff1c02638 a3 0000000000000000
> > > > > > [ 3469.410828] a4 00007ffff1c02680 a5 00007ffff0ce0f20 a6
> > > > > > 00007ffff0ce0f20 a7 0000000000000118
> > > > > > [ 3469.410830] t0 ffff80000338dd44 t1 90000002456d4880 t2
> > > > > > d665bdcea9f14eb9 t3 90000001dc9c1000
> > > > > > [ 3469.410832] t4 90000001dea99670 t5 0000000000000000 t6
> > > > > > 0000000000000000 t7 0000000000000000
> > > > > > [ 3469.410834] t8 000000000000000f u0 000000000000000a s9
> > > > > > 90000002975dbec0 s0 90000002975dbc70
> > > > > > [ 3469.410836] s1 0000000000000000 s2 90000000076a5000 s3
> > > > > > 0000000000000000 s4 ffff8000128f0000
> > > > > > [ 3469.410838] s5 0000000000000000 s6 0000000000000000 s7
> > > > > > 0000000000000050 s8 fffffffffffffdf4
> > > > > > [ 3469.410840] ra: 90000000060885f4 __bpf_prog_test_run_raw_tp+0x6c/0x108
> > > > > > [ 3469.410848] ERA: 90000002456d4880 0x90000002456d4880
> > > > > > [ 3469.410851] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > > > > > [ 3469.410860] PRMD: 00000004 (PPLV0 +PIE -PWE)
> > > > > > [ 3469.410865] EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
> > > > > > [ 3469.410870] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
> > > > > > [ 3469.410875] ESTAT: 000d0000 [INE] (IS= ECode=13 EsubCode=0)
> > > > > > [ 3469.410878] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000)
> > > > > > [ 3469.410880] Modules linked in: bpf_testmod(O) tls nft_fib_inet
> > > > > > nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
> > > > > > nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
> > > > > > nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink cmac
> > > > > > algif_hash algif_skcipher af_alg bnep vfat fat rtw88_8821ce
> > > > > > rtw88_8821c rtw88_pci rtw88_core mac80211 btusb libarc4 btrtl btbcm
> > > > > > btmtk btintel cfg80211 bluetooth sha3_generic kvm jitterentropy_rng
> > > > > > drbg ecdh_generic ecc loongson3_cpufreq spi_loongson_pci rfkill
> > > > > > spi_loongson_core uio_pdrv_genirq uio lm75 fuse efi_pstore pstore zram
> > > > > > 842_decompress 842_compress lz4hc_compress lz4_compress uas
> > > > > > usb_storage efivarfs [last unloaded: bpf_testmod(O)]
> > > > > > [ 3469.410946] Process test_progs (pid: 37338,
> > > > > > threadinfo=00000000760120b6, task=00000000620daecd)
> > > > > > [ 3469.410950] Stack : 0000000000000000 0000000000000000
> > > > > > 0000000000000000 d665bdcea9f14eb9
> > > > > > [ 3469.410956] 000000000000000f 0000000000000000
> > > > > > 90000002975dbc70 90000000076a5000
> > > > > > [ 3469.410961] 00007ffff1c02638 ffff8000128f0000
> > > > > > 90000002975dbd90 900000000608ab28
> > > > > > [ 3469.410966] ffff8000128f0000 0000000000000000
> > > > > > 0000000000000000 d665bdcea9f14eb9
> > > > > > [ 3469.410971] 0000000000000000 90000000076a5000
> > > > > > 90000002975dbd90 00007ffff1c02638
> > > > > > [ 3469.410976] 0000000000000000 000000000000000a
> > > > > > ffff8000128f0000 9000000004f21328
> > > > > > [ 3469.410981] 0000000000000000 9000000004e554c0
> > > > > > 0000000000000000 900000010f6a4240
> > > > > > [ 3469.410986] 0000000129e18000 ffffffff00003500
> > > > > > 0000000000000000 0000000000000000
> > > > > > [ 3469.410990] 0000000000000000 0000000000000000
> > > > > > 0000000000000000 d665bdcea9f14eb9
> > > > > > [ 3469.410995] 0000000000000001 00007ffff1c03040
> > > > > > 0000000000000000 0000000000000002
> > > > > > [ 3469.411000] ...
> > > > > > [ 3469.411002] Call Trace:
> > > > > > [ 3469.411005] [<9000000004d11fb8>] handle_syscall+0xb8/0x158
> > > > > >
> > > > > > [ 3469.411012] Code: 4c2e0031 31465341 00323731 <00000000> 00000000
> > > > > > 975d8000 90000002 00000003 00402040
> > > > > >
> > > > > > [ 3469.411024] ---[ end trace 0000000000000000 ]---
> > > > > >
> > > > > > here is the relevant config diff between your nolockup config and my
> > > > > > lockup config that I suspect your nolockup config didn't cause kernel
> > > > > > lockup
> > > > > >
> > > > > > diff -u config-nolockup config-lockup
> > > > > >
> > > > > > # Debug Oops, Lockups and Hangs
> > > > > > #
> > > > > > -# CONFIG_PANIC_ON_OOPS is not set
> > > > > > -CONFIG_PANIC_ON_OOPS_VALUE=0
> > > > > > +CONFIG_PANIC_ON_OOPS=y
> > > > > > +CONFIG_PANIC_ON_OOPS_VALUE=1
> > > > > > CONFIG_PANIC_TIMEOUT=0
> > > > > > -# CONFIG_SOFTLOCKUP_DETECTOR is not set
> > > > > > +CONFIG_LOCKUP_DETECTOR=y
> > > > > > +CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > +CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
> > > > > > CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y
> > > > > > -# CONFIG_HARDLOCKUP_DETECTOR is not set
> > > > > > -# CONFIG_DETECT_HUNG_TASK is not set
> > > > > > -# CONFIG_WQ_WATCHDOG is not set
> > > > > > -# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set
> > > > > > +CONFIG_HARDLOCKUP_DETECTOR=y
> > > > > > +# CONFIG_HARDLOCKUP_DETECTOR_PERF is not set
> > > > > > +CONFIG_HARDLOCKUP_DETECTOR_BUDDY=y
> > > > > > +# CONFIG_HARDLOCKUP_DETECTOR_ARCH is not set
> > > > > > +CONFIG_HARDLOCKUP_DETECTOR_COUNTS_HRTIMER=y
> > > > > > +CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
> > > > > > +CONFIG_DETECT_HUNG_TASK=y
> > > > > > +CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
> > > > > > +CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
> > > > > > +CONFIG_DETECT_HUNG_TASK_BLOCKER=y
> > > > > > +CONFIG_WQ_WATCHDOG=y
> > > > > > +CONFIG_WQ_CPU_INTENSIVE_REPORT=y
> > > > > > # CONFIG_TEST_LOCKUP is not set
> > > > > > # end of Debug Oops, Lockups and Hangs
> > > > > Hi, Vincent,
> > > > >
> > > > > I have applied all BPF patches with some small modifications. Can you
> > > > > test it again?
> > > > > https://web.git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/log/?h=loongarch-next
> > > > >
> > > > > Huacai
> > > > >
> > > > I see you applied the tail call bug patches, I thought about applying
> > > > tail call patches just in case, but I am not sure the
> > > > fentry_attach_stress involves tail call count bugs. I tried your
> > > > loongarch-next branch anyway, but the lockup still happens when I run
> > > > fentry_attach_stress in a while loop. I am not sure if this particular
> > > > test should block the merge of bpf trampoline patches, it looks to be
> > > > an extreme and rare case, maybe track it as a bug report and fix it
> > > > later after merge? just my two cents :)
> > > >
> > > > [root@fedora ~]# cd /usr/src/linux-loongson/
> > > > [root@fedora linux-loongson]# git branch
> > > > * loongarch-next
> > > > master
> > > > [root@fedora linux-loongson]# uname -a
> > > > Linux fedora 6.16.0+ #1 SMP PREEMPT_DYNAMIC Sun Aug 3 08:01:34 PDT
> > > > 2025 loongarch64 GNU/Linux
> > > >
> > > > [root@fedora linux-loongson]# cd tools/testing/selftests/bpf/
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > #107 fentry_attach_stress:OK
> > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > >
> > > > [root@fedora bpf]# while true; do ./test_progs -a
> > > > fentry_attach_stress; sleep 5; done
> > > > client_loop: send disconnect: Broken pipe
> > > Could you please try this on top of loongarch-next?
> > > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > > index e61c482068fe..c63c78a99f99 100644
> > > --- a/arch/loongarch/kernel/inst.c
> > > +++ b/arch/loongarch/kernel/inst.c
> > > @@ -5,6 +5,7 @@
> > > #include <linux/sizes.h>
> > > #include <linux/uaccess.h>
> > > #include <linux/set_memory.h>
> > > +#include <linux/stop_machine.h>
> > >
> > > #include <asm/cacheflush.h>
> > > #include <asm/inst.h>
> > > @@ -219,32 +220,49 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > > return ret;
> > > }
> > >
> > > -int larch_insn_text_copy(void *dst, void *src, size_t len)
> > > +struct insn_copy {
> > > + void *dst;
> > > + void *src;
> > > + size_t len;
> > > + unsigned int cpu;
> > > +};
> > > +
> > > +static int text_copy_cb(void *data)
> > > {
> > > - int ret;
> > > - unsigned long flags;
> > > - unsigned long dst_start, dst_end, dst_len;
> > > + int ret = 0;
> > > + size_t start, end;
> > > + struct insn_copy *copy = data;
> > >
> > > - dst_start = round_down((unsigned long)dst, PAGE_SIZE);
> > > - dst_end = round_up((unsigned long)dst + len, PAGE_SIZE);
> > > - dst_len = dst_end - dst_start; /* page-aligned */
> > > + if (smp_processor_id() == copy->cpu) {
> > > + start = round_down((size_t)copy->dst, PAGE_SIZE);
> > > + end = round_up((size_t)copy->dst + copy->len, PAGE_SIZE);
> > >
> > > - set_memory_rw(dst_start, dst_len / PAGE_SIZE);
> > > - raw_spin_lock_irqsave(&patch_lock, flags);
> > > + set_memory_rw(start, (end - start) / PAGE_SIZE);
> > >
> > > - ret = copy_to_kernel_nofault(dst, src, len);
> > > - if (ret)
> > > - pr_err("%s: operation failed\n", __func__);
> > > + ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
> > > + if (ret)
> > > + pr_err("%s: operation failed\n", __func__);
> > >
> > > - raw_spin_unlock_irqrestore(&patch_lock, flags);
> > > - set_memory_rox(dst_start, dst_len / PAGE_SIZE);
> > > + set_memory_rox(start, (end - start) / PAGE_SIZE);
> > > + }
> > >
> > > - if (!ret)
> > > - flush_icache_range((unsigned long)dst, (unsigned
> > > long)dst + len);
> > > + flush_icache_range((unsigned long)copy->dst, (unsigned
> > > long)copy->dst + copy->len);
> > >
> > > return ret;
> > > }
> > >
> > > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > > +{
> > > + struct insn_copy copy = {
> > > + .dst = dst,
> > > + .src = src,
> > > + .len = len,
> > > + .cpu = smp_processor_id(),
> > > + };
> > > +
> > > + return stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
> > > +}
> > > +
> > > u32 larch_insn_gen_nop(void)
> > > {
> > > return INSN_NOP;
> > >
> > Here is the result code I manually patched according to the above
> > diff. unfortunately, it made the lockup issue worse, even with
> > Tiezhu's no lockup config, it locked up immediately when start
> > fentry_attach_stress
> >
> > struct insn_copy {
> > void *dst;
> > void *src;
> > size_t len;
> > unsigned int cpu;
> > };
> >
> > static int text_copy_cb(void *data)
> > {
> > int ret = 0;
> > size_t start, end;
> > struct insn_copy *copy = data;
> >
> > if (smp_processor_id() == copy->cpu) {
> > start = round_down((size_t)copy->dst, PAGE_SIZE);
> > end = round_up((size_t)copy->dst + copy->len, PAGE_SIZE);
> > set_memory_rw(start, (end - start) / PAGE_SIZE);
> > ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
> > if (ret)
> > pr_err("%s: operation failed\n", __func__);
> >
> > set_memory_rox(start, (end - start) / PAGE_SIZE);
> > }
> >
> > flush_icache_range((unsigned long)copy->dst, (unsigned
> > long)copy->dst + copy->len);
> >
> > return ret;
> > }
> >
> > int larch_insn_text_copy(void *dst, void *src, size_t len)
> > {
> > struct insn_copy copy = {
> > .dst = dst,
> > .src = src,
> > .len = len,
> > .cpu = smp_processor_id(),
> > };
> >
> > return stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
> > }
> >
> Huacai,
>
> Here is your second revision code that is tested so far working well,
> with Tiezhu's kernel config, no lockup running fentry_attach_stress in
> while loop, no dmesg kernel panic trace, other fentry*/fexit* also
> past. I am going to do clean kernel build with my kernel config with
> softlockup/hardlockup detection on, and will report result ASAP
>
> struct insn_copy {
> void *dst;
> void *src;
> size_t len;
> unsigned int cpu;
> };
>
> static int text_copy_cb(void *data)
> {
> int ret = 0;
> struct insn_copy *copy = data;
>
> if (smp_processor_id() == copy->cpu) {
> ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
> if (ret)
> pr_err("%s: operation failed\n", __func__);
>
> }
>
> flush_icache_range((unsigned long)copy->dst, (unsigned
> long)copy->dst + copy->len);
>
> return ret;
> }
>
> int larch_insn_text_copy(void *dst, void *src, size_t len)
> {
> int ret = 0;
> size_t start, end;
> struct insn_copy copy = {
> .dst = dst,
> .src = src,
> .len = len,
> .cpu = smp_processor_id(),
> };
>
> start = round_down((size_t)dst, PAGE_SIZE);
> end = round_up((size_t)dst + len, PAGE_SIZE);
>
> set_memory_rw(start, (end - start) / PAGE_SIZE);
> return stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
> set_memory_rox(start, (end - start) / PAGE_SIZE);
>
> return ret;
> }
here is the correct code and I have re-run the test, test passed
struct insn_copy {
void *dst;
void *src;
size_t len;
unsigned int cpu;
};
static int text_copy_cb(void *data)
{
int ret = 0;
struct insn_copy *copy = data;
if (smp_processor_id() == copy->cpu) {
ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
if (ret)
pr_err("%s: operation failed\n", __func__);
}
flush_icache_range((unsigned long)copy->dst, (unsigned
long)copy->dst + copy->len);
return ret;
}
int larch_insn_text_copy(void *dst, void *src, size_t len)
{
int ret = 0;
size_t start, end;
struct insn_copy copy = {
.dst = dst,
.src = src,
.len = len,
.cpu = smp_processor_id(),
};
start = round_down((size_t)dst, PAGE_SIZE);
end = round_up((size_t)dst + len, PAGE_SIZE);
set_memory_rw(start, (end - start) / PAGE_SIZE);
ret = stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
set_memory_rox(start, (end - start) / PAGE_SIZE);
return ret;
}
>
> >
> > > >
> > > > > >
> > > > > > > >
> > > > > > > > Have you tried to run the same fentry_attach_stress multiple times or
> > > > > > > > in a loop like while true; do ./test_progs -a fentry_attach_stress;
> > > > > > > > sleep 1; done
> > > > > > > > the lockup happens intermittently, sometime it PASSED, sometime kernel
> > > > > > > > locks up. I merged the tools/testing/selftests/bpf/config with my
> > > > > > > > original config by
> > > > > > > > ./scripts/kconfig/merge_config.sh -y .config
> > > > > > > > tools/testing/selftests/bpf/config. my config seems including
> > > > > > > > everything you listed above except CONFIG_ARCH_STRICT_ALIGN not set,
> > > > > > > > here is my config https://www.bpfire.net/download/loongfire/config.txt
> > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Tiezhu
> > > > > > > > >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 0/5] Support trampoline for LoongArch
2025-08-04 15:36 ` Vincent Li
@ 2025-08-04 15:51 ` Vincent Li
0 siblings, 0 replies; 35+ messages in thread
From: Vincent Li @ 2025-08-04 15:51 UTC (permalink / raw)
To: Huacai Chen
Cc: Tiezhu Yang, Chenghao Duan, hengqi.chen, kernel, loongarch,
guodongtai, youling.tang, jianghaoran, geliang
On Mon, Aug 4, 2025 at 8:36 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
>
> On Mon, Aug 4, 2025 at 7:24 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> >
> > On Mon, Aug 4, 2025 at 6:28 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > >
> > > On Mon, Aug 4, 2025 at 1:24 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > >
> > > > On Sun, Aug 3, 2025 at 11:25 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > >
> > > > > On Sun, Aug 3, 2025 at 7:11 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > >
> > > > > > On Sat, Aug 2, 2025 at 11:52 PM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > >
> > > > > > > On Sat, Aug 2, 2025 at 7:47 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Sat, Aug 2, 2025 at 6:53 AM Vincent Li <vincent.mc.li@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > On Sat, Aug 2, 2025 at 2:19 AM Tiezhu Yang <yangtiezhu@loongson.cn> wrote:
> > > > > > > > > >
> > > > > > > > > > On 2025/8/1 下午1:21, Vincent Li wrote:
> > > > > > > > > > > Hi Chenghao,
> > > > > > > > > > >
> > > > > > > > > > > I trimmed the email recipients only to the loongarch mailing list and
> > > > > > > > > > > folks who might pay attention to this, I personally don't like to
> > > > > > > > > > > bother other people who may not be interested in this :). Folks let me
> > > > > > > > > > > know if this is not ok. anyway, please check my bpf selftest result
> > > > > > > > > > > inline. The fentry_attach_stress results in kernel lockup.
> > > > > > > > > >
> > > > > > > > > > It passed on my test environment.
> > > > > > > > > >
> > > > > > > > > > $ sudo ./test_progs -a fentry_attach_stress
> > > > > > > > > > #104 fentry_attach_stress:OK
> > > > > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > > > >
> > > > > > > > > > I used loongson3_defconfig and the following additional configs:
> > > > > > > > > >
> > > > > > > > > > CONFIG_KPROBES=y
> > > > > > > > > > CONFIG_FUNCTION_ERROR_INJECTION=y
> > > > > > > > > > CONFIG_TEST_BPF=m
> > > > > > > > > > CONFIG_FTRACE=y
> > > > > > > > > > CONFIG_FUNCTION_TRACER=y
> > > > > > > > > > CONFIG_DYNAMIC_FTRACE=y
> > > > > > > > > > CONFIG_FPROBE=y
> > > > > > > > > > CONFIG_FTRACE_SYSCALLS=y
> > > > > > > > > > CONFIG_BPF_KPROBE_OVERRIDE=y
> > > > > > > > > > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> > > > > > > > > > CONFIG_DEBUG_INFO_BTF=y
> > > > > > > > > > CONFIG_NET_SCH_BPF=y
> > > > > > > > > > CONFIG_BPF_LSM=y
> > > > > > > > > > CONFIG_BPF_UNPRIV_DEFAULT_OFF=n
> > > > > > > > > > CONFIG_ARCH_STRICT_ALIGN=n
> > > > > > > > > >
> > > > > > > > > > I am not sure whether it is related with configs, you can test it again.
> > > > > > > > > >
> > > > > > > >
> > > > > > > > I did:
> > > > > > > > cp arch/loongarch/configs/loongson3_defconfig .config
> > > > > > > > ./scripts/kconfig/merge_config.sh -y .config config-tiezhu(your above config)
> > > > > > > > indeed I could not reproduce the lockup, even run the test in a loop.
> > > > > > > > it seems to be related to the kernel config I use, maybe you could try
> > > > > > > > my kernel config?
> > > > > > > >
> > > > > > > > [root@fedora bpf]# while true; do ./test_progs -a
> > > > > > > > fentry_attach_stress; sleep 1; done
> > > > > > > > #105 fentry_attach_stress:OK
> > > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > > #105 fentry_attach_stress:OK
> > > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > >
> > > > > > > > #105 fentry_attach_stress:OK
> > > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > > #105 fentry_attach_stress:OK
> > > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > > #105 fentry_attach_stress:OK
> > > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > > #105 fentry_attach_stress:OK
> > > > > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > > > > #105 fentry_attach_stress:OK
> > > > > > > >
> > > > > > >
> > > > > > > while checking dmesg with your nolockup config, I see following log:
> > > > > > >
> > > > > > > [ 3469.410821] Hardware name: Loongson
> > > > > > > Loongson-3A6000-7A2000-NUC/Loongson-3A6000-7A2000-NUC, BIOS
> > > > > > > Loongson-UDK2018-V4.0.05759-stable202405 07/12/24 15:49:14
> > > > > > > [ 3469.410824] pc 90000002456d4880 ra 90000000060885f4 tp
> > > > > > > 90000002975d8000 sp 90000002975dbc10
> > > > > > > [ 3469.410826] a0 0000000000000000 a1 ffff8000128f0048 a2
> > > > > > > 00007ffff1c02638 a3 0000000000000000
> > > > > > > [ 3469.410828] a4 00007ffff1c02680 a5 00007ffff0ce0f20 a6
> > > > > > > 00007ffff0ce0f20 a7 0000000000000118
> > > > > > > [ 3469.410830] t0 ffff80000338dd44 t1 90000002456d4880 t2
> > > > > > > d665bdcea9f14eb9 t3 90000001dc9c1000
> > > > > > > [ 3469.410832] t4 90000001dea99670 t5 0000000000000000 t6
> > > > > > > 0000000000000000 t7 0000000000000000
> > > > > > > [ 3469.410834] t8 000000000000000f u0 000000000000000a s9
> > > > > > > 90000002975dbec0 s0 90000002975dbc70
> > > > > > > [ 3469.410836] s1 0000000000000000 s2 90000000076a5000 s3
> > > > > > > 0000000000000000 s4 ffff8000128f0000
> > > > > > > [ 3469.410838] s5 0000000000000000 s6 0000000000000000 s7
> > > > > > > 0000000000000050 s8 fffffffffffffdf4
> > > > > > > [ 3469.410840] ra: 90000000060885f4 __bpf_prog_test_run_raw_tp+0x6c/0x108
> > > > > > > [ 3469.410848] ERA: 90000002456d4880 0x90000002456d4880
> > > > > > > [ 3469.410851] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > > > > > > [ 3469.410860] PRMD: 00000004 (PPLV0 +PIE -PWE)
> > > > > > > [ 3469.410865] EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
> > > > > > > [ 3469.410870] ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
> > > > > > > [ 3469.410875] ESTAT: 000d0000 [INE] (IS= ECode=13 EsubCode=0)
> > > > > > > [ 3469.410878] PRID: 0014d000 (Loongson-64bit, Loongson-3A6000)
> > > > > > > [ 3469.410880] Modules linked in: bpf_testmod(O) tls nft_fib_inet
> > > > > > > nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
> > > > > > > nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
> > > > > > > nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink cmac
> > > > > > > algif_hash algif_skcipher af_alg bnep vfat fat rtw88_8821ce
> > > > > > > rtw88_8821c rtw88_pci rtw88_core mac80211 btusb libarc4 btrtl btbcm
> > > > > > > btmtk btintel cfg80211 bluetooth sha3_generic kvm jitterentropy_rng
> > > > > > > drbg ecdh_generic ecc loongson3_cpufreq spi_loongson_pci rfkill
> > > > > > > spi_loongson_core uio_pdrv_genirq uio lm75 fuse efi_pstore pstore zram
> > > > > > > 842_decompress 842_compress lz4hc_compress lz4_compress uas
> > > > > > > usb_storage efivarfs [last unloaded: bpf_testmod(O)]
> > > > > > > [ 3469.410946] Process test_progs (pid: 37338,
> > > > > > > threadinfo=00000000760120b6, task=00000000620daecd)
> > > > > > > [ 3469.410950] Stack : 0000000000000000 0000000000000000
> > > > > > > 0000000000000000 d665bdcea9f14eb9
> > > > > > > [ 3469.410956] 000000000000000f 0000000000000000
> > > > > > > 90000002975dbc70 90000000076a5000
> > > > > > > [ 3469.410961] 00007ffff1c02638 ffff8000128f0000
> > > > > > > 90000002975dbd90 900000000608ab28
> > > > > > > [ 3469.410966] ffff8000128f0000 0000000000000000
> > > > > > > 0000000000000000 d665bdcea9f14eb9
> > > > > > > [ 3469.410971] 0000000000000000 90000000076a5000
> > > > > > > 90000002975dbd90 00007ffff1c02638
> > > > > > > [ 3469.410976] 0000000000000000 000000000000000a
> > > > > > > ffff8000128f0000 9000000004f21328
> > > > > > > [ 3469.410981] 0000000000000000 9000000004e554c0
> > > > > > > 0000000000000000 900000010f6a4240
> > > > > > > [ 3469.410986] 0000000129e18000 ffffffff00003500
> > > > > > > 0000000000000000 0000000000000000
> > > > > > > [ 3469.410990] 0000000000000000 0000000000000000
> > > > > > > 0000000000000000 d665bdcea9f14eb9
> > > > > > > [ 3469.410995] 0000000000000001 00007ffff1c03040
> > > > > > > 0000000000000000 0000000000000002
> > > > > > > [ 3469.411000] ...
> > > > > > > [ 3469.411002] Call Trace:
> > > > > > > [ 3469.411005] [<9000000004d11fb8>] handle_syscall+0xb8/0x158
> > > > > > >
> > > > > > > [ 3469.411012] Code: 4c2e0031 31465341 00323731 <00000000> 00000000
> > > > > > > 975d8000 90000002 00000003 00402040
> > > > > > >
> > > > > > > [ 3469.411024] ---[ end trace 0000000000000000 ]---
> > > > > > >
> > > > > > > here is the relevant config diff between your nolockup config and my
> > > > > > > lockup config that I suspect your nolockup config didn't cause kernel
> > > > > > > lockup
> > > > > > >
> > > > > > > diff -u config-nolockup config-lockup
> > > > > > >
> > > > > > > # Debug Oops, Lockups and Hangs
> > > > > > > #
> > > > > > > -# CONFIG_PANIC_ON_OOPS is not set
> > > > > > > -CONFIG_PANIC_ON_OOPS_VALUE=0
> > > > > > > +CONFIG_PANIC_ON_OOPS=y
> > > > > > > +CONFIG_PANIC_ON_OOPS_VALUE=1
> > > > > > > CONFIG_PANIC_TIMEOUT=0
> > > > > > > -# CONFIG_SOFTLOCKUP_DETECTOR is not set
> > > > > > > +CONFIG_LOCKUP_DETECTOR=y
> > > > > > > +CONFIG_SOFTLOCKUP_DETECTOR=y
> > > > > > > +CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
> > > > > > > CONFIG_HAVE_HARDLOCKUP_DETECTOR_BUDDY=y
> > > > > > > -# CONFIG_HARDLOCKUP_DETECTOR is not set
> > > > > > > -# CONFIG_DETECT_HUNG_TASK is not set
> > > > > > > -# CONFIG_WQ_WATCHDOG is not set
> > > > > > > -# CONFIG_WQ_CPU_INTENSIVE_REPORT is not set
> > > > > > > +CONFIG_HARDLOCKUP_DETECTOR=y
> > > > > > > +# CONFIG_HARDLOCKUP_DETECTOR_PERF is not set
> > > > > > > +CONFIG_HARDLOCKUP_DETECTOR_BUDDY=y
> > > > > > > +# CONFIG_HARDLOCKUP_DETECTOR_ARCH is not set
> > > > > > > +CONFIG_HARDLOCKUP_DETECTOR_COUNTS_HRTIMER=y
> > > > > > > +CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
> > > > > > > +CONFIG_DETECT_HUNG_TASK=y
> > > > > > > +CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
> > > > > > > +CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
> > > > > > > +CONFIG_DETECT_HUNG_TASK_BLOCKER=y
> > > > > > > +CONFIG_WQ_WATCHDOG=y
> > > > > > > +CONFIG_WQ_CPU_INTENSIVE_REPORT=y
> > > > > > > # CONFIG_TEST_LOCKUP is not set
> > > > > > > # end of Debug Oops, Lockups and Hangs
> > > > > > Hi, Vincent,
> > > > > >
> > > > > > I have applied all BPF patches with some small modifications. Can you
> > > > > > test it again?
> > > > > > https://web.git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/log/?h=loongarch-next
> > > > > >
> > > > > > Huacai
> > > > > >
> > > > > I see you applied the tail call bug patches, I thought about applying
> > > > > tail call patches just in case, but I am not sure the
> > > > > fentry_attach_stress involves tail call count bugs. I tried your
> > > > > loongarch-next branch anyway, but the lockup still happens when I run
> > > > > fentry_attach_stress in a while loop. I am not sure if this particular
> > > > > test should block the merge of bpf trampoline patches, it looks to be
> > > > > an extreme and rare case, maybe track it as a bug report and fix it
> > > > > later after merge? just my two cents :)
> > > > >
> > > > > [root@fedora ~]# cd /usr/src/linux-loongson/
> > > > > [root@fedora linux-loongson]# git branch
> > > > > * loongarch-next
> > > > > master
> > > > > [root@fedora linux-loongson]# uname -a
> > > > > Linux fedora 6.16.0+ #1 SMP PREEMPT_DYNAMIC Sun Aug 3 08:01:34 PDT
> > > > > 2025 loongarch64 GNU/Linux
> > > > >
> > > > > [root@fedora linux-loongson]# cd tools/testing/selftests/bpf/
> > > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > > #107 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > > #107 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > > #107 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > > #107 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > > #107 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > > #107 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > > #107 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > > #107 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > > #107 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > > #107 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > > #107 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > > #107 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > > #107 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > > #107 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > > [root@fedora bpf]# ./test_progs -a fentry_attach_stress
> > > > > #107 fentry_attach_stress:OK
> > > > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > > >
> > > > > [root@fedora bpf]# while true; do ./test_progs -a
> > > > > fentry_attach_stress; sleep 5; done
> > > > > client_loop: send disconnect: Broken pipe
> > > > Could you please try this on top of loongarch-next?
> > > > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > > > index e61c482068fe..c63c78a99f99 100644
> > > > --- a/arch/loongarch/kernel/inst.c
> > > > +++ b/arch/loongarch/kernel/inst.c
> > > > @@ -5,6 +5,7 @@
> > > > #include <linux/sizes.h>
> > > > #include <linux/uaccess.h>
> > > > #include <linux/set_memory.h>
> > > > +#include <linux/stop_machine.h>
> > > >
> > > > #include <asm/cacheflush.h>
> > > > #include <asm/inst.h>
> > > > @@ -219,32 +220,49 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > > > return ret;
> > > > }
> > > >
> > > > -int larch_insn_text_copy(void *dst, void *src, size_t len)
> > > > +struct insn_copy {
> > > > + void *dst;
> > > > + void *src;
> > > > + size_t len;
> > > > + unsigned int cpu;
> > > > +};
> > > > +
> > > > +static int text_copy_cb(void *data)
> > > > {
> > > > - int ret;
> > > > - unsigned long flags;
> > > > - unsigned long dst_start, dst_end, dst_len;
> > > > + int ret = 0;
> > > > + size_t start, end;
> > > > + struct insn_copy *copy = data;
> > > >
> > > > - dst_start = round_down((unsigned long)dst, PAGE_SIZE);
> > > > - dst_end = round_up((unsigned long)dst + len, PAGE_SIZE);
> > > > - dst_len = dst_end - dst_start; /* page-aligned */
> > > > + if (smp_processor_id() == copy->cpu) {
> > > > + start = round_down((size_t)copy->dst, PAGE_SIZE);
> > > > + end = round_up((size_t)copy->dst + copy->len, PAGE_SIZE);
> > > >
> > > > - set_memory_rw(dst_start, dst_len / PAGE_SIZE);
> > > > - raw_spin_lock_irqsave(&patch_lock, flags);
> > > > + set_memory_rw(start, (end - start) / PAGE_SIZE);
> > > >
> > > > - ret = copy_to_kernel_nofault(dst, src, len);
> > > > - if (ret)
> > > > - pr_err("%s: operation failed\n", __func__);
> > > > + ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
> > > > + if (ret)
> > > > + pr_err("%s: operation failed\n", __func__);
> > > >
> > > > - raw_spin_unlock_irqrestore(&patch_lock, flags);
> > > > - set_memory_rox(dst_start, dst_len / PAGE_SIZE);
> > > > + set_memory_rox(start, (end - start) / PAGE_SIZE);
> > > > + }
> > > >
> > > > - if (!ret)
> > > > - flush_icache_range((unsigned long)dst, (unsigned
> > > > long)dst + len);
> > > > + flush_icache_range((unsigned long)copy->dst, (unsigned
> > > > long)copy->dst + copy->len);
> > > >
> > > > return ret;
> > > > }
> > > >
> > > > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > > > +{
> > > > + struct insn_copy copy = {
> > > > + .dst = dst,
> > > > + .src = src,
> > > > + .len = len,
> > > > + .cpu = smp_processor_id(),
> > > > + };
> > > > +
> > > > + return stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
> > > > +}
> > > > +
> > > > u32 larch_insn_gen_nop(void)
> > > > {
> > > > return INSN_NOP;
> > > >
> > > Here is the result code I manually patched according to the above
> > > diff. unfortunately, it made the lockup issue worse, even with
> > > Tiezhu's no lockup config, it locked up immediately when start
> > > fentry_attach_stress
> > >
> > > struct insn_copy {
> > > void *dst;
> > > void *src;
> > > size_t len;
> > > unsigned int cpu;
> > > };
> > >
> > > static int text_copy_cb(void *data)
> > > {
> > > int ret = 0;
> > > size_t start, end;
> > > struct insn_copy *copy = data;
> > >
> > > if (smp_processor_id() == copy->cpu) {
> > > start = round_down((size_t)copy->dst, PAGE_SIZE);
> > > end = round_up((size_t)copy->dst + copy->len, PAGE_SIZE);
> > > set_memory_rw(start, (end - start) / PAGE_SIZE);
> > > ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
> > > if (ret)
> > > pr_err("%s: operation failed\n", __func__);
> > >
> > > set_memory_rox(start, (end - start) / PAGE_SIZE);
> > > }
> > >
> > > flush_icache_range((unsigned long)copy->dst, (unsigned
> > > long)copy->dst + copy->len);
> > >
> > > return ret;
> > > }
> > >
> > > int larch_insn_text_copy(void *dst, void *src, size_t len)
> > > {
> > > struct insn_copy copy = {
> > > .dst = dst,
> > > .src = src,
> > > .len = len,
> > > .cpu = smp_processor_id(),
> > > };
> > >
> > > return stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
> > > }
> > >
> > Huacai,
> >
> > Here is your second revision code that is tested so far working well,
> > with Tiezhu's kernel config, no lockup running fentry_attach_stress in
> > while loop, no dmesg kernel panic trace, other fentry*/fexit* also
> > past. I am going to do clean kernel build with my kernel config with
> > softlockup/hardlockup detection on, and will report result ASAP
> >
> > struct insn_copy {
> > void *dst;
> > void *src;
> > size_t len;
> > unsigned int cpu;
> > };
> >
> > static int text_copy_cb(void *data)
> > {
> > int ret = 0;
> > struct insn_copy *copy = data;
> >
> > if (smp_processor_id() == copy->cpu) {
> > ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
> > if (ret)
> > pr_err("%s: operation failed\n", __func__);
> >
> > }
> >
> > flush_icache_range((unsigned long)copy->dst, (unsigned
> > long)copy->dst + copy->len);
> >
> > return ret;
> > }
> >
> > int larch_insn_text_copy(void *dst, void *src, size_t len)
> > {
> > int ret = 0;
> > size_t start, end;
> > struct insn_copy copy = {
> > .dst = dst,
> > .src = src,
> > .len = len,
> > .cpu = smp_processor_id(),
> > };
> >
> > start = round_down((size_t)dst, PAGE_SIZE);
> > end = round_up((size_t)dst + len, PAGE_SIZE);
> >
> > set_memory_rw(start, (end - start) / PAGE_SIZE);
> > return stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
> > set_memory_rox(start, (end - start) / PAGE_SIZE);
> >
> > return ret;
> > }
>
> here is the correct code and I have re-run the test, test passed
>
> struct insn_copy {
> void *dst;
> void *src;
> size_t len;
> unsigned int cpu;
> };
>
> static int text_copy_cb(void *data)
> {
> int ret = 0;
> struct insn_copy *copy = data;
>
> if (smp_processor_id() == copy->cpu) {
> ret = copy_to_kernel_nofault(copy->dst, copy->src, copy->len);
> if (ret)
> pr_err("%s: operation failed\n", __func__);
>
> }
>
> flush_icache_range((unsigned long)copy->dst, (unsigned
> long)copy->dst + copy->len);
>
> return ret;
> }
>
> int larch_insn_text_copy(void *dst, void *src, size_t len)
> {
> int ret = 0;
> size_t start, end;
> struct insn_copy copy = {
> .dst = dst,
> .src = src,
> .len = len,
> .cpu = smp_processor_id(),
> };
>
> start = round_down((size_t)dst, PAGE_SIZE);
> end = round_up((size_t)dst + len, PAGE_SIZE);
>
> set_memory_rw(start, (end - start) / PAGE_SIZE);
> ret = stop_machine_cpuslocked(text_copy_cb, ©, cpu_online_mask);
> set_memory_rox(start, (end - start) / PAGE_SIZE);
>
> return ret;
> }
>
change stop_machine_cpuslocked() to stop_machine() and tested again,
test passed.
int larch_insn_text_copy(void *dst, void *src, size_t len)
{
int ret = 0;
size_t start, end;
struct insn_copy copy = {
.dst = dst,
.src = src,
.len = len,
.cpu = smp_processor_id(),
};
start = round_down((size_t)dst, PAGE_SIZE);
end = round_up((size_t)dst + len, PAGE_SIZE);
set_memory_rw(start, (end - start) / PAGE_SIZE);
ret = stop_machine(text_copy_cb, ©, cpu_online_mask);
set_memory_rox(start, (end - start) / PAGE_SIZE);
return ret;
}
> >
> > >
> > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > > Have you tried to run the same fentry_attach_stress multiple times or
> > > > > > > > > in a loop like while true; do ./test_progs -a fentry_attach_stress;
> > > > > > > > > sleep 1; done
> > > > > > > > > the lockup happens intermittently, sometime it PASSED, sometime kernel
> > > > > > > > > locks up. I merged the tools/testing/selftests/bpf/config with my
> > > > > > > > > original config by
> > > > > > > > > ./scripts/kconfig/merge_config.sh -y .config
> > > > > > > > > tools/testing/selftests/bpf/config. my config seems including
> > > > > > > > > everything you listed above except CONFIG_ARCH_STRICT_ALIGN not set,
> > > > > > > > > here is my config https://www.bpfire.net/download/loongfire/config.txt
> > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Tiezhu
> > > > > > > > > >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 3/5] LoongArch: BPF: Implement dynamic code modification support
2025-08-04 2:02 ` Hengqi Chen
@ 2025-08-05 4:10 ` Huacai Chen
2025-08-05 6:30 ` Chenghao Duan
0 siblings, 1 reply; 35+ messages in thread
From: Huacai Chen @ 2025-08-05 4:10 UTC (permalink / raw)
To: Hengqi Chen
Cc: Chenghao Duan, ast, daniel, andrii, yangtiezhu, martin.lau,
eddyz87, song, yonghong.song, john.fastabend, kpsingh, sdf,
haoluo, jolsa, kernel, linux-kernel, loongarch, bpf, guodongtai,
youling.tang, jianghaoran, vincent.mc.li, geliang
On Mon, Aug 4, 2025 at 10:02 AM Hengqi Chen <hengqi.chen@gmail.com> wrote:
>
> On Wed, Jul 30, 2025 at 9:13 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > This commit adds support for BPF dynamic code modification on the
> > LoongArch architecture.:
> > 1. Implement bpf_arch_text_poke() for runtime instruction patching.
> > 2. Add bpf_arch_text_copy() for instruction block copying.
> > 3. Create bpf_arch_text_invalidate() for code invalidation.
> >
> > On LoongArch, since symbol addresses in the direct mapping
> > region cannot be reached via relative jump instructions from the paged
> > mapping region, we use the move_imm+jirl instruction pair as absolute
> > jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> > instructions in the program as placeholders for function jumps.
> >
> > larch_insn_text_copy is solely used for BPF. The use of
> > larch_insn_text_copy() requires page_size alignment. Currently, only
> > the size of the trampoline is page-aligned.
> >
> > Co-developed-by: George Guo <guodongtai@kylinos.cn>
> > Signed-off-by: George Guo <guodongtai@kylinos.cn>
> > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > ---
> > arch/loongarch/include/asm/inst.h | 1 +
> > arch/loongarch/kernel/inst.c | 27 ++++++++
> > arch/loongarch/net/bpf_jit.c | 104 ++++++++++++++++++++++++++++++
> > 3 files changed, 132 insertions(+)
> >
> > diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> > index 2ae96a35d..88bb73e46 100644
> > --- a/arch/loongarch/include/asm/inst.h
> > +++ b/arch/loongarch/include/asm/inst.h
> > @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> > int larch_insn_read(void *addr, u32 *insnp);
> > int larch_insn_write(void *addr, u32 insn);
> > int larch_insn_patch_text(void *addr, u32 insn);
> > +int larch_insn_text_copy(void *dst, void *src, size_t len);
> >
> > u32 larch_insn_gen_nop(void);
> > u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > index 674e3b322..7df63a950 100644
> > --- a/arch/loongarch/kernel/inst.c
> > +++ b/arch/loongarch/kernel/inst.c
> > @@ -4,6 +4,7 @@
> > */
> > #include <linux/sizes.h>
> > #include <linux/uaccess.h>
> > +#include <linux/set_memory.h>
> >
> > #include <asm/cacheflush.h>
> > #include <asm/inst.h>
> > @@ -218,6 +219,32 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > return ret;
> > }
> >
> > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > +{
> > + int ret;
> > + unsigned long flags;
> > + unsigned long dst_start, dst_end, dst_len;
> > +
> > + dst_start = round_down((unsigned long)dst, PAGE_SIZE);
> > + dst_end = round_up((unsigned long)dst + len, PAGE_SIZE);
> > + dst_len = dst_end - dst_start;
> > +
> > + set_memory_rw(dst_start, dst_len / PAGE_SIZE);
> > + raw_spin_lock_irqsave(&patch_lock, flags);
> > +
> > + ret = copy_to_kernel_nofault(dst, src, len);
> > + if (ret)
> > + pr_err("%s: operation failed\n", __func__);
> > +
> > + raw_spin_unlock_irqrestore(&patch_lock, flags);
> > + set_memory_rox(dst_start, dst_len / PAGE_SIZE);
> > +
> > + if (!ret)
> > + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> > +
> > + return ret;
> > +}
> > +
> > u32 larch_insn_gen_nop(void)
> > {
> > return INSN_NOP;
> > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > index 7032f11d3..5e6ae7e0e 100644
> > --- a/arch/loongarch/net/bpf_jit.c
> > +++ b/arch/loongarch/net/bpf_jit.c
> > @@ -4,8 +4,12 @@
> > *
> > * Copyright (C) 2022 Loongson Technology Corporation Limited
> > */
> > +#include <linux/memory.h>
> > #include "bpf_jit.h"
> >
> > +#define LOONGARCH_LONG_JUMP_NINSNS 5
> > +#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> > +
> > #define REG_TCC LOONGARCH_GPR_A6
> > #define TCC_SAVED LOONGARCH_GPR_S5
> >
> > @@ -88,6 +92,7 @@ static u8 tail_call_reg(struct jit_ctx *ctx)
> > */
> > static void build_prologue(struct jit_ctx *ctx)
> > {
> > + int i;
> > int stack_adjust = 0, store_offset, bpf_stack_adjust;
> >
> > bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
> > @@ -98,6 +103,10 @@ static void build_prologue(struct jit_ctx *ctx)
> > stack_adjust = round_up(stack_adjust, 16);
> > stack_adjust += bpf_stack_adjust;
> >
> > + /* Reserve space for the move_imm + jirl instruction */
> > + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> > + emit_insn(ctx, nop);
> > +
> > /*
> > * First instruction initializes the tail call count (TCC).
> > * On tail call we skip this instruction, and the TCC is
> > @@ -1367,3 +1376,98 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> > {
> > return true;
> > }
> > +
> > +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 target)
> > +{
> > + if (!target) {
> > + pr_err("bpf_jit: jump target address is error\n");
> > + return -EFAULT;
> > + }
> > +
> > + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> > + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> > +
> > + return 0;
> > +}
> > +
> > +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> > +{
> > + struct jit_ctx ctx;
> > +
> > + ctx.idx = 0;
> > + ctx.image = (union loongarch_instruction *)insns;
> > +
> > + if (!target) {
> > + emit_insn((&ctx), nop);
> > + emit_insn((&ctx), nop);
>
> There should be 5 nops, no ?
Chenghao,
We have already fixed the concurrent problem, now this is the only
issue, please reply tas soon as possible.
Huacai
>
> > + return 0;
> > + }
> > +
> > + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> > + (unsigned long)target);
> > +}
> > +
> > +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> > + void *old_addr, void *new_addr)
> > +{
> > + u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> > + u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> > + bool is_call = poke_type == BPF_MOD_CALL;
> > + int ret;
> > +
> > + if (!is_kernel_text((unsigned long)ip) &&
> > + !is_bpf_text_address((unsigned long)ip))
> > + return -ENOTSUPP;
> > +
> > + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> > + if (ret)
> > + return ret;
> > +
> > + if (memcmp(ip, old_insns, LOONGARCH_LONG_JUMP_NBYTES))
> > + return -EFAULT;
> > +
> > + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> > + if (ret)
> > + return ret;
> > +
> > + mutex_lock(&text_mutex);
> > + if (memcmp(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES))
> > + ret = larch_insn_text_copy(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES);
> > + mutex_unlock(&text_mutex);
> > + return ret;
> > +}
> > +
> > +int bpf_arch_text_invalidate(void *dst, size_t len)
> > +{
> > + int i;
> > + int ret = 0;
> > + u32 *inst;
> > +
> > + inst = kvmalloc(len, GFP_KERNEL);
> > + if (!inst)
> > + return -ENOMEM;
> > +
> > + for (i = 0; i < (len/sizeof(u32)); i++)
> > + inst[i] = INSN_BREAK;
> > +
> > + mutex_lock(&text_mutex);
> > + if (larch_insn_text_copy(dst, inst, len))
> > + ret = -EINVAL;
> > + mutex_unlock(&text_mutex);
> > +
> > + kvfree(inst);
> > + return ret;
> > +}
> > +
> > +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> > +{
> > + int ret;
> > +
> > + mutex_lock(&text_mutex);
> > + ret = larch_insn_text_copy(dst, src, len);
> > + mutex_unlock(&text_mutex);
> > + if (ret)
> > + return ERR_PTR(-EINVAL);
> > +
> > + return dst;
> > +}
> > --
>
> bpf_arch_text_invalidate() and bpf_arch_text_copy() is not related to
> BPF trampoline, right ?
>
> > 2.25.1
> >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 3/5] LoongArch: BPF: Implement dynamic code modification support
2025-08-05 4:10 ` Huacai Chen
@ 2025-08-05 6:30 ` Chenghao Duan
2025-08-05 11:13 ` Huacai Chen
0 siblings, 1 reply; 35+ messages in thread
From: Chenghao Duan @ 2025-08-05 6:30 UTC (permalink / raw)
To: Huacai Chen
Cc: Hengqi Chen, ast, daniel, andrii, yangtiezhu, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li, geliang
On Tue, Aug 05, 2025 at 12:10:05PM +0800, Huacai Chen wrote:
> On Mon, Aug 4, 2025 at 10:02 AM Hengqi Chen <hengqi.chen@gmail.com> wrote:
> >
> > On Wed, Jul 30, 2025 at 9:13 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > >
> > > This commit adds support for BPF dynamic code modification on the
> > > LoongArch architecture.:
> > > 1. Implement bpf_arch_text_poke() for runtime instruction patching.
> > > 2. Add bpf_arch_text_copy() for instruction block copying.
> > > 3. Create bpf_arch_text_invalidate() for code invalidation.
> > >
> > > On LoongArch, since symbol addresses in the direct mapping
> > > region cannot be reached via relative jump instructions from the paged
> > > mapping region, we use the move_imm+jirl instruction pair as absolute
> > > jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> > > instructions in the program as placeholders for function jumps.
> > >
> > > larch_insn_text_copy is solely used for BPF. The use of
> > > larch_insn_text_copy() requires page_size alignment. Currently, only
> > > the size of the trampoline is page-aligned.
> > >
> > > Co-developed-by: George Guo <guodongtai@kylinos.cn>
> > > Signed-off-by: George Guo <guodongtai@kylinos.cn>
> > > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > > ---
> > > arch/loongarch/include/asm/inst.h | 1 +
> > > arch/loongarch/kernel/inst.c | 27 ++++++++
> > > arch/loongarch/net/bpf_jit.c | 104 ++++++++++++++++++++++++++++++
> > > 3 files changed, 132 insertions(+)
> > >
> > > diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> > > index 2ae96a35d..88bb73e46 100644
> > > --- a/arch/loongarch/include/asm/inst.h
> > > +++ b/arch/loongarch/include/asm/inst.h
> > > @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> > > int larch_insn_read(void *addr, u32 *insnp);
> > > int larch_insn_write(void *addr, u32 insn);
> > > int larch_insn_patch_text(void *addr, u32 insn);
> > > +int larch_insn_text_copy(void *dst, void *src, size_t len);
> > >
> > > u32 larch_insn_gen_nop(void);
> > > u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> > > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > > index 674e3b322..7df63a950 100644
> > > --- a/arch/loongarch/kernel/inst.c
> > > +++ b/arch/loongarch/kernel/inst.c
> > > @@ -4,6 +4,7 @@
> > > */
> > > #include <linux/sizes.h>
> > > #include <linux/uaccess.h>
> > > +#include <linux/set_memory.h>
> > >
> > > #include <asm/cacheflush.h>
> > > #include <asm/inst.h>
> > > @@ -218,6 +219,32 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > > return ret;
> > > }
> > >
> > > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > > +{
> > > + int ret;
> > > + unsigned long flags;
> > > + unsigned long dst_start, dst_end, dst_len;
> > > +
> > > + dst_start = round_down((unsigned long)dst, PAGE_SIZE);
> > > + dst_end = round_up((unsigned long)dst + len, PAGE_SIZE);
> > > + dst_len = dst_end - dst_start;
> > > +
> > > + set_memory_rw(dst_start, dst_len / PAGE_SIZE);
> > > + raw_spin_lock_irqsave(&patch_lock, flags);
> > > +
> > > + ret = copy_to_kernel_nofault(dst, src, len);
> > > + if (ret)
> > > + pr_err("%s: operation failed\n", __func__);
> > > +
> > > + raw_spin_unlock_irqrestore(&patch_lock, flags);
> > > + set_memory_rox(dst_start, dst_len / PAGE_SIZE);
> > > +
> > > + if (!ret)
> > > + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> > > +
> > > + return ret;
> > > +}
> > > +
> > > u32 larch_insn_gen_nop(void)
> > > {
> > > return INSN_NOP;
> > > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > > index 7032f11d3..5e6ae7e0e 100644
> > > --- a/arch/loongarch/net/bpf_jit.c
> > > +++ b/arch/loongarch/net/bpf_jit.c
> > > @@ -4,8 +4,12 @@
> > > *
> > > * Copyright (C) 2022 Loongson Technology Corporation Limited
> > > */
> > > +#include <linux/memory.h>
> > > #include "bpf_jit.h"
> > >
> > > +#define LOONGARCH_LONG_JUMP_NINSNS 5
> > > +#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> > > +
> > > #define REG_TCC LOONGARCH_GPR_A6
> > > #define TCC_SAVED LOONGARCH_GPR_S5
> > >
> > > @@ -88,6 +92,7 @@ static u8 tail_call_reg(struct jit_ctx *ctx)
> > > */
> > > static void build_prologue(struct jit_ctx *ctx)
> > > {
> > > + int i;
> > > int stack_adjust = 0, store_offset, bpf_stack_adjust;
> > >
> > > bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
> > > @@ -98,6 +103,10 @@ static void build_prologue(struct jit_ctx *ctx)
> > > stack_adjust = round_up(stack_adjust, 16);
> > > stack_adjust += bpf_stack_adjust;
> > >
> > > + /* Reserve space for the move_imm + jirl instruction */
> > > + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> > > + emit_insn(ctx, nop);
> > > +
> > > /*
> > > * First instruction initializes the tail call count (TCC).
> > > * On tail call we skip this instruction, and the TCC is
> > > @@ -1367,3 +1376,98 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> > > {
> > > return true;
> > > }
> > > +
> > > +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 target)
> > > +{
> > > + if (!target) {
> > > + pr_err("bpf_jit: jump target address is error\n");
> > > + return -EFAULT;
> > > + }
> > > +
> > > + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> > > + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> > > +{
> > > + struct jit_ctx ctx;
> > > +
> > > + ctx.idx = 0;
> > > + ctx.image = (union loongarch_instruction *)insns;
> > > +
> > > + if (!target) {
> > > + emit_insn((&ctx), nop);
> > > + emit_insn((&ctx), nop);
> >
> > There should be 5 nops, no ?
> Chenghao,
>
> We have already fixed the concurrent problem, now this is the only
> issue, please reply tas soon as possible.
>
> Huacai
Hi Hengqi & Huacai,
I'm sorry I just saw the email.
This position can be configured with 5 NOP instructions, and I have
tested it successfully.
sudo ./test_progs -a fentry_test/fentry
sudo ./test_progs -a fexit_test/fexit
sudo ./test_progs -a fentry_fexit
sudo ./test_progs -a modify_return
sudo ./test_progs -a fexit_sleep
sudo ./test_progs -a test_overhead
sudo ./test_progs -a trampoline_count
sudo ./test_progs -a fexit_bpf2bpf
if (!target) {
int i;
for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
emit_insn((&ctx), nop);
return 0;
}
Chenghao
>
> >
> > > + return 0;
> > > + }
> > > +
> > > + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> > > + (unsigned long)target);
> > > +}
> > > +
> > > +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> > > + void *old_addr, void *new_addr)
> > > +{
> > > + u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> > > + u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> > > + bool is_call = poke_type == BPF_MOD_CALL;
> > > + int ret;
> > > +
> > > + if (!is_kernel_text((unsigned long)ip) &&
> > > + !is_bpf_text_address((unsigned long)ip))
> > > + return -ENOTSUPP;
> > > +
> > > + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + if (memcmp(ip, old_insns, LOONGARCH_LONG_JUMP_NBYTES))
> > > + return -EFAULT;
> > > +
> > > + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + mutex_lock(&text_mutex);
> > > + if (memcmp(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES))
> > > + ret = larch_insn_text_copy(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES);
> > > + mutex_unlock(&text_mutex);
> > > + return ret;
> > > +}
> > > +
> > > +int bpf_arch_text_invalidate(void *dst, size_t len)
> > > +{
> > > + int i;
> > > + int ret = 0;
> > > + u32 *inst;
> > > +
> > > + inst = kvmalloc(len, GFP_KERNEL);
> > > + if (!inst)
> > > + return -ENOMEM;
> > > +
> > > + for (i = 0; i < (len/sizeof(u32)); i++)
> > > + inst[i] = INSN_BREAK;
> > > +
> > > + mutex_lock(&text_mutex);
> > > + if (larch_insn_text_copy(dst, inst, len))
> > > + ret = -EINVAL;
> > > + mutex_unlock(&text_mutex);
> > > +
> > > + kvfree(inst);
> > > + return ret;
> > > +}
> > > +
> > > +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> > > +{
> > > + int ret;
> > > +
> > > + mutex_lock(&text_mutex);
> > > + ret = larch_insn_text_copy(dst, src, len);
> > > + mutex_unlock(&text_mutex);
> > > + if (ret)
> > > + return ERR_PTR(-EINVAL);
> > > +
> > > + return dst;
> > > +}
> > > --
> >
> > bpf_arch_text_invalidate() and bpf_arch_text_copy() is not related to
> > BPF trampoline, right ?
From the perspective of BPF core source code calls, the two functions
bpf_arch_text_invalidate() and bpf_arch_text_copy() are not only used for
trampolines.
> >
> > > 2.25.1
> > >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 3/5] LoongArch: BPF: Implement dynamic code modification support
2025-08-05 6:30 ` Chenghao Duan
@ 2025-08-05 11:13 ` Huacai Chen
2025-08-05 13:42 ` Vincent Li
2025-08-07 10:26 ` Chenghao Duan
0 siblings, 2 replies; 35+ messages in thread
From: Huacai Chen @ 2025-08-05 11:13 UTC (permalink / raw)
To: Chenghao Duan
Cc: Hengqi Chen, ast, daniel, andrii, yangtiezhu, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li, geliang
On Tue, Aug 5, 2025 at 2:30 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
>
> On Tue, Aug 05, 2025 at 12:10:05PM +0800, Huacai Chen wrote:
> > On Mon, Aug 4, 2025 at 10:02 AM Hengqi Chen <hengqi.chen@gmail.com> wrote:
> > >
> > > On Wed, Jul 30, 2025 at 9:13 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > > >
> > > > This commit adds support for BPF dynamic code modification on the
> > > > LoongArch architecture.:
> > > > 1. Implement bpf_arch_text_poke() for runtime instruction patching.
> > > > 2. Add bpf_arch_text_copy() for instruction block copying.
> > > > 3. Create bpf_arch_text_invalidate() for code invalidation.
> > > >
> > > > On LoongArch, since symbol addresses in the direct mapping
> > > > region cannot be reached via relative jump instructions from the paged
> > > > mapping region, we use the move_imm+jirl instruction pair as absolute
> > > > jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> > > > instructions in the program as placeholders for function jumps.
> > > >
> > > > larch_insn_text_copy is solely used for BPF. The use of
> > > > larch_insn_text_copy() requires page_size alignment. Currently, only
> > > > the size of the trampoline is page-aligned.
> > > >
> > > > Co-developed-by: George Guo <guodongtai@kylinos.cn>
> > > > Signed-off-by: George Guo <guodongtai@kylinos.cn>
> > > > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > > > ---
> > > > arch/loongarch/include/asm/inst.h | 1 +
> > > > arch/loongarch/kernel/inst.c | 27 ++++++++
> > > > arch/loongarch/net/bpf_jit.c | 104 ++++++++++++++++++++++++++++++
> > > > 3 files changed, 132 insertions(+)
> > > >
> > > > diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> > > > index 2ae96a35d..88bb73e46 100644
> > > > --- a/arch/loongarch/include/asm/inst.h
> > > > +++ b/arch/loongarch/include/asm/inst.h
> > > > @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> > > > int larch_insn_read(void *addr, u32 *insnp);
> > > > int larch_insn_write(void *addr, u32 insn);
> > > > int larch_insn_patch_text(void *addr, u32 insn);
> > > > +int larch_insn_text_copy(void *dst, void *src, size_t len);
> > > >
> > > > u32 larch_insn_gen_nop(void);
> > > > u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> > > > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > > > index 674e3b322..7df63a950 100644
> > > > --- a/arch/loongarch/kernel/inst.c
> > > > +++ b/arch/loongarch/kernel/inst.c
> > > > @@ -4,6 +4,7 @@
> > > > */
> > > > #include <linux/sizes.h>
> > > > #include <linux/uaccess.h>
> > > > +#include <linux/set_memory.h>
> > > >
> > > > #include <asm/cacheflush.h>
> > > > #include <asm/inst.h>
> > > > @@ -218,6 +219,32 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > > > return ret;
> > > > }
> > > >
> > > > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > > > +{
> > > > + int ret;
> > > > + unsigned long flags;
> > > > + unsigned long dst_start, dst_end, dst_len;
> > > > +
> > > > + dst_start = round_down((unsigned long)dst, PAGE_SIZE);
> > > > + dst_end = round_up((unsigned long)dst + len, PAGE_SIZE);
> > > > + dst_len = dst_end - dst_start;
> > > > +
> > > > + set_memory_rw(dst_start, dst_len / PAGE_SIZE);
> > > > + raw_spin_lock_irqsave(&patch_lock, flags);
> > > > +
> > > > + ret = copy_to_kernel_nofault(dst, src, len);
> > > > + if (ret)
> > > > + pr_err("%s: operation failed\n", __func__);
> > > > +
> > > > + raw_spin_unlock_irqrestore(&patch_lock, flags);
> > > > + set_memory_rox(dst_start, dst_len / PAGE_SIZE);
> > > > +
> > > > + if (!ret)
> > > > + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> > > > +
> > > > + return ret;
> > > > +}
> > > > +
> > > > u32 larch_insn_gen_nop(void)
> > > > {
> > > > return INSN_NOP;
> > > > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > > > index 7032f11d3..5e6ae7e0e 100644
> > > > --- a/arch/loongarch/net/bpf_jit.c
> > > > +++ b/arch/loongarch/net/bpf_jit.c
> > > > @@ -4,8 +4,12 @@
> > > > *
> > > > * Copyright (C) 2022 Loongson Technology Corporation Limited
> > > > */
> > > > +#include <linux/memory.h>
> > > > #include "bpf_jit.h"
> > > >
> > > > +#define LOONGARCH_LONG_JUMP_NINSNS 5
> > > > +#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> > > > +
> > > > #define REG_TCC LOONGARCH_GPR_A6
> > > > #define TCC_SAVED LOONGARCH_GPR_S5
> > > >
> > > > @@ -88,6 +92,7 @@ static u8 tail_call_reg(struct jit_ctx *ctx)
> > > > */
> > > > static void build_prologue(struct jit_ctx *ctx)
> > > > {
> > > > + int i;
> > > > int stack_adjust = 0, store_offset, bpf_stack_adjust;
> > > >
> > > > bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
> > > > @@ -98,6 +103,10 @@ static void build_prologue(struct jit_ctx *ctx)
> > > > stack_adjust = round_up(stack_adjust, 16);
> > > > stack_adjust += bpf_stack_adjust;
> > > >
> > > > + /* Reserve space for the move_imm + jirl instruction */
> > > > + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> > > > + emit_insn(ctx, nop);
> > > > +
> > > > /*
> > > > * First instruction initializes the tail call count (TCC).
> > > > * On tail call we skip this instruction, and the TCC is
> > > > @@ -1367,3 +1376,98 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> > > > {
> > > > return true;
> > > > }
> > > > +
> > > > +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 target)
> > > > +{
> > > > + if (!target) {
> > > > + pr_err("bpf_jit: jump target address is error\n");
> > > > + return -EFAULT;
> > > > + }
> > > > +
> > > > + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> > > > + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> > > > +{
> > > > + struct jit_ctx ctx;
> > > > +
> > > > + ctx.idx = 0;
> > > > + ctx.image = (union loongarch_instruction *)insns;
> > > > +
> > > > + if (!target) {
> > > > + emit_insn((&ctx), nop);
> > > > + emit_insn((&ctx), nop);
> > >
> > > There should be 5 nops, no ?
> > Chenghao,
> >
> > We have already fixed the concurrent problem, now this is the only
> > issue, please reply tas soon as possible.
> >
> > Huacai
>
> Hi Hengqi & Huacai,
>
> I'm sorry I just saw the email.
> This position can be configured with 5 NOP instructions, and I have
> tested it successfully.
OK, now loongarch-next [1] has integrated all needed changes, you and
Vincent can test to see if everything is OK.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/log/?h=loongarch-next
Huacai
>
> sudo ./test_progs -a fentry_test/fentry
> sudo ./test_progs -a fexit_test/fexit
> sudo ./test_progs -a fentry_fexit
> sudo ./test_progs -a modify_return
> sudo ./test_progs -a fexit_sleep
> sudo ./test_progs -a test_overhead
> sudo ./test_progs -a trampoline_count
> sudo ./test_progs -a fexit_bpf2bpf
>
> if (!target) {
> int i;
> for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> emit_insn((&ctx), nop);
> return 0;
> }
>
>
> Chenghao
>
> >
> > >
> > > > + return 0;
> > > > + }
> > > > +
> > > > + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> > > > + (unsigned long)target);
> > > > +}
> > > > +
> > > > +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> > > > + void *old_addr, void *new_addr)
> > > > +{
> > > > + u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> > > > + u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> > > > + bool is_call = poke_type == BPF_MOD_CALL;
> > > > + int ret;
> > > > +
> > > > + if (!is_kernel_text((unsigned long)ip) &&
> > > > + !is_bpf_text_address((unsigned long)ip))
> > > > + return -ENOTSUPP;
> > > > +
> > > > + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> > > > + if (ret)
> > > > + return ret;
> > > > +
> > > > + if (memcmp(ip, old_insns, LOONGARCH_LONG_JUMP_NBYTES))
> > > > + return -EFAULT;
> > > > +
> > > > + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> > > > + if (ret)
> > > > + return ret;
> > > > +
> > > > + mutex_lock(&text_mutex);
> > > > + if (memcmp(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES))
> > > > + ret = larch_insn_text_copy(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES);
> > > > + mutex_unlock(&text_mutex);
> > > > + return ret;
> > > > +}
> > > > +
> > > > +int bpf_arch_text_invalidate(void *dst, size_t len)
> > > > +{
> > > > + int i;
> > > > + int ret = 0;
> > > > + u32 *inst;
> > > > +
> > > > + inst = kvmalloc(len, GFP_KERNEL);
> > > > + if (!inst)
> > > > + return -ENOMEM;
> > > > +
> > > > + for (i = 0; i < (len/sizeof(u32)); i++)
> > > > + inst[i] = INSN_BREAK;
> > > > +
> > > > + mutex_lock(&text_mutex);
> > > > + if (larch_insn_text_copy(dst, inst, len))
> > > > + ret = -EINVAL;
> > > > + mutex_unlock(&text_mutex);
> > > > +
> > > > + kvfree(inst);
> > > > + return ret;
> > > > +}
> > > > +
> > > > +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> > > > +{
> > > > + int ret;
> > > > +
> > > > + mutex_lock(&text_mutex);
> > > > + ret = larch_insn_text_copy(dst, src, len);
> > > > + mutex_unlock(&text_mutex);
> > > > + if (ret)
> > > > + return ERR_PTR(-EINVAL);
> > > > +
> > > > + return dst;
> > > > +}
> > > > --
> > >
> > > bpf_arch_text_invalidate() and bpf_arch_text_copy() is not related to
> > > BPF trampoline, right ?
>
> From the perspective of BPF core source code calls, the two functions
> bpf_arch_text_invalidate() and bpf_arch_text_copy() are not only used for
> trampolines.
>
> > >
> > > > 2.25.1
> > > >
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 3/5] LoongArch: BPF: Implement dynamic code modification support
2025-08-05 11:13 ` Huacai Chen
@ 2025-08-05 13:42 ` Vincent Li
2025-08-07 10:26 ` Chenghao Duan
1 sibling, 0 replies; 35+ messages in thread
From: Vincent Li @ 2025-08-05 13:42 UTC (permalink / raw)
To: Huacai Chen
Cc: Chenghao Duan, Hengqi Chen, ast, daniel, andrii, yangtiezhu,
martin.lau, eddyz87, song, yonghong.song, john.fastabend, kpsingh,
sdf, haoluo, jolsa, kernel, linux-kernel, loongarch, bpf,
guodongtai, youling.tang, jianghaoran, geliang
On Tue, Aug 5, 2025 at 4:13 AM Huacai Chen <chenhuacai@kernel.org> wrote:
>
> On Tue, Aug 5, 2025 at 2:30 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > On Tue, Aug 05, 2025 at 12:10:05PM +0800, Huacai Chen wrote:
> > > On Mon, Aug 4, 2025 at 10:02 AM Hengqi Chen <hengqi.chen@gmail.com> wrote:
> > > >
> > > > On Wed, Jul 30, 2025 at 9:13 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > > > >
> > > > > This commit adds support for BPF dynamic code modification on the
> > > > > LoongArch architecture.:
> > > > > 1. Implement bpf_arch_text_poke() for runtime instruction patching.
> > > > > 2. Add bpf_arch_text_copy() for instruction block copying.
> > > > > 3. Create bpf_arch_text_invalidate() for code invalidation.
> > > > >
> > > > > On LoongArch, since symbol addresses in the direct mapping
> > > > > region cannot be reached via relative jump instructions from the paged
> > > > > mapping region, we use the move_imm+jirl instruction pair as absolute
> > > > > jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> > > > > instructions in the program as placeholders for function jumps.
> > > > >
> > > > > larch_insn_text_copy is solely used for BPF. The use of
> > > > > larch_insn_text_copy() requires page_size alignment. Currently, only
> > > > > the size of the trampoline is page-aligned.
> > > > >
> > > > > Co-developed-by: George Guo <guodongtai@kylinos.cn>
> > > > > Signed-off-by: George Guo <guodongtai@kylinos.cn>
> > > > > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > > > > ---
> > > > > arch/loongarch/include/asm/inst.h | 1 +
> > > > > arch/loongarch/kernel/inst.c | 27 ++++++++
> > > > > arch/loongarch/net/bpf_jit.c | 104 ++++++++++++++++++++++++++++++
> > > > > 3 files changed, 132 insertions(+)
> > > > >
> > > > > diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> > > > > index 2ae96a35d..88bb73e46 100644
> > > > > --- a/arch/loongarch/include/asm/inst.h
> > > > > +++ b/arch/loongarch/include/asm/inst.h
> > > > > @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> > > > > int larch_insn_read(void *addr, u32 *insnp);
> > > > > int larch_insn_write(void *addr, u32 insn);
> > > > > int larch_insn_patch_text(void *addr, u32 insn);
> > > > > +int larch_insn_text_copy(void *dst, void *src, size_t len);
> > > > >
> > > > > u32 larch_insn_gen_nop(void);
> > > > > u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> > > > > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > > > > index 674e3b322..7df63a950 100644
> > > > > --- a/arch/loongarch/kernel/inst.c
> > > > > +++ b/arch/loongarch/kernel/inst.c
> > > > > @@ -4,6 +4,7 @@
> > > > > */
> > > > > #include <linux/sizes.h>
> > > > > #include <linux/uaccess.h>
> > > > > +#include <linux/set_memory.h>
> > > > >
> > > > > #include <asm/cacheflush.h>
> > > > > #include <asm/inst.h>
> > > > > @@ -218,6 +219,32 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > > > > return ret;
> > > > > }
> > > > >
> > > > > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > > > > +{
> > > > > + int ret;
> > > > > + unsigned long flags;
> > > > > + unsigned long dst_start, dst_end, dst_len;
> > > > > +
> > > > > + dst_start = round_down((unsigned long)dst, PAGE_SIZE);
> > > > > + dst_end = round_up((unsigned long)dst + len, PAGE_SIZE);
> > > > > + dst_len = dst_end - dst_start;
> > > > > +
> > > > > + set_memory_rw(dst_start, dst_len / PAGE_SIZE);
> > > > > + raw_spin_lock_irqsave(&patch_lock, flags);
> > > > > +
> > > > > + ret = copy_to_kernel_nofault(dst, src, len);
> > > > > + if (ret)
> > > > > + pr_err("%s: operation failed\n", __func__);
> > > > > +
> > > > > + raw_spin_unlock_irqrestore(&patch_lock, flags);
> > > > > + set_memory_rox(dst_start, dst_len / PAGE_SIZE);
> > > > > +
> > > > > + if (!ret)
> > > > > + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> > > > > +
> > > > > + return ret;
> > > > > +}
> > > > > +
> > > > > u32 larch_insn_gen_nop(void)
> > > > > {
> > > > > return INSN_NOP;
> > > > > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > > > > index 7032f11d3..5e6ae7e0e 100644
> > > > > --- a/arch/loongarch/net/bpf_jit.c
> > > > > +++ b/arch/loongarch/net/bpf_jit.c
> > > > > @@ -4,8 +4,12 @@
> > > > > *
> > > > > * Copyright (C) 2022 Loongson Technology Corporation Limited
> > > > > */
> > > > > +#include <linux/memory.h>
> > > > > #include "bpf_jit.h"
> > > > >
> > > > > +#define LOONGARCH_LONG_JUMP_NINSNS 5
> > > > > +#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> > > > > +
> > > > > #define REG_TCC LOONGARCH_GPR_A6
> > > > > #define TCC_SAVED LOONGARCH_GPR_S5
> > > > >
> > > > > @@ -88,6 +92,7 @@ static u8 tail_call_reg(struct jit_ctx *ctx)
> > > > > */
> > > > > static void build_prologue(struct jit_ctx *ctx)
> > > > > {
> > > > > + int i;
> > > > > int stack_adjust = 0, store_offset, bpf_stack_adjust;
> > > > >
> > > > > bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
> > > > > @@ -98,6 +103,10 @@ static void build_prologue(struct jit_ctx *ctx)
> > > > > stack_adjust = round_up(stack_adjust, 16);
> > > > > stack_adjust += bpf_stack_adjust;
> > > > >
> > > > > + /* Reserve space for the move_imm + jirl instruction */
> > > > > + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> > > > > + emit_insn(ctx, nop);
> > > > > +
> > > > > /*
> > > > > * First instruction initializes the tail call count (TCC).
> > > > > * On tail call we skip this instruction, and the TCC is
> > > > > @@ -1367,3 +1376,98 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> > > > > {
> > > > > return true;
> > > > > }
> > > > > +
> > > > > +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 target)
> > > > > +{
> > > > > + if (!target) {
> > > > > + pr_err("bpf_jit: jump target address is error\n");
> > > > > + return -EFAULT;
> > > > > + }
> > > > > +
> > > > > + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> > > > > + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> > > > > +{
> > > > > + struct jit_ctx ctx;
> > > > > +
> > > > > + ctx.idx = 0;
> > > > > + ctx.image = (union loongarch_instruction *)insns;
> > > > > +
> > > > > + if (!target) {
> > > > > + emit_insn((&ctx), nop);
> > > > > + emit_insn((&ctx), nop);
> > > >
> > > > There should be 5 nops, no ?
> > > Chenghao,
> > >
> > > We have already fixed the concurrent problem, now this is the only
> > > issue, please reply tas soon as possible.
> > >
> > > Huacai
> >
> > Hi Hengqi & Huacai,
> >
> > I'm sorry I just saw the email.
> > This position can be configured with 5 NOP instructions, and I have
> > tested it successfully.
> OK, now loongarch-next [1] has integrated all needed changes, you and
> Vincent can test to see if everything is OK.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/log/?h=loongarch-next
>
Tested-by: Vincent Li <vincent.mc.li@gmail.com>
> Huacai
>
> >
> > sudo ./test_progs -a fentry_test/fentry
> > sudo ./test_progs -a fexit_test/fexit
> > sudo ./test_progs -a fentry_fexit
> > sudo ./test_progs -a modify_return
> > sudo ./test_progs -a fexit_sleep
> > sudo ./test_progs -a test_overhead
> > sudo ./test_progs -a trampoline_count
> > sudo ./test_progs -a fexit_bpf2bpf
> >
> > if (!target) {
> > int i;
> > for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> > emit_insn((&ctx), nop);
> > return 0;
> > }
> >
> >
> > Chenghao
> >
> > >
> > > >
> > > > > + return 0;
> > > > > + }
> > > > > +
> > > > > + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> > > > > + (unsigned long)target);
> > > > > +}
> > > > > +
> > > > > +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> > > > > + void *old_addr, void *new_addr)
> > > > > +{
> > > > > + u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> > > > > + u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> > > > > + bool is_call = poke_type == BPF_MOD_CALL;
> > > > > + int ret;
> > > > > +
> > > > > + if (!is_kernel_text((unsigned long)ip) &&
> > > > > + !is_bpf_text_address((unsigned long)ip))
> > > > > + return -ENOTSUPP;
> > > > > +
> > > > > + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> > > > > + if (ret)
> > > > > + return ret;
> > > > > +
> > > > > + if (memcmp(ip, old_insns, LOONGARCH_LONG_JUMP_NBYTES))
> > > > > + return -EFAULT;
> > > > > +
> > > > > + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> > > > > + if (ret)
> > > > > + return ret;
> > > > > +
> > > > > + mutex_lock(&text_mutex);
> > > > > + if (memcmp(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES))
> > > > > + ret = larch_insn_text_copy(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES);
> > > > > + mutex_unlock(&text_mutex);
> > > > > + return ret;
> > > > > +}
> > > > > +
> > > > > +int bpf_arch_text_invalidate(void *dst, size_t len)
> > > > > +{
> > > > > + int i;
> > > > > + int ret = 0;
> > > > > + u32 *inst;
> > > > > +
> > > > > + inst = kvmalloc(len, GFP_KERNEL);
> > > > > + if (!inst)
> > > > > + return -ENOMEM;
> > > > > +
> > > > > + for (i = 0; i < (len/sizeof(u32)); i++)
> > > > > + inst[i] = INSN_BREAK;
> > > > > +
> > > > > + mutex_lock(&text_mutex);
> > > > > + if (larch_insn_text_copy(dst, inst, len))
> > > > > + ret = -EINVAL;
> > > > > + mutex_unlock(&text_mutex);
> > > > > +
> > > > > + kvfree(inst);
> > > > > + return ret;
> > > > > +}
> > > > > +
> > > > > +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> > > > > +{
> > > > > + int ret;
> > > > > +
> > > > > + mutex_lock(&text_mutex);
> > > > > + ret = larch_insn_text_copy(dst, src, len);
> > > > > + mutex_unlock(&text_mutex);
> > > > > + if (ret)
> > > > > + return ERR_PTR(-EINVAL);
> > > > > +
> > > > > + return dst;
> > > > > +}
> > > > > --
> > > >
> > > > bpf_arch_text_invalidate() and bpf_arch_text_copy() is not related to
> > > > BPF trampoline, right ?
> >
> > From the perspective of BPF core source code calls, the two functions
> > bpf_arch_text_invalidate() and bpf_arch_text_copy() are not only used for
> > trampolines.
> >
> > > >
> > > > > 2.25.1
> > > > >
> >
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v5 3/5] LoongArch: BPF: Implement dynamic code modification support
2025-08-05 11:13 ` Huacai Chen
2025-08-05 13:42 ` Vincent Li
@ 2025-08-07 10:26 ` Chenghao Duan
1 sibling, 0 replies; 35+ messages in thread
From: Chenghao Duan @ 2025-08-07 10:26 UTC (permalink / raw)
To: Huacai Chen
Cc: Hengqi Chen, ast, daniel, andrii, yangtiezhu, martin.lau, eddyz87,
song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
kernel, linux-kernel, loongarch, bpf, guodongtai, youling.tang,
jianghaoran, vincent.mc.li, geliang
On Tue, Aug 05, 2025 at 07:13:04PM +0800, Huacai Chen wrote:
> On Tue, Aug 5, 2025 at 2:30 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> >
> > On Tue, Aug 05, 2025 at 12:10:05PM +0800, Huacai Chen wrote:
> > > On Mon, Aug 4, 2025 at 10:02 AM Hengqi Chen <hengqi.chen@gmail.com> wrote:
> > > >
> > > > On Wed, Jul 30, 2025 at 9:13 PM Chenghao Duan <duanchenghao@kylinos.cn> wrote:
> > > > >
> > > > > This commit adds support for BPF dynamic code modification on the
> > > > > LoongArch architecture.:
> > > > > 1. Implement bpf_arch_text_poke() for runtime instruction patching.
> > > > > 2. Add bpf_arch_text_copy() for instruction block copying.
> > > > > 3. Create bpf_arch_text_invalidate() for code invalidation.
> > > > >
> > > > > On LoongArch, since symbol addresses in the direct mapping
> > > > > region cannot be reached via relative jump instructions from the paged
> > > > > mapping region, we use the move_imm+jirl instruction pair as absolute
> > > > > jump instructions. These require 2-5 instructions, so we reserve 5 NOP
> > > > > instructions in the program as placeholders for function jumps.
> > > > >
> > > > > larch_insn_text_copy is solely used for BPF. The use of
> > > > > larch_insn_text_copy() requires page_size alignment. Currently, only
> > > > > the size of the trampoline is page-aligned.
> > > > >
> > > > > Co-developed-by: George Guo <guodongtai@kylinos.cn>
> > > > > Signed-off-by: George Guo <guodongtai@kylinos.cn>
> > > > > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> > > > > ---
> > > > > arch/loongarch/include/asm/inst.h | 1 +
> > > > > arch/loongarch/kernel/inst.c | 27 ++++++++
> > > > > arch/loongarch/net/bpf_jit.c | 104 ++++++++++++++++++++++++++++++
> > > > > 3 files changed, 132 insertions(+)
> > > > >
> > > > > diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> > > > > index 2ae96a35d..88bb73e46 100644
> > > > > --- a/arch/loongarch/include/asm/inst.h
> > > > > +++ b/arch/loongarch/include/asm/inst.h
> > > > > @@ -497,6 +497,7 @@ void arch_simulate_insn(union loongarch_instruction insn, struct pt_regs *regs);
> > > > > int larch_insn_read(void *addr, u32 *insnp);
> > > > > int larch_insn_write(void *addr, u32 insn);
> > > > > int larch_insn_patch_text(void *addr, u32 insn);
> > > > > +int larch_insn_text_copy(void *dst, void *src, size_t len);
> > > > >
> > > > > u32 larch_insn_gen_nop(void);
> > > > > u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> > > > > diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> > > > > index 674e3b322..7df63a950 100644
> > > > > --- a/arch/loongarch/kernel/inst.c
> > > > > +++ b/arch/loongarch/kernel/inst.c
> > > > > @@ -4,6 +4,7 @@
> > > > > */
> > > > > #include <linux/sizes.h>
> > > > > #include <linux/uaccess.h>
> > > > > +#include <linux/set_memory.h>
> > > > >
> > > > > #include <asm/cacheflush.h>
> > > > > #include <asm/inst.h>
> > > > > @@ -218,6 +219,32 @@ int larch_insn_patch_text(void *addr, u32 insn)
> > > > > return ret;
> > > > > }
> > > > >
> > > > > +int larch_insn_text_copy(void *dst, void *src, size_t len)
> > > > > +{
> > > > > + int ret;
> > > > > + unsigned long flags;
> > > > > + unsigned long dst_start, dst_end, dst_len;
> > > > > +
> > > > > + dst_start = round_down((unsigned long)dst, PAGE_SIZE);
> > > > > + dst_end = round_up((unsigned long)dst + len, PAGE_SIZE);
> > > > > + dst_len = dst_end - dst_start;
> > > > > +
> > > > > + set_memory_rw(dst_start, dst_len / PAGE_SIZE);
> > > > > + raw_spin_lock_irqsave(&patch_lock, flags);
> > > > > +
> > > > > + ret = copy_to_kernel_nofault(dst, src, len);
> > > > > + if (ret)
> > > > > + pr_err("%s: operation failed\n", __func__);
> > > > > +
> > > > > + raw_spin_unlock_irqrestore(&patch_lock, flags);
> > > > > + set_memory_rox(dst_start, dst_len / PAGE_SIZE);
> > > > > +
> > > > > + if (!ret)
> > > > > + flush_icache_range((unsigned long)dst, (unsigned long)dst + len);
> > > > > +
> > > > > + return ret;
> > > > > +}
> > > > > +
> > > > > u32 larch_insn_gen_nop(void)
> > > > > {
> > > > > return INSN_NOP;
> > > > > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> > > > > index 7032f11d3..5e6ae7e0e 100644
> > > > > --- a/arch/loongarch/net/bpf_jit.c
> > > > > +++ b/arch/loongarch/net/bpf_jit.c
> > > > > @@ -4,8 +4,12 @@
> > > > > *
> > > > > * Copyright (C) 2022 Loongson Technology Corporation Limited
> > > > > */
> > > > > +#include <linux/memory.h>
> > > > > #include "bpf_jit.h"
> > > > >
> > > > > +#define LOONGARCH_LONG_JUMP_NINSNS 5
> > > > > +#define LOONGARCH_LONG_JUMP_NBYTES (LOONGARCH_LONG_JUMP_NINSNS * 4)
> > > > > +
> > > > > #define REG_TCC LOONGARCH_GPR_A6
> > > > > #define TCC_SAVED LOONGARCH_GPR_S5
> > > > >
> > > > > @@ -88,6 +92,7 @@ static u8 tail_call_reg(struct jit_ctx *ctx)
> > > > > */
> > > > > static void build_prologue(struct jit_ctx *ctx)
> > > > > {
> > > > > + int i;
> > > > > int stack_adjust = 0, store_offset, bpf_stack_adjust;
> > > > >
> > > > > bpf_stack_adjust = round_up(ctx->prog->aux->stack_depth, 16);
> > > > > @@ -98,6 +103,10 @@ static void build_prologue(struct jit_ctx *ctx)
> > > > > stack_adjust = round_up(stack_adjust, 16);
> > > > > stack_adjust += bpf_stack_adjust;
> > > > >
> > > > > + /* Reserve space for the move_imm + jirl instruction */
> > > > > + for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> > > > > + emit_insn(ctx, nop);
> > > > > +
> > > > > /*
> > > > > * First instruction initializes the tail call count (TCC).
> > > > > * On tail call we skip this instruction, and the TCC is
> > > > > @@ -1367,3 +1376,98 @@ bool bpf_jit_supports_subprog_tailcalls(void)
> > > > > {
> > > > > return true;
> > > > > }
> > > > > +
> > > > > +static int emit_jump_and_link(struct jit_ctx *ctx, u8 rd, u64 target)
> > > > > +{
> > > > > + if (!target) {
> > > > > + pr_err("bpf_jit: jump target address is error\n");
> > > > > + return -EFAULT;
> > > > > + }
> > > > > +
> > > > > + move_imm(ctx, LOONGARCH_GPR_T1, target, false);
> > > > > + emit_insn(ctx, jirl, rd, LOONGARCH_GPR_T1, 0);
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > +static int gen_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
> > > > > +{
> > > > > + struct jit_ctx ctx;
> > > > > +
> > > > > + ctx.idx = 0;
> > > > > + ctx.image = (union loongarch_instruction *)insns;
> > > > > +
> > > > > + if (!target) {
> > > > > + emit_insn((&ctx), nop);
> > > > > + emit_insn((&ctx), nop);
> > > >
> > > > There should be 5 nops, no ?
> > > Chenghao,
> > >
> > > We have already fixed the concurrent problem, now this is the only
> > > issue, please reply tas soon as possible.
> > >
> > > Huacai
> >
> > Hi Hengqi & Huacai,
> >
> > I'm sorry I just saw the email.
> > This position can be configured with 5 NOP instructions, and I have
> > tested it successfully.
> OK, now loongarch-next [1] has integrated all needed changes, you and
> Vincent can test to see if everything is OK.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/log/?h=loongarch-next
>
> Huacai
The following test items have been successfully tested:
./test_progs -a fentry_test/fentry
./test_progs -a fexit_test/fexit
./test_progs -a fentry_fexit
./test_progs -a modify_return
./test_progs -a fexit_sleep
./test_progs -a test_overhead
./test_progs -a trampoline_count
./test_progs -a fexit_bpf2bpf
./test_progs -t struct_ops -d struct_ops_multi_pages
#15/1 bad_struct_ops/invalid_prog_reuse:OK
#15/2 bad_struct_ops/unused_program:OK
#15 bad_struct_ops:OK
#408/1 struct_ops_autocreate/cant_load_full_object:OK
#408/2 struct_ops_autocreate/can_load_partial_object:OK
#408/3 struct_ops_autocreate/autoload_and_shadow_vars:OK
#408/4 struct_ops_autocreate/optional_maps:OK
#408 struct_ops_autocreate:OK
#409/1 struct_ops_kptr_return/kptr_return:OK
#409/2 struct_ops_kptr_return/kptr_return_fail__wrong_type:OK
#409/3 struct_ops_kptr_return/kptr_return_fail__invalid_scalar:OK
#409/4 struct_ops_kptr_return/kptr_return_fail__nonzero_offset:OK
#409/5 struct_ops_kptr_return/kptr_return_fail__local_kptr:OK
#409 struct_ops_kptr_return:OK
#410/1 struct_ops_maybe_null/maybe_null:OK
#410/2 struct_ops_maybe_null/maybe_null_fail:OK
#410 struct_ops_maybe_null:OK
#411/1 struct_ops_module/struct_ops_load:OK
#411/2 struct_ops_module/struct_ops_not_zeroed:OK
#411/3 struct_ops_module/struct_ops_incompatible:OK
#411/4 struct_ops_module/struct_ops_null_out_cb:OK
#411/5 struct_ops_module/struct_ops_forgotten_cb:OK
#411/6 struct_ops_module/test_detach_link:OK
#411/7 struct_ops_module/unsupported_ops:OK
#411 struct_ops_module:OK
#413/1 struct_ops_no_cfi/load_bpf_test_no_cfi:OK
#413 struct_ops_no_cfi:OK
#414/1 struct_ops_private_stack/private_stack:SKIP
#414/2 struct_ops_private_stack/private_stack_fail:SKIP
#414/3 struct_ops_private_stack/private_stack_recur:SKIP
#414 struct_ops_private_stack:SKIP
#415/1 struct_ops_refcounted/refcounted:OK
#415/2 struct_ops_refcounted/refcounted_fail__ref_leak:OK
#415/3 struct_ops_refcounted/refcounted_fail__global_subprog:OK
#415/4 struct_ops_refcounted/refcounted_fail__tail_call:OK
#415 struct_ops_refcounted:OK
Summary: 8/25 PASSED, 3 SKIPPED, 0 FAILED
while true; do ./test_progs -a fentry_attach_stress; sleep 1; done
(Loop 60 times.)
Chenghao
>
> >
> > sudo ./test_progs -a fentry_test/fentry
> > sudo ./test_progs -a fexit_test/fexit
> > sudo ./test_progs -a fentry_fexit
> > sudo ./test_progs -a modify_return
> > sudo ./test_progs -a fexit_sleep
> > sudo ./test_progs -a test_overhead
> > sudo ./test_progs -a trampoline_count
> > sudo ./test_progs -a fexit_bpf2bpf
> >
> > if (!target) {
> > int i;
> > for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
> > emit_insn((&ctx), nop);
> > return 0;
> > }
> >
> >
> > Chenghao
> >
> > >
> > > >
> > > > > + return 0;
> > > > > + }
> > > > > +
> > > > > + return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO,
> > > > > + (unsigned long)target);
> > > > > +}
> > > > > +
> > > > > +int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type poke_type,
> > > > > + void *old_addr, void *new_addr)
> > > > > +{
> > > > > + u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> > > > > + u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
> > > > > + bool is_call = poke_type == BPF_MOD_CALL;
> > > > > + int ret;
> > > > > +
> > > > > + if (!is_kernel_text((unsigned long)ip) &&
> > > > > + !is_bpf_text_address((unsigned long)ip))
> > > > > + return -ENOTSUPP;
> > > > > +
> > > > > + ret = gen_jump_or_nops(old_addr, ip, old_insns, is_call);
> > > > > + if (ret)
> > > > > + return ret;
> > > > > +
> > > > > + if (memcmp(ip, old_insns, LOONGARCH_LONG_JUMP_NBYTES))
> > > > > + return -EFAULT;
> > > > > +
> > > > > + ret = gen_jump_or_nops(new_addr, ip, new_insns, is_call);
> > > > > + if (ret)
> > > > > + return ret;
> > > > > +
> > > > > + mutex_lock(&text_mutex);
> > > > > + if (memcmp(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES))
> > > > > + ret = larch_insn_text_copy(ip, new_insns, LOONGARCH_LONG_JUMP_NBYTES);
> > > > > + mutex_unlock(&text_mutex);
> > > > > + return ret;
> > > > > +}
> > > > > +
> > > > > +int bpf_arch_text_invalidate(void *dst, size_t len)
> > > > > +{
> > > > > + int i;
> > > > > + int ret = 0;
> > > > > + u32 *inst;
> > > > > +
> > > > > + inst = kvmalloc(len, GFP_KERNEL);
> > > > > + if (!inst)
> > > > > + return -ENOMEM;
> > > > > +
> > > > > + for (i = 0; i < (len/sizeof(u32)); i++)
> > > > > + inst[i] = INSN_BREAK;
> > > > > +
> > > > > + mutex_lock(&text_mutex);
> > > > > + if (larch_insn_text_copy(dst, inst, len))
> > > > > + ret = -EINVAL;
> > > > > + mutex_unlock(&text_mutex);
> > > > > +
> > > > > + kvfree(inst);
> > > > > + return ret;
> > > > > +}
> > > > > +
> > > > > +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> > > > > +{
> > > > > + int ret;
> > > > > +
> > > > > + mutex_lock(&text_mutex);
> > > > > + ret = larch_insn_text_copy(dst, src, len);
> > > > > + mutex_unlock(&text_mutex);
> > > > > + if (ret)
> > > > > + return ERR_PTR(-EINVAL);
> > > > > +
> > > > > + return dst;
> > > > > +}
> > > > > --
> > > >
> > > > bpf_arch_text_invalidate() and bpf_arch_text_copy() is not related to
> > > > BPF trampoline, right ?
> >
> > From the perspective of BPF core source code calls, the two functions
> > bpf_arch_text_invalidate() and bpf_arch_text_copy() are not only used for
> > trampolines.
> >
> > > >
> > > > > 2.25.1
> > > > >
> >
^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2025-08-07 10:27 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-30 13:12 [PATCH v5 0/5] Support trampoline for LoongArch Chenghao Duan
2025-07-30 13:12 ` [PATCH v5 1/5] LoongArch: Add larch_insn_gen_{beq,bne} helpers Chenghao Duan
2025-07-31 1:41 ` Hengqi Chen
2025-07-30 13:12 ` [PATCH v5 2/5] LoongArch: BPF: Update the code to rename validate_code to validate_ctx Chenghao Duan
2025-07-31 1:44 ` Hengqi Chen
2025-07-30 13:12 ` [PATCH v5 3/5] LoongArch: BPF: Implement dynamic code modification support Chenghao Duan
2025-08-04 2:02 ` Hengqi Chen
2025-08-05 4:10 ` Huacai Chen
2025-08-05 6:30 ` Chenghao Duan
2025-08-05 11:13 ` Huacai Chen
2025-08-05 13:42 ` Vincent Li
2025-08-07 10:26 ` Chenghao Duan
2025-08-04 2:24 ` Hengqi Chen
2025-07-30 13:12 ` [PATCH v5 4/5] LoongArch: BPF: Add bpf trampoline support for Loongarch Chenghao Duan
2025-07-31 2:17 ` Chenghao Duan
2025-08-01 8:04 ` Huacai Chen
2025-08-03 14:17 ` Huacai Chen
2025-07-30 13:12 ` [PATCH v5 5/5] LoongArch: BPF: Add struct ops support for trampoline Chenghao Duan
2025-08-01 5:21 ` [PATCH v5 0/5] Support trampoline for LoongArch Vincent Li
2025-08-01 2:00 ` Vincent Li
2025-08-02 9:19 ` Tiezhu Yang
2025-08-02 13:53 ` Vincent Li
2025-08-02 14:47 ` Vincent Li
2025-08-02 15:52 ` Vincent Li
2025-08-03 14:10 ` Huacai Chen
2025-08-03 15:24 ` Vincent Li
2025-08-04 2:12 ` Hengqi Chen
2025-08-04 2:28 ` Huacai Chen
2025-08-04 3:10 ` Vincent Li
2025-08-04 8:24 ` Huacai Chen
2025-08-04 13:28 ` Vincent Li
2025-08-04 14:24 ` Vincent Li
2025-08-04 14:58 ` Vincent Li
2025-08-04 15:36 ` Vincent Li
2025-08-04 15:51 ` Vincent Li
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).