* [PATCH v2 0/4] LoongArch bpf kptr xchg inline support
@ 2026-06-03 10:04 Chenguang Zhao
2026-06-03 10:04 ` [PATCH v2 1/4] LoongArch: bpf: Use amswap_db for BPF atomic xchg Chenguang Zhao
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: Chenguang Zhao @ 2026-06-03 10:04 UTC (permalink / raw)
To: Huacai Chen, WANG Xuerui, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
Kumar Kartikeya Dwivedi, Tiezhu Yang
Cc: Chenguang Zhao, Hengqi Chen, Song Liu, Yonghong Song, Jiri Olsa,
Shuah Khan, loongarch, bpf, linux-kselftest
This series replaces the single patch "LoongArch: bpf: Support kptr xchg
inline" with four logically separate changes as requested during review.
Patch 1 fixes BPF JIT atomic xchg to use amswap_db.{w,d}, which provide
the full barrier semantics required by LKMM for value-returning atomic RMW.
This is independent of kptr inlining.
Patch 2 advertises bpf_jit_supports_ptr_xchg() so the verifier may inline
bpf_kptr_xchg() to BPF_XCHG on LoongArch.
Patches 3 and 4 extend bpf selftests: functional coverage via
./bench -d 30 -w 5 -p 1 kptr-xchg --nr-loops 256
Chenguang Zhao (4):
LoongArch: bpf: Use amswap_db for BPF atomic xchg
LoongArch: bpf: Advertise JIT support for kptr xchg inline
selftests/bpf: Enable kptr_xchg_inline test on LoongArch
selftests/bpf: Add kptr-xchg benchmark
arch/loongarch/include/asm/inst.h | 2 +
arch/loongarch/net/bpf_jit.c | 9 +-
tools/testing/selftests/bpf/Makefile | 2 +
tools/testing/selftests/bpf/bench.c | 2 +
.../selftests/bpf/benchs/bench_kptr_xchg.c | 96 +++++++++++++++++++
.../bpf/prog_tests/kptr_xchg_inline.c | 3 +-
.../selftests/bpf/progs/kptr_xchg_bench.c | 48 ++++++++++
7 files changed, 159 insertions(+), 3 deletions(-)
create mode 100644 tools/testing/selftests/bpf/benchs/bench_kptr_xchg.c
create mode 100644 tools/testing/selftests/bpf/progs/kptr_xchg_bench.c
---
v2:
- Split the original single patch into four separate commits.
- Switch BPF_W/BPF_DW atomic xchg from plain amswap{w,d} to barrier-aware amswapdbw/amswapdbd.
v1:
- https://lore.kernel.org/all/20260602021515.214560-1-zhaochenguang@kylinos.cn/
--
2.25.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 1/4] LoongArch: bpf: Use amswap_db for BPF atomic xchg
2026-06-03 10:04 [PATCH v2 0/4] LoongArch bpf kptr xchg inline support Chenguang Zhao
@ 2026-06-03 10:04 ` Chenguang Zhao
2026-06-03 10:16 ` sashiko-bot
2026-06-03 10:59 ` bot+bpf-ci
2026-06-03 10:04 ` [PATCH v2 2/4] LoongArch: bpf: Advertise JIT support for kptr xchg inline Chenguang Zhao
` (2 subsequent siblings)
3 siblings, 2 replies; 10+ messages in thread
From: Chenguang Zhao @ 2026-06-03 10:04 UTC (permalink / raw)
To: Huacai Chen, WANG Xuerui, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
Kumar Kartikeya Dwivedi, Tiezhu Yang
Cc: Chenguang Zhao, Hengqi Chen, Song Liu, Yonghong Song, Jiri Olsa,
Shuah Khan, loongarch, bpf, linux-kselftest
Per the Linux Kernel Memory Model, value-returning atomic RMW instructions
imply a full barrier. On LoongArch the plain amswap.{w,d} variants do not
provide that ordering; emit amswap_db.{w,d} for BPF_XCHG instead so scalar
atomic exchanges remain sequentially consistent regardless of kptr inlining.
Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn>
---
arch/loongarch/include/asm/inst.h | 2 ++
arch/loongarch/net/bpf_jit.c | 4 ++--
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
index 76b723590023..636cfc524b02 100644
--- a/arch/loongarch/include/asm/inst.h
+++ b/arch/loongarch/include/asm/inst.h
@@ -783,6 +783,8 @@ DEF_EMIT_REG3_FORMAT(amswapb, amswapb_op)
DEF_EMIT_REG3_FORMAT(amswaph, amswaph_op)
DEF_EMIT_REG3_FORMAT(amswapw, amswapw_op)
DEF_EMIT_REG3_FORMAT(amswapd, amswapd_op)
+DEF_EMIT_REG3_FORMAT(amswapdbw, amswapdbw_op)
+DEF_EMIT_REG3_FORMAT(amswapdbd, amswapdbd_op)
#define DEF_EMIT_REG3SA2_FORMAT(NAME, OP) \
static inline void emit_##NAME(union loongarch_instruction *insn, \
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index 24913dc7f4e8..f071d913e054 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -482,11 +482,11 @@ static int emit_atomic_rmw(const struct bpf_insn *insn, struct jit_ctx *ctx)
emit_zext_32(ctx, src, true);
break;
case BPF_W:
- emit_insn(ctx, amswapw, src, t1, t3);
+ emit_insn(ctx, amswapdbw, src, t1, t3);
emit_zext_32(ctx, src, true);
break;
case BPF_DW:
- emit_insn(ctx, amswapd, src, t1, t3);
+ emit_insn(ctx, amswapdbd, src, t1, t3);
break;
}
break;
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 2/4] LoongArch: bpf: Advertise JIT support for kptr xchg inline
2026-06-03 10:04 [PATCH v2 0/4] LoongArch bpf kptr xchg inline support Chenguang Zhao
2026-06-03 10:04 ` [PATCH v2 1/4] LoongArch: bpf: Use amswap_db for BPF atomic xchg Chenguang Zhao
@ 2026-06-03 10:04 ` Chenguang Zhao
2026-06-03 10:27 ` sashiko-bot
2026-06-03 10:41 ` bot+bpf-ci
2026-06-03 10:04 ` [PATCH v2 3/4] selftests/bpf: Enable kptr_xchg_inline test on LoongArch Chenguang Zhao
2026-06-03 10:04 ` [PATCH v2 4/4] selftests/bpf: Add kptr-xchg benchmark Chenguang Zhao
3 siblings, 2 replies; 10+ messages in thread
From: Chenguang Zhao @ 2026-06-03 10:04 UTC (permalink / raw)
To: Huacai Chen, WANG Xuerui, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
Kumar Kartikeya Dwivedi, Tiezhu Yang
Cc: Chenguang Zhao, Hengqi Chen, Song Liu, Yonghong Song, Jiri Olsa,
Shuah Khan, loongarch, bpf, linux-kselftest
The BPF verifier can lower bpf_kptr_xchg() to BPF_XCHG when the JIT
advertises ptr xchg support. With ordered amswap_db.* emission from the
previous patch, declare that LoongArch bpf JIT supports this inlining.
Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn>
---
arch/loongarch/net/bpf_jit.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index f071d913e054..4f3aa53eda20 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -2362,6 +2362,11 @@ bool bpf_jit_supports_fsession(void)
return true;
}
+bool bpf_jit_supports_ptr_xchg(void)
+{
+ return true;
+}
+
/* Indicate the JIT backend supports mixing bpf2bpf and tailcalls. */
bool bpf_jit_supports_subprog_tailcalls(void)
{
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 3/4] selftests/bpf: Enable kptr_xchg_inline test on LoongArch
2026-06-03 10:04 [PATCH v2 0/4] LoongArch bpf kptr xchg inline support Chenguang Zhao
2026-06-03 10:04 ` [PATCH v2 1/4] LoongArch: bpf: Use amswap_db for BPF atomic xchg Chenguang Zhao
2026-06-03 10:04 ` [PATCH v2 2/4] LoongArch: bpf: Advertise JIT support for kptr xchg inline Chenguang Zhao
@ 2026-06-03 10:04 ` Chenguang Zhao
2026-06-03 10:04 ` [PATCH v2 4/4] selftests/bpf: Add kptr-xchg benchmark Chenguang Zhao
3 siblings, 0 replies; 10+ messages in thread
From: Chenguang Zhao @ 2026-06-03 10:04 UTC (permalink / raw)
To: Huacai Chen, WANG Xuerui, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
Kumar Kartikeya Dwivedi, Tiezhu Yang
Cc: Chenguang Zhao, Hengqi Chen, Song Liu, Yonghong Song, Jiri Olsa,
Shuah Khan, loongarch, bpf, linux-kselftest
Run the kptr_xchg_inline functional test on LoongArch64 now that the bpf
JIT can inline bpf_kptr_xchg() with correct memory ordering.
Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn>
---
tools/testing/selftests/bpf/prog_tests/kptr_xchg_inline.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/kptr_xchg_inline.c b/tools/testing/selftests/bpf/prog_tests/kptr_xchg_inline.c
index 7def158da9eb..8f7b58727416 100644
--- a/tools/testing/selftests/bpf/prog_tests/kptr_xchg_inline.c
+++ b/tools/testing/selftests/bpf/prog_tests/kptr_xchg_inline.c
@@ -14,7 +14,8 @@ void test_kptr_xchg_inline(void)
int err;
#if !(defined(__x86_64__) || defined(__aarch64__) || \
- (defined(__riscv) && __riscv_xlen == 64))
+ (defined(__riscv) && __riscv_xlen == 64) || \
+ (defined(__loongarch__) && __loongarch_grlen == 64))
test__skip();
return;
#endif
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 4/4] selftests/bpf: Add kptr-xchg benchmark
2026-06-03 10:04 [PATCH v2 0/4] LoongArch bpf kptr xchg inline support Chenguang Zhao
` (2 preceding siblings ...)
2026-06-03 10:04 ` [PATCH v2 3/4] selftests/bpf: Enable kptr_xchg_inline test on LoongArch Chenguang Zhao
@ 2026-06-03 10:04 ` Chenguang Zhao
2026-06-03 10:40 ` sashiko-bot
3 siblings, 1 reply; 10+ messages in thread
From: Chenguang Zhao @ 2026-06-03 10:04 UTC (permalink / raw)
To: Huacai Chen, WANG Xuerui, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
Kumar Kartikeya Dwivedi, Tiezhu Yang
Cc: Chenguang Zhao, Hengqi Chen, Song Liu, Yonghong Song, Jiri Olsa,
Shuah Khan, loongarch, bpf, linux-kselftest
Add a bpf selftest benchmark that exercises bpf_kptr_xchg() in a tight loop
so helper vs inlined JIT paths can be compared on supported architectures.
Signed-off-by: Chenguang Zhao <zhaochenguang@kylinos.cn>
---
tools/testing/selftests/bpf/Makefile | 2 +
tools/testing/selftests/bpf/bench.c | 2 +
.../selftests/bpf/benchs/bench_kptr_xchg.c | 96 +++++++++++++++++++
.../selftests/bpf/progs/kptr_xchg_bench.c | 48 ++++++++++
4 files changed, 148 insertions(+)
create mode 100644 tools/testing/selftests/bpf/benchs/bench_kptr_xchg.c
create mode 100644 tools/testing/selftests/bpf/progs/kptr_xchg_bench.c
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 6ef6872adbc3..ea4c22e20f3c 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -866,6 +866,7 @@ $(OUTPUT)/bench_htab_mem.o: $(OUTPUT)/htab_mem_bench.skel.h
$(OUTPUT)/bench_bpf_crypto.o: $(OUTPUT)/crypto_bench.skel.h
$(OUTPUT)/bench_sockmap.o: $(OUTPUT)/bench_sockmap_prog.skel.h
$(OUTPUT)/bench_lpm_trie_map.o: $(OUTPUT)/lpm_trie_bench.skel.h $(OUTPUT)/lpm_trie_map.skel.h
+$(OUTPUT)/bench_kptr_xchg.o: $(OUTPUT)/kptr_xchg_bench.skel.h
$(OUTPUT)/bench.o: bench.h testing_helpers.h $(BPFOBJ)
$(OUTPUT)/bench: LDLIBS += -lm
$(OUTPUT)/bench: $(OUTPUT)/bench.o \
@@ -888,6 +889,7 @@ $(OUTPUT)/bench: $(OUTPUT)/bench.o \
$(OUTPUT)/bench_bpf_crypto.o \
$(OUTPUT)/bench_sockmap.o \
$(OUTPUT)/bench_lpm_trie_map.o \
+ $(OUTPUT)/bench_kptr_xchg.o \
$(OUTPUT)/usdt_1.o \
$(OUTPUT)/usdt_2.o \
#
diff --git a/tools/testing/selftests/bpf/bench.c b/tools/testing/selftests/bpf/bench.c
index 029b3e21f438..2b6dd8aec282 100644
--- a/tools/testing/selftests/bpf/bench.c
+++ b/tools/testing/selftests/bpf/bench.c
@@ -575,6 +575,7 @@ extern const struct bench bench_lpm_trie_insert;
extern const struct bench bench_lpm_trie_update;
extern const struct bench bench_lpm_trie_delete;
extern const struct bench bench_lpm_trie_free;
+extern const struct bench bench_kptr_xchg;
static const struct bench *benchs[] = {
&bench_count_global,
@@ -653,6 +654,7 @@ static const struct bench *benchs[] = {
&bench_lpm_trie_update,
&bench_lpm_trie_delete,
&bench_lpm_trie_free,
+ &bench_kptr_xchg,
};
static void find_benchmark(void)
diff --git a/tools/testing/selftests/bpf/benchs/bench_kptr_xchg.c b/tools/testing/selftests/bpf/benchs/bench_kptr_xchg.c
new file mode 100644
index 000000000000..b8a0d346fda6
--- /dev/null
+++ b/tools/testing/selftests/bpf/benchs/bench_kptr_xchg.c
@@ -0,0 +1,96 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (C) 2026. Loongson Technology Corporation Limited */
+#include <argp.h>
+#include "bench.h"
+#include "kptr_xchg_bench.skel.h"
+
+static struct ctx {
+ struct kptr_xchg_bench *skel;
+} ctx;
+
+static struct {
+ __u32 nr_loops;
+} args = {
+ .nr_loops = 256,
+};
+
+enum {
+ ARG_NR_LOOPS = 7000,
+};
+
+static const struct argp_option opts[] = {
+ { "nr_loops", ARG_NR_LOOPS, "nr_loops", 0,
+ "Set number of bpf_kptr_xchg() calls per trigger"},
+ {},
+};
+
+static error_t parse_arg(int key, char *arg, struct argp_state *state)
+{
+ switch (key) {
+ case ARG_NR_LOOPS:
+ args.nr_loops = strtol(arg, NULL, 10);
+ break;
+ default:
+ return ARGP_ERR_UNKNOWN;
+ }
+
+ return 0;
+}
+
+static const struct argp bench_kptr_xchg_argp = {
+ .options = opts,
+ .parser = parse_arg,
+};
+
+static void validate(void)
+{
+ if (env.consumer_cnt != 0) {
+ fprintf(stderr, "benchmark doesn't support consumer!\n");
+ exit(1);
+ }
+}
+
+static void *producer(void *input)
+{
+ while (true)
+ syscall(__NR_getpgid);
+
+ return NULL;
+}
+
+static void measure(struct bench_res *res)
+{
+ res->hits = atomic_swap(&ctx.skel->bss->hits, 0);
+}
+
+static void setup(void)
+{
+ struct bpf_link *link;
+
+ setup_libbpf();
+
+ ctx.skel = kptr_xchg_bench__open_and_load();
+ if (!ctx.skel) {
+ fprintf(stderr, "failed to open skeleton\n");
+ exit(1);
+ }
+
+ ctx.skel->data->nr_loops = args.nr_loops;
+
+ link = bpf_program__attach(ctx.skel->progs.benchmark);
+ if (!link) {
+ fprintf(stderr, "failed to attach program!\n");
+ exit(1);
+ }
+}
+
+const struct bench bench_kptr_xchg = {
+ .name = "kptr-xchg",
+ .argp = &bench_kptr_xchg_argp,
+ .validate = validate,
+ .setup = setup,
+ .producer_thread = producer,
+ .measure = measure,
+ .report_progress = ops_report_progress,
+ .report_final = ops_report_final,
+};
diff --git a/tools/testing/selftests/bpf/progs/kptr_xchg_bench.c b/tools/testing/selftests/bpf/progs/kptr_xchg_bench.c
new file mode 100644
index 000000000000..ff146e4dcde7
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/kptr_xchg_bench.c
@@ -0,0 +1,48 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (C) 2026. Loongson Technology Corporation Limited */
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+
+#include "bpf_experimental.h"
+#include "bpf_misc.h"
+
+char _license[] SEC("license") = "GPL";
+
+#define MAX_XCHG_LOOPS 4096
+
+struct bin_data {
+ char blob[32];
+};
+
+#define private(name) SEC(".bss." #name) __hidden __attribute__((aligned(8)))
+
+private(kptr) struct bin_data __kptr *ptr;
+u32 nr_loops = 256;
+long hits;
+
+SEC("fentry/" SYS_PREFIX "sys_getpgid")
+int benchmark(void *ctx)
+{
+ struct bin_data *old;
+ u32 i;
+
+ for (i = 0; i < MAX_XCHG_LOOPS; i++) {
+ if (i >= nr_loops)
+ break;
+
+ old = bpf_kptr_xchg(&ptr, NULL);
+ if (old)
+ bpf_obj_drop(old);
+ }
+
+ __sync_add_and_fetch(&hits, i);
+ return 0;
+}
+
+/* BTF FUNC records are not generated for kfuncs referenced only through
+ * optimized paths. Keep bpf_obj_drop() visible to libbpf's kfunc linker.
+ */
+void __btf_root(void)
+{
+ bpf_obj_drop(NULL);
+}
--
2.25.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/4] LoongArch: bpf: Use amswap_db for BPF atomic xchg
2026-06-03 10:04 ` [PATCH v2 1/4] LoongArch: bpf: Use amswap_db for BPF atomic xchg Chenguang Zhao
@ 2026-06-03 10:16 ` sashiko-bot
2026-06-03 10:59 ` bot+bpf-ci
1 sibling, 0 replies; 10+ messages in thread
From: sashiko-bot @ 2026-06-03 10:16 UTC (permalink / raw)
To: Chenguang Zhao; +Cc: bpf
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] The patch incompletely fixes memory ordering for value-returning BPF atomic RMW instructions. While it adds full barriers for 32/64-bit BPF_XCHG by emitting `amswap_db`, it misses other BPF_FETCH operations, sub-word BPF_XCHG, and BPF_CMPXCHG, leaving them weakly ordered.
--
commit f10ece83a8a616fba153a535a64e9160f3060f64
Author: Chenguang Zhao <zhaochenguang@kylinos.cn>
LoongArch: bpf: Use amswap_db for BPF atomic xchg
The patch updates the LoongArch BPF JIT to use amswap_db.{w,d} instead of
plain amswap.{w,d} for 32-bit and 64-bit BPF_XCHG instructions. This ensures
that value-returning atomic exchanges provide the full memory barrier required
by the Linux Kernel Memory Model.
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> index 76b723590023d..636cfc524b020 100644
> --- a/arch/loongarch/include/asm/inst.h
> +++ b/arch/loongarch/include/asm/inst.h
[ ... ]
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 24913dc7f4e83..f071d913e0545 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -482,11 +482,11 @@ static int emit_atomic_rmw(const struct bpf_insn *insn, struct jit_ctx *ctx)
> emit_zext_32(ctx, src, true);
> break;
> case BPF_W:
> - emit_insn(ctx, amswapw, src, t1, t3);
> + emit_insn(ctx, amswapdbw, src, t1, t3);
> emit_zext_32(ctx, src, true);
> break;
> case BPF_DW:
> - emit_insn(ctx, amswapd, src, t1, t3);
> + emit_insn(ctx, amswapdbd, src, t1, t3);
> break;
> }
> break;
[Severity: High]
This is a pre-existing issue, but does this patch fully resolve the memory
ordering requirements for BPF atomics?
Per the LKMM and BPF documentation, all atomic RMW operations that return a
value must have sequentially consistent memory ordering (a full memory
barrier). While this patch addresses 32-bit and 64-bit BPF_XCHG, several other
value-returning atomics in emit_atomic_rmw() appear to still use weakly
ordered instructions.
For example, BPF_FETCH operations use plain amadd.w and amadd.d:
case BPF_ADD | BPF_FETCH:
switch (BPF_SIZE(insn->code)) {
...
case BPF_W:
emit_insn(ctx, amaddw, src, t1, t3);
emit_zext_32(ctx, src, true);
break;
case BPF_DW:
emit_insn(ctx, amaddd, src, t1, t3);
break;
Sub-word BPF_XCHG operations still emit weakly ordered amswap.b and amswap.h
without explicit barriers:
case BPF_XCHG:
switch (BPF_SIZE(insn->code)) {
case BPF_B:
...
emit_insn(ctx, amswapb, src, t1, t3);
...
case BPF_H:
...
emit_insn(ctx, amswaph, src, t1, t3);
Additionally, BPF_CMPXCHG emits a bare ll and sc loop without the required
dbar (e.g., dbar 0x700) before and after the loop:
case BPF_CMPXCHG:
...
if (isdw) {
emit_insn(ctx, lld, r0, t1, 0);
emit_insn(ctx, bne, t2, r0, 4);
move_reg(ctx, t3, src);
emit_insn(ctx, scd, t3, t1, 0);
emit_insn(ctx, beq, t3, LOONGARCH_GPR_ZERO, -4);
Could we also update these other value-returning atomic instructions in
emit_atomic_rmw() so that all lockless BPF algorithms execute safely?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260603100438.2177817-1-zhaochenguang@kylinos.cn?part=1
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/4] LoongArch: bpf: Advertise JIT support for kptr xchg inline
2026-06-03 10:04 ` [PATCH v2 2/4] LoongArch: bpf: Advertise JIT support for kptr xchg inline Chenguang Zhao
@ 2026-06-03 10:27 ` sashiko-bot
2026-06-03 10:41 ` bot+bpf-ci
1 sibling, 0 replies; 10+ messages in thread
From: sashiko-bot @ 2026-06-03 10:27 UTC (permalink / raw)
To: Chenguang Zhao; +Cc: bpf
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] LoongArch BPF JIT lacks required memory barriers for value-returning atomic instructions (`BPF_CMPXCHG` and `BPF_FETCH` variants), violating the Linux Kernel Memory Model.
--
commit ae71c5cdb14e5f9c74cac323c884423ce8768951
Author: Chenguang Zhao <zhaochenguang@kylinos.cn>
LoongArch: bpf: Advertise JIT support for kptr xchg inline
The BPF verifier can lower bpf_kptr_xchg() to BPF_XCHG when the JIT
advertises ptr xchg support. With ordered amswap_db.* emission from the
previous patch, declare that LoongArch bpf JIT supports this inlining.
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index f071d913e0545..4f3aa53eda205 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -2362,6 +2362,11 @@ bool bpf_jit_supports_fsession(void)
> return true;
> }
>
> +bool bpf_jit_supports_ptr_xchg(void)
> +{
> + return true;
> +}
> +
[Severity: High]
This is a pre-existing issue, but while the previous patch updated BPF_XCHG
to emit fully ordered amswap_db.* instructions, do the other value-returning
BPF atomics (BPF_CMPXCHG and BPF_FETCH variants) also need to be updated to
ensure sequential consistency?
According to the Linux Kernel Memory Model, value-returning atomic RMW
instructions must be fully ordered. In emit_atomic_rmw(), BPF_ADD | BPF_FETCH
and similar fetch variants emit weakly ordered instructions:
arch/loongarch/net/bpf_jit.c:emit_atomic_rmw() {
/* src = atomic_fetch_<op>(dst + off, src) */
case BPF_ADD | BPF_FETCH:
...
emit_insn(ctx, amaddd, src, t1, t3);
...
}
Should these be using the fully ordered _db variants (e.g., amadd_db.d) or
be wrapped with dbar 0?
Similarly, BPF_CMPXCHG emits an ll/sc loop without memory barriers before
or after the loop:
arch/loongarch/net/bpf_jit.c:emit_atomic_rmw() {
/* r0 = atomic_cmpxchg(dst + off, r0, src); */
case BPF_CMPXCHG:
move_reg(ctx, t2, r0);
if (isdw) {
emit_insn(ctx, lld, r0, t1, 0);
emit_insn(ctx, bne, t2, r0, 4);
move_reg(ctx, t3, src);
emit_insn(ctx, scd, t3, t1, 0);
emit_insn(ctx, beq, t3, LOONGARCH_GPR_ZERO, -4);
} ...
}
Could this leave BPF programs vulnerable to memory reordering bugs on
LoongArch if dbar 0 is not emitted?
> /* Indicate the JIT backend supports mixing bpf2bpf and tailcalls. */
> bool bpf_jit_supports_subprog_tailcalls(void)
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260603100438.2177817-1-zhaochenguang@kylinos.cn?part=2
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 4/4] selftests/bpf: Add kptr-xchg benchmark
2026-06-03 10:04 ` [PATCH v2 4/4] selftests/bpf: Add kptr-xchg benchmark Chenguang Zhao
@ 2026-06-03 10:40 ` sashiko-bot
0 siblings, 0 replies; 10+ messages in thread
From: sashiko-bot @ 2026-06-03 10:40 UTC (permalink / raw)
To: Chenguang Zhao; +Cc: bpf
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Low] The multi-line comment in `kptr_xchg_bench.c` violates the mandatory BPF subsystem comment style.
--
commit 65ef658c1862e0b10ff0a6614c9d69939e4240fa
Author: Chenguang Zhao <zhaochenguang@kylinos.cn>
selftests/bpf: Add kptr-xchg benchmark
Add a bpf selftest benchmark that exercises bpf_kptr_xchg() in a tight loop
so helper vs inlined JIT paths can be compared on supported architectures.
> diff --git a/tools/testing/selftests/bpf/progs/kptr_xchg_bench.c b/tools/testing/selftests/bpf/progs/kptr_xchg_bench.c
> new file mode 100644
> index 0000000000000..ff146e4dcde7a
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/kptr_xchg_bench.c
[ ... ]
> + __sync_add_and_fetch(&hits, i);
> + return 0;
> +}
> +
> +/* BTF FUNC records are not generated for kfuncs referenced only through
> + * optimized paths. Keep bpf_obj_drop() visible to libbpf's kfunc linker.
> + */
[Severity: Low]
This isn't a bug, but the multi-line comment here violates the BPF subsystem
comment style, which strictly requires the opening /* to be on its own line.
Could you move the opening /* to its own empty line?
> +void __btf_root(void)
> +{
> + bpf_obj_drop(NULL);
> +}
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260603100438.2177817-1-zhaochenguang@kylinos.cn?part=4
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/4] LoongArch: bpf: Advertise JIT support for kptr xchg inline
2026-06-03 10:04 ` [PATCH v2 2/4] LoongArch: bpf: Advertise JIT support for kptr xchg inline Chenguang Zhao
2026-06-03 10:27 ` sashiko-bot
@ 2026-06-03 10:41 ` bot+bpf-ci
1 sibling, 0 replies; 10+ messages in thread
From: bot+bpf-ci @ 2026-06-03 10:41 UTC (permalink / raw)
To: zhaochenguang, chenhuacai, kernel, ast, daniel, andrii,
martin.lau, eddyz87, memxor, yangtiezhu
Cc: zhaochenguang, hengqi.chen, song, yonghong.song, jolsa, shuah,
loongarch, bpf, linux-kselftest, ast, andrii, daniel, martin.lau,
eddyz87, yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 1606 bytes --]
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index f071d913e054..4f3aa53eda20 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -2362,6 +2362,11 @@ bool bpf_jit_supports_fsession(void)
> return true;
> }
>
> +bool bpf_jit_supports_ptr_xchg(void)
> +{
> + return true;
> +}
> +
While patch 1 fixed BPF_XCHG (32/64-bit) to use amswap_db.{w,d} for full
barriers, do the other value-returning atomic RMW instructions also need
updates to ensure sequential consistency?
According to the Linux Kernel Memory Model, value-returning atomic RMW
instructions must be fully ordered. In emit_atomic_rmw():
1. BPF_FETCH operations (BPF_ADD|BPF_FETCH, BPF_AND|BPF_FETCH, etc.) use
plain amadd.{b,h,w,d}, amand.{w,d}, amor.{w,d}, amxor.{w,d} without _db
variants
2. Sub-word BPF_XCHG (BPF_B, BPF_H) use plain amswap.{b,h} without barriers
3. BPF_CMPXCHG uses bare ll/sc loops without dbar instructions
Could this leave BPF programs vulnerable to memory reordering bugs on
LoongArch if dbar 0 is not emitted or if the _db variants are not used?
(This was raised by sashiko-bot@kernel.org in v2:
https://lore.kernel.org/bpf/20260603102720.20CEA1F00893@smtp.kernel.org/)
> /* Indicate the JIT backend supports mixing bpf2bpf and tailcalls. */
> bool bpf_jit_supports_subprog_tailcalls(void)
> {
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/26878707871
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/4] LoongArch: bpf: Use amswap_db for BPF atomic xchg
2026-06-03 10:04 ` [PATCH v2 1/4] LoongArch: bpf: Use amswap_db for BPF atomic xchg Chenguang Zhao
2026-06-03 10:16 ` sashiko-bot
@ 2026-06-03 10:59 ` bot+bpf-ci
1 sibling, 0 replies; 10+ messages in thread
From: bot+bpf-ci @ 2026-06-03 10:59 UTC (permalink / raw)
To: zhaochenguang, chenhuacai, kernel, ast, daniel, andrii,
martin.lau, eddyz87, memxor, yangtiezhu
Cc: zhaochenguang, hengqi.chen, song, yonghong.song, jolsa, shuah,
loongarch, bpf, linux-kselftest, ast, andrii, daniel, martin.lau,
eddyz87, yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 3378 bytes --]
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> index 76b723590023..636cfc524b02 100644
> --- a/arch/loongarch/include/asm/inst.h
> +++ b/arch/loongarch/include/asm/inst.h
> @@ -783,6 +783,8 @@ DEF_EMIT_REG3_FORMAT(amswapb, amswapb_op)
> DEF_EMIT_REG3_FORMAT(amswaph, amswaph_op)
> DEF_EMIT_REG3_FORMAT(amswapw, amswapw_op)
> DEF_EMIT_REG3_FORMAT(amswapd, amswapd_op)
> +DEF_EMIT_REG3_FORMAT(amswapdbw, amswapdbw_op)
> +DEF_EMIT_REG3_FORMAT(amswapdbd, amswapdbd_op)
>
> #define DEF_EMIT_REG3SA2_FORMAT(NAME, OP) \
> static inline void emit_##NAME(union loongarch_instruction *insn, \
> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
> index 24913dc7f4e8..f071d913e054 100644
> --- a/arch/loongarch/net/bpf_jit.c
> +++ b/arch/loongarch/net/bpf_jit.c
> @@ -482,11 +482,11 @@ static int emit_atomic_rmw(const struct bpf_insn *insn, struct jit_ctx *ctx)
> emit_zext_32(ctx, src, true);
> break;
> case BPF_W:
> - emit_insn(ctx, amswapw, src, t1, t3);
> + emit_insn(ctx, amswapdbw, src, t1, t3);
> emit_zext_32(ctx, src, true);
> break;
> case BPF_DW:
> - emit_insn(ctx, amswapd, src, t1, t3);
> + emit_insn(ctx, amswapdbd, src, t1, t3);
> break;
> }
> break;
The commit message states that value-returning atomic RMW instructions
imply a full barrier per the LKMM.
This change only updates the BPF_W and BPF_DW BPF_XCHG cases. Should the
other value-returning atomics in emit_atomic_rmw() get the same treatment?
The BPF_*_FETCH cases still emit the plain amadd/amand/amor/amxor variants:
arch/loongarch/net/bpf_jit.c:emit_atomic_rmw() {
...
case BPF_W:
emit_insn(ctx, amaddw, src, t1, t3);
emit_zext_32(ctx, src, true);
break;
case BPF_DW:
emit_insn(ctx, amaddd, src, t1, t3);
break;
...
case BPF_AND | BPF_FETCH:
if (isdw) {
emit_insn(ctx, amandd, src, t1, t3);
} else {
emit_insn(ctx, amandw, src, t1, t3);
...
}
The sub-word BPF_XCHG cases (BPF_B and BPF_H) still emit plain
amswap.b/amswap.h:
arch/loongarch/net/bpf_jit.c:emit_atomic_rmw() {
...
case BPF_B:
...
emit_insn(ctx, amswapb, src, t1, t3);
...
case BPF_H:
...
emit_insn(ctx, amswaph, src, t1, t3);
...
}
And BPF_CMPXCHG uses an ll/sc loop with no dbar barrier:
arch/loongarch/net/bpf_jit.c:emit_atomic_rmw() {
...
case BPF_CMPXCHG:
move_reg(ctx, t2, r0);
if (isdw) {
emit_insn(ctx, lld, r0, t1, 0);
emit_insn(ctx, bne, t2, r0, 4);
move_reg(ctx, t3, src);
emit_insn(ctx, scd, t3, t1, 0);
emit_insn(ctx, beq, t3, LOONGARCH_GPR_ZERO, -4);
} else {
...
}
These are also value-returning atomics. If the LKMM requires a full
barrier for all of them, would these cases need updating too so that
lockless BPF algorithms stay sequentially consistent?
This was also raised by the Sashiko AI review on v2:
https://lore.kernel.org/bpf/20260603101633.25FFA1F00893@smtp.kernel.org
There is no Fixes: tag on this commit.
The amswap.{w,d} BPF_XCHG path was introduced when the LoongArch BPF JIT
first implemented BPF_XCHG. Should this carry:
Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support")
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/26878707871
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-06-03 10:59 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-03 10:04 [PATCH v2 0/4] LoongArch bpf kptr xchg inline support Chenguang Zhao
2026-06-03 10:04 ` [PATCH v2 1/4] LoongArch: bpf: Use amswap_db for BPF atomic xchg Chenguang Zhao
2026-06-03 10:16 ` sashiko-bot
2026-06-03 10:59 ` bot+bpf-ci
2026-06-03 10:04 ` [PATCH v2 2/4] LoongArch: bpf: Advertise JIT support for kptr xchg inline Chenguang Zhao
2026-06-03 10:27 ` sashiko-bot
2026-06-03 10:41 ` bot+bpf-ci
2026-06-03 10:04 ` [PATCH v2 3/4] selftests/bpf: Enable kptr_xchg_inline test on LoongArch Chenguang Zhao
2026-06-03 10:04 ` [PATCH v2 4/4] selftests/bpf: Add kptr-xchg benchmark Chenguang Zhao
2026-06-03 10:40 ` sashiko-bot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.