* [PATCH bpf 1/2] bpf: Fix incorrect pruning due to atomic fetch precision tracking
@ 2026-03-30 13:27 Daniel Borkmann
2026-03-30 13:27 ` [PATCH bpf 2/2] selftests/bpf: Add more precision tracking tests for atomics Daniel Borkmann
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Daniel Borkmann @ 2026-03-30 13:27 UTC (permalink / raw)
To: eddyz87; +Cc: ast, bpf, STAR Labs SG
When backtrack_insn encounters a BPF_STX instruction with BPF_ATOMIC
and BPF_FETCH, the src register (or r0 for BPF_CMPXCHG) also acts as
a destination, thus receiving the old value from the memory location.
The current backtracking logic does not account for this. It treats
atomic fetch operations the same as regular stores where the src
register is only an input. This leads the backtrack_insn to fail to
propagate precision to the stack location, which is then not marked
as precise!
Later, the verifier's path pruning can incorrectly consider two states
equivalent when they differ in terms of stack state. Meaning, two
branches can be treated as equivalent and thus get pruned when they
should not be seen as such.
Fix it as follows: When the fetch dst register is being tracked for
precision, clear it and propagate precision over to the stack slot.
This is similar to how BPF_LDX handles loads from the stack.
Before:
0: (b7) r1 = 8 ; R1=8
1: (7b) *(u64 *)(r10 -8) = r1 ; R1=8 R10=fp0 fp-8=8
2: (b7) r2 = 0 ; R2=0
3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) ; R2=8 R10=fp0 fp-8=mmmmmmmm
4: (bf) r3 = r10 ; R3=fp0 R10=fp0
5: (0f) r3 += r2
mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1
mark_precise: frame0: regs=r2 stack= before 4: (bf) r3 = r10
mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2)
mark_precise: frame0: regs=r2 stack= before 2: (b7) r2 = 0
6: R2=8 R3=fp8
6: (b7) r0 = 0 ; R0=0
7: (95) exit
After:
0: (b7) r1 = 8 ; R1=8
1: (7b) *(u64 *)(r10 -8) = r1 ; R1=8 R10=fp0 fp-8=8
2: (b7) r2 = 0 ; R2=0
3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) ; R2=8 R10=fp0 fp-8=mmmmmmmm
4: (bf) r3 = r10 ; R3=fp0 R10=fp0
5: (0f) r3 += r2
mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1
mark_precise: frame0: regs=r2 stack= before 4: (bf) r3 = r10
mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2)
mark_precise: frame0: regs= stack=-8 before 2: (b7) r2 = 0
mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1
mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8
6: R2=8 R3=fp8
6: (b7) r0 = 0 ; R0=0
7: (95) exit
Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg")
Fixes: 5ca419f2864a ("bpf: Add BPF_FETCH field / create atomic_fetch_add instruction")
Reported-by: STAR Labs SG <info@starlabs.sg>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
kernel/bpf/verifier.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f108c01ff6d0..293aa957a5ff 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4474,6 +4474,31 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
* encountered a case of pointer subtraction.
*/
return -ENOTSUPP;
+
+ /* atomic fetch operation writes the old value into a
+ * register (sreg or r0) and if it was tracked for
+ * precision, propagate to the stack slot like we do
+ * in ldx.
+ */
+ if (class == BPF_STX && mode == BPF_ATOMIC &&
+ (insn->imm & BPF_FETCH)) {
+ u32 load_reg = insn->imm == BPF_CMPXCHG ?
+ BPF_REG_0 : sreg;
+
+ if (bt_is_reg_set(bt, load_reg)) {
+ bt_clear_reg(bt, load_reg);
+ /* atomic fetch from non-stack memory
+ * can't be further backtracked, same
+ * as for ldx.
+ */
+ if (!hist || !(hist->flags & INSN_F_STACK_ACCESS))
+ return 0;
+ spi = insn_stack_access_spi(hist->flags);
+ fr = insn_stack_access_frameno(hist->flags);
+ bt_set_frame_slot(bt, fr, spi);
+ return 0;
+ }
+ }
/* scalars can only be spilled into stack */
if (!hist || !(hist->flags & INSN_F_STACK_ACCESS))
return 0;
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH bpf 2/2] selftests/bpf: Add more precision tracking tests for atomics 2026-03-30 13:27 [PATCH bpf 1/2] bpf: Fix incorrect pruning due to atomic fetch precision tracking Daniel Borkmann @ 2026-03-30 13:27 ` Daniel Borkmann 2026-03-30 14:42 ` Puranjay Mohan 2026-03-30 14:41 ` [PATCH bpf 1/2] bpf: Fix incorrect pruning due to atomic fetch precision tracking Puranjay Mohan 2026-03-30 14:45 ` Alexei Starovoitov 2 siblings, 1 reply; 7+ messages in thread From: Daniel Borkmann @ 2026-03-30 13:27 UTC (permalink / raw) To: eddyz87; +Cc: ast, bpf Add verifier precision tracking tests for BPF atomic fetch operations. Validate that backtrack_insn correctly propagates precision from the fetch dst_reg to the stack slot for atomic64_{fetch_add,xchg,cmpxchg}. # LDLIBS=-static PKG_CONFIG='pkg-config --static' ./vmtest.sh -- ./test_progs -t verifier_precision [...] + /etc/rcS.d/S50-startup ./test_progs -t verifier_precision [ 1.697105] bpf_testmod: loading out-of-tree module taints kernel. [ 1.700220] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel [ 1.777043] tsc: Refined TSC clocksource calibration: 3407.986 MHz [ 1.777619] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fc6d7268, max_idle_ns: 440795260133 ns [ 1.778658] clocksource: Switched to clocksource tsc #623/1 verifier_precision/bpf_neg:OK #623/2 verifier_precision/bpf_end_to_le:OK #623/3 verifier_precision/bpf_end_to_be:OK #623/4 verifier_precision/bpf_end_bswap:OK #623/5 verifier_precision/bpf_load_acquire:OK #623/6 verifier_precision/bpf_store_release:OK #623/7 verifier_precision/state_loop_first_last_equal:OK #623/8 verifier_precision/bpf_cond_op_r10:OK #623/9 verifier_precision/bpf_cond_op_not_r10:OK #623/10 verifier_precision/bpf_atomic_fetch_add_precision:OK #623/11 verifier_precision/bpf_atomic_xchg_precision:OK #623/12 verifier_precision/bpf_atomic_cmpxchg_precision:OK #623/13 verifier_precision/bpf_atomic_fetch_add_dual_precision:OK #623/14 verifier_precision/bpf_atomic_cmpxchg_dual_precision:OK #623/15 verifier_precision/bpf_atomic_fetch_add_map_precision:OK #623/16 verifier_precision/bpf_atomic_cmpxchg_map_precision:OK #623/17 verifier_precision/bpf_neg_2:OK #623/18 verifier_precision/bpf_neg_3:OK #623/19 verifier_precision/bpf_neg_4:OK #623/20 verifier_precision/bpf_neg_5:OK #623 verifier_precision:OK Summary: 1/20 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> --- .../selftests/bpf/progs/verifier_precision.c | 207 ++++++++++++++++++ 1 file changed, 207 insertions(+) diff --git a/tools/testing/selftests/bpf/progs/verifier_precision.c b/tools/testing/selftests/bpf/progs/verifier_precision.c index 1fe090cd6744..6dbd4feda337 100644 --- a/tools/testing/selftests/bpf/progs/verifier_precision.c +++ b/tools/testing/selftests/bpf/progs/verifier_precision.c @@ -5,6 +5,13 @@ #include "../../../include/linux/filter.h" #include "bpf_misc.h" +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, __u64); +} precision_map SEC(".maps"); + SEC("?raw_tp") __success __log_level(2) __msg("mark_precise: frame0: regs=r2 stack= before 3: (bf) r1 = r10") @@ -301,4 +308,204 @@ __naked int bpf_neg_5(void) ::: __clobber_all); } +SEC("?raw_tp") +__success __log_level(2) +__msg("mark_precise: frame0: regs=r2 stack= before 4: (bf) r3 = r10") +__msg("mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2)") +__msg("mark_precise: frame0: regs= stack=-8 before 2: (b7) r2 = 0") +__msg("mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1") +__msg("mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8") +__naked int bpf_atomic_fetch_add_precision(void) +{ + asm volatile ( + "r1 = 8;" + "*(u64 *)(r10 - 8) = r1;" + "r2 = 0;" + ".8byte %[fetch_add_insn];" /* r2 = atomic_fetch_add(*(u64 *)(r10 - 8), r2) */ + "r3 = r10;" + "r3 += r2;" /* mark_precise */ + "r0 = 0;" + "exit;" + : + : __imm_insn(fetch_add_insn, + BPF_ATOMIC_OP(BPF_DW, BPF_ADD | BPF_FETCH, BPF_REG_10, BPF_REG_2, -8)) + : __clobber_all); +} + +SEC("?raw_tp") +__success __log_level(2) +__msg("mark_precise: frame0: regs=r2 stack= before 4: (bf) r3 = r10") +__msg("mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_xchg((u64 *)(r10 -8), r2)") +__msg("mark_precise: frame0: regs= stack=-8 before 2: (b7) r2 = 0") +__msg("mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1") +__msg("mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8") +__naked int bpf_atomic_xchg_precision(void) +{ + asm volatile ( + "r1 = 8;" + "*(u64 *)(r10 - 8) = r1;" + "r2 = 0;" + ".8byte %[xchg_insn];" /* r2 = atomic_xchg(*(u64 *)(r10 - 8), r2) */ + "r3 = r10;" + "r3 += r2;" /* mark_precise */ + "r0 = 0;" + "exit;" + : + : __imm_insn(xchg_insn, + BPF_ATOMIC_OP(BPF_DW, BPF_XCHG, BPF_REG_10, BPF_REG_2, -8)) + : __clobber_all); +} + +SEC("?raw_tp") +__success __log_level(2) +__msg("mark_precise: frame0: regs=r0 stack= before 5: (bf) r3 = r10") +__msg("mark_precise: frame0: regs=r0 stack= before 4: (db) r0 = atomic64_cmpxchg((u64 *)(r10 -8), r0, r2)") +__msg("mark_precise: frame0: regs= stack=-8 before 3: (b7) r2 = 0") +__msg("mark_precise: frame0: regs= stack=-8 before 2: (b7) r0 = 0") +__msg("mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1") +__msg("mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8") +__naked int bpf_atomic_cmpxchg_precision(void) +{ + asm volatile ( + "r1 = 8;" + "*(u64 *)(r10 - 8) = r1;" + "r0 = 0;" + "r2 = 0;" + ".8byte %[cmpxchg_insn];" /* r0 = atomic_cmpxchg(*(u64 *)(r10 - 8), r0, r2) */ + "r3 = r10;" + "r3 += r0;" /* mark_precise */ + "r0 = 0;" + "exit;" + : + : __imm_insn(cmpxchg_insn, + BPF_ATOMIC_OP(BPF_DW, BPF_CMPXCHG, BPF_REG_10, BPF_REG_2, -8)) + : __clobber_all); +} + +/* Regression test for dual precision: Both the fetched value (r2) and + * a reread of the same stack slot (r3) are tracked for precision. After + * the atomic operation, the stack slot is STACK_MISC. Thus, the ldx at + * insn 4 does NOT set INSN_F_STACK_ACCESS. Precision for the stack slot + * propagates solely through the atomic fetch's load side (insn 3). + */ +SEC("?raw_tp") +__success __log_level(2) +__msg("mark_precise: frame0: regs=r2,r3 stack= before 4: (79) r3 = *(u64 *)(r10 -8)") +__msg("mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2)") +__msg("mark_precise: frame0: regs= stack=-8 before 2: (b7) r2 = 0") +__msg("mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1") +__msg("mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8") +__naked int bpf_atomic_fetch_add_dual_precision(void) +{ + asm volatile ( + "r1 = 8;" + "*(u64 *)(r10 - 8) = r1;" + "r2 = 0;" + ".8byte %[fetch_add_insn];" /* r2 = atomic_fetch_add(*(u64 *)(r10 - 8), r2) */ + "r3 = *(u64 *)(r10 - 8);" + "r4 = r2;" + "r4 += r3;" + "r4 &= 7;" + "r5 = r10;" + "r5 += r4;" /* mark_precise */ + "r0 = 0;" + "exit;" + : + : __imm_insn(fetch_add_insn, + BPF_ATOMIC_OP(BPF_DW, BPF_ADD | BPF_FETCH, BPF_REG_10, BPF_REG_2, -8)) + : __clobber_all); +} + +SEC("?raw_tp") +__success __log_level(2) +__msg("mark_precise: frame0: regs=r0,r3 stack= before 5: (79) r3 = *(u64 *)(r10 -8)") +__msg("mark_precise: frame0: regs=r0 stack= before 4: (db) r0 = atomic64_cmpxchg((u64 *)(r10 -8), r0, r2)") +__msg("mark_precise: frame0: regs= stack=-8 before 3: (b7) r2 = 0") +__msg("mark_precise: frame0: regs= stack=-8 before 2: (b7) r0 = 8") +__msg("mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1") +__msg("mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8") +__naked int bpf_atomic_cmpxchg_dual_precision(void) +{ + asm volatile ( + "r1 = 8;" + "*(u64 *)(r10 - 8) = r1;" + "r0 = 8;" + "r2 = 0;" + ".8byte %[cmpxchg_insn];" /* r0 = atomic_cmpxchg(*(u64 *)(r10 - 8), r0, r2) */ + "r3 = *(u64 *)(r10 - 8);" + "r4 = r0;" + "r4 += r3;" + "r4 &= 7;" + "r5 = r10;" + "r5 += r4;" /* mark_precise */ + "r0 = 0;" + "exit;" + : + : __imm_insn(cmpxchg_insn, + BPF_ATOMIC_OP(BPF_DW, BPF_CMPXCHG, BPF_REG_10, BPF_REG_2, -8)) + : __clobber_all); +} + +SEC("?raw_tp") +__success __log_level(2) +__msg("mark_precise: frame0: regs=r1 stack= before 10: (57) r1 &= 7") +__msg("mark_precise: frame0: regs=r1 stack= before 9: (db) r1 = atomic64_fetch_add((u64 *)(r0 +0), r1)") +__not_msg("falling back to forcing all scalars precise") +__naked int bpf_atomic_fetch_add_map_precision(void) +{ + asm volatile ( + "r1 = 0;" + "*(u64 *)(r10 - 8) = r1;" + "r2 = r10;" + "r2 += -8;" + "r1 = %[precision_map] ll;" + "call %[bpf_map_lookup_elem];" + "if r0 == 0 goto 1f;" + "r1 = 0;" + ".8byte %[fetch_add_insn];" /* r1 = atomic_fetch_add(*(u64 *)(r0 + 0), r1) */ + "r1 &= 7;" + "r2 = r10;" + "r2 += r1;" /* mark_precise */ + "1: r0 = 0;" + "exit;" + : + : __imm_addr(precision_map), + __imm(bpf_map_lookup_elem), + __imm_insn(fetch_add_insn, + BPF_ATOMIC_OP(BPF_DW, BPF_ADD | BPF_FETCH, BPF_REG_0, BPF_REG_1, 0)) + : __clobber_all); +} + +SEC("?raw_tp") +__success __log_level(2) +__msg("mark_precise: frame0: regs=r0 stack= before 12: (57) r0 &= 7") +__msg("mark_precise: frame0: regs=r0 stack= before 11: (db) r0 = atomic64_cmpxchg((u64 *)(r6 +0), r0, r1)") +__not_msg("falling back to forcing all scalars precise") +__naked int bpf_atomic_cmpxchg_map_precision(void) +{ + asm volatile ( + "r1 = 0;" + "*(u64 *)(r10 - 8) = r1;" + "r2 = r10;" + "r2 += -8;" + "r1 = %[precision_map] ll;" + "call %[bpf_map_lookup_elem];" + "if r0 == 0 goto 1f;" + "r6 = r0;" + "r0 = 0;" + "r1 = 0;" + ".8byte %[cmpxchg_insn];" /* r0 = atomic_cmpxchg(*(u64 *)(r6 + 0), r0, r1) */ + "r0 &= 7;" + "r2 = r10;" + "r2 += r0;" /* mark_precise */ + "1: r0 = 0;" + "exit;" + : + : __imm_addr(precision_map), + __imm(bpf_map_lookup_elem), + __imm_insn(cmpxchg_insn, + BPF_ATOMIC_OP(BPF_DW, BPF_CMPXCHG, BPF_REG_6, BPF_REG_1, 0)) + : __clobber_all); +} + char _license[] SEC("license") = "GPL"; -- 2.43.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH bpf 2/2] selftests/bpf: Add more precision tracking tests for atomics 2026-03-30 13:27 ` [PATCH bpf 2/2] selftests/bpf: Add more precision tracking tests for atomics Daniel Borkmann @ 2026-03-30 14:42 ` Puranjay Mohan 0 siblings, 0 replies; 7+ messages in thread From: Puranjay Mohan @ 2026-03-30 14:42 UTC (permalink / raw) To: Daniel Borkmann, eddyz87; +Cc: ast, bpf Daniel Borkmann <daniel@iogearbox.net> writes: > Add verifier precision tracking tests for BPF atomic fetch operations. > Validate that backtrack_insn correctly propagates precision from the > fetch dst_reg to the stack slot for atomic64_{fetch_add,xchg,cmpxchg}. > > # LDLIBS=-static PKG_CONFIG='pkg-config --static' ./vmtest.sh -- ./test_progs -t verifier_precision > [...] > + /etc/rcS.d/S50-startup > ./test_progs -t verifier_precision > [ 1.697105] bpf_testmod: loading out-of-tree module taints kernel. > [ 1.700220] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel > [ 1.777043] tsc: Refined TSC clocksource calibration: 3407.986 MHz > [ 1.777619] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fc6d7268, max_idle_ns: 440795260133 ns > [ 1.778658] clocksource: Switched to clocksource tsc > #623/1 verifier_precision/bpf_neg:OK > #623/2 verifier_precision/bpf_end_to_le:OK > #623/3 verifier_precision/bpf_end_to_be:OK > #623/4 verifier_precision/bpf_end_bswap:OK > #623/5 verifier_precision/bpf_load_acquire:OK > #623/6 verifier_precision/bpf_store_release:OK > #623/7 verifier_precision/state_loop_first_last_equal:OK > #623/8 verifier_precision/bpf_cond_op_r10:OK > #623/9 verifier_precision/bpf_cond_op_not_r10:OK > #623/10 verifier_precision/bpf_atomic_fetch_add_precision:OK > #623/11 verifier_precision/bpf_atomic_xchg_precision:OK > #623/12 verifier_precision/bpf_atomic_cmpxchg_precision:OK > #623/13 verifier_precision/bpf_atomic_fetch_add_dual_precision:OK > #623/14 verifier_precision/bpf_atomic_cmpxchg_dual_precision:OK > #623/15 verifier_precision/bpf_atomic_fetch_add_map_precision:OK > #623/16 verifier_precision/bpf_atomic_cmpxchg_map_precision:OK > #623/17 verifier_precision/bpf_neg_2:OK > #623/18 verifier_precision/bpf_neg_3:OK > #623/19 verifier_precision/bpf_neg_4:OK > #623/20 verifier_precision/bpf_neg_5:OK > #623 verifier_precision:OK > Summary: 1/20 PASSED, 0 SKIPPED, 0 FAILED > > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Puranjay Mohan <puranjay@kernel.org> > --- > .../selftests/bpf/progs/verifier_precision.c | 207 ++++++++++++++++++ > 1 file changed, 207 insertions(+) > > diff --git a/tools/testing/selftests/bpf/progs/verifier_precision.c b/tools/testing/selftests/bpf/progs/verifier_precision.c > index 1fe090cd6744..6dbd4feda337 100644 > --- a/tools/testing/selftests/bpf/progs/verifier_precision.c > +++ b/tools/testing/selftests/bpf/progs/verifier_precision.c > @@ -5,6 +5,13 @@ > #include "../../../include/linux/filter.h" > #include "bpf_misc.h" > > +struct { > + __uint(type, BPF_MAP_TYPE_ARRAY); > + __uint(max_entries, 1); > + __type(key, __u32); > + __type(value, __u64); > +} precision_map SEC(".maps"); > + > SEC("?raw_tp") > __success __log_level(2) > __msg("mark_precise: frame0: regs=r2 stack= before 3: (bf) r1 = r10") > @@ -301,4 +308,204 @@ __naked int bpf_neg_5(void) > ::: __clobber_all); > } > > +SEC("?raw_tp") > +__success __log_level(2) > +__msg("mark_precise: frame0: regs=r2 stack= before 4: (bf) r3 = r10") > +__msg("mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2)") > +__msg("mark_precise: frame0: regs= stack=-8 before 2: (b7) r2 = 0") > +__msg("mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1") > +__msg("mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8") > +__naked int bpf_atomic_fetch_add_precision(void) > +{ > + asm volatile ( > + "r1 = 8;" > + "*(u64 *)(r10 - 8) = r1;" > + "r2 = 0;" > + ".8byte %[fetch_add_insn];" /* r2 = atomic_fetch_add(*(u64 *)(r10 - 8), r2) */ > + "r3 = r10;" > + "r3 += r2;" /* mark_precise */ > + "r0 = 0;" > + "exit;" > + : > + : __imm_insn(fetch_add_insn, > + BPF_ATOMIC_OP(BPF_DW, BPF_ADD | BPF_FETCH, BPF_REG_10, BPF_REG_2, -8)) > + : __clobber_all); > +} > + > +SEC("?raw_tp") > +__success __log_level(2) > +__msg("mark_precise: frame0: regs=r2 stack= before 4: (bf) r3 = r10") > +__msg("mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_xchg((u64 *)(r10 -8), r2)") > +__msg("mark_precise: frame0: regs= stack=-8 before 2: (b7) r2 = 0") > +__msg("mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1") > +__msg("mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8") > +__naked int bpf_atomic_xchg_precision(void) > +{ > + asm volatile ( > + "r1 = 8;" > + "*(u64 *)(r10 - 8) = r1;" > + "r2 = 0;" > + ".8byte %[xchg_insn];" /* r2 = atomic_xchg(*(u64 *)(r10 - 8), r2) */ > + "r3 = r10;" > + "r3 += r2;" /* mark_precise */ > + "r0 = 0;" > + "exit;" > + : > + : __imm_insn(xchg_insn, > + BPF_ATOMIC_OP(BPF_DW, BPF_XCHG, BPF_REG_10, BPF_REG_2, -8)) > + : __clobber_all); > +} > + > +SEC("?raw_tp") > +__success __log_level(2) > +__msg("mark_precise: frame0: regs=r0 stack= before 5: (bf) r3 = r10") > +__msg("mark_precise: frame0: regs=r0 stack= before 4: (db) r0 = atomic64_cmpxchg((u64 *)(r10 -8), r0, r2)") > +__msg("mark_precise: frame0: regs= stack=-8 before 3: (b7) r2 = 0") > +__msg("mark_precise: frame0: regs= stack=-8 before 2: (b7) r0 = 0") > +__msg("mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1") > +__msg("mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8") > +__naked int bpf_atomic_cmpxchg_precision(void) > +{ > + asm volatile ( > + "r1 = 8;" > + "*(u64 *)(r10 - 8) = r1;" > + "r0 = 0;" > + "r2 = 0;" > + ".8byte %[cmpxchg_insn];" /* r0 = atomic_cmpxchg(*(u64 *)(r10 - 8), r0, r2) */ > + "r3 = r10;" > + "r3 += r0;" /* mark_precise */ > + "r0 = 0;" > + "exit;" > + : > + : __imm_insn(cmpxchg_insn, > + BPF_ATOMIC_OP(BPF_DW, BPF_CMPXCHG, BPF_REG_10, BPF_REG_2, -8)) > + : __clobber_all); > +} > + > +/* Regression test for dual precision: Both the fetched value (r2) and > + * a reread of the same stack slot (r3) are tracked for precision. After > + * the atomic operation, the stack slot is STACK_MISC. Thus, the ldx at > + * insn 4 does NOT set INSN_F_STACK_ACCESS. Precision for the stack slot > + * propagates solely through the atomic fetch's load side (insn 3). > + */ > +SEC("?raw_tp") > +__success __log_level(2) > +__msg("mark_precise: frame0: regs=r2,r3 stack= before 4: (79) r3 = *(u64 *)(r10 -8)") > +__msg("mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2)") > +__msg("mark_precise: frame0: regs= stack=-8 before 2: (b7) r2 = 0") > +__msg("mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1") > +__msg("mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8") > +__naked int bpf_atomic_fetch_add_dual_precision(void) > +{ > + asm volatile ( > + "r1 = 8;" > + "*(u64 *)(r10 - 8) = r1;" > + "r2 = 0;" > + ".8byte %[fetch_add_insn];" /* r2 = atomic_fetch_add(*(u64 *)(r10 - 8), r2) */ > + "r3 = *(u64 *)(r10 - 8);" > + "r4 = r2;" > + "r4 += r3;" > + "r4 &= 7;" > + "r5 = r10;" > + "r5 += r4;" /* mark_precise */ > + "r0 = 0;" > + "exit;" > + : > + : __imm_insn(fetch_add_insn, > + BPF_ATOMIC_OP(BPF_DW, BPF_ADD | BPF_FETCH, BPF_REG_10, BPF_REG_2, -8)) > + : __clobber_all); > +} > + > +SEC("?raw_tp") > +__success __log_level(2) > +__msg("mark_precise: frame0: regs=r0,r3 stack= before 5: (79) r3 = *(u64 *)(r10 -8)") > +__msg("mark_precise: frame0: regs=r0 stack= before 4: (db) r0 = atomic64_cmpxchg((u64 *)(r10 -8), r0, r2)") > +__msg("mark_precise: frame0: regs= stack=-8 before 3: (b7) r2 = 0") > +__msg("mark_precise: frame0: regs= stack=-8 before 2: (b7) r0 = 8") > +__msg("mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1") > +__msg("mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8") > +__naked int bpf_atomic_cmpxchg_dual_precision(void) > +{ > + asm volatile ( > + "r1 = 8;" > + "*(u64 *)(r10 - 8) = r1;" > + "r0 = 8;" > + "r2 = 0;" > + ".8byte %[cmpxchg_insn];" /* r0 = atomic_cmpxchg(*(u64 *)(r10 - 8), r0, r2) */ > + "r3 = *(u64 *)(r10 - 8);" > + "r4 = r0;" > + "r4 += r3;" > + "r4 &= 7;" > + "r5 = r10;" > + "r5 += r4;" /* mark_precise */ > + "r0 = 0;" > + "exit;" > + : > + : __imm_insn(cmpxchg_insn, > + BPF_ATOMIC_OP(BPF_DW, BPF_CMPXCHG, BPF_REG_10, BPF_REG_2, -8)) > + : __clobber_all); > +} > + > +SEC("?raw_tp") > +__success __log_level(2) > +__msg("mark_precise: frame0: regs=r1 stack= before 10: (57) r1 &= 7") > +__msg("mark_precise: frame0: regs=r1 stack= before 9: (db) r1 = atomic64_fetch_add((u64 *)(r0 +0), r1)") > +__not_msg("falling back to forcing all scalars precise") > +__naked int bpf_atomic_fetch_add_map_precision(void) > +{ > + asm volatile ( > + "r1 = 0;" > + "*(u64 *)(r10 - 8) = r1;" > + "r2 = r10;" > + "r2 += -8;" > + "r1 = %[precision_map] ll;" > + "call %[bpf_map_lookup_elem];" > + "if r0 == 0 goto 1f;" > + "r1 = 0;" > + ".8byte %[fetch_add_insn];" /* r1 = atomic_fetch_add(*(u64 *)(r0 + 0), r1) */ > + "r1 &= 7;" > + "r2 = r10;" > + "r2 += r1;" /* mark_precise */ > + "1: r0 = 0;" > + "exit;" > + : > + : __imm_addr(precision_map), > + __imm(bpf_map_lookup_elem), > + __imm_insn(fetch_add_insn, > + BPF_ATOMIC_OP(BPF_DW, BPF_ADD | BPF_FETCH, BPF_REG_0, BPF_REG_1, 0)) > + : __clobber_all); > +} > + > +SEC("?raw_tp") > +__success __log_level(2) > +__msg("mark_precise: frame0: regs=r0 stack= before 12: (57) r0 &= 7") > +__msg("mark_precise: frame0: regs=r0 stack= before 11: (db) r0 = atomic64_cmpxchg((u64 *)(r6 +0), r0, r1)") > +__not_msg("falling back to forcing all scalars precise") > +__naked int bpf_atomic_cmpxchg_map_precision(void) > +{ > + asm volatile ( > + "r1 = 0;" > + "*(u64 *)(r10 - 8) = r1;" > + "r2 = r10;" > + "r2 += -8;" > + "r1 = %[precision_map] ll;" > + "call %[bpf_map_lookup_elem];" > + "if r0 == 0 goto 1f;" > + "r6 = r0;" > + "r0 = 0;" > + "r1 = 0;" > + ".8byte %[cmpxchg_insn];" /* r0 = atomic_cmpxchg(*(u64 *)(r6 + 0), r0, r1) */ > + "r0 &= 7;" > + "r2 = r10;" > + "r2 += r0;" /* mark_precise */ > + "1: r0 = 0;" > + "exit;" > + : > + : __imm_addr(precision_map), > + __imm(bpf_map_lookup_elem), > + __imm_insn(cmpxchg_insn, > + BPF_ATOMIC_OP(BPF_DW, BPF_CMPXCHG, BPF_REG_6, BPF_REG_1, 0)) > + : __clobber_all); > +} > + > char _license[] SEC("license") = "GPL"; > -- > 2.43.0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH bpf 1/2] bpf: Fix incorrect pruning due to atomic fetch precision tracking 2026-03-30 13:27 [PATCH bpf 1/2] bpf: Fix incorrect pruning due to atomic fetch precision tracking Daniel Borkmann 2026-03-30 13:27 ` [PATCH bpf 2/2] selftests/bpf: Add more precision tracking tests for atomics Daniel Borkmann @ 2026-03-30 14:41 ` Puranjay Mohan 2026-03-30 21:56 ` Daniel Borkmann 2026-03-30 14:45 ` Alexei Starovoitov 2 siblings, 1 reply; 7+ messages in thread From: Puranjay Mohan @ 2026-03-30 14:41 UTC (permalink / raw) To: Daniel Borkmann, eddyz87; +Cc: ast, bpf, STAR Labs SG Daniel Borkmann <daniel@iogearbox.net> writes: > When backtrack_insn encounters a BPF_STX instruction with BPF_ATOMIC > and BPF_FETCH, the src register (or r0 for BPF_CMPXCHG) also acts as > a destination, thus receiving the old value from the memory location. > > The current backtracking logic does not account for this. It treats > atomic fetch operations the same as regular stores where the src > register is only an input. This leads the backtrack_insn to fail to > propagate precision to the stack location, which is then not marked > as precise! > > Later, the verifier's path pruning can incorrectly consider two states > equivalent when they differ in terms of stack state. Meaning, two > branches can be treated as equivalent and thus get pruned when they > should not be seen as such. > > Fix it as follows: When the fetch dst register is being tracked for > precision, clear it and propagate precision over to the stack slot. > This is similar to how BPF_LDX handles loads from the stack. > > Before: > > 0: (b7) r1 = 8 ; R1=8 > 1: (7b) *(u64 *)(r10 -8) = r1 ; R1=8 R10=fp0 fp-8=8 > 2: (b7) r2 = 0 ; R2=0 > 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) ; R2=8 R10=fp0 fp-8=mmmmmmmm > 4: (bf) r3 = r10 ; R3=fp0 R10=fp0 > 5: (0f) r3 += r2 > mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1 > mark_precise: frame0: regs=r2 stack= before 4: (bf) r3 = r10 > mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) > mark_precise: frame0: regs=r2 stack= before 2: (b7) r2 = 0 > 6: R2=8 R3=fp8 > 6: (b7) r0 = 0 ; R0=0 > 7: (95) exit > > After: > > 0: (b7) r1 = 8 ; R1=8 > 1: (7b) *(u64 *)(r10 -8) = r1 ; R1=8 R10=fp0 fp-8=8 > 2: (b7) r2 = 0 ; R2=0 > 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) ; R2=8 R10=fp0 fp-8=mmmmmmmm > 4: (bf) r3 = r10 ; R3=fp0 R10=fp0 > 5: (0f) r3 += r2 > mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1 > mark_precise: frame0: regs=r2 stack= before 4: (bf) r3 = r10 > mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) > mark_precise: frame0: regs= stack=-8 before 2: (b7) r2 = 0 > mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1 > mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8 > 6: R2=8 R3=fp8 > 6: (b7) r0 = 0 ; R0=0 > 7: (95) exit > > Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg") > Fixes: 5ca419f2864a ("bpf: Add BPF_FETCH field / create atomic_fetch_add instruction") > Reported-by: STAR Labs SG <info@starlabs.sg> > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Athough not related to this patch but it made me think about how the conditional semantics of CMPXCHG could interact with precision tracking, because the CMPXCHG could write to a stack slot or not depending on the comparison and that slot could later be used for precision tracking. But I found that although check_atomic_rmw() set's INSN_F_STACK_ACCESS in the flags, it also makes the stack slot STACK_MISC, therefore instruction reading from it will not have INSN_F_STACK_ACCESS so the precision tracking will stop there. This patch looks correct to me. Reviewed-by: Puranjay Mohan <puranjay@kernel.org> > --- > kernel/bpf/verifier.c | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index f108c01ff6d0..293aa957a5ff 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -4474,6 +4474,31 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, > * encountered a case of pointer subtraction. > */ > return -ENOTSUPP; > + > + /* atomic fetch operation writes the old value into a > + * register (sreg or r0) and if it was tracked for > + * precision, propagate to the stack slot like we do > + * in ldx. > + */ > + if (class == BPF_STX && mode == BPF_ATOMIC && > + (insn->imm & BPF_FETCH)) { > + u32 load_reg = insn->imm == BPF_CMPXCHG ? > + BPF_REG_0 : sreg; > + > + if (bt_is_reg_set(bt, load_reg)) { > + bt_clear_reg(bt, load_reg); > + /* atomic fetch from non-stack memory > + * can't be further backtracked, same > + * as for ldx. > + */ > + if (!hist || !(hist->flags & INSN_F_STACK_ACCESS)) > + return 0; > + spi = insn_stack_access_spi(hist->flags); > + fr = insn_stack_access_frameno(hist->flags); > + bt_set_frame_slot(bt, fr, spi); > + return 0; > + } > + } > /* scalars can only be spilled into stack */ > if (!hist || !(hist->flags & INSN_F_STACK_ACCESS)) > return 0; > -- > 2.43.0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH bpf 1/2] bpf: Fix incorrect pruning due to atomic fetch precision tracking 2026-03-30 14:41 ` [PATCH bpf 1/2] bpf: Fix incorrect pruning due to atomic fetch precision tracking Puranjay Mohan @ 2026-03-30 21:56 ` Daniel Borkmann 0 siblings, 0 replies; 7+ messages in thread From: Daniel Borkmann @ 2026-03-30 21:56 UTC (permalink / raw) To: Puranjay Mohan, eddyz87; +Cc: ast, bpf, STAR Labs SG On 3/30/26 4:41 PM, Puranjay Mohan wrote: > Daniel Borkmann <daniel@iogearbox.net> writes: > >> When backtrack_insn encounters a BPF_STX instruction with BPF_ATOMIC >> and BPF_FETCH, the src register (or r0 for BPF_CMPXCHG) also acts as >> a destination, thus receiving the old value from the memory location. >> >> The current backtracking logic does not account for this. It treats >> atomic fetch operations the same as regular stores where the src >> register is only an input. This leads the backtrack_insn to fail to >> propagate precision to the stack location, which is then not marked >> as precise! >> >> Later, the verifier's path pruning can incorrectly consider two states >> equivalent when they differ in terms of stack state. Meaning, two >> branches can be treated as equivalent and thus get pruned when they >> should not be seen as such. >> >> Fix it as follows: When the fetch dst register is being tracked for >> precision, clear it and propagate precision over to the stack slot. >> This is similar to how BPF_LDX handles loads from the stack. >> >> Before: >> >> 0: (b7) r1 = 8 ; R1=8 >> 1: (7b) *(u64 *)(r10 -8) = r1 ; R1=8 R10=fp0 fp-8=8 >> 2: (b7) r2 = 0 ; R2=0 >> 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) ; R2=8 R10=fp0 fp-8=mmmmmmmm >> 4: (bf) r3 = r10 ; R3=fp0 R10=fp0 >> 5: (0f) r3 += r2 >> mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1 >> mark_precise: frame0: regs=r2 stack= before 4: (bf) r3 = r10 >> mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) >> mark_precise: frame0: regs=r2 stack= before 2: (b7) r2 = 0 >> 6: R2=8 R3=fp8 >> 6: (b7) r0 = 0 ; R0=0 >> 7: (95) exit >> >> After: >> >> 0: (b7) r1 = 8 ; R1=8 >> 1: (7b) *(u64 *)(r10 -8) = r1 ; R1=8 R10=fp0 fp-8=8 >> 2: (b7) r2 = 0 ; R2=0 >> 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) ; R2=8 R10=fp0 fp-8=mmmmmmmm >> 4: (bf) r3 = r10 ; R3=fp0 R10=fp0 >> 5: (0f) r3 += r2 >> mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1 >> mark_precise: frame0: regs=r2 stack= before 4: (bf) r3 = r10 >> mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) >> mark_precise: frame0: regs= stack=-8 before 2: (b7) r2 = 0 >> mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1 >> mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8 >> 6: R2=8 R3=fp8 >> 6: (b7) r0 = 0 ; R0=0 >> 7: (95) exit >> >> Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg") >> Fixes: 5ca419f2864a ("bpf: Add BPF_FETCH field / create atomic_fetch_add instruction") >> Reported-by: STAR Labs SG <info@starlabs.sg> >> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> > > Athough not related to this patch but it made me think about how the > conditional semantics of CMPXCHG could interact with precision tracking, > because the CMPXCHG could write to a stack slot or not depending on the > comparison and that slot could later be used for precision tracking. But > I found that although check_atomic_rmw() set's INSN_F_STACK_ACCESS in > the flags, it also makes the stack slot STACK_MISC, therefore > instruction reading from it will not have INSN_F_STACK_ACCESS so the > precision tracking will stop there. Yeap, there should be test coverage in patch 2/2 for this case to assert the current behavior making sure we don't forget about this case when the internals change. > This patch looks correct to me. > > Reviewed-by: Puranjay Mohan <puranjay@kernel.org> Thanks! ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH bpf 1/2] bpf: Fix incorrect pruning due to atomic fetch precision tracking 2026-03-30 13:27 [PATCH bpf 1/2] bpf: Fix incorrect pruning due to atomic fetch precision tracking Daniel Borkmann 2026-03-30 13:27 ` [PATCH bpf 2/2] selftests/bpf: Add more precision tracking tests for atomics Daniel Borkmann 2026-03-30 14:41 ` [PATCH bpf 1/2] bpf: Fix incorrect pruning due to atomic fetch precision tracking Puranjay Mohan @ 2026-03-30 14:45 ` Alexei Starovoitov 2026-03-30 22:02 ` Daniel Borkmann 2 siblings, 1 reply; 7+ messages in thread From: Alexei Starovoitov @ 2026-03-30 14:45 UTC (permalink / raw) To: Daniel Borkmann; +Cc: Eduard, Alexei Starovoitov, bpf, STAR Labs SG On Mon, Mar 30, 2026 at 6:27 AM Daniel Borkmann <daniel@iogearbox.net> wrote: > > When backtrack_insn encounters a BPF_STX instruction with BPF_ATOMIC > and BPF_FETCH, the src register (or r0 for BPF_CMPXCHG) also acts as > a destination, thus receiving the old value from the memory location. > > The current backtracking logic does not account for this. It treats > atomic fetch operations the same as regular stores where the src > register is only an input. This leads the backtrack_insn to fail to > propagate precision to the stack location, which is then not marked > as precise! > > Later, the verifier's path pruning can incorrectly consider two states > equivalent when they differ in terms of stack state. Meaning, two > branches can be treated as equivalent and thus get pruned when they > should not be seen as such. > > Fix it as follows: When the fetch dst register is being tracked for > precision, clear it and propagate precision over to the stack slot. > This is similar to how BPF_LDX handles loads from the stack. > > Before: > > 0: (b7) r1 = 8 ; R1=8 > 1: (7b) *(u64 *)(r10 -8) = r1 ; R1=8 R10=fp0 fp-8=8 > 2: (b7) r2 = 0 ; R2=0 > 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) ; R2=8 R10=fp0 fp-8=mmmmmmmm > 4: (bf) r3 = r10 ; R3=fp0 R10=fp0 > 5: (0f) r3 += r2 > mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1 > mark_precise: frame0: regs=r2 stack= before 4: (bf) r3 = r10 > mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) > mark_precise: frame0: regs=r2 stack= before 2: (b7) r2 = 0 > 6: R2=8 R3=fp8 > 6: (b7) r0 = 0 ; R0=0 > 7: (95) exit > > After: > > 0: (b7) r1 = 8 ; R1=8 > 1: (7b) *(u64 *)(r10 -8) = r1 ; R1=8 R10=fp0 fp-8=8 > 2: (b7) r2 = 0 ; R2=0 > 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) ; R2=8 R10=fp0 fp-8=mmmmmmmm > 4: (bf) r3 = r10 ; R3=fp0 R10=fp0 > 5: (0f) r3 += r2 > mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1 > mark_precise: frame0: regs=r2 stack= before 4: (bf) r3 = r10 > mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) > mark_precise: frame0: regs= stack=-8 before 2: (b7) r2 = 0 > mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1 > mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8 > 6: R2=8 R3=fp8 > 6: (b7) r0 = 0 ; R0=0 > 7: (95) exit > > Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg") > Fixes: 5ca419f2864a ("bpf: Add BPF_FETCH field / create atomic_fetch_add instruction") > Reported-by: STAR Labs SG <info@starlabs.sg> > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> > --- > kernel/bpf/verifier.c | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index f108c01ff6d0..293aa957a5ff 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -4474,6 +4474,31 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, > * encountered a case of pointer subtraction. > */ > return -ENOTSUPP; > + > + /* atomic fetch operation writes the old value into a kernel comment style pls. > + * register (sreg or r0) and if it was tracked for > + * precision, propagate to the stack slot like we do > + * in ldx. > + */ > + if (class == BPF_STX && mode == BPF_ATOMIC && > + (insn->imm & BPF_FETCH)) { > + u32 load_reg = insn->imm == BPF_CMPXCHG ? > + BPF_REG_0 : sreg; > + > + if (bt_is_reg_set(bt, load_reg)) { too much claude smell in here. remove unnecessary indent pls. > + bt_clear_reg(bt, load_reg); > + /* atomic fetch from non-stack memory > + * can't be further backtracked, same > + * as for ldx. > + */ > + if (!hist || !(hist->flags & INSN_F_STACK_ACCESS)) > + return 0; too much copy paste. I bet above can be combine with the same line below. > + spi = insn_stack_access_spi(hist->flags); > + fr = insn_stack_access_frameno(hist->flags); > + bt_set_frame_slot(bt, fr, spi); > + return 0; > + } > + } > /* scalars can only be spilled into stack */ > if (!hist || !(hist->flags & INSN_F_STACK_ACCESS)) > return 0; Considering we're in rc6+, bpf-next is better. pw-bot: cr ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH bpf 1/2] bpf: Fix incorrect pruning due to atomic fetch precision tracking 2026-03-30 14:45 ` Alexei Starovoitov @ 2026-03-30 22:02 ` Daniel Borkmann 0 siblings, 0 replies; 7+ messages in thread From: Daniel Borkmann @ 2026-03-30 22:02 UTC (permalink / raw) To: Alexei Starovoitov; +Cc: Eduard, Alexei Starovoitov, bpf, STAR Labs SG On 3/30/26 4:45 PM, Alexei Starovoitov wrote: > On Mon, Mar 30, 2026 at 6:27 AM Daniel Borkmann <daniel@iogearbox.net> wrote: >> >> When backtrack_insn encounters a BPF_STX instruction with BPF_ATOMIC >> and BPF_FETCH, the src register (or r0 for BPF_CMPXCHG) also acts as >> a destination, thus receiving the old value from the memory location. >> >> The current backtracking logic does not account for this. It treats >> atomic fetch operations the same as regular stores where the src >> register is only an input. This leads the backtrack_insn to fail to >> propagate precision to the stack location, which is then not marked >> as precise! >> >> Later, the verifier's path pruning can incorrectly consider two states >> equivalent when they differ in terms of stack state. Meaning, two >> branches can be treated as equivalent and thus get pruned when they >> should not be seen as such. >> >> Fix it as follows: When the fetch dst register is being tracked for >> precision, clear it and propagate precision over to the stack slot. >> This is similar to how BPF_LDX handles loads from the stack. >> >> Before: >> >> 0: (b7) r1 = 8 ; R1=8 >> 1: (7b) *(u64 *)(r10 -8) = r1 ; R1=8 R10=fp0 fp-8=8 >> 2: (b7) r2 = 0 ; R2=0 >> 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) ; R2=8 R10=fp0 fp-8=mmmmmmmm >> 4: (bf) r3 = r10 ; R3=fp0 R10=fp0 >> 5: (0f) r3 += r2 >> mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1 >> mark_precise: frame0: regs=r2 stack= before 4: (bf) r3 = r10 >> mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) >> mark_precise: frame0: regs=r2 stack= before 2: (b7) r2 = 0 >> 6: R2=8 R3=fp8 >> 6: (b7) r0 = 0 ; R0=0 >> 7: (95) exit >> >> After: >> >> 0: (b7) r1 = 8 ; R1=8 >> 1: (7b) *(u64 *)(r10 -8) = r1 ; R1=8 R10=fp0 fp-8=8 >> 2: (b7) r2 = 0 ; R2=0 >> 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) ; R2=8 R10=fp0 fp-8=mmmmmmmm >> 4: (bf) r3 = r10 ; R3=fp0 R10=fp0 >> 5: (0f) r3 += r2 >> mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1 >> mark_precise: frame0: regs=r2 stack= before 4: (bf) r3 = r10 >> mark_precise: frame0: regs=r2 stack= before 3: (db) r2 = atomic64_fetch_add((u64 *)(r10 -8), r2) >> mark_precise: frame0: regs= stack=-8 before 2: (b7) r2 = 0 >> mark_precise: frame0: regs= stack=-8 before 1: (7b) *(u64 *)(r10 -8) = r1 >> mark_precise: frame0: regs=r1 stack= before 0: (b7) r1 = 8 >> 6: R2=8 R3=fp8 >> 6: (b7) r0 = 0 ; R0=0 >> 7: (95) exit >> >> Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg") >> Fixes: 5ca419f2864a ("bpf: Add BPF_FETCH field / create atomic_fetch_add instruction") >> Reported-by: STAR Labs SG <info@starlabs.sg> >> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> >> --- >> kernel/bpf/verifier.c | 25 +++++++++++++++++++++++++ >> 1 file changed, 25 insertions(+) >> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> index f108c01ff6d0..293aa957a5ff 100644 >> --- a/kernel/bpf/verifier.c >> +++ b/kernel/bpf/verifier.c >> @@ -4474,6 +4474,31 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, >> * encountered a case of pointer subtraction. >> */ >> return -ENOTSUPP; >> + >> + /* atomic fetch operation writes the old value into a > > kernel comment style pls. Ok, will change. The comment style in rest of backtrack_insn is the same, but if we're converging long-term to kernel style that's fine. >> + * register (sreg or r0) and if it was tracked for >> + * precision, propagate to the stack slot like we do >> + * in ldx. >> + */ >> + if (class == BPF_STX && mode == BPF_ATOMIC && >> + (insn->imm & BPF_FETCH)) { >> + u32 load_reg = insn->imm == BPF_CMPXCHG ? >> + BPF_REG_0 : sreg; >> + >> + if (bt_is_reg_set(bt, load_reg)) { > > too much claude smell in here. > remove unnecessary indent pls. > >> + bt_clear_reg(bt, load_reg); >> + /* atomic fetch from non-stack memory >> + * can't be further backtracked, same >> + * as for ldx. >> + */ >> + if (!hist || !(hist->flags & INSN_F_STACK_ACCESS)) >> + return 0; > > too much copy paste. I bet above can be combine with the same line below. I'll play around with removing indent and refactoring. It's a bit of a mix between stx and ldx, closer to ldx actually. Worst case I'll have this as a proper 'else if' case if it gets too convoluted in either of the two. >> + spi = insn_stack_access_spi(hist->flags); >> + fr = insn_stack_access_frameno(hist->flags); >> + bt_set_frame_slot(bt, fr, spi); >> + return 0; >> + } >> + } >> /* scalars can only be spilled into stack */ >> if (!hist || !(hist->flags & INSN_F_STACK_ACCESS)) >> return 0; > > Considering we're in rc6+, bpf-next is better. Yeap, will do! Thanks, Daniel ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-03-30 22:02 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-30 13:27 [PATCH bpf 1/2] bpf: Fix incorrect pruning due to atomic fetch precision tracking Daniel Borkmann 2026-03-30 13:27 ` [PATCH bpf 2/2] selftests/bpf: Add more precision tracking tests for atomics Daniel Borkmann 2026-03-30 14:42 ` Puranjay Mohan 2026-03-30 14:41 ` [PATCH bpf 1/2] bpf: Fix incorrect pruning due to atomic fetch precision tracking Puranjay Mohan 2026-03-30 21:56 ` Daniel Borkmann 2026-03-30 14:45 ` Alexei Starovoitov 2026-03-30 22:02 ` Daniel Borkmann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox