* [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping @ 2025-05-11 16:27 Yonghong Song 2025-05-11 16:28 ` [PATCH bpf-next 2/2] selftests/bpf: Add a test with r10 in conditional jmp Yonghong Song ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Yonghong Song @ 2025-05-11 16:27 UTC (permalink / raw) To: bpf Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau Yi Lai reported an issue ([1]) where the following warning appears in kernel dmesg: [ 60.643604] verifier backtracking bug [ 60.643635] WARNING: CPU: 10 PID: 2315 at kernel/bpf/verifier.c:4302 __mark_chain_precision+0x3a6c/0x3e10 [ 60.648428] Modules linked in: bpf_testmod(OE) [ 60.650471] CPU: 10 UID: 0 PID: 2315 Comm: test_progs Tainted: G OE 6.15.0-rc4-gef11287f8289-dirty #327 PREEMPT(full) [ 60.654385] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [ 60.656682] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 60.660475] RIP: 0010:__mark_chain_precision+0x3a6c/0x3e10 [ 60.662814] Code: 5a 30 84 89 ea e8 c4 d9 01 00 80 3d 3e 7d d8 04 00 0f 85 60 fa ff ff c6 05 31 7d d8 04 01 48 c7 c7 00 58 30 84 e8 c4 06 a5 ff <0f> 0b e9 46 fa ff ff 48 ... [ 60.668720] RSP: 0018:ffff888116cc7298 EFLAGS: 00010246 [ 60.671075] RAX: 54d70e82dfd31900 RBX: ffff888115b65e20 RCX: 0000000000000000 [ 60.673659] RDX: 0000000000000001 RSI: 0000000000000004 RDI: 00000000ffffffff [ 60.676241] RBP: 0000000000000400 R08: ffff8881f6f23bd3 R09: 1ffff1103ede477a [ 60.678787] R10: dffffc0000000000 R11: ffffed103ede477b R12: ffff888115b60ae8 [ 60.681420] R13: 1ffff11022b6cbc4 R14: 00000000fffffff2 R15: 0000000000000001 [ 60.684030] FS: 00007fc2aedd80c0(0000) GS:ffff88826fa8a000(0000) knlGS:0000000000000000 [ 60.686837] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 60.689027] CR2: 000056325369e000 CR3: 000000011088b002 CR4: 0000000000370ef0 [ 60.691623] Call Trace: [ 60.692821] <TASK> [ 60.693960] ? __pfx_verbose+0x10/0x10 [ 60.695656] ? __pfx_disasm_kfunc_name+0x10/0x10 [ 60.697495] check_cond_jmp_op+0x16f7/0x39b0 [ 60.699237] do_check+0x58fa/0xab10 ... Further analysis shows the warning is at line 4302 as below: 4294 /* static subprog call instruction, which 4295 * means that we are exiting current subprog, 4296 * so only r1-r5 could be still requested as 4297 * precise, r0 and r6-r10 or any stack slot in 4298 * the current frame should be zero by now 4299 */ 4300 if (bt_reg_mask(bt) & ~BPF_REGMASK_ARGS) { 4301 verbose(env, "BUG regs %x\n", bt_reg_mask(bt)); 4302 WARN_ONCE(1, "verifier backtracking bug"); 4303 return -EFAULT; 4304 } With the below test (also in the next patch): __used __naked static void __bpf_jmp_r10(void) { asm volatile ( "r2 = 2314885393468386424 ll;" "goto +0;" "if r2 <= r10 goto +3;" "if r1 >= -1835016 goto +0;" "if r2 <= 8 goto +0;" "if r3 <= 0 goto +0;" "exit;" ::: __clobber_all); } SEC("?raw_tp") __naked void bpf_jmp_r10(void) { asm volatile ( "r3 = 0 ll;" "call __bpf_jmp_r10;" "r0 = 0;" "exit;" ::: __clobber_all); } The following is the verifier failure log: 0: (18) r3 = 0x0 ; R3_w=0 2: (85) call pc+2 caller: R10=fp0 callee: frame1: R1=ctx() R3_w=0 R10=fp0 5: frame1: R1=ctx() R3_w=0 R10=fp0 ; asm volatile (" \ @ verifier_precision.c:184 5: (18) r2 = 0x20202000256c6c78 ; frame1: R2_w=0x20202000256c6c78 7: (05) goto pc+0 8: (bd) if r2 <= r10 goto pc+3 ; frame1: R2_w=0x20202000256c6c78 R10=fp0 9: (35) if r1 >= 0xffe3fff8 goto pc+0 ; frame1: R1=ctx() 10: (b5) if r2 <= 0x8 goto pc+0 mark_precise: frame1: last_idx 10 first_idx 0 subseq_idx -1 mark_precise: frame1: regs=r2 stack= before 9: (35) if r1 >= 0xffe3fff8 goto pc+0 mark_precise: frame1: regs=r2 stack= before 8: (bd) if r2 <= r10 goto pc+3 mark_precise: frame1: regs=r2,r10 stack= before 7: (05) goto pc+0 mark_precise: frame1: regs=r2,r10 stack= before 5: (18) r2 = 0x20202000256c6c78 mark_precise: frame1: regs=r10 stack= before 2: (85) call pc+2 BUG regs 400 The main failure reason is due to r10 in precision backtracking bookkeeping. Actually r10 is always precise and there is no need to add it the precision backtracking bookkeeping. This patch fixed the problem by not adding r10 to prevision backtracking bookkeeping. [1] https://lore.kernel.org/bpf/Z%2F8q3xzpU59CIYQE@ly-workstation/ Reported by: Yi Lai <yi1.lai@linux.intel.com> Fixes: 407958a0e980 ("bpf: encapsulate precision backtracking bookkeeping") Signed-off-by: Yonghong Song <yonghong.song@linux.dev> --- kernel/bpf/verifier.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 28f5a7899bd6..1cb4d80d15c1 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4413,8 +4413,10 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, * before it would be equally necessary to * propagate it to dreg. */ - bt_set_reg(bt, dreg); - bt_set_reg(bt, sreg); + if (dreg != BPF_REG_FP) + bt_set_reg(bt, dreg); + if (sreg != BPF_REG_FP) + bt_set_reg(bt, sreg); } else if (BPF_SRC(insn->code) == BPF_K) { /* dreg <cond> K * Only dreg still needs precision before -- 2.47.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH bpf-next 2/2] selftests/bpf: Add a test with r10 in conditional jmp 2025-05-11 16:27 [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping Yonghong Song @ 2025-05-11 16:28 ` Yonghong Song 2025-05-11 22:33 ` [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping Alexei Starovoitov 2025-05-12 16:26 ` Andrii Nakryiko 2 siblings, 0 replies; 10+ messages in thread From: Yonghong Song @ 2025-05-11 16:28 UTC (permalink / raw) To: bpf Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau Add a test with r10 in conditional jmp where the test passed. Without previous verifier change, the test will fail. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> --- .../selftests/bpf/progs/verifier_precision.c | 32 +++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/tools/testing/selftests/bpf/progs/verifier_precision.c b/tools/testing/selftests/bpf/progs/verifier_precision.c index 2dd0d15c2678..1591635e6e93 100644 --- a/tools/testing/selftests/bpf/progs/verifier_precision.c +++ b/tools/testing/selftests/bpf/progs/verifier_precision.c @@ -178,4 +178,36 @@ __naked int state_loop_first_last_equal(void) ); } +__used __naked static void __bpf_jmp_r10(void) +{ + asm volatile ( + "r2 = 2314885393468386424 ll;" + "goto +0;" + "if r2 <= r10 goto +3;" + "if r1 >= -1835016 goto +0;" + "if r2 <= 8 goto +0;" + "if r3 <= 0 goto +0;" + "exit;" + ::: __clobber_all); +} + +SEC("?raw_tp") +__success __log_level(2) +__msg("8: (bd) if r2 <= r10 goto pc+3") +__msg("9: (35) if r1 >= 0xffe3fff8 goto pc+0") +__msg("10: (b5) if r2 <= 0x8 goto pc+0") +__msg("mark_precise: frame1: last_idx 10 first_idx 0 subseq_idx -1") +__msg("mark_precise: frame1: regs=r2 stack= before 9: (35) if r1 >= 0xffe3fff8 goto pc+0") +__msg("mark_precise: frame1: regs=r2 stack= before 8: (bd) if r2 <= r10 goto pc+3") +__msg("mark_precise: frame1: regs=r2 stack= before 7: (05) goto pc+0") +__naked void bpf_jmp_r10(void) +{ + asm volatile ( + "r3 = 0 ll;" + "call __bpf_jmp_r10;" + "r0 = 0;" + "exit;" + ::: __clobber_all); +} + char _license[] SEC("license") = "GPL"; -- 2.47.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping 2025-05-11 16:27 [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping Yonghong Song 2025-05-11 16:28 ` [PATCH bpf-next 2/2] selftests/bpf: Add a test with r10 in conditional jmp Yonghong Song @ 2025-05-11 22:33 ` Alexei Starovoitov 2025-05-12 12:03 ` Ilya Leoshkevich 2025-05-12 16:26 ` Andrii Nakryiko 2 siblings, 1 reply; 10+ messages in thread From: Alexei Starovoitov @ 2025-05-11 22:33 UTC (permalink / raw) To: Yonghong Song, Ilya Leoshkevich Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team, Martin KaFai Lau On Sun, May 11, 2025 at 9:28 AM Yonghong Song <yonghong.song@linux.dev> wrote: > > Reported by: Yi Lai <yi1.lai@linux.intel.com> > Fixes: 407958a0e980 ("bpf: encapsulate precision backtracking bookkeeping") > Signed-off-by: Yonghong Song <yonghong.song@linux.dev> > --- > kernel/bpf/verifier.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 28f5a7899bd6..1cb4d80d15c1 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -4413,8 +4413,10 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, > * before it would be equally necessary to > * propagate it to dreg. > */ > - bt_set_reg(bt, dreg); > - bt_set_reg(bt, sreg); > + if (dreg != BPF_REG_FP) > + bt_set_reg(bt, dreg); > + if (sreg != BPF_REG_FP) > + bt_set_reg(bt, sreg); The fix makes sense to me. but it crashes on s390 according to CI: 2025-05-11T16:48:18.5929491Z #401 struct_ops_refcounted:OK 2025-05-11T16:48:18.7330807Z ------------[ cut here ]------------ 2025-05-11T16:48:18.7333824Z kernel BUG at kernel/bpf/core.c:533! 2025-05-11T16:48:18.7335154Z monitor event: 0040 ilc:2 [#1]SMP 2025-05-11T16:48:18.7336972Z Modules linked in: bpf_testmod(OE) [last unloaded: bpf_test_no_cfi(OE)] 2025-05-11T16:48:18.7341000Z CPU: 0 UID: 0 PID: 109 Comm: new_name Tainted: G OE 6.15.0-rc4-ga9827e5c6a13-dirty #13 NONE 2025-05-11T16:48:18.7343245Z Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE 2025-05-11T16:48:18.7344697Z Hardware name: IBM 8561 LT1 400 (KVM/Linux) 2025-05-11T16:48:18.7347056Z Krnl PSW : 0704d00180000000 000003320039d8ca (bpf_patch_insn_single+0x29a/0x2a0) 2025-05-11T16:48:18.7349372Z R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3 2025-05-11T16:48:18.7351910Z Krnl GPRS: 000002b200000016 ffffffff7ffffffe ffffffffffffffde 00000000ffffffde 2025-05-11T16:48:18.7354602Z 0000000000000003 0000000000000005 0000000000000000 000002b2000b5048 2025-05-11T16:48:18.7356934Z 0000000000000018 000002b2000b5000 0000000000000003 0000000000000002 2025-05-11T16:48:18.7359164Z 000003ff81badf98 0000000000000002 000003320039d738 000002b200687840 2025-05-11T16:48:18.7361217Z Krnl Code: 000003320039d8bc: e3005ff0ff50 sty %r0,-16(%r5) 2025-05-11T16:48:18.7363048Z 000003320039d8c2: a7f4ffc6 brc 15,000003320039d84e 2025-05-11T16:48:18.7364611Z #000003320039d8c6: af000000 mc 0,0 2025-05-11T16:48:18.7366106Z >000003320039d8ca: 0707 bcr 0,%r7 2025-05-11T16:48:18.7367449Z 000003320039d8cc: 0707 bcr 0,%r7 2025-05-11T16:48:18.7368855Z 000003320039d8ce: 0707 bcr 0,%r7 2025-05-11T16:48:18.7403748Z 000003320039d8d0: c004004bdc60 brcl 0,0000033200d19190 2025-05-11T16:48:18.7407899Z 000003320039d8d6: eb6ff0480024 stmg %r6,%r15,72(%r15) 2025-05-11T16:48:18.7410576Z Call Trace: 2025-05-11T16:48:18.7411713Z [<000003320039d8ca>] bpf_patch_insn_single+0x29a/0x2a0 2025-05-11T16:48:18.7413433Z ([<000003320039d738>] bpf_patch_insn_single+0x108/0x2a0) 2025-05-11T16:48:18.7415210Z [<000003320039eb72>] bpf_jit_blind_constants+0xd2/0x1b0 2025-05-11T16:48:18.7416879Z [<000003320020b5ee>] bpf_int_jit_compile+0x46/0x448 2025-05-11T16:48:18.7418417Z [<00000332003c12d4>] jit_subprogs+0x594/0xbe0 2025-05-11T16:48:18.7419782Z [<00000332003dacc8>] bpf_check+0xe28/0x14b0 2025-05-11T16:48:18.7421128Z [<00000332003a9328>] bpf_prog_load+0x4d8/0xba0 2025-05-11T16:48:18.7422570Z [<00000332003ab976>] __sys_bpf+0x98e/0xdd0 2025-05-11T16:48:18.7423887Z [<00000332003abdfc>] __s390x_sys_bpf+0x44/0x50 2025-05-11T16:48:18.7425227Z [<0000033200ce61b2>] __do_syscall+0x132/0x260 2025-05-11T16:48:18.7426522Z [<0000033200cf162c>] system_call+0x74/0x98 Ilya, Could you please verify whether the fix is related or not ? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping 2025-05-11 22:33 ` [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping Alexei Starovoitov @ 2025-05-12 12:03 ` Ilya Leoshkevich 0 siblings, 0 replies; 10+ messages in thread From: Ilya Leoshkevich @ 2025-05-12 12:03 UTC (permalink / raw) To: Alexei Starovoitov, Yonghong Song Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team, Martin KaFai Lau On Sun, 2025-05-11 at 15:33 -0700, Alexei Starovoitov wrote: > On Sun, May 11, 2025 at 9:28 AM Yonghong Song > <yonghong.song@linux.dev> wrote: > > > > Reported by: Yi Lai <yi1.lai@linux.intel.com> > > Fixes: 407958a0e980 ("bpf: encapsulate precision backtracking > > bookkeeping") > > Signed-off-by: Yonghong Song <yonghong.song@linux.dev> > > --- > > kernel/bpf/verifier.c | 6 ++++-- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 28f5a7899bd6..1cb4d80d15c1 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -4413,8 +4413,10 @@ static int backtrack_insn(struct > > bpf_verifier_env *env, int idx, int subseq_idx, > > * before it would be equally necessary to > > * propagate it to dreg. > > */ > > - bt_set_reg(bt, dreg); > > - bt_set_reg(bt, sreg); > > + if (dreg != BPF_REG_FP) > > + bt_set_reg(bt, dreg); > > + if (sreg != BPF_REG_FP) > > + bt_set_reg(bt, sreg); > > The fix makes sense to me. > > but it crashes on s390 according to CI: > > 2025-05-11T16:48:18.5929491Z #401 struct_ops_refcounted:OK > 2025-05-11T16:48:18.7330807Z ------------[ cut here ]------------ > 2025-05-11T16:48:18.7333824Z kernel BUG at kernel/bpf/core.c:533! > 2025-05-11T16:48:18.7335154Z monitor event: 0040 ilc:2 [#1]SMP > 2025-05-11T16:48:18.7336972Z Modules linked in: bpf_testmod(OE) [last > unloaded: bpf_test_no_cfi(OE)] > 2025-05-11T16:48:18.7341000Z CPU: 0 UID: 0 PID: 109 Comm: new_name > Tainted: G OE 6.15.0-rc4-ga9827e5c6a13-dirty #13 NONE > 2025-05-11T16:48:18.7343245Z Tainted: [O]=OOT_MODULE, > [E]=UNSIGNED_MODULE > 2025-05-11T16:48:18.7344697Z Hardware name: IBM 8561 LT1 400 > (KVM/Linux) > 2025-05-11T16:48:18.7347056Z Krnl PSW : 0704d00180000000 > 000003320039d8ca (bpf_patch_insn_single+0x29a/0x2a0) > 2025-05-11T16:48:18.7349372Z R:0 T:1 IO:1 EX:1 Key:0 M:1 > W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3 > 2025-05-11T16:48:18.7351910Z Krnl GPRS: 000002b200000016 > ffffffff7ffffffe ffffffffffffffde 00000000ffffffde > 2025-05-11T16:48:18.7354602Z 0000000000000003 > 0000000000000005 0000000000000000 000002b2000b5048 > 2025-05-11T16:48:18.7356934Z 0000000000000018 > 000002b2000b5000 0000000000000003 0000000000000002 > 2025-05-11T16:48:18.7359164Z 000003ff81badf98 > 0000000000000002 000003320039d738 000002b200687840 > 2025-05-11T16:48:18.7361217Z Krnl Code: 000003320039d8bc: > e3005ff0ff50 > sty %r0,-16(%r5) > 2025-05-11T16:48:18.7363048Z 000003320039d8c2: a7f4ffc6 > brc > 15,000003320039d84e > 2025-05-11T16:48:18.7364611Z #000003320039d8c6: af000000 mc > 0,0 > 2025-05-11T16:48:18.7366106Z >000003320039d8ca: 0707 bcr > 0,%r7 > 2025-05-11T16:48:18.7367449Z 000003320039d8cc: 0707 bcr > 0,%r7 > 2025-05-11T16:48:18.7368855Z 000003320039d8ce: 0707 bcr > 0,%r7 > 2025-05-11T16:48:18.7403748Z 000003320039d8d0: > c004004bdc60 > brcl 0,0000033200d19190 > 2025-05-11T16:48:18.7407899Z 000003320039d8d6: > eb6ff0480024 > stmg %r6,%r15,72(%r15) > 2025-05-11T16:48:18.7410576Z Call Trace: > 2025-05-11T16:48:18.7411713Z [<000003320039d8ca>] > bpf_patch_insn_single+0x29a/0x2a0 > 2025-05-11T16:48:18.7413433Z ([<000003320039d738>] > bpf_patch_insn_single+0x108/0x2a0) > 2025-05-11T16:48:18.7415210Z [<000003320039eb72>] > bpf_jit_blind_constants+0xd2/0x1b0 > 2025-05-11T16:48:18.7416879Z [<000003320020b5ee>] > bpf_int_jit_compile+0x46/0x448 > 2025-05-11T16:48:18.7418417Z [<00000332003c12d4>] > jit_subprogs+0x594/0xbe0 > 2025-05-11T16:48:18.7419782Z [<00000332003dacc8>] > bpf_check+0xe28/0x14b0 > 2025-05-11T16:48:18.7421128Z [<00000332003a9328>] > bpf_prog_load+0x4d8/0xba0 > 2025-05-11T16:48:18.7422570Z [<00000332003ab976>] > __sys_bpf+0x98e/0xdd0 > 2025-05-11T16:48:18.7423887Z [<00000332003abdfc>] > __s390x_sys_bpf+0x44/0x50 > 2025-05-11T16:48:18.7425227Z [<0000033200ce61b2>] > __do_syscall+0x132/0x260 > 2025-05-11T16:48:18.7426522Z [<0000033200cf162c>] > system_call+0x74/0x98 > > > Ilya, > > Could you please verify whether the fix is related or not ? I assume what crashes here is subprogs_and_jit_harden. I could reproduce this neither using the build artifacts, nor in my own development setup with this series applied. subprogs_and_jit_harden is trying to induce a race condition by constantly toggling bpf_jit_harden. Running it in a loop for a while does not lead to any failures. I also cannot see how this can create problems with the existing code structure. bpf_jit_harden is used by bpf_jit_blinding_enabled() and bpf_jit_kallsyms_enabled(), each of which has only one call site. When these two see different values of bpf_jit_harden, nothing bad should happen. I suspect that this must an existing intermittently occurring issue. Can we re-run the CI to see if the BUG() is reproducible in the GHA environment? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping 2025-05-11 16:27 [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping Yonghong Song 2025-05-11 16:28 ` [PATCH bpf-next 2/2] selftests/bpf: Add a test with r10 in conditional jmp Yonghong Song 2025-05-11 22:33 ` [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping Alexei Starovoitov @ 2025-05-12 16:26 ` Andrii Nakryiko 2025-05-12 22:05 ` Andrii Nakryiko 2 siblings, 1 reply; 10+ messages in thread From: Andrii Nakryiko @ 2025-05-12 16:26 UTC (permalink / raw) To: Yonghong Song Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau On Sun, May 11, 2025 at 9:28 AM Yonghong Song <yonghong.song@linux.dev> wrote: > > Yi Lai reported an issue ([1]) where the following warning appears > in kernel dmesg: > [ 60.643604] verifier backtracking bug > [ 60.643635] WARNING: CPU: 10 PID: 2315 at kernel/bpf/verifier.c:4302 __mark_chain_precision+0x3a6c/0x3e10 > [ 60.648428] Modules linked in: bpf_testmod(OE) > [ 60.650471] CPU: 10 UID: 0 PID: 2315 Comm: test_progs Tainted: G OE 6.15.0-rc4-gef11287f8289-dirty #327 PREEMPT(full) > [ 60.654385] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE > [ 60.656682] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 > [ 60.660475] RIP: 0010:__mark_chain_precision+0x3a6c/0x3e10 > [ 60.662814] Code: 5a 30 84 89 ea e8 c4 d9 01 00 80 3d 3e 7d d8 04 00 0f 85 60 fa ff ff c6 05 31 7d d8 04 > 01 48 c7 c7 00 58 30 84 e8 c4 06 a5 ff <0f> 0b e9 46 fa ff ff 48 ... > [ 60.668720] RSP: 0018:ffff888116cc7298 EFLAGS: 00010246 > [ 60.671075] RAX: 54d70e82dfd31900 RBX: ffff888115b65e20 RCX: 0000000000000000 > [ 60.673659] RDX: 0000000000000001 RSI: 0000000000000004 RDI: 00000000ffffffff > [ 60.676241] RBP: 0000000000000400 R08: ffff8881f6f23bd3 R09: 1ffff1103ede477a > [ 60.678787] R10: dffffc0000000000 R11: ffffed103ede477b R12: ffff888115b60ae8 > [ 60.681420] R13: 1ffff11022b6cbc4 R14: 00000000fffffff2 R15: 0000000000000001 > [ 60.684030] FS: 00007fc2aedd80c0(0000) GS:ffff88826fa8a000(0000) knlGS:0000000000000000 > [ 60.686837] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 60.689027] CR2: 000056325369e000 CR3: 000000011088b002 CR4: 0000000000370ef0 > [ 60.691623] Call Trace: > [ 60.692821] <TASK> > [ 60.693960] ? __pfx_verbose+0x10/0x10 > [ 60.695656] ? __pfx_disasm_kfunc_name+0x10/0x10 > [ 60.697495] check_cond_jmp_op+0x16f7/0x39b0 > [ 60.699237] do_check+0x58fa/0xab10 > ... > > Further analysis shows the warning is at line 4302 as below: > > 4294 /* static subprog call instruction, which > 4295 * means that we are exiting current subprog, > 4296 * so only r1-r5 could be still requested as > 4297 * precise, r0 and r6-r10 or any stack slot in > 4298 * the current frame should be zero by now > 4299 */ > 4300 if (bt_reg_mask(bt) & ~BPF_REGMASK_ARGS) { > 4301 verbose(env, "BUG regs %x\n", bt_reg_mask(bt)); > 4302 WARN_ONCE(1, "verifier backtracking bug"); > 4303 return -EFAULT; > 4304 } > > With the below test (also in the next patch): > __used __naked static void __bpf_jmp_r10(void) > { > asm volatile ( > "r2 = 2314885393468386424 ll;" > "goto +0;" > "if r2 <= r10 goto +3;" > "if r1 >= -1835016 goto +0;" > "if r2 <= 8 goto +0;" > "if r3 <= 0 goto +0;" > "exit;" > ::: __clobber_all); > } > > SEC("?raw_tp") > __naked void bpf_jmp_r10(void) > { > asm volatile ( > "r3 = 0 ll;" > "call __bpf_jmp_r10;" > "r0 = 0;" > "exit;" > ::: __clobber_all); > } > > The following is the verifier failure log: > 0: (18) r3 = 0x0 ; R3_w=0 > 2: (85) call pc+2 > caller: > R10=fp0 > callee: > frame1: R1=ctx() R3_w=0 R10=fp0 > 5: frame1: R1=ctx() R3_w=0 R10=fp0 > ; asm volatile (" \ @ verifier_precision.c:184 > 5: (18) r2 = 0x20202000256c6c78 ; frame1: R2_w=0x20202000256c6c78 > 7: (05) goto pc+0 > 8: (bd) if r2 <= r10 goto pc+3 ; frame1: R2_w=0x20202000256c6c78 R10=fp0 For stacks spill/fill we use INSN_F_STACK_ACCESS because not just r10 can be used to point to the stack. I wonder if we need to handle r10 more generically here? E.g., if here we had something like r1 = r10 r1 += -8 if r2 <= r1 goto pc +3 is it fine to track r1 as precise or we need to know that r1 is an alias to r10? Not sure myself yet, but I thought I'd bring this up as a concern. > 9: (35) if r1 >= 0xffe3fff8 goto pc+0 ; frame1: R1=ctx() > 10: (b5) if r2 <= 0x8 goto pc+0 > mark_precise: frame1: last_idx 10 first_idx 0 subseq_idx -1 > mark_precise: frame1: regs=r2 stack= before 9: (35) if r1 >= 0xffe3fff8 goto pc+0 > mark_precise: frame1: regs=r2 stack= before 8: (bd) if r2 <= r10 goto pc+3 > mark_precise: frame1: regs=r2,r10 stack= before 7: (05) goto pc+0 > mark_precise: frame1: regs=r2,r10 stack= before 5: (18) r2 = 0x20202000256c6c78 > mark_precise: frame1: regs=r10 stack= before 2: (85) call pc+2 > BUG regs 400 > > The main failure reason is due to r10 in precision backtracking bookkeeping. > Actually r10 is always precise and there is no need to add it the precision > backtracking bookkeeping. > > This patch fixed the problem by not adding r10 to prevision backtracking bookkeeping. > > [1] https://lore.kernel.org/bpf/Z%2F8q3xzpU59CIYQE@ly-workstation/ > > Reported by: Yi Lai <yi1.lai@linux.intel.com> > Fixes: 407958a0e980 ("bpf: encapsulate precision backtracking bookkeeping") > Signed-off-by: Yonghong Song <yonghong.song@linux.dev> > --- > kernel/bpf/verifier.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 28f5a7899bd6..1cb4d80d15c1 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -4413,8 +4413,10 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, > * before it would be equally necessary to > * propagate it to dreg. > */ > - bt_set_reg(bt, dreg); > - bt_set_reg(bt, sreg); > + if (dreg != BPF_REG_FP) > + bt_set_reg(bt, dreg); > + if (sreg != BPF_REG_FP) > + bt_set_reg(bt, sreg); > } else if (BPF_SRC(insn->code) == BPF_K) { > /* dreg <cond> K > * Only dreg still needs precision before > -- > 2.47.1 > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping 2025-05-12 16:26 ` Andrii Nakryiko @ 2025-05-12 22:05 ` Andrii Nakryiko 2025-05-13 5:20 ` Yonghong Song 2025-05-15 1:44 ` Yonghong Song 0 siblings, 2 replies; 10+ messages in thread From: Andrii Nakryiko @ 2025-05-12 22:05 UTC (permalink / raw) To: Yonghong Song Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau On Mon, May 12, 2025 at 9:26 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Sun, May 11, 2025 at 9:28 AM Yonghong Song <yonghong.song@linux.dev> wrote: > > > > Yi Lai reported an issue ([1]) where the following warning appears > > in kernel dmesg: > > [ 60.643604] verifier backtracking bug > > [ 60.643635] WARNING: CPU: 10 PID: 2315 at kernel/bpf/verifier.c:4302 __mark_chain_precision+0x3a6c/0x3e10 > > [ 60.648428] Modules linked in: bpf_testmod(OE) > > [ 60.650471] CPU: 10 UID: 0 PID: 2315 Comm: test_progs Tainted: G OE 6.15.0-rc4-gef11287f8289-dirty #327 PREEMPT(full) > > [ 60.654385] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE > > [ 60.656682] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 > > [ 60.660475] RIP: 0010:__mark_chain_precision+0x3a6c/0x3e10 > > [ 60.662814] Code: 5a 30 84 89 ea e8 c4 d9 01 00 80 3d 3e 7d d8 04 00 0f 85 60 fa ff ff c6 05 31 7d d8 04 > > 01 48 c7 c7 00 58 30 84 e8 c4 06 a5 ff <0f> 0b e9 46 fa ff ff 48 ... > > [ 60.668720] RSP: 0018:ffff888116cc7298 EFLAGS: 00010246 > > [ 60.671075] RAX: 54d70e82dfd31900 RBX: ffff888115b65e20 RCX: 0000000000000000 > > [ 60.673659] RDX: 0000000000000001 RSI: 0000000000000004 RDI: 00000000ffffffff > > [ 60.676241] RBP: 0000000000000400 R08: ffff8881f6f23bd3 R09: 1ffff1103ede477a > > [ 60.678787] R10: dffffc0000000000 R11: ffffed103ede477b R12: ffff888115b60ae8 > > [ 60.681420] R13: 1ffff11022b6cbc4 R14: 00000000fffffff2 R15: 0000000000000001 > > [ 60.684030] FS: 00007fc2aedd80c0(0000) GS:ffff88826fa8a000(0000) knlGS:0000000000000000 > > [ 60.686837] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 60.689027] CR2: 000056325369e000 CR3: 000000011088b002 CR4: 0000000000370ef0 > > [ 60.691623] Call Trace: > > [ 60.692821] <TASK> > > [ 60.693960] ? __pfx_verbose+0x10/0x10 > > [ 60.695656] ? __pfx_disasm_kfunc_name+0x10/0x10 > > [ 60.697495] check_cond_jmp_op+0x16f7/0x39b0 > > [ 60.699237] do_check+0x58fa/0xab10 > > ... > > > > Further analysis shows the warning is at line 4302 as below: > > > > 4294 /* static subprog call instruction, which > > 4295 * means that we are exiting current subprog, > > 4296 * so only r1-r5 could be still requested as > > 4297 * precise, r0 and r6-r10 or any stack slot in > > 4298 * the current frame should be zero by now > > 4299 */ > > 4300 if (bt_reg_mask(bt) & ~BPF_REGMASK_ARGS) { > > 4301 verbose(env, "BUG regs %x\n", bt_reg_mask(bt)); > > 4302 WARN_ONCE(1, "verifier backtracking bug"); > > 4303 return -EFAULT; > > 4304 } > > > > With the below test (also in the next patch): > > __used __naked static void __bpf_jmp_r10(void) > > { > > asm volatile ( > > "r2 = 2314885393468386424 ll;" > > "goto +0;" > > "if r2 <= r10 goto +3;" > > "if r1 >= -1835016 goto +0;" > > "if r2 <= 8 goto +0;" > > "if r3 <= 0 goto +0;" > > "exit;" > > ::: __clobber_all); > > } > > > > SEC("?raw_tp") > > __naked void bpf_jmp_r10(void) > > { > > asm volatile ( > > "r3 = 0 ll;" > > "call __bpf_jmp_r10;" > > "r0 = 0;" > > "exit;" > > ::: __clobber_all); > > } > > > > The following is the verifier failure log: > > 0: (18) r3 = 0x0 ; R3_w=0 > > 2: (85) call pc+2 > > caller: > > R10=fp0 > > callee: > > frame1: R1=ctx() R3_w=0 R10=fp0 > > 5: frame1: R1=ctx() R3_w=0 R10=fp0 > > ; asm volatile (" \ @ verifier_precision.c:184 > > 5: (18) r2 = 0x20202000256c6c78 ; frame1: R2_w=0x20202000256c6c78 > > 7: (05) goto pc+0 > > 8: (bd) if r2 <= r10 goto pc+3 ; frame1: R2_w=0x20202000256c6c78 R10=fp0 > > For stacks spill/fill we use INSN_F_STACK_ACCESS because not just r10 > can be used to point to the stack. I wonder if we need to handle r10 > more generically here? > > E.g., if here we had something like > > r1 = r10 > r1 += -8 > if r2 <= r1 goto pc +3 > > is it fine to track r1 as precise or we need to know that r1 is an alias to r10? > > Not sure myself yet, but I thought I'd bring this up as a concern. > After discussing this with Eduard offline, I think that we should generalize this a bit and not hard-code r10 handling like this. Note how we use INSN_F_STACK_ACCESS to mark LDX and STX instructions as "accesses stack through register", regardless of whether that register is r10 or any other rx after `rX = r10; rX += <offset>`. I think we should do the same here more generally for all instructions, especially for conditional jumps. The only complication is that with INSN_F_STACK_ACCESS we have only one possible register within LDX/STX, while with conditional jumps we can have two registers (and both might be PTR_TO_STACK registers!). So I propose we split INSN_F_STACK_ACCESS into INSN_F_STACK_SRC and INSN_F_STACK_DST and use that to mark either src or dst register as being a PTR_TO_STACK. Then we can generically ignore any register that was a PTR_TO_STACK, because any such register is already implicitly precise. We'd need to slightly update existing code to use either INSN_F_STACK_SRC or INSN_F_STACK_DST, depending on LDX or STX, and then generalize all that to conditionals (and, technically, any other instruction). WDYT? Does it make sense? > > 9: (35) if r1 >= 0xffe3fff8 goto pc+0 ; frame1: R1=ctx() > > 10: (b5) if r2 <= 0x8 goto pc+0 > > mark_precise: frame1: last_idx 10 first_idx 0 subseq_idx -1 > > mark_precise: frame1: regs=r2 stack= before 9: (35) if r1 >= 0xffe3fff8 goto pc+0 > > mark_precise: frame1: regs=r2 stack= before 8: (bd) if r2 <= r10 goto pc+3 > > mark_precise: frame1: regs=r2,r10 stack= before 7: (05) goto pc+0 > > mark_precise: frame1: regs=r2,r10 stack= before 5: (18) r2 = 0x20202000256c6c78 > > mark_precise: frame1: regs=r10 stack= before 2: (85) call pc+2 > > BUG regs 400 > > > > The main failure reason is due to r10 in precision backtracking bookkeeping. > > Actually r10 is always precise and there is no need to add it the precision > > backtracking bookkeeping. > > > > This patch fixed the problem by not adding r10 to prevision backtracking bookkeeping. > > > > [1] https://lore.kernel.org/bpf/Z%2F8q3xzpU59CIYQE@ly-workstation/ > > > > Reported by: Yi Lai <yi1.lai@linux.intel.com> > > Fixes: 407958a0e980 ("bpf: encapsulate precision backtracking bookkeeping") > > Signed-off-by: Yonghong Song <yonghong.song@linux.dev> > > --- > > kernel/bpf/verifier.c | 6 ++++-- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 28f5a7899bd6..1cb4d80d15c1 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -4413,8 +4413,10 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, > > * before it would be equally necessary to > > * propagate it to dreg. > > */ > > - bt_set_reg(bt, dreg); > > - bt_set_reg(bt, sreg); > > + if (dreg != BPF_REG_FP) > > + bt_set_reg(bt, dreg); > > + if (sreg != BPF_REG_FP) > > + bt_set_reg(bt, sreg); > > } else if (BPF_SRC(insn->code) == BPF_K) { > > /* dreg <cond> K > > * Only dreg still needs precision before > > -- > > 2.47.1 > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping 2025-05-12 22:05 ` Andrii Nakryiko @ 2025-05-13 5:20 ` Yonghong Song 2025-05-15 1:44 ` Yonghong Song 1 sibling, 0 replies; 10+ messages in thread From: Yonghong Song @ 2025-05-13 5:20 UTC (permalink / raw) To: Andrii Nakryiko Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau On 5/12/25 6:05 AM, Andrii Nakryiko wrote: > On Mon, May 12, 2025 at 9:26 AM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: >> On Sun, May 11, 2025 at 9:28 AM Yonghong Song <yonghong.song@linux.dev> wrote: >>> Yi Lai reported an issue ([1]) where the following warning appears >>> in kernel dmesg: >>> [ 60.643604] verifier backtracking bug >>> [ 60.643635] WARNING: CPU: 10 PID: 2315 at kernel/bpf/verifier.c:4302 __mark_chain_precision+0x3a6c/0x3e10 >>> [ 60.648428] Modules linked in: bpf_testmod(OE) >>> [ 60.650471] CPU: 10 UID: 0 PID: 2315 Comm: test_progs Tainted: G OE 6.15.0-rc4-gef11287f8289-dirty #327 PREEMPT(full) >>> [ 60.654385] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE >>> [ 60.656682] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 >>> [ 60.660475] RIP: 0010:__mark_chain_precision+0x3a6c/0x3e10 >>> [ 60.662814] Code: 5a 30 84 89 ea e8 c4 d9 01 00 80 3d 3e 7d d8 04 00 0f 85 60 fa ff ff c6 05 31 7d d8 04 >>> 01 48 c7 c7 00 58 30 84 e8 c4 06 a5 ff <0f> 0b e9 46 fa ff ff 48 ... >>> [ 60.668720] RSP: 0018:ffff888116cc7298 EFLAGS: 00010246 >>> [ 60.671075] RAX: 54d70e82dfd31900 RBX: ffff888115b65e20 RCX: 0000000000000000 >>> [ 60.673659] RDX: 0000000000000001 RSI: 0000000000000004 RDI: 00000000ffffffff >>> [ 60.676241] RBP: 0000000000000400 R08: ffff8881f6f23bd3 R09: 1ffff1103ede477a >>> [ 60.678787] R10: dffffc0000000000 R11: ffffed103ede477b R12: ffff888115b60ae8 >>> [ 60.681420] R13: 1ffff11022b6cbc4 R14: 00000000fffffff2 R15: 0000000000000001 >>> [ 60.684030] FS: 00007fc2aedd80c0(0000) GS:ffff88826fa8a000(0000) knlGS:0000000000000000 >>> [ 60.686837] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> [ 60.689027] CR2: 000056325369e000 CR3: 000000011088b002 CR4: 0000000000370ef0 >>> [ 60.691623] Call Trace: >>> [ 60.692821] <TASK> >>> [ 60.693960] ? __pfx_verbose+0x10/0x10 >>> [ 60.695656] ? __pfx_disasm_kfunc_name+0x10/0x10 >>> [ 60.697495] check_cond_jmp_op+0x16f7/0x39b0 >>> [ 60.699237] do_check+0x58fa/0xab10 >>> ... >>> >>> Further analysis shows the warning is at line 4302 as below: >>> >>> 4294 /* static subprog call instruction, which >>> 4295 * means that we are exiting current subprog, >>> 4296 * so only r1-r5 could be still requested as >>> 4297 * precise, r0 and r6-r10 or any stack slot in >>> 4298 * the current frame should be zero by now >>> 4299 */ >>> 4300 if (bt_reg_mask(bt) & ~BPF_REGMASK_ARGS) { >>> 4301 verbose(env, "BUG regs %x\n", bt_reg_mask(bt)); >>> 4302 WARN_ONCE(1, "verifier backtracking bug"); >>> 4303 return -EFAULT; >>> 4304 } >>> >>> With the below test (also in the next patch): >>> __used __naked static void __bpf_jmp_r10(void) >>> { >>> asm volatile ( >>> "r2 = 2314885393468386424 ll;" >>> "goto +0;" >>> "if r2 <= r10 goto +3;" >>> "if r1 >= -1835016 goto +0;" >>> "if r2 <= 8 goto +0;" >>> "if r3 <= 0 goto +0;" >>> "exit;" >>> ::: __clobber_all); >>> } >>> >>> SEC("?raw_tp") >>> __naked void bpf_jmp_r10(void) >>> { >>> asm volatile ( >>> "r3 = 0 ll;" >>> "call __bpf_jmp_r10;" >>> "r0 = 0;" >>> "exit;" >>> ::: __clobber_all); >>> } >>> >>> The following is the verifier failure log: >>> 0: (18) r3 = 0x0 ; R3_w=0 >>> 2: (85) call pc+2 >>> caller: >>> R10=fp0 >>> callee: >>> frame1: R1=ctx() R3_w=0 R10=fp0 >>> 5: frame1: R1=ctx() R3_w=0 R10=fp0 >>> ; asm volatile (" \ @ verifier_precision.c:184 >>> 5: (18) r2 = 0x20202000256c6c78 ; frame1: R2_w=0x20202000256c6c78 >>> 7: (05) goto pc+0 >>> 8: (bd) if r2 <= r10 goto pc+3 ; frame1: R2_w=0x20202000256c6c78 R10=fp0 >> For stacks spill/fill we use INSN_F_STACK_ACCESS because not just r10 >> can be used to point to the stack. I wonder if we need to handle r10 >> more generically here? >> >> E.g., if here we had something like >> >> r1 = r10 >> r1 += -8 >> if r2 <= r1 goto pc +3 >> >> is it fine to track r1 as precise or we need to know that r1 is an alias to r10? >> >> Not sure myself yet, but I thought I'd bring this up as a concern. >> > After discussing this with Eduard offline, I think that we should > generalize this a bit and not hard-code r10 handling like this. > > Note how we use INSN_F_STACK_ACCESS to mark LDX and STX instructions > as "accesses stack through register", regardless of whether that > register is r10 or any other rx after `rX = r10; rX += <offset>`. I > think we should do the same here more generally for all instructions, > especially for conditional jumps. > > The only complication is that with INSN_F_STACK_ACCESS we have only > one possible register within LDX/STX, while with conditional jumps we > can have two registers (and both might be PTR_TO_STACK registers!). > > So I propose we split INSN_F_STACK_ACCESS into INSN_F_STACK_SRC and > INSN_F_STACK_DST and use that to mark either src or dst register as > being a PTR_TO_STACK. Then we can generically ignore any register that > was a PTR_TO_STACK, because any such register is already implicitly > precise. > > We'd need to slightly update existing code to use either > INSN_F_STACK_SRC or INSN_F_STACK_DST, depending on LDX or STX, and > then generalize all that to conditionals (and, technically, any other > instruction). > > WDYT? Does it make sense? Indeed, what you suggested does make sense. As you mentioned that we could have r7 = r10 and later r7 may appear in comparison. INSN_F_STACK_SRC and INSN_F_STACK_DST makes sense. I will make changes along this line. > >>> 9: (35) if r1 >= 0xffe3fff8 goto pc+0 ; frame1: R1=ctx() >>> 10: (b5) if r2 <= 0x8 goto pc+0 >>> mark_precise: frame1: last_idx 10 first_idx 0 subseq_idx -1 >>> mark_precise: frame1: regs=r2 stack= before 9: (35) if r1 >= 0xffe3fff8 goto pc+0 >>> mark_precise: frame1: regs=r2 stack= before 8: (bd) if r2 <= r10 goto pc+3 >>> mark_precise: frame1: regs=r2,r10 stack= before 7: (05) goto pc+0 >>> mark_precise: frame1: regs=r2,r10 stack= before 5: (18) r2 = 0x20202000256c6c78 >>> mark_precise: frame1: regs=r10 stack= before 2: (85) call pc+2 >>> BUG regs 400 >>> >>> The main failure reason is due to r10 in precision backtracking bookkeeping. >>> Actually r10 is always precise and there is no need to add it the precision >>> backtracking bookkeeping. >>> >>> This patch fixed the problem by not adding r10 to prevision backtracking bookkeeping. >>> >>> [1] https://lore.kernel.org/bpf/Z%2F8q3xzpU59CIYQE@ly-workstation/ >>> >>> Reported by: Yi Lai <yi1.lai@linux.intel.com> >>> Fixes: 407958a0e980 ("bpf: encapsulate precision backtracking bookkeeping") >>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> >>> --- >>> kernel/bpf/verifier.c | 6 ++++-- >>> 1 file changed, 4 insertions(+), 2 deletions(-) >>> >>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >>> index 28f5a7899bd6..1cb4d80d15c1 100644 >>> --- a/kernel/bpf/verifier.c >>> +++ b/kernel/bpf/verifier.c >>> @@ -4413,8 +4413,10 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, >>> * before it would be equally necessary to >>> * propagate it to dreg. >>> */ >>> - bt_set_reg(bt, dreg); >>> - bt_set_reg(bt, sreg); >>> + if (dreg != BPF_REG_FP) >>> + bt_set_reg(bt, dreg); >>> + if (sreg != BPF_REG_FP) >>> + bt_set_reg(bt, sreg); >>> } else if (BPF_SRC(insn->code) == BPF_K) { >>> /* dreg <cond> K >>> * Only dreg still needs precision before >>> -- >>> 2.47.1 >>> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping 2025-05-12 22:05 ` Andrii Nakryiko 2025-05-13 5:20 ` Yonghong Song @ 2025-05-15 1:44 ` Yonghong Song 2025-05-15 17:47 ` Andrii Nakryiko 1 sibling, 1 reply; 10+ messages in thread From: Yonghong Song @ 2025-05-15 1:44 UTC (permalink / raw) To: Andrii Nakryiko Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau On 5/12/25 6:05 AM, Andrii Nakryiko wrote: > On Mon, May 12, 2025 at 9:26 AM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: >> On Sun, May 11, 2025 at 9:28 AM Yonghong Song <yonghong.song@linux.dev> wrote: >>> Yi Lai reported an issue ([1]) where the following warning appears >>> in kernel dmesg: >>> [ 60.643604] verifier backtracking bug >>> [ 60.643635] WARNING: CPU: 10 PID: 2315 at kernel/bpf/verifier.c:4302 __mark_chain_precision+0x3a6c/0x3e10 >>> [ 60.648428] Modules linked in: bpf_testmod(OE) >>> [ 60.650471] CPU: 10 UID: 0 PID: 2315 Comm: test_progs Tainted: G OE 6.15.0-rc4-gef11287f8289-dirty #327 PREEMPT(full) >>> [ 60.654385] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE >>> [ 60.656682] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 >>> [ 60.660475] RIP: 0010:__mark_chain_precision+0x3a6c/0x3e10 >>> [ 60.662814] Code: 5a 30 84 89 ea e8 c4 d9 01 00 80 3d 3e 7d d8 04 00 0f 85 60 fa ff ff c6 05 31 7d d8 04 >>> 01 48 c7 c7 00 58 30 84 e8 c4 06 a5 ff <0f> 0b e9 46 fa ff ff 48 ... >>> [ 60.668720] RSP: 0018:ffff888116cc7298 EFLAGS: 00010246 >>> [ 60.671075] RAX: 54d70e82dfd31900 RBX: ffff888115b65e20 RCX: 0000000000000000 >>> [ 60.673659] RDX: 0000000000000001 RSI: 0000000000000004 RDI: 00000000ffffffff >>> [ 60.676241] RBP: 0000000000000400 R08: ffff8881f6f23bd3 R09: 1ffff1103ede477a >>> [ 60.678787] R10: dffffc0000000000 R11: ffffed103ede477b R12: ffff888115b60ae8 >>> [ 60.681420] R13: 1ffff11022b6cbc4 R14: 00000000fffffff2 R15: 0000000000000001 >>> [ 60.684030] FS: 00007fc2aedd80c0(0000) GS:ffff88826fa8a000(0000) knlGS:0000000000000000 >>> [ 60.686837] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> [ 60.689027] CR2: 000056325369e000 CR3: 000000011088b002 CR4: 0000000000370ef0 >>> [ 60.691623] Call Trace: >>> [ 60.692821] <TASK> >>> [ 60.693960] ? __pfx_verbose+0x10/0x10 >>> [ 60.695656] ? __pfx_disasm_kfunc_name+0x10/0x10 >>> [ 60.697495] check_cond_jmp_op+0x16f7/0x39b0 >>> [ 60.699237] do_check+0x58fa/0xab10 >>> ... >>> >>> Further analysis shows the warning is at line 4302 as below: >>> >>> 4294 /* static subprog call instruction, which >>> 4295 * means that we are exiting current subprog, >>> 4296 * so only r1-r5 could be still requested as >>> 4297 * precise, r0 and r6-r10 or any stack slot in >>> 4298 * the current frame should be zero by now >>> 4299 */ >>> 4300 if (bt_reg_mask(bt) & ~BPF_REGMASK_ARGS) { >>> 4301 verbose(env, "BUG regs %x\n", bt_reg_mask(bt)); >>> 4302 WARN_ONCE(1, "verifier backtracking bug"); >>> 4303 return -EFAULT; >>> 4304 } >>> >>> With the below test (also in the next patch): >>> __used __naked static void __bpf_jmp_r10(void) >>> { >>> asm volatile ( >>> "r2 = 2314885393468386424 ll;" >>> "goto +0;" >>> "if r2 <= r10 goto +3;" >>> "if r1 >= -1835016 goto +0;" >>> "if r2 <= 8 goto +0;" >>> "if r3 <= 0 goto +0;" >>> "exit;" >>> ::: __clobber_all); >>> } >>> >>> SEC("?raw_tp") >>> __naked void bpf_jmp_r10(void) >>> { >>> asm volatile ( >>> "r3 = 0 ll;" >>> "call __bpf_jmp_r10;" >>> "r0 = 0;" >>> "exit;" >>> ::: __clobber_all); >>> } >>> >>> The following is the verifier failure log: >>> 0: (18) r3 = 0x0 ; R3_w=0 >>> 2: (85) call pc+2 >>> caller: >>> R10=fp0 >>> callee: >>> frame1: R1=ctx() R3_w=0 R10=fp0 >>> 5: frame1: R1=ctx() R3_w=0 R10=fp0 >>> ; asm volatile (" \ @ verifier_precision.c:184 >>> 5: (18) r2 = 0x20202000256c6c78 ; frame1: R2_w=0x20202000256c6c78 >>> 7: (05) goto pc+0 >>> 8: (bd) if r2 <= r10 goto pc+3 ; frame1: R2_w=0x20202000256c6c78 R10=fp0 >> For stacks spill/fill we use INSN_F_STACK_ACCESS because not just r10 >> can be used to point to the stack. I wonder if we need to handle r10 >> more generically here? >> >> E.g., if here we had something like >> >> r1 = r10 >> r1 += -8 >> if r2 <= r1 goto pc +3 >> >> is it fine to track r1 as precise or we need to know that r1 is an alias to r10? >> >> Not sure myself yet, but I thought I'd bring this up as a concern. In backtrack_insn, we have: } else if (opcode == BPF_MOV) { if (BPF_SRC(insn->code) == BPF_X) { /* dreg = sreg or dreg = (s8, s16, s32)sreg * dreg needs precision after this insn * sreg needs precision before this insn */ bt_clear_reg(bt, dreg); if (sreg != BPF_REG_FP) bt_set_reg(bt, sreg); } else { /* dreg = K * dreg needs precision after this insn. * Corresponding register is already marked * as precise=true in this verifier state. * No further markings in parent are necessary */ bt_clear_reg(bt, dreg); } So for insn 'r1 = r10', even if r1 is marked precise, but based on the above code r1 will be cleared and r10 will not be added to bt_set_reg due to 'sreg != BPF_REG_FP'. So the current implementation should be okay. >> > After discussing this with Eduard offline, I think that we should > generalize this a bit and not hard-code r10 handling like this. > > Note how we use INSN_F_STACK_ACCESS to mark LDX and STX instructions > as "accesses stack through register", regardless of whether that > register is r10 or any other rx after `rX = r10; rX += <offset>`. I > think we should do the same here more generally for all instructions, > especially for conditional jumps. > > The only complication is that with INSN_F_STACK_ACCESS we have only > one possible register within LDX/STX, while with conditional jumps we > can have two registers (and both might be PTR_TO_STACK registers!). > > So I propose we split INSN_F_STACK_ACCESS into INSN_F_STACK_SRC and > INSN_F_STACK_DST and use that to mark either src or dst register as > being a PTR_TO_STACK. Then we can generically ignore any register that > was a PTR_TO_STACK, because any such register is already implicitly > precise. > > We'd need to slightly update existing code to use either > INSN_F_STACK_SRC or INSN_F_STACK_DST, depending on LDX or STX, and > then generalize all that to conditionals (and, technically, any other > instruction). > > WDYT? Does it make sense? I tried to prototype based on the above idea. But ultimately I gave up. The following are some of my analysis. The INSN_F_STACK_ACCESS is used for stack access (load and store). See: /* instruction history flags, used in bpf_insn_hist_entry.flags field */ enum { /* instruction references stack slot through PTR_TO_STACK register; * we also store stack's frame number in lower 3 bits (MAX_CALL_FRAMES is 8) * and accessed stack slot's index in next 6 bits (MAX_BPF_STACK is 512, * 8 bytes per slot, so slot index (spi) is [0, 63]) */ INSN_F_FRAMENO_MASK = 0x7, /* 3 bits */ INSN_F_SPI_MASK = 0x3f, /* 6 bits */ INSN_F_SPI_SHIFT = 3, /* shifted 3 bits to the left */ INSN_F_STACK_ACCESS = BIT(9), /* we need 10 bits total */ }; static int insn_stack_access_flags(int frameno, int spi) { return INSN_F_STACK_ACCESS | (spi << INSN_F_SPI_SHIFT) | frameno; } insn_stack_access_flags() is used by check_stack_read_fixed_off() and check_stack_write_fixed_off(). For these two functions, eventually a push_insn_history() push_insn_history(env, env->cur_state, insn_flags, 0) is done to record related insn_flags info. Note that insn_flags could be 0 or could be a actual insn_stack_access_flags() which depends on other contexts. For cond op's like 'rX <op> rY', it is similar to other ALU{32,64} operations. The decision can be made on the spot about to either clear or add related registers to bt_reg_set. I understand that it is desirable to avoid explicit checking BPF_REG_FP register. But this seems the simplest workable approach without involving push_insn_history(). The more complex option is to do push_insn_history() for 'rX <op> rY' conditions with information about how to deal with r10 register, e.g., to enforce the register must be one of r0-r9. That way, in backtrack_insn, the code can simply to if (hist->dst_reg_mask & dreg) bt_set_reg(bt, dreg); if (hist->src_reg_mask & sreg) bt_set_reg(bt, sreg); But this seems more complex than current simple approach. WDYT? > >>> 9: (35) if r1 >= 0xffe3fff8 goto pc+0 ; frame1: R1=ctx() >>> 10: (b5) if r2 <= 0x8 goto pc+0 >>> mark_precise: frame1: last_idx 10 first_idx 0 subseq_idx -1 >>> mark_precise: frame1: regs=r2 stack= before 9: (35) if r1 >= 0xffe3fff8 goto pc+0 >>> mark_precise: frame1: regs=r2 stack= before 8: (bd) if r2 <= r10 goto pc+3 >>> mark_precise: frame1: regs=r2,r10 stack= before 7: (05) goto pc+0 >>> mark_precise: frame1: regs=r2,r10 stack= before 5: (18) r2 = 0x20202000256c6c78 >>> mark_precise: frame1: regs=r10 stack= before 2: (85) call pc+2 >>> BUG regs 400 >>> >>> The main failure reason is due to r10 in precision backtracking bookkeeping. >>> Actually r10 is always precise and there is no need to add it the precision >>> backtracking bookkeeping. >>> >>> This patch fixed the problem by not adding r10 to prevision backtracking bookkeeping. >>> >>> [1] https://lore.kernel.org/bpf/Z%2F8q3xzpU59CIYQE@ly-workstation/ >>> >>> Reported by: Yi Lai <yi1.lai@linux.intel.com> >>> Fixes: 407958a0e980 ("bpf: encapsulate precision backtracking bookkeeping") >>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> >>> --- >>> kernel/bpf/verifier.c | 6 ++++-- >>> 1 file changed, 4 insertions(+), 2 deletions(-) >>> >>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >>> index 28f5a7899bd6..1cb4d80d15c1 100644 >>> --- a/kernel/bpf/verifier.c >>> +++ b/kernel/bpf/verifier.c >>> @@ -4413,8 +4413,10 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, >>> * before it would be equally necessary to >>> * propagate it to dreg. >>> */ >>> - bt_set_reg(bt, dreg); >>> - bt_set_reg(bt, sreg); >>> + if (dreg != BPF_REG_FP) >>> + bt_set_reg(bt, dreg); >>> + if (sreg != BPF_REG_FP) >>> + bt_set_reg(bt, sreg); >>> } else if (BPF_SRC(insn->code) == BPF_K) { >>> /* dreg <cond> K >>> * Only dreg still needs precision before >>> -- >>> 2.47.1 >>> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping 2025-05-15 1:44 ` Yonghong Song @ 2025-05-15 17:47 ` Andrii Nakryiko 2025-05-15 19:07 ` Yonghong Song 0 siblings, 1 reply; 10+ messages in thread From: Andrii Nakryiko @ 2025-05-15 17:47 UTC (permalink / raw) To: Yonghong Song Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau On Wed, May 14, 2025 at 6:44 PM Yonghong Song <yonghong.song@linux.dev> wrote: > > > > On 5/12/25 6:05 AM, Andrii Nakryiko wrote: > > On Mon, May 12, 2025 at 9:26 AM Andrii Nakryiko > > <andrii.nakryiko@gmail.com> wrote: > >> On Sun, May 11, 2025 at 9:28 AM Yonghong Song <yonghong.song@linux.dev> wrote: > >>> Yi Lai reported an issue ([1]) where the following warning appears > >>> in kernel dmesg: > >>> [ 60.643604] verifier backtracking bug > >>> [ 60.643635] WARNING: CPU: 10 PID: 2315 at kernel/bpf/verifier.c:4302 __mark_chain_precision+0x3a6c/0x3e10 > >>> [ 60.648428] Modules linked in: bpf_testmod(OE) > >>> [ 60.650471] CPU: 10 UID: 0 PID: 2315 Comm: test_progs Tainted: G OE 6.15.0-rc4-gef11287f8289-dirty #327 PREEMPT(full) > >>> [ 60.654385] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE > >>> [ 60.656682] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 > >>> [ 60.660475] RIP: 0010:__mark_chain_precision+0x3a6c/0x3e10 > >>> [ 60.662814] Code: 5a 30 84 89 ea e8 c4 d9 01 00 80 3d 3e 7d d8 04 00 0f 85 60 fa ff ff c6 05 31 7d d8 04 > >>> 01 48 c7 c7 00 58 30 84 e8 c4 06 a5 ff <0f> 0b e9 46 fa ff ff 48 ... > >>> [ 60.668720] RSP: 0018:ffff888116cc7298 EFLAGS: 00010246 > >>> [ 60.671075] RAX: 54d70e82dfd31900 RBX: ffff888115b65e20 RCX: 0000000000000000 > >>> [ 60.673659] RDX: 0000000000000001 RSI: 0000000000000004 RDI: 00000000ffffffff > >>> [ 60.676241] RBP: 0000000000000400 R08: ffff8881f6f23bd3 R09: 1ffff1103ede477a > >>> [ 60.678787] R10: dffffc0000000000 R11: ffffed103ede477b R12: ffff888115b60ae8 > >>> [ 60.681420] R13: 1ffff11022b6cbc4 R14: 00000000fffffff2 R15: 0000000000000001 > >>> [ 60.684030] FS: 00007fc2aedd80c0(0000) GS:ffff88826fa8a000(0000) knlGS:0000000000000000 > >>> [ 60.686837] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >>> [ 60.689027] CR2: 000056325369e000 CR3: 000000011088b002 CR4: 0000000000370ef0 > >>> [ 60.691623] Call Trace: > >>> [ 60.692821] <TASK> > >>> [ 60.693960] ? __pfx_verbose+0x10/0x10 > >>> [ 60.695656] ? __pfx_disasm_kfunc_name+0x10/0x10 > >>> [ 60.697495] check_cond_jmp_op+0x16f7/0x39b0 > >>> [ 60.699237] do_check+0x58fa/0xab10 > >>> ... > >>> > >>> Further analysis shows the warning is at line 4302 as below: > >>> > >>> 4294 /* static subprog call instruction, which > >>> 4295 * means that we are exiting current subprog, > >>> 4296 * so only r1-r5 could be still requested as > >>> 4297 * precise, r0 and r6-r10 or any stack slot in > >>> 4298 * the current frame should be zero by now > >>> 4299 */ > >>> 4300 if (bt_reg_mask(bt) & ~BPF_REGMASK_ARGS) { > >>> 4301 verbose(env, "BUG regs %x\n", bt_reg_mask(bt)); > >>> 4302 WARN_ONCE(1, "verifier backtracking bug"); > >>> 4303 return -EFAULT; > >>> 4304 } > >>> > >>> With the below test (also in the next patch): > >>> __used __naked static void __bpf_jmp_r10(void) > >>> { > >>> asm volatile ( > >>> "r2 = 2314885393468386424 ll;" > >>> "goto +0;" > >>> "if r2 <= r10 goto +3;" > >>> "if r1 >= -1835016 goto +0;" > >>> "if r2 <= 8 goto +0;" > >>> "if r3 <= 0 goto +0;" > >>> "exit;" > >>> ::: __clobber_all); > >>> } > >>> > >>> SEC("?raw_tp") > >>> __naked void bpf_jmp_r10(void) > >>> { > >>> asm volatile ( > >>> "r3 = 0 ll;" > >>> "call __bpf_jmp_r10;" > >>> "r0 = 0;" > >>> "exit;" > >>> ::: __clobber_all); > >>> } > >>> > >>> The following is the verifier failure log: > >>> 0: (18) r3 = 0x0 ; R3_w=0 > >>> 2: (85) call pc+2 > >>> caller: > >>> R10=fp0 > >>> callee: > >>> frame1: R1=ctx() R3_w=0 R10=fp0 > >>> 5: frame1: R1=ctx() R3_w=0 R10=fp0 > >>> ; asm volatile (" \ @ verifier_precision.c:184 > >>> 5: (18) r2 = 0x20202000256c6c78 ; frame1: R2_w=0x20202000256c6c78 > >>> 7: (05) goto pc+0 > >>> 8: (bd) if r2 <= r10 goto pc+3 ; frame1: R2_w=0x20202000256c6c78 R10=fp0 > >> For stacks spill/fill we use INSN_F_STACK_ACCESS because not just r10 > >> can be used to point to the stack. I wonder if we need to handle r10 > >> more generically here? > >> > >> E.g., if here we had something like > >> > >> r1 = r10 > >> r1 += -8 > >> if r2 <= r1 goto pc +3 > >> > >> is it fine to track r1 as precise or we need to know that r1 is an alias to r10? > >> > >> Not sure myself yet, but I thought I'd bring this up as a concern. > > In backtrack_insn, we have: > > } else if (opcode == BPF_MOV) { > if (BPF_SRC(insn->code) == BPF_X) { > /* dreg = sreg or dreg = (s8, s16, s32)sreg > * dreg needs precision after this insn > * sreg needs precision before this insn > */ > bt_clear_reg(bt, dreg); > if (sreg != BPF_REG_FP) > bt_set_reg(bt, sreg); > } else { > /* dreg = K > * dreg needs precision after this insn. > * Corresponding register is already marked > * as precise=true in this verifier state. > * No further markings in parent are necessary > */ > bt_clear_reg(bt, dreg); > } > > So for insn 'r1 = r10', even if r1 is marked precise, but based on the above > code r1 will be cleared and r10 will not be added to bt_set_reg due to > 'sreg != BPF_REG_FP'. So the current implementation should be okay. > > > >> > > After discussing this with Eduard offline, I think that we should > > generalize this a bit and not hard-code r10 handling like this. > > > > Note how we use INSN_F_STACK_ACCESS to mark LDX and STX instructions > > as "accesses stack through register", regardless of whether that > > register is r10 or any other rx after `rX = r10; rX += <offset>`. I > > think we should do the same here more generally for all instructions, > > especially for conditional jumps. > > > > The only complication is that with INSN_F_STACK_ACCESS we have only > > one possible register within LDX/STX, while with conditional jumps we > > can have two registers (and both might be PTR_TO_STACK registers!). > > > > So I propose we split INSN_F_STACK_ACCESS into INSN_F_STACK_SRC and > > INSN_F_STACK_DST and use that to mark either src or dst register as > > being a PTR_TO_STACK. Then we can generically ignore any register that > > was a PTR_TO_STACK, because any such register is already implicitly > > precise. > > > > We'd need to slightly update existing code to use either > > INSN_F_STACK_SRC or INSN_F_STACK_DST, depending on LDX or STX, and > > then generalize all that to conditionals (and, technically, any other > > instruction). > > > > WDYT? Does it make sense? > > I tried to prototype based on the above idea. But ultimately I gave up. > The following are some of my analysis. > > The INSN_F_STACK_ACCESS is used for stack access (load and store). > See: > > /* instruction history flags, used in bpf_insn_hist_entry.flags field */ > enum { > /* instruction references stack slot through PTR_TO_STACK register; > * we also store stack's frame number in lower 3 bits (MAX_CALL_FRAMES is 8) > * and accessed stack slot's index in next 6 bits (MAX_BPF_STACK is 512, > * 8 bytes per slot, so slot index (spi) is [0, 63]) > */ > INSN_F_FRAMENO_MASK = 0x7, /* 3 bits */ > > INSN_F_SPI_MASK = 0x3f, /* 6 bits */ > INSN_F_SPI_SHIFT = 3, /* shifted 3 bits to the left */ > > INSN_F_STACK_ACCESS = BIT(9), /* we need 10 bits total */ > }; > > static int insn_stack_access_flags(int frameno, int spi) > { > return INSN_F_STACK_ACCESS | (spi << INSN_F_SPI_SHIFT) | frameno; > } > > insn_stack_access_flags() is used by check_stack_read_fixed_off() > and check_stack_write_fixed_off(). For these two functions, > eventually a push_insn_history() > push_insn_history(env, env->cur_state, insn_flags, 0) > is done to record related insn_flags info. Note that > insn_flags could be 0 or could be a actual insn_stack_access_flags() > which depends on other contexts. > > For cond op's like 'rX <op> rY', it is similar to other ALU{32,64} operations. > The decision can be made on the spot about to either clear or add related > registers to bt_reg_set. > > I understand that it is desirable to avoid explicit checking BPF_REG_FP > register. But this seems the simplest workable approach without > involving push_insn_history(). > > The more complex option is to do push_insn_history() for 'rX <op> rY' > conditions with information about how to deal with r10 register, e.g., > to enforce the register must be one of r0-r9. That way, in backtrack_insn, > the code can simply to > if (hist->dst_reg_mask & dreg) > bt_set_reg(bt, dreg); > if (hist->src_reg_mask & sreg) > bt_set_reg(bt, sreg); > > But this seems more complex than current simple approach. > > WDYT? Doing the push_insn_history() is exactly what I had in mind from the very beginning. I'd do that. It's a bit more code, but it sets us up better for generic handling of PTR_TO_STACK registers, regardless if they are r10 or any other rX. This is the general direction we started on with INSN_F_STACK_ACCESS, so I think it makes sense to take another step in that direction, instead of reverting back to hacky BPF_REG_FP handling. > > > > > >>> 9: (35) if r1 >= 0xffe3fff8 goto pc+0 ; frame1: R1=ctx() > >>> 10: (b5) if r2 <= 0x8 goto pc+0 > >>> mark_precise: frame1: last_idx 10 first_idx 0 subseq_idx -1 > >>> mark_precise: frame1: regs=r2 stack= before 9: (35) if r1 >= 0xffe3fff8 goto pc+0 > >>> mark_precise: frame1: regs=r2 stack= before 8: (bd) if r2 <= r10 goto pc+3 > >>> mark_precise: frame1: regs=r2,r10 stack= before 7: (05) goto pc+0 > >>> mark_precise: frame1: regs=r2,r10 stack= before 5: (18) r2 = 0x20202000256c6c78 > >>> mark_precise: frame1: regs=r10 stack= before 2: (85) call pc+2 > >>> BUG regs 400 > >>> > >>> The main failure reason is due to r10 in precision backtracking bookkeeping. > >>> Actually r10 is always precise and there is no need to add it the precision > >>> backtracking bookkeeping. > >>> > >>> This patch fixed the problem by not adding r10 to prevision backtracking bookkeeping. > >>> > >>> [1] https://lore.kernel.org/bpf/Z%2F8q3xzpU59CIYQE@ly-workstation/ > >>> > >>> Reported by: Yi Lai <yi1.lai@linux.intel.com> > >>> Fixes: 407958a0e980 ("bpf: encapsulate precision backtracking bookkeeping") > >>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> > >>> --- > >>> kernel/bpf/verifier.c | 6 ++++-- > >>> 1 file changed, 4 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > >>> index 28f5a7899bd6..1cb4d80d15c1 100644 > >>> --- a/kernel/bpf/verifier.c > >>> +++ b/kernel/bpf/verifier.c > >>> @@ -4413,8 +4413,10 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, > >>> * before it would be equally necessary to > >>> * propagate it to dreg. > >>> */ > >>> - bt_set_reg(bt, dreg); > >>> - bt_set_reg(bt, sreg); > >>> + if (dreg != BPF_REG_FP) > >>> + bt_set_reg(bt, dreg); > >>> + if (sreg != BPF_REG_FP) > >>> + bt_set_reg(bt, sreg); > >>> } else if (BPF_SRC(insn->code) == BPF_K) { > >>> /* dreg <cond> K > >>> * Only dreg still needs precision before > >>> -- > >>> 2.47.1 > >>> > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping 2025-05-15 17:47 ` Andrii Nakryiko @ 2025-05-15 19:07 ` Yonghong Song 0 siblings, 0 replies; 10+ messages in thread From: Yonghong Song @ 2025-05-15 19:07 UTC (permalink / raw) To: Andrii Nakryiko Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau On 5/15/25 1:47 AM, Andrii Nakryiko wrote: > On Wed, May 14, 2025 at 6:44 PM Yonghong Song <yonghong.song@linux.dev> wrote: >> >> >> On 5/12/25 6:05 AM, Andrii Nakryiko wrote: >>> On Mon, May 12, 2025 at 9:26 AM Andrii Nakryiko >>> <andrii.nakryiko@gmail.com> wrote: >>>> On Sun, May 11, 2025 at 9:28 AM Yonghong Song <yonghong.song@linux.dev> wrote: >>>>> Yi Lai reported an issue ([1]) where the following warning appears >>>>> in kernel dmesg: >>>>> [ 60.643604] verifier backtracking bug >>>>> [ 60.643635] WARNING: CPU: 10 PID: 2315 at kernel/bpf/verifier.c:4302 __mark_chain_precision+0x3a6c/0x3e10 >>>>> [ 60.648428] Modules linked in: bpf_testmod(OE) >>>>> [ 60.650471] CPU: 10 UID: 0 PID: 2315 Comm: test_progs Tainted: G OE 6.15.0-rc4-gef11287f8289-dirty #327 PREEMPT(full) >>>>> [ 60.654385] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE >>>>> [ 60.656682] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 >>>>> [ 60.660475] RIP: 0010:__mark_chain_precision+0x3a6c/0x3e10 >>>>> [ 60.662814] Code: 5a 30 84 89 ea e8 c4 d9 01 00 80 3d 3e 7d d8 04 00 0f 85 60 fa ff ff c6 05 31 7d d8 04 >>>>> 01 48 c7 c7 00 58 30 84 e8 c4 06 a5 ff <0f> 0b e9 46 fa ff ff 48 ... >>>>> [ 60.668720] RSP: 0018:ffff888116cc7298 EFLAGS: 00010246 >>>>> [ 60.671075] RAX: 54d70e82dfd31900 RBX: ffff888115b65e20 RCX: 0000000000000000 >>>>> [ 60.673659] RDX: 0000000000000001 RSI: 0000000000000004 RDI: 00000000ffffffff >>>>> [ 60.676241] RBP: 0000000000000400 R08: ffff8881f6f23bd3 R09: 1ffff1103ede477a >>>>> [ 60.678787] R10: dffffc0000000000 R11: ffffed103ede477b R12: ffff888115b60ae8 >>>>> [ 60.681420] R13: 1ffff11022b6cbc4 R14: 00000000fffffff2 R15: 0000000000000001 >>>>> [ 60.684030] FS: 00007fc2aedd80c0(0000) GS:ffff88826fa8a000(0000) knlGS:0000000000000000 >>>>> [ 60.686837] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>> [ 60.689027] CR2: 000056325369e000 CR3: 000000011088b002 CR4: 0000000000370ef0 >>>>> [ 60.691623] Call Trace: >>>>> [ 60.692821] <TASK> >>>>> [ 60.693960] ? __pfx_verbose+0x10/0x10 >>>>> [ 60.695656] ? __pfx_disasm_kfunc_name+0x10/0x10 >>>>> [ 60.697495] check_cond_jmp_op+0x16f7/0x39b0 >>>>> [ 60.699237] do_check+0x58fa/0xab10 >>>>> ... >>>>> >>>>> Further analysis shows the warning is at line 4302 as below: >>>>> >>>>> 4294 /* static subprog call instruction, which >>>>> 4295 * means that we are exiting current subprog, >>>>> 4296 * so only r1-r5 could be still requested as >>>>> 4297 * precise, r0 and r6-r10 or any stack slot in >>>>> 4298 * the current frame should be zero by now >>>>> 4299 */ >>>>> 4300 if (bt_reg_mask(bt) & ~BPF_REGMASK_ARGS) { >>>>> 4301 verbose(env, "BUG regs %x\n", bt_reg_mask(bt)); >>>>> 4302 WARN_ONCE(1, "verifier backtracking bug"); >>>>> 4303 return -EFAULT; >>>>> 4304 } >>>>> >>>>> With the below test (also in the next patch): >>>>> __used __naked static void __bpf_jmp_r10(void) >>>>> { >>>>> asm volatile ( >>>>> "r2 = 2314885393468386424 ll;" >>>>> "goto +0;" >>>>> "if r2 <= r10 goto +3;" >>>>> "if r1 >= -1835016 goto +0;" >>>>> "if r2 <= 8 goto +0;" >>>>> "if r3 <= 0 goto +0;" >>>>> "exit;" >>>>> ::: __clobber_all); >>>>> } >>>>> >>>>> SEC("?raw_tp") >>>>> __naked void bpf_jmp_r10(void) >>>>> { >>>>> asm volatile ( >>>>> "r3 = 0 ll;" >>>>> "call __bpf_jmp_r10;" >>>>> "r0 = 0;" >>>>> "exit;" >>>>> ::: __clobber_all); >>>>> } >>>>> >>>>> The following is the verifier failure log: >>>>> 0: (18) r3 = 0x0 ; R3_w=0 >>>>> 2: (85) call pc+2 >>>>> caller: >>>>> R10=fp0 >>>>> callee: >>>>> frame1: R1=ctx() R3_w=0 R10=fp0 >>>>> 5: frame1: R1=ctx() R3_w=0 R10=fp0 >>>>> ; asm volatile (" \ @ verifier_precision.c:184 >>>>> 5: (18) r2 = 0x20202000256c6c78 ; frame1: R2_w=0x20202000256c6c78 >>>>> 7: (05) goto pc+0 >>>>> 8: (bd) if r2 <= r10 goto pc+3 ; frame1: R2_w=0x20202000256c6c78 R10=fp0 >>>> For stacks spill/fill we use INSN_F_STACK_ACCESS because not just r10 >>>> can be used to point to the stack. I wonder if we need to handle r10 >>>> more generically here? >>>> >>>> E.g., if here we had something like >>>> >>>> r1 = r10 >>>> r1 += -8 >>>> if r2 <= r1 goto pc +3 >>>> >>>> is it fine to track r1 as precise or we need to know that r1 is an alias to r10? >>>> >>>> Not sure myself yet, but I thought I'd bring this up as a concern. >> In backtrack_insn, we have: >> >> } else if (opcode == BPF_MOV) { >> if (BPF_SRC(insn->code) == BPF_X) { >> /* dreg = sreg or dreg = (s8, s16, s32)sreg >> * dreg needs precision after this insn >> * sreg needs precision before this insn >> */ >> bt_clear_reg(bt, dreg); >> if (sreg != BPF_REG_FP) >> bt_set_reg(bt, sreg); >> } else { >> /* dreg = K >> * dreg needs precision after this insn. >> * Corresponding register is already marked >> * as precise=true in this verifier state. >> * No further markings in parent are necessary >> */ >> bt_clear_reg(bt, dreg); >> } >> >> So for insn 'r1 = r10', even if r1 is marked precise, but based on the above >> code r1 will be cleared and r10 will not be added to bt_set_reg due to >> 'sreg != BPF_REG_FP'. So the current implementation should be okay. >> >> >>> After discussing this with Eduard offline, I think that we should >>> generalize this a bit and not hard-code r10 handling like this. >>> >>> Note how we use INSN_F_STACK_ACCESS to mark LDX and STX instructions >>> as "accesses stack through register", regardless of whether that >>> register is r10 or any other rx after `rX = r10; rX += <offset>`. I >>> think we should do the same here more generally for all instructions, >>> especially for conditional jumps. >>> >>> The only complication is that with INSN_F_STACK_ACCESS we have only >>> one possible register within LDX/STX, while with conditional jumps we >>> can have two registers (and both might be PTR_TO_STACK registers!). >>> >>> So I propose we split INSN_F_STACK_ACCESS into INSN_F_STACK_SRC and >>> INSN_F_STACK_DST and use that to mark either src or dst register as >>> being a PTR_TO_STACK. Then we can generically ignore any register that >>> was a PTR_TO_STACK, because any such register is already implicitly >>> precise. >>> >>> We'd need to slightly update existing code to use either >>> INSN_F_STACK_SRC or INSN_F_STACK_DST, depending on LDX or STX, and >>> then generalize all that to conditionals (and, technically, any other >>> instruction). >>> >>> WDYT? Does it make sense? >> I tried to prototype based on the above idea. But ultimately I gave up. >> The following are some of my analysis. >> >> The INSN_F_STACK_ACCESS is used for stack access (load and store). >> See: >> >> /* instruction history flags, used in bpf_insn_hist_entry.flags field */ >> enum { >> /* instruction references stack slot through PTR_TO_STACK register; >> * we also store stack's frame number in lower 3 bits (MAX_CALL_FRAMES is 8) >> * and accessed stack slot's index in next 6 bits (MAX_BPF_STACK is 512, >> * 8 bytes per slot, so slot index (spi) is [0, 63]) >> */ >> INSN_F_FRAMENO_MASK = 0x7, /* 3 bits */ >> >> INSN_F_SPI_MASK = 0x3f, /* 6 bits */ >> INSN_F_SPI_SHIFT = 3, /* shifted 3 bits to the left */ >> >> INSN_F_STACK_ACCESS = BIT(9), /* we need 10 bits total */ >> }; >> >> static int insn_stack_access_flags(int frameno, int spi) >> { >> return INSN_F_STACK_ACCESS | (spi << INSN_F_SPI_SHIFT) | frameno; >> } >> >> insn_stack_access_flags() is used by check_stack_read_fixed_off() >> and check_stack_write_fixed_off(). For these two functions, >> eventually a push_insn_history() >> push_insn_history(env, env->cur_state, insn_flags, 0) >> is done to record related insn_flags info. Note that >> insn_flags could be 0 or could be a actual insn_stack_access_flags() >> which depends on other contexts. >> >> For cond op's like 'rX <op> rY', it is similar to other ALU{32,64} operations. >> The decision can be made on the spot about to either clear or add related >> registers to bt_reg_set. >> >> I understand that it is desirable to avoid explicit checking BPF_REG_FP >> register. But this seems the simplest workable approach without >> involving push_insn_history(). >> >> The more complex option is to do push_insn_history() for 'rX <op> rY' >> conditions with information about how to deal with r10 register, e.g., >> to enforce the register must be one of r0-r9. That way, in backtrack_insn, >> the code can simply to >> if (hist->dst_reg_mask & dreg) >> bt_set_reg(bt, dreg); >> if (hist->src_reg_mask & sreg) >> bt_set_reg(bt, sreg); >> >> But this seems more complex than current simple approach. >> >> WDYT? > Doing the push_insn_history() is exactly what I had in mind from the > very beginning. I'd do that. It's a bit more code, but it sets us up > better for generic handling of PTR_TO_STACK registers, regardless if > they are r10 or any other rX. This is the general direction we started > on with INSN_F_STACK_ACCESS, so I think it makes sense to take another > step in that direction, instead of reverting back to hacky BPF_REG_FP > handling. Okay, I will take a stab on this. > >> >>>>> 9: (35) if r1 >= 0xffe3fff8 goto pc+0 ; frame1: R1=ctx() >>>>> 10: (b5) if r2 <= 0x8 goto pc+0 >>>>> mark_precise: frame1: last_idx 10 first_idx 0 subseq_idx -1 >>>>> mark_precise: frame1: regs=r2 stack= before 9: (35) if r1 >= 0xffe3fff8 goto pc+0 >>>>> mark_precise: frame1: regs=r2 stack= before 8: (bd) if r2 <= r10 goto pc+3 >>>>> mark_precise: frame1: regs=r2,r10 stack= before 7: (05) goto pc+0 >>>>> mark_precise: frame1: regs=r2,r10 stack= before 5: (18) r2 = 0x20202000256c6c78 >>>>> mark_precise: frame1: regs=r10 stack= before 2: (85) call pc+2 >>>>> BUG regs 400 >>>>> >>>>> The main failure reason is due to r10 in precision backtracking bookkeeping. >>>>> Actually r10 is always precise and there is no need to add it the precision >>>>> backtracking bookkeeping. >>>>> >>>>> This patch fixed the problem by not adding r10 to prevision backtracking bookkeeping. >>>>> >>>>> [1] https://lore.kernel.org/bpf/Z%2F8q3xzpU59CIYQE@ly-workstation/ >>>>> >>>>> Reported by: Yi Lai <yi1.lai@linux.intel.com> >>>>> Fixes: 407958a0e980 ("bpf: encapsulate precision backtracking bookkeeping") >>>>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> >>>>> --- >>>>> kernel/bpf/verifier.c | 6 ++++-- >>>>> 1 file changed, 4 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >>>>> index 28f5a7899bd6..1cb4d80d15c1 100644 >>>>> --- a/kernel/bpf/verifier.c >>>>> +++ b/kernel/bpf/verifier.c >>>>> @@ -4413,8 +4413,10 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, >>>>> * before it would be equally necessary to >>>>> * propagate it to dreg. >>>>> */ >>>>> - bt_set_reg(bt, dreg); >>>>> - bt_set_reg(bt, sreg); >>>>> + if (dreg != BPF_REG_FP) >>>>> + bt_set_reg(bt, dreg); >>>>> + if (sreg != BPF_REG_FP) >>>>> + bt_set_reg(bt, sreg); >>>>> } else if (BPF_SRC(insn->code) == BPF_K) { >>>>> /* dreg <cond> K >>>>> * Only dreg still needs precision before >>>>> -- >>>>> 2.47.1 >>>>> ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-05-15 19:07 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-05-11 16:27 [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping Yonghong Song 2025-05-11 16:28 ` [PATCH bpf-next 2/2] selftests/bpf: Add a test with r10 in conditional jmp Yonghong Song 2025-05-11 22:33 ` [PATCH bpf-next 1/2] bpf: Do not include r10 in precision backtracking bookkeeping Alexei Starovoitov 2025-05-12 12:03 ` Ilya Leoshkevich 2025-05-12 16:26 ` Andrii Nakryiko 2025-05-12 22:05 ` Andrii Nakryiko 2025-05-13 5:20 ` Yonghong Song 2025-05-15 1:44 ` Yonghong Song 2025-05-15 17:47 ` Andrii Nakryiko 2025-05-15 19:07 ` Yonghong Song
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).