* [syzbot] [bpf?] WARNING in reg_bounds_sanity_check @ 2025-07-02 1:55 syzbot 2025-07-03 17:02 ` Paul Chaignon 2025-07-05 16:02 ` syzbot 0 siblings, 2 replies; 17+ messages in thread From: syzbot @ 2025-07-02 1:55 UTC (permalink / raw) To: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song Hello, syzbot found the following issue on: HEAD commit: cce3fee729ee selftests/bpf: Enable dynptr/test_probe_read_.. git tree: bpf-next console+strace: https://syzkaller.appspot.com/x/log.txt?x=147793d4580000 kernel config: https://syzkaller.appspot.com/x/.config?x=79da270cec5ffd65 dashboard link: https://syzkaller.appspot.com/bug?extid=c711ce17dd78e5d4fdcf compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1594e48c580000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1159388c580000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/f286a7ef4940/disk-cce3fee7.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/e2f2ebe1fdc3/vmlinux-cce3fee7.xz kernel image: https://storage.googleapis.com/syzbot-assets/6e3070663778/bzImage-cce3fee7.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+c711ce17dd78e5d4fdcf@syzkaller.appspotmail.com ------------[ cut here ]------------ verifier bug: REG INVARIANTS VIOLATION (false_reg1): range bounds violation u64=[0x0, 0x0] s64=[0x0, 0x0] u32=[0x1, 0x0] s32=[0x0, 0x0] var_off=(0x0, 0x0)(1) WARNING: CPU: 1 PID: 5833 at kernel/bpf/verifier.c:2688 reg_bounds_sanity_check+0x6e6/0xc20 kernel/bpf/verifier.c:2682 Modules linked in: CPU: 1 UID: 0 PID: 5833 Comm: syz-executor346 Not tainted 6.16.0-rc3-syzkaller-gcce3fee729ee #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025 RIP: 0010:reg_bounds_sanity_check+0x6e6/0xc20 kernel/bpf/verifier.c:2682 Code: 24 20 4c 8b 44 24 60 4c 8b 4c 24 58 41 ff 75 00 53 41 57 55 ff 74 24 38 ff 74 24 70 ff 74 24 40 e8 8f 86 aa ff 48 83 c4 38 90 <0f> 0b 90 90 48 bb 00 00 00 00 00 fc ff df 4d 89 f7 4c 8b 74 24 08 RSP: 0018:ffffc90003f6ec08 EFLAGS: 00010282 RAX: 2c2acf8a45b1bf00 RBX: 0000000000000000 RCX: ffff888029235a00 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000002 RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000004 R10: dffffc0000000000 R11: fffffbfff1bfaa04 R12: ffff888025056000 R13: ffff888025056020 R14: ffff888025056038 R15: 0000000000000000 FS: 00005555860e1380(0000) GS:ffff888125d4d000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00002000002a1000 CR3: 00000000749a0000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> reg_set_min_max+0x264/0x300 kernel/bpf/verifier.c:16262 check_cond_jmp_op+0x159b/0x2910 kernel/bpf/verifier.c:16705 do_check_insn kernel/bpf/verifier.c:19882 [inline] do_check+0x665b/0xe080 kernel/bpf/verifier.c:20017 do_check_common+0x188f/0x23f0 kernel/bpf/verifier.c:23180 do_check_main kernel/bpf/verifier.c:23263 [inline] bpf_check+0x10252/0x1a5d0 kernel/bpf/verifier.c:24619 bpf_prog_load+0x1318/0x1930 kernel/bpf/syscall.c:2972 __sys_bpf+0x5f1/0x860 kernel/bpf/syscall.c:5978 __do_sys_bpf kernel/bpf/syscall.c:6085 [inline] __se_sys_bpf kernel/bpf/syscall.c:6083 [inline] __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:6083 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f37dffe23a9 Code: 48 83 c4 28 c3 e8 37 17 00 00 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fff6bb80468 EFLAGS: 00000246 ORIG_RAX: 0000000000000141 RAX: ffffffffffffffda RBX: 00007fff6bb80648 RCX: 00007f37dffe23a9 RDX: 0000000000000045 RSI: 00002000002a0fb8 RDI: 0000000000000005 RBP: 00007f37e0055610 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffffffffffff R11: 0000000000000246 R12: 0000000000000001 R13: 00007fff6bb80638 R14: 0000000000000001 R15: 0000000000000001 </TASK> --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. If the report is already addressed, let syzbot know by replying with: #syz fix: exact-commit-title If you want syzbot to run the reproducer, reply with: #syz test: git://repo/address.git branch-or-commit-hash If you attach or paste a git patch, syzbot will apply it before testing. If you want to overwrite report's subsystems, reply with: #syz set subsystems: new-subsystem (See the list of subsystem names on the web dashboard) If the report is a duplicate of another one, reply with: #syz dup: exact-subject-of-another-report If you want to undo deduplication, reply with: #syz undup ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check 2025-07-02 1:55 [syzbot] [bpf?] WARNING in reg_bounds_sanity_check syzbot @ 2025-07-03 17:02 ` Paul Chaignon 2025-07-03 18:54 ` Eduard Zingerman 2025-07-05 16:02 ` syzbot 1 sibling, 1 reply; 17+ messages in thread From: Paul Chaignon @ 2025-07-03 17:02 UTC (permalink / raw) To: syzbot Cc: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song On Tue, Jul 01, 2025 at 06:55:28PM -0700, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: cce3fee729ee selftests/bpf: Enable dynptr/test_probe_read_.. > git tree: bpf-next > console+strace: https://syzkaller.appspot.com/x/log.txt?x=147793d4580000 > kernel config: https://syzkaller.appspot.com/x/.config?x=79da270cec5ffd65 > dashboard link: https://syzkaller.appspot.com/bug?extid=c711ce17dd78e5d4fdcf > compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1594e48c580000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1159388c580000 > > Downloadable assets: > disk image: https://storage.googleapis.com/syzbot-assets/f286a7ef4940/disk-cce3fee7.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/e2f2ebe1fdc3/vmlinux-cce3fee7.xz > kernel image: https://storage.googleapis.com/syzbot-assets/6e3070663778/bzImage-cce3fee7.xz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+c711ce17dd78e5d4fdcf@syzkaller.appspotmail.com > > ------------[ cut here ]------------ > verifier bug: REG INVARIANTS VIOLATION (false_reg1): range bounds violation u64=[0x0, 0x0] s64=[0x0, 0x0] u32=[0x1, 0x0] s32=[0x0, 0x0] var_off=(0x0, 0x0)(1) > WARNING: CPU: 1 PID: 5833 at kernel/bpf/verifier.c:2688 reg_bounds_sanity_check+0x6e6/0xc20 kernel/bpf/verifier.c:2682 I'm unsure how to handle this one. One example repro is as follows. 0: call bpf_get_netns_cookie 1: if r0 == 0 goto <exit> 2: if r0 & Oxffffffff goto <exit> The issue is on the path where we fall through both jumps. That path is unreachable at runtime: after insn 1, we know r0 != 0, but with the sign extension on the jset, we would only fallthrough insn 2 if r0 == 0. Unfortunately, is_branch_taken() isn't currently able to figure this out, so the verifier walks all branches. As a result, we end up with inconsistent register ranges on this unreachable path: 0: if r0 == 0 goto <exit> r0: u64=[0x1, 0xffffffffffffffff] var_off=(0, 0xffffffffffffffff) 1: if r0 & 0xffffffff goto <exit> r0 before reg_bounds_sync: u64=[0x1, 0xffffffffffffffff] var_off=(0, 0) r0 after reg_bounds_sync: u64=[0x1, 0] var_off=(0, 0) I suspect there isn't anything specific to these two conditions, and anytime we start walking an unreachable path, we may end up with inconsistent register ranges. The number of times syzkaller is currently hitting this (180 in 1.5 days) suggests there are many different ways to reproduce. We could teach is_branch_taken() about this case, but we probably won't be able to cover all cases. We could stop warning on this, but then we may also miss legitimate cases (i.e., invariants violations on reachable paths). We could also teach reg_bounds_sync() to stop refining the bounds before it gets inconsistent, but I'm unsure how useful that'd be. Paul [...] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check 2025-07-03 17:02 ` Paul Chaignon @ 2025-07-03 18:54 ` Eduard Zingerman 2025-07-04 17:14 ` Paul Chaignon 0 siblings, 1 reply; 17+ messages in thread From: Eduard Zingerman @ 2025-07-03 18:54 UTC (permalink / raw) To: Paul Chaignon, syzbot Cc: andrii, ast, bpf, daniel, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song On Thu, 2025-07-03 at 19:02 +0200, Paul Chaignon wrote: > On Tue, Jul 01, 2025 at 06:55:28PM -0700, syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: cce3fee729ee selftests/bpf: Enable dynptr/test_probe_read_.. > > git tree: bpf-next > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=147793d4580000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=79da270cec5ffd65 > > dashboard link: https://syzkaller.appspot.com/bug?extid=c711ce17dd78e5d4fdcf > > compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6 > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1594e48c580000 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1159388c580000 > > > > Downloadable assets: > > disk image: https://storage.googleapis.com/syzbot-assets/f286a7ef4940/disk-cce3fee7.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/e2f2ebe1fdc3/vmlinux-cce3fee7.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/6e3070663778/bzImage-cce3fee7.xz > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+c711ce17dd78e5d4fdcf@syzkaller.appspotmail.com > > > > ------------[ cut here ]------------ > > verifier bug: REG INVARIANTS VIOLATION (false_reg1): range bounds violation u64=[0x0, 0x0] s64=[0x0, 0x0] u32=[0x1, 0x0] s32=[0x0, 0x0] var_off=(0x0, 0x0)(1) > > WARNING: CPU: 1 PID: 5833 at kernel/bpf/verifier.c:2688 reg_bounds_sanity_check+0x6e6/0xc20 kernel/bpf/verifier.c:2682 > > I'm unsure how to handle this one. > > One example repro is as follows. > > 0: call bpf_get_netns_cookie > 1: if r0 == 0 goto <exit> > 2: if r0 & Oxffffffff goto <exit> > > The issue is on the path where we fall through both jumps. > > That path is unreachable at runtime: after insn 1, we know r0 != 0, but > with the sign extension on the jset, we would only fallthrough insn 2 > if r0 == 0. Unfortunately, is_branch_taken() isn't currently able to > figure this out, so the verifier walks all branches. As a result, we end > up with inconsistent register ranges on this unreachable path: > > 0: if r0 == 0 goto <exit> > r0: u64=[0x1, 0xffffffffffffffff] var_off=(0, 0xffffffffffffffff) > 1: if r0 & 0xffffffff goto <exit> > r0 before reg_bounds_sync: u64=[0x1, 0xffffffffffffffff] var_off=(0, 0) > r0 after reg_bounds_sync: u64=[0x1, 0] var_off=(0, 0) > > I suspect there isn't anything specific to these two conditions, and > anytime we start walking an unreachable path, we may end up with > inconsistent register ranges. The number of times syzkaller is currently > hitting this (180 in 1.5 days) suggests there are many different ways to > reproduce. > > We could teach is_branch_taken() about this case, but we probably won't > be able to cover all cases. We could stop warning on this, but then we > may also miss legitimate cases (i.e., invariants violations on reachable > paths). We could also teach reg_bounds_sync() to stop refining the > bounds before it gets inconsistent, but I'm unsure how useful that'd be. Hi Paul, In general, I think that reg_bounds_sync() can be used as a substitute for is_branch_taken() -> whenever an impossible range is produced, the branch should be deemed impossible at runtime and abandoned. If I recall correctly Andrii considered this too risky some time ago, so this warning is in place to catch bugs. Which leaves only the option to refine is_branch_taken(). I think is_branch_taken() modification should not be too complicated. For JSET it only checks tnum, but does not take ranges into account. Reasoning about ranges is something along the lines: - for unsigned range a = b & CONST -> a is in [b_min & CONST, b_max & CONST]; - for signed ranged same thing, but consider two unsigned sub-ranges; - for non CONST cases, I think same reasoning can apply, but more min/max combinations need to be explored. - then check if zero is a member or 'a' range. Wdyt? > The number of times syzkaller is currently hitting this (180 in 1.5 > days) suggests there are many different ways to reproduce. It is a bit inconvenient to read syzbot BPF reports at the moment, because it us hard to figure out how the program looks like. Do you happen to know how complicated would it be to modify syzbot output to: - produce a comment with BPF program - generating reproducer with a flag, allowing to print level 2 verifier log ? Thanks, Eduard ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check 2025-07-03 18:54 ` Eduard Zingerman @ 2025-07-04 17:14 ` Paul Chaignon 2025-07-04 17:26 ` Eduard Zingerman 0 siblings, 1 reply; 17+ messages in thread From: Paul Chaignon @ 2025-07-04 17:14 UTC (permalink / raw) To: Eduard Zingerman Cc: syzbot, andrii, ast, bpf, daniel, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song On Thu, Jul 03, 2025 at 11:54:27AM -0700, Eduard Zingerman wrote: > On Thu, 2025-07-03 at 19:02 +0200, Paul Chaignon wrote: > > On Tue, Jul 01, 2025 at 06:55:28PM -0700, syzbot wrote: > > > Hello, > > > > > > syzbot found the following issue on: > > > > > > HEAD commit: cce3fee729ee selftests/bpf: Enable dynptr/test_probe_read_.. > > > git tree: bpf-next > > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=147793d4580000 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=79da270cec5ffd65 > > > dashboard link: https://syzkaller.appspot.com/bug?extid=c711ce17dd78e5d4fdcf > > > compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6 > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1594e48c580000 > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1159388c580000 > > > > > > Downloadable assets: > > > disk image: https://storage.googleapis.com/syzbot-assets/f286a7ef4940/disk-cce3fee7.raw.xz > > > vmlinux: https://storage.googleapis.com/syzbot-assets/e2f2ebe1fdc3/vmlinux-cce3fee7.xz > > > kernel image: https://storage.googleapis.com/syzbot-assets/6e3070663778/bzImage-cce3fee7.xz > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > Reported-by: syzbot+c711ce17dd78e5d4fdcf@syzkaller.appspotmail.com > > > > > > ------------[ cut here ]------------ > > > verifier bug: REG INVARIANTS VIOLATION (false_reg1): range bounds violation u64=[0x0, 0x0] s64=[0x0, 0x0] u32=[0x1, 0x0] s32=[0x0, 0x0] var_off=(0x0, 0x0)(1) > > > WARNING: CPU: 1 PID: 5833 at kernel/bpf/verifier.c:2688 reg_bounds_sanity_check+0x6e6/0xc20 kernel/bpf/verifier.c:2682 > > > > I'm unsure how to handle this one. > > > > One example repro is as follows. > > > > 0: call bpf_get_netns_cookie > > 1: if r0 == 0 goto <exit> > > 2: if r0 & Oxffffffff goto <exit> > > > > The issue is on the path where we fall through both jumps. > > > > That path is unreachable at runtime: after insn 1, we know r0 != 0, but > > with the sign extension on the jset, we would only fallthrough insn 2 > > if r0 == 0. Unfortunately, is_branch_taken() isn't currently able to > > figure this out, so the verifier walks all branches. As a result, we end > > up with inconsistent register ranges on this unreachable path: > > > > 0: if r0 == 0 goto <exit> > > r0: u64=[0x1, 0xffffffffffffffff] var_off=(0, 0xffffffffffffffff) > > 1: if r0 & 0xffffffff goto <exit> > > r0 before reg_bounds_sync: u64=[0x1, 0xffffffffffffffff] var_off=(0, 0) > > r0 after reg_bounds_sync: u64=[0x1, 0] var_off=(0, 0) > > > > I suspect there isn't anything specific to these two conditions, and > > anytime we start walking an unreachable path, we may end up with > > inconsistent register ranges. The number of times syzkaller is currently > > hitting this (180 in 1.5 days) suggests there are many different ways to > > reproduce. > > > > We could teach is_branch_taken() about this case, but we probably won't > > be able to cover all cases. We could stop warning on this, but then we > > may also miss legitimate cases (i.e., invariants violations on reachable > > paths). We could also teach reg_bounds_sync() to stop refining the > > bounds before it gets inconsistent, but I'm unsure how useful that'd be. > > Hi Paul, > > In general, I think that reg_bounds_sync() can be used as a substitute > for is_branch_taken() -> whenever an impossible range is produced, > the branch should be deemed impossible at runtime and abandoned. > If I recall correctly Andrii considered this too risky some time ago, > so this warning is in place to catch bugs. Hi Eduard, Yeah, that feels risky enough that I didn't even dare mention it as a possibility :) > > Which leaves only the option to refine is_branch_taken(). > > I think is_branch_taken() modification should not be too complicated. > For JSET it only checks tnum, but does not take ranges into account. > Reasoning about ranges is something along the lines: > - for unsigned range a = b & CONST -> a is in [b_min & CONST, b_max & CONST]; > - for signed ranged same thing, but consider two unsigned sub-ranges; > - for non CONST cases, I think same reasoning can apply, but more > min/max combinations need to be explored. > - then check if zero is a member or 'a' range. > > Wdyt? I might be missing something, but I'm not sure that works. For the unsigned range, if we have b & 0x2 with b in [2; 10], then we'd end up with a in [2; 2] and would conclude that the jump is never taken. But b=8 proves us wrong. > > > The number of times syzkaller is currently hitting this (180 in 1.5 > > days) suggests there are many different ways to reproduce. > > It is a bit inconvenient to read syzbot BPF reports at the moment, > because it us hard to figure out how the program looks like. > Do you happen to know how complicated would it be to modify syzbot > output to: > - produce a comment with BPF program > - generating reproducer with a flag, allowing to print level 2 > verifier log > ? I have the same thought sometimes. Right now, I add verifier logs to a syz or C reproducer to see the program. Producing the BPF program in a comment would likely be tricky as we'd need to maintain a disassembler in syzkaller. Adding verifier logs to reproducers that contain bpf(PROG_LOAD) calls seems easier. Then I guess we'd get that output in the strace or console logs of syzbot. > > Thanks, > Eduard ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check 2025-07-04 17:14 ` Paul Chaignon @ 2025-07-04 17:26 ` Eduard Zingerman 2025-07-04 21:13 ` Eduard Zingerman 2025-07-07 21:57 ` Paul Chaignon 0 siblings, 2 replies; 17+ messages in thread From: Eduard Zingerman @ 2025-07-04 17:26 UTC (permalink / raw) To: Paul Chaignon Cc: syzbot, andrii, ast, bpf, daniel, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song On Fri, 2025-07-04 at 19:14 +0200, Paul Chaignon wrote: > On Thu, Jul 03, 2025 at 11:54:27AM -0700, Eduard Zingerman wrote: > > On Thu, 2025-07-03 at 19:02 +0200, Paul Chaignon wrote: > > > On Tue, Jul 01, 2025 at 06:55:28PM -0700, syzbot wrote: > > > > Hello, > > > > > > > > syzbot found the following issue on: > > > > > > > > HEAD commit: cce3fee729ee selftests/bpf: Enable dynptr/test_probe_read_.. > > > > git tree: bpf-next > > > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=147793d4580000 > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=79da270cec5ffd65 > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=c711ce17dd78e5d4fdcf > > > > compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6 > > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1594e48c580000 > > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1159388c580000 > > > > > > > > Downloadable assets: > > > > disk image: https://storage.googleapis.com/syzbot-assets/f286a7ef4940/disk-cce3fee7.raw.xz > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/e2f2ebe1fdc3/vmlinux-cce3fee7.xz > > > > kernel image: https://storage.googleapis.com/syzbot-assets/6e3070663778/bzImage-cce3fee7.xz > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > > Reported-by: syzbot+c711ce17dd78e5d4fdcf@syzkaller.appspotmail.com > > > > > > > > ------------[ cut here ]------------ > > > > verifier bug: REG INVARIANTS VIOLATION (false_reg1): range bounds violation u64=[0x0, 0x0] s64=[0x0, 0x0] u32=[0x1, 0x0] s32=[0x0, 0x0] var_off=(0x0, 0x0)(1) > > > > WARNING: CPU: 1 PID: 5833 at kernel/bpf/verifier.c:2688 reg_bounds_sanity_check+0x6e6/0xc20 kernel/bpf/verifier.c:2682 > > > > > > I'm unsure how to handle this one. > > > > > > One example repro is as follows. > > > > > > 0: call bpf_get_netns_cookie > > > 1: if r0 == 0 goto <exit> > > > 2: if r0 & Oxffffffff goto <exit> > > > > > > The issue is on the path where we fall through both jumps. > > > > > > That path is unreachable at runtime: after insn 1, we know r0 != 0, but > > > with the sign extension on the jset, we would only fallthrough insn 2 > > > if r0 == 0. Unfortunately, is_branch_taken() isn't currently able to > > > figure this out, so the verifier walks all branches. As a result, we end > > > up with inconsistent register ranges on this unreachable path: > > > > > > 0: if r0 == 0 goto <exit> > > > r0: u64=[0x1, 0xffffffffffffffff] var_off=(0, 0xffffffffffffffff) > > > 1: if r0 & 0xffffffff goto <exit> > > > r0 before reg_bounds_sync: u64=[0x1, 0xffffffffffffffff] var_off=(0, 0) > > > r0 after reg_bounds_sync: u64=[0x1, 0] var_off=(0, 0) > > > > > > I suspect there isn't anything specific to these two conditions, and > > > anytime we start walking an unreachable path, we may end up with > > > inconsistent register ranges. The number of times syzkaller is currently > > > hitting this (180 in 1.5 days) suggests there are many different ways to > > > reproduce. > > > > > > We could teach is_branch_taken() about this case, but we probably won't > > > be able to cover all cases. We could stop warning on this, but then we > > > may also miss legitimate cases (i.e., invariants violations on reachable > > > paths). We could also teach reg_bounds_sync() to stop refining the > > > bounds before it gets inconsistent, but I'm unsure how useful that'd be. > > > > Hi Paul, > > > > In general, I think that reg_bounds_sync() can be used as a substitute > > for is_branch_taken() -> whenever an impossible range is produced, > > the branch should be deemed impossible at runtime and abandoned. > > If I recall correctly Andrii considered this too risky some time ago, > > so this warning is in place to catch bugs. > > Hi Eduard, > > Yeah, that feels risky enough that I didn't even dare mention it as a > possibility :) > > > > > Which leaves only the option to refine is_branch_taken(). > > > > I think is_branch_taken() modification should not be too complicated. > > For JSET it only checks tnum, but does not take ranges into account. > > Reasoning about ranges is something along the lines: > > - for unsigned range a = b & CONST -> a is in [b_min & CONST, b_max & CONST]; > > - for signed ranged same thing, but consider two unsigned sub-ranges; > > - for non CONST cases, I think same reasoning can apply, but more > > min/max combinations need to be explored. > > - then check if zero is a member or 'a' range. > > > > Wdyt? > > I might be missing something, but I'm not sure that works. For the > unsigned range, if we have b & 0x2 with b in [2; 10], then we'd end up > with a in [2; 2] and would conclude that the jump is never taken. But > b=8 proves us wrong. I see, what is really needed is an 'or' joined mask of all 'b' values. I need to think how that can be obtained (or approximated). > > > > > The number of times syzkaller is currently hitting this (180 in 1.5 > > > days) suggests there are many different ways to reproduce. > > > > It is a bit inconvenient to read syzbot BPF reports at the moment, > > because it us hard to figure out how the program looks like. > > Do you happen to know how complicated would it be to modify syzbot > > output to: > > - produce a comment with BPF program > > - generating reproducer with a flag, allowing to print level 2 > > verifier log > > ? > > I have the same thought sometimes. Right now, I add verifier logs to a > syz or C reproducer to see the program. Producing the BPF program in a > comment would likely be tricky as we'd need to maintain a disassembler > in syzkaller. So, it operates on raw bytes, not on logical instructions? > Adding verifier logs to reproducers that contain bpf(PROG_LOAD) > calls seems easier. Then I guess we'd get that output in the strace > or console logs of syzbot. The log level 2 might be huge, so it shouldn't be enabled by default. But not having to modify the reproducer before investigation would be helpful. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check 2025-07-04 17:26 ` Eduard Zingerman @ 2025-07-04 21:13 ` Eduard Zingerman 2025-07-04 21:27 ` Eduard Zingerman 2025-07-07 22:30 ` Paul Chaignon 2025-07-07 21:57 ` Paul Chaignon 1 sibling, 2 replies; 17+ messages in thread From: Eduard Zingerman @ 2025-07-04 21:13 UTC (permalink / raw) To: Paul Chaignon Cc: syzbot, andrii, ast, bpf, daniel, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song On Fri, 2025-07-04 at 10:26 -0700, Eduard Zingerman wrote: > On Fri, 2025-07-04 at 19:14 +0200, Paul Chaignon wrote: > > On Thu, Jul 03, 2025 at 11:54:27AM -0700, Eduard Zingerman wrote: [...] > > > I think is_branch_taken() modification should not be too complicated. > > > For JSET it only checks tnum, but does not take ranges into account. > > > Reasoning about ranges is something along the lines: > > > - for unsigned range a = b & CONST -> a is in [b_min & CONST, b_max & CONST]; > > > - for signed ranged same thing, but consider two unsigned sub-ranges; > > > - for non CONST cases, I think same reasoning can apply, but more > > > min/max combinations need to be explored. > > > - then check if zero is a member or 'a' range. > > > > > > Wdyt? > > > > I might be missing something, but I'm not sure that works. For the > > unsigned range, if we have b & 0x2 with b in [2; 10], then we'd end up > > with a in [2; 2] and would conclude that the jump is never taken. But > > b=8 proves us wrong. > > I see, what is really needed is an 'or' joined mask of all 'b' values. > I need to think how that can be obtained (or approximated). I think the mask can be computed as in or_range() function at the bottom of the email. This gives the following algorithm, if only unsigned range is considered: - assume prediction is needed for "if a & b goto ..." - bits that may be set in 'a' are or_range(a_min, a_max) - bits that may be set in 'b' are or_range(b_min, b_max) - if computed bit masks intersect: both branches are possible - otherwise only false branch is possible. Wdyt? [...] --- #include <stdint.h> #include <stdio.h> static uint64_t or_range(uint64_t lo, uint64_t hi) { uint64_t m; uint32_t i; m = hi; i = 0; while (lo != hi) { m |= 1lu << i; lo >>= 1; hi >>= 1; i++; } return m; } static uint64_t or_range_simple(uint64_t lo, uint64_t hi) { uint64_t m = 0; uint64_t v = 0; for (v = lo; v <= hi; v++) m |= v; return m; } int main(int argc, char *argv[]) { int max = 0x1000; for (int lo = 0; lo < max; lo++) { for (int hi = lo; hi < max; hi++) { uint64_t expected = or_range_simple(lo, hi); uint64_t result = or_range(lo, hi); if (expected != result) { printf("mismatch: %x..%x -> expecting %lx, result %lx\n", lo, hi, expected, result); return 1; } } } printf("all ok\n"); return 0; } ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check 2025-07-04 21:13 ` Eduard Zingerman @ 2025-07-04 21:27 ` Eduard Zingerman 2025-07-07 22:30 ` Paul Chaignon 1 sibling, 0 replies; 17+ messages in thread From: Eduard Zingerman @ 2025-07-04 21:27 UTC (permalink / raw) To: Paul Chaignon Cc: syzbot, andrii, ast, bpf, daniel, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song On Fri, 2025-07-04 at 14:13 -0700, Eduard Zingerman wrote: [...] > I think the mask can be computed as in or_range() function at the > bottom of the email. This gives the following algorithm, if only > unsigned range is considered: > > - assume prediction is needed for "if a & b goto ..." > - bits that may be set in 'a' are or_range(a_min, a_max) > - bits that may be set in 'b' are or_range(b_min, b_max) > - if computed bit masks intersect: both branches are possible > - otherwise only false branch is possible. > > Wdyt? This does not help with problem at hand, however. Possible bits set for range [0x1, 0xff] are 0xff, so there would be no prediction. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check 2025-07-04 21:13 ` Eduard Zingerman 2025-07-04 21:27 ` Eduard Zingerman @ 2025-07-07 22:30 ` Paul Chaignon 2025-07-07 23:29 ` Eduard Zingerman 1 sibling, 1 reply; 17+ messages in thread From: Paul Chaignon @ 2025-07-07 22:30 UTC (permalink / raw) To: Eduard Zingerman Cc: syzbot, andrii, ast, bpf, daniel, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song On Fri, Jul 04, 2025 at 02:13:15PM -0700, Eduard Zingerman wrote: > On Fri, 2025-07-04 at 10:26 -0700, Eduard Zingerman wrote: > > On Fri, 2025-07-04 at 19:14 +0200, Paul Chaignon wrote: > > > On Thu, Jul 03, 2025 at 11:54:27AM -0700, Eduard Zingerman wrote: > > [...] > > > > > I think is_branch_taken() modification should not be too complicated. > > > > For JSET it only checks tnum, but does not take ranges into account. > > > > Reasoning about ranges is something along the lines: > > > > - for unsigned range a = b & CONST -> a is in [b_min & CONST, b_max & CONST]; > > > > - for signed ranged same thing, but consider two unsigned sub-ranges; > > > > - for non CONST cases, I think same reasoning can apply, but more > > > > min/max combinations need to be explored. > > > > - then check if zero is a member or 'a' range. > > > > > > > > Wdyt? > > > > > > I might be missing something, but I'm not sure that works. For the > > > unsigned range, if we have b & 0x2 with b in [2; 10], then we'd end up > > > with a in [2; 2] and would conclude that the jump is never taken. But > > > b=8 proves us wrong. > > > > I see, what is really needed is an 'or' joined mask of all 'b' values. > > I need to think how that can be obtained (or approximated). > > I think the mask can be computed as in or_range() function at the > bottom of the email. This gives the following algorithm, if only > unsigned range is considered: > > - assume prediction is needed for "if a & b goto ..." > - bits that may be set in 'a' are or_range(a_min, a_max) > - bits that may be set in 'b' are or_range(b_min, b_max) > - if computed bit masks intersect: both branches are possible > - otherwise only false branch is possible. > > Wdyt? This is really nice! I think we can extend it to detect some always-true branches as well, and thus handle the initial case reported by syzbot. - if a_min == 0: we don't deduce anything - bits that may be set in 'a' are: possible_a = or_range(a_min, a_max) - bits that are always set in 'b' are: always_b = b_value & ~b_mask - if possible_a & always_b == possible_a: only true branch is possible - otherwise, we can't deduce anything For BPF_X case, we probably want to also check the reverse with possible_b & always_a. --- #include <stdint.h> #include <stdio.h> #include <stdbool.h> static uint64_t or_range(uint64_t lo, uint64_t hi) { uint64_t m; uint32_t i; m = hi; i = 0; while (lo != hi) { m |= 1lu << i; lo >>= 1; hi >>= 1; i++; } return m; } static bool always_matches(uint64_t lo, uint64_t hi, uint64_t mask) { uint64_t possible_bits = or_range(lo, hi); return possible_bits & mask == possible_bits; } static bool always_matches_naive(uint64_t lo, uint64_t hi, uint64_t mask) { uint64_t v = 0; for (v = lo; v <= hi; v++) { if (!(v & mask)) { return false; } } return true; } int main(int argc, char *argv[]) { int max = 0x300; for (int mask = 0; mask < max; mask++) { for (int lo = 1; lo < max; lo++) { for (int hi = lo; hi < max; hi++) { bool expected = always_matches_naive(lo, hi, mask); bool result = always_matches(lo, hi, mask); if (result == true && expected == false) { printf("mismatch: %x..%x & %x -> expecting %d, result %d\n", lo, hi, mask, expected, result); return 1; } } } } printf("all ok\n"); return 0; } ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check 2025-07-07 22:30 ` Paul Chaignon @ 2025-07-07 23:29 ` Eduard Zingerman 2025-07-08 0:37 ` Eduard Zingerman 0 siblings, 1 reply; 17+ messages in thread From: Eduard Zingerman @ 2025-07-07 23:29 UTC (permalink / raw) To: Paul Chaignon Cc: syzbot, andrii, ast, bpf, daniel, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song On Tue, 2025-07-08 at 00:30 +0200, Paul Chaignon wrote: [...] > This is really nice! I think we can extend it to detect some > always-true branches as well, and thus handle the initial case reported > by syzbot. > > - if a_min == 0: we don't deduce anything > - bits that may be set in 'a' are: possible_a = or_range(a_min, a_max) > - bits that are always set in 'b' are: always_b = b_value & ~b_mask > - if possible_a & always_b == possible_a: only true branch is possible > - otherwise, we can't deduce anything > > For BPF_X case, we probably want to also check the reverse with > possible_b & always_a. So, this would extend existing predictions: - [old] always_a & always_b -> infer always true - [old] !(possible_a & possible_b) -> infer always false - [new] if possible_a & always_b == possible_a -> infer true (but make sure 0 is not in possible_a) And it so happens, that it covers example at hand. Note that or_range(1, (u64)-1) == (u64)-1, so maybe tnum would be sufficient, w/o the need for or_range(). The part of the verifier that narrows the range after prediction: regs_refine_cond_op: case BPF_JSET | BPF_X: /* reverse of BPF_JSET, see rev_opcode() */ if (!is_reg_const(reg: reg2, subreg32: is_jmp32)) swap(reg1, reg2); if (!is_reg_const(reg: reg2, subreg32: is_jmp32)) break; val = reg_const_value(reg: reg2, subreg32: is_jmp32); ... reg1->var_off = tnum_and(a: reg1->var_off, b: tnum_const(value: ~val)); ... break; And after suggested change this part would be executed only if tnum bounds can be changed by jset. So, this eliminates at-least a sub-class of a problem. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check 2025-07-07 23:29 ` Eduard Zingerman @ 2025-07-08 0:37 ` Eduard Zingerman 2025-07-08 0:51 ` Alexei Starovoitov 0 siblings, 1 reply; 17+ messages in thread From: Eduard Zingerman @ 2025-07-08 0:37 UTC (permalink / raw) To: Paul Chaignon Cc: syzbot, andrii, ast, bpf, daniel, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song On Mon, 2025-07-07 at 16:29 -0700, Eduard Zingerman wrote: > On Tue, 2025-07-08 at 00:30 +0200, Paul Chaignon wrote: > > [...] > > > This is really nice! I think we can extend it to detect some > > always-true branches as well, and thus handle the initial case reported > > by syzbot. > > > > - if a_min == 0: we don't deduce anything > > - bits that may be set in 'a' are: possible_a = or_range(a_min, a_max) > > - bits that are always set in 'b' are: always_b = b_value & ~b_mask > > - if possible_a & always_b == possible_a: only true branch is possible > > - otherwise, we can't deduce anything > > > > For BPF_X case, we probably want to also check the reverse with > > possible_b & always_a. > > So, this would extend existing predictions: > - [old] always_a & always_b -> infer always true > - [old] !(possible_a & possible_b) -> infer always false > - [new] if possible_a & always_b == possible_a -> infer true > (but make sure 0 is not in possible_a) > > And it so happens, that it covers example at hand. > Note that or_range(1, (u64)-1) == (u64)-1, so maybe tnum would be > sufficient, w/o the need for or_range(). > > The part of the verifier that narrows the range after prediction: > > regs_refine_cond_op: > > case BPF_JSET | BPF_X: /* reverse of BPF_JSET, see rev_opcode() */ > if (!is_reg_const(reg: reg2, subreg32: is_jmp32)) > swap(reg1, reg2); > if (!is_reg_const(reg: reg2, subreg32: is_jmp32)) > break; > val = reg_const_value(reg: reg2, subreg32: is_jmp32); > ... > reg1->var_off = tnum_and(a: reg1->var_off, b: tnum_const(value: ~val)); > ... > break; > > And after suggested change this part would be executed only if tnum > bounds can be changed by jset. So, this eliminates at-least a > sub-class of a problem. But I think the program below would still be problematic: SEC("socket") __success __retval(0) __naked void jset_bug1(void) { asm volatile (" \ call %[bpf_get_prandom_u32]; \ if r0 < 2 goto 1f; \ r0 |= 1; \ if r0 & -2 goto 1f; \ 1: r0 = 0; \ exit; \ " : : __imm(bpf_get_prandom_u32) : __clobber_all); } The possible_r0 would be changed by `if r0 & -2`, so new rule will not hit. And the problem remains unsolved. I think we need to reset min/max bounds in regs_refine_cond_op for JSET: - in some cases range is more precise than tnum - in these cases range cannot be compressed to a tnum - predictions in jset are done for a tnum - to avoid issues when narrowing tnum after prediction, forget the range. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check 2025-07-08 0:37 ` Eduard Zingerman @ 2025-07-08 0:51 ` Alexei Starovoitov 2025-07-08 0:57 ` Eduard Zingerman 0 siblings, 1 reply; 17+ messages in thread From: Alexei Starovoitov @ 2025-07-08 0:51 UTC (permalink / raw) To: Eduard Zingerman Cc: Paul Chaignon, syzbot, Andrii Nakryiko, Alexei Starovoitov, bpf, Daniel Borkmann, Hao Luo, John Fastabend, Jiri Olsa, KP Singh, LKML, Martin KaFai Lau, Network Development, Stanislav Fomichev, Song Liu, syzkaller-bugs, Yonghong Song On Mon, Jul 7, 2025 at 5:37 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > On Mon, 2025-07-07 at 16:29 -0700, Eduard Zingerman wrote: > > On Tue, 2025-07-08 at 00:30 +0200, Paul Chaignon wrote: > > > > [...] > > > > > This is really nice! I think we can extend it to detect some > > > always-true branches as well, and thus handle the initial case reported > > > by syzbot. > > > > > > - if a_min == 0: we don't deduce anything > > > - bits that may be set in 'a' are: possible_a = or_range(a_min, a_max) > > > - bits that are always set in 'b' are: always_b = b_value & ~b_mask > > > - if possible_a & always_b == possible_a: only true branch is possible > > > - otherwise, we can't deduce anything > > > > > > For BPF_X case, we probably want to also check the reverse with > > > possible_b & always_a. > > > > So, this would extend existing predictions: > > - [old] always_a & always_b -> infer always true > > - [old] !(possible_a & possible_b) -> infer always false > > - [new] if possible_a & always_b == possible_a -> infer true > > (but make sure 0 is not in possible_a) > > > > And it so happens, that it covers example at hand. > > Note that or_range(1, (u64)-1) == (u64)-1, so maybe tnum would be > > sufficient, w/o the need for or_range(). > > > > The part of the verifier that narrows the range after prediction: > > > > regs_refine_cond_op: > > > > case BPF_JSET | BPF_X: /* reverse of BPF_JSET, see rev_opcode() */ > > if (!is_reg_const(reg: reg2, subreg32: is_jmp32)) > > swap(reg1, reg2); > > if (!is_reg_const(reg: reg2, subreg32: is_jmp32)) > > break; > > val = reg_const_value(reg: reg2, subreg32: is_jmp32); > > ... > > reg1->var_off = tnum_and(a: reg1->var_off, b: tnum_const(value: ~val)); > > ... > > break; > > > > And after suggested change this part would be executed only if tnum > > bounds can be changed by jset. So, this eliminates at-least a > > sub-class of a problem. > > But I think the program below would still be problematic: > > SEC("socket") > __success > __retval(0) > __naked void jset_bug1(void) > { > asm volatile (" \ > call %[bpf_get_prandom_u32]; \ > if r0 < 2 goto 1f; \ > r0 |= 1; \ > if r0 & -2 goto 1f; \ > 1: r0 = 0; \ > exit; \ > " : > : __imm(bpf_get_prandom_u32) > : __clobber_all); > } > > The possible_r0 would be changed by `if r0 & -2`, so new rule will not hit. > And the problem remains unsolved. I think we need to reset min/max > bounds in regs_refine_cond_op for JSET: > - in some cases range is more precise than tnum > - in these cases range cannot be compressed to a tnum > - predictions in jset are done for a tnum > - to avoid issues when narrowing tnum after prediction, forget the > range. You're digging too deep. llvm doesn't generate JSET insn, so this is syzbot only issue. Let's address it with minimal changes. Do not introduce fancy branch taken analysis. I would be fine with reverting this particular verifier_bug() hunk. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check 2025-07-08 0:51 ` Alexei Starovoitov @ 2025-07-08 0:57 ` Eduard Zingerman 2025-07-08 16:19 ` Paul Chaignon 0 siblings, 1 reply; 17+ messages in thread From: Eduard Zingerman @ 2025-07-08 0:57 UTC (permalink / raw) To: Alexei Starovoitov Cc: Paul Chaignon, syzbot, Andrii Nakryiko, Alexei Starovoitov, bpf, Daniel Borkmann, Hao Luo, John Fastabend, Jiri Olsa, KP Singh, LKML, Martin KaFai Lau, Network Development, Stanislav Fomichev, Song Liu, syzkaller-bugs, Yonghong Song On Mon, 2025-07-07 at 17:51 -0700, Alexei Starovoitov wrote: > On Mon, Jul 7, 2025 at 5:37 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > > > On Mon, 2025-07-07 at 16:29 -0700, Eduard Zingerman wrote: > > > On Tue, 2025-07-08 at 00:30 +0200, Paul Chaignon wrote: > > > > > > [...] > > > > > > > This is really nice! I think we can extend it to detect some > > > > always-true branches as well, and thus handle the initial case reported > > > > by syzbot. > > > > > > > > - if a_min == 0: we don't deduce anything > > > > - bits that may be set in 'a' are: possible_a = or_range(a_min, a_max) > > > > - bits that are always set in 'b' are: always_b = b_value & ~b_mask > > > > - if possible_a & always_b == possible_a: only true branch is possible > > > > - otherwise, we can't deduce anything > > > > > > > > For BPF_X case, we probably want to also check the reverse with > > > > possible_b & always_a. > > > > > > So, this would extend existing predictions: > > > - [old] always_a & always_b -> infer always true > > > - [old] !(possible_a & possible_b) -> infer always false > > > - [new] if possible_a & always_b == possible_a -> infer true > > > (but make sure 0 is not in possible_a) > > > > > > And it so happens, that it covers example at hand. > > > Note that or_range(1, (u64)-1) == (u64)-1, so maybe tnum would be > > > sufficient, w/o the need for or_range(). > > > > > > The part of the verifier that narrows the range after prediction: > > > > > > regs_refine_cond_op: > > > > > > case BPF_JSET | BPF_X: /* reverse of BPF_JSET, see rev_opcode() */ > > > if (!is_reg_const(reg: reg2, subreg32: is_jmp32)) > > > swap(reg1, reg2); > > > if (!is_reg_const(reg: reg2, subreg32: is_jmp32)) > > > break; > > > val = reg_const_value(reg: reg2, subreg32: is_jmp32); > > > ... > > > reg1->var_off = tnum_and(a: reg1->var_off, b: tnum_const(value: ~val)); > > > ... > > > break; > > > > > > And after suggested change this part would be executed only if tnum > > > bounds can be changed by jset. So, this eliminates at-least a > > > sub-class of a problem. > > > > But I think the program below would still be problematic: > > > > SEC("socket") > > __success > > __retval(0) > > __naked void jset_bug1(void) > > { > > asm volatile (" \ > > call %[bpf_get_prandom_u32]; \ > > if r0 < 2 goto 1f; \ > > r0 |= 1; \ > > if r0 & -2 goto 1f; \ > > 1: r0 = 0; \ > > exit; \ > > " : > > : __imm(bpf_get_prandom_u32) > > : __clobber_all); > > } > > > > The possible_r0 would be changed by `if r0 & -2`, so new rule will not hit. > > And the problem remains unsolved. I think we need to reset min/max > > bounds in regs_refine_cond_op for JSET: > > - in some cases range is more precise than tnum > > - in these cases range cannot be compressed to a tnum > > - predictions in jset are done for a tnum > > - to avoid issues when narrowing tnum after prediction, forget the > > range. > > You're digging too deep. llvm doesn't generate JSET insn, > so this is syzbot only issue. Let's address it with minimal changes. > Do not introduce fancy branch taken analysis. > I would be fine with reverting this particular verifier_bug() hunk. My point is that the fix should look as below (but extract it as a utility function): diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 53007182b46b..b2fe665901b7 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -16207,6 +16207,14 @@ static void regs_refine_cond_op(struct bpf_reg_state *reg1, struct bpf_reg_state swap(reg1, reg2); if (!is_reg_const(reg2, is_jmp32)) break; + reg1->u32_max_value = U32_MAX; + reg1->u32_min_value = 0; + reg1->s32_max_value = S32_MAX; + reg1->s32_min_value = S32_MIN; + reg1->umax_value = U64_MAX; + reg1->umin_value = 0; + reg1->smax_value = S64_MAX; + reg1->smin_value = S32_MIN; val = reg_const_value(reg2, is_jmp32); if (is_jmp32) { t = tnum_and(tnum_subreg(reg1->var_off), tnum_const(~val)); ---- Because of irreconcilable differences in what can be represented as a tnum and what can be represented as a range. ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check 2025-07-08 0:57 ` Eduard Zingerman @ 2025-07-08 16:19 ` Paul Chaignon 2025-07-08 17:39 ` Eduard Zingerman 0 siblings, 1 reply; 17+ messages in thread From: Paul Chaignon @ 2025-07-08 16:19 UTC (permalink / raw) To: Eduard Zingerman Cc: Alexei Starovoitov, syzbot, Andrii Nakryiko, Alexei Starovoitov, bpf, Daniel Borkmann, Hao Luo, John Fastabend, Jiri Olsa, KP Singh, LKML, Martin KaFai Lau, Network Development, Stanislav Fomichev, Song Liu, syzkaller-bugs, Yonghong Song On Mon, Jul 07, 2025 at 05:57:32PM -0700, Eduard Zingerman wrote: > On Mon, 2025-07-07 at 17:51 -0700, Alexei Starovoitov wrote: > > On Mon, Jul 7, 2025 at 5:37 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > > > > > On Mon, 2025-07-07 at 16:29 -0700, Eduard Zingerman wrote: > > > > On Tue, 2025-07-08 at 00:30 +0200, Paul Chaignon wrote: [...] > > > But I think the program below would still be problematic: > > > > > > SEC("socket") > > > __success > > > __retval(0) > > > __naked void jset_bug1(void) > > > { > > > asm volatile (" \ > > > call %[bpf_get_prandom_u32]; \ > > > if r0 < 2 goto 1f; \ > > > r0 |= 1; \ > > > if r0 & -2 goto 1f; \ > > > 1: r0 = 0; \ > > > exit; \ > > > " : > > > : __imm(bpf_get_prandom_u32) > > > : __clobber_all); > > > } > > > > > > The possible_r0 would be changed by `if r0 & -2`, so new rule will not hit. > > > And the problem remains unsolved. I think we need to reset min/max > > > bounds in regs_refine_cond_op for JSET: > > > - in some cases range is more precise than tnum > > > - in these cases range cannot be compressed to a tnum > > > - predictions in jset are done for a tnum > > > - to avoid issues when narrowing tnum after prediction, forget the > > > range. > > > > You're digging too deep. llvm doesn't generate JSET insn, > > so this is syzbot only issue. Let's address it with minimal changes. > > Do not introduce fancy branch taken analysis. > > I would be fine with reverting this particular verifier_bug() hunk. Ok, if LLVM doesn't generate JSETs, I agree there's not much point trying to reduce false positives. I like Eduard's solution below because it handles the JSET case without removing the warning. Given the number of crashes syzkaller is generating, I suspect this isn't only about JSET, so it'd be good to keep some visibility into invariant violations. > > My point is that the fix should look as below (but extract it as a > utility function): > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 53007182b46b..b2fe665901b7 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -16207,6 +16207,14 @@ static void regs_refine_cond_op(struct bpf_reg_state *reg1, struct bpf_reg_state > swap(reg1, reg2); > if (!is_reg_const(reg2, is_jmp32)) > break; > + reg1->u32_max_value = U32_MAX; > + reg1->u32_min_value = 0; > + reg1->s32_max_value = S32_MAX; > + reg1->s32_min_value = S32_MIN; > + reg1->umax_value = U64_MAX; > + reg1->umin_value = 0; > + reg1->smax_value = S64_MAX; > + reg1->smin_value = S32_MIN; Looks like __mark_reg_unbounded :) I can send a test case + __mark_reg_unbounded for BPF_JSET | BPF_X in regs_refine_cond_op. I suspect we may need the same for the BPF_JSET case as well, but I'm unable to build a repro for that so far. > val = reg_const_value(reg2, is_jmp32); > if (is_jmp32) { > t = tnum_and(tnum_subreg(reg1->var_off), tnum_const(~val)); > > ---- > > Because of irreconcilable differences in what can be represented as a > tnum and what can be represented as a range. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check 2025-07-08 16:19 ` Paul Chaignon @ 2025-07-08 17:39 ` Eduard Zingerman 0 siblings, 0 replies; 17+ messages in thread From: Eduard Zingerman @ 2025-07-08 17:39 UTC (permalink / raw) To: Paul Chaignon Cc: Alexei Starovoitov, syzbot, Andrii Nakryiko, Alexei Starovoitov, bpf, Daniel Borkmann, Hao Luo, John Fastabend, Jiri Olsa, KP Singh, LKML, Martin KaFai Lau, Network Development, Stanislav Fomichev, Song Liu, syzkaller-bugs, Yonghong Song On Tue, 2025-07-08 at 18:19 +0200, Paul Chaignon wrote: > On Mon, Jul 07, 2025 at 05:57:32PM -0700, Eduard Zingerman wrote: > > On Mon, 2025-07-07 at 17:51 -0700, Alexei Starovoitov wrote: > > > On Mon, Jul 7, 2025 at 5:37 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > > > > > > > On Mon, 2025-07-07 at 16:29 -0700, Eduard Zingerman wrote: > > > > > On Tue, 2025-07-08 at 00:30 +0200, Paul Chaignon wrote: > > [...] > > > > > But I think the program below would still be problematic: > > > > > > > > SEC("socket") > > > > __success > > > > __retval(0) > > > > __naked void jset_bug1(void) > > > > { > > > > asm volatile (" \ > > > > call %[bpf_get_prandom_u32]; \ > > > > if r0 < 2 goto 1f; \ > > > > r0 |= 1; \ > > > > if r0 & -2 goto 1f; \ > > > > 1: r0 = 0; \ > > > > exit; \ > > > > " : > > > > : __imm(bpf_get_prandom_u32) > > > > : __clobber_all); > > > > } > > > > > > > > The possible_r0 would be changed by `if r0 & -2`, so new rule will not hit. > > > > And the problem remains unsolved. I think we need to reset min/max > > > > bounds in regs_refine_cond_op for JSET: > > > > - in some cases range is more precise than tnum > > > > - in these cases range cannot be compressed to a tnum > > > > - predictions in jset are done for a tnum > > > > - to avoid issues when narrowing tnum after prediction, forget the > > > > range. > > > > > > You're digging too deep. llvm doesn't generate JSET insn, > > > so this is syzbot only issue. Let's address it with minimal changes. > > > Do not introduce fancy branch taken analysis. > > > I would be fine with reverting this particular verifier_bug() hunk. > > Ok, if LLVM doesn't generate JSETs, I agree there's not much point > trying to reduce false positives. I like Eduard's solution below > because it handles the JSET case without removing the warning. Given > the number of crashes syzkaller is generating, I suspect this isn't > only about JSET, so it'd be good to keep some visibility into invariant > violations. I suspect similar problems might be found in any place where tnum operations are used to narrow the range. E.g. if a repro for JSET would be found, same repro might be applicable to BPF_AND. In general, it might be the case we should not treat out of sync bounds as an error. Assuming that tnum and bounds based ranges have different precision in different scale regions, situations when one bound is changed w/o changing another can be legit. E.g.: ____ bounds range ____ / \ 0 --------------------------------------------------------- MAX \___________________________________________________/ tnum range Narrowing only tnum: ____ bounds range ____ / \ 0 --------------------------------------------------------- MAX \___________________/ tnum range This does not highlight an error, but a difference in expressive power for specific values. > > My point is that the fix should look as below (but extract it as a > > utility function): > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index 53007182b46b..b2fe665901b7 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -16207,6 +16207,14 @@ static void regs_refine_cond_op(struct bpf_reg_state *reg1, struct bpf_reg_state > > swap(reg1, reg2); > > if (!is_reg_const(reg2, is_jmp32)) > > break; > > + reg1->u32_max_value = U32_MAX; > > + reg1->u32_min_value = 0; > > + reg1->s32_max_value = S32_MAX; > > + reg1->s32_min_value = S32_MIN; > > + reg1->umax_value = U64_MAX; > > + reg1->umin_value = 0; > > + reg1->smax_value = S64_MAX; > > + reg1->smin_value = S32_MIN; > > Looks like __mark_reg_unbounded :) I suspected there should be something already :) > I can send a test case + __mark_reg_unbounded for BPF_JSET | BPF_X in > regs_refine_cond_op. I suspect we may need the same for the BPF_JSET > case as well, but I'm unable to build a repro for that so far. Please go ahead. > > > val = reg_const_value(reg2, is_jmp32); > > if (is_jmp32) { > > t = tnum_and(tnum_subreg(reg1->var_off), tnum_const(~val)); > > > > ---- > > > > Because of irreconcilable differences in what can be represented as a > > tnum and what can be represented as a range. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check 2025-07-04 17:26 ` Eduard Zingerman 2025-07-04 21:13 ` Eduard Zingerman @ 2025-07-07 21:57 ` Paul Chaignon 2025-07-07 23:36 ` Eduard Zingerman 1 sibling, 1 reply; 17+ messages in thread From: Paul Chaignon @ 2025-07-07 21:57 UTC (permalink / raw) To: Eduard Zingerman Cc: syzbot, andrii, ast, bpf, daniel, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song On Fri, Jul 04, 2025 at 10:26:14AM -0700, Eduard Zingerman wrote: > On Fri, 2025-07-04 at 19:14 +0200, Paul Chaignon wrote: > > On Thu, Jul 03, 2025 at 11:54:27AM -0700, Eduard Zingerman wrote: > > > On Thu, 2025-07-03 at 19:02 +0200, Paul Chaignon wrote: > > > > The number of times syzkaller is currently hitting this (180 in 1.5 > > > > days) suggests there are many different ways to reproduce. > > > > > > It is a bit inconvenient to read syzbot BPF reports at the moment, > > > because it us hard to figure out how the program looks like. > > > Do you happen to know how complicated would it be to modify syzbot > > > output to: > > > - produce a comment with BPF program > > > - generating reproducer with a flag, allowing to print level 2 > > > verifier log > > > ? > > > > I have the same thought sometimes. Right now, I add verifier logs to a > > syz or C reproducer to see the program. Producing the BPF program in a > > comment would likely be tricky as we'd need to maintain a disassembler > > in syzkaller. > > So, it operates on raw bytes, not on logical instructions? Both I would say. The syzkaller descriptions for BPF are structured around instructions [1], though they may not always match 1:1 with upstream instructions. Syzkaller then mutates raw bytes, taking some information from the descriptions into account (ex. known flag values). 1 - https://github.com/google/syzkaller/blob/master/sys/linux/bpf_prog.txt ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check 2025-07-07 21:57 ` Paul Chaignon @ 2025-07-07 23:36 ` Eduard Zingerman 0 siblings, 0 replies; 17+ messages in thread From: Eduard Zingerman @ 2025-07-07 23:36 UTC (permalink / raw) To: Paul Chaignon Cc: syzbot, andrii, ast, bpf, daniel, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song On Mon, 2025-07-07 at 23:57 +0200, Paul Chaignon wrote: > On Fri, Jul 04, 2025 at 10:26:14AM -0700, Eduard Zingerman wrote: > > On Fri, 2025-07-04 at 19:14 +0200, Paul Chaignon wrote: > > > On Thu, Jul 03, 2025 at 11:54:27AM -0700, Eduard Zingerman wrote: > > > > On Thu, 2025-07-03 at 19:02 +0200, Paul Chaignon wrote: > > > > > The number of times syzkaller is currently hitting this (180 in 1.5 > > > > > days) suggests there are many different ways to reproduce. > > > > > > > > It is a bit inconvenient to read syzbot BPF reports at the moment, > > > > because it us hard to figure out how the program looks like. > > > > Do you happen to know how complicated would it be to modify syzbot > > > > output to: > > > > - produce a comment with BPF program > > > > - generating reproducer with a flag, allowing to print level 2 > > > > verifier log > > > > ? > > > > > > I have the same thought sometimes. Right now, I add verifier logs to a > > > syz or C reproducer to see the program. Producing the BPF program in a > > > comment would likely be tricky as we'd need to maintain a disassembler > > > in syzkaller. > > > > So, it operates on raw bytes, not on logical instructions? > > Both I would say. The syzkaller descriptions for BPF are structured > around instructions [1], though they may not always match 1:1 with > upstream instructions. Syzkaller then mutates raw bytes, taking some > information from the descriptions into account (ex. known flag values). > > 1 - https://github.com/google/syzkaller/blob/master/sys/linux/bpf_prog.txt I actually took a brief look at syzkaller over the weekend but got lost tbh. BPF disassembler is small (~400Loc in kernel/bpf/disasm.c). I can teach myself some golang and make a copy of it in syzkaller, but having some guidance on where to put/call this code would be of much help. (I think that having program code in the report would be of great help in triaging). ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [syzbot] [bpf?] WARNING in reg_bounds_sanity_check 2025-07-02 1:55 [syzbot] [bpf?] WARNING in reg_bounds_sanity_check syzbot 2025-07-03 17:02 ` Paul Chaignon @ 2025-07-05 16:02 ` syzbot 1 sibling, 0 replies; 17+ messages in thread From: syzbot @ 2025-07-05 16:02 UTC (permalink / raw) To: andrii, ast, bpf, daniel, eddyz87, haoluo, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, netdev, paul.chaignon, sdf, song, syzkaller-bugs, yonghong.song syzbot has bisected this issue to: commit 0df1a55afa832f463f9ad68ddc5de92230f1bc8a Author: Paul Chaignon <paul.chaignon@gmail.com> Date: Tue Jul 1 18:36:15 2025 +0000 bpf: Warn on internal verifier errors bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1693cf70580000 start commit: cce3fee729ee selftests/bpf: Enable dynptr/test_probe_read_.. git tree: bpf-next final oops: https://syzkaller.appspot.com/x/report.txt?x=1593cf70580000 console output: https://syzkaller.appspot.com/x/log.txt?x=1193cf70580000 kernel config: https://syzkaller.appspot.com/x/.config?x=79da270cec5ffd65 dashboard link: https://syzkaller.appspot.com/bug?extid=c711ce17dd78e5d4fdcf syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1594e48c580000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1159388c580000 Reported-by: syzbot+c711ce17dd78e5d4fdcf@syzkaller.appspotmail.com Fixes: 0df1a55afa83 ("bpf: Warn on internal verifier errors") For information about bisection process see: https://goo.gl/tpsmEJ#bisection ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2025-07-08 17:39 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-07-02 1:55 [syzbot] [bpf?] WARNING in reg_bounds_sanity_check syzbot 2025-07-03 17:02 ` Paul Chaignon 2025-07-03 18:54 ` Eduard Zingerman 2025-07-04 17:14 ` Paul Chaignon 2025-07-04 17:26 ` Eduard Zingerman 2025-07-04 21:13 ` Eduard Zingerman 2025-07-04 21:27 ` Eduard Zingerman 2025-07-07 22:30 ` Paul Chaignon 2025-07-07 23:29 ` Eduard Zingerman 2025-07-08 0:37 ` Eduard Zingerman 2025-07-08 0:51 ` Alexei Starovoitov 2025-07-08 0:57 ` Eduard Zingerman 2025-07-08 16:19 ` Paul Chaignon 2025-07-08 17:39 ` Eduard Zingerman 2025-07-07 21:57 ` Paul Chaignon 2025-07-07 23:36 ` Eduard Zingerman 2025-07-05 16:02 ` syzbot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).