* [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) @ 2025-02-05 0:34 syzbot 2025-02-05 14:56 ` Paul E. McKenney ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: syzbot @ 2025-02-05 0:34 UTC (permalink / raw) To: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel, linux-mm, paulmck, rcu, syzkaller-bugs Hello, syzbot found the following issue on: HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000 kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000 Downloadable assets: disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576 BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page PGD 0 P4D 0 Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:0x0 Code: Unable to access opcode bytes at 0xffffffffffffffd6. RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100 RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8 RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507 R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8 FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> rcu_do_batch kernel/rcu/tree.c:2546 [inline] rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2802 handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561 __do_softirq kernel/softirq.c:595 [inline] invoke_softirq kernel/softirq.c:435 [inline] __irq_exit_rcu+0xf7/0x220 kernel/softirq.c:662 irq_exit_rcu+0x9/0x30 kernel/softirq.c:678 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline] sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1049 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702 RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline] RIP: 0010:_raw_spin_unlock_irqrestore+0xd8/0x140 kernel/locking/spinlock.c:194 Code: 9c 8f 44 24 20 42 80 3c 23 00 74 08 4c 89 f7 e8 fe 78 2d f6 f6 44 24 21 02 75 52 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> c3 0f 95 f5 65 8b 05 d4 58 0b 74 85 c0 74 43 48 c7 04 24 0e 36 RSP: 0018:ffffc900030fef60 EFLAGS: 00000206 RAX: 23438dd059a4b100 RBX: 1ffff9200061fdf0 RCX: ffffffff819b316a RDX: dffffc0000000000 RSI: ffffffff8c0aa680 RDI: 0000000000000001 RBP: ffffc900030feff8 R08: ffffffff942f9847 R09: 1ffffffff285f308 R10: dffffc0000000000 R11: fffffbfff285f309 R12: dffffc0000000000 R13: 1ffff9200061fdec R14: ffffc900030fef80 R15: 0000000000000246 spin_unlock_irqrestore include/linux/spinlock.h:406 [inline] rmqueue_bulk mm/page_alloc.c:2329 [inline] __rmqueue_pcplist+0x21fd/0x2a90 mm/page_alloc.c:3004 rmqueue_pcplist mm/page_alloc.c:3046 [inline] rmqueue mm/page_alloc.c:3077 [inline] get_page_from_freelist+0x886/0x37a0 mm/page_alloc.c:3474 __alloc_frozen_pages_noprof+0x292/0x710 mm/page_alloc.c:4739 alloc_pages_mpol+0x311/0x660 mm/mempolicy.c:2270 folio_alloc_mpol_noprof mm/mempolicy.c:2289 [inline] vma_alloc_folio_noprof+0x12b/0x260 mm/mempolicy.c:2324 folio_prealloc+0x2e/0x170 wp_page_copy mm/memory.c:3435 [inline] do_wp_page+0x1253/0x49b0 mm/memory.c:3827 handle_pte_fault mm/memory.c:5905 [inline] __handle_mm_fault+0x24d5/0x70f0 mm/memory.c:6032 handle_mm_fault+0x3e5/0x8d0 mm/memory.c:6201 do_user_addr_fault arch/x86/mm/fault.c:1388 [inline] handle_page_fault arch/x86/mm/fault.c:1480 [inline] exc_page_fault+0x2b9/0x8b0 arch/x86/mm/fault.c:1538 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623 RIP: 0010:__put_user_4+0x11/0x20 arch/x86/lib/putuser.S:88 Code: 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 89 cb 48 c1 fb 3f 48 09 d9 0f 01 cb <89> 01 31 c9 0f 01 ca c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90 RSP: 0018:ffffc900030fff00 EFLAGS: 00050202 RAX: 0000000000000005 RBX: 0000000000000000 RCX: 00005555679927d0 RDX: 0000000000000000 RSI: ffffffff8c0ab8e0 RDI: ffffffff8c608a00 RBP: ffff888000dfcf20 R08: ffffffff901b5177 R09: 1ffffffff2036a2e R10: dffffc0000000000 R11: fffffbfff2036a2f R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000005 R15: dffffc0000000000 schedule_tail+0x96/0xb0 kernel/sched/core.c:5312 ret_from_fork+0x24/0x80 arch/x86/kernel/process.c:144 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Modules linked in: CR2: 0000000000000000 ---[ end trace 0000000000000000 ]--- RIP: 0010:0x0 Code: Unable to access opcode bytes at 0xffffffffffffffd6. RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100 RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8 RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507 R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8 FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 ---------------- Code disassembly (best guess): 0: 9c pushf 1: 8f 44 24 20 pop 0x20(%rsp) 5: 42 80 3c 23 00 cmpb $0x0,(%rbx,%r12,1) a: 74 08 je 0x14 c: 4c 89 f7 mov %r14,%rdi f: e8 fe 78 2d f6 call 0xf62d7912 14: f6 44 24 21 02 testb $0x2,0x21(%rsp) 19: 75 52 jne 0x6d 1b: 41 f7 c7 00 02 00 00 test $0x200,%r15d 22: 74 01 je 0x25 24: fb sti 25: bf 01 00 00 00 mov $0x1,%edi * 2a: e8 c3 0f 95 f5 call 0xf5950ff2 <-- trapping instruction 2f: 65 8b 05 d4 58 0b 74 mov %gs:0x740b58d4(%rip),%eax # 0x740b590a 36: 85 c0 test %eax,%eax 38: 74 43 je 0x7d 3a: 48 rex.W 3b: c7 .byte 0xc7 3c: 04 24 add $0x24,%al 3e: 0e (bad) 3f: 36 ss --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. If the report is already addressed, let syzbot know by replying with: #syz fix: exact-commit-title If you want syzbot to run the reproducer, reply with: #syz test: git://repo/address.git branch-or-commit-hash If you attach or paste a git patch, syzbot will apply it before testing. If you want to overwrite report's subsystems, reply with: #syz set subsystems: new-subsystem (See the list of subsystem names on the web dashboard) If the report is a duplicate of another one, reply with: #syz dup: exact-subject-of-another-report If you want to undo deduplication, reply with: #syz undup ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-02-05 0:34 [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) syzbot @ 2025-02-05 14:56 ` Paul E. McKenney 2025-06-08 15:26 ` Kent Overstreet 2025-06-08 6:58 ` [syzbot] [bcachefs?] [rcu?] " syzbot 2025-06-11 15:58 ` [syzbot] [rcu?] [bcachefs?] " Uladzislau Rezki 2 siblings, 1 reply; 20+ messages in thread From: Paul E. McKenney @ 2025-02-05 14:56 UTC (permalink / raw) To: syzbot Cc: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel, linux-mm, rcu, syzkaller-bugs On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000 > kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000 > > Downloadable assets: > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com > > slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576 > BUG: kernel NULL pointer dereference, address: 0000000000000000 > #PF: supervisor instruction fetch in kernel mode > #PF: error_code(0x0010) - not-present page > PGD 0 P4D 0 > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 > RIP: 0010:0x0 > Code: Unable to access opcode bytes at 0xffffffffffffffd6. > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246 > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100 > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8 > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507 > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8 > FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <IRQ> > rcu_do_batch kernel/rcu/tree.c:2546 [inline] The usual way that this happens is that someone clobbers the rcu_head structure of something that has been passed to call_rcu(). The most popular way of clobbering this structure is to pass the same something to call_rcu() twice in a row, but other creative arrangements are possible. Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually spot invoking call_rcu() twice in a row. Thanx, Paul > rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2802 > handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561 > __do_softirq kernel/softirq.c:595 [inline] > invoke_softirq kernel/softirq.c:435 [inline] > __irq_exit_rcu+0xf7/0x220 kernel/softirq.c:662 > irq_exit_rcu+0x9/0x30 kernel/softirq.c:678 > instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline] > sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1049 > </IRQ> > <TASK> > asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702 > RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline] > RIP: 0010:_raw_spin_unlock_irqrestore+0xd8/0x140 kernel/locking/spinlock.c:194 > Code: 9c 8f 44 24 20 42 80 3c 23 00 74 08 4c 89 f7 e8 fe 78 2d f6 f6 44 24 21 02 75 52 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> c3 0f 95 f5 65 8b 05 d4 58 0b 74 85 c0 74 43 48 c7 04 24 0e 36 > RSP: 0018:ffffc900030fef60 EFLAGS: 00000206 > RAX: 23438dd059a4b100 RBX: 1ffff9200061fdf0 RCX: ffffffff819b316a > RDX: dffffc0000000000 RSI: ffffffff8c0aa680 RDI: 0000000000000001 > RBP: ffffc900030feff8 R08: ffffffff942f9847 R09: 1ffffffff285f308 > R10: dffffc0000000000 R11: fffffbfff285f309 R12: dffffc0000000000 > R13: 1ffff9200061fdec R14: ffffc900030fef80 R15: 0000000000000246 > spin_unlock_irqrestore include/linux/spinlock.h:406 [inline] > rmqueue_bulk mm/page_alloc.c:2329 [inline] > __rmqueue_pcplist+0x21fd/0x2a90 mm/page_alloc.c:3004 > rmqueue_pcplist mm/page_alloc.c:3046 [inline] > rmqueue mm/page_alloc.c:3077 [inline] > get_page_from_freelist+0x886/0x37a0 mm/page_alloc.c:3474 > __alloc_frozen_pages_noprof+0x292/0x710 mm/page_alloc.c:4739 > alloc_pages_mpol+0x311/0x660 mm/mempolicy.c:2270 > folio_alloc_mpol_noprof mm/mempolicy.c:2289 [inline] > vma_alloc_folio_noprof+0x12b/0x260 mm/mempolicy.c:2324 > folio_prealloc+0x2e/0x170 > wp_page_copy mm/memory.c:3435 [inline] > do_wp_page+0x1253/0x49b0 mm/memory.c:3827 > handle_pte_fault mm/memory.c:5905 [inline] > __handle_mm_fault+0x24d5/0x70f0 mm/memory.c:6032 > handle_mm_fault+0x3e5/0x8d0 mm/memory.c:6201 > do_user_addr_fault arch/x86/mm/fault.c:1388 [inline] > handle_page_fault arch/x86/mm/fault.c:1480 [inline] > exc_page_fault+0x2b9/0x8b0 arch/x86/mm/fault.c:1538 > asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623 > RIP: 0010:__put_user_4+0x11/0x20 arch/x86/lib/putuser.S:88 > Code: 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 89 cb 48 c1 fb 3f 48 09 d9 0f 01 cb <89> 01 31 c9 0f 01 ca c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90 > RSP: 0018:ffffc900030fff00 EFLAGS: 00050202 > RAX: 0000000000000005 RBX: 0000000000000000 RCX: 00005555679927d0 > RDX: 0000000000000000 RSI: ffffffff8c0ab8e0 RDI: ffffffff8c608a00 > RBP: ffff888000dfcf20 R08: ffffffff901b5177 R09: 1ffffffff2036a2e > R10: dffffc0000000000 R11: fffffbfff2036a2f R12: 0000000000000000 > R13: 0000000000000000 R14: 0000000000000005 R15: dffffc0000000000 > schedule_tail+0x96/0xb0 kernel/sched/core.c:5312 > ret_from_fork+0x24/0x80 arch/x86/kernel/process.c:144 > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 > </TASK> > Modules linked in: > CR2: 0000000000000000 > ---[ end trace 0000000000000000 ]--- > RIP: 0010:0x0 > Code: Unable to access opcode bytes at 0xffffffffffffffd6. > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246 > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100 > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8 > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507 > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8 > FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > ---------------- > Code disassembly (best guess): > 0: 9c pushf > 1: 8f 44 24 20 pop 0x20(%rsp) > 5: 42 80 3c 23 00 cmpb $0x0,(%rbx,%r12,1) > a: 74 08 je 0x14 > c: 4c 89 f7 mov %r14,%rdi > f: e8 fe 78 2d f6 call 0xf62d7912 > 14: f6 44 24 21 02 testb $0x2,0x21(%rsp) > 19: 75 52 jne 0x6d > 1b: 41 f7 c7 00 02 00 00 test $0x200,%r15d > 22: 74 01 je 0x25 > 24: fb sti > 25: bf 01 00 00 00 mov $0x1,%edi > * 2a: e8 c3 0f 95 f5 call 0xf5950ff2 <-- trapping instruction > 2f: 65 8b 05 d4 58 0b 74 mov %gs:0x740b58d4(%rip),%eax # 0x740b590a > 36: 85 c0 test %eax,%eax > 38: 74 43 je 0x7d > 3a: 48 rex.W > 3b: c7 .byte 0xc7 > 3c: 04 24 add $0x24,%al > 3e: 0e (bad) > 3f: 36 ss > > > --- > This report is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkaller@googlegroups.com. > > syzbot will keep track of this issue. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > If the report is already addressed, let syzbot know by replying with: > #syz fix: exact-commit-title > > If you want syzbot to run the reproducer, reply with: > #syz test: git://repo/address.git branch-or-commit-hash > If you attach or paste a git patch, syzbot will apply it before testing. > > If you want to overwrite report's subsystems, reply with: > #syz set subsystems: new-subsystem > (See the list of subsystem names on the web dashboard) > > If the report is a duplicate of another one, reply with: > #syz dup: exact-subject-of-another-report > > If you want to undo deduplication, reply with: > #syz undup ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-02-05 14:56 ` Paul E. McKenney @ 2025-06-08 15:26 ` Kent Overstreet 2025-06-08 18:23 ` Uladzislau Rezki 0 siblings, 1 reply; 20+ messages in thread From: Kent Overstreet @ 2025-06-08 15:26 UTC (permalink / raw) To: Paul E. McKenney Cc: syzbot, akpm, josh, linux-bcachefs, linux-kernel, linux-mm, rcu, syzkaller-bugs On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote: > On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000 > > > > Downloadable assets: > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz > > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com > > > > slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576 > > BUG: kernel NULL pointer dereference, address: 0000000000000000 > > #PF: supervisor instruction fetch in kernel mode > > #PF: error_code(0x0010) - not-present page > > PGD 0 P4D 0 > > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI > > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0 > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 > > RIP: 0010:0x0 > > Code: Unable to access opcode bytes at 0xffffffffffffffd6. > > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246 > > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100 > > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8 > > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a > > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507 > > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8 > > FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > <IRQ> > > rcu_do_batch kernel/rcu/tree.c:2546 [inline] > > The usual way that this happens is that someone clobbers the rcu_head > structure of something that has been passed to call_rcu(). The most > popular way of clobbering this structure is to pass the same something to > call_rcu() twice in a row, but other creative arrangements are possible. > > Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually > spot invoking call_rcu() twice in a row. I don't think it's that - syzbot's .config already has that enabled. KASAN, too. And the only place we do call_rcu() is from rcu_pending.c, where we've got a rearming rcu callback - but we track whether it's outstanding, and we do all relevant operations with a lock held. And we only use rcu_pending.c with SRCU, not regular RCU. We do use kfree_rcu() in a few places (all boring, I expect), but that doesn't (generally?) use the rcu callback list. So I'm not sure this is even a bcachefs bug. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-06-08 15:26 ` Kent Overstreet @ 2025-06-08 18:23 ` Uladzislau Rezki 2025-06-09 0:25 ` Paul E. McKenney 2025-06-09 18:28 ` Vlastimil Babka 0 siblings, 2 replies; 20+ messages in thread From: Uladzislau Rezki @ 2025-06-08 18:23 UTC (permalink / raw) To: Kent Overstreet, Paul E. McKenney Cc: Paul E. McKenney, syzbot, akpm, josh, linux-bcachefs, linux-kernel, linux-mm, rcu, syzkaller-bugs On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote: > On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote: > > On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote: > > > Hello, > > > > > > syzbot found the following issue on: > > > > > > HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/.. > > > git tree: upstream > > > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce > > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000 > > > > > > Downloadable assets: > > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz > > > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz > > > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz > > > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com > > > > > > slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576 > > > BUG: kernel NULL pointer dereference, address: 0000000000000000 > > > #PF: supervisor instruction fetch in kernel mode > > > #PF: error_code(0x0010) - not-present page > > > PGD 0 P4D 0 > > > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI > > > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0 > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 > > > RIP: 0010:0x0 > > > Code: Unable to access opcode bytes at 0xffffffffffffffd6. > > > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246 > > > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100 > > > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8 > > > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a > > > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507 > > > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8 > > > FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000 > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0 > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > Call Trace: > > > <IRQ> > > > rcu_do_batch kernel/rcu/tree.c:2546 [inline] > > > > The usual way that this happens is that someone clobbers the rcu_head > > structure of something that has been passed to call_rcu(). The most > > popular way of clobbering this structure is to pass the same something to > > call_rcu() twice in a row, but other creative arrangements are possible. > > > > Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually > > spot invoking call_rcu() twice in a row. > > I don't think it's that - syzbot's .config already has that enabled. > KASAN, too. > > And the only place we do call_rcu() is from rcu_pending.c, where we've > got a rearming rcu callback - but we track whether it's outstanding, and > we do all relevant operations with a lock held. > > And we only use rcu_pending.c with SRCU, not regular RCU. > > We do use kfree_rcu() in a few places (all boring, I expect), but that > doesn't (generally?) use the rcu callback list. > Right, kvfree_rcu() does not intersect with regular callbacks, it has its own path. It looks like the problem is here: <snip> f = rhp->func; debug_rcu_head_callback(rhp); WRITE_ONCE(rhp->func, (rcu_callback_t)0L); f(rhp); <snip> we do not check if callback, "f", is a NULL. If it is, the kernel bug is triggered right away. For example: call_rcu(&rh, NULL); @Paul, do you think it makes sense to narrow callers which apparently pass NULL as a callback? To me it seems the case of this bug. But we do not know the source. It would give at least a stack-trace of caller which passes a NULL. -- Uladzislau Rezki ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-06-08 18:23 ` Uladzislau Rezki @ 2025-06-09 0:25 ` Paul E. McKenney 2025-06-09 8:35 ` Uladzislau Rezki 2025-06-09 18:28 ` Vlastimil Babka 1 sibling, 1 reply; 20+ messages in thread From: Paul E. McKenney @ 2025-06-09 0:25 UTC (permalink / raw) To: Uladzislau Rezki Cc: Kent Overstreet, syzbot, akpm, josh, linux-bcachefs, linux-kernel, linux-mm, rcu, syzkaller-bugs On Sun, Jun 08, 2025 at 08:23:36PM +0200, Uladzislau Rezki wrote: > On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote: > > On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote: > > > On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote: > > > > Hello, > > > > > > > > syzbot found the following issue on: > > > > > > > > HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/.. > > > > git tree: upstream > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000 > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a > > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000 > > > > > > > > Downloadable assets: > > > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz > > > > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz > > > > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com > > > > > > > > slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576 > > > > BUG: kernel NULL pointer dereference, address: 0000000000000000 > > > > #PF: supervisor instruction fetch in kernel mode > > > > #PF: error_code(0x0010) - not-present page > > > > PGD 0 P4D 0 > > > > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI > > > > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0 > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 > > > > RIP: 0010:0x0 > > > > Code: Unable to access opcode bytes at 0xffffffffffffffd6. > > > > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246 > > > > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100 > > > > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8 > > > > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a > > > > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507 > > > > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8 > > > > FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000 > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0 > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > Call Trace: > > > > <IRQ> > > > > rcu_do_batch kernel/rcu/tree.c:2546 [inline] > > > > > > The usual way that this happens is that someone clobbers the rcu_head > > > structure of something that has been passed to call_rcu(). The most > > > popular way of clobbering this structure is to pass the same something to > > > call_rcu() twice in a row, but other creative arrangements are possible. > > > > > > Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually > > > spot invoking call_rcu() twice in a row. > > > > I don't think it's that - syzbot's .config already has that enabled. > > KASAN, too. > > > > And the only place we do call_rcu() is from rcu_pending.c, where we've > > got a rearming rcu callback - but we track whether it's outstanding, and > > we do all relevant operations with a lock held. > > > > And we only use rcu_pending.c with SRCU, not regular RCU. > > > > We do use kfree_rcu() in a few places (all boring, I expect), but that > > doesn't (generally?) use the rcu callback list. > > > Right, kvfree_rcu() does not intersect with regular callbacks, it has > its own path. > > It looks like the problem is here: > > <snip> > f = rhp->func; > debug_rcu_head_callback(rhp); > WRITE_ONCE(rhp->func, (rcu_callback_t)0L); > f(rhp); > <snip> > > we do not check if callback, "f", is a NULL. If it is, the kernel bug > is triggered right away. For example: > > call_rcu(&rh, NULL); > > @Paul, do you think it makes sense to narrow callers which apparently > pass NULL as a callback? To me it seems the case of this bug. But we > do not know the source. > > It would give at least a stack-trace of caller which passes a NULL. Adding a check for NULL func passed to __call_rcu_common(), you mean? That wouldn't hurt, and would either (as you say) catch the culprit or show that the problem is elsewhere. Thanx, Paul ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-06-09 0:25 ` Paul E. McKenney @ 2025-06-09 8:35 ` Uladzislau Rezki 2025-06-09 9:47 ` Paul E. McKenney 0 siblings, 1 reply; 20+ messages in thread From: Uladzislau Rezki @ 2025-06-09 8:35 UTC (permalink / raw) To: Paul E. McKenney Cc: Uladzislau Rezki, Kent Overstreet, syzbot, akpm, josh, linux-bcachefs, linux-kernel, linux-mm, rcu, syzkaller-bugs On Sun, Jun 08, 2025 at 05:25:05PM -0700, Paul E. McKenney wrote: > On Sun, Jun 08, 2025 at 08:23:36PM +0200, Uladzislau Rezki wrote: > > On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote: > > > On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote: > > > > On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote: > > > > > Hello, > > > > > > > > > > syzbot found the following issue on: > > > > > > > > > > HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/.. > > > > > git tree: upstream > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000 > > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a > > > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000 > > > > > > > > > > Downloadable assets: > > > > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz > > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz > > > > > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz > > > > > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz > > > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com > > > > > > > > > > slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576 > > > > > BUG: kernel NULL pointer dereference, address: 0000000000000000 > > > > > #PF: supervisor instruction fetch in kernel mode > > > > > #PF: error_code(0x0010) - not-present page > > > > > PGD 0 P4D 0 > > > > > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI > > > > > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0 > > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 > > > > > RIP: 0010:0x0 > > > > > Code: Unable to access opcode bytes at 0xffffffffffffffd6. > > > > > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246 > > > > > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100 > > > > > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8 > > > > > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a > > > > > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507 > > > > > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8 > > > > > FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000 > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0 > > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > > Call Trace: > > > > > <IRQ> > > > > > rcu_do_batch kernel/rcu/tree.c:2546 [inline] > > > > > > > > The usual way that this happens is that someone clobbers the rcu_head > > > > structure of something that has been passed to call_rcu(). The most > > > > popular way of clobbering this structure is to pass the same something to > > > > call_rcu() twice in a row, but other creative arrangements are possible. > > > > > > > > Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually > > > > spot invoking call_rcu() twice in a row. > > > > > > I don't think it's that - syzbot's .config already has that enabled. > > > KASAN, too. > > > > > > And the only place we do call_rcu() is from rcu_pending.c, where we've > > > got a rearming rcu callback - but we track whether it's outstanding, and > > > we do all relevant operations with a lock held. > > > > > > And we only use rcu_pending.c with SRCU, not regular RCU. > > > > > > We do use kfree_rcu() in a few places (all boring, I expect), but that > > > doesn't (generally?) use the rcu callback list. > > > > > Right, kvfree_rcu() does not intersect with regular callbacks, it has > > its own path. > > > > It looks like the problem is here: > > > > <snip> > > f = rhp->func; > > debug_rcu_head_callback(rhp); > > WRITE_ONCE(rhp->func, (rcu_callback_t)0L); > > f(rhp); > > <snip> > > > > we do not check if callback, "f", is a NULL. If it is, the kernel bug > > is triggered right away. For example: > > > > call_rcu(&rh, NULL); > > > > @Paul, do you think it makes sense to narrow callers which apparently > > pass NULL as a callback? To me it seems the case of this bug. But we > > do not know the source. > > > > It would give at least a stack-trace of caller which passes a NULL. > > Adding a check for NULL func passed to __call_rcu_common(), you mean? > Yes. Currently there is no any check. So passing a NULL just triggers kernel panic. > > That wouldn't hurt, and would either (as you say) catch the culprit > or show that the problem is elsewhere. > I can add it then and send out the patch if no objections. -- Uladzislau Rezki ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-06-09 8:35 ` Uladzislau Rezki @ 2025-06-09 9:47 ` Paul E. McKenney 2025-06-09 14:20 ` Joel Fernandes 0 siblings, 1 reply; 20+ messages in thread From: Paul E. McKenney @ 2025-06-09 9:47 UTC (permalink / raw) To: Uladzislau Rezki Cc: Kent Overstreet, syzbot, akpm, josh, linux-bcachefs, linux-kernel, linux-mm, rcu, syzkaller-bugs On Mon, Jun 09, 2025 at 10:35:34AM +0200, Uladzislau Rezki wrote: > On Sun, Jun 08, 2025 at 05:25:05PM -0700, Paul E. McKenney wrote: > > On Sun, Jun 08, 2025 at 08:23:36PM +0200, Uladzislau Rezki wrote: > > > On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote: > > > > On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote: > > > > > On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote: > > > > > > Hello, > > > > > > > > > > > > syzbot found the following issue on: > > > > > > > > > > > > HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/.. > > > > > > git tree: upstream > > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000 > > > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce > > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a > > > > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000 > > > > > > > > > > > > Downloadable assets: > > > > > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz > > > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz > > > > > > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz > > > > > > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz > > > > > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > > > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com > > > > > > > > > > > > slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576 > > > > > > BUG: kernel NULL pointer dereference, address: 0000000000000000 > > > > > > #PF: supervisor instruction fetch in kernel mode > > > > > > #PF: error_code(0x0010) - not-present page > > > > > > PGD 0 P4D 0 > > > > > > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI > > > > > > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0 > > > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 > > > > > > RIP: 0010:0x0 > > > > > > Code: Unable to access opcode bytes at 0xffffffffffffffd6. > > > > > > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246 > > > > > > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100 > > > > > > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8 > > > > > > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a > > > > > > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507 > > > > > > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8 > > > > > > FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000 > > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > > > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0 > > > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > > > Call Trace: > > > > > > <IRQ> > > > > > > rcu_do_batch kernel/rcu/tree.c:2546 [inline] > > > > > > > > > > The usual way that this happens is that someone clobbers the rcu_head > > > > > structure of something that has been passed to call_rcu(). The most > > > > > popular way of clobbering this structure is to pass the same something to > > > > > call_rcu() twice in a row, but other creative arrangements are possible. > > > > > > > > > > Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually > > > > > spot invoking call_rcu() twice in a row. > > > > > > > > I don't think it's that - syzbot's .config already has that enabled. > > > > KASAN, too. > > > > > > > > And the only place we do call_rcu() is from rcu_pending.c, where we've > > > > got a rearming rcu callback - but we track whether it's outstanding, and > > > > we do all relevant operations with a lock held. > > > > > > > > And we only use rcu_pending.c with SRCU, not regular RCU. > > > > > > > > We do use kfree_rcu() in a few places (all boring, I expect), but that > > > > doesn't (generally?) use the rcu callback list. > > > > > > > Right, kvfree_rcu() does not intersect with regular callbacks, it has > > > its own path. > > > > > > It looks like the problem is here: > > > > > > <snip> > > > f = rhp->func; > > > debug_rcu_head_callback(rhp); > > > WRITE_ONCE(rhp->func, (rcu_callback_t)0L); > > > f(rhp); > > > <snip> > > > > > > we do not check if callback, "f", is a NULL. If it is, the kernel bug > > > is triggered right away. For example: > > > > > > call_rcu(&rh, NULL); > > > > > > @Paul, do you think it makes sense to narrow callers which apparently > > > pass NULL as a callback? To me it seems the case of this bug. But we > > > do not know the source. > > > > > > It would give at least a stack-trace of caller which passes a NULL. > > > > Adding a check for NULL func passed to __call_rcu_common(), you mean? > > > Yes. Currently there is no any check. So passing a NULL just triggers > kernel panic. > > > > > That wouldn't hurt, and would either (as you say) catch the culprit > > or show that the problem is elsewhere. > > > I can add it then and send out the patch if no objections. No objections from me! Thanx, Paul ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-06-09 9:47 ` Paul E. McKenney @ 2025-06-09 14:20 ` Joel Fernandes 2025-06-10 12:19 ` Uladzislau Rezki 0 siblings, 1 reply; 20+ messages in thread From: Joel Fernandes @ 2025-06-09 14:20 UTC (permalink / raw) To: paulmck, Uladzislau Rezki Cc: Kent Overstreet, syzbot, akpm, josh, linux-bcachefs, linux-kernel, linux-mm, rcu, syzkaller-bugs On 6/9/2025 5:47 AM, Paul E. McKenney wrote: > On Mon, Jun 09, 2025 at 10:35:34AM +0200, Uladzislau Rezki wrote: >> On Sun, Jun 08, 2025 at 05:25:05PM -0700, Paul E. McKenney wrote: >>> On Sun, Jun 08, 2025 at 08:23:36PM +0200, Uladzislau Rezki wrote: >>>> On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote: >>>>> On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote: >>>>>> On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote: >>>>>>> Hello, >>>>>>> >>>>>>> syzbot found the following issue on: >>>>>>> >>>>>>> HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/.. >>>>>>> git tree: upstream >>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000 >>>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce >>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a >>>>>>> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 >>>>>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000 >>>>>>> >>>>>>> Downloadable assets: >>>>>>> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz >>>>>>> vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz >>>>>>> kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz >>>>>>> mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz >>>>>>> >>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit: >>>>>>> Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com >>>>>>> >>>>>>> slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576 >>>>>>> BUG: kernel NULL pointer dereference, address: 0000000000000000 >>>>>>> #PF: supervisor instruction fetch in kernel mode >>>>>>> #PF: error_code(0x0010) - not-present page >>>>>>> PGD 0 P4D 0 >>>>>>> Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI >>>>>>> CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0 >>>>>>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 >>>>>>> RIP: 0010:0x0 >>>>>>> Code: Unable to access opcode bytes at 0xffffffffffffffd6. >>>>>>> RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246 >>>>>>> RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100 >>>>>>> RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8 >>>>>>> RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a >>>>>>> R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507 >>>>>>> R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8 >>>>>>> FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000 >>>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>>>> CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0 >>>>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>>>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>>>>>> Call Trace: >>>>>>> <IRQ> >>>>>>> rcu_do_batch kernel/rcu/tree.c:2546 [inline] >>>>>> >>>>>> The usual way that this happens is that someone clobbers the rcu_head >>>>>> structure of something that has been passed to call_rcu(). The most >>>>>> popular way of clobbering this structure is to pass the same something to >>>>>> call_rcu() twice in a row, but other creative arrangements are possible. >>>>>> >>>>>> Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually >>>>>> spot invoking call_rcu() twice in a row. >>>>> >>>>> I don't think it's that - syzbot's .config already has that enabled. >>>>> KASAN, too. >>>>> >>>>> And the only place we do call_rcu() is from rcu_pending.c, where we've >>>>> got a rearming rcu callback - but we track whether it's outstanding, and >>>>> we do all relevant operations with a lock held. >>>>> >>>>> And we only use rcu_pending.c with SRCU, not regular RCU. >>>>> >>>>> We do use kfree_rcu() in a few places (all boring, I expect), but that >>>>> doesn't (generally?) use the rcu callback list. >>>>> >>>> Right, kvfree_rcu() does not intersect with regular callbacks, it has >>>> its own path. >>>> >>>> It looks like the problem is here: >>>> >>>> <snip> >>>> f = rhp->func; >>>> debug_rcu_head_callback(rhp); >>>> WRITE_ONCE(rhp->func, (rcu_callback_t)0L); >>>> f(rhp); >>>> <snip> >>>> >>>> we do not check if callback, "f", is a NULL. If it is, the kernel bug >>>> is triggered right away. For example: >>>> >>>> call_rcu(&rh, NULL); >>>> >>>> @Paul, do you think it makes sense to narrow callers which apparently >>>> pass NULL as a callback? To me it seems the case of this bug. But we >>>> do not know the source. >>>> >>>> It would give at least a stack-trace of caller which passes a NULL. >>> >>> Adding a check for NULL func passed to __call_rcu_common(), you mean? >>> >> Yes. Currently there is no any check. So passing a NULL just triggers >> kernel panic. >> >>> >>> That wouldn't hurt, and would either (as you say) catch the culprit >>> or show that the problem is elsewhere. >>> >> I can add it then and send out the patch if no objections. > > No objections from me! Me neither! And I can push that into an -rc release as well once I have it (since it is related to a potential bug). thanks, - Joel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-06-09 14:20 ` Joel Fernandes @ 2025-06-10 12:19 ` Uladzislau Rezki 0 siblings, 0 replies; 20+ messages in thread From: Uladzislau Rezki @ 2025-06-10 12:19 UTC (permalink / raw) To: Joel Fernandes Cc: paulmck, Uladzislau Rezki, Kent Overstreet, syzbot, akpm, josh, linux-bcachefs, linux-kernel, linux-mm, rcu, syzkaller-bugs On Mon, Jun 09, 2025 at 10:20:58AM -0400, Joel Fernandes wrote: > > > On 6/9/2025 5:47 AM, Paul E. McKenney wrote: > > On Mon, Jun 09, 2025 at 10:35:34AM +0200, Uladzislau Rezki wrote: > >> On Sun, Jun 08, 2025 at 05:25:05PM -0700, Paul E. McKenney wrote: > >>> On Sun, Jun 08, 2025 at 08:23:36PM +0200, Uladzislau Rezki wrote: > >>>> On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote: > >>>>> On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote: > >>>>>> On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote: > >>>>>>> Hello, > >>>>>>> > >>>>>>> syzbot found the following issue on: > >>>>>>> > >>>>>>> HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/.. > >>>>>>> git tree: upstream > >>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000 > >>>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce > >>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a > >>>>>>> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > >>>>>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000 > >>>>>>> > >>>>>>> Downloadable assets: > >>>>>>> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz > >>>>>>> vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz > >>>>>>> kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz > >>>>>>> mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz > >>>>>>> > >>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit: > >>>>>>> Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com > >>>>>>> > >>>>>>> slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576 > >>>>>>> BUG: kernel NULL pointer dereference, address: 0000000000000000 > >>>>>>> #PF: supervisor instruction fetch in kernel mode > >>>>>>> #PF: error_code(0x0010) - not-present page > >>>>>>> PGD 0 P4D 0 > >>>>>>> Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI > >>>>>>> CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0 > >>>>>>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 > >>>>>>> RIP: 0010:0x0 > >>>>>>> Code: Unable to access opcode bytes at 0xffffffffffffffd6. > >>>>>>> RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246 > >>>>>>> RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100 > >>>>>>> RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8 > >>>>>>> RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a > >>>>>>> R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507 > >>>>>>> R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8 > >>>>>>> FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000 > >>>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >>>>>>> CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0 > >>>>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > >>>>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > >>>>>>> Call Trace: > >>>>>>> <IRQ> > >>>>>>> rcu_do_batch kernel/rcu/tree.c:2546 [inline] > >>>>>> > >>>>>> The usual way that this happens is that someone clobbers the rcu_head > >>>>>> structure of something that has been passed to call_rcu(). The most > >>>>>> popular way of clobbering this structure is to pass the same something to > >>>>>> call_rcu() twice in a row, but other creative arrangements are possible. > >>>>>> > >>>>>> Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually > >>>>>> spot invoking call_rcu() twice in a row. > >>>>> > >>>>> I don't think it's that - syzbot's .config already has that enabled. > >>>>> KASAN, too. > >>>>> > >>>>> And the only place we do call_rcu() is from rcu_pending.c, where we've > >>>>> got a rearming rcu callback - but we track whether it's outstanding, and > >>>>> we do all relevant operations with a lock held. > >>>>> > >>>>> And we only use rcu_pending.c with SRCU, not regular RCU. > >>>>> > >>>>> We do use kfree_rcu() in a few places (all boring, I expect), but that > >>>>> doesn't (generally?) use the rcu callback list. > >>>>> > >>>> Right, kvfree_rcu() does not intersect with regular callbacks, it has > >>>> its own path. > >>>> > >>>> It looks like the problem is here: > >>>> > >>>> <snip> > >>>> f = rhp->func; > >>>> debug_rcu_head_callback(rhp); > >>>> WRITE_ONCE(rhp->func, (rcu_callback_t)0L); > >>>> f(rhp); > >>>> <snip> > >>>> > >>>> we do not check if callback, "f", is a NULL. If it is, the kernel bug > >>>> is triggered right away. For example: > >>>> > >>>> call_rcu(&rh, NULL); > >>>> > >>>> @Paul, do you think it makes sense to narrow callers which apparently > >>>> pass NULL as a callback? To me it seems the case of this bug. But we > >>>> do not know the source. > >>>> > >>>> It would give at least a stack-trace of caller which passes a NULL. > >>> > >>> Adding a check for NULL func passed to __call_rcu_common(), you mean? > >>> > >> Yes. Currently there is no any check. So passing a NULL just triggers > >> kernel panic. > >> > >>> > >>> That wouldn't hurt, and would either (as you say) catch the culprit > >>> or show that the problem is elsewhere. > >>> > >> I can add it then and send out the patch if no objections. > > > > No objections from me! > > Me neither! And I can push that into an -rc release as well once I have it > (since it is related to a potential bug). > I will prepare it and send out today. -- Uladzislau Rezki ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-06-08 18:23 ` Uladzislau Rezki 2025-06-09 0:25 ` Paul E. McKenney @ 2025-06-09 18:28 ` Vlastimil Babka 2025-06-10 12:33 ` Uladzislau Rezki 1 sibling, 1 reply; 20+ messages in thread From: Vlastimil Babka @ 2025-06-09 18:28 UTC (permalink / raw) To: Uladzislau Rezki, Kent Overstreet, Paul E. McKenney Cc: syzbot, akpm, josh, linux-bcachefs, linux-kernel, linux-mm, rcu, syzkaller-bugs On 6/8/25 20:23, Uladzislau Rezki wrote: > On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote: >> >> I don't think it's that - syzbot's .config already has that enabled. >> KASAN, too. >> >> And the only place we do call_rcu() is from rcu_pending.c, where we've >> got a rearming rcu callback - but we track whether it's outstanding, and >> we do all relevant operations with a lock held. >> >> And we only use rcu_pending.c with SRCU, not regular RCU. >> >> We do use kfree_rcu() in a few places (all boring, I expect), but that >> doesn't (generally?) use the rcu callback list. >> > Right, kvfree_rcu() does not intersect with regular callbacks, it has > its own path. You mean do to the batching? Maybe the batching should be disabled with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y if it prevents it from detecting issues? Otherwise we now have kvfree_rcu_cb() so the special handling of kvfree_rcu() is gone in in the non-batching case. > It looks like the problem is here: > > <snip> > f = rhp->func; > debug_rcu_head_callback(rhp); > WRITE_ONCE(rhp->func, (rcu_callback_t)0L); > f(rhp); > <snip> > > we do not check if callback, "f", is a NULL. If it is, the kernel bug > is triggered right away. For example: > > call_rcu(&rh, NULL); > > @Paul, do you think it makes sense to narrow callers which apparently > pass NULL as a callback? To me it seems the case of this bug. But we > do not know the source. > > It would give at least a stack-trace of caller which passes a NULL. Right, AFAIU this kind of check is now possible, previously NULL was being interpreted as a valid __is_kvfree_rcu_offset() (i.e. rcu_head at offset 0). > -- > Uladzislau Rezki > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-06-09 18:28 ` Vlastimil Babka @ 2025-06-10 12:33 ` Uladzislau Rezki 0 siblings, 0 replies; 20+ messages in thread From: Uladzislau Rezki @ 2025-06-10 12:33 UTC (permalink / raw) To: Vlastimil Babka Cc: Uladzislau Rezki, Kent Overstreet, Paul E. McKenney, syzbot, akpm, josh, linux-bcachefs, linux-kernel, linux-mm, rcu, syzkaller-bugs On Mon, Jun 09, 2025 at 08:28:56PM +0200, Vlastimil Babka wrote: > On 6/8/25 20:23, Uladzislau Rezki wrote: > > On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote: > >> > >> I don't think it's that - syzbot's .config already has that enabled. > >> KASAN, too. > >> > >> And the only place we do call_rcu() is from rcu_pending.c, where we've > >> got a rearming rcu callback - but we track whether it's outstanding, and > >> we do all relevant operations with a lock held. > >> > >> And we only use rcu_pending.c with SRCU, not regular RCU. > >> > >> We do use kfree_rcu() in a few places (all boring, I expect), but that > >> doesn't (generally?) use the rcu callback list. > >> > > Right, kvfree_rcu() does not intersect with regular callbacks, it has > > its own path. > > You mean do to the batching? Maybe the batching should be disabled with > CONFIG_DEBUG_OBJECTS_RCU_HEAD=y if it prevents it from detecting issues? > Otherwise we now have kvfree_rcu_cb() so the special handling of > kvfree_rcu() is gone in in the non-batching case. > Not really. I meant that in a call_rcu() API there is no any check if a passed callback which is executed after GP is NULL. If so, we get the bug about about dereferencing of NULL pointer. Since it is invoked by the rcu_core() context, we can not identify the caller in order to blame someone :) As for batching, we have a support of CONFIG_DEBUG_OBJECTS_RCU_HEAD. It helps to identify double-freeing and probably leaking. > > It looks like the problem is here: > > > > <snip> > > f = rhp->func; > > debug_rcu_head_callback(rhp); > > WRITE_ONCE(rhp->func, (rcu_callback_t)0L); > > f(rhp); > > <snip> > > > > we do not check if callback, "f", is a NULL. If it is, the kernel bug > > is triggered right away. For example: > > > > call_rcu(&rh, NULL); > > > > @Paul, do you think it makes sense to narrow callers which apparently > > pass NULL as a callback? To me it seems the case of this bug. But we > > do not know the source. > > > > It would give at least a stack-trace of caller which passes a NULL. > > Right, AFAIU this kind of check is now possible, previously NULL was being > interpreted as a valid __is_kvfree_rcu_offset() (i.e. rcu_head at offset 0). > > > -- > > Uladzislau Rezki > > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-02-05 0:34 [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) syzbot 2025-02-05 14:56 ` Paul E. McKenney @ 2025-06-08 6:58 ` syzbot 2025-06-11 15:58 ` [syzbot] [rcu?] [bcachefs?] " Uladzislau Rezki 2 siblings, 0 replies; 20+ messages in thread From: syzbot @ 2025-06-08 6:58 UTC (permalink / raw) To: akpm, ayaanmirza.788, ayaanmirzabaig85, josh, kent.overstreet, linux-bcachefs, linux-kernel, linux-mm, luto, paulmck, peterz, rcu, syzkaller-bugs, tglx syzbot has bisected this issue to: commit 14152654805256d760315ec24e414363bfa19a06 Author: Kent Overstreet <kent.overstreet@linux.dev> Date: Mon Nov 25 05:21:27 2024 +0000 bcachefs: Bad btree roots are now autofix bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=12fa0a82580000 start commit: 99fa936e8e4f Merge tag 'affs-6.14-rc5-tag' of git://git.ke.. git tree: upstream final oops: https://syzkaller.appspot.com/x/report.txt?x=11fa0a82580000 console output: https://syzkaller.appspot.com/x/log.txt?x=16fa0a82580000 kernel config: https://syzkaller.appspot.com/x/.config?x=523d3ff8e053340a dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a syz repro: https://syzkaller.appspot.com/x/repro.syz?x=119d35a8580000 Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com Fixes: 141526548052 ("bcachefs: Bad btree roots are now autofix") For information about bisection process see: https://goo.gl/tpsmEJ#bisection ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-02-05 0:34 [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) syzbot 2025-02-05 14:56 ` Paul E. McKenney 2025-06-08 6:58 ` [syzbot] [bcachefs?] [rcu?] " syzbot @ 2025-06-11 15:58 ` Uladzislau Rezki 2025-06-11 18:02 ` [syzbot] [bcachefs?] [rcu?] " syzbot 2 siblings, 1 reply; 20+ messages in thread From: Uladzislau Rezki @ 2025-06-11 15:58 UTC (permalink / raw) To: syzbot Cc: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel, linux-mm, paulmck, rcu, syzkaller-bugs On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000 > kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000 > > Downloadable assets: > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com > > slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576 > BUG: kernel NULL pointer dereference, address: 0000000000000000 > #PF: supervisor instruction fetch in kernel mode > #PF: error_code(0x0010) - not-present page > PGD 0 P4D 0 > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 > RIP: 0010:0x0 > Code: Unable to access opcode bytes at 0xffffffffffffffd6. > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246 > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100 > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8 > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507 > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8 > FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <IRQ> > rcu_do_batch kernel/rcu/tree.c:2546 [inline] > rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2802 > handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561 > __do_softirq kernel/softirq.c:595 [inline] > invoke_softirq kernel/softirq.c:435 [inline] > __irq_exit_rcu+0xf7/0x220 kernel/softirq.c:662 > irq_exit_rcu+0x9/0x30 kernel/softirq.c:678 > instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline] > sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1049 > </IRQ> > <TASK> > asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702 > RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline] > RIP: 0010:_raw_spin_unlock_irqrestore+0xd8/0x140 kernel/locking/spinlock.c:194 > Code: 9c 8f 44 24 20 42 80 3c 23 00 74 08 4c 89 f7 e8 fe 78 2d f6 f6 44 24 21 02 75 52 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> c3 0f 95 f5 65 8b 05 d4 58 0b 74 85 c0 74 43 48 c7 04 24 0e 36 > RSP: 0018:ffffc900030fef60 EFLAGS: 00000206 > RAX: 23438dd059a4b100 RBX: 1ffff9200061fdf0 RCX: ffffffff819b316a > RDX: dffffc0000000000 RSI: ffffffff8c0aa680 RDI: 0000000000000001 > RBP: ffffc900030feff8 R08: ffffffff942f9847 R09: 1ffffffff285f308 > R10: dffffc0000000000 R11: fffffbfff285f309 R12: dffffc0000000000 > R13: 1ffff9200061fdec R14: ffffc900030fef80 R15: 0000000000000246 > spin_unlock_irqrestore include/linux/spinlock.h:406 [inline] > rmqueue_bulk mm/page_alloc.c:2329 [inline] > __rmqueue_pcplist+0x21fd/0x2a90 mm/page_alloc.c:3004 > rmqueue_pcplist mm/page_alloc.c:3046 [inline] > rmqueue mm/page_alloc.c:3077 [inline] > get_page_from_freelist+0x886/0x37a0 mm/page_alloc.c:3474 > __alloc_frozen_pages_noprof+0x292/0x710 mm/page_alloc.c:4739 > alloc_pages_mpol+0x311/0x660 mm/mempolicy.c:2270 > folio_alloc_mpol_noprof mm/mempolicy.c:2289 [inline] > vma_alloc_folio_noprof+0x12b/0x260 mm/mempolicy.c:2324 > folio_prealloc+0x2e/0x170 > wp_page_copy mm/memory.c:3435 [inline] > do_wp_page+0x1253/0x49b0 mm/memory.c:3827 > handle_pte_fault mm/memory.c:5905 [inline] > __handle_mm_fault+0x24d5/0x70f0 mm/memory.c:6032 > handle_mm_fault+0x3e5/0x8d0 mm/memory.c:6201 > do_user_addr_fault arch/x86/mm/fault.c:1388 [inline] > handle_page_fault arch/x86/mm/fault.c:1480 [inline] > exc_page_fault+0x2b9/0x8b0 arch/x86/mm/fault.c:1538 > asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623 > RIP: 0010:__put_user_4+0x11/0x20 arch/x86/lib/putuser.S:88 > Code: 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 89 cb 48 c1 fb 3f 48 09 d9 0f 01 cb <89> 01 31 c9 0f 01 ca c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90 > RSP: 0018:ffffc900030fff00 EFLAGS: 00050202 > RAX: 0000000000000005 RBX: 0000000000000000 RCX: 00005555679927d0 > RDX: 0000000000000000 RSI: ffffffff8c0ab8e0 RDI: ffffffff8c608a00 > RBP: ffff888000dfcf20 R08: ffffffff901b5177 R09: 1ffffffff2036a2e > R10: dffffc0000000000 R11: fffffbfff2036a2f R12: 0000000000000000 > R13: 0000000000000000 R14: 0000000000000005 R15: dffffc0000000000 > schedule_tail+0x96/0xb0 kernel/sched/core.c:5312 > ret_from_fork+0x24/0x80 arch/x86/kernel/process.c:144 > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 > </TASK> > Modules linked in: > CR2: 0000000000000000 > ---[ end trace 0000000000000000 ]--- > RIP: 0010:0x0 > Code: Unable to access opcode bytes at 0xffffffffffffffd6. > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246 > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100 > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8 > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507 > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8 > FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > ---------------- > Code disassembly (best guess): > 0: 9c pushf > 1: 8f 44 24 20 pop 0x20(%rsp) > 5: 42 80 3c 23 00 cmpb $0x0,(%rbx,%r12,1) > a: 74 08 je 0x14 > c: 4c 89 f7 mov %r14,%rdi > f: e8 fe 78 2d f6 call 0xf62d7912 > 14: f6 44 24 21 02 testb $0x2,0x21(%rsp) > 19: 75 52 jne 0x6d > 1b: 41 f7 c7 00 02 00 00 test $0x200,%r15d > 22: 74 01 je 0x25 > 24: fb sti > 25: bf 01 00 00 00 mov $0x1,%edi > * 2a: e8 c3 0f 95 f5 call 0xf5950ff2 <-- trapping instruction > 2f: 65 8b 05 d4 58 0b 74 mov %gs:0x740b58d4(%rip),%eax # 0x740b590a > 36: 85 c0 test %eax,%eax > 38: 74 43 je 0x7d > 3a: 48 rex.W > 3b: c7 .byte 0xc7 > 3c: 04 24 add $0x24,%al > 3e: 0e (bad) > 3f: 36 ss > > > --- > This report is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkaller@googlegroups.com. > > syzbot will keep track of this issue. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > If the report is already addressed, let syzbot know by replying with: > #syz fix: exact-commit-title > > If you want syzbot to run the reproducer, reply with: > #syz test: git://repo/address.git branch-or-commit-hash > If you attach or paste a git patch, syzbot will apply it before testing. > > If you want to overwrite report's subsystems, reply with: > #syz set subsystems: new-subsystem > (See the list of subsystem names on the web dashboard) > > If the report is a duplicate of another one, reply with: > #syz dup: exact-subject-of-another-report > > If you want to undo deduplication, reply with: > #syz undup > #syz test diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 475f31deed14..b297a32c6779 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3047,6 +3047,10 @@ __call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy_in) /* Misaligned rcu_head! */ WARN_ON_ONCE((unsigned long)head & (sizeof(void *) - 1)); + /* Avoid NULL dereference if callback is NULL. */ + if (WARN_ON_ONCE(!func)) + return; + if (debug_rcu_head_queue(head)) { /* * Probable double call_rcu(), so leak the callback. ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-06-11 15:58 ` [syzbot] [rcu?] [bcachefs?] " Uladzislau Rezki @ 2025-06-11 18:02 ` syzbot 2025-06-11 19:15 ` Uladzislau Rezki 0 siblings, 1 reply; 20+ messages in thread From: syzbot @ 2025-06-11 18:02 UTC (permalink / raw) To: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel, linux-mm, paulmck, rcu, syzkaller-bugs, urezki Hello, syzbot tried to test the proposed patch but the build/boot failed: failed to apply patch: checking file kernel/rcu/tree.c patch: **** unexpected end of file in patch Tested on: commit: aef17cb3 Revert "mm/damon/Kconfig: enable CONFIG_DAMON.. git tree: upstream kernel config: https://syzkaller.appspot.com/x/.config?x=523d3ff8e053340a dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a compiler: patch: https://syzkaller.appspot.com/x/patch.diff?x=17de99d4580000 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-06-11 18:02 ` [syzbot] [bcachefs?] [rcu?] " syzbot @ 2025-06-11 19:15 ` Uladzislau Rezki 2025-06-11 19:57 ` syzbot 0 siblings, 1 reply; 20+ messages in thread From: Uladzislau Rezki @ 2025-06-11 19:15 UTC (permalink / raw) To: syzbot Cc: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel, linux-mm, paulmck, rcu, syzkaller-bugs, urezki On Wed, Jun 11, 2025 at 11:02:03AM -0700, syzbot wrote: > Hello, > > syzbot tried to test the proposed patch but the build/boot failed: > > failed to apply patch: > checking file kernel/rcu/tree.c > patch: **** unexpected end of file in patch > > > > Tested on: > > commit: aef17cb3 Revert "mm/damon/Kconfig: enable CONFIG_DAMON.. > git tree: upstream > kernel config: https://syzkaller.appspot.com/x/.config?x=523d3ff8e053340a > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a > compiler: > patch: https://syzkaller.appspot.com/x/patch.diff?x=17de99d4580000 > #syz test diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index e8a4b720d7d2..14d4499c6fc3 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3072,6 +3072,10 @@ __call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy_in) /* Misaligned rcu_head! */ WARN_ON_ONCE((unsigned long)head & (sizeof(void *) - 1)); + /* Avoid NULL dereference if callback is NULL. */ + if (WARN_ON_ONCE(!func)) + return; + if (debug_rcu_head_queue(head)) { /* * Probable double call_rcu(), so leak the callback. ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-06-11 19:15 ` Uladzislau Rezki @ 2025-06-11 19:57 ` syzbot 2025-06-11 20:58 ` Boqun Feng 0 siblings, 1 reply; 20+ messages in thread From: syzbot @ 2025-06-11 19:57 UTC (permalink / raw) To: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel, linux-mm, paulmck, rcu, syzkaller-bugs, urezki Hello, syzbot has tested the proposed patch and the reproducer did not trigger any issue: Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com Tested-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com Tested on: commit: 488ef356 KEYS: Invert FINAL_PUT bit git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000 kernel config: https://syzkaller.appspot.com/x/.config?x=713d218acd33d94 dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6 patch: https://syzkaller.appspot.com/x/patch.diff?x=170e460c580000 Note: testing is done by a robot and is best-effort only. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-06-11 19:57 ` syzbot @ 2025-06-11 20:58 ` Boqun Feng 2025-06-12 7:42 ` Aleksandr Nogikh 0 siblings, 1 reply; 20+ messages in thread From: Boqun Feng @ 2025-06-11 20:58 UTC (permalink / raw) To: syzbot Cc: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel, linux-mm, paulmck, rcu, syzkaller-bugs, urezki On Wed, Jun 11, 2025 at 12:57:04PM -0700, syzbot wrote: > Hello, > > syzbot has tested the proposed patch and the reproducer did not trigger any issue: > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com > Tested-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com > > Tested on: > > commit: 488ef356 KEYS: Invert FINAL_PUT bit > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000 Is there a way to see the whole console output? If Ulad's patch fixes the exact issue, we should be able to see a WARN_ON_ONCE() triggered. Regards, Boqun > kernel config: https://syzkaller.appspot.com/x/.config?x=713d218acd33d94 > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a > compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6 > patch: https://syzkaller.appspot.com/x/patch.diff?x=170e460c580000 > > Note: testing is done by a robot and is best-effort only. > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-06-11 20:58 ` Boqun Feng @ 2025-06-12 7:42 ` Aleksandr Nogikh 2025-06-12 9:37 ` Uladzislau Rezki 0 siblings, 1 reply; 20+ messages in thread From: Aleksandr Nogikh @ 2025-06-12 7:42 UTC (permalink / raw) To: Boqun Feng Cc: syzbot, akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel, linux-mm, paulmck, rcu, syzkaller-bugs, urezki On Wed, Jun 11, 2025 at 10:58 PM Boqun Feng <boqun.feng@gmail.com> wrote: > > On Wed, Jun 11, 2025 at 12:57:04PM -0700, syzbot wrote: > > Hello, > > > > syzbot has tested the proposed patch and the reproducer did not trigger any issue: > > > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com > > Tested-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com > > > > Tested on: > > > > commit: 488ef356 KEYS: Invert FINAL_PUT bit > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000 > > Is there a way to see the whole console output? If Ulad's patch fixes > the exact issue, we should be able to see a WARN_ON_ONCE() triggered. If WARN_ON_ONCE() were triggered, the associated kernel panic output would have been at the end of this log. > > Regards, > Boqun > > > kernel config: https://syzkaller.appspot.com/x/.config?x=713d218acd33d94 > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a FWIW the last time the bug was observed on syzbot was 100 days ago, so it has likely been fixed since then or has become much harder to reproduce. > > compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6 > > patch: https://syzkaller.appspot.com/x/patch.diff?x=170e460c580000 > > > > Note: testing is done by a robot and is best-effort only. > > > -- Aleksandr ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-06-12 7:42 ` Aleksandr Nogikh @ 2025-06-12 9:37 ` Uladzislau Rezki 2025-06-12 17:20 ` Boqun Feng 0 siblings, 1 reply; 20+ messages in thread From: Uladzislau Rezki @ 2025-06-12 9:37 UTC (permalink / raw) To: Aleksandr Nogikh Cc: Boqun Feng, syzbot, akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel, linux-mm, paulmck, rcu, syzkaller-bugs, urezki On Thu, Jun 12, 2025 at 09:42:32AM +0200, Aleksandr Nogikh wrote: > On Wed, Jun 11, 2025 at 10:58 PM Boqun Feng <boqun.feng@gmail.com> wrote: > > > > On Wed, Jun 11, 2025 at 12:57:04PM -0700, syzbot wrote: > > > Hello, > > > > > > syzbot has tested the proposed patch and the reproducer did not trigger any issue: > > > > > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com > > > Tested-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com > > > > > > Tested on: > > > > > > commit: 488ef356 KEYS: Invert FINAL_PUT bit > > > git tree: upstream > > > console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000 > > > > Is there a way to see the whole console output? If Ulad's patch fixes > > the exact issue, we should be able to see a WARN_ON_ONCE() triggered. > > If WARN_ON_ONCE() were triggered, the associated kernel panic output > would have been at the end of this log. > > > > > Regards, > > Boqun > > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=713d218acd33d94 > > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a > > FWIW the last time the bug was observed on syzbot was 100 days ago, so > it has likely been fixed since then or has become much harder to > reproduce. > That is even worse, if it is last for 100 days already. -- Uladzislau Rezki ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) 2025-06-12 9:37 ` Uladzislau Rezki @ 2025-06-12 17:20 ` Boqun Feng 0 siblings, 0 replies; 20+ messages in thread From: Boqun Feng @ 2025-06-12 17:20 UTC (permalink / raw) To: Uladzislau Rezki (Sony), Aleksandr Nogikh Cc: syzbot, Andrew Morton, Josh Triplett, kent.overstreet, linux-bcachefs, linux-kernel, linux-mm, Paul E. McKenney, rcu, syzkaller-bugs On Thu, Jun 12, 2025, at 2:37 AM, Uladzislau Rezki wrote: > On Thu, Jun 12, 2025 at 09:42:32AM +0200, Aleksandr Nogikh wrote: >> On Wed, Jun 11, 2025 at 10:58 PM Boqun Feng <boqun.feng@gmail.com> wrote: >> > >> > On Wed, Jun 11, 2025 at 12:57:04PM -0700, syzbot wrote: >> > > Hello, >> > > >> > > syzbot has tested the proposed patch and the reproducer did not trigger any issue: >> > > >> > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com >> > > Tested-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com >> > > >> > > Tested on: >> > > >> > > commit: 488ef356 KEYS: Invert FINAL_PUT bit >> > > git tree: upstream >> > > console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000 >> > >> > Is there a way to see the whole console output? If Ulad's patch fixes >> > the exact issue, we should be able to see a WARN_ON_ONCE() triggered. >> >> If WARN_ON_ONCE() were triggered, the associated kernel panic output >> would have been at the end of this log. >> >> > >> > Regards, >> > Boqun >> > >> > > kernel config: https://syzkaller.appspot.com/x/.config?x=713d218acd33d94 >> > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a >> >> FWIW the last time the bug was observed on syzbot was 100 days ago, so >> it has likely been fixed since then or has become much harder to >> reproduce. >> > That is even worse, if it is last for 100 days already. > My understanding is that the evidence shows that the issue that directly caused null-ptr-derek the has been fixed 100 days ago. Regards, Boqun > -- > Uladzislau Rezki ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-06-12 17:20 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-02-05 0:34 [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) syzbot 2025-02-05 14:56 ` Paul E. McKenney 2025-06-08 15:26 ` Kent Overstreet 2025-06-08 18:23 ` Uladzislau Rezki 2025-06-09 0:25 ` Paul E. McKenney 2025-06-09 8:35 ` Uladzislau Rezki 2025-06-09 9:47 ` Paul E. McKenney 2025-06-09 14:20 ` Joel Fernandes 2025-06-10 12:19 ` Uladzislau Rezki 2025-06-09 18:28 ` Vlastimil Babka 2025-06-10 12:33 ` Uladzislau Rezki 2025-06-08 6:58 ` [syzbot] [bcachefs?] [rcu?] " syzbot 2025-06-11 15:58 ` [syzbot] [rcu?] [bcachefs?] " Uladzislau Rezki 2025-06-11 18:02 ` [syzbot] [bcachefs?] [rcu?] " syzbot 2025-06-11 19:15 ` Uladzislau Rezki 2025-06-11 19:57 ` syzbot 2025-06-11 20:58 ` Boqun Feng 2025-06-12 7:42 ` Aleksandr Nogikh 2025-06-12 9:37 ` Uladzislau Rezki 2025-06-12 17:20 ` Boqun Feng
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).