* [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
@ 2025-02-05 0:34 syzbot
2025-02-05 14:56 ` Paul E. McKenney
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: syzbot @ 2025-02-05 0:34 UTC (permalink / raw)
To: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel,
linux-mm, paulmck, rcu, syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 0 P4D 0
Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
rcu_do_batch kernel/rcu/tree.c:2546 [inline]
rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2802
handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561
__do_softirq kernel/softirq.c:595 [inline]
invoke_softirq kernel/softirq.c:435 [inline]
__irq_exit_rcu+0xf7/0x220 kernel/softirq.c:662
irq_exit_rcu+0x9/0x30 kernel/softirq.c:678
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline]
sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1049
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0xd8/0x140 kernel/locking/spinlock.c:194
Code: 9c 8f 44 24 20 42 80 3c 23 00 74 08 4c 89 f7 e8 fe 78 2d f6 f6 44 24 21 02 75 52 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> c3 0f 95 f5 65 8b 05 d4 58 0b 74 85 c0 74 43 48 c7 04 24 0e 36
RSP: 0018:ffffc900030fef60 EFLAGS: 00000206
RAX: 23438dd059a4b100 RBX: 1ffff9200061fdf0 RCX: ffffffff819b316a
RDX: dffffc0000000000 RSI: ffffffff8c0aa680 RDI: 0000000000000001
RBP: ffffc900030feff8 R08: ffffffff942f9847 R09: 1ffffffff285f308
R10: dffffc0000000000 R11: fffffbfff285f309 R12: dffffc0000000000
R13: 1ffff9200061fdec R14: ffffc900030fef80 R15: 0000000000000246
spin_unlock_irqrestore include/linux/spinlock.h:406 [inline]
rmqueue_bulk mm/page_alloc.c:2329 [inline]
__rmqueue_pcplist+0x21fd/0x2a90 mm/page_alloc.c:3004
rmqueue_pcplist mm/page_alloc.c:3046 [inline]
rmqueue mm/page_alloc.c:3077 [inline]
get_page_from_freelist+0x886/0x37a0 mm/page_alloc.c:3474
__alloc_frozen_pages_noprof+0x292/0x710 mm/page_alloc.c:4739
alloc_pages_mpol+0x311/0x660 mm/mempolicy.c:2270
folio_alloc_mpol_noprof mm/mempolicy.c:2289 [inline]
vma_alloc_folio_noprof+0x12b/0x260 mm/mempolicy.c:2324
folio_prealloc+0x2e/0x170
wp_page_copy mm/memory.c:3435 [inline]
do_wp_page+0x1253/0x49b0 mm/memory.c:3827
handle_pte_fault mm/memory.c:5905 [inline]
__handle_mm_fault+0x24d5/0x70f0 mm/memory.c:6032
handle_mm_fault+0x3e5/0x8d0 mm/memory.c:6201
do_user_addr_fault arch/x86/mm/fault.c:1388 [inline]
handle_page_fault arch/x86/mm/fault.c:1480 [inline]
exc_page_fault+0x2b9/0x8b0 arch/x86/mm/fault.c:1538
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
RIP: 0010:__put_user_4+0x11/0x20 arch/x86/lib/putuser.S:88
Code: 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 89 cb 48 c1 fb 3f 48 09 d9 0f 01 cb <89> 01 31 c9 0f 01 ca c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90
RSP: 0018:ffffc900030fff00 EFLAGS: 00050202
RAX: 0000000000000005 RBX: 0000000000000000 RCX: 00005555679927d0
RDX: 0000000000000000 RSI: ffffffff8c0ab8e0 RDI: ffffffff8c608a00
RBP: ffff888000dfcf20 R08: ffffffff901b5177 R09: 1ffffffff2036a2e
R10: dffffc0000000000 R11: fffffbfff2036a2f R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000005 R15: dffffc0000000000
schedule_tail+0x96/0xb0 kernel/sched/core.c:5312
ret_from_fork+0x24/0x80 arch/x86/kernel/process.c:144
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
Modules linked in:
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
0: 9c pushf
1: 8f 44 24 20 pop 0x20(%rsp)
5: 42 80 3c 23 00 cmpb $0x0,(%rbx,%r12,1)
a: 74 08 je 0x14
c: 4c 89 f7 mov %r14,%rdi
f: e8 fe 78 2d f6 call 0xf62d7912
14: f6 44 24 21 02 testb $0x2,0x21(%rsp)
19: 75 52 jne 0x6d
1b: 41 f7 c7 00 02 00 00 test $0x200,%r15d
22: 74 01 je 0x25
24: fb sti
25: bf 01 00 00 00 mov $0x1,%edi
* 2a: e8 c3 0f 95 f5 call 0xf5950ff2 <-- trapping instruction
2f: 65 8b 05 d4 58 0b 74 mov %gs:0x740b58d4(%rip),%eax # 0x740b590a
36: 85 c0 test %eax,%eax
38: 74 43 je 0x7d
3a: 48 rex.W
3b: c7 .byte 0xc7
3c: 04 24 add $0x24,%al
3e: 0e (bad)
3f: 36 ss
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-02-05 0:34 [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) syzbot
@ 2025-02-05 14:56 ` Paul E. McKenney
2025-06-08 15:26 ` Kent Overstreet
2025-06-08 6:58 ` [syzbot] [bcachefs?] [rcu?] " syzbot
2025-06-11 15:58 ` [syzbot] [rcu?] [bcachefs?] " Uladzislau Rezki
2 siblings, 1 reply; 20+ messages in thread
From: Paul E. McKenney @ 2025-02-05 14:56 UTC (permalink / raw)
To: syzbot
Cc: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel,
linux-mm, rcu, syzkaller-bugs
On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
>
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
> mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
>
> slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor instruction fetch in kernel mode
> #PF: error_code(0x0010) - not-present page
> PGD 0 P4D 0
> Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <IRQ>
> rcu_do_batch kernel/rcu/tree.c:2546 [inline]
The usual way that this happens is that someone clobbers the rcu_head
structure of something that has been passed to call_rcu(). The most
popular way of clobbering this structure is to pass the same something to
call_rcu() twice in a row, but other creative arrangements are possible.
Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
spot invoking call_rcu() twice in a row.
Thanx, Paul
> rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2802
> handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561
> __do_softirq kernel/softirq.c:595 [inline]
> invoke_softirq kernel/softirq.c:435 [inline]
> __irq_exit_rcu+0xf7/0x220 kernel/softirq.c:662
> irq_exit_rcu+0x9/0x30 kernel/softirq.c:678
> instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline]
> sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1049
> </IRQ>
> <TASK>
> asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
> RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
> RIP: 0010:_raw_spin_unlock_irqrestore+0xd8/0x140 kernel/locking/spinlock.c:194
> Code: 9c 8f 44 24 20 42 80 3c 23 00 74 08 4c 89 f7 e8 fe 78 2d f6 f6 44 24 21 02 75 52 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> c3 0f 95 f5 65 8b 05 d4 58 0b 74 85 c0 74 43 48 c7 04 24 0e 36
> RSP: 0018:ffffc900030fef60 EFLAGS: 00000206
> RAX: 23438dd059a4b100 RBX: 1ffff9200061fdf0 RCX: ffffffff819b316a
> RDX: dffffc0000000000 RSI: ffffffff8c0aa680 RDI: 0000000000000001
> RBP: ffffc900030feff8 R08: ffffffff942f9847 R09: 1ffffffff285f308
> R10: dffffc0000000000 R11: fffffbfff285f309 R12: dffffc0000000000
> R13: 1ffff9200061fdec R14: ffffc900030fef80 R15: 0000000000000246
> spin_unlock_irqrestore include/linux/spinlock.h:406 [inline]
> rmqueue_bulk mm/page_alloc.c:2329 [inline]
> __rmqueue_pcplist+0x21fd/0x2a90 mm/page_alloc.c:3004
> rmqueue_pcplist mm/page_alloc.c:3046 [inline]
> rmqueue mm/page_alloc.c:3077 [inline]
> get_page_from_freelist+0x886/0x37a0 mm/page_alloc.c:3474
> __alloc_frozen_pages_noprof+0x292/0x710 mm/page_alloc.c:4739
> alloc_pages_mpol+0x311/0x660 mm/mempolicy.c:2270
> folio_alloc_mpol_noprof mm/mempolicy.c:2289 [inline]
> vma_alloc_folio_noprof+0x12b/0x260 mm/mempolicy.c:2324
> folio_prealloc+0x2e/0x170
> wp_page_copy mm/memory.c:3435 [inline]
> do_wp_page+0x1253/0x49b0 mm/memory.c:3827
> handle_pte_fault mm/memory.c:5905 [inline]
> __handle_mm_fault+0x24d5/0x70f0 mm/memory.c:6032
> handle_mm_fault+0x3e5/0x8d0 mm/memory.c:6201
> do_user_addr_fault arch/x86/mm/fault.c:1388 [inline]
> handle_page_fault arch/x86/mm/fault.c:1480 [inline]
> exc_page_fault+0x2b9/0x8b0 arch/x86/mm/fault.c:1538
> asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
> RIP: 0010:__put_user_4+0x11/0x20 arch/x86/lib/putuser.S:88
> Code: 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 89 cb 48 c1 fb 3f 48 09 d9 0f 01 cb <89> 01 31 c9 0f 01 ca c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90
> RSP: 0018:ffffc900030fff00 EFLAGS: 00050202
> RAX: 0000000000000005 RBX: 0000000000000000 RCX: 00005555679927d0
> RDX: 0000000000000000 RSI: ffffffff8c0ab8e0 RDI: ffffffff8c608a00
> RBP: ffff888000dfcf20 R08: ffffffff901b5177 R09: 1ffffffff2036a2e
> R10: dffffc0000000000 R11: fffffbfff2036a2f R12: 0000000000000000
> R13: 0000000000000000 R14: 0000000000000005 R15: dffffc0000000000
> schedule_tail+0x96/0xb0 kernel/sched/core.c:5312
> ret_from_fork+0x24/0x80 arch/x86/kernel/process.c:144
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> </TASK>
> Modules linked in:
> CR2: 0000000000000000
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> ----------------
> Code disassembly (best guess):
> 0: 9c pushf
> 1: 8f 44 24 20 pop 0x20(%rsp)
> 5: 42 80 3c 23 00 cmpb $0x0,(%rbx,%r12,1)
> a: 74 08 je 0x14
> c: 4c 89 f7 mov %r14,%rdi
> f: e8 fe 78 2d f6 call 0xf62d7912
> 14: f6 44 24 21 02 testb $0x2,0x21(%rsp)
> 19: 75 52 jne 0x6d
> 1b: 41 f7 c7 00 02 00 00 test $0x200,%r15d
> 22: 74 01 je 0x25
> 24: fb sti
> 25: bf 01 00 00 00 mov $0x1,%edi
> * 2a: e8 c3 0f 95 f5 call 0xf5950ff2 <-- trapping instruction
> 2f: 65 8b 05 d4 58 0b 74 mov %gs:0x740b58d4(%rip),%eax # 0x740b590a
> 36: 85 c0 test %eax,%eax
> 38: 74 43 je 0x7d
> 3a: 48 rex.W
> 3b: c7 .byte 0xc7
> 3c: 04 24 add $0x24,%al
> 3e: 0e (bad)
> 3f: 36 ss
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
>
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.
>
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
>
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
>
> If you want to undo deduplication, reply with:
> #syz undup
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-02-05 0:34 [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) syzbot
2025-02-05 14:56 ` Paul E. McKenney
@ 2025-06-08 6:58 ` syzbot
2025-06-11 15:58 ` [syzbot] [rcu?] [bcachefs?] " Uladzislau Rezki
2 siblings, 0 replies; 20+ messages in thread
From: syzbot @ 2025-06-08 6:58 UTC (permalink / raw)
To: akpm, ayaanmirza.788, ayaanmirzabaig85, josh, kent.overstreet,
linux-bcachefs, linux-kernel, linux-mm, luto, paulmck, peterz,
rcu, syzkaller-bugs, tglx
syzbot has bisected this issue to:
commit 14152654805256d760315ec24e414363bfa19a06
Author: Kent Overstreet <kent.overstreet@linux.dev>
Date: Mon Nov 25 05:21:27 2024 +0000
bcachefs: Bad btree roots are now autofix
bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=12fa0a82580000
start commit: 99fa936e8e4f Merge tag 'affs-6.14-rc5-tag' of git://git.ke..
git tree: upstream
final oops: https://syzkaller.appspot.com/x/report.txt?x=11fa0a82580000
console output: https://syzkaller.appspot.com/x/log.txt?x=16fa0a82580000
kernel config: https://syzkaller.appspot.com/x/.config?x=523d3ff8e053340a
dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=119d35a8580000
Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
Fixes: 141526548052 ("bcachefs: Bad btree roots are now autofix")
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-02-05 14:56 ` Paul E. McKenney
@ 2025-06-08 15:26 ` Kent Overstreet
2025-06-08 18:23 ` Uladzislau Rezki
0 siblings, 1 reply; 20+ messages in thread
From: Kent Overstreet @ 2025-06-08 15:26 UTC (permalink / raw)
To: Paul E. McKenney
Cc: syzbot, akpm, josh, linux-bcachefs, linux-kernel, linux-mm, rcu,
syzkaller-bugs
On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote:
> On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
> >
> > Downloadable assets:
> > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
> > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> >
> > slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
> > BUG: kernel NULL pointer dereference, address: 0000000000000000
> > #PF: supervisor instruction fetch in kernel mode
> > #PF: error_code(0x0010) - not-present page
> > PGD 0 P4D 0
> > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> > RIP: 0010:0x0
> > Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> > FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> > <IRQ>
> > rcu_do_batch kernel/rcu/tree.c:2546 [inline]
>
> The usual way that this happens is that someone clobbers the rcu_head
> structure of something that has been passed to call_rcu(). The most
> popular way of clobbering this structure is to pass the same something to
> call_rcu() twice in a row, but other creative arrangements are possible.
>
> Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
> spot invoking call_rcu() twice in a row.
I don't think it's that - syzbot's .config already has that enabled.
KASAN, too.
And the only place we do call_rcu() is from rcu_pending.c, where we've
got a rearming rcu callback - but we track whether it's outstanding, and
we do all relevant operations with a lock held.
And we only use rcu_pending.c with SRCU, not regular RCU.
We do use kfree_rcu() in a few places (all boring, I expect), but that
doesn't (generally?) use the rcu callback list.
So I'm not sure this is even a bcachefs bug.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-06-08 15:26 ` Kent Overstreet
@ 2025-06-08 18:23 ` Uladzislau Rezki
2025-06-09 0:25 ` Paul E. McKenney
2025-06-09 18:28 ` Vlastimil Babka
0 siblings, 2 replies; 20+ messages in thread
From: Uladzislau Rezki @ 2025-06-08 18:23 UTC (permalink / raw)
To: Kent Overstreet, Paul E. McKenney
Cc: Paul E. McKenney, syzbot, akpm, josh, linux-bcachefs,
linux-kernel, linux-mm, rcu, syzkaller-bugs
On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
> On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote:
> > On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> > > git tree: upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
> > >
> > > Downloadable assets:
> > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
> > > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
> > > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
> > > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> > >
> > > slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
> > > BUG: kernel NULL pointer dereference, address: 0000000000000000
> > > #PF: supervisor instruction fetch in kernel mode
> > > #PF: error_code(0x0010) - not-present page
> > > PGD 0 P4D 0
> > > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> > > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
> > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> > > RIP: 0010:0x0
> > > Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> > > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> > > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> > > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> > > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> > > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> > > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> > > FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > Call Trace:
> > > <IRQ>
> > > rcu_do_batch kernel/rcu/tree.c:2546 [inline]
> >
> > The usual way that this happens is that someone clobbers the rcu_head
> > structure of something that has been passed to call_rcu(). The most
> > popular way of clobbering this structure is to pass the same something to
> > call_rcu() twice in a row, but other creative arrangements are possible.
> >
> > Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
> > spot invoking call_rcu() twice in a row.
>
> I don't think it's that - syzbot's .config already has that enabled.
> KASAN, too.
>
> And the only place we do call_rcu() is from rcu_pending.c, where we've
> got a rearming rcu callback - but we track whether it's outstanding, and
> we do all relevant operations with a lock held.
>
> And we only use rcu_pending.c with SRCU, not regular RCU.
>
> We do use kfree_rcu() in a few places (all boring, I expect), but that
> doesn't (generally?) use the rcu callback list.
>
Right, kvfree_rcu() does not intersect with regular callbacks, it has
its own path.
It looks like the problem is here:
<snip>
f = rhp->func;
debug_rcu_head_callback(rhp);
WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
f(rhp);
<snip>
we do not check if callback, "f", is a NULL. If it is, the kernel bug
is triggered right away. For example:
call_rcu(&rh, NULL);
@Paul, do you think it makes sense to narrow callers which apparently
pass NULL as a callback? To me it seems the case of this bug. But we
do not know the source.
It would give at least a stack-trace of caller which passes a NULL.
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-06-08 18:23 ` Uladzislau Rezki
@ 2025-06-09 0:25 ` Paul E. McKenney
2025-06-09 8:35 ` Uladzislau Rezki
2025-06-09 18:28 ` Vlastimil Babka
1 sibling, 1 reply; 20+ messages in thread
From: Paul E. McKenney @ 2025-06-09 0:25 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Kent Overstreet, syzbot, akpm, josh, linux-bcachefs, linux-kernel,
linux-mm, rcu, syzkaller-bugs
On Sun, Jun 08, 2025 at 08:23:36PM +0200, Uladzislau Rezki wrote:
> On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
> > On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote:
> > > On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> > > > Hello,
> > > >
> > > > syzbot found the following issue on:
> > > >
> > > > HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> > > > git tree: upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> > > > kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
> > > >
> > > > Downloadable assets:
> > > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
> > > > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
> > > > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
> > > > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
> > > >
> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> > > >
> > > > slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
> > > > BUG: kernel NULL pointer dereference, address: 0000000000000000
> > > > #PF: supervisor instruction fetch in kernel mode
> > > > #PF: error_code(0x0010) - not-present page
> > > > PGD 0 P4D 0
> > > > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> > > > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
> > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> > > > RIP: 0010:0x0
> > > > Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> > > > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> > > > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> > > > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> > > > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> > > > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> > > > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> > > > FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > Call Trace:
> > > > <IRQ>
> > > > rcu_do_batch kernel/rcu/tree.c:2546 [inline]
> > >
> > > The usual way that this happens is that someone clobbers the rcu_head
> > > structure of something that has been passed to call_rcu(). The most
> > > popular way of clobbering this structure is to pass the same something to
> > > call_rcu() twice in a row, but other creative arrangements are possible.
> > >
> > > Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
> > > spot invoking call_rcu() twice in a row.
> >
> > I don't think it's that - syzbot's .config already has that enabled.
> > KASAN, too.
> >
> > And the only place we do call_rcu() is from rcu_pending.c, where we've
> > got a rearming rcu callback - but we track whether it's outstanding, and
> > we do all relevant operations with a lock held.
> >
> > And we only use rcu_pending.c with SRCU, not regular RCU.
> >
> > We do use kfree_rcu() in a few places (all boring, I expect), but that
> > doesn't (generally?) use the rcu callback list.
> >
> Right, kvfree_rcu() does not intersect with regular callbacks, it has
> its own path.
>
> It looks like the problem is here:
>
> <snip>
> f = rhp->func;
> debug_rcu_head_callback(rhp);
> WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
> f(rhp);
> <snip>
>
> we do not check if callback, "f", is a NULL. If it is, the kernel bug
> is triggered right away. For example:
>
> call_rcu(&rh, NULL);
>
> @Paul, do you think it makes sense to narrow callers which apparently
> pass NULL as a callback? To me it seems the case of this bug. But we
> do not know the source.
>
> It would give at least a stack-trace of caller which passes a NULL.
Adding a check for NULL func passed to __call_rcu_common(), you mean?
That wouldn't hurt, and would either (as you say) catch the culprit
or show that the problem is elsewhere.
Thanx, Paul
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-06-09 0:25 ` Paul E. McKenney
@ 2025-06-09 8:35 ` Uladzislau Rezki
2025-06-09 9:47 ` Paul E. McKenney
0 siblings, 1 reply; 20+ messages in thread
From: Uladzislau Rezki @ 2025-06-09 8:35 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Uladzislau Rezki, Kent Overstreet, syzbot, akpm, josh,
linux-bcachefs, linux-kernel, linux-mm, rcu, syzkaller-bugs
On Sun, Jun 08, 2025 at 05:25:05PM -0700, Paul E. McKenney wrote:
> On Sun, Jun 08, 2025 at 08:23:36PM +0200, Uladzislau Rezki wrote:
> > On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
> > > On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote:
> > > > On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> > > > > Hello,
> > > > >
> > > > > syzbot found the following issue on:
> > > > >
> > > > > HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> > > > > git tree: upstream
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> > > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
> > > > >
> > > > > Downloadable assets:
> > > > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
> > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
> > > > > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
> > > > > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
> > > > >
> > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> > > > >
> > > > > slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
> > > > > BUG: kernel NULL pointer dereference, address: 0000000000000000
> > > > > #PF: supervisor instruction fetch in kernel mode
> > > > > #PF: error_code(0x0010) - not-present page
> > > > > PGD 0 P4D 0
> > > > > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> > > > > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
> > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> > > > > RIP: 0010:0x0
> > > > > Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> > > > > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> > > > > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> > > > > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> > > > > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> > > > > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> > > > > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> > > > > FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > > Call Trace:
> > > > > <IRQ>
> > > > > rcu_do_batch kernel/rcu/tree.c:2546 [inline]
> > > >
> > > > The usual way that this happens is that someone clobbers the rcu_head
> > > > structure of something that has been passed to call_rcu(). The most
> > > > popular way of clobbering this structure is to pass the same something to
> > > > call_rcu() twice in a row, but other creative arrangements are possible.
> > > >
> > > > Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
> > > > spot invoking call_rcu() twice in a row.
> > >
> > > I don't think it's that - syzbot's .config already has that enabled.
> > > KASAN, too.
> > >
> > > And the only place we do call_rcu() is from rcu_pending.c, where we've
> > > got a rearming rcu callback - but we track whether it's outstanding, and
> > > we do all relevant operations with a lock held.
> > >
> > > And we only use rcu_pending.c with SRCU, not regular RCU.
> > >
> > > We do use kfree_rcu() in a few places (all boring, I expect), but that
> > > doesn't (generally?) use the rcu callback list.
> > >
> > Right, kvfree_rcu() does not intersect with regular callbacks, it has
> > its own path.
> >
> > It looks like the problem is here:
> >
> > <snip>
> > f = rhp->func;
> > debug_rcu_head_callback(rhp);
> > WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
> > f(rhp);
> > <snip>
> >
> > we do not check if callback, "f", is a NULL. If it is, the kernel bug
> > is triggered right away. For example:
> >
> > call_rcu(&rh, NULL);
> >
> > @Paul, do you think it makes sense to narrow callers which apparently
> > pass NULL as a callback? To me it seems the case of this bug. But we
> > do not know the source.
> >
> > It would give at least a stack-trace of caller which passes a NULL.
>
> Adding a check for NULL func passed to __call_rcu_common(), you mean?
>
Yes. Currently there is no any check. So passing a NULL just triggers
kernel panic.
>
> That wouldn't hurt, and would either (as you say) catch the culprit
> or show that the problem is elsewhere.
>
I can add it then and send out the patch if no objections.
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-06-09 8:35 ` Uladzislau Rezki
@ 2025-06-09 9:47 ` Paul E. McKenney
2025-06-09 14:20 ` Joel Fernandes
0 siblings, 1 reply; 20+ messages in thread
From: Paul E. McKenney @ 2025-06-09 9:47 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Kent Overstreet, syzbot, akpm, josh, linux-bcachefs, linux-kernel,
linux-mm, rcu, syzkaller-bugs
On Mon, Jun 09, 2025 at 10:35:34AM +0200, Uladzislau Rezki wrote:
> On Sun, Jun 08, 2025 at 05:25:05PM -0700, Paul E. McKenney wrote:
> > On Sun, Jun 08, 2025 at 08:23:36PM +0200, Uladzislau Rezki wrote:
> > > On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
> > > > On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote:
> > > > > On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> > > > > > Hello,
> > > > > >
> > > > > > syzbot found the following issue on:
> > > > > >
> > > > > > HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> > > > > > git tree: upstream
> > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> > > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> > > > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
> > > > > >
> > > > > > Downloadable assets:
> > > > > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
> > > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
> > > > > > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
> > > > > > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
> > > > > >
> > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> > > > > >
> > > > > > slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
> > > > > > BUG: kernel NULL pointer dereference, address: 0000000000000000
> > > > > > #PF: supervisor instruction fetch in kernel mode
> > > > > > #PF: error_code(0x0010) - not-present page
> > > > > > PGD 0 P4D 0
> > > > > > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> > > > > > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
> > > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> > > > > > RIP: 0010:0x0
> > > > > > Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> > > > > > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> > > > > > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> > > > > > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> > > > > > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> > > > > > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> > > > > > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> > > > > > FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> > > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > > > Call Trace:
> > > > > > <IRQ>
> > > > > > rcu_do_batch kernel/rcu/tree.c:2546 [inline]
> > > > >
> > > > > The usual way that this happens is that someone clobbers the rcu_head
> > > > > structure of something that has been passed to call_rcu(). The most
> > > > > popular way of clobbering this structure is to pass the same something to
> > > > > call_rcu() twice in a row, but other creative arrangements are possible.
> > > > >
> > > > > Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
> > > > > spot invoking call_rcu() twice in a row.
> > > >
> > > > I don't think it's that - syzbot's .config already has that enabled.
> > > > KASAN, too.
> > > >
> > > > And the only place we do call_rcu() is from rcu_pending.c, where we've
> > > > got a rearming rcu callback - but we track whether it's outstanding, and
> > > > we do all relevant operations with a lock held.
> > > >
> > > > And we only use rcu_pending.c with SRCU, not regular RCU.
> > > >
> > > > We do use kfree_rcu() in a few places (all boring, I expect), but that
> > > > doesn't (generally?) use the rcu callback list.
> > > >
> > > Right, kvfree_rcu() does not intersect with regular callbacks, it has
> > > its own path.
> > >
> > > It looks like the problem is here:
> > >
> > > <snip>
> > > f = rhp->func;
> > > debug_rcu_head_callback(rhp);
> > > WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
> > > f(rhp);
> > > <snip>
> > >
> > > we do not check if callback, "f", is a NULL. If it is, the kernel bug
> > > is triggered right away. For example:
> > >
> > > call_rcu(&rh, NULL);
> > >
> > > @Paul, do you think it makes sense to narrow callers which apparently
> > > pass NULL as a callback? To me it seems the case of this bug. But we
> > > do not know the source.
> > >
> > > It would give at least a stack-trace of caller which passes a NULL.
> >
> > Adding a check for NULL func passed to __call_rcu_common(), you mean?
> >
> Yes. Currently there is no any check. So passing a NULL just triggers
> kernel panic.
>
> >
> > That wouldn't hurt, and would either (as you say) catch the culprit
> > or show that the problem is elsewhere.
> >
> I can add it then and send out the patch if no objections.
No objections from me!
Thanx, Paul
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-06-09 9:47 ` Paul E. McKenney
@ 2025-06-09 14:20 ` Joel Fernandes
2025-06-10 12:19 ` Uladzislau Rezki
0 siblings, 1 reply; 20+ messages in thread
From: Joel Fernandes @ 2025-06-09 14:20 UTC (permalink / raw)
To: paulmck, Uladzislau Rezki
Cc: Kent Overstreet, syzbot, akpm, josh, linux-bcachefs, linux-kernel,
linux-mm, rcu, syzkaller-bugs
On 6/9/2025 5:47 AM, Paul E. McKenney wrote:
> On Mon, Jun 09, 2025 at 10:35:34AM +0200, Uladzislau Rezki wrote:
>> On Sun, Jun 08, 2025 at 05:25:05PM -0700, Paul E. McKenney wrote:
>>> On Sun, Jun 08, 2025 at 08:23:36PM +0200, Uladzislau Rezki wrote:
>>>> On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
>>>>> On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote:
>>>>>> On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> syzbot found the following issue on:
>>>>>>>
>>>>>>> HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
>>>>>>> git tree: upstream
>>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
>>>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
>>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
>>>>>>> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>>>>>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
>>>>>>>
>>>>>>> Downloadable assets:
>>>>>>> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
>>>>>>> vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
>>>>>>> kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
>>>>>>> mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
>>>>>>>
>>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>>>>> Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
>>>>>>>
>>>>>>> slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
>>>>>>> BUG: kernel NULL pointer dereference, address: 0000000000000000
>>>>>>> #PF: supervisor instruction fetch in kernel mode
>>>>>>> #PF: error_code(0x0010) - not-present page
>>>>>>> PGD 0 P4D 0
>>>>>>> Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
>>>>>>> CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
>>>>>>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
>>>>>>> RIP: 0010:0x0
>>>>>>> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
>>>>>>> RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
>>>>>>> RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
>>>>>>> RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
>>>>>>> RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
>>>>>>> R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
>>>>>>> R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
>>>>>>> FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
>>>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>> CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
>>>>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>>>>> Call Trace:
>>>>>>> <IRQ>
>>>>>>> rcu_do_batch kernel/rcu/tree.c:2546 [inline]
>>>>>>
>>>>>> The usual way that this happens is that someone clobbers the rcu_head
>>>>>> structure of something that has been passed to call_rcu(). The most
>>>>>> popular way of clobbering this structure is to pass the same something to
>>>>>> call_rcu() twice in a row, but other creative arrangements are possible.
>>>>>>
>>>>>> Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
>>>>>> spot invoking call_rcu() twice in a row.
>>>>>
>>>>> I don't think it's that - syzbot's .config already has that enabled.
>>>>> KASAN, too.
>>>>>
>>>>> And the only place we do call_rcu() is from rcu_pending.c, where we've
>>>>> got a rearming rcu callback - but we track whether it's outstanding, and
>>>>> we do all relevant operations with a lock held.
>>>>>
>>>>> And we only use rcu_pending.c with SRCU, not regular RCU.
>>>>>
>>>>> We do use kfree_rcu() in a few places (all boring, I expect), but that
>>>>> doesn't (generally?) use the rcu callback list.
>>>>>
>>>> Right, kvfree_rcu() does not intersect with regular callbacks, it has
>>>> its own path.
>>>>
>>>> It looks like the problem is here:
>>>>
>>>> <snip>
>>>> f = rhp->func;
>>>> debug_rcu_head_callback(rhp);
>>>> WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
>>>> f(rhp);
>>>> <snip>
>>>>
>>>> we do not check if callback, "f", is a NULL. If it is, the kernel bug
>>>> is triggered right away. For example:
>>>>
>>>> call_rcu(&rh, NULL);
>>>>
>>>> @Paul, do you think it makes sense to narrow callers which apparently
>>>> pass NULL as a callback? To me it seems the case of this bug. But we
>>>> do not know the source.
>>>>
>>>> It would give at least a stack-trace of caller which passes a NULL.
>>>
>>> Adding a check for NULL func passed to __call_rcu_common(), you mean?
>>>
>> Yes. Currently there is no any check. So passing a NULL just triggers
>> kernel panic.
>>
>>>
>>> That wouldn't hurt, and would either (as you say) catch the culprit
>>> or show that the problem is elsewhere.
>>>
>> I can add it then and send out the patch if no objections.
>
> No objections from me!
Me neither! And I can push that into an -rc release as well once I have it
(since it is related to a potential bug).
thanks,
- Joel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-06-08 18:23 ` Uladzislau Rezki
2025-06-09 0:25 ` Paul E. McKenney
@ 2025-06-09 18:28 ` Vlastimil Babka
2025-06-10 12:33 ` Uladzislau Rezki
1 sibling, 1 reply; 20+ messages in thread
From: Vlastimil Babka @ 2025-06-09 18:28 UTC (permalink / raw)
To: Uladzislau Rezki, Kent Overstreet, Paul E. McKenney
Cc: syzbot, akpm, josh, linux-bcachefs, linux-kernel, linux-mm, rcu,
syzkaller-bugs
On 6/8/25 20:23, Uladzislau Rezki wrote:
> On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
>>
>> I don't think it's that - syzbot's .config already has that enabled.
>> KASAN, too.
>>
>> And the only place we do call_rcu() is from rcu_pending.c, where we've
>> got a rearming rcu callback - but we track whether it's outstanding, and
>> we do all relevant operations with a lock held.
>>
>> And we only use rcu_pending.c with SRCU, not regular RCU.
>>
>> We do use kfree_rcu() in a few places (all boring, I expect), but that
>> doesn't (generally?) use the rcu callback list.
>>
> Right, kvfree_rcu() does not intersect with regular callbacks, it has
> its own path.
You mean do to the batching? Maybe the batching should be disabled with
CONFIG_DEBUG_OBJECTS_RCU_HEAD=y if it prevents it from detecting issues?
Otherwise we now have kvfree_rcu_cb() so the special handling of
kvfree_rcu() is gone in in the non-batching case.
> It looks like the problem is here:
>
> <snip>
> f = rhp->func;
> debug_rcu_head_callback(rhp);
> WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
> f(rhp);
> <snip>
>
> we do not check if callback, "f", is a NULL. If it is, the kernel bug
> is triggered right away. For example:
>
> call_rcu(&rh, NULL);
>
> @Paul, do you think it makes sense to narrow callers which apparently
> pass NULL as a callback? To me it seems the case of this bug. But we
> do not know the source.
>
> It would give at least a stack-trace of caller which passes a NULL.
Right, AFAIU this kind of check is now possible, previously NULL was being
interpreted as a valid __is_kvfree_rcu_offset() (i.e. rcu_head at offset 0).
> --
> Uladzislau Rezki
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-06-09 14:20 ` Joel Fernandes
@ 2025-06-10 12:19 ` Uladzislau Rezki
0 siblings, 0 replies; 20+ messages in thread
From: Uladzislau Rezki @ 2025-06-10 12:19 UTC (permalink / raw)
To: Joel Fernandes
Cc: paulmck, Uladzislau Rezki, Kent Overstreet, syzbot, akpm, josh,
linux-bcachefs, linux-kernel, linux-mm, rcu, syzkaller-bugs
On Mon, Jun 09, 2025 at 10:20:58AM -0400, Joel Fernandes wrote:
>
>
> On 6/9/2025 5:47 AM, Paul E. McKenney wrote:
> > On Mon, Jun 09, 2025 at 10:35:34AM +0200, Uladzislau Rezki wrote:
> >> On Sun, Jun 08, 2025 at 05:25:05PM -0700, Paul E. McKenney wrote:
> >>> On Sun, Jun 08, 2025 at 08:23:36PM +0200, Uladzislau Rezki wrote:
> >>>> On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
> >>>>> On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote:
> >>>>>> On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> >>>>>>> Hello,
> >>>>>>>
> >>>>>>> syzbot found the following issue on:
> >>>>>>>
> >>>>>>> HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> >>>>>>> git tree: upstream
> >>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> >>>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> >>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> >>>>>>> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> >>>>>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
> >>>>>>>
> >>>>>>> Downloadable assets:
> >>>>>>> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
> >>>>>>> vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
> >>>>>>> kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
> >>>>>>> mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
> >>>>>>>
> >>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >>>>>>> Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> >>>>>>>
> >>>>>>> slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
> >>>>>>> BUG: kernel NULL pointer dereference, address: 0000000000000000
> >>>>>>> #PF: supervisor instruction fetch in kernel mode
> >>>>>>> #PF: error_code(0x0010) - not-present page
> >>>>>>> PGD 0 P4D 0
> >>>>>>> Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> >>>>>>> CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
> >>>>>>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> >>>>>>> RIP: 0010:0x0
> >>>>>>> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> >>>>>>> RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> >>>>>>> RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> >>>>>>> RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> >>>>>>> RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> >>>>>>> R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> >>>>>>> R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> >>>>>>> FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> >>>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>>>>> CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> >>>>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >>>>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >>>>>>> Call Trace:
> >>>>>>> <IRQ>
> >>>>>>> rcu_do_batch kernel/rcu/tree.c:2546 [inline]
> >>>>>>
> >>>>>> The usual way that this happens is that someone clobbers the rcu_head
> >>>>>> structure of something that has been passed to call_rcu(). The most
> >>>>>> popular way of clobbering this structure is to pass the same something to
> >>>>>> call_rcu() twice in a row, but other creative arrangements are possible.
> >>>>>>
> >>>>>> Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
> >>>>>> spot invoking call_rcu() twice in a row.
> >>>>>
> >>>>> I don't think it's that - syzbot's .config already has that enabled.
> >>>>> KASAN, too.
> >>>>>
> >>>>> And the only place we do call_rcu() is from rcu_pending.c, where we've
> >>>>> got a rearming rcu callback - but we track whether it's outstanding, and
> >>>>> we do all relevant operations with a lock held.
> >>>>>
> >>>>> And we only use rcu_pending.c with SRCU, not regular RCU.
> >>>>>
> >>>>> We do use kfree_rcu() in a few places (all boring, I expect), but that
> >>>>> doesn't (generally?) use the rcu callback list.
> >>>>>
> >>>> Right, kvfree_rcu() does not intersect with regular callbacks, it has
> >>>> its own path.
> >>>>
> >>>> It looks like the problem is here:
> >>>>
> >>>> <snip>
> >>>> f = rhp->func;
> >>>> debug_rcu_head_callback(rhp);
> >>>> WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
> >>>> f(rhp);
> >>>> <snip>
> >>>>
> >>>> we do not check if callback, "f", is a NULL. If it is, the kernel bug
> >>>> is triggered right away. For example:
> >>>>
> >>>> call_rcu(&rh, NULL);
> >>>>
> >>>> @Paul, do you think it makes sense to narrow callers which apparently
> >>>> pass NULL as a callback? To me it seems the case of this bug. But we
> >>>> do not know the source.
> >>>>
> >>>> It would give at least a stack-trace of caller which passes a NULL.
> >>>
> >>> Adding a check for NULL func passed to __call_rcu_common(), you mean?
> >>>
> >> Yes. Currently there is no any check. So passing a NULL just triggers
> >> kernel panic.
> >>
> >>>
> >>> That wouldn't hurt, and would either (as you say) catch the culprit
> >>> or show that the problem is elsewhere.
> >>>
> >> I can add it then and send out the patch if no objections.
> >
> > No objections from me!
>
> Me neither! And I can push that into an -rc release as well once I have it
> (since it is related to a potential bug).
>
I will prepare it and send out today.
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-06-09 18:28 ` Vlastimil Babka
@ 2025-06-10 12:33 ` Uladzislau Rezki
0 siblings, 0 replies; 20+ messages in thread
From: Uladzislau Rezki @ 2025-06-10 12:33 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Uladzislau Rezki, Kent Overstreet, Paul E. McKenney, syzbot, akpm,
josh, linux-bcachefs, linux-kernel, linux-mm, rcu, syzkaller-bugs
On Mon, Jun 09, 2025 at 08:28:56PM +0200, Vlastimil Babka wrote:
> On 6/8/25 20:23, Uladzislau Rezki wrote:
> > On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
> >>
> >> I don't think it's that - syzbot's .config already has that enabled.
> >> KASAN, too.
> >>
> >> And the only place we do call_rcu() is from rcu_pending.c, where we've
> >> got a rearming rcu callback - but we track whether it's outstanding, and
> >> we do all relevant operations with a lock held.
> >>
> >> And we only use rcu_pending.c with SRCU, not regular RCU.
> >>
> >> We do use kfree_rcu() in a few places (all boring, I expect), but that
> >> doesn't (generally?) use the rcu callback list.
> >>
> > Right, kvfree_rcu() does not intersect with regular callbacks, it has
> > its own path.
>
> You mean do to the batching? Maybe the batching should be disabled with
> CONFIG_DEBUG_OBJECTS_RCU_HEAD=y if it prevents it from detecting issues?
> Otherwise we now have kvfree_rcu_cb() so the special handling of
> kvfree_rcu() is gone in in the non-batching case.
>
Not really. I meant that in a call_rcu() API there is no any check if
a passed callback which is executed after GP is NULL. If so, we get the
bug about about dereferencing of NULL pointer.
Since it is invoked by the rcu_core() context, we can not identify the
caller in order to blame someone :)
As for batching, we have a support of CONFIG_DEBUG_OBJECTS_RCU_HEAD. It
helps to identify double-freeing and probably leaking.
> > It looks like the problem is here:
> >
> > <snip>
> > f = rhp->func;
> > debug_rcu_head_callback(rhp);
> > WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
> > f(rhp);
> > <snip>
> >
> > we do not check if callback, "f", is a NULL. If it is, the kernel bug
> > is triggered right away. For example:
> >
> > call_rcu(&rh, NULL);
> >
> > @Paul, do you think it makes sense to narrow callers which apparently
> > pass NULL as a callback? To me it seems the case of this bug. But we
> > do not know the source.
> >
> > It would give at least a stack-trace of caller which passes a NULL.
>
> Right, AFAIU this kind of check is now possible, previously NULL was being
> interpreted as a valid __is_kvfree_rcu_offset() (i.e. rcu_head at offset 0).
>
> > --
> > Uladzislau Rezki
> >
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-02-05 0:34 [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) syzbot
2025-02-05 14:56 ` Paul E. McKenney
2025-06-08 6:58 ` [syzbot] [bcachefs?] [rcu?] " syzbot
@ 2025-06-11 15:58 ` Uladzislau Rezki
2025-06-11 18:02 ` [syzbot] [bcachefs?] [rcu?] " syzbot
2 siblings, 1 reply; 20+ messages in thread
From: Uladzislau Rezki @ 2025-06-11 15:58 UTC (permalink / raw)
To: syzbot
Cc: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel,
linux-mm, paulmck, rcu, syzkaller-bugs
On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
>
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
> mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
>
> slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor instruction fetch in kernel mode
> #PF: error_code(0x0010) - not-present page
> PGD 0 P4D 0
> Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <IRQ>
> rcu_do_batch kernel/rcu/tree.c:2546 [inline]
> rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2802
> handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561
> __do_softirq kernel/softirq.c:595 [inline]
> invoke_softirq kernel/softirq.c:435 [inline]
> __irq_exit_rcu+0xf7/0x220 kernel/softirq.c:662
> irq_exit_rcu+0x9/0x30 kernel/softirq.c:678
> instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline]
> sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1049
> </IRQ>
> <TASK>
> asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
> RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
> RIP: 0010:_raw_spin_unlock_irqrestore+0xd8/0x140 kernel/locking/spinlock.c:194
> Code: 9c 8f 44 24 20 42 80 3c 23 00 74 08 4c 89 f7 e8 fe 78 2d f6 f6 44 24 21 02 75 52 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> c3 0f 95 f5 65 8b 05 d4 58 0b 74 85 c0 74 43 48 c7 04 24 0e 36
> RSP: 0018:ffffc900030fef60 EFLAGS: 00000206
> RAX: 23438dd059a4b100 RBX: 1ffff9200061fdf0 RCX: ffffffff819b316a
> RDX: dffffc0000000000 RSI: ffffffff8c0aa680 RDI: 0000000000000001
> RBP: ffffc900030feff8 R08: ffffffff942f9847 R09: 1ffffffff285f308
> R10: dffffc0000000000 R11: fffffbfff285f309 R12: dffffc0000000000
> R13: 1ffff9200061fdec R14: ffffc900030fef80 R15: 0000000000000246
> spin_unlock_irqrestore include/linux/spinlock.h:406 [inline]
> rmqueue_bulk mm/page_alloc.c:2329 [inline]
> __rmqueue_pcplist+0x21fd/0x2a90 mm/page_alloc.c:3004
> rmqueue_pcplist mm/page_alloc.c:3046 [inline]
> rmqueue mm/page_alloc.c:3077 [inline]
> get_page_from_freelist+0x886/0x37a0 mm/page_alloc.c:3474
> __alloc_frozen_pages_noprof+0x292/0x710 mm/page_alloc.c:4739
> alloc_pages_mpol+0x311/0x660 mm/mempolicy.c:2270
> folio_alloc_mpol_noprof mm/mempolicy.c:2289 [inline]
> vma_alloc_folio_noprof+0x12b/0x260 mm/mempolicy.c:2324
> folio_prealloc+0x2e/0x170
> wp_page_copy mm/memory.c:3435 [inline]
> do_wp_page+0x1253/0x49b0 mm/memory.c:3827
> handle_pte_fault mm/memory.c:5905 [inline]
> __handle_mm_fault+0x24d5/0x70f0 mm/memory.c:6032
> handle_mm_fault+0x3e5/0x8d0 mm/memory.c:6201
> do_user_addr_fault arch/x86/mm/fault.c:1388 [inline]
> handle_page_fault arch/x86/mm/fault.c:1480 [inline]
> exc_page_fault+0x2b9/0x8b0 arch/x86/mm/fault.c:1538
> asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
> RIP: 0010:__put_user_4+0x11/0x20 arch/x86/lib/putuser.S:88
> Code: 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 89 cb 48 c1 fb 3f 48 09 d9 0f 01 cb <89> 01 31 c9 0f 01 ca c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90
> RSP: 0018:ffffc900030fff00 EFLAGS: 00050202
> RAX: 0000000000000005 RBX: 0000000000000000 RCX: 00005555679927d0
> RDX: 0000000000000000 RSI: ffffffff8c0ab8e0 RDI: ffffffff8c608a00
> RBP: ffff888000dfcf20 R08: ffffffff901b5177 R09: 1ffffffff2036a2e
> R10: dffffc0000000000 R11: fffffbfff2036a2f R12: 0000000000000000
> R13: 0000000000000000 R14: 0000000000000005 R15: dffffc0000000000
> schedule_tail+0x96/0xb0 kernel/sched/core.c:5312
> ret_from_fork+0x24/0x80 arch/x86/kernel/process.c:144
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> </TASK>
> Modules linked in:
> CR2: 0000000000000000
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> ----------------
> Code disassembly (best guess):
> 0: 9c pushf
> 1: 8f 44 24 20 pop 0x20(%rsp)
> 5: 42 80 3c 23 00 cmpb $0x0,(%rbx,%r12,1)
> a: 74 08 je 0x14
> c: 4c 89 f7 mov %r14,%rdi
> f: e8 fe 78 2d f6 call 0xf62d7912
> 14: f6 44 24 21 02 testb $0x2,0x21(%rsp)
> 19: 75 52 jne 0x6d
> 1b: 41 f7 c7 00 02 00 00 test $0x200,%r15d
> 22: 74 01 je 0x25
> 24: fb sti
> 25: bf 01 00 00 00 mov $0x1,%edi
> * 2a: e8 c3 0f 95 f5 call 0xf5950ff2 <-- trapping instruction
> 2f: 65 8b 05 d4 58 0b 74 mov %gs:0x740b58d4(%rip),%eax # 0x740b590a
> 36: 85 c0 test %eax,%eax
> 38: 74 43 je 0x7d
> 3a: 48 rex.W
> 3b: c7 .byte 0xc7
> 3c: 04 24 add $0x24,%al
> 3e: 0e (bad)
> 3f: 36 ss
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
>
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.
>
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
>
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
>
> If you want to undo deduplication, reply with:
> #syz undup
>
#syz test
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 475f31deed14..b297a32c6779 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3047,6 +3047,10 @@ __call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy_in)
/* Misaligned rcu_head! */
WARN_ON_ONCE((unsigned long)head & (sizeof(void *) - 1));
+ /* Avoid NULL dereference if callback is NULL. */
+ if (WARN_ON_ONCE(!func))
+ return;
+
if (debug_rcu_head_queue(head)) {
/*
* Probable double call_rcu(), so leak the callback.
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-06-11 15:58 ` [syzbot] [rcu?] [bcachefs?] " Uladzislau Rezki
@ 2025-06-11 18:02 ` syzbot
2025-06-11 19:15 ` Uladzislau Rezki
0 siblings, 1 reply; 20+ messages in thread
From: syzbot @ 2025-06-11 18:02 UTC (permalink / raw)
To: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel,
linux-mm, paulmck, rcu, syzkaller-bugs, urezki
Hello,
syzbot tried to test the proposed patch but the build/boot failed:
failed to apply patch:
checking file kernel/rcu/tree.c
patch: **** unexpected end of file in patch
Tested on:
commit: aef17cb3 Revert "mm/damon/Kconfig: enable CONFIG_DAMON..
git tree: upstream
kernel config: https://syzkaller.appspot.com/x/.config?x=523d3ff8e053340a
dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
compiler:
patch: https://syzkaller.appspot.com/x/patch.diff?x=17de99d4580000
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-06-11 18:02 ` [syzbot] [bcachefs?] [rcu?] " syzbot
@ 2025-06-11 19:15 ` Uladzislau Rezki
2025-06-11 19:57 ` syzbot
0 siblings, 1 reply; 20+ messages in thread
From: Uladzislau Rezki @ 2025-06-11 19:15 UTC (permalink / raw)
To: syzbot
Cc: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel,
linux-mm, paulmck, rcu, syzkaller-bugs, urezki
On Wed, Jun 11, 2025 at 11:02:03AM -0700, syzbot wrote:
> Hello,
>
> syzbot tried to test the proposed patch but the build/boot failed:
>
> failed to apply patch:
> checking file kernel/rcu/tree.c
> patch: **** unexpected end of file in patch
>
>
>
> Tested on:
>
> commit: aef17cb3 Revert "mm/damon/Kconfig: enable CONFIG_DAMON..
> git tree: upstream
> kernel config: https://syzkaller.appspot.com/x/.config?x=523d3ff8e053340a
> dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> compiler:
> patch: https://syzkaller.appspot.com/x/patch.diff?x=17de99d4580000
>
#syz test
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index e8a4b720d7d2..14d4499c6fc3 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3072,6 +3072,10 @@ __call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy_in)
/* Misaligned rcu_head! */
WARN_ON_ONCE((unsigned long)head & (sizeof(void *) - 1));
+ /* Avoid NULL dereference if callback is NULL. */
+ if (WARN_ON_ONCE(!func))
+ return;
+
if (debug_rcu_head_queue(head)) {
/*
* Probable double call_rcu(), so leak the callback.
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-06-11 19:15 ` Uladzislau Rezki
@ 2025-06-11 19:57 ` syzbot
2025-06-11 20:58 ` Boqun Feng
0 siblings, 1 reply; 20+ messages in thread
From: syzbot @ 2025-06-11 19:57 UTC (permalink / raw)
To: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel,
linux-mm, paulmck, rcu, syzkaller-bugs, urezki
Hello,
syzbot has tested the proposed patch and the reproducer did not trigger any issue:
Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
Tested-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
Tested on:
commit: 488ef356 KEYS: Invert FINAL_PUT bit
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000
kernel config: https://syzkaller.appspot.com/x/.config?x=713d218acd33d94
dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6
patch: https://syzkaller.appspot.com/x/patch.diff?x=170e460c580000
Note: testing is done by a robot and is best-effort only.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-06-11 19:57 ` syzbot
@ 2025-06-11 20:58 ` Boqun Feng
2025-06-12 7:42 ` Aleksandr Nogikh
0 siblings, 1 reply; 20+ messages in thread
From: Boqun Feng @ 2025-06-11 20:58 UTC (permalink / raw)
To: syzbot
Cc: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel,
linux-mm, paulmck, rcu, syzkaller-bugs, urezki
On Wed, Jun 11, 2025 at 12:57:04PM -0700, syzbot wrote:
> Hello,
>
> syzbot has tested the proposed patch and the reproducer did not trigger any issue:
>
> Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> Tested-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
>
> Tested on:
>
> commit: 488ef356 KEYS: Invert FINAL_PUT bit
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000
Is there a way to see the whole console output? If Ulad's patch fixes
the exact issue, we should be able to see a WARN_ON_ONCE() triggered.
Regards,
Boqun
> kernel config: https://syzkaller.appspot.com/x/.config?x=713d218acd33d94
> dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6
> patch: https://syzkaller.appspot.com/x/patch.diff?x=170e460c580000
>
> Note: testing is done by a robot and is best-effort only.
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-06-11 20:58 ` Boqun Feng
@ 2025-06-12 7:42 ` Aleksandr Nogikh
2025-06-12 9:37 ` Uladzislau Rezki
0 siblings, 1 reply; 20+ messages in thread
From: Aleksandr Nogikh @ 2025-06-12 7:42 UTC (permalink / raw)
To: Boqun Feng
Cc: syzbot, akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel,
linux-mm, paulmck, rcu, syzkaller-bugs, urezki
On Wed, Jun 11, 2025 at 10:58 PM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> On Wed, Jun 11, 2025 at 12:57:04PM -0700, syzbot wrote:
> > Hello,
> >
> > syzbot has tested the proposed patch and the reproducer did not trigger any issue:
> >
> > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> > Tested-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> >
> > Tested on:
> >
> > commit: 488ef356 KEYS: Invert FINAL_PUT bit
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000
>
> Is there a way to see the whole console output? If Ulad's patch fixes
> the exact issue, we should be able to see a WARN_ON_ONCE() triggered.
If WARN_ON_ONCE() were triggered, the associated kernel panic output
would have been at the end of this log.
>
> Regards,
> Boqun
>
> > kernel config: https://syzkaller.appspot.com/x/.config?x=713d218acd33d94
> > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
FWIW the last time the bug was observed on syzbot was 100 days ago, so
it has likely been fixed since then or has become much harder to
reproduce.
> > compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6
> > patch: https://syzkaller.appspot.com/x/patch.diff?x=170e460c580000
> >
> > Note: testing is done by a robot and is best-effort only.
> >
>
--
Aleksandr
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-06-12 7:42 ` Aleksandr Nogikh
@ 2025-06-12 9:37 ` Uladzislau Rezki
2025-06-12 17:20 ` Boqun Feng
0 siblings, 1 reply; 20+ messages in thread
From: Uladzislau Rezki @ 2025-06-12 9:37 UTC (permalink / raw)
To: Aleksandr Nogikh
Cc: Boqun Feng, syzbot, akpm, josh, kent.overstreet, linux-bcachefs,
linux-kernel, linux-mm, paulmck, rcu, syzkaller-bugs, urezki
On Thu, Jun 12, 2025 at 09:42:32AM +0200, Aleksandr Nogikh wrote:
> On Wed, Jun 11, 2025 at 10:58 PM Boqun Feng <boqun.feng@gmail.com> wrote:
> >
> > On Wed, Jun 11, 2025 at 12:57:04PM -0700, syzbot wrote:
> > > Hello,
> > >
> > > syzbot has tested the proposed patch and the reproducer did not trigger any issue:
> > >
> > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> > > Tested-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> > >
> > > Tested on:
> > >
> > > commit: 488ef356 KEYS: Invert FINAL_PUT bit
> > > git tree: upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000
> >
> > Is there a way to see the whole console output? If Ulad's patch fixes
> > the exact issue, we should be able to see a WARN_ON_ONCE() triggered.
>
> If WARN_ON_ONCE() were triggered, the associated kernel panic output
> would have been at the end of this log.
>
> >
> > Regards,
> > Boqun
> >
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=713d218acd33d94
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
>
> FWIW the last time the bug was observed on syzbot was 100 days ago, so
> it has likely been fixed since then or has become much harder to
> reproduce.
>
That is even worse, if it is last for 100 days already.
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
2025-06-12 9:37 ` Uladzislau Rezki
@ 2025-06-12 17:20 ` Boqun Feng
0 siblings, 0 replies; 20+ messages in thread
From: Boqun Feng @ 2025-06-12 17:20 UTC (permalink / raw)
To: Uladzislau Rezki (Sony), Aleksandr Nogikh
Cc: syzbot, Andrew Morton, Josh Triplett, kent.overstreet,
linux-bcachefs, linux-kernel, linux-mm, Paul E. McKenney, rcu,
syzkaller-bugs
On Thu, Jun 12, 2025, at 2:37 AM, Uladzislau Rezki wrote:
> On Thu, Jun 12, 2025 at 09:42:32AM +0200, Aleksandr Nogikh wrote:
>> On Wed, Jun 11, 2025 at 10:58 PM Boqun Feng <boqun.feng@gmail.com> wrote:
>> >
>> > On Wed, Jun 11, 2025 at 12:57:04PM -0700, syzbot wrote:
>> > > Hello,
>> > >
>> > > syzbot has tested the proposed patch and the reproducer did not trigger any issue:
>> > >
>> > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
>> > > Tested-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
>> > >
>> > > Tested on:
>> > >
>> > > commit: 488ef356 KEYS: Invert FINAL_PUT bit
>> > > git tree: upstream
>> > > console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000
>> >
>> > Is there a way to see the whole console output? If Ulad's patch fixes
>> > the exact issue, we should be able to see a WARN_ON_ONCE() triggered.
>>
>> If WARN_ON_ONCE() were triggered, the associated kernel panic output
>> would have been at the end of this log.
>>
>> >
>> > Regards,
>> > Boqun
>> >
>> > > kernel config: https://syzkaller.appspot.com/x/.config?x=713d218acd33d94
>> > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
>>
>> FWIW the last time the bug was observed on syzbot was 100 days ago, so
>> it has likely been fixed since then or has become much harder to
>> reproduce.
>>
> That is even worse, if it is last for 100 days already.
>
My understanding is that the evidence shows that the
issue that directly caused null-ptr-derek the has been
fixed 100 days ago.
Regards,
Boqun
> --
> Uladzislau Rezki
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-06-12 17:20 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-05 0:34 [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) syzbot
2025-02-05 14:56 ` Paul E. McKenney
2025-06-08 15:26 ` Kent Overstreet
2025-06-08 18:23 ` Uladzislau Rezki
2025-06-09 0:25 ` Paul E. McKenney
2025-06-09 8:35 ` Uladzislau Rezki
2025-06-09 9:47 ` Paul E. McKenney
2025-06-09 14:20 ` Joel Fernandes
2025-06-10 12:19 ` Uladzislau Rezki
2025-06-09 18:28 ` Vlastimil Babka
2025-06-10 12:33 ` Uladzislau Rezki
2025-06-08 6:58 ` [syzbot] [bcachefs?] [rcu?] " syzbot
2025-06-11 15:58 ` [syzbot] [rcu?] [bcachefs?] " Uladzislau Rezki
2025-06-11 18:02 ` [syzbot] [bcachefs?] [rcu?] " syzbot
2025-06-11 19:15 ` Uladzislau Rezki
2025-06-11 19:57 ` syzbot
2025-06-11 20:58 ` Boqun Feng
2025-06-12 7:42 ` Aleksandr Nogikh
2025-06-12 9:37 ` Uladzislau Rezki
2025-06-12 17:20 ` Boqun Feng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).