[syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
@ 2025-02-05  0:34 syzbot
  2025-02-05 14:56 ` Paul E. McKenney
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: syzbot @ 2025-02-05  0:34 UTC (permalink / raw)
  To: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel,
	linux-mm, paulmck, rcu, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com

 slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 0 P4D 0 
Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
FS:  0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 rcu_do_batch kernel/rcu/tree.c:2546 [inline]
 rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2802
 handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561
 __do_softirq kernel/softirq.c:595 [inline]
 invoke_softirq kernel/softirq.c:435 [inline]
 __irq_exit_rcu+0xf7/0x220 kernel/softirq.c:662
 irq_exit_rcu+0x9/0x30 kernel/softirq.c:678
 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline]
 sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1049
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0xd8/0x140 kernel/locking/spinlock.c:194
Code: 9c 8f 44 24 20 42 80 3c 23 00 74 08 4c 89 f7 e8 fe 78 2d f6 f6 44 24 21 02 75 52 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> c3 0f 95 f5 65 8b 05 d4 58 0b 74 85 c0 74 43 48 c7 04 24 0e 36
RSP: 0018:ffffc900030fef60 EFLAGS: 00000206
RAX: 23438dd059a4b100 RBX: 1ffff9200061fdf0 RCX: ffffffff819b316a
RDX: dffffc0000000000 RSI: ffffffff8c0aa680 RDI: 0000000000000001
RBP: ffffc900030feff8 R08: ffffffff942f9847 R09: 1ffffffff285f308
R10: dffffc0000000000 R11: fffffbfff285f309 R12: dffffc0000000000
R13: 1ffff9200061fdec R14: ffffc900030fef80 R15: 0000000000000246
 spin_unlock_irqrestore include/linux/spinlock.h:406 [inline]
 rmqueue_bulk mm/page_alloc.c:2329 [inline]
 __rmqueue_pcplist+0x21fd/0x2a90 mm/page_alloc.c:3004
 rmqueue_pcplist mm/page_alloc.c:3046 [inline]
 rmqueue mm/page_alloc.c:3077 [inline]
 get_page_from_freelist+0x886/0x37a0 mm/page_alloc.c:3474
 __alloc_frozen_pages_noprof+0x292/0x710 mm/page_alloc.c:4739
 alloc_pages_mpol+0x311/0x660 mm/mempolicy.c:2270
 folio_alloc_mpol_noprof mm/mempolicy.c:2289 [inline]
 vma_alloc_folio_noprof+0x12b/0x260 mm/mempolicy.c:2324
 folio_prealloc+0x2e/0x170
 wp_page_copy mm/memory.c:3435 [inline]
 do_wp_page+0x1253/0x49b0 mm/memory.c:3827
 handle_pte_fault mm/memory.c:5905 [inline]
 __handle_mm_fault+0x24d5/0x70f0 mm/memory.c:6032
 handle_mm_fault+0x3e5/0x8d0 mm/memory.c:6201
 do_user_addr_fault arch/x86/mm/fault.c:1388 [inline]
 handle_page_fault arch/x86/mm/fault.c:1480 [inline]
 exc_page_fault+0x2b9/0x8b0 arch/x86/mm/fault.c:1538
 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
RIP: 0010:__put_user_4+0x11/0x20 arch/x86/lib/putuser.S:88
Code: 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 89 cb 48 c1 fb 3f 48 09 d9 0f 01 cb <89> 01 31 c9 0f 01 ca c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90
RSP: 0018:ffffc900030fff00 EFLAGS: 00050202
RAX: 0000000000000005 RBX: 0000000000000000 RCX: 00005555679927d0
RDX: 0000000000000000 RSI: ffffffff8c0ab8e0 RDI: ffffffff8c608a00
RBP: ffff888000dfcf20 R08: ffffffff901b5177 R09: 1ffffffff2036a2e
R10: dffffc0000000000 R11: fffffbfff2036a2f R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000005 R15: dffffc0000000000
 schedule_tail+0x96/0xb0 kernel/sched/core.c:5312
 ret_from_fork+0x24/0x80 arch/x86/kernel/process.c:144
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
Modules linked in:
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
FS:  0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
   0:	9c                   	pushf
   1:	8f 44 24 20          	pop    0x20(%rsp)
   5:	42 80 3c 23 00       	cmpb   $0x0,(%rbx,%r12,1)
   a:	74 08                	je     0x14
   c:	4c 89 f7             	mov    %r14,%rdi
   f:	e8 fe 78 2d f6       	call   0xf62d7912
  14:	f6 44 24 21 02       	testb  $0x2,0x21(%rsp)
  19:	75 52                	jne    0x6d
  1b:	41 f7 c7 00 02 00 00 	test   $0x200,%r15d
  22:	74 01                	je     0x25
  24:	fb                   	sti
  25:	bf 01 00 00 00       	mov    $0x1,%edi
* 2a:	e8 c3 0f 95 f5       	call   0xf5950ff2 <-- trapping instruction
  2f:	65 8b 05 d4 58 0b 74 	mov    %gs:0x740b58d4(%rip),%eax        # 0x740b590a
  36:	85 c0                	test   %eax,%eax
  38:	74 43                	je     0x7d
  3a:	48                   	rex.W
  3b:	c7                   	.byte 0xc7
  3c:	04 24                	add    $0x24,%al
  3e:	0e                   	(bad)
  3f:	36                   	ss


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-02-05  0:34 [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) syzbot
@ 2025-02-05 14:56 ` Paul E. McKenney
  2025-06-08 15:26   ` Kent Overstreet
  2025-06-08  6:58 ` [syzbot] [bcachefs?] [rcu?] " syzbot
  2025-06-11 15:58 ` [syzbot] [rcu?] [bcachefs?] " Uladzislau Rezki
  2 siblings, 1 reply; 20+ messages in thread
From: Paul E. McKenney @ 2025-02-05 14:56 UTC (permalink / raw)
  To: syzbot
  Cc: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel,
	linux-mm, rcu, syzkaller-bugs

On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
> 
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
> mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> 
>  slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor instruction fetch in kernel mode
> #PF: error_code(0x0010) - not-present page
> PGD 0 P4D 0 
> Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> FS:  0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <IRQ>
>  rcu_do_batch kernel/rcu/tree.c:2546 [inline]

The usual way that this happens is that someone clobbers the rcu_head
structure of something that has been passed to call_rcu().  The most
popular way of clobbering this structure is to pass the same something to
call_rcu() twice in a row, but other creative arrangements are possible.

Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
spot invoking call_rcu() twice in a row.

							Thanx, Paul

>  rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2802
>  handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561
>  __do_softirq kernel/softirq.c:595 [inline]
>  invoke_softirq kernel/softirq.c:435 [inline]
>  __irq_exit_rcu+0xf7/0x220 kernel/softirq.c:662
>  irq_exit_rcu+0x9/0x30 kernel/softirq.c:678
>  instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline]
>  sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1049
>  </IRQ>
>  <TASK>
>  asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
> RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
> RIP: 0010:_raw_spin_unlock_irqrestore+0xd8/0x140 kernel/locking/spinlock.c:194
> Code: 9c 8f 44 24 20 42 80 3c 23 00 74 08 4c 89 f7 e8 fe 78 2d f6 f6 44 24 21 02 75 52 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> c3 0f 95 f5 65 8b 05 d4 58 0b 74 85 c0 74 43 48 c7 04 24 0e 36
> RSP: 0018:ffffc900030fef60 EFLAGS: 00000206
> RAX: 23438dd059a4b100 RBX: 1ffff9200061fdf0 RCX: ffffffff819b316a
> RDX: dffffc0000000000 RSI: ffffffff8c0aa680 RDI: 0000000000000001
> RBP: ffffc900030feff8 R08: ffffffff942f9847 R09: 1ffffffff285f308
> R10: dffffc0000000000 R11: fffffbfff285f309 R12: dffffc0000000000
> R13: 1ffff9200061fdec R14: ffffc900030fef80 R15: 0000000000000246
>  spin_unlock_irqrestore include/linux/spinlock.h:406 [inline]
>  rmqueue_bulk mm/page_alloc.c:2329 [inline]
>  __rmqueue_pcplist+0x21fd/0x2a90 mm/page_alloc.c:3004
>  rmqueue_pcplist mm/page_alloc.c:3046 [inline]
>  rmqueue mm/page_alloc.c:3077 [inline]
>  get_page_from_freelist+0x886/0x37a0 mm/page_alloc.c:3474
>  __alloc_frozen_pages_noprof+0x292/0x710 mm/page_alloc.c:4739
>  alloc_pages_mpol+0x311/0x660 mm/mempolicy.c:2270
>  folio_alloc_mpol_noprof mm/mempolicy.c:2289 [inline]
>  vma_alloc_folio_noprof+0x12b/0x260 mm/mempolicy.c:2324
>  folio_prealloc+0x2e/0x170
>  wp_page_copy mm/memory.c:3435 [inline]
>  do_wp_page+0x1253/0x49b0 mm/memory.c:3827
>  handle_pte_fault mm/memory.c:5905 [inline]
>  __handle_mm_fault+0x24d5/0x70f0 mm/memory.c:6032
>  handle_mm_fault+0x3e5/0x8d0 mm/memory.c:6201
>  do_user_addr_fault arch/x86/mm/fault.c:1388 [inline]
>  handle_page_fault arch/x86/mm/fault.c:1480 [inline]
>  exc_page_fault+0x2b9/0x8b0 arch/x86/mm/fault.c:1538
>  asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
> RIP: 0010:__put_user_4+0x11/0x20 arch/x86/lib/putuser.S:88
> Code: 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 89 cb 48 c1 fb 3f 48 09 d9 0f 01 cb <89> 01 31 c9 0f 01 ca c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90
> RSP: 0018:ffffc900030fff00 EFLAGS: 00050202
> RAX: 0000000000000005 RBX: 0000000000000000 RCX: 00005555679927d0
> RDX: 0000000000000000 RSI: ffffffff8c0ab8e0 RDI: ffffffff8c608a00
> RBP: ffff888000dfcf20 R08: ffffffff901b5177 R09: 1ffffffff2036a2e
> R10: dffffc0000000000 R11: fffffbfff2036a2f R12: 0000000000000000
> R13: 0000000000000000 R14: 0000000000000005 R15: dffffc0000000000
>  schedule_tail+0x96/0xb0 kernel/sched/core.c:5312
>  ret_from_fork+0x24/0x80 arch/x86/kernel/process.c:144
>  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>  </TASK>
> Modules linked in:
> CR2: 0000000000000000
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> FS:  0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> ----------------
> Code disassembly (best guess):
>    0:	9c                   	pushf
>    1:	8f 44 24 20          	pop    0x20(%rsp)
>    5:	42 80 3c 23 00       	cmpb   $0x0,(%rbx,%r12,1)
>    a:	74 08                	je     0x14
>    c:	4c 89 f7             	mov    %r14,%rdi
>    f:	e8 fe 78 2d f6       	call   0xf62d7912
>   14:	f6 44 24 21 02       	testb  $0x2,0x21(%rsp)
>   19:	75 52                	jne    0x6d
>   1b:	41 f7 c7 00 02 00 00 	test   $0x200,%r15d
>   22:	74 01                	je     0x25
>   24:	fb                   	sti
>   25:	bf 01 00 00 00       	mov    $0x1,%edi
> * 2a:	e8 c3 0f 95 f5       	call   0xf5950ff2 <-- trapping instruction
>   2f:	65 8b 05 d4 58 0b 74 	mov    %gs:0x740b58d4(%rip),%eax        # 0x740b590a
>   36:	85 c0                	test   %eax,%eax
>   38:	74 43                	je     0x7d
>   3a:	48                   	rex.W
>   3b:	c7                   	.byte 0xc7
>   3c:	04 24                	add    $0x24,%al
>   3e:	0e                   	(bad)
>   3f:	36                   	ss
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
> 
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.
> 
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
> 
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
> 
> If you want to undo deduplication, reply with:
> #syz undup

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-02-05 14:56 ` Paul E. McKenney
@ 2025-06-08 15:26   ` Kent Overstreet
  2025-06-08 18:23     ` Uladzislau Rezki
  0 siblings, 1 reply; 20+ messages in thread
From: Kent Overstreet @ 2025-06-08 15:26 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: syzbot, akpm, josh, linux-bcachefs, linux-kernel, linux-mm, rcu,
	syzkaller-bugs

On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote:
> On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> > Hello,
> > 
> > syzbot found the following issue on:
> > 
> > HEAD commit:    0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
> > 
> > Downloadable assets:
> > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
> > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
> > 
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> > 
> >  slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
> > BUG: kernel NULL pointer dereference, address: 0000000000000000
> > #PF: supervisor instruction fetch in kernel mode
> > #PF: error_code(0x0010) - not-present page
> > PGD 0 P4D 0 
> > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> > RIP: 0010:0x0
> > Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> > FS:  0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  <IRQ>
> >  rcu_do_batch kernel/rcu/tree.c:2546 [inline]
> 
> The usual way that this happens is that someone clobbers the rcu_head
> structure of something that has been passed to call_rcu().  The most
> popular way of clobbering this structure is to pass the same something to
> call_rcu() twice in a row, but other creative arrangements are possible.
> 
> Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
> spot invoking call_rcu() twice in a row.

I don't think it's that - syzbot's .config already has that enabled.
KASAN, too.

And the only place we do call_rcu() is from rcu_pending.c, where we've
got a rearming rcu callback - but we track whether it's outstanding, and
we do all relevant operations with a lock held.

And we only use rcu_pending.c with SRCU, not regular RCU.

We do use kfree_rcu() in a few places (all boring, I expect), but that
doesn't (generally?) use the rcu callback list.

So I'm not sure this is even a bcachefs bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-06-08 15:26   ` Kent Overstreet
@ 2025-06-08 18:23     ` Uladzislau Rezki
  2025-06-09  0:25       ` Paul E. McKenney
  2025-06-09 18:28       ` Vlastimil Babka
  0 siblings, 2 replies; 20+ messages in thread
From: Uladzislau Rezki @ 2025-06-08 18:23 UTC (permalink / raw)
  To: Kent Overstreet, Paul E. McKenney
  Cc: Paul E. McKenney, syzbot, akpm, josh, linux-bcachefs,
	linux-kernel, linux-mm, rcu, syzkaller-bugs

On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
> On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote:
> > On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> > > Hello,
> > > 
> > > syzbot found the following issue on:
> > > 
> > > HEAD commit:    0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
> > > 
> > > Downloadable assets:
> > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
> > > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
> > > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
> > > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
> > > 
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> > > 
> > >  slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
> > > BUG: kernel NULL pointer dereference, address: 0000000000000000
> > > #PF: supervisor instruction fetch in kernel mode
> > > #PF: error_code(0x0010) - not-present page
> > > PGD 0 P4D 0 
> > > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> > > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
> > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> > > RIP: 0010:0x0
> > > Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> > > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> > > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> > > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> > > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> > > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> > > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> > > FS:  0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > Call Trace:
> > >  <IRQ>
> > >  rcu_do_batch kernel/rcu/tree.c:2546 [inline]
> > 
> > The usual way that this happens is that someone clobbers the rcu_head
> > structure of something that has been passed to call_rcu().  The most
> > popular way of clobbering this structure is to pass the same something to
> > call_rcu() twice in a row, but other creative arrangements are possible.
> > 
> > Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
> > spot invoking call_rcu() twice in a row.
> 
> I don't think it's that - syzbot's .config already has that enabled.
> KASAN, too.
> 
> And the only place we do call_rcu() is from rcu_pending.c, where we've
> got a rearming rcu callback - but we track whether it's outstanding, and
> we do all relevant operations with a lock held.
> 
> And we only use rcu_pending.c with SRCU, not regular RCU.
> 
> We do use kfree_rcu() in a few places (all boring, I expect), but that
> doesn't (generally?) use the rcu callback list.
>
Right, kvfree_rcu() does not intersect with regular callbacks, it has
its own path. 

It looks like the problem is here:

<snip>
  f = rhp->func;
  debug_rcu_head_callback(rhp);
  WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
  f(rhp);
<snip>

we do not check if callback, "f", is a NULL. If it is, the kernel bug
is triggered right away. For example:

call_rcu(&rh, NULL);

@Paul, do you think it makes sense to narrow callers which apparently
pass NULL as a callback? To me it seems the case of this bug. But we
do not know the source.

It would give at least a stack-trace of caller which passes a NULL.

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-06-08 18:23     ` Uladzislau Rezki
@ 2025-06-09  0:25       ` Paul E. McKenney
  2025-06-09  8:35         ` Uladzislau Rezki
  2025-06-09 18:28       ` Vlastimil Babka
  1 sibling, 1 reply; 20+ messages in thread
From: Paul E. McKenney @ 2025-06-09  0:25 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Kent Overstreet, syzbot, akpm, josh, linux-bcachefs, linux-kernel,
	linux-mm, rcu, syzkaller-bugs

On Sun, Jun 08, 2025 at 08:23:36PM +0200, Uladzislau Rezki wrote:
> On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
> > On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote:
> > > On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> > > > Hello,
> > > > 
> > > > syzbot found the following issue on:
> > > > 
> > > > HEAD commit:    0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> > > > git tree:       upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> > > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
> > > > 
> > > > Downloadable assets:
> > > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
> > > > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
> > > > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
> > > > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
> > > > 
> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> > > > 
> > > >  slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
> > > > BUG: kernel NULL pointer dereference, address: 0000000000000000
> > > > #PF: supervisor instruction fetch in kernel mode
> > > > #PF: error_code(0x0010) - not-present page
> > > > PGD 0 P4D 0 
> > > > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> > > > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
> > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> > > > RIP: 0010:0x0
> > > > Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> > > > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> > > > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> > > > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> > > > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> > > > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> > > > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> > > > FS:  0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> > > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > Call Trace:
> > > >  <IRQ>
> > > >  rcu_do_batch kernel/rcu/tree.c:2546 [inline]
> > > 
> > > The usual way that this happens is that someone clobbers the rcu_head
> > > structure of something that has been passed to call_rcu().  The most
> > > popular way of clobbering this structure is to pass the same something to
> > > call_rcu() twice in a row, but other creative arrangements are possible.
> > > 
> > > Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
> > > spot invoking call_rcu() twice in a row.
> > 
> > I don't think it's that - syzbot's .config already has that enabled.
> > KASAN, too.
> > 
> > And the only place we do call_rcu() is from rcu_pending.c, where we've
> > got a rearming rcu callback - but we track whether it's outstanding, and
> > we do all relevant operations with a lock held.
> > 
> > And we only use rcu_pending.c with SRCU, not regular RCU.
> > 
> > We do use kfree_rcu() in a few places (all boring, I expect), but that
> > doesn't (generally?) use the rcu callback list.
> >
> Right, kvfree_rcu() does not intersect with regular callbacks, it has
> its own path. 
> 
> It looks like the problem is here:
> 
> <snip>
>   f = rhp->func;
>   debug_rcu_head_callback(rhp);
>   WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
>   f(rhp);
> <snip>
> 
> we do not check if callback, "f", is a NULL. If it is, the kernel bug
> is triggered right away. For example:
> 
> call_rcu(&rh, NULL);
> 
> @Paul, do you think it makes sense to narrow callers which apparently
> pass NULL as a callback? To me it seems the case of this bug. But we
> do not know the source.
> 
> It would give at least a stack-trace of caller which passes a NULL.

Adding a check for NULL func passed to __call_rcu_common(), you mean?

That wouldn't hurt, and would either (as you say) catch the culprit
or show that the problem is elsewhere.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-06-09  0:25       ` Paul E. McKenney
@ 2025-06-09  8:35         ` Uladzislau Rezki
  2025-06-09  9:47           ` Paul E. McKenney
  0 siblings, 1 reply; 20+ messages in thread
From: Uladzislau Rezki @ 2025-06-09  8:35 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Uladzislau Rezki, Kent Overstreet, syzbot, akpm, josh,
	linux-bcachefs, linux-kernel, linux-mm, rcu, syzkaller-bugs

On Sun, Jun 08, 2025 at 05:25:05PM -0700, Paul E. McKenney wrote:
> On Sun, Jun 08, 2025 at 08:23:36PM +0200, Uladzislau Rezki wrote:
> > On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
> > > On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote:
> > > > On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> > > > > Hello,
> > > > > 
> > > > > syzbot found the following issue on:
> > > > > 
> > > > > HEAD commit:    0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> > > > > git tree:       upstream
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> > > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> > > > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
> > > > > 
> > > > > Downloadable assets:
> > > > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
> > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
> > > > > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
> > > > > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
> > > > > 
> > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> > > > > 
> > > > >  slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
> > > > > BUG: kernel NULL pointer dereference, address: 0000000000000000
> > > > > #PF: supervisor instruction fetch in kernel mode
> > > > > #PF: error_code(0x0010) - not-present page
> > > > > PGD 0 P4D 0 
> > > > > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> > > > > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
> > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> > > > > RIP: 0010:0x0
> > > > > Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> > > > > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> > > > > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> > > > > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> > > > > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> > > > > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> > > > > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> > > > > FS:  0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> > > > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > > Call Trace:
> > > > >  <IRQ>
> > > > >  rcu_do_batch kernel/rcu/tree.c:2546 [inline]
> > > > 
> > > > The usual way that this happens is that someone clobbers the rcu_head
> > > > structure of something that has been passed to call_rcu().  The most
> > > > popular way of clobbering this structure is to pass the same something to
> > > > call_rcu() twice in a row, but other creative arrangements are possible.
> > > > 
> > > > Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
> > > > spot invoking call_rcu() twice in a row.
> > > 
> > > I don't think it's that - syzbot's .config already has that enabled.
> > > KASAN, too.
> > > 
> > > And the only place we do call_rcu() is from rcu_pending.c, where we've
> > > got a rearming rcu callback - but we track whether it's outstanding, and
> > > we do all relevant operations with a lock held.
> > > 
> > > And we only use rcu_pending.c with SRCU, not regular RCU.
> > > 
> > > We do use kfree_rcu() in a few places (all boring, I expect), but that
> > > doesn't (generally?) use the rcu callback list.
> > >
> > Right, kvfree_rcu() does not intersect with regular callbacks, it has
> > its own path. 
> > 
> > It looks like the problem is here:
> > 
> > <snip>
> >   f = rhp->func;
> >   debug_rcu_head_callback(rhp);
> >   WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
> >   f(rhp);
> > <snip>
> > 
> > we do not check if callback, "f", is a NULL. If it is, the kernel bug
> > is triggered right away. For example:
> > 
> > call_rcu(&rh, NULL);
> > 
> > @Paul, do you think it makes sense to narrow callers which apparently
> > pass NULL as a callback? To me it seems the case of this bug. But we
> > do not know the source.
> > 
> > It would give at least a stack-trace of caller which passes a NULL.
> 
> Adding a check for NULL func passed to __call_rcu_common(), you mean?
> 
Yes. Currently there is no any check. So passing a NULL just triggers
kernel panic.

>
> That wouldn't hurt, and would either (as you say) catch the culprit
> or show that the problem is elsewhere.
> 
I can add it then and send out the patch if no objections.

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-06-09  8:35         ` Uladzislau Rezki
@ 2025-06-09  9:47           ` Paul E. McKenney
  2025-06-09 14:20             ` Joel Fernandes
  0 siblings, 1 reply; 20+ messages in thread
From: Paul E. McKenney @ 2025-06-09  9:47 UTC (permalink / raw)
  To: Uladzislau Rezki
  Cc: Kent Overstreet, syzbot, akpm, josh, linux-bcachefs, linux-kernel,
	linux-mm, rcu, syzkaller-bugs

On Mon, Jun 09, 2025 at 10:35:34AM +0200, Uladzislau Rezki wrote:
> On Sun, Jun 08, 2025 at 05:25:05PM -0700, Paul E. McKenney wrote:
> > On Sun, Jun 08, 2025 at 08:23:36PM +0200, Uladzislau Rezki wrote:
> > > On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
> > > > On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote:
> > > > > On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> > > > > > Hello,
> > > > > > 
> > > > > > syzbot found the following issue on:
> > > > > > 
> > > > > > HEAD commit:    0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> > > > > > git tree:       upstream
> > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> > > > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> > > > > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
> > > > > > 
> > > > > > Downloadable assets:
> > > > > > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
> > > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
> > > > > > kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
> > > > > > mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
> > > > > > 
> > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> > > > > > 
> > > > > >  slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
> > > > > > BUG: kernel NULL pointer dereference, address: 0000000000000000
> > > > > > #PF: supervisor instruction fetch in kernel mode
> > > > > > #PF: error_code(0x0010) - not-present page
> > > > > > PGD 0 P4D 0 
> > > > > > Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> > > > > > CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
> > > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> > > > > > RIP: 0010:0x0
> > > > > > Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> > > > > > RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> > > > > > RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> > > > > > RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> > > > > > RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> > > > > > R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> > > > > > R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> > > > > > FS:  0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> > > > > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > > CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> > > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > > > Call Trace:
> > > > > >  <IRQ>
> > > > > >  rcu_do_batch kernel/rcu/tree.c:2546 [inline]
> > > > > 
> > > > > The usual way that this happens is that someone clobbers the rcu_head
> > > > > structure of something that has been passed to call_rcu().  The most
> > > > > popular way of clobbering this structure is to pass the same something to
> > > > > call_rcu() twice in a row, but other creative arrangements are possible.
> > > > > 
> > > > > Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
> > > > > spot invoking call_rcu() twice in a row.
> > > > 
> > > > I don't think it's that - syzbot's .config already has that enabled.
> > > > KASAN, too.
> > > > 
> > > > And the only place we do call_rcu() is from rcu_pending.c, where we've
> > > > got a rearming rcu callback - but we track whether it's outstanding, and
> > > > we do all relevant operations with a lock held.
> > > > 
> > > > And we only use rcu_pending.c with SRCU, not regular RCU.
> > > > 
> > > > We do use kfree_rcu() in a few places (all boring, I expect), but that
> > > > doesn't (generally?) use the rcu callback list.
> > > >
> > > Right, kvfree_rcu() does not intersect with regular callbacks, it has
> > > its own path. 
> > > 
> > > It looks like the problem is here:
> > > 
> > > <snip>
> > >   f = rhp->func;
> > >   debug_rcu_head_callback(rhp);
> > >   WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
> > >   f(rhp);
> > > <snip>
> > > 
> > > we do not check if callback, "f", is a NULL. If it is, the kernel bug
> > > is triggered right away. For example:
> > > 
> > > call_rcu(&rh, NULL);
> > > 
> > > @Paul, do you think it makes sense to narrow callers which apparently
> > > pass NULL as a callback? To me it seems the case of this bug. But we
> > > do not know the source.
> > > 
> > > It would give at least a stack-trace of caller which passes a NULL.
> > 
> > Adding a check for NULL func passed to __call_rcu_common(), you mean?
> > 
> Yes. Currently there is no any check. So passing a NULL just triggers
> kernel panic.
> 
> >
> > That wouldn't hurt, and would either (as you say) catch the culprit
> > or show that the problem is elsewhere.
> > 
> I can add it then and send out the patch if no objections.

No objections from me!

						Thanx, Paul

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-06-09  9:47           ` Paul E. McKenney
@ 2025-06-09 14:20             ` Joel Fernandes
  2025-06-10 12:19               ` Uladzislau Rezki
  0 siblings, 1 reply; 20+ messages in thread
From: Joel Fernandes @ 2025-06-09 14:20 UTC (permalink / raw)
  To: paulmck, Uladzislau Rezki
  Cc: Kent Overstreet, syzbot, akpm, josh, linux-bcachefs, linux-kernel,
	linux-mm, rcu, syzkaller-bugs



On 6/9/2025 5:47 AM, Paul E. McKenney wrote:
> On Mon, Jun 09, 2025 at 10:35:34AM +0200, Uladzislau Rezki wrote:
>> On Sun, Jun 08, 2025 at 05:25:05PM -0700, Paul E. McKenney wrote:
>>> On Sun, Jun 08, 2025 at 08:23:36PM +0200, Uladzislau Rezki wrote:
>>>> On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
>>>>> On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote:
>>>>>> On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> syzbot found the following issue on:
>>>>>>>
>>>>>>> HEAD commit:    0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
>>>>>>> git tree:       upstream
>>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
>>>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
>>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
>>>>>>> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>>>>>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
>>>>>>>
>>>>>>> Downloadable assets:
>>>>>>> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
>>>>>>> vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
>>>>>>> kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
>>>>>>> mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
>>>>>>>
>>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>>>>> Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
>>>>>>>
>>>>>>>  slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
>>>>>>> BUG: kernel NULL pointer dereference, address: 0000000000000000
>>>>>>> #PF: supervisor instruction fetch in kernel mode
>>>>>>> #PF: error_code(0x0010) - not-present page
>>>>>>> PGD 0 P4D 0 
>>>>>>> Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
>>>>>>> CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
>>>>>>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
>>>>>>> RIP: 0010:0x0
>>>>>>> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
>>>>>>> RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
>>>>>>> RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
>>>>>>> RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
>>>>>>> RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
>>>>>>> R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
>>>>>>> R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
>>>>>>> FS:  0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
>>>>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>> CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
>>>>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>>>>> Call Trace:
>>>>>>>  <IRQ>
>>>>>>>  rcu_do_batch kernel/rcu/tree.c:2546 [inline]
>>>>>>
>>>>>> The usual way that this happens is that someone clobbers the rcu_head
>>>>>> structure of something that has been passed to call_rcu().  The most
>>>>>> popular way of clobbering this structure is to pass the same something to
>>>>>> call_rcu() twice in a row, but other creative arrangements are possible.
>>>>>>
>>>>>> Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
>>>>>> spot invoking call_rcu() twice in a row.
>>>>>
>>>>> I don't think it's that - syzbot's .config already has that enabled.
>>>>> KASAN, too.
>>>>>
>>>>> And the only place we do call_rcu() is from rcu_pending.c, where we've
>>>>> got a rearming rcu callback - but we track whether it's outstanding, and
>>>>> we do all relevant operations with a lock held.
>>>>>
>>>>> And we only use rcu_pending.c with SRCU, not regular RCU.
>>>>>
>>>>> We do use kfree_rcu() in a few places (all boring, I expect), but that
>>>>> doesn't (generally?) use the rcu callback list.
>>>>>
>>>> Right, kvfree_rcu() does not intersect with regular callbacks, it has
>>>> its own path. 
>>>>
>>>> It looks like the problem is here:
>>>>
>>>> <snip>
>>>>   f = rhp->func;
>>>>   debug_rcu_head_callback(rhp);
>>>>   WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
>>>>   f(rhp);
>>>> <snip>
>>>>
>>>> we do not check if callback, "f", is a NULL. If it is, the kernel bug
>>>> is triggered right away. For example:
>>>>
>>>> call_rcu(&rh, NULL);
>>>>
>>>> @Paul, do you think it makes sense to narrow callers which apparently
>>>> pass NULL as a callback? To me it seems the case of this bug. But we
>>>> do not know the source.
>>>>
>>>> It would give at least a stack-trace of caller which passes a NULL.
>>>
>>> Adding a check for NULL func passed to __call_rcu_common(), you mean?
>>>
>> Yes. Currently there is no any check. So passing a NULL just triggers
>> kernel panic.
>>
>>>
>>> That wouldn't hurt, and would either (as you say) catch the culprit
>>> or show that the problem is elsewhere.
>>>
>> I can add it then and send out the patch if no objections.
> 
> No objections from me!

Me neither! And I can push that into an -rc release as well once I have it
(since it is related to a potential bug).

thanks,

 - Joel



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-06-09 14:20             ` Joel Fernandes
@ 2025-06-10 12:19               ` Uladzislau Rezki
  0 siblings, 0 replies; 20+ messages in thread
From: Uladzislau Rezki @ 2025-06-10 12:19 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: paulmck, Uladzislau Rezki, Kent Overstreet, syzbot, akpm, josh,
	linux-bcachefs, linux-kernel, linux-mm, rcu, syzkaller-bugs

On Mon, Jun 09, 2025 at 10:20:58AM -0400, Joel Fernandes wrote:
> 
> 
> On 6/9/2025 5:47 AM, Paul E. McKenney wrote:
> > On Mon, Jun 09, 2025 at 10:35:34AM +0200, Uladzislau Rezki wrote:
> >> On Sun, Jun 08, 2025 at 05:25:05PM -0700, Paul E. McKenney wrote:
> >>> On Sun, Jun 08, 2025 at 08:23:36PM +0200, Uladzislau Rezki wrote:
> >>>> On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
> >>>>> On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote:
> >>>>>> On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> >>>>>>> Hello,
> >>>>>>>
> >>>>>>> syzbot found the following issue on:
> >>>>>>>
> >>>>>>> HEAD commit:    0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> >>>>>>> git tree:       upstream
> >>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> >>>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> >>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> >>>>>>> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> >>>>>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
> >>>>>>>
> >>>>>>> Downloadable assets:
> >>>>>>> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
> >>>>>>> vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
> >>>>>>> kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
> >>>>>>> mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
> >>>>>>>
> >>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >>>>>>> Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> >>>>>>>
> >>>>>>>  slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
> >>>>>>> BUG: kernel NULL pointer dereference, address: 0000000000000000
> >>>>>>> #PF: supervisor instruction fetch in kernel mode
> >>>>>>> #PF: error_code(0x0010) - not-present page
> >>>>>>> PGD 0 P4D 0 
> >>>>>>> Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> >>>>>>> CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
> >>>>>>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> >>>>>>> RIP: 0010:0x0
> >>>>>>> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> >>>>>>> RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> >>>>>>> RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> >>>>>>> RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> >>>>>>> RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> >>>>>>> R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> >>>>>>> R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> >>>>>>> FS:  0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> >>>>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>>>>> CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> >>>>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >>>>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >>>>>>> Call Trace:
> >>>>>>>  <IRQ>
> >>>>>>>  rcu_do_batch kernel/rcu/tree.c:2546 [inline]
> >>>>>>
> >>>>>> The usual way that this happens is that someone clobbers the rcu_head
> >>>>>> structure of something that has been passed to call_rcu().  The most
> >>>>>> popular way of clobbering this structure is to pass the same something to
> >>>>>> call_rcu() twice in a row, but other creative arrangements are possible.
> >>>>>>
> >>>>>> Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
> >>>>>> spot invoking call_rcu() twice in a row.
> >>>>>
> >>>>> I don't think it's that - syzbot's .config already has that enabled.
> >>>>> KASAN, too.
> >>>>>
> >>>>> And the only place we do call_rcu() is from rcu_pending.c, where we've
> >>>>> got a rearming rcu callback - but we track whether it's outstanding, and
> >>>>> we do all relevant operations with a lock held.
> >>>>>
> >>>>> And we only use rcu_pending.c with SRCU, not regular RCU.
> >>>>>
> >>>>> We do use kfree_rcu() in a few places (all boring, I expect), but that
> >>>>> doesn't (generally?) use the rcu callback list.
> >>>>>
> >>>> Right, kvfree_rcu() does not intersect with regular callbacks, it has
> >>>> its own path. 
> >>>>
> >>>> It looks like the problem is here:
> >>>>
> >>>> <snip>
> >>>>   f = rhp->func;
> >>>>   debug_rcu_head_callback(rhp);
> >>>>   WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
> >>>>   f(rhp);
> >>>> <snip>
> >>>>
> >>>> we do not check if callback, "f", is a NULL. If it is, the kernel bug
> >>>> is triggered right away. For example:
> >>>>
> >>>> call_rcu(&rh, NULL);
> >>>>
> >>>> @Paul, do you think it makes sense to narrow callers which apparently
> >>>> pass NULL as a callback? To me it seems the case of this bug. But we
> >>>> do not know the source.
> >>>>
> >>>> It would give at least a stack-trace of caller which passes a NULL.
> >>>
> >>> Adding a check for NULL func passed to __call_rcu_common(), you mean?
> >>>
> >> Yes. Currently there is no any check. So passing a NULL just triggers
> >> kernel panic.
> >>
> >>>
> >>> That wouldn't hurt, and would either (as you say) catch the culprit
> >>> or show that the problem is elsewhere.
> >>>
> >> I can add it then and send out the patch if no objections.
> > 
> > No objections from me!
> 
> Me neither! And I can push that into an -rc release as well once I have it
> (since it is related to a potential bug).
> 
I will prepare it and send out today.

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-06-08 18:23     ` Uladzislau Rezki
  2025-06-09  0:25       ` Paul E. McKenney
@ 2025-06-09 18:28       ` Vlastimil Babka
  2025-06-10 12:33         ` Uladzislau Rezki
  1 sibling, 1 reply; 20+ messages in thread
From: Vlastimil Babka @ 2025-06-09 18:28 UTC (permalink / raw)
  To: Uladzislau Rezki, Kent Overstreet, Paul E. McKenney
  Cc: syzbot, akpm, josh, linux-bcachefs, linux-kernel, linux-mm, rcu,
	syzkaller-bugs

On 6/8/25 20:23, Uladzislau Rezki wrote:
> On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
>> 
>> I don't think it's that - syzbot's .config already has that enabled.
>> KASAN, too.
>> 
>> And the only place we do call_rcu() is from rcu_pending.c, where we've
>> got a rearming rcu callback - but we track whether it's outstanding, and
>> we do all relevant operations with a lock held.
>> 
>> And we only use rcu_pending.c with SRCU, not regular RCU.
>> 
>> We do use kfree_rcu() in a few places (all boring, I expect), but that
>> doesn't (generally?) use the rcu callback list.
>>
> Right, kvfree_rcu() does not intersect with regular callbacks, it has
> its own path. 

You mean do to the batching? Maybe the batching should be disabled with
CONFIG_DEBUG_OBJECTS_RCU_HEAD=y if it prevents it from detecting issues?
Otherwise we now have kvfree_rcu_cb() so the special handling of
kvfree_rcu() is gone in in the non-batching case.

> It looks like the problem is here:
> 
> <snip>
>   f = rhp->func;
>   debug_rcu_head_callback(rhp);
>   WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
>   f(rhp);
> <snip>
> 
> we do not check if callback, "f", is a NULL. If it is, the kernel bug
> is triggered right away. For example:
> 
> call_rcu(&rh, NULL);
> 
> @Paul, do you think it makes sense to narrow callers which apparently
> pass NULL as a callback? To me it seems the case of this bug. But we
> do not know the source.
> 
> It would give at least a stack-trace of caller which passes a NULL.

Right, AFAIU this kind of check is now possible, previously NULL was being
interpreted as a valid __is_kvfree_rcu_offset() (i.e. rcu_head at offset 0).

> --
> Uladzislau Rezki
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-06-09 18:28       ` Vlastimil Babka
@ 2025-06-10 12:33         ` Uladzislau Rezki
  0 siblings, 0 replies; 20+ messages in thread
From: Uladzislau Rezki @ 2025-06-10 12:33 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Uladzislau Rezki, Kent Overstreet, Paul E. McKenney, syzbot, akpm,
	josh, linux-bcachefs, linux-kernel, linux-mm, rcu, syzkaller-bugs

On Mon, Jun 09, 2025 at 08:28:56PM +0200, Vlastimil Babka wrote:
> On 6/8/25 20:23, Uladzislau Rezki wrote:
> > On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
> >> 
> >> I don't think it's that - syzbot's .config already has that enabled.
> >> KASAN, too.
> >> 
> >> And the only place we do call_rcu() is from rcu_pending.c, where we've
> >> got a rearming rcu callback - but we track whether it's outstanding, and
> >> we do all relevant operations with a lock held.
> >> 
> >> And we only use rcu_pending.c with SRCU, not regular RCU.
> >> 
> >> We do use kfree_rcu() in a few places (all boring, I expect), but that
> >> doesn't (generally?) use the rcu callback list.
> >>
> > Right, kvfree_rcu() does not intersect with regular callbacks, it has
> > its own path. 
> 
> You mean do to the batching? Maybe the batching should be disabled with
> CONFIG_DEBUG_OBJECTS_RCU_HEAD=y if it prevents it from detecting issues?
> Otherwise we now have kvfree_rcu_cb() so the special handling of
> kvfree_rcu() is gone in in the non-batching case.
> 
Not really. I meant that in a call_rcu() API there is no any check if
a passed callback which is executed after GP is NULL. If so, we get the
bug about about dereferencing of NULL pointer.

Since it is invoked by the rcu_core() context, we can not identify the
caller in order to blame someone :)

As for batching, we have a support of CONFIG_DEBUG_OBJECTS_RCU_HEAD. It
helps to identify double-freeing and probably leaking.

> > It looks like the problem is here:
> > 
> > <snip>
> >   f = rhp->func;
> >   debug_rcu_head_callback(rhp);
> >   WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
> >   f(rhp);
> > <snip>
> > 
> > we do not check if callback, "f", is a NULL. If it is, the kernel bug
> > is triggered right away. For example:
> > 
> > call_rcu(&rh, NULL);
> > 
> > @Paul, do you think it makes sense to narrow callers which apparently
> > pass NULL as a callback? To me it seems the case of this bug. But we
> > do not know the source.
> > 
> > It would give at least a stack-trace of caller which passes a NULL.
> 
> Right, AFAIU this kind of check is now possible, previously NULL was being
> interpreted as a valid __is_kvfree_rcu_offset() (i.e. rcu_head at offset 0).
> 
> > --
> > Uladzislau Rezki
> > 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-02-05  0:34 [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) syzbot
  2025-02-05 14:56 ` Paul E. McKenney
@ 2025-06-08  6:58 ` syzbot
  2025-06-11 15:58 ` [syzbot] [rcu?] [bcachefs?] " Uladzislau Rezki
  2 siblings, 0 replies; 20+ messages in thread
From: syzbot @ 2025-06-08  6:58 UTC (permalink / raw)
  To: akpm, ayaanmirza.788, ayaanmirzabaig85, josh, kent.overstreet,
	linux-bcachefs, linux-kernel, linux-mm, luto, paulmck, peterz,
	rcu, syzkaller-bugs, tglx

syzbot has bisected this issue to:

commit 14152654805256d760315ec24e414363bfa19a06
Author: Kent Overstreet <kent.overstreet@linux.dev>
Date:   Mon Nov 25 05:21:27 2024 +0000

    bcachefs: Bad btree roots are now autofix

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=12fa0a82580000
start commit:   99fa936e8e4f Merge tag 'affs-6.14-rc5-tag' of git://git.ke..
git tree:       upstream
final oops:     https://syzkaller.appspot.com/x/report.txt?x=11fa0a82580000
console output: https://syzkaller.appspot.com/x/log.txt?x=16fa0a82580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=523d3ff8e053340a
dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=119d35a8580000

Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
Fixes: 141526548052 ("bcachefs: Bad btree roots are now autofix")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-02-05  0:34 [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) syzbot
  2025-02-05 14:56 ` Paul E. McKenney
  2025-06-08  6:58 ` [syzbot] [bcachefs?] [rcu?] " syzbot
@ 2025-06-11 15:58 ` Uladzislau Rezki
  2025-06-11 18:02   ` [syzbot] [bcachefs?] [rcu?] " syzbot
  2 siblings, 1 reply; 20+ messages in thread
From: Uladzislau Rezki @ 2025-06-11 15:58 UTC (permalink / raw)
  To: syzbot
  Cc: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel,
	linux-mm, paulmck, rcu, syzkaller-bugs

On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000
> 
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
> mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> 
>  slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor instruction fetch in kernel mode
> #PF: error_code(0x0010) - not-present page
> PGD 0 P4D 0 
> Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
> CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> FS:  0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <IRQ>
>  rcu_do_batch kernel/rcu/tree.c:2546 [inline]
>  rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2802
>  handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561
>  __do_softirq kernel/softirq.c:595 [inline]
>  invoke_softirq kernel/softirq.c:435 [inline]
>  __irq_exit_rcu+0xf7/0x220 kernel/softirq.c:662
>  irq_exit_rcu+0x9/0x30 kernel/softirq.c:678
>  instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline]
>  sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1049
>  </IRQ>
>  <TASK>
>  asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
> RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
> RIP: 0010:_raw_spin_unlock_irqrestore+0xd8/0x140 kernel/locking/spinlock.c:194
> Code: 9c 8f 44 24 20 42 80 3c 23 00 74 08 4c 89 f7 e8 fe 78 2d f6 f6 44 24 21 02 75 52 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> c3 0f 95 f5 65 8b 05 d4 58 0b 74 85 c0 74 43 48 c7 04 24 0e 36
> RSP: 0018:ffffc900030fef60 EFLAGS: 00000206
> RAX: 23438dd059a4b100 RBX: 1ffff9200061fdf0 RCX: ffffffff819b316a
> RDX: dffffc0000000000 RSI: ffffffff8c0aa680 RDI: 0000000000000001
> RBP: ffffc900030feff8 R08: ffffffff942f9847 R09: 1ffffffff285f308
> R10: dffffc0000000000 R11: fffffbfff285f309 R12: dffffc0000000000
> R13: 1ffff9200061fdec R14: ffffc900030fef80 R15: 0000000000000246
>  spin_unlock_irqrestore include/linux/spinlock.h:406 [inline]
>  rmqueue_bulk mm/page_alloc.c:2329 [inline]
>  __rmqueue_pcplist+0x21fd/0x2a90 mm/page_alloc.c:3004
>  rmqueue_pcplist mm/page_alloc.c:3046 [inline]
>  rmqueue mm/page_alloc.c:3077 [inline]
>  get_page_from_freelist+0x886/0x37a0 mm/page_alloc.c:3474
>  __alloc_frozen_pages_noprof+0x292/0x710 mm/page_alloc.c:4739
>  alloc_pages_mpol+0x311/0x660 mm/mempolicy.c:2270
>  folio_alloc_mpol_noprof mm/mempolicy.c:2289 [inline]
>  vma_alloc_folio_noprof+0x12b/0x260 mm/mempolicy.c:2324
>  folio_prealloc+0x2e/0x170
>  wp_page_copy mm/memory.c:3435 [inline]
>  do_wp_page+0x1253/0x49b0 mm/memory.c:3827
>  handle_pte_fault mm/memory.c:5905 [inline]
>  __handle_mm_fault+0x24d5/0x70f0 mm/memory.c:6032
>  handle_mm_fault+0x3e5/0x8d0 mm/memory.c:6201
>  do_user_addr_fault arch/x86/mm/fault.c:1388 [inline]
>  handle_page_fault arch/x86/mm/fault.c:1480 [inline]
>  exc_page_fault+0x2b9/0x8b0 arch/x86/mm/fault.c:1538
>  asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
> RIP: 0010:__put_user_4+0x11/0x20 arch/x86/lib/putuser.S:88
> Code: 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 89 cb 48 c1 fb 3f 48 09 d9 0f 01 cb <89> 01 31 c9 0f 01 ca c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90
> RSP: 0018:ffffc900030fff00 EFLAGS: 00050202
> RAX: 0000000000000005 RBX: 0000000000000000 RCX: 00005555679927d0
> RDX: 0000000000000000 RSI: ffffffff8c0ab8e0 RDI: ffffffff8c608a00
> RBP: ffff888000dfcf20 R08: ffffffff901b5177 R09: 1ffffffff2036a2e
> R10: dffffc0000000000 R11: fffffbfff2036a2f R12: 0000000000000000
> R13: 0000000000000000 R14: 0000000000000005 R15: dffffc0000000000
>  schedule_tail+0x96/0xb0 kernel/sched/core.c:5312
>  ret_from_fork+0x24/0x80 arch/x86/kernel/process.c:144
>  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>  </TASK>
> Modules linked in:
> CR2: 0000000000000000
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
> RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
> RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
> RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
> R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
> R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
> FS:  0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> ----------------
> Code disassembly (best guess):
>    0:	9c                   	pushf
>    1:	8f 44 24 20          	pop    0x20(%rsp)
>    5:	42 80 3c 23 00       	cmpb   $0x0,(%rbx,%r12,1)
>    a:	74 08                	je     0x14
>    c:	4c 89 f7             	mov    %r14,%rdi
>    f:	e8 fe 78 2d f6       	call   0xf62d7912
>   14:	f6 44 24 21 02       	testb  $0x2,0x21(%rsp)
>   19:	75 52                	jne    0x6d
>   1b:	41 f7 c7 00 02 00 00 	test   $0x200,%r15d
>   22:	74 01                	je     0x25
>   24:	fb                   	sti
>   25:	bf 01 00 00 00       	mov    $0x1,%edi
> * 2a:	e8 c3 0f 95 f5       	call   0xf5950ff2 <-- trapping instruction
>   2f:	65 8b 05 d4 58 0b 74 	mov    %gs:0x740b58d4(%rip),%eax        # 0x740b590a
>   36:	85 c0                	test   %eax,%eax
>   38:	74 43                	je     0x7d
>   3a:	48                   	rex.W
>   3b:	c7                   	.byte 0xc7
>   3c:	04 24                	add    $0x24,%al
>   3e:	0e                   	(bad)
>   3f:	36                   	ss
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
> 
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.
> 
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
> 
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
> 
> If you want to undo deduplication, reply with:
> #syz undup
> 

#syz test

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 475f31deed14..b297a32c6779 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3047,6 +3047,10 @@ __call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy_in)
        /* Misaligned rcu_head! */
        WARN_ON_ONCE((unsigned long)head & (sizeof(void *) - 1));

+       /* Avoid NULL dereference if callback is NULL. */
+       if (WARN_ON_ONCE(!func))
+               return;
+
        if (debug_rcu_head_queue(head)) {
                /*
                 * Probable double call_rcu(), so leak the callback.


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-06-11 15:58 ` [syzbot] [rcu?] [bcachefs?] " Uladzislau Rezki
@ 2025-06-11 18:02   ` syzbot
  2025-06-11 19:15     ` Uladzislau Rezki
  0 siblings, 1 reply; 20+ messages in thread
From: syzbot @ 2025-06-11 18:02 UTC (permalink / raw)
  To: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel,
	linux-mm, paulmck, rcu, syzkaller-bugs, urezki

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

failed to apply patch:
checking file kernel/rcu/tree.c
patch: **** unexpected end of file in patch

Tested on:

commit:         aef17cb3 Revert "mm/damon/Kconfig: enable CONFIG_DAMON..
git tree:       upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=523d3ff8e053340a
dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
compiler:       
patch:          https://syzkaller.appspot.com/x/patch.diff?x=17de99d4580000

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-06-11 18:02   ` [syzbot] [bcachefs?] [rcu?] " syzbot
@ 2025-06-11 19:15     ` Uladzislau Rezki
  2025-06-11 19:57       ` syzbot
  0 siblings, 1 reply; 20+ messages in thread
From: Uladzislau Rezki @ 2025-06-11 19:15 UTC (permalink / raw)
  To: syzbot
  Cc: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel,
	linux-mm, paulmck, rcu, syzkaller-bugs, urezki

On Wed, Jun 11, 2025 at 11:02:03AM -0700, syzbot wrote:
> Hello,
> 
> syzbot tried to test the proposed patch but the build/boot failed:
> 
> failed to apply patch:
> checking file kernel/rcu/tree.c
> patch: **** unexpected end of file in patch
> 
> 
> 
> Tested on:
> 
> commit:         aef17cb3 Revert "mm/damon/Kconfig: enable CONFIG_DAMON..
> git tree:       upstream
> kernel config:  https://syzkaller.appspot.com/x/.config?x=523d3ff8e053340a
> dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> compiler:       
> patch:          https://syzkaller.appspot.com/x/patch.diff?x=17de99d4580000
> 
#syz test

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index e8a4b720d7d2..14d4499c6fc3 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3072,6 +3072,10 @@ __call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy_in)
 	/* Misaligned rcu_head! */
 	WARN_ON_ONCE((unsigned long)head & (sizeof(void *) - 1));
 
+	/* Avoid NULL dereference if callback is NULL. */
+	if (WARN_ON_ONCE(!func))
+		return;
+
 	if (debug_rcu_head_queue(head)) {
 		/*
 		 * Probable double call_rcu(), so leak the callback.

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-06-11 19:15     ` Uladzislau Rezki
@ 2025-06-11 19:57       ` syzbot
  2025-06-11 20:58         ` Boqun Feng
  0 siblings, 1 reply; 20+ messages in thread
From: syzbot @ 2025-06-11 19:57 UTC (permalink / raw)
  To: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel,
	linux-mm, paulmck, rcu, syzkaller-bugs, urezki

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
Tested-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com

Tested on:

commit:         488ef356 KEYS: Invert FINAL_PUT bit
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=713d218acd33d94
dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
compiler:       Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6
patch:          https://syzkaller.appspot.com/x/patch.diff?x=170e460c580000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-06-11 19:57       ` syzbot
@ 2025-06-11 20:58         ` Boqun Feng
  2025-06-12  7:42           ` Aleksandr Nogikh
  0 siblings, 1 reply; 20+ messages in thread
From: Boqun Feng @ 2025-06-11 20:58 UTC (permalink / raw)
  To: syzbot
  Cc: akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel,
	linux-mm, paulmck, rcu, syzkaller-bugs, urezki

On Wed, Jun 11, 2025 at 12:57:04PM -0700, syzbot wrote:
> Hello,
> 
> syzbot has tested the proposed patch and the reproducer did not trigger any issue:
> 
> Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> Tested-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> 
> Tested on:
> 
> commit:         488ef356 KEYS: Invert FINAL_PUT bit
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000

Is there a way to see the whole console output? If Ulad's patch fixes
the exact issue, we should be able to see a WARN_ON_ONCE() triggered.

Regards,
Boqun

> kernel config:  https://syzkaller.appspot.com/x/.config?x=713d218acd33d94
> dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> compiler:       Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6
> patch:          https://syzkaller.appspot.com/x/patch.diff?x=170e460c580000
> 
> Note: testing is done by a robot and is best-effort only.
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-06-11 20:58         ` Boqun Feng
@ 2025-06-12  7:42           ` Aleksandr Nogikh
  2025-06-12  9:37             ` Uladzislau Rezki
  0 siblings, 1 reply; 20+ messages in thread
From: Aleksandr Nogikh @ 2025-06-12  7:42 UTC (permalink / raw)
  To: Boqun Feng
  Cc: syzbot, akpm, josh, kent.overstreet, linux-bcachefs, linux-kernel,
	linux-mm, paulmck, rcu, syzkaller-bugs, urezki

On Wed, Jun 11, 2025 at 10:58 PM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> On Wed, Jun 11, 2025 at 12:57:04PM -0700, syzbot wrote:
> > Hello,
> >
> > syzbot has tested the proposed patch and the reproducer did not trigger any issue:
> >
> > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> > Tested-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> >
> > Tested on:
> >
> > commit:         488ef356 KEYS: Invert FINAL_PUT bit
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000
>
> Is there a way to see the whole console output? If Ulad's patch fixes
> the exact issue, we should be able to see a WARN_ON_ONCE() triggered.

If WARN_ON_ONCE() were triggered, the associated kernel panic output
would have been at the end of this log.

>
> Regards,
> Boqun
>
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=713d218acd33d94
> > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a

FWIW the last time the bug was observed on syzbot was 100 days ago, so
it has likely been fixed since then or has become much harder to
reproduce.

> > compiler:       Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6
> > patch:          https://syzkaller.appspot.com/x/patch.diff?x=170e460c580000
> >
> > Note: testing is done by a robot and is best-effort only.
> >
>

-- 
Aleksandr

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-06-12  7:42           ` Aleksandr Nogikh
@ 2025-06-12  9:37             ` Uladzislau Rezki
  2025-06-12 17:20               ` Boqun Feng
  0 siblings, 1 reply; 20+ messages in thread
From: Uladzislau Rezki @ 2025-06-12  9:37 UTC (permalink / raw)
  To: Aleksandr Nogikh
  Cc: Boqun Feng, syzbot, akpm, josh, kent.overstreet, linux-bcachefs,
	linux-kernel, linux-mm, paulmck, rcu, syzkaller-bugs, urezki

On Thu, Jun 12, 2025 at 09:42:32AM +0200, Aleksandr Nogikh wrote:
> On Wed, Jun 11, 2025 at 10:58 PM Boqun Feng <boqun.feng@gmail.com> wrote:
> >
> > On Wed, Jun 11, 2025 at 12:57:04PM -0700, syzbot wrote:
> > > Hello,
> > >
> > > syzbot has tested the proposed patch and the reproducer did not trigger any issue:
> > >
> > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> > > Tested-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
> > >
> > > Tested on:
> > >
> > > commit:         488ef356 KEYS: Invert FINAL_PUT bit
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000
> >
> > Is there a way to see the whole console output? If Ulad's patch fixes
> > the exact issue, we should be able to see a WARN_ON_ONCE() triggered.
> 
> If WARN_ON_ONCE() were triggered, the associated kernel panic output
> would have been at the end of this log.
> 
> >
> > Regards,
> > Boqun
> >
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=713d218acd33d94
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> 
> FWIW the last time the bug was observed on syzbot was 100 days ago, so
> it has likely been fixed since then or has become much harder to
> reproduce.
> 
That is even worse, if it is last for 100 days already.

--
Uladzislau Rezki

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [syzbot] [bcachefs?] [rcu?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)
  2025-06-12  9:37             ` Uladzislau Rezki
@ 2025-06-12 17:20               ` Boqun Feng
  0 siblings, 0 replies; 20+ messages in thread
From: Boqun Feng @ 2025-06-12 17:20 UTC (permalink / raw)
  To: Uladzislau Rezki (Sony), Aleksandr Nogikh
  Cc: syzbot, Andrew Morton, Josh Triplett, kent.overstreet,
	linux-bcachefs, linux-kernel, linux-mm, Paul E. McKenney, rcu,
	syzkaller-bugs



On Thu, Jun 12, 2025, at 2:37 AM, Uladzislau Rezki wrote:
> On Thu, Jun 12, 2025 at 09:42:32AM +0200, Aleksandr Nogikh wrote:
>> On Wed, Jun 11, 2025 at 10:58 PM Boqun Feng <boqun.feng@gmail.com> wrote:
>> >
>> > On Wed, Jun 11, 2025 at 12:57:04PM -0700, syzbot wrote:
>> > > Hello,
>> > >
>> > > syzbot has tested the proposed patch and the reproducer did not trigger any issue:
>> > >
>> > > Reported-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
>> > > Tested-by: syzbot+80e5d6f453f14a53383a@syzkaller.appspotmail.com
>> > >
>> > > Tested on:
>> > >
>> > > commit:         488ef356 KEYS: Invert FINAL_PUT bit
>> > > git tree:       upstream
>> > > console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000
>> >
>> > Is there a way to see the whole console output? If Ulad's patch fixes
>> > the exact issue, we should be able to see a WARN_ON_ONCE() triggered.
>> 
>> If WARN_ON_ONCE() were triggered, the associated kernel panic output
>> would have been at the end of this log.
>> 
>> >
>> > Regards,
>> > Boqun
>> >
>> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=713d218acd33d94
>> > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
>> 
>> FWIW the last time the bug was observed on syzbot was 100 days ago, so
>> it has likely been fixed since then or has become much harder to
>> reproduce.
>> 
> That is even worse, if it is last for 100 days already.
>

My understanding is that the evidence shows that the
issue that directly caused null-ptr-derek the has been
fixed 100 days ago.

Regards,
Boqun

> --
> Uladzislau Rezki

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2025-06-12 17:20 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-05  0:34 [syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3) syzbot
2025-02-05 14:56 ` Paul E. McKenney
2025-06-08 15:26   ` Kent Overstreet
2025-06-08 18:23     ` Uladzislau Rezki
2025-06-09  0:25       ` Paul E. McKenney
2025-06-09  8:35         ` Uladzislau Rezki
2025-06-09  9:47           ` Paul E. McKenney
2025-06-09 14:20             ` Joel Fernandes
2025-06-10 12:19               ` Uladzislau Rezki
2025-06-09 18:28       ` Vlastimil Babka
2025-06-10 12:33         ` Uladzislau Rezki
2025-06-08  6:58 ` [syzbot] [bcachefs?] [rcu?] " syzbot
2025-06-11 15:58 ` [syzbot] [rcu?] [bcachefs?] " Uladzislau Rezki
2025-06-11 18:02   ` [syzbot] [bcachefs?] [rcu?] " syzbot
2025-06-11 19:15     ` Uladzislau Rezki
2025-06-11 19:57       ` syzbot
2025-06-11 20:58         ` Boqun Feng
2025-06-12  7:42           ` Aleksandr Nogikh
2025-06-12  9:37             ` Uladzislau Rezki
2025-06-12 17:20               ` Boqun Feng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).