public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [syzbot] [kernel?] BUG: unable to handle kernel NULL pointer dereference in __hrtimer_run_queues
@ 2024-06-03 10:22 syzbot
  2024-06-03 11:04 ` Hillf Danton
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: syzbot @ 2024-06-03 10:22 UTC (permalink / raw)
  To: anna-maria, frederic, linux-kernel, syzkaller-bugs, tglx

Hello,

syzbot found the following issue on:

HEAD commit:    4a4be1ad3a6e Revert "vfs: Delete the associated dentry whe..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1422a73c980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=bd6024aedb15e15c
dashboard link: https://syzkaller.appspot.com/bug?extid=558f67d44ad7f098a3de
compiler:       aarch64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=15583162980000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12c1b514980000

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/384ffdcca292/non_bootable_disk-4a4be1ad.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/75957361122b/vmlinux-4a4be1ad.xz
kernel image: https://storage.googleapis.com/syzbot-assets/6c766b0ec377/Image-4a4be1ad.gz.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+558f67d44ad7f098a3de@syzkaller.appspotmail.com

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000090
Mem abort info:
  ESR = 0x0000000096000006
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x06: level 2 translation fault
Data abort info:
  ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
  CM = 0, WnR = 0, TnD = 0, TagAccess = 0
  GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 52-bit VAs, pgdp=000000004605bb80
[0000000000000090] pgd=08000000464ee003, p4d=08000000472aa003, pud=08000000471b8003, pmd=0000000000000000
Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 3192 Comm: syz-executor607 Not tainted 6.10.0-rc1-syzkaller-00027-g4a4be1ad3a6e #0
Hardware name: linux,dummy-virt (DT)
pstate: 204000c9 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : rb_next+0x1c/0x54 lib/rbtree.c:505
lr : rb_erase_cached include/linux/rbtree.h:124 [inline]
lr : timerqueue_del+0x38/0x70 lib/timerqueue.c:57
sp : ffff800080003e70
x29: ffff800080003e70 x28: 0000000000000000 x27: fff000007f8cf780
x26: 0000000000000001 x25: 00000000000000c0 x24: 0000001f0198bc90
x23: fff000007f8cf780 x22: fff000007f8cf7e0 x21: fff000007f8cf780
x20: fff000007f8cf7e0 x19: ffff800088c3bd60 x18: 0000000000000000
x17: fff07ffffd319000 x16: ffff800080000000 x15: 0000ffffef309d38
x14: 00000000000003bb x13: 0000000000000000 x12: ffff8000825e0028
x11: 0000000000000001 x10: 0000000000000200 x9 : 0000000000200000
x8 : 0008000000000000 x7 : ff7ffffffffffbff x6 : 00000000019a23f5
x5 : fff07ffffd319000 x4 : 000000000a2dca90 x3 : ffff800088c3bd60
x2 : ff7000007f8cf8e8 x1 : 0000000000000080 x0 : 0000000000000080
Call trace:
 rb_next+0x1c/0x54 lib/rbtree.c:505
 __remove_hrtimer kernel/time/hrtimer.c:1118 [inline]
 __run_hrtimer kernel/time/hrtimer.c:1667 [inline]
 __hrtimer_run_queues+0x104/0x1bc kernel/time/hrtimer.c:1751
 hrtimer_interrupt+0xe8/0x244 kernel/time/hrtimer.c:1813
 timer_handler drivers/clocksource/arm_arch_timer.c:674 [inline]
 arch_timer_handler_phys+0x2c/0x44 drivers/clocksource/arm_arch_timer.c:692
 handle_percpu_devid_irq+0x84/0x130 kernel/irq/chip.c:942
 generic_handle_irq_desc include/linux/irqdesc.h:173 [inline]
 handle_irq_desc kernel/irq/irqdesc.c:691 [inline]
 generic_handle_domain_irq+0x2c/0x44 kernel/irq/irqdesc.c:747
 gic_handle_irq+0x40/0xc4 drivers/irqchip/irq-gic.c:370
 call_on_irq_stack+0x24/0x4c arch/arm64/kernel/entry.S:889
 do_interrupt_handler+0x80/0x84 arch/arm64/kernel/entry-common.c:310
 __el1_irq arch/arm64/kernel/entry-common.c:536 [inline]
 el1_interrupt+0x34/0x64 arch/arm64/kernel/entry-common.c:551
 el1h_64_irq_handler+0x18/0x24 arch/arm64/kernel/entry-common.c:556
 el1h_64_irq+0x64/0x68 arch/arm64/kernel/entry.S:594
 __clear_young_dirty_ptes arch/arm64/include/asm/pgtable.h:1311 [inline]
 contpte_clear_young_dirty_ptes+0x68/0x128 arch/arm64/mm/contpte.c:389
 walk_pmd_range mm/pagewalk.c:143 [inline]
 walk_pud_range mm/pagewalk.c:221 [inline]
 walk_p4d_range mm/pagewalk.c:256 [inline]
 walk_pgd_range+0x4b0/0x8a4 mm/pagewalk.c:293
 __walk_page_range+0x178/0x180 mm/pagewalk.c:395
 walk_page_range+0x144/0x224 mm/pagewalk.c:521
 madvise_free_single_vma+0x134/0x2bc mm/madvise.c:815
 madvise_dontneed_free mm/madvise.c:929 [inline]
 madvise_vma_behavior+0x1d0/0x790 mm/madvise.c:1046
 madvise_walk_vmas+0xbc/0x12c mm/madvise.c:1268
 do_madvise+0x160/0x418 mm/madvise.c:1464
 __do_sys_madvise mm/madvise.c:1481 [inline]
 __se_sys_madvise mm/madvise.c:1479 [inline]
 __arm64_sys_madvise+0x24/0x34 mm/madvise.c:1479
 __invoke_syscall arch/arm64/kernel/syscall.c:34 [inline]
 invoke_syscall+0x48/0x118 arch/arm64/kernel/syscall.c:48
 el0_svc_common.constprop.0+0x40/0xe0 arch/arm64/kernel/syscall.c:133
 do_el0_svc+0x1c/0x28 arch/arm64/kernel/syscall.c:152
 el0_svc+0x34/0xf8 arch/arm64/kernel/entry-common.c:712
 el0t_64_sync_handler+0x100/0x12c arch/arm64/kernel/entry-common.c:730
 el0t_64_sync+0x19c/0x1a0 arch/arm64/kernel/entry.S:598
Code: 54000200 f9400401 b4000141 aa0103e0 (f9400821) 
---[ end trace 0000000000000000 ]---
----------------
Code disassembly (best guess):
   0:	54000200 	b.eq	0x40  // b.none
   4:	f9400401 	ldr	x1, [x0, #8]
   8:	b4000141 	cbz	x1, 0x30
   c:	aa0103e0 	mov	x0, x1
* 10:	f9400821 	ldr	x1, [x1, #16] <-- trapping instruction


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [kernel?] BUG: unable to handle kernel NULL pointer dereference in __hrtimer_run_queues
  2024-06-03 10:22 [syzbot] [kernel?] BUG: unable to handle kernel NULL pointer dereference in __hrtimer_run_queues syzbot
@ 2024-06-03 11:04 ` Hillf Danton
  2024-06-04 12:29 ` Thomas Gleixner
  2024-06-04 12:45 ` Hillf Danton
  2 siblings, 0 replies; 7+ messages in thread
From: Hillf Danton @ 2024-06-03 11:04 UTC (permalink / raw)
  To: syzbot; +Cc: frederic, linux-kernel, syzkaller-bugs, tglx

On Mon, 03 Jun 2024 03:22:29 -0700
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000090
> CPU: 0 PID: 3192 Comm: syz-executor607 Not tainted 6.10.0-rc1-syzkaller-00027-g4a4be1ad3a6e #0
> Hardware name: linux,dummy-virt (DT)
> pstate: 204000c9 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : rb_next+0x1c/0x54 lib/rbtree.c:505
> lr : rb_erase_cached include/linux/rbtree.h:124 [inline]
> lr : timerqueue_del+0x38/0x70 lib/timerqueue.c:57
> sp : ffff800080003e70
> x29: ffff800080003e70 x28: 0000000000000000 x27: fff000007f8cf780
> x26: 0000000000000001 x25: 00000000000000c0 x24: 0000001f0198bc90
> x23: fff000007f8cf780 x22: fff000007f8cf7e0 x21: fff000007f8cf780
> x20: fff000007f8cf7e0 x19: ffff800088c3bd60 x18: 0000000000000000
> x17: fff07ffffd319000 x16: ffff800080000000 x15: 0000ffffef309d38
> x14: 00000000000003bb x13: 0000000000000000 x12: ffff8000825e0028
> x11: 0000000000000001 x10: 0000000000000200 x9 : 0000000000200000
> x8 : 0008000000000000 x7 : ff7ffffffffffbff x6 : 00000000019a23f5
> x5 : fff07ffffd319000 x4 : 000000000a2dca90 x3 : ffff800088c3bd60
> x2 : ff7000007f8cf8e8 x1 : 0000000000000080 x0 : 0000000000000080
> Call trace:
>  rb_next+0x1c/0x54 lib/rbtree.c:505
>  __remove_hrtimer kernel/time/hrtimer.c:1118 [inline]
>  __run_hrtimer kernel/time/hrtimer.c:1667 [inline]
>  __hrtimer_run_queues+0x104/0x1bc kernel/time/hrtimer.c:1751
>  hrtimer_interrupt+0xe8/0x244 kernel/time/hrtimer.c:1813

After scratching head skin 30 minutes I failed to work out how the timer
was armed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [kernel?] BUG: unable to handle kernel NULL pointer dereference in __hrtimer_run_queues
  2024-06-03 10:22 [syzbot] [kernel?] BUG: unable to handle kernel NULL pointer dereference in __hrtimer_run_queues syzbot
  2024-06-03 11:04 ` Hillf Danton
@ 2024-06-04 12:29 ` Thomas Gleixner
  2024-06-04 13:34   ` Will Deacon
  2024-06-04 12:45 ` Hillf Danton
  2 siblings, 1 reply; 7+ messages in thread
From: Thomas Gleixner @ 2024-06-04 12:29 UTC (permalink / raw)
  To: syzbot, anna-maria, frederic, linux-kernel, syzkaller-bugs,
	Catalin Marinas, Will Deacon

On Mon, Jun 03 2024 at 03:22, syzbot wrote:

Cc+ ARM64 folks

Content untrimmed for reference.

> syzbot found the following issue on:
>
> HEAD commit:    4a4be1ad3a6e Revert "vfs: Delete the associated dentry whe..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1422a73c980000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=bd6024aedb15e15c
> dashboard link: https://syzkaller.appspot.com/bug?extid=558f67d44ad7f098a3de
> compiler:       aarch64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> userspace arch: arm64
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=15583162980000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12c1b514980000
>
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/384ffdcca292/non_bootable_disk-4a4be1ad.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/75957361122b/vmlinux-4a4be1ad.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/6c766b0ec377/Image-4a4be1ad.gz.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+558f67d44ad7f098a3de@syzkaller.appspotmail.com
>
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000090
> Mem abort info:
>   ESR = 0x0000000096000006
>   EC = 0x25: DABT (current EL), IL = 32 bits
>   SET = 0, FnV = 0
>   EA = 0, S1PTW = 0
>   FSC = 0x06: level 2 translation fault
> Data abort info:
>   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
>   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
>   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> user pgtable: 4k pages, 52-bit VAs, pgdp=000000004605bb80
> [0000000000000090] pgd=08000000464ee003, p4d=08000000472aa003, pud=08000000471b8003, pmd=0000000000000000
> Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 0 PID: 3192 Comm: syz-executor607 Not tainted 6.10.0-rc1-syzkaller-00027-g4a4be1ad3a6e #0
> Hardware name: linux,dummy-virt (DT)
> pstate: 204000c9 (nzCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : rb_next+0x1c/0x54 lib/rbtree.c:505
> lr : rb_erase_cached include/linux/rbtree.h:124 [inline]
> lr : timerqueue_del+0x38/0x70 lib/timerqueue.c:57
> sp : ffff800080003e70
> x29: ffff800080003e70 x28: 0000000000000000 x27: fff000007f8cf780
> x26: 0000000000000001 x25: 00000000000000c0 x24: 0000001f0198bc90
> x23: fff000007f8cf780 x22: fff000007f8cf7e0 x21: fff000007f8cf780
> x20: fff000007f8cf7e0 x19: ffff800088c3bd60 x18: 0000000000000000
> x17: fff07ffffd319000 x16: ffff800080000000 x15: 0000ffffef309d38
> x14: 00000000000003bb x13: 0000000000000000 x12: ffff8000825e0028
> x11: 0000000000000001 x10: 0000000000000200 x9 : 0000000000200000
> x8 : 0008000000000000 x7 : ff7ffffffffffbff x6 : 00000000019a23f5
> x5 : fff07ffffd319000 x4 : 000000000a2dca90 x3 : ffff800088c3bd60
> x2 : ff7000007f8cf8e8 x1 : 0000000000000080 x0 : 0000000000000080
> Call trace:
>  rb_next+0x1c/0x54 lib/rbtree.c:505
>  __remove_hrtimer kernel/time/hrtimer.c:1118 [inline]
>  __run_hrtimer kernel/time/hrtimer.c:1667 [inline]
>  __hrtimer_run_queues+0x104/0x1bc kernel/time/hrtimer.c:1751
>  hrtimer_interrupt+0xe8/0x244 kernel/time/hrtimer.c:1813
>  timer_handler drivers/clocksource/arm_arch_timer.c:674 [inline]
>  arch_timer_handler_phys+0x2c/0x44 drivers/clocksource/arm_arch_timer.c:692
>  handle_percpu_devid_irq+0x84/0x130 kernel/irq/chip.c:942
>  generic_handle_irq_desc include/linux/irqdesc.h:173 [inline]
>  handle_irq_desc kernel/irq/irqdesc.c:691 [inline]
>  generic_handle_domain_irq+0x2c/0x44 kernel/irq/irqdesc.c:747
>  gic_handle_irq+0x40/0xc4 drivers/irqchip/irq-gic.c:370
>  call_on_irq_stack+0x24/0x4c arch/arm64/kernel/entry.S:889
>  do_interrupt_handler+0x80/0x84 arch/arm64/kernel/entry-common.c:310
>  __el1_irq arch/arm64/kernel/entry-common.c:536 [inline]
>  el1_interrupt+0x34/0x64 arch/arm64/kernel/entry-common.c:551
>  el1h_64_irq_handler+0x18/0x24 arch/arm64/kernel/entry-common.c:556
>  el1h_64_irq+0x64/0x68 arch/arm64/kernel/entry.S:594
>  __clear_young_dirty_ptes arch/arm64/include/asm/pgtable.h:1311 [inline]
>  contpte_clear_young_dirty_ptes+0x68/0x128 arch/arm64/mm/contpte.c:389
>  walk_pmd_range mm/pagewalk.c:143 [inline]
>  walk_pud_range mm/pagewalk.c:221 [inline]
>  walk_p4d_range mm/pagewalk.c:256 [inline]
>  walk_pgd_range+0x4b0/0x8a4 mm/pagewalk.c:293
>  __walk_page_range+0x178/0x180 mm/pagewalk.c:395
>  walk_page_range+0x144/0x224 mm/pagewalk.c:521
>  madvise_free_single_vma+0x134/0x2bc mm/madvise.c:815
>  madvise_dontneed_free mm/madvise.c:929 [inline]
>  madvise_vma_behavior+0x1d0/0x790 mm/madvise.c:1046
>  madvise_walk_vmas+0xbc/0x12c mm/madvise.c:1268
>  do_madvise+0x160/0x418 mm/madvise.c:1464
>  __do_sys_madvise mm/madvise.c:1481 [inline]
>  __se_sys_madvise mm/madvise.c:1479 [inline]
>  __arm64_sys_madvise+0x24/0x34 mm/madvise.c:1479
>  __invoke_syscall arch/arm64/kernel/syscall.c:34 [inline]
>  invoke_syscall+0x48/0x118 arch/arm64/kernel/syscall.c:48
>  el0_svc_common.constprop.0+0x40/0xe0 arch/arm64/kernel/syscall.c:133
>  do_el0_svc+0x1c/0x28 arch/arm64/kernel/syscall.c:152
>  el0_svc+0x34/0xf8 arch/arm64/kernel/entry-common.c:712
>  el0t_64_sync_handler+0x100/0x12c arch/arm64/kernel/entry-common.c:730
>  el0t_64_sync+0x19c/0x1a0 arch/arm64/kernel/entry.S:598
> Code: 54000200 f9400401 b4000141 aa0103e0 (f9400821) 
> ---[ end trace 0000000000000000 ]---
> ----------------
> Code disassembly (best guess):
>    0:	54000200 	b.eq	0x40  // b.none
>    4:	f9400401 	ldr	x1, [x0, #8]
>    8:	b4000141 	cbz	x1, 0x30
>    c:	aa0103e0 	mov	x0, x1
> * 10:	f9400821 	ldr	x1, [x1, #16] <-- trapping instruction

So this is the following code in rb_next():

>    4:	f9400401 	ldr	x1, [x0, #8]    // Offset 8 in @node
>    8:	b4000141 	cbz	x1, 0x30
	if (node->rb_right) {

>    c:	aa0103e0 	mov	x0, x1          // Saves node::rb_right
		node = node->rb_right;

> * 10:	f9400821 	ldr	x1, [x1, #16] <-- trapping instruction
		while (node->rb_left)

> x2 : ff7000007f8cf8e8 x1 : 0000000000000080 x0 : 0000000000000080

which obviously crashes. Now the question is how does the original node
end up with node::rb_right == 0x80?

I doubt that this is a hrtimer or rbtree problem. It smells like random
data corruption caused by whatever. It might not even be an ARM64
specific issue though the C repro does not trigger on x86...

Handing it over to Catalin and Will.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [kernel?] BUG: unable to handle kernel NULL pointer dereference in __hrtimer_run_queues
  2024-06-03 10:22 [syzbot] [kernel?] BUG: unable to handle kernel NULL pointer dereference in __hrtimer_run_queues syzbot
  2024-06-03 11:04 ` Hillf Danton
  2024-06-04 12:29 ` Thomas Gleixner
@ 2024-06-04 12:45 ` Hillf Danton
  2024-06-04 13:30   ` syzbot
  2 siblings, 1 reply; 7+ messages in thread
From: Hillf Danton @ 2024-06-04 12:45 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, syzkaller-bugs

On Mon, 03 Jun 2024 03:22:29 -0700
> syzbot found the following issue on:
> 
> HEAD commit:    4a4be1ad3a6e Revert "vfs: Delete the associated dentry whe..
> git tree:       upstream
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12c1b514980000

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git  master

--- a/arch/arm64/mm/contpte.c
+++ b/arch/arm64/mm/contpte.c
@@ -376,7 +376,7 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
 	 * clearing access/dirty for the whole block.
 	 */
 	unsigned long start = addr;
-	unsigned long end = start + nr;
+	unsigned long end = start + nr * PAGE_SIZE;
 
 	if (pte_cont(__ptep_get(ptep + nr - 1)))
 		end = ALIGN(end, CONT_PTE_SIZE);
@@ -386,7 +386,7 @@ void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
 		ptep = contpte_align_down(ptep);
 	}
 
-	__clear_young_dirty_ptes(vma, start, ptep, end - start, flags);
+	__clear_young_dirty_ptes(vma, start, ptep, (end - start) / PAGE_SIZE, flags);
 }
 EXPORT_SYMBOL_GPL(contpte_clear_young_dirty_ptes);
 
--

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [kernel?] BUG: unable to handle kernel NULL pointer dereference in __hrtimer_run_queues
  2024-06-04 12:45 ` Hillf Danton
@ 2024-06-04 13:30   ` syzbot
  0 siblings, 0 replies; 7+ messages in thread
From: syzbot @ 2024-06-04 13:30 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+558f67d44ad7f098a3de@syzkaller.appspotmail.com

Tested on:

commit:         2ab79514 Merge tag 'cxl-fixes-6.10-rc3' of git://git.k..
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=121dd4ac980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=48aeb395bedeb71f
dashboard link: https://syzkaller.appspot.com/bug?extid=558f67d44ad7f098a3de
compiler:       aarch64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1346c226980000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [kernel?] BUG: unable to handle kernel NULL pointer dereference in __hrtimer_run_queues
  2024-06-04 12:29 ` Thomas Gleixner
@ 2024-06-04 13:34   ` Will Deacon
  2024-06-04 16:10     ` Thomas Gleixner
  0 siblings, 1 reply; 7+ messages in thread
From: Will Deacon @ 2024-06-04 13:34 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: syzbot, anna-maria, frederic, linux-kernel, syzkaller-bugs,
	Catalin Marinas

Hi Thomas,

On Tue, Jun 04, 2024 at 02:29:57PM +0200, Thomas Gleixner wrote:
> On Mon, Jun 03 2024 at 03:22, syzbot wrote:
> Cc+ ARM64 folks
> 
> Content untrimmed for reference.

Thanks! I'll trim it now...

> >  __clear_young_dirty_ptes arch/arm64/include/asm/pgtable.h:1311 [inline]
> >  contpte_clear_young_dirty_ptes+0x68/0x128 arch/arm64/mm/contpte.c:389
> >  walk_pmd_range mm/pagewalk.c:143 [inline]
> >  walk_pud_range mm/pagewalk.c:221 [inline]
> >  walk_p4d_range mm/pagewalk.c:256 [inline]
> >  walk_pgd_range+0x4b0/0x8a4 mm/pagewalk.c:293
> >  __walk_page_range+0x178/0x180 mm/pagewalk.c:395
> >  walk_page_range+0x144/0x224 mm/pagewalk.c:521
> >  madvise_free_single_vma+0x134/0x2bc mm/madvise.c:815
> >  madvise_dontneed_free mm/madvise.c:929 [inline]
> >  madvise_vma_behavior+0x1d0/0x790 mm/madvise.c:1046
> >  madvise_walk_vmas+0xbc/0x12c mm/madvise.c:1268
> >  do_madvise+0x160/0x418 mm/madvise.c:1464
> >  __do_sys_madvise mm/madvise.c:1481 [inline]
> >  __se_sys_madvise mm/madvise.c:1479 [inline]
> >  __arm64_sys_madvise+0x24/0x34 mm/madvise.c:1479
> >  __invoke_syscall arch/arm64/kernel/syscall.c:34 [inline]
> >  invoke_syscall+0x48/0x118 arch/arm64/kernel/syscall.c:48
> >  el0_svc_common.constprop.0+0x40/0xe0 arch/arm64/kernel/syscall.c:133
> >  do_el0_svc+0x1c/0x28 arch/arm64/kernel/syscall.c:152
> >  el0_svc+0x34/0xf8 arch/arm64/kernel/entry-common.c:712
> >  el0t_64_sync_handler+0x100/0x12c arch/arm64/kernel/entry-common.c:730
> >  el0t_64_sync+0x19c/0x1a0 arch/arm64/kernel/entry.S:598
> > Code: 54000200 f9400401 b4000141 aa0103e0 (f9400821) 
> > ---[ end trace 0000000000000000 ]---
> > ----------------
> > Code disassembly (best guess):
> >    0:	54000200 	b.eq	0x40  // b.none
> >    4:	f9400401 	ldr	x1, [x0, #8]
> >    8:	b4000141 	cbz	x1, 0x30
> >    c:	aa0103e0 	mov	x0, x1
> > * 10:	f9400821 	ldr	x1, [x1, #16] <-- trapping instruction
> 
> So this is the following code in rb_next():
> 
> >    4:	f9400401 	ldr	x1, [x0, #8]    // Offset 8 in @node
> >    8:	b4000141 	cbz	x1, 0x30
> 	if (node->rb_right) {
> 
> >    c:	aa0103e0 	mov	x0, x1          // Saves node::rb_right
> 		node = node->rb_right;
> 
> > * 10:	f9400821 	ldr	x1, [x1, #16] <-- trapping instruction
> 		while (node->rb_left)
> 
> > x2 : ff7000007f8cf8e8 x1 : 0000000000000080 x0 : 0000000000000080
> 
> which obviously crashes. Now the question is how does the original node
> end up with node::rb_right == 0x80?
> 
> I doubt that this is a hrtimer or rbtree problem. It smells like random
> data corruption caused by whatever. It might not even be an ARM64
> specific issue though the C repro does not trigger on x86...
> 
> Handing it over to Catalin and Will.

I suspect this is a duplicate of:

https://lore.kernel.org/lkml/20240604110119.GA20284@willie-the-truck/

and there's a fix queued in the -mm tree.

Will

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [kernel?] BUG: unable to handle kernel NULL pointer dereference in __hrtimer_run_queues
  2024-06-04 13:34   ` Will Deacon
@ 2024-06-04 16:10     ` Thomas Gleixner
  0 siblings, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2024-06-04 16:10 UTC (permalink / raw)
  To: Will Deacon
  Cc: syzbot, anna-maria, frederic, linux-kernel, syzkaller-bugs,
	Catalin Marinas

Will!

On Tue, Jun 04 2024 at 14:34, Will Deacon wrote:
> On Tue, Jun 04, 2024 at 02:29:57PM +0200, Thomas Gleixner wrote:
>> On Mon, Jun 03 2024 at 03:22, syzbot wrote:
>> > * 10:	f9400821 	ldr	x1, [x1, #16] <-- trapping instruction
>> 		while (node->rb_left)
>> 
>> > x2 : ff7000007f8cf8e8 x1 : 0000000000000080 x0 : 0000000000000080
>> 
>> which obviously crashes. Now the question is how does the original node
>> end up with node::rb_right == 0x80?
>> 
>> I doubt that this is a hrtimer or rbtree problem. It smells like random
>> data corruption caused by whatever. It might not even be an ARM64
>> specific issue though the C repro does not trigger on x86...
>> 
>> Handing it over to Catalin and Will.
>
> I suspect this is a duplicate of:
>
> https://lore.kernel.org/lkml/20240604110119.GA20284@willie-the-truck/
>
> and there's a fix queued in the -mm tree.

That looks very much so.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-06-04 16:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-03 10:22 [syzbot] [kernel?] BUG: unable to handle kernel NULL pointer dereference in __hrtimer_run_queues syzbot
2024-06-03 11:04 ` Hillf Danton
2024-06-04 12:29 ` Thomas Gleixner
2024-06-04 13:34   ` Will Deacon
2024-06-04 16:10     ` Thomas Gleixner
2024-06-04 12:45 ` Hillf Danton
2024-06-04 13:30   ` syzbot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox