BUG: soft lockup in smp_call

All of lore.kernel.org
 help / color / mirror / Atom feed

* BUG: soft lockup in smp_call_function
@ 2023-09-12 23:02 Sanan Hasanov
  2023-09-13 10:05 ` Peter Zijlstra
  2023-09-13 11:07 ` Hillf Danton
  0 siblings, 2 replies; 19+ messages in thread
From: Sanan Hasanov @ 2023-09-12 23:02 UTC (permalink / raw)
  To: peterz@infradead.org, paulmck@kernel.org, jgross@suse.com,
	vschneid@redhat.com, yury.norov@gmail.com,
	linux-kernel@vger.kernel.org
  Cc: syzkaller@googlegroups.com, contact@pgazz.com

Good day, dear maintainers,

We found a bug using a modified kernel configuration file used by syzbot.

We enhanced the coverage of the configuration file using our tool, klocalizer.

Kernel Branch: 6.3.0-next-20230426
Kernel Config: https://drive.google.com/file/d/1WSUEWrith9-539qo6xRqmwy4LfDtmKpp/view?usp=sharing
Reproducer: https://drive.google.com/file/d/1pN6FfcjuUs6Wx94g1gufuYGjRbMMgiZ4/view?usp=sharing
Thank you!

Best regards,
Sanan Hasanov

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [kworker/u16:1:12]
Modules linked in:
irq event stamp: 192794
hardirqs last  enabled at (192793): [<ffffffff89a0140a>] asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
hardirqs last disabled at (192794): [<ffffffff89975d4f>] sysvec_apic_timer_interrupt+0xf/0xc0 arch/x86/kernel/apic/apic.c:1106
softirqs last  enabled at (187764): [<ffffffff814b94bd>] invoke_softirq kernel/softirq.c:445 [inline]
softirqs last  enabled at (187764): [<ffffffff814b94bd>] __irq_exit_rcu+0x11d/0x190 kernel/softirq.c:650
softirqs last disabled at (187671): [<ffffffff814b94bd>] invoke_softirq kernel/softirq.c:445 [inline]
softirqs last disabled at (187671): [<ffffffff814b94bd>] __irq_exit_rcu+0x11d/0x190 kernel/softirq.c:650
CPU: 5 PID: 12 Comm: kworker/u16:1 Not tainted 6.3.0-next-20230426 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Workqueue: events_unbound toggle_allocation_gate
RIP: 0010:csd_lock_wait kernel/smp.c:294 [inline]
RIP: 0010:smp_call_function_many_cond+0x5bd/0x1020 kernel/smp.c:828
Code: 0b 00 85 ed 74 4d 48 b8 00 00 00 00 00 fc ff df 4d 89 f4 4c 89 f5 49 c1 ec 03 83 e5 07 49 01 c4 83 c5 03 e8 b5 07 0b 00 f3 90 <41> 0f b6 04 24 40 38 c5 7c 08 84 c0 0f 85 46 08 00 00 8b 43 08 31
RSP: 0018:ffffc900000cf9e8 EFLAGS: 00000293
RAX: 0000000000000000 RBX: ffff888119cc4d80 RCX: 0000000000000000
RDX: ffff888100325940 RSI: ffffffff8176807b RDI: 0000000000000005
RBP: 0000000000000003 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffffed10233989b1
R13: 0000000000000001 R14: ffff888119cc4d88 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff888119e80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555556a6cc88 CR3: 000000000bb73000 CR4: 0000000000350ee0
Call Trace:
 <TASK>
 on_each_cpu_cond_mask+0x40/0x90 kernel/smp.c:996
 on_each_cpu include/linux/smp.h:71 [inline]
 text_poke_sync arch/x86/kernel/alternative.c:1770 [inline]
 text_poke_bp_batch+0x237/0x770 arch/x86/kernel/alternative.c:1970
 text_poke_flush arch/x86/kernel/alternative.c:2161 [inline]
 text_poke_flush arch/x86/kernel/alternative.c:2158 [inline]
 text_poke_finish+0x1a/0x30 arch/x86/kernel/alternative.c:2168
 arch_jump_label_transform_apply+0x17/0x30 arch/x86/kernel/jump_label.c:146
 jump_label_update+0x321/0x400 kernel/jump_label.c:829
 static_key_enable_cpuslocked+0x1b5/0x270 kernel/jump_label.c:205
 static_key_enable+0x1a/0x20 kernel/jump_label.c:218
 toggle_allocation_gate mm/kfence/core.c:831 [inline]
 toggle_allocation_gate+0xf4/0x220 mm/kfence/core.c:823
 process_one_work+0x993/0x15e0 kernel/workqueue.c:2405
 worker_thread+0x67d/0x10c0 kernel/workqueue.c:2552
 kthread+0x33e/0x440 kernel/kthread.c:379
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
 </TASK>
Sending NMI from CPU 5 to CPUs 0-4,6-7:
NMI backtrace for cpu 1
CPU: 1 PID: 20602 Comm: syz-executor.3 Not tainted 6.3.0-next-20230426 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
RIP: 0010:hlock_class kernel/locking/lockdep.c:228 [inline]
RIP: 0010:check_wait_context kernel/locking/lockdep.c:4747 [inline]
RIP: 0010:__lock_acquire+0x489/0x5d00 kernel/locking/lockdep.c:5024
Code: 41 81 e5 ff 1f 45 0f b7 ed be 08 00 00 00 4c 89 e8 48 c1 e8 06 48 8d 3c c5 00 6b 2c 90 e8 5f 90 6e 00 4c 0f a3 2d d7 35 c9 0e <0f> 83 5c 0c 00 00 4f 8d 6c 6d 00 49 c1 e5 06 49 81 c5 20 6f 2c 90
RSP: 0018:ffffc90002aa7350 EFLAGS: 00000047
RAX: 0000000000000001 RBX: 0000000000000001 RCX: ffffffff81633521
RDX: fffffbfff2058d62 RSI: 0000000000000008 RDI: ffffffff902c6b08
RBP: ffff888042995940 R08: 0000000000000000 R09: ffffffff902c6b0f
R10: fffffbfff2058d61 R11: 0000000000000001 R12: ffff888119e2b818
R13: 0000000000000063 R14: 0000000000000002 R15: ffff888042996598
FS:  00007fdaad065700(0000) GS:ffff888119c80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b30623000 CR3: 0000000101969000 CR4: 0000000000350ee0
Call Trace:
 <TASK>
 lock_acquire kernel/locking/lockdep.c:5691 [inline]
 lock_acquire+0x1b1/0x520 kernel/locking/lockdep.c:5656
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
 _raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162
 lock_hrtimer_base kernel/time/hrtimer.c:173 [inline]
 hrtimer_try_to_cancel kernel/time/hrtimer.c:1331 [inline]
 hrtimer_try_to_cancel+0xa9/0x2e0 kernel/time/hrtimer.c:1316
 hrtimer_cancel+0x17/0x40 kernel/time/hrtimer.c:1443
 __disable_vblank drivers/gpu/drm/drm_vblank.c:434 [inline]
 drm_vblank_disable_and_save+0x282/0x3d0 drivers/gpu/drm/drm_vblank.c:478
 drm_crtc_vblank_off+0x312/0x970 drivers/gpu/drm/drm_vblank.c:1366
 disable_outputs+0x7c7/0xbb0 drivers/gpu/drm/drm_atomic_helper.c:1202
 drm_atomic_helper_commit_modeset_disables+0x1d/0x40 drivers/gpu/drm/drm_atomic_helper.c:1397
 vkms_atomic_commit_tail+0x51/0x240 drivers/gpu/drm/vkms/vkms_drv.c:71
 commit_tail+0x288/0x420 drivers/gpu/drm/drm_atomic_helper.c:1812
 drm_atomic_helper_commit drivers/gpu/drm/drm_atomic_helper.c:2052 [inline]
 drm_atomic_helper_commit+0x306/0x390 drivers/gpu/drm/drm_atomic_helper.c:1985
 drm_atomic_commit+0x20a/0x2d0 drivers/gpu/drm/drm_atomic.c:1503
 drm_client_modeset_commit_atomic+0x698/0x7e0 drivers/gpu/drm/drm_client_modeset.c:1045
 drm_client_modeset_dpms+0x174/0x200 drivers/gpu/drm/drm_client_modeset.c:1226
 drm_fb_helper_dpms drivers/gpu/drm/drm_fb_helper.c:323 [inline]
 drm_fb_helper_blank+0xd1/0x260 drivers/gpu/drm/drm_fb_helper.c:356
 fb_blank+0x105/0x190 drivers/video/fbdev/core/fbmem.c:1088
 do_fb_ioctl+0x390/0x760 drivers/video/fbdev/core/fbmem.c:1180
 fb_ioctl+0xeb/0x150 drivers/video/fbdev/core/fbmem.c:1204
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:870 [inline]
 __se_sys_ioctl fs/ioctl.c:856 [inline]
 __x64_sys_ioctl+0x197/0x210 fs/ioctl.c:856
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x39/0x80 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fdaabe8edcd
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fdaad064bf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fdaabfbbf80 RCX: 00007fdaabe8edcd
RDX: 0000000000000004 RSI: 0000000000004611 RDI: 0000000000000003
RBP: 00007fdaabefc59c R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffdadeffe9f R14: 00007ffdadf00040 R15: 00007fdaad064d80
 </TASK>
NMI backtrace for cpu 0 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
NMI backtrace for cpu 0 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
NMI backtrace for cpu 0 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729
NMI backtrace for cpu 3 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
NMI backtrace for cpu 3 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
NMI backtrace for cpu 3 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729
NMI backtrace for cpu 6 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
NMI backtrace for cpu 6 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
NMI backtrace for cpu 6 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729
NMI backtrace for cpu 7 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
NMI backtrace for cpu 7 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
NMI backtrace for cpu 7 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729
NMI backtrace for cpu 4
CPU: 4 PID: 20623 Comm: syz-executor.6 Not tainted 6.3.0-next-20230426 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
RIP: 0010:kvm_wait+0xb7/0x110 arch/x86/kernel/kvm.c:1064
Code: 40 38 c6 74 1b 48 83 c4 10 c3 c3 e8 93 d3 50 00 eb 07 0f 00 2d 4a 04 92 08 fb f4 48 83 c4 10 c3 eb 07 0f 00 2d 3a 04 92 08 f4 <48> 83 c4 10 c3 89 74 24 0c 48 89 3c 24 e8 d7 d4 50 00 8b 74 24 0c
RSP: 0018:ffffc90000300b50 EFLAGS: 00000046
RAX: 0000000000000003 RBX: 0000000000000000 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff88810b0803d8
RBP: ffff88810b0803d8 R08: 0000000000000001 R09: ffff88810b0803d8
R10: ffffed102161007b R11: ffffc90000300ff8 R12: 0000000000000000
R13: ffffed102161007b R14: 0000000000000001 R15: ffff888119e3d3c0
FS:  0000000000000000(0000) GS:ffff888119e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f28183bd0b0 CR3: 000000000bb73000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 pv_wait arch/x86/include/asm/paravirt.h:598 [inline]
 pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:470 [inline]
 __pv_queued_spin_lock_slowpath+0x8e4/0xb80 kernel/locking/qspinlock.c:511
 pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:586 [inline]
 queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline]
 queued_spin_lock include/asm-generic/qspinlock.h:114 [inline]
 do_raw_spin_lock+0x20d/0x2b0 kernel/locking/spinlock_debug.c:115
 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:111 [inline]
 _raw_spin_lock_irqsave+0x45/0x60 kernel/locking/spinlock.c:162
 drm_handle_vblank+0x11e/0xb80 drivers/gpu/drm/drm_vblank.c:1986
 vkms_vblank_simulate+0xe8/0x3e0 drivers/gpu/drm/vkms/vkms_crtc.c:29
 __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
 __hrtimer_run_queues+0x599/0xa30 kernel/time/hrtimer.c:1749
 hrtimer_interrupt+0x320/0x7b0 kernel/time/hrtimer.c:1811
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1095 [inline]
 __sysvec_apic_timer_interrupt+0x14a/0x430 arch/x86/kernel/apic/apic.c:1112
 sysvec_apic_timer_interrupt+0x92/0xc0 arch/x86/kernel/apic/apic.c:1106
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
RIP: 0010:check_kcov_mode kernel/kcov.c:173 [inline]
RIP: 0010:__sanitizer_cov_trace_pc+0x11/0x70 kernel/kcov.c:207
Code: a8 01 00 00 e8 b0 ff ff ff 31 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 65 8b 05 0d 33 82 7e 89 c1 48 8b 34 24 <81> e1 00 01 00 00 65 48 8b 14 25 40 bb 03 00 a9 00 01 ff 00 74 0e
RSP: 0018:ffffc90002be76d8 EFLAGS: 00000286
RAX: 0000000080000001 RBX: 0000000000000001 RCX: 0000000080000001
RDX: 00007f2817c77000 RSI: ffffffff81bcd756 RDI: ffffc90002be7ad8
RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffffea00014fc480
R13: 0000000000000000 R14: dffffc0000000000 R15: 8000000053f12007
 zap_drop_file_uffd_wp mm/memory.c:1352 [inline]
 zap_install_uffd_wp_if_needed mm/memory.c:1371 [inline]
 zap_pte_range mm/memory.c:1417 [inline]
 zap_pmd_range mm/memory.c:1564 [inline]
 zap_pud_range mm/memory.c:1593 [inline]
 zap_p4d_range mm/memory.c:1614 [inline]
 unmap_page_range+0x1046/0x4470 mm/memory.c:1635
 unmap_single_vma+0x19a/0x2b0 mm/memory.c:1681
 unmap_vmas+0x234/0x380 mm/memory.c:1720
 exit_mmap+0x190/0x930 mm/mmap.c:3111
 __mmput+0x128/0x4c0 kernel/fork.c:1351
 mmput+0x60/0x70 kernel/fork.c:1373
 exit_mm kernel/exit.c:564 [inline]
 do_exit+0x9d1/0x29f0 kernel/exit.c:858
 do_group_exit+0xd4/0x2a0 kernel/exit.c:1021
 get_signal+0x2311/0x25c0 kernel/signal.c:2874
 arch_do_signal_or_restart+0x79/0x5a0 arch/x86/kernel/signal.c:307
 exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
 exit_to_user_mode_prepare+0x11f/0x240 kernel/entry/common.c:204
 __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
 syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:297
 do_syscall_64+0x46/0x80 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f281828edcd
Code: Unable to access opcode bytes at 0x7f281828eda3.
RSP: 002b:00007f28194c0c98 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00007f28183bbf80 RCX: 00007f281828edcd
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f28183bbf88
RBP: 00007f28183bbf88 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f28183bbf8c
R13: 00007ffd5038e1ef R14: 00007ffd5038e390 R15: 00007f28194c0d80
 </TASK>
NMI backtrace for cpu 2 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
NMI backtrace for cpu 2 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
NMI backtrace for cpu 2 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: soft lockup in smp_call_function
  2023-09-12 23:02 BUG: soft lockup in smp_call_function Sanan Hasanov
@ 2023-09-13 10:05 ` Peter Zijlstra
  2023-09-13 11:07 ` Hillf Danton
  1 sibling, 0 replies; 19+ messages in thread
From: Peter Zijlstra @ 2023-09-13 10:05 UTC (permalink / raw)
  To: Sanan Hasanov
  Cc: paulmck@kernel.org, jgross@suse.com, vschneid@redhat.com,
	yury.norov@gmail.com, linux-kernel@vger.kernel.org,
	syzkaller@googlegroups.com, contact@pgazz.com,
	rodrigosiqueiramelo, melissa.srw, mairacanal, hamohammed.sa,
	daniel, airlied

On Tue, Sep 12, 2023 at 11:02:56PM +0000, Sanan Hasanov wrote:
> Good day, dear maintainers,
> 
> We found a bug using a modified kernel configuration file used by syzbot.
> 
> We enhanced the coverage of the configuration file using our tool, klocalizer.
> 
> Kernel Branch: 6.3.0-next-20230426
> Kernel Config: https://drive.google.com/file/d/1WSUEWrith9-539qo6xRqmwy4LfDtmKpp/view?usp=sharing
> Reproducer: https://drive.google.com/file/d/1pN6FfcjuUs6Wx94g1gufuYGjRbMMgiZ4/view?usp=sharing
> Thank you!

AFAICT the thing is stuck in DRM somewhere...

> watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [kworker/u16:1:12]
> Modules linked in:
> irq event stamp: 192794
> hardirqs last  enabled at (192793): [<ffffffff89a0140a>] asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
> hardirqs last disabled at (192794): [<ffffffff89975d4f>] sysvec_apic_timer_interrupt+0xf/0xc0 arch/x86/kernel/apic/apic.c:1106
> softirqs last  enabled at (187764): [<ffffffff814b94bd>] invoke_softirq kernel/softirq.c:445 [inline]
> softirqs last  enabled at (187764): [<ffffffff814b94bd>] __irq_exit_rcu+0x11d/0x190 kernel/softirq.c:650
> softirqs last disabled at (187671): [<ffffffff814b94bd>] invoke_softirq kernel/softirq.c:445 [inline]
> softirqs last disabled at (187671): [<ffffffff814b94bd>] __irq_exit_rcu+0x11d/0x190 kernel/softirq.c:650
> CPU: 5 PID: 12 Comm: kworker/u16:1 Not tainted 6.3.0-next-20230426 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> Workqueue: events_unbound toggle_allocation_gate
> RIP: 0010:csd_lock_wait kernel/smp.c:294 [inline]
> RIP: 0010:smp_call_function_many_cond+0x5bd/0x1020 kernel/smp.c:828
> Code: 0b 00 85 ed 74 4d 48 b8 00 00 00 00 00 fc ff df 4d 89 f4 4c 89 f5 49 c1 ec 03 83 e5 07 49 01 c4 83 c5 03 e8 b5 07 0b 00 f3 90 <41> 0f b6 04 24 40 38 c5 7c 08 84 c0 0f 85 46 08 00 00 8b 43 08 31
> RSP: 0018:ffffc900000cf9e8 EFLAGS: 00000293
> RAX: 0000000000000000 RBX: ffff888119cc4d80 RCX: 0000000000000000
> RDX: ffff888100325940 RSI: ffffffff8176807b RDI: 0000000000000005
> RBP: 0000000000000003 R08: 0000000000000005 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000001 R12: ffffed10233989b1
> R13: 0000000000000001 R14: ffff888119cc4d88 R15: 0000000000000001
> FS:  0000000000000000(0000) GS:ffff888119e80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000555556a6cc88 CR3: 000000000bb73000 CR4: 0000000000350ee0
> Call Trace:
>  <TASK>
>  on_each_cpu_cond_mask+0x40/0x90 kernel/smp.c:996
>  on_each_cpu include/linux/smp.h:71 [inline]
>  text_poke_sync arch/x86/kernel/alternative.c:1770 [inline]
>  text_poke_bp_batch+0x237/0x770 arch/x86/kernel/alternative.c:1970
>  text_poke_flush arch/x86/kernel/alternative.c:2161 [inline]
>  text_poke_flush arch/x86/kernel/alternative.c:2158 [inline]
>  text_poke_finish+0x1a/0x30 arch/x86/kernel/alternative.c:2168
>  arch_jump_label_transform_apply+0x17/0x30 arch/x86/kernel/jump_label.c:146
>  jump_label_update+0x321/0x400 kernel/jump_label.c:829
>  static_key_enable_cpuslocked+0x1b5/0x270 kernel/jump_label.c:205
>  static_key_enable+0x1a/0x20 kernel/jump_label.c:218
>  toggle_allocation_gate mm/kfence/core.c:831 [inline]
>  toggle_allocation_gate+0xf4/0x220 mm/kfence/core.c:823
>  process_one_work+0x993/0x15e0 kernel/workqueue.c:2405
>  worker_thread+0x67d/0x10c0 kernel/workqueue.c:2552
>  kthread+0x33e/0x440 kernel/kthread.c:379
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
>  </TASK>

Right, so this is waiting for an IPI to be processed.. while #1 has IRQs
disabled

> Sending NMI from CPU 5 to CPUs 0-4,6-7:
> NMI backtrace for cpu 1
> CPU: 1 PID: 20602 Comm: syz-executor.3 Not tainted 6.3.0-next-20230426 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> RIP: 0010:hlock_class kernel/locking/lockdep.c:228 [inline]
> RIP: 0010:check_wait_context kernel/locking/lockdep.c:4747 [inline]
> RIP: 0010:__lock_acquire+0x489/0x5d00 kernel/locking/lockdep.c:5024
> Code: 41 81 e5 ff 1f 45 0f b7 ed be 08 00 00 00 4c 89 e8 48 c1 e8 06 48 8d 3c c5 00 6b 2c 90 e8 5f 90 6e 00 4c 0f a3 2d d7 35 c9 0e <0f> 83 5c 0c 00 00 4f 8d 6c 6d 00 49 c1 e5 06 49 81 c5 20 6f 2c 90
> RSP: 0018:ffffc90002aa7350 EFLAGS: 00000047
> RAX: 0000000000000001 RBX: 0000000000000001 RCX: ffffffff81633521
> RDX: fffffbfff2058d62 RSI: 0000000000000008 RDI: ffffffff902c6b08
> RBP: ffff888042995940 R08: 0000000000000000 R09: ffffffff902c6b0f
> R10: fffffbfff2058d61 R11: 0000000000000001 R12: ffff888119e2b818
> R13: 0000000000000063 R14: 0000000000000002 R15: ffff888042996598
> FS:  00007fdaad065700(0000) GS:ffff888119c80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000001b30623000 CR3: 0000000101969000 CR4: 0000000000350ee0
> Call Trace:
>  <TASK>
>  lock_acquire kernel/locking/lockdep.c:5691 [inline]
>  lock_acquire+0x1b1/0x520 kernel/locking/lockdep.c:5656
>  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>  _raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162
>  lock_hrtimer_base kernel/time/hrtimer.c:173 [inline]
>  hrtimer_try_to_cancel kernel/time/hrtimer.c:1331 [inline]
>  hrtimer_try_to_cancel+0xa9/0x2e0 kernel/time/hrtimer.c:1316
>  hrtimer_cancel+0x17/0x40 kernel/time/hrtimer.c:1443

And this is trying to cancel a hrtimer which is ran on CPU4 and won't be
making much progress.

>  __disable_vblank drivers/gpu/drm/drm_vblank.c:434 [inline]

So we're here, holding vbl_lock, vblank_time_lock one of which is what
#4 is waiting on.

This also has IRQs disabled, which is what #1 is waiting on.

>  drm_vblank_disable_and_save+0x282/0x3d0 drivers/gpu/drm/drm_vblank.c:478
>  drm_crtc_vblank_off+0x312/0x970 drivers/gpu/drm/drm_vblank.c:1366
>  disable_outputs+0x7c7/0xbb0 drivers/gpu/drm/drm_atomic_helper.c:1202
>  drm_atomic_helper_commit_modeset_disables+0x1d/0x40 drivers/gpu/drm/drm_atomic_helper.c:1397
>  vkms_atomic_commit_tail+0x51/0x240 drivers/gpu/drm/vkms/vkms_drv.c:71
>  commit_tail+0x288/0x420 drivers/gpu/drm/drm_atomic_helper.c:1812
>  drm_atomic_helper_commit drivers/gpu/drm/drm_atomic_helper.c:2052 [inline]
>  drm_atomic_helper_commit+0x306/0x390 drivers/gpu/drm/drm_atomic_helper.c:1985
>  drm_atomic_commit+0x20a/0x2d0 drivers/gpu/drm/drm_atomic.c:1503
>  drm_client_modeset_commit_atomic+0x698/0x7e0 drivers/gpu/drm/drm_client_modeset.c:1045
>  drm_client_modeset_dpms+0x174/0x200 drivers/gpu/drm/drm_client_modeset.c:1226
>  drm_fb_helper_dpms drivers/gpu/drm/drm_fb_helper.c:323 [inline]
>  drm_fb_helper_blank+0xd1/0x260 drivers/gpu/drm/drm_fb_helper.c:356
>  fb_blank+0x105/0x190 drivers/video/fbdev/core/fbmem.c:1088
>  do_fb_ioctl+0x390/0x760 drivers/video/fbdev/core/fbmem.c:1180
>  fb_ioctl+0xeb/0x150 drivers/video/fbdev/core/fbmem.c:1204
>  vfs_ioctl fs/ioctl.c:51 [inline]
>  __do_sys_ioctl fs/ioctl.c:870 [inline]
>  __se_sys_ioctl fs/ioctl.c:856 [inline]
>  __x64_sys_ioctl+0x197/0x210 fs/ioctl.c:856
>  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>  do_syscall_64+0x39/0x80 arch/x86/entry/common.c:80
>  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7fdaabe8edcd
> Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007fdaad064bf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 00007fdaabfbbf80 RCX: 00007fdaabe8edcd
> RDX: 0000000000000004 RSI: 0000000000004611 RDI: 0000000000000003
> RBP: 00007fdaabefc59c R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 00007ffdadeffe9f R14: 00007ffdadf00040 R15: 00007fdaad064d80
>  </TASK>



> CPU: 4 PID: 20623 Comm: syz-executor.6 Not tainted 6.3.0-next-20230426 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> RIP: 0010:kvm_wait+0xb7/0x110 arch/x86/kernel/kvm.c:1064
> Code: 40 38 c6 74 1b 48 83 c4 10 c3 c3 e8 93 d3 50 00 eb 07 0f 00 2d 4a 04 92 08 fb f4 48 83 c4 10 c3 eb 07 0f 00 2d 3a 04 92 08 f4 <48> 83 c4 10 c3 89 74 24 0c 48 89 3c 24 e8 d7 d4 50 00 8b 74 24 0c
> RSP: 0018:ffffc90000300b50 EFLAGS: 00000046
> RAX: 0000000000000003 RBX: 0000000000000000 RCX: dffffc0000000000
> RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff88810b0803d8
> RBP: ffff88810b0803d8 R08: 0000000000000001 R09: ffff88810b0803d8
> R10: ffffed102161007b R11: ffffc90000300ff8 R12: 0000000000000000
> R13: ffffed102161007b R14: 0000000000000001 R15: ffff888119e3d3c0
> FS:  0000000000000000(0000) GS:ffff888119e00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f28183bd0b0 CR3: 000000000bb73000 CR4: 0000000000350ee0
> Call Trace:
>  <IRQ>
>  pv_wait arch/x86/include/asm/paravirt.h:598 [inline]
>  pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:470 [inline]
>  __pv_queued_spin_lock_slowpath+0x8e4/0xb80 kernel/locking/qspinlock.c:511
>  pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:586 [inline]
>  queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline]
>  queued_spin_lock include/asm-generic/qspinlock.h:114 [inline]
>  do_raw_spin_lock+0x20d/0x2b0 kernel/locking/spinlock_debug.c:115
>  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:111 [inline]
>  _raw_spin_lock_irqsave+0x45/0x60 kernel/locking/spinlock.c:162
>  drm_handle_vblank+0x11e/0xb80 drivers/gpu/drm/drm_vblank.c:1986

stuck on a spinlock held by #5

>  vkms_vblank_simulate+0xe8/0x3e0 drivers/gpu/drm/vkms/vkms_crtc.c:29
>  __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
>  __hrtimer_run_queues+0x599/0xa30 kernel/time/hrtimer.c:1749
>  hrtimer_interrupt+0x320/0x7b0 kernel/time/hrtimer.c:1811
>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1095 [inline]
>  __sysvec_apic_timer_interrupt+0x14a/0x430 arch/x86/kernel/apic/apic.c:1112
>  sysvec_apic_timer_interrupt+0x92/0xc0 arch/x86/kernel/apic/apic.c:1106
>  </IRQ>
>  <TASK>
>  asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
> RIP: 0010:check_kcov_mode kernel/kcov.c:173 [inline]
> RIP: 0010:__sanitizer_cov_trace_pc+0x11/0x70 kernel/kcov.c:207
> Code: a8 01 00 00 e8 b0 ff ff ff 31 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 65 8b 05 0d 33 82 7e 89 c1 48 8b 34 24 <81> e1 00 01 00 00 65 48 8b 14 25 40 bb 03 00 a9 00 01 ff 00 74 0e
> RSP: 0018:ffffc90002be76d8 EFLAGS: 00000286
> RAX: 0000000080000001 RBX: 0000000000000001 RCX: 0000000080000001
> RDX: 00007f2817c77000 RSI: ffffffff81bcd756 RDI: ffffc90002be7ad8
> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000001 R12: ffffea00014fc480
> R13: 0000000000000000 R14: dffffc0000000000 R15: 8000000053f12007
>  zap_drop_file_uffd_wp mm/memory.c:1352 [inline]
>  zap_install_uffd_wp_if_needed mm/memory.c:1371 [inline]
>  zap_pte_range mm/memory.c:1417 [inline]
>  zap_pmd_range mm/memory.c:1564 [inline]
>  zap_pud_range mm/memory.c:1593 [inline]
>  zap_p4d_range mm/memory.c:1614 [inline]
>  unmap_page_range+0x1046/0x4470 mm/memory.c:1635
>  unmap_single_vma+0x19a/0x2b0 mm/memory.c:1681
>  unmap_vmas+0x234/0x380 mm/memory.c:1720
>  exit_mmap+0x190/0x930 mm/mmap.c:3111
>  __mmput+0x128/0x4c0 kernel/fork.c:1351
>  mmput+0x60/0x70 kernel/fork.c:1373
>  exit_mm kernel/exit.c:564 [inline]
>  do_exit+0x9d1/0x29f0 kernel/exit.c:858
>  do_group_exit+0xd4/0x2a0 kernel/exit.c:1021
>  get_signal+0x2311/0x25c0 kernel/signal.c:2874
>  arch_do_signal_or_restart+0x79/0x5a0 arch/x86/kernel/signal.c:307
>  exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>  exit_to_user_mode_prepare+0x11f/0x240 kernel/entry/common.c:204
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
>  syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:297
>  do_syscall_64+0x46/0x80 arch/x86/entry/common.c:86
>  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7f281828edcd
> Code: Unable to access opcode bytes at 0x7f281828eda3.
> RSP: 002b:00007f28194c0c98 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> RAX: fffffffffffffe00 RBX: 00007f28183bbf80 RCX: 00007f281828edcd
> RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f28183bbf88
> RBP: 00007f28183bbf88 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 00007f28183bbf8c
> R13: 00007ffd5038e1ef R14: 00007ffd5038e390 R15: 00007f28194c0d80
>  </TASK>



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: soft lockup in smp_call_function
  2023-09-12 23:02 BUG: soft lockup in smp_call_function Sanan Hasanov
  2023-09-13 10:05 ` Peter Zijlstra
@ 2023-09-13 11:07 ` Hillf Danton
  2023-09-13 14:21     ` Tetsuo Handa
  2023-09-13 14:30   ` BUG: soft lockup in smp_call_function Tetsuo Handa
  1 sibling, 2 replies; 19+ messages in thread
From: Hillf Danton @ 2023-09-13 11:07 UTC (permalink / raw)
  To: Sanan Hasanov
  Cc: peterz, Linus Torvalds, Tetsuo Handa, Thomas Gleixner, syzkaller,
	linux-mm, LKML

On Tue, 12 Sep 2023 23:02:56 +0000 Sanan Hasanov <Sanan.Hasanov@ucf.edu>
> Good day, dear maintainers,
> 
> We found a bug using a modified kernel configuration file used by syzbot.
> 
Thanks for your report.

> We enhanced the coverage of the configuration file using our tool, klocalizer.
> 
> Kernel Branch: 6.3.0-next-20230426
> Kernel Config: https://drive.google.com/file/d/1WSUEWrith9-539qo6xRqmwy4LfDtmKpp/view?usp=sharing
> Reproducer: https://drive.google.com/file/d/1pN6FfcjuUs6Wx94g1gufuYGjRbMMgiZ4/view?usp=sharing
> Thank you!
> 
> Best regards,
> Sanan Hasanov
> 
> watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [kworker/u16:1:12]
> Modules linked in:
> irq event stamp: 192794
> hardirqs last  enabled at (192793): [<ffffffff89a0140a>] asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
> hardirqs last disabled at (192794): [<ffffffff89975d4f>] sysvec_apic_timer_interrupt+0xf/0xc0 arch/x86/kernel/apic/apic.c:1106
> softirqs last  enabled at (187764): [<ffffffff814b94bd>] invoke_softirq kernel/softirq.c:445 [inline]
> softirqs last  enabled at (187764): [<ffffffff814b94bd>] __irq_exit_rcu+0x11d/0x190 kernel/softirq.c:650
> softirqs last disabled at (187671): [<ffffffff814b94bd>] invoke_softirq kernel/softirq.c:445 [inline]
> softirqs last disabled at (187671): [<ffffffff814b94bd>] __irq_exit_rcu+0x11d/0x190 kernel/softirq.c:650
> CPU: 5 PID: 12 Comm: kworker/u16:1 Not tainted 6.3.0-next-20230426 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> Workqueue: events_unbound toggle_allocation_gate
> RIP: 0010:csd_lock_wait kernel/smp.c:294 [inline]
> RIP: 0010:smp_call_function_many_cond+0x5bd/0x1020 kernel/smp.c:828
> Code: 0b 00 85 ed 74 4d 48 b8 00 00 00 00 00 fc ff df 4d 89 f4 4c 89 f5 49 c1 ec 03 83 e5 07 49 01 c4 83 c5 03 e8 b5 07 0b 00 f3 90 <41> 0f b6 04 24 40 38 c5 7c 08 84 c0 0f 85 46 08 00 00 8b 43 08 31
> RSP: 0018:ffffc900000cf9e8 EFLAGS: 00000293
> RAX: 0000000000000000 RBX: ffff888119cc4d80 RCX: 0000000000000000
> RDX: ffff888100325940 RSI: ffffffff8176807b RDI: 0000000000000005
> RBP: 0000000000000003 R08: 0000000000000005 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000001 R12: ffffed10233989b1
> R13: 0000000000000001 R14: ffff888119cc4d88 R15: 0000000000000001
> FS:  0000000000000000(0000) GS:ffff888119e80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000555556a6cc88 CR3: 000000000bb73000 CR4: 0000000000350ee0
> Call Trace:
>  <TASK>
>  on_each_cpu_cond_mask+0x40/0x90 kernel/smp.c:996
>  on_each_cpu include/linux/smp.h:71 [inline]
>  text_poke_sync arch/x86/kernel/alternative.c:1770 [inline]
>  text_poke_bp_batch+0x237/0x770 arch/x86/kernel/alternative.c:1970
>  text_poke_flush arch/x86/kernel/alternative.c:2161 [inline]
>  text_poke_flush arch/x86/kernel/alternative.c:2158 [inline]
>  text_poke_finish+0x1a/0x30 arch/x86/kernel/alternative.c:2168
>  arch_jump_label_transform_apply+0x17/0x30 arch/x86/kernel/jump_label.c:146
>  jump_label_update+0x321/0x400 kernel/jump_label.c:829
>  static_key_enable_cpuslocked+0x1b5/0x270 kernel/jump_label.c:205
>  static_key_enable+0x1a/0x20 kernel/jump_label.c:218
>  toggle_allocation_gate mm/kfence/core.c:831 [inline]
>  toggle_allocation_gate+0xf4/0x220 mm/kfence/core.c:823
>  process_one_work+0x993/0x15e0 kernel/workqueue.c:2405
>  worker_thread+0x67d/0x10c0 kernel/workqueue.c:2552
>  kthread+0x33e/0x440 kernel/kthread.c:379
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
>  </TASK>
> Sending NMI from CPU 5 to CPUs 0-4,6-7:
> NMI backtrace for cpu 1
> CPU: 1 PID: 20602 Comm: syz-executor.3 Not tainted 6.3.0-next-20230426 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> RIP: 0010:hlock_class kernel/locking/lockdep.c:228 [inline]
> RIP: 0010:check_wait_context kernel/locking/lockdep.c:4747 [inline]
> RIP: 0010:__lock_acquire+0x489/0x5d00 kernel/locking/lockdep.c:5024
> Code: 41 81 e5 ff 1f 45 0f b7 ed be 08 00 00 00 4c 89 e8 48 c1 e8 06 48 8d 3c c5 00 6b 2c 90 e8 5f 90 6e 00 4c 0f a3 2d d7 35 c9 0e <0f> 83 5c 0c 00 00 4f 8d 6c 6d 00 49 c1 e5 06 49 81 c5 20 6f 2c 90
> RSP: 0018:ffffc90002aa7350 EFLAGS: 00000047
> RAX: 0000000000000001 RBX: 0000000000000001 RCX: ffffffff81633521
> RDX: fffffbfff2058d62 RSI: 0000000000000008 RDI: ffffffff902c6b08
> RBP: ffff888042995940 R08: 0000000000000000 R09: ffffffff902c6b0f
> R10: fffffbfff2058d61 R11: 0000000000000001 R12: ffff888119e2b818
> R13: 0000000000000063 R14: 0000000000000002 R15: ffff888042996598
> FS:  00007fdaad065700(0000) GS:ffff888119c80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000001b30623000 CR3: 0000000101969000 CR4: 0000000000350ee0
> Call Trace:
>  <TASK>
>  lock_acquire kernel/locking/lockdep.c:5691 [inline]
>  lock_acquire+0x1b1/0x520 kernel/locking/lockdep.c:5656
>  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>  _raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162
>  lock_hrtimer_base kernel/time/hrtimer.c:173 [inline]
>  hrtimer_try_to_cancel kernel/time/hrtimer.c:1331 [inline]
>  hrtimer_try_to_cancel+0xa9/0x2e0 kernel/time/hrtimer.c:1316
>  hrtimer_cancel+0x17/0x40 kernel/time/hrtimer.c:1443
>  __disable_vblank drivers/gpu/drm/drm_vblank.c:434 [inline]
>  drm_vblank_disable_and_save+0x282/0x3d0 drivers/gpu/drm/drm_vblank.c:478
>  drm_crtc_vblank_off+0x312/0x970 drivers/gpu/drm/drm_vblank.c:1366

	cpu1			cpu4 (see below)
	====			====
	drm_crtc_vblank_off	__run_hrtimer
	spin_lock_irq(&dev->event_lock);
	...
				drm_handle_vblank
	hrtimer_cancel		spin_lock_irqsave(&dev->event_lock, irqflags);


Deadlock should have been reported instead provided the lockdep_map in
struct timer_list were added also to hrtimer, so it is highly appreciated
if Tetsuo or Thomas adds it before 6.8 or 6.10.

Hillf

>  disable_outputs+0x7c7/0xbb0 drivers/gpu/drm/drm_atomic_helper.c:1202
>  drm_atomic_helper_commit_modeset_disables+0x1d/0x40 drivers/gpu/drm/drm_atomic_helper.c:1397
>  vkms_atomic_commit_tail+0x51/0x240 drivers/gpu/drm/vkms/vkms_drv.c:71
>  commit_tail+0x288/0x420 drivers/gpu/drm/drm_atomic_helper.c:1812
>  drm_atomic_helper_commit drivers/gpu/drm/drm_atomic_helper.c:2052 [inline]
>  drm_atomic_helper_commit+0x306/0x390 drivers/gpu/drm/drm_atomic_helper.c:1985
>  drm_atomic_commit+0x20a/0x2d0 drivers/gpu/drm/drm_atomic.c:1503
>  drm_client_modeset_commit_atomic+0x698/0x7e0 drivers/gpu/drm/drm_client_modeset.c:1045
>  drm_client_modeset_dpms+0x174/0x200 drivers/gpu/drm/drm_client_modeset.c:1226
>  drm_fb_helper_dpms drivers/gpu/drm/drm_fb_helper.c:323 [inline]
>  drm_fb_helper_blank+0xd1/0x260 drivers/gpu/drm/drm_fb_helper.c:356
>  fb_blank+0x105/0x190 drivers/video/fbdev/core/fbmem.c:1088
>  do_fb_ioctl+0x390/0x760 drivers/video/fbdev/core/fbmem.c:1180
>  fb_ioctl+0xeb/0x150 drivers/video/fbdev/core/fbmem.c:1204
>  vfs_ioctl fs/ioctl.c:51 [inline]
>  __do_sys_ioctl fs/ioctl.c:870 [inline]
>  __se_sys_ioctl fs/ioctl.c:856 [inline]
>  __x64_sys_ioctl+0x197/0x210 fs/ioctl.c:856
>  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>  do_syscall_64+0x39/0x80 arch/x86/entry/common.c:80
>  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7fdaabe8edcd
> Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007fdaad064bf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 00007fdaabfbbf80 RCX: 00007fdaabe8edcd
> RDX: 0000000000000004 RSI: 0000000000004611 RDI: 0000000000000003
> RBP: 00007fdaabefc59c R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 00007ffdadeffe9f R14: 00007ffdadf00040 R15: 00007fdaad064d80
>  </TASK>
> NMI backtrace for cpu 0 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
> NMI backtrace for cpu 0 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
> NMI backtrace for cpu 0 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729
> NMI backtrace for cpu 3 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
> NMI backtrace for cpu 3 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
> NMI backtrace for cpu 3 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729
> NMI backtrace for cpu 6 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
> NMI backtrace for cpu 6 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
> NMI backtrace for cpu 6 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729
> NMI backtrace for cpu 7 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
> NMI backtrace for cpu 7 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
> NMI backtrace for cpu 7 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729
> NMI backtrace for cpu 4
> CPU: 4 PID: 20623 Comm: syz-executor.6 Not tainted 6.3.0-next-20230426 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> RIP: 0010:kvm_wait+0xb7/0x110 arch/x86/kernel/kvm.c:1064
> Code: 40 38 c6 74 1b 48 83 c4 10 c3 c3 e8 93 d3 50 00 eb 07 0f 00 2d 4a 04 92 08 fb f4 48 83 c4 10 c3 eb 07 0f 00 2d 3a 04 92 08 f4 <48> 83 c4 10 c3 89 74 24 0c 48 89 3c 24 e8 d7 d4 50 00 8b 74 24 0c
> RSP: 0018:ffffc90000300b50 EFLAGS: 00000046
> RAX: 0000000000000003 RBX: 0000000000000000 RCX: dffffc0000000000
> RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff88810b0803d8
> RBP: ffff88810b0803d8 R08: 0000000000000001 R09: ffff88810b0803d8
> R10: ffffed102161007b R11: ffffc90000300ff8 R12: 0000000000000000
> R13: ffffed102161007b R14: 0000000000000001 R15: ffff888119e3d3c0
> FS:  0000000000000000(0000) GS:ffff888119e00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f28183bd0b0 CR3: 000000000bb73000 CR4: 0000000000350ee0
> Call Trace:
>  <IRQ>
>  pv_wait arch/x86/include/asm/paravirt.h:598 [inline]
>  pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:470 [inline]
>  __pv_queued_spin_lock_slowpath+0x8e4/0xb80 kernel/locking/qspinlock.c:511
>  pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:586 [inline]
>  queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline]
>  queued_spin_lock include/asm-generic/qspinlock.h:114 [inline]
>  do_raw_spin_lock+0x20d/0x2b0 kernel/locking/spinlock_debug.c:115
>  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:111 [inline]
>  _raw_spin_lock_irqsave+0x45/0x60 kernel/locking/spinlock.c:162
>  drm_handle_vblank+0x11e/0xb80 drivers/gpu/drm/drm_vblank.c:1986
>  vkms_vblank_simulate+0xe8/0x3e0 drivers/gpu/drm/vkms/vkms_crtc.c:29
>  __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
>  __hrtimer_run_queues+0x599/0xa30 kernel/time/hrtimer.c:1749
>  hrtimer_interrupt+0x320/0x7b0 kernel/time/hrtimer.c:1811
>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1095 [inline]
>  __sysvec_apic_timer_interrupt+0x14a/0x430 arch/x86/kernel/apic/apic.c:1112
>  sysvec_apic_timer_interrupt+0x92/0xc0 arch/x86/kernel/apic/apic.c:1106
>  </IRQ>
>  <TASK>
>  asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
> RIP: 0010:check_kcov_mode kernel/kcov.c:173 [inline]
> RIP: 0010:__sanitizer_cov_trace_pc+0x11/0x70 kernel/kcov.c:207
> Code: a8 01 00 00 e8 b0 ff ff ff 31 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 65 8b 05 0d 33 82 7e 89 c1 48 8b 34 24 <81> e1 00 01 00 00 65 48 8b 14 25 40 bb 03 00 a9 00 01 ff 00 74 0e
> RSP: 0018:ffffc90002be76d8 EFLAGS: 00000286
> RAX: 0000000080000001 RBX: 0000000000000001 RCX: 0000000080000001
> RDX: 00007f2817c77000 RSI: ffffffff81bcd756 RDI: ffffc90002be7ad8
> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000001 R12: ffffea00014fc480
> R13: 0000000000000000 R14: dffffc0000000000 R15: 8000000053f12007
>  zap_drop_file_uffd_wp mm/memory.c:1352 [inline]
>  zap_install_uffd_wp_if_needed mm/memory.c:1371 [inline]
>  zap_pte_range mm/memory.c:1417 [inline]
>  zap_pmd_range mm/memory.c:1564 [inline]
>  zap_pud_range mm/memory.c:1593 [inline]
>  zap_p4d_range mm/memory.c:1614 [inline]
>  unmap_page_range+0x1046/0x4470 mm/memory.c:1635
>  unmap_single_vma+0x19a/0x2b0 mm/memory.c:1681
>  unmap_vmas+0x234/0x380 mm/memory.c:1720
>  exit_mmap+0x190/0x930 mm/mmap.c:3111
>  __mmput+0x128/0x4c0 kernel/fork.c:1351
>  mmput+0x60/0x70 kernel/fork.c:1373
>  exit_mm kernel/exit.c:564 [inline]
>  do_exit+0x9d1/0x29f0 kernel/exit.c:858
>  do_group_exit+0xd4/0x2a0 kernel/exit.c:1021
>  get_signal+0x2311/0x25c0 kernel/signal.c:2874
>  arch_do_signal_or_restart+0x79/0x5a0 arch/x86/kernel/signal.c:307
>  exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>  exit_to_user_mode_prepare+0x11f/0x240 kernel/entry/common.c:204
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
>  syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:297
>  do_syscall_64+0x46/0x80 arch/x86/entry/common.c:86
>  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7f281828edcd
> Code: Unable to access opcode bytes at 0x7f281828eda3.
> RSP: 002b:00007f28194c0c98 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> RAX: fffffffffffffe00 RBX: 00007f28183bbf80 RCX: 00007f281828edcd
> RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f28183bbf88
> RBP: 00007f28183bbf88 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 00007f28183bbf8c
> R13: 00007ffd5038e1ef R14: 00007ffd5038e390 R15: 00007f28194c0d80
>  </TASK>
> NMI backtrace for cpu 2 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
> NMI backtrace for cpu 2 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
> NMI backtrace for cpu 2 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729


^ permalink raw reply	[flat|nested] 19+ messages in thread

* drm/vkms: deadlock between dev->event_lock and timer
  2023-09-13 11:07 ` Hillf Danton
@ 2023-09-13 14:21     ` Tetsuo Handa
  2023-09-13 14:30   ` BUG: soft lockup in smp_call_function Tetsuo Handa
  1 sibling, 0 replies; 19+ messages in thread
From: Tetsuo Handa @ 2023-09-13 14:21 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maira Canal, Haneen Mohammed,
	Daniel Vetter, David Airlie, DRI
  Cc: Hillf Danton, syzkaller, LKML, Sanan Hasanov

Hello. A deadlock was reported in drivers/gpu/drm/vkms/ .
It looks like this locking pattern remains as of 6.6-rc1. Please fix.

void drm_crtc_vblank_off(struct drm_crtc *crtc) {
  spin_lock_irq(&dev->event_lock);
  drm_vblank_disable_and_save(dev, pipe) {
    __disable_vblank(dev, pipe) {
      crtc->funcs->disable_vblank(crtc) == vkms_disable_vblank {
        hrtimer_cancel(&out->vblank_hrtimer) { // Retries with dev->event_lock held until lock_hrtimer_base() succeeds.
          ret = hrtimer_try_to_cancel(timer) {
            base = lock_hrtimer_base(timer, &flags); // Fails forever because vkms_vblank_simulate() is in progress.
          }
        }
      }
    }
  }
  spin_unlock_irq(&dev->event_lock);
}

static void __run_hrtimer(...) {
  restart = fn(timer) == vkms_vblank_simulate {
    drm_crtc_handle_vblank(crtc) {
      drm_handle_vblank(struct drm_device *dev, unsigned int pipe) {
        spin_lock_irqsave(&dev->event_lock, irqflags); // Trying to hold dev->event_lock inside timer interrupt handler. => Deadlock was reported as a soft lockup.
        spin_unlock_irqrestore(&dev->event_lock, irqflags);
      }
    }
  }
}

On 2023/09/13 20:07, Hillf Danton wrote:
> On Tue, 12 Sep 2023 23:02:56 +0000 Sanan Hasanov <Sanan.Hasanov@ucf.edu>
>> Good day, dear maintainers,
>>
>> We found a bug using a modified kernel configuration file used by syzbot.
>>
> Thanks for your report.
> 
>> We enhanced the coverage of the configuration file using our tool, klocalizer.
>>
>> Kernel Branch: 6.3.0-next-20230426
>> Kernel Config: https://drive.google.com/file/d/1WSUEWrith9-539qo6xRqmwy4LfDtmKpp/view?usp=sharing
>> Reproducer: https://drive.google.com/file/d/1pN6FfcjuUs6Wx94g1gufuYGjRbMMgiZ4/view?usp=sharing
>> Thank you!
>>
>> Best regards,
>> Sanan Hasanov
>>
>> watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [kworker/u16:1:12]
>> Modules linked in:
>> irq event stamp: 192794
>> hardirqs last  enabled at (192793): [<ffffffff89a0140a>] asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
>> hardirqs last disabled at (192794): [<ffffffff89975d4f>] sysvec_apic_timer_interrupt+0xf/0xc0 arch/x86/kernel/apic/apic.c:1106
>> softirqs last  enabled at (187764): [<ffffffff814b94bd>] invoke_softirq kernel/softirq.c:445 [inline]
>> softirqs last  enabled at (187764): [<ffffffff814b94bd>] __irq_exit_rcu+0x11d/0x190 kernel/softirq.c:650
>> softirqs last disabled at (187671): [<ffffffff814b94bd>] invoke_softirq kernel/softirq.c:445 [inline]
>> softirqs last disabled at (187671): [<ffffffff814b94bd>] __irq_exit_rcu+0x11d/0x190 kernel/softirq.c:650
>> CPU: 5 PID: 12 Comm: kworker/u16:1 Not tainted 6.3.0-next-20230426 #1
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
>> Workqueue: events_unbound toggle_allocation_gate
>> RIP: 0010:csd_lock_wait kernel/smp.c:294 [inline]
>> RIP: 0010:smp_call_function_many_cond+0x5bd/0x1020 kernel/smp.c:828
>> Code: 0b 00 85 ed 74 4d 48 b8 00 00 00 00 00 fc ff df 4d 89 f4 4c 89 f5 49 c1 ec 03 83 e5 07 49 01 c4 83 c5 03 e8 b5 07 0b 00 f3 90 <41> 0f b6 04 24 40 38 c5 7c 08 84 c0 0f 85 46 08 00 00 8b 43 08 31
>> RSP: 0018:ffffc900000cf9e8 EFLAGS: 00000293
>> RAX: 0000000000000000 RBX: ffff888119cc4d80 RCX: 0000000000000000
>> RDX: ffff888100325940 RSI: ffffffff8176807b RDI: 0000000000000005
>> RBP: 0000000000000003 R08: 0000000000000005 R09: 0000000000000000
>> R10: 0000000000000001 R11: 0000000000000001 R12: ffffed10233989b1
>> R13: 0000000000000001 R14: ffff888119cc4d88 R15: 0000000000000001
>> FS:  0000000000000000(0000) GS:ffff888119e80000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000555556a6cc88 CR3: 000000000bb73000 CR4: 0000000000350ee0
>> Call Trace:
>>  <TASK>
>>  on_each_cpu_cond_mask+0x40/0x90 kernel/smp.c:996
>>  on_each_cpu include/linux/smp.h:71 [inline]
>>  text_poke_sync arch/x86/kernel/alternative.c:1770 [inline]
>>  text_poke_bp_batch+0x237/0x770 arch/x86/kernel/alternative.c:1970
>>  text_poke_flush arch/x86/kernel/alternative.c:2161 [inline]
>>  text_poke_flush arch/x86/kernel/alternative.c:2158 [inline]
>>  text_poke_finish+0x1a/0x30 arch/x86/kernel/alternative.c:2168
>>  arch_jump_label_transform_apply+0x17/0x30 arch/x86/kernel/jump_label.c:146
>>  jump_label_update+0x321/0x400 kernel/jump_label.c:829
>>  static_key_enable_cpuslocked+0x1b5/0x270 kernel/jump_label.c:205
>>  static_key_enable+0x1a/0x20 kernel/jump_label.c:218
>>  toggle_allocation_gate mm/kfence/core.c:831 [inline]
>>  toggle_allocation_gate+0xf4/0x220 mm/kfence/core.c:823
>>  process_one_work+0x993/0x15e0 kernel/workqueue.c:2405
>>  worker_thread+0x67d/0x10c0 kernel/workqueue.c:2552
>>  kthread+0x33e/0x440 kernel/kthread.c:379
>>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
>>  </TASK>
>> Sending NMI from CPU 5 to CPUs 0-4,6-7:
>> NMI backtrace for cpu 1
>> CPU: 1 PID: 20602 Comm: syz-executor.3 Not tainted 6.3.0-next-20230426 #1
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
>> RIP: 0010:hlock_class kernel/locking/lockdep.c:228 [inline]
>> RIP: 0010:check_wait_context kernel/locking/lockdep.c:4747 [inline]
>> RIP: 0010:__lock_acquire+0x489/0x5d00 kernel/locking/lockdep.c:5024
>> Code: 41 81 e5 ff 1f 45 0f b7 ed be 08 00 00 00 4c 89 e8 48 c1 e8 06 48 8d 3c c5 00 6b 2c 90 e8 5f 90 6e 00 4c 0f a3 2d d7 35 c9 0e <0f> 83 5c 0c 00 00 4f 8d 6c 6d 00 49 c1 e5 06 49 81 c5 20 6f 2c 90
>> RSP: 0018:ffffc90002aa7350 EFLAGS: 00000047
>> RAX: 0000000000000001 RBX: 0000000000000001 RCX: ffffffff81633521
>> RDX: fffffbfff2058d62 RSI: 0000000000000008 RDI: ffffffff902c6b08
>> RBP: ffff888042995940 R08: 0000000000000000 R09: ffffffff902c6b0f
>> R10: fffffbfff2058d61 R11: 0000000000000001 R12: ffff888119e2b818
>> R13: 0000000000000063 R14: 0000000000000002 R15: ffff888042996598
>> FS:  00007fdaad065700(0000) GS:ffff888119c80000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000001b30623000 CR3: 0000000101969000 CR4: 0000000000350ee0
>> Call Trace:
>>  <TASK>
>>  lock_acquire kernel/locking/lockdep.c:5691 [inline]
>>  lock_acquire+0x1b1/0x520 kernel/locking/lockdep.c:5656
>>  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>>  _raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162
>>  lock_hrtimer_base kernel/time/hrtimer.c:173 [inline]
>>  hrtimer_try_to_cancel kernel/time/hrtimer.c:1331 [inline]
>>  hrtimer_try_to_cancel+0xa9/0x2e0 kernel/time/hrtimer.c:1316
>>  hrtimer_cancel+0x17/0x40 kernel/time/hrtimer.c:1443
>>  __disable_vblank drivers/gpu/drm/drm_vblank.c:434 [inline]
>>  drm_vblank_disable_and_save+0x282/0x3d0 drivers/gpu/drm/drm_vblank.c:478
>>  drm_crtc_vblank_off+0x312/0x970 drivers/gpu/drm/drm_vblank.c:1366
> 
> 	cpu1			cpu4 (see below)
> 	====			====
> 	drm_crtc_vblank_off	__run_hrtimer
> 	spin_lock_irq(&dev->event_lock);
> 	...
> 				drm_handle_vblank
> 	hrtimer_cancel		spin_lock_irqsave(&dev->event_lock, irqflags);
> 
> 
> Deadlock should have been reported instead provided the lockdep_map in
> struct timer_list were added also to hrtimer, so it is highly appreciated
> if Tetsuo or Thomas adds it before 6.8 or 6.10.
> 
> Hillf
> 
>>  disable_outputs+0x7c7/0xbb0 drivers/gpu/drm/drm_atomic_helper.c:1202
>>  drm_atomic_helper_commit_modeset_disables+0x1d/0x40 drivers/gpu/drm/drm_atomic_helper.c:1397
>>  vkms_atomic_commit_tail+0x51/0x240 drivers/gpu/drm/vkms/vkms_drv.c:71
>>  commit_tail+0x288/0x420 drivers/gpu/drm/drm_atomic_helper.c:1812
>>  drm_atomic_helper_commit drivers/gpu/drm/drm_atomic_helper.c:2052 [inline]
>>  drm_atomic_helper_commit+0x306/0x390 drivers/gpu/drm/drm_atomic_helper.c:1985
>>  drm_atomic_commit+0x20a/0x2d0 drivers/gpu/drm/drm_atomic.c:1503
>>  drm_client_modeset_commit_atomic+0x698/0x7e0 drivers/gpu/drm/drm_client_modeset.c:1045
>>  drm_client_modeset_dpms+0x174/0x200 drivers/gpu/drm/drm_client_modeset.c:1226
>>  drm_fb_helper_dpms drivers/gpu/drm/drm_fb_helper.c:323 [inline]
>>  drm_fb_helper_blank+0xd1/0x260 drivers/gpu/drm/drm_fb_helper.c:356
>>  fb_blank+0x105/0x190 drivers/video/fbdev/core/fbmem.c:1088
>>  do_fb_ioctl+0x390/0x760 drivers/video/fbdev/core/fbmem.c:1180
>>  fb_ioctl+0xeb/0x150 drivers/video/fbdev/core/fbmem.c:1204
>>  vfs_ioctl fs/ioctl.c:51 [inline]
>>  __do_sys_ioctl fs/ioctl.c:870 [inline]
>>  __se_sys_ioctl fs/ioctl.c:856 [inline]
>>  __x64_sys_ioctl+0x197/0x210 fs/ioctl.c:856
>>  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>>  do_syscall_64+0x39/0x80 arch/x86/entry/common.c:80
>>  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> RIP: 0033:0x7fdaabe8edcd
>> Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:00007fdaad064bf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>> RAX: ffffffffffffffda RBX: 00007fdaabfbbf80 RCX: 00007fdaabe8edcd
>> RDX: 0000000000000004 RSI: 0000000000004611 RDI: 0000000000000003
>> RBP: 00007fdaabefc59c R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>> R13: 00007ffdadeffe9f R14: 00007ffdadf00040 R15: 00007fdaad064d80
>>  </TASK>
>> NMI backtrace for cpu 0 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
>> NMI backtrace for cpu 0 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
>> NMI backtrace for cpu 0 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729
>> NMI backtrace for cpu 3 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
>> NMI backtrace for cpu 3 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
>> NMI backtrace for cpu 3 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729
>> NMI backtrace for cpu 6 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
>> NMI backtrace for cpu 6 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
>> NMI backtrace for cpu 6 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729
>> NMI backtrace for cpu 7 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
>> NMI backtrace for cpu 7 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
>> NMI backtrace for cpu 7 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729
>> NMI backtrace for cpu 4
>> CPU: 4 PID: 20623 Comm: syz-executor.6 Not tainted 6.3.0-next-20230426 #1
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
>> RIP: 0010:kvm_wait+0xb7/0x110 arch/x86/kernel/kvm.c:1064
>> Code: 40 38 c6 74 1b 48 83 c4 10 c3 c3 e8 93 d3 50 00 eb 07 0f 00 2d 4a 04 92 08 fb f4 48 83 c4 10 c3 eb 07 0f 00 2d 3a 04 92 08 f4 <48> 83 c4 10 c3 89 74 24 0c 48 89 3c 24 e8 d7 d4 50 00 8b 74 24 0c
>> RSP: 0018:ffffc90000300b50 EFLAGS: 00000046
>> RAX: 0000000000000003 RBX: 0000000000000000 RCX: dffffc0000000000
>> RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff88810b0803d8
>> RBP: ffff88810b0803d8 R08: 0000000000000001 R09: ffff88810b0803d8
>> R10: ffffed102161007b R11: ffffc90000300ff8 R12: 0000000000000000
>> R13: ffffed102161007b R14: 0000000000000001 R15: ffff888119e3d3c0
>> FS:  0000000000000000(0000) GS:ffff888119e00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f28183bd0b0 CR3: 000000000bb73000 CR4: 0000000000350ee0
>> Call Trace:
>>  <IRQ>
>>  pv_wait arch/x86/include/asm/paravirt.h:598 [inline]
>>  pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:470 [inline]
>>  __pv_queued_spin_lock_slowpath+0x8e4/0xb80 kernel/locking/qspinlock.c:511
>>  pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:586 [inline]
>>  queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline]
>>  queued_spin_lock include/asm-generic/qspinlock.h:114 [inline]
>>  do_raw_spin_lock+0x20d/0x2b0 kernel/locking/spinlock_debug.c:115
>>  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:111 [inline]
>>  _raw_spin_lock_irqsave+0x45/0x60 kernel/locking/spinlock.c:162
>>  drm_handle_vblank+0x11e/0xb80 drivers/gpu/drm/drm_vblank.c:1986
>>  vkms_vblank_simulate+0xe8/0x3e0 drivers/gpu/drm/vkms/vkms_crtc.c:29
>>  __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
>>  __hrtimer_run_queues+0x599/0xa30 kernel/time/hrtimer.c:1749
>>  hrtimer_interrupt+0x320/0x7b0 kernel/time/hrtimer.c:1811
>>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1095 [inline]
>>  __sysvec_apic_timer_interrupt+0x14a/0x430 arch/x86/kernel/apic/apic.c:1112
>>  sysvec_apic_timer_interrupt+0x92/0xc0 arch/x86/kernel/apic/apic.c:1106
>>  </IRQ>
>>  <TASK>
>>  asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
>> RIP: 0010:check_kcov_mode kernel/kcov.c:173 [inline]
>> RIP: 0010:__sanitizer_cov_trace_pc+0x11/0x70 kernel/kcov.c:207
>> Code: a8 01 00 00 e8 b0 ff ff ff 31 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 65 8b 05 0d 33 82 7e 89 c1 48 8b 34 24 <81> e1 00 01 00 00 65 48 8b 14 25 40 bb 03 00 a9 00 01 ff 00 74 0e
>> RSP: 0018:ffffc90002be76d8 EFLAGS: 00000286
>> RAX: 0000000080000001 RBX: 0000000000000001 RCX: 0000000080000001
>> RDX: 00007f2817c77000 RSI: ffffffff81bcd756 RDI: ffffc90002be7ad8
>> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
>> R10: 0000000000000001 R11: 0000000000000001 R12: ffffea00014fc480
>> R13: 0000000000000000 R14: dffffc0000000000 R15: 8000000053f12007
>>  zap_drop_file_uffd_wp mm/memory.c:1352 [inline]
>>  zap_install_uffd_wp_if_needed mm/memory.c:1371 [inline]
>>  zap_pte_range mm/memory.c:1417 [inline]
>>  zap_pmd_range mm/memory.c:1564 [inline]
>>  zap_pud_range mm/memory.c:1593 [inline]
>>  zap_p4d_range mm/memory.c:1614 [inline]
>>  unmap_page_range+0x1046/0x4470 mm/memory.c:1635
>>  unmap_single_vma+0x19a/0x2b0 mm/memory.c:1681
>>  unmap_vmas+0x234/0x380 mm/memory.c:1720
>>  exit_mmap+0x190/0x930 mm/mmap.c:3111
>>  __mmput+0x128/0x4c0 kernel/fork.c:1351
>>  mmput+0x60/0x70 kernel/fork.c:1373
>>  exit_mm kernel/exit.c:564 [inline]
>>  do_exit+0x9d1/0x29f0 kernel/exit.c:858
>>  do_group_exit+0xd4/0x2a0 kernel/exit.c:1021
>>  get_signal+0x2311/0x25c0 kernel/signal.c:2874
>>  arch_do_signal_or_restart+0x79/0x5a0 arch/x86/kernel/signal.c:307
>>  exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>>  exit_to_user_mode_prepare+0x11f/0x240 kernel/entry/common.c:204
>>  __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
>>  syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:297
>>  do_syscall_64+0x46/0x80 arch/x86/entry/common.c:86
>>  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> RIP: 0033:0x7f281828edcd
>> Code: Unable to access opcode bytes at 0x7f281828eda3.
>> RSP: 002b:00007f28194c0c98 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
>> RAX: fffffffffffffe00 RBX: 00007f28183bbf80 RCX: 00007f281828edcd
>> RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f28183bbf88
>> RBP: 00007f28183bbf88 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 00007f28183bbf8c
>> R13: 00007ffd5038e1ef R14: 00007ffd5038e390 R15: 00007f28194c0d80
>>  </TASK>
>> NMI backtrace for cpu 2 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
>> NMI backtrace for cpu 2 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
>> NMI backtrace for cpu 2 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729


^ permalink raw reply	[flat|nested] 19+ messages in thread

* drm/vkms: deadlock between dev->event_lock and timer
@ 2023-09-13 14:21     ` Tetsuo Handa
  0 siblings, 0 replies; 19+ messages in thread
From: Tetsuo Handa @ 2023-09-13 14:21 UTC (permalink / raw)
  To: Rodrigo Siqueira, Melissa Wen, Maira Canal, Haneen Mohammed,
	Daniel Vetter, David Airlie, DRI
  Cc: syzkaller, LKML, Hillf Danton, Sanan Hasanov

Hello. A deadlock was reported in drivers/gpu/drm/vkms/ .
It looks like this locking pattern remains as of 6.6-rc1. Please fix.

void drm_crtc_vblank_off(struct drm_crtc *crtc) {
  spin_lock_irq(&dev->event_lock);
  drm_vblank_disable_and_save(dev, pipe) {
    __disable_vblank(dev, pipe) {
      crtc->funcs->disable_vblank(crtc) == vkms_disable_vblank {
        hrtimer_cancel(&out->vblank_hrtimer) { // Retries with dev->event_lock held until lock_hrtimer_base() succeeds.
          ret = hrtimer_try_to_cancel(timer) {
            base = lock_hrtimer_base(timer, &flags); // Fails forever because vkms_vblank_simulate() is in progress.
          }
        }
      }
    }
  }
  spin_unlock_irq(&dev->event_lock);
}

static void __run_hrtimer(...) {
  restart = fn(timer) == vkms_vblank_simulate {
    drm_crtc_handle_vblank(crtc) {
      drm_handle_vblank(struct drm_device *dev, unsigned int pipe) {
        spin_lock_irqsave(&dev->event_lock, irqflags); // Trying to hold dev->event_lock inside timer interrupt handler. => Deadlock was reported as a soft lockup.
        spin_unlock_irqrestore(&dev->event_lock, irqflags);
      }
    }
  }
}

On 2023/09/13 20:07, Hillf Danton wrote:
> On Tue, 12 Sep 2023 23:02:56 +0000 Sanan Hasanov <Sanan.Hasanov@ucf.edu>
>> Good day, dear maintainers,
>>
>> We found a bug using a modified kernel configuration file used by syzbot.
>>
> Thanks for your report.
> 
>> We enhanced the coverage of the configuration file using our tool, klocalizer.
>>
>> Kernel Branch: 6.3.0-next-20230426
>> Kernel Config: https://drive.google.com/file/d/1WSUEWrith9-539qo6xRqmwy4LfDtmKpp/view?usp=sharing
>> Reproducer: https://drive.google.com/file/d/1pN6FfcjuUs6Wx94g1gufuYGjRbMMgiZ4/view?usp=sharing
>> Thank you!
>>
>> Best regards,
>> Sanan Hasanov
>>
>> watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [kworker/u16:1:12]
>> Modules linked in:
>> irq event stamp: 192794
>> hardirqs last  enabled at (192793): [<ffffffff89a0140a>] asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
>> hardirqs last disabled at (192794): [<ffffffff89975d4f>] sysvec_apic_timer_interrupt+0xf/0xc0 arch/x86/kernel/apic/apic.c:1106
>> softirqs last  enabled at (187764): [<ffffffff814b94bd>] invoke_softirq kernel/softirq.c:445 [inline]
>> softirqs last  enabled at (187764): [<ffffffff814b94bd>] __irq_exit_rcu+0x11d/0x190 kernel/softirq.c:650
>> softirqs last disabled at (187671): [<ffffffff814b94bd>] invoke_softirq kernel/softirq.c:445 [inline]
>> softirqs last disabled at (187671): [<ffffffff814b94bd>] __irq_exit_rcu+0x11d/0x190 kernel/softirq.c:650
>> CPU: 5 PID: 12 Comm: kworker/u16:1 Not tainted 6.3.0-next-20230426 #1
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
>> Workqueue: events_unbound toggle_allocation_gate
>> RIP: 0010:csd_lock_wait kernel/smp.c:294 [inline]
>> RIP: 0010:smp_call_function_many_cond+0x5bd/0x1020 kernel/smp.c:828
>> Code: 0b 00 85 ed 74 4d 48 b8 00 00 00 00 00 fc ff df 4d 89 f4 4c 89 f5 49 c1 ec 03 83 e5 07 49 01 c4 83 c5 03 e8 b5 07 0b 00 f3 90 <41> 0f b6 04 24 40 38 c5 7c 08 84 c0 0f 85 46 08 00 00 8b 43 08 31
>> RSP: 0018:ffffc900000cf9e8 EFLAGS: 00000293
>> RAX: 0000000000000000 RBX: ffff888119cc4d80 RCX: 0000000000000000
>> RDX: ffff888100325940 RSI: ffffffff8176807b RDI: 0000000000000005
>> RBP: 0000000000000003 R08: 0000000000000005 R09: 0000000000000000
>> R10: 0000000000000001 R11: 0000000000000001 R12: ffffed10233989b1
>> R13: 0000000000000001 R14: ffff888119cc4d88 R15: 0000000000000001
>> FS:  0000000000000000(0000) GS:ffff888119e80000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000555556a6cc88 CR3: 000000000bb73000 CR4: 0000000000350ee0
>> Call Trace:
>>  <TASK>
>>  on_each_cpu_cond_mask+0x40/0x90 kernel/smp.c:996
>>  on_each_cpu include/linux/smp.h:71 [inline]
>>  text_poke_sync arch/x86/kernel/alternative.c:1770 [inline]
>>  text_poke_bp_batch+0x237/0x770 arch/x86/kernel/alternative.c:1970
>>  text_poke_flush arch/x86/kernel/alternative.c:2161 [inline]
>>  text_poke_flush arch/x86/kernel/alternative.c:2158 [inline]
>>  text_poke_finish+0x1a/0x30 arch/x86/kernel/alternative.c:2168
>>  arch_jump_label_transform_apply+0x17/0x30 arch/x86/kernel/jump_label.c:146
>>  jump_label_update+0x321/0x400 kernel/jump_label.c:829
>>  static_key_enable_cpuslocked+0x1b5/0x270 kernel/jump_label.c:205
>>  static_key_enable+0x1a/0x20 kernel/jump_label.c:218
>>  toggle_allocation_gate mm/kfence/core.c:831 [inline]
>>  toggle_allocation_gate+0xf4/0x220 mm/kfence/core.c:823
>>  process_one_work+0x993/0x15e0 kernel/workqueue.c:2405
>>  worker_thread+0x67d/0x10c0 kernel/workqueue.c:2552
>>  kthread+0x33e/0x440 kernel/kthread.c:379
>>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
>>  </TASK>
>> Sending NMI from CPU 5 to CPUs 0-4,6-7:
>> NMI backtrace for cpu 1
>> CPU: 1 PID: 20602 Comm: syz-executor.3 Not tainted 6.3.0-next-20230426 #1
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
>> RIP: 0010:hlock_class kernel/locking/lockdep.c:228 [inline]
>> RIP: 0010:check_wait_context kernel/locking/lockdep.c:4747 [inline]
>> RIP: 0010:__lock_acquire+0x489/0x5d00 kernel/locking/lockdep.c:5024
>> Code: 41 81 e5 ff 1f 45 0f b7 ed be 08 00 00 00 4c 89 e8 48 c1 e8 06 48 8d 3c c5 00 6b 2c 90 e8 5f 90 6e 00 4c 0f a3 2d d7 35 c9 0e <0f> 83 5c 0c 00 00 4f 8d 6c 6d 00 49 c1 e5 06 49 81 c5 20 6f 2c 90
>> RSP: 0018:ffffc90002aa7350 EFLAGS: 00000047
>> RAX: 0000000000000001 RBX: 0000000000000001 RCX: ffffffff81633521
>> RDX: fffffbfff2058d62 RSI: 0000000000000008 RDI: ffffffff902c6b08
>> RBP: ffff888042995940 R08: 0000000000000000 R09: ffffffff902c6b0f
>> R10: fffffbfff2058d61 R11: 0000000000000001 R12: ffff888119e2b818
>> R13: 0000000000000063 R14: 0000000000000002 R15: ffff888042996598
>> FS:  00007fdaad065700(0000) GS:ffff888119c80000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000001b30623000 CR3: 0000000101969000 CR4: 0000000000350ee0
>> Call Trace:
>>  <TASK>
>>  lock_acquire kernel/locking/lockdep.c:5691 [inline]
>>  lock_acquire+0x1b1/0x520 kernel/locking/lockdep.c:5656
>>  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>>  _raw_spin_lock_irqsave+0x3d/0x60 kernel/locking/spinlock.c:162
>>  lock_hrtimer_base kernel/time/hrtimer.c:173 [inline]
>>  hrtimer_try_to_cancel kernel/time/hrtimer.c:1331 [inline]
>>  hrtimer_try_to_cancel+0xa9/0x2e0 kernel/time/hrtimer.c:1316
>>  hrtimer_cancel+0x17/0x40 kernel/time/hrtimer.c:1443
>>  __disable_vblank drivers/gpu/drm/drm_vblank.c:434 [inline]
>>  drm_vblank_disable_and_save+0x282/0x3d0 drivers/gpu/drm/drm_vblank.c:478
>>  drm_crtc_vblank_off+0x312/0x970 drivers/gpu/drm/drm_vblank.c:1366
> 
> 	cpu1			cpu4 (see below)
> 	====			====
> 	drm_crtc_vblank_off	__run_hrtimer
> 	spin_lock_irq(&dev->event_lock);
> 	...
> 				drm_handle_vblank
> 	hrtimer_cancel		spin_lock_irqsave(&dev->event_lock, irqflags);
> 
> 
> Deadlock should have been reported instead provided the lockdep_map in
> struct timer_list were added also to hrtimer, so it is highly appreciated
> if Tetsuo or Thomas adds it before 6.8 or 6.10.
> 
> Hillf
> 
>>  disable_outputs+0x7c7/0xbb0 drivers/gpu/drm/drm_atomic_helper.c:1202
>>  drm_atomic_helper_commit_modeset_disables+0x1d/0x40 drivers/gpu/drm/drm_atomic_helper.c:1397
>>  vkms_atomic_commit_tail+0x51/0x240 drivers/gpu/drm/vkms/vkms_drv.c:71
>>  commit_tail+0x288/0x420 drivers/gpu/drm/drm_atomic_helper.c:1812
>>  drm_atomic_helper_commit drivers/gpu/drm/drm_atomic_helper.c:2052 [inline]
>>  drm_atomic_helper_commit+0x306/0x390 drivers/gpu/drm/drm_atomic_helper.c:1985
>>  drm_atomic_commit+0x20a/0x2d0 drivers/gpu/drm/drm_atomic.c:1503
>>  drm_client_modeset_commit_atomic+0x698/0x7e0 drivers/gpu/drm/drm_client_modeset.c:1045
>>  drm_client_modeset_dpms+0x174/0x200 drivers/gpu/drm/drm_client_modeset.c:1226
>>  drm_fb_helper_dpms drivers/gpu/drm/drm_fb_helper.c:323 [inline]
>>  drm_fb_helper_blank+0xd1/0x260 drivers/gpu/drm/drm_fb_helper.c:356
>>  fb_blank+0x105/0x190 drivers/video/fbdev/core/fbmem.c:1088
>>  do_fb_ioctl+0x390/0x760 drivers/video/fbdev/core/fbmem.c:1180
>>  fb_ioctl+0xeb/0x150 drivers/video/fbdev/core/fbmem.c:1204
>>  vfs_ioctl fs/ioctl.c:51 [inline]
>>  __do_sys_ioctl fs/ioctl.c:870 [inline]
>>  __se_sys_ioctl fs/ioctl.c:856 [inline]
>>  __x64_sys_ioctl+0x197/0x210 fs/ioctl.c:856
>>  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>>  do_syscall_64+0x39/0x80 arch/x86/entry/common.c:80
>>  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> RIP: 0033:0x7fdaabe8edcd
>> Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:00007fdaad064bf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>> RAX: ffffffffffffffda RBX: 00007fdaabfbbf80 RCX: 00007fdaabe8edcd
>> RDX: 0000000000000004 RSI: 0000000000004611 RDI: 0000000000000003
>> RBP: 00007fdaabefc59c R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>> R13: 00007ffdadeffe9f R14: 00007ffdadf00040 R15: 00007fdaad064d80
>>  </TASK>
>> NMI backtrace for cpu 0 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
>> NMI backtrace for cpu 0 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
>> NMI backtrace for cpu 0 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729
>> NMI backtrace for cpu 3 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
>> NMI backtrace for cpu 3 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
>> NMI backtrace for cpu 3 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729
>> NMI backtrace for cpu 6 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
>> NMI backtrace for cpu 6 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
>> NMI backtrace for cpu 6 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729
>> NMI backtrace for cpu 7 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
>> NMI backtrace for cpu 7 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
>> NMI backtrace for cpu 7 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729
>> NMI backtrace for cpu 4
>> CPU: 4 PID: 20623 Comm: syz-executor.6 Not tainted 6.3.0-next-20230426 #1
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
>> RIP: 0010:kvm_wait+0xb7/0x110 arch/x86/kernel/kvm.c:1064
>> Code: 40 38 c6 74 1b 48 83 c4 10 c3 c3 e8 93 d3 50 00 eb 07 0f 00 2d 4a 04 92 08 fb f4 48 83 c4 10 c3 eb 07 0f 00 2d 3a 04 92 08 f4 <48> 83 c4 10 c3 89 74 24 0c 48 89 3c 24 e8 d7 d4 50 00 8b 74 24 0c
>> RSP: 0018:ffffc90000300b50 EFLAGS: 00000046
>> RAX: 0000000000000003 RBX: 0000000000000000 RCX: dffffc0000000000
>> RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff88810b0803d8
>> RBP: ffff88810b0803d8 R08: 0000000000000001 R09: ffff88810b0803d8
>> R10: ffffed102161007b R11: ffffc90000300ff8 R12: 0000000000000000
>> R13: ffffed102161007b R14: 0000000000000001 R15: ffff888119e3d3c0
>> FS:  0000000000000000(0000) GS:ffff888119e00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f28183bd0b0 CR3: 000000000bb73000 CR4: 0000000000350ee0
>> Call Trace:
>>  <IRQ>
>>  pv_wait arch/x86/include/asm/paravirt.h:598 [inline]
>>  pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:470 [inline]
>>  __pv_queued_spin_lock_slowpath+0x8e4/0xb80 kernel/locking/qspinlock.c:511
>>  pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:586 [inline]
>>  queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline]
>>  queued_spin_lock include/asm-generic/qspinlock.h:114 [inline]
>>  do_raw_spin_lock+0x20d/0x2b0 kernel/locking/spinlock_debug.c:115
>>  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:111 [inline]
>>  _raw_spin_lock_irqsave+0x45/0x60 kernel/locking/spinlock.c:162
>>  drm_handle_vblank+0x11e/0xb80 drivers/gpu/drm/drm_vblank.c:1986
>>  vkms_vblank_simulate+0xe8/0x3e0 drivers/gpu/drm/vkms/vkms_crtc.c:29
>>  __run_hrtimer kernel/time/hrtimer.c:1685 [inline]
>>  __hrtimer_run_queues+0x599/0xa30 kernel/time/hrtimer.c:1749
>>  hrtimer_interrupt+0x320/0x7b0 kernel/time/hrtimer.c:1811
>>  local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1095 [inline]
>>  __sysvec_apic_timer_interrupt+0x14a/0x430 arch/x86/kernel/apic/apic.c:1112
>>  sysvec_apic_timer_interrupt+0x92/0xc0 arch/x86/kernel/apic/apic.c:1106
>>  </IRQ>
>>  <TASK>
>>  asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
>> RIP: 0010:check_kcov_mode kernel/kcov.c:173 [inline]
>> RIP: 0010:__sanitizer_cov_trace_pc+0x11/0x70 kernel/kcov.c:207
>> Code: a8 01 00 00 e8 b0 ff ff ff 31 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 65 8b 05 0d 33 82 7e 89 c1 48 8b 34 24 <81> e1 00 01 00 00 65 48 8b 14 25 40 bb 03 00 a9 00 01 ff 00 74 0e
>> RSP: 0018:ffffc90002be76d8 EFLAGS: 00000286
>> RAX: 0000000080000001 RBX: 0000000000000001 RCX: 0000000080000001
>> RDX: 00007f2817c77000 RSI: ffffffff81bcd756 RDI: ffffc90002be7ad8
>> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
>> R10: 0000000000000001 R11: 0000000000000001 R12: ffffea00014fc480
>> R13: 0000000000000000 R14: dffffc0000000000 R15: 8000000053f12007
>>  zap_drop_file_uffd_wp mm/memory.c:1352 [inline]
>>  zap_install_uffd_wp_if_needed mm/memory.c:1371 [inline]
>>  zap_pte_range mm/memory.c:1417 [inline]
>>  zap_pmd_range mm/memory.c:1564 [inline]
>>  zap_pud_range mm/memory.c:1593 [inline]
>>  zap_p4d_range mm/memory.c:1614 [inline]
>>  unmap_page_range+0x1046/0x4470 mm/memory.c:1635
>>  unmap_single_vma+0x19a/0x2b0 mm/memory.c:1681
>>  unmap_vmas+0x234/0x380 mm/memory.c:1720
>>  exit_mmap+0x190/0x930 mm/mmap.c:3111
>>  __mmput+0x128/0x4c0 kernel/fork.c:1351
>>  mmput+0x60/0x70 kernel/fork.c:1373
>>  exit_mm kernel/exit.c:564 [inline]
>>  do_exit+0x9d1/0x29f0 kernel/exit.c:858
>>  do_group_exit+0xd4/0x2a0 kernel/exit.c:1021
>>  get_signal+0x2311/0x25c0 kernel/signal.c:2874
>>  arch_do_signal_or_restart+0x79/0x5a0 arch/x86/kernel/signal.c:307
>>  exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>>  exit_to_user_mode_prepare+0x11f/0x240 kernel/entry/common.c:204
>>  __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
>>  syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:297
>>  do_syscall_64+0x46/0x80 arch/x86/entry/common.c:86
>>  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> RIP: 0033:0x7f281828edcd
>> Code: Unable to access opcode bytes at 0x7f281828eda3.
>> RSP: 002b:00007f28194c0c98 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
>> RAX: fffffffffffffe00 RBX: 00007f28183bbf80 RCX: 00007f281828edcd
>> RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f28183bbf88
>> RBP: 00007f28183bbf88 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 00007f28183bbf8c
>> R13: 00007ffd5038e1ef R14: 00007ffd5038e390 R15: 00007f28194c0d80
>>  </TASK>
>> NMI backtrace for cpu 2 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
>> NMI backtrace for cpu 2 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
>> NMI backtrace for cpu 2 skipped: idling at default_idle+0xf/0x20 arch/x86/kernel/process.c:729



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: drm/vkms: deadlock between dev->event_lock and timer
  2023-09-13 14:21     ` Tetsuo Handa
@ 2023-09-13 16:47       ` Linus Torvalds
  -1 siblings, 0 replies; 19+ messages in thread
From: Linus Torvalds @ 2023-09-13 16:47 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Haneen Mohammed, Hillf Danton, Sanan Hasanov, Rodrigo Siqueira,
	LKML, DRI, Melissa Wen, Maira Canal, syzkaller

On Wed, 13 Sept 2023 at 07:21, Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> Hello. A deadlock was reported in drivers/gpu/drm/vkms/ .
> It looks like this locking pattern remains as of 6.6-rc1. Please fix.
>
> void drm_crtc_vblank_off(struct drm_crtc *crtc) {
>   spin_lock_irq(&dev->event_lock);
>   drm_vblank_disable_and_save(dev, pipe) {
>     __disable_vblank(dev, pipe) {
>       crtc->funcs->disable_vblank(crtc) == vkms_disable_vblank {
>         hrtimer_cancel(&out->vblank_hrtimer) { // Retries with dev->event_lock held until lock_hrtimer_base() succeeds.
>           ret = hrtimer_try_to_cancel(timer) {
>             base = lock_hrtimer_base(timer, &flags); // Fails forever because vkms_vblank_simulate() is in progress.

Heh. Ok. This is clearly a bug, but it does seem to be limited to just
the vkms driver, and literally only to the "simulate vblank" case.

The worst part about it is that it's so subtle and not obvious.

Some light grepping seems to show that amdgpu has almost the exact
same pattern in its own vkms thing, except it uses

        hrtimer_try_to_cancel(&amdgpu_crtc->vblank_timer);

directly, which presumably fixes the livelock, but means that the
cancel will fail if it's currently running.

So just doing the same thing in the vkms driver probably fixes things.

Maybe the vkms people need to add a flag to say "it's canceled" so
that it doesn't then get re-enabled?  Or maybe it doesn't matter
and/or already happens for some reason I didn't look into.

                       Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: drm/vkms: deadlock between dev->event_lock and timer
@ 2023-09-13 16:47       ` Linus Torvalds
  0 siblings, 0 replies; 19+ messages in thread
From: Linus Torvalds @ 2023-09-13 16:47 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Rodrigo Siqueira, Melissa Wen, Maira Canal, Haneen Mohammed,
	Daniel Vetter, David Airlie, DRI, syzkaller, LKML, Hillf Danton,
	Sanan Hasanov

On Wed, 13 Sept 2023 at 07:21, Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> Hello. A deadlock was reported in drivers/gpu/drm/vkms/ .
> It looks like this locking pattern remains as of 6.6-rc1. Please fix.
>
> void drm_crtc_vblank_off(struct drm_crtc *crtc) {
>   spin_lock_irq(&dev->event_lock);
>   drm_vblank_disable_and_save(dev, pipe) {
>     __disable_vblank(dev, pipe) {
>       crtc->funcs->disable_vblank(crtc) == vkms_disable_vblank {
>         hrtimer_cancel(&out->vblank_hrtimer) { // Retries with dev->event_lock held until lock_hrtimer_base() succeeds.
>           ret = hrtimer_try_to_cancel(timer) {
>             base = lock_hrtimer_base(timer, &flags); // Fails forever because vkms_vblank_simulate() is in progress.

Heh. Ok. This is clearly a bug, but it does seem to be limited to just
the vkms driver, and literally only to the "simulate vblank" case.

The worst part about it is that it's so subtle and not obvious.

Some light grepping seems to show that amdgpu has almost the exact
same pattern in its own vkms thing, except it uses

        hrtimer_try_to_cancel(&amdgpu_crtc->vblank_timer);

directly, which presumably fixes the livelock, but means that the
cancel will fail if it's currently running.

So just doing the same thing in the vkms driver probably fixes things.

Maybe the vkms people need to add a flag to say "it's canceled" so
that it doesn't then get re-enabled?  Or maybe it doesn't matter
and/or already happens for some reason I didn't look into.

                       Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: drm/vkms: deadlock between dev->event_lock and timer
  2023-09-13 16:47       ` Linus Torvalds
@ 2023-09-13 21:08         ` Thomas Gleixner
  -1 siblings, 0 replies; 19+ messages in thread
From: Thomas Gleixner @ 2023-09-13 21:08 UTC (permalink / raw)
  To: Linus Torvalds, Tetsuo Handa
  Cc: Haneen Mohammed, Hillf Danton, Sanan Hasanov, Rodrigo Siqueira,
	LKML, DRI, Melissa Wen, Maira Canal, syzkaller

On Wed, Sep 13 2023 at 09:47, Linus Torvalds wrote:
> On Wed, 13 Sept 2023 at 07:21, Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>
>> Hello. A deadlock was reported in drivers/gpu/drm/vkms/ .
>> It looks like this locking pattern remains as of 6.6-rc1. Please fix.
>>
>> void drm_crtc_vblank_off(struct drm_crtc *crtc) {
>>   spin_lock_irq(&dev->event_lock);
>>   drm_vblank_disable_and_save(dev, pipe) {
>>     __disable_vblank(dev, pipe) {
>>       crtc->funcs->disable_vblank(crtc) == vkms_disable_vblank {
>>         hrtimer_cancel(&out->vblank_hrtimer) { // Retries with dev->event_lock held until lock_hrtimer_base() succeeds.
>>           ret = hrtimer_try_to_cancel(timer) {
>>             base = lock_hrtimer_base(timer, &flags); // Fails forever because vkms_vblank_simulate() is in progress.
>
> Heh. Ok. This is clearly a bug, but it does seem to be limited to just
> the vkms driver, and literally only to the "simulate vblank" case.
>
> The worst part about it is that it's so subtle and not obvious.
>
> Some light grepping seems to show that amdgpu has almost the exact
> same pattern in its own vkms thing, except it uses
>
>         hrtimer_try_to_cancel(&amdgpu_crtc->vblank_timer);
>
> directly, which presumably fixes the livelock, but means that the
> cancel will fail if it's currently running.
>
> So just doing the same thing in the vkms driver probably fixes things.
>
> Maybe the vkms people need to add a flag to say "it's canceled" so
> that it doesn't then get re-enabled?  Or maybe it doesn't matter
> and/or already happens for some reason I didn't look into.

Maybe the VKMS people need to understand locking in the first place. The
first thing I saw in this code is:

static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
{
   ...
   mutex_unlock(&output->enabled_lock);

What?

Unlocking a mutex in the context of a hrtimer callback is simply
violating all mutex locking rules.

How has this code ever survived lock debugging without triggering a big
fat warning?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: drm/vkms: deadlock between dev->event_lock and timer
@ 2023-09-13 21:08         ` Thomas Gleixner
  0 siblings, 0 replies; 19+ messages in thread
From: Thomas Gleixner @ 2023-09-13 21:08 UTC (permalink / raw)
  To: Linus Torvalds, Tetsuo Handa
  Cc: Rodrigo Siqueira, Melissa Wen, Maira Canal, Haneen Mohammed,
	Daniel Vetter, David Airlie, DRI, syzkaller, LKML, Hillf Danton,
	Sanan Hasanov

On Wed, Sep 13 2023 at 09:47, Linus Torvalds wrote:
> On Wed, 13 Sept 2023 at 07:21, Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>
>> Hello. A deadlock was reported in drivers/gpu/drm/vkms/ .
>> It looks like this locking pattern remains as of 6.6-rc1. Please fix.
>>
>> void drm_crtc_vblank_off(struct drm_crtc *crtc) {
>>   spin_lock_irq(&dev->event_lock);
>>   drm_vblank_disable_and_save(dev, pipe) {
>>     __disable_vblank(dev, pipe) {
>>       crtc->funcs->disable_vblank(crtc) == vkms_disable_vblank {
>>         hrtimer_cancel(&out->vblank_hrtimer) { // Retries with dev->event_lock held until lock_hrtimer_base() succeeds.
>>           ret = hrtimer_try_to_cancel(timer) {
>>             base = lock_hrtimer_base(timer, &flags); // Fails forever because vkms_vblank_simulate() is in progress.
>
> Heh. Ok. This is clearly a bug, but it does seem to be limited to just
> the vkms driver, and literally only to the "simulate vblank" case.
>
> The worst part about it is that it's so subtle and not obvious.
>
> Some light grepping seems to show that amdgpu has almost the exact
> same pattern in its own vkms thing, except it uses
>
>         hrtimer_try_to_cancel(&amdgpu_crtc->vblank_timer);
>
> directly, which presumably fixes the livelock, but means that the
> cancel will fail if it's currently running.
>
> So just doing the same thing in the vkms driver probably fixes things.
>
> Maybe the vkms people need to add a flag to say "it's canceled" so
> that it doesn't then get re-enabled?  Or maybe it doesn't matter
> and/or already happens for some reason I didn't look into.

Maybe the VKMS people need to understand locking in the first place. The
first thing I saw in this code is:

static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
{
   ...
   mutex_unlock(&output->enabled_lock);

What?

Unlocking a mutex in the context of a hrtimer callback is simply
violating all mutex locking rules.

How has this code ever survived lock debugging without triggering a big
fat warning?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: drm/vkms: deadlock between dev->event_lock and timer
  2023-09-13 21:08         ` Thomas Gleixner
@ 2023-09-14  6:33           ` Tetsuo Handa
  -1 siblings, 0 replies; 19+ messages in thread
From: Tetsuo Handa @ 2023-09-14  6:33 UTC (permalink / raw)
  To: Maira Canal, Arthur Grillo
  Cc: Haneen Mohammed, Hillf Danton, Sanan Hasanov, Rodrigo Siqueira,
	Linus Torvalds, LKML, DRI, Melissa Wen, syzkaller,
	Thomas Gleixner

On 2023/09/14 6:08, Thomas Gleixner wrote:
> Maybe the VKMS people need to understand locking in the first place. The
> first thing I saw in this code is:
> 
> static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> {
>    ...
>    mutex_unlock(&output->enabled_lock);
> 
> What?
> 
> Unlocking a mutex in the context of a hrtimer callback is simply
> violating all mutex locking rules.
> 
> How has this code ever survived lock debugging without triggering a big
> fat warning?

Commit a0e6a017ab56936c ("drm/vkms: Fix race-condition between the hrtimer
and the atomic commit") in 6.6-rc1 replaced spinlock with mutex. So we haven't
tested with the lock debugging yet...

Maíra and Arthur, mutex_unlock() from interrupt context is not permitted.
Please revert that patch immediately.
I guess that a semaphore (down()/up()) could be used instead of a mutex.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: drm/vkms: deadlock between dev->event_lock and timer
@ 2023-09-14  6:33           ` Tetsuo Handa
  0 siblings, 0 replies; 19+ messages in thread
From: Tetsuo Handa @ 2023-09-14  6:33 UTC (permalink / raw)
  To: Maira Canal, Arthur Grillo
  Cc: Rodrigo Siqueira, Melissa Wen, Haneen Mohammed, Daniel Vetter,
	David Airlie, DRI, syzkaller, LKML, Hillf Danton, Sanan Hasanov,
	Thomas Gleixner, Linus Torvalds

On 2023/09/14 6:08, Thomas Gleixner wrote:
> Maybe the VKMS people need to understand locking in the first place. The
> first thing I saw in this code is:
> 
> static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> {
>    ...
>    mutex_unlock(&output->enabled_lock);
> 
> What?
> 
> Unlocking a mutex in the context of a hrtimer callback is simply
> violating all mutex locking rules.
> 
> How has this code ever survived lock debugging without triggering a big
> fat warning?

Commit a0e6a017ab56936c ("drm/vkms: Fix race-condition between the hrtimer
and the atomic commit") in 6.6-rc1 replaced spinlock with mutex. So we haven't
tested with the lock debugging yet...

Maíra and Arthur, mutex_unlock() from interrupt context is not permitted.
Please revert that patch immediately.
I guess that a semaphore (down()/up()) could be used instead of a mutex.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: drm/vkms: deadlock between dev->event_lock and timer
  2023-09-14  6:33           ` Tetsuo Handa
@ 2023-09-14  8:12             ` Daniel Vetter
  -1 siblings, 0 replies; 19+ messages in thread
From: Daniel Vetter @ 2023-09-14  8:12 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Haneen Mohammed, Hillf Danton, Sanan Hasanov, Rodrigo Siqueira,
	Linus Torvalds, LKML, DRI, Melissa Wen, Maira Canal, syzkaller,
	Thomas Gleixner, Arthur Grillo

On Thu, Sep 14, 2023 at 03:33:41PM +0900, Tetsuo Handa wrote:
> On 2023/09/14 6:08, Thomas Gleixner wrote:
> > Maybe the VKMS people need to understand locking in the first place. The
> > first thing I saw in this code is:
> > 
> > static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> > {
> >    ...
> >    mutex_unlock(&output->enabled_lock);
> > 
> > What?
> > 
> > Unlocking a mutex in the context of a hrtimer callback is simply
> > violating all mutex locking rules.
> > 
> > How has this code ever survived lock debugging without triggering a big
> > fat warning?
> 
> Commit a0e6a017ab56936c ("drm/vkms: Fix race-condition between the hrtimer
> and the atomic commit") in 6.6-rc1 replaced spinlock with mutex. So we haven't
> tested with the lock debugging yet...

Yeah that needs an immediate revert, there's not much that looks legit in
that patch. I'll chat with Maira.

Also yes how that landed without anyone running lockdep is ... not good. I
guess we need a lockdep enabled drm ci target that runs vkms tests asap
:-)

> Maíra and Arthur, mutex_unlock() from interrupt context is not permitted.
> Please revert that patch immediately.
> I guess that a semaphore (down()/up()) could be used instead of a mutex.

From a quick look this smells like a classic "try to use locking when you
want synchronization primitives", so semaphore here doesn't look any
better. The vkms_set_composer() function was originally for crc
generation, where it's userspace's job to make sure they wait for all the
crc they need to be generated before they shut it down again. But for
writeback the kernel must guarantee that the compositiona actually
happens, and the current function just doesn't make any such guarantees.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: drm/vkms: deadlock between dev->event_lock and timer
@ 2023-09-14  8:12             ` Daniel Vetter
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel Vetter @ 2023-09-14  8:12 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Maira Canal, Arthur Grillo, Rodrigo Siqueira, Melissa Wen,
	Haneen Mohammed, Daniel Vetter, David Airlie, DRI, syzkaller,
	LKML, Hillf Danton, Sanan Hasanov, Thomas Gleixner,
	Linus Torvalds

On Thu, Sep 14, 2023 at 03:33:41PM +0900, Tetsuo Handa wrote:
> On 2023/09/14 6:08, Thomas Gleixner wrote:
> > Maybe the VKMS people need to understand locking in the first place. The
> > first thing I saw in this code is:
> > 
> > static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> > {
> >    ...
> >    mutex_unlock(&output->enabled_lock);
> > 
> > What?
> > 
> > Unlocking a mutex in the context of a hrtimer callback is simply
> > violating all mutex locking rules.
> > 
> > How has this code ever survived lock debugging without triggering a big
> > fat warning?
> 
> Commit a0e6a017ab56936c ("drm/vkms: Fix race-condition between the hrtimer
> and the atomic commit") in 6.6-rc1 replaced spinlock with mutex. So we haven't
> tested with the lock debugging yet...

Yeah that needs an immediate revert, there's not much that looks legit in
that patch. I'll chat with Maira.

Also yes how that landed without anyone running lockdep is ... not good. I
guess we need a lockdep enabled drm ci target that runs vkms tests asap
:-)

> Maíra and Arthur, mutex_unlock() from interrupt context is not permitted.
> Please revert that patch immediately.
> I guess that a semaphore (down()/up()) could be used instead of a mutex.

From a quick look this smells like a classic "try to use locking when you
want synchronization primitives", so semaphore here doesn't look any
better. The vkms_set_composer() function was originally for crc
generation, where it's userspace's job to make sure they wait for all the
crc they need to be generated before they shut it down again. But for
writeback the kernel must guarantee that the compositiona actually
happens, and the current function just doesn't make any such guarantees.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: drm/vkms: deadlock between dev->event_lock and timer
  2023-09-14  8:12             ` Daniel Vetter
  (?)
@ 2023-09-18 22:02             ` Helen Koike
  2023-09-19  6:38                 ` Daniel Stone
  -1 siblings, 1 reply; 19+ messages in thread
From: Helen Koike @ 2023-09-18 22:02 UTC (permalink / raw)
  To: Tetsuo Handa, Maira Canal, Arthur Grillo, Rodrigo Siqueira,
	Melissa Wen, Haneen Mohammed, David Airlie, DRI, syzkaller, LKML,
	Hillf Danton, Sanan Hasanov, Thomas Gleixner, Linus Torvalds,
	Daniel Stone, David Heidelberg, Vignesh Raman



On 14/09/2023 05:12, Daniel Vetter wrote:
> On Thu, Sep 14, 2023 at 03:33:41PM +0900, Tetsuo Handa wrote:
>> On 2023/09/14 6:08, Thomas Gleixner wrote:
>>> Maybe the VKMS people need to understand locking in the first place. The
>>> first thing I saw in this code is:
>>>
>>> static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
>>> {
>>>     ...
>>>     mutex_unlock(&output->enabled_lock);
>>>
>>> What?
>>>
>>> Unlocking a mutex in the context of a hrtimer callback is simply
>>> violating all mutex locking rules.
>>>
>>> How has this code ever survived lock debugging without triggering a big
>>> fat warning?
>>
>> Commit a0e6a017ab56936c ("drm/vkms: Fix race-condition between the hrtimer
>> and the atomic commit") in 6.6-rc1 replaced spinlock with mutex. So we haven't
>> tested with the lock debugging yet...
> 
> Yeah that needs an immediate revert, there's not much that looks legit in
> that patch. I'll chat with Maira.
> 
> Also yes how that landed without anyone running lockdep is ... not good. I
> guess we need a lockdep enabled drm ci target that runs vkms tests asap
> :-)

btw, I just executed a draft version of vkms targed on the ci, we do get 
the warning:

https://gitlab.freedesktop.org/helen.fornazier/linux/-/jobs/49156305#L623

I'm just not sure if tests would fail (since it is a warning) and it has 
a chance to be ignored if people don't look at the logs (unless if igt 
already handles that).

I still need to do some adjustments (it seems the tests is hanging 
somewhere and we got a timeout) but we should have vkms in drm ci soon.

Regards,
Helen


> 
>> Maíra and Arthur, mutex_unlock() from interrupt context is not permitted.
>> Please revert that patch immediately.
>> I guess that a semaphore (down()/up()) could be used instead of a mutex.
> 
>  From a quick look this smells like a classic "try to use locking when you
> want synchronization primitives", so semaphore here doesn't look any
> better. The vkms_set_composer() function was originally for crc
> generation, where it's userspace's job to make sure they wait for all the
> crc they need to be generated before they shut it down again. But for
> writeback the kernel must guarantee that the compositiona actually
> happens, and the current function just doesn't make any such guarantees.
> 
> Cheers, Daniel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: drm/vkms: deadlock between dev->event_lock and timer
  2023-09-18 22:02             ` Helen Koike
@ 2023-09-19  6:38                 ` Daniel Stone
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel Stone @ 2023-09-19  6:38 UTC (permalink / raw)
  To: Helen Koike
  Cc: Haneen Mohammed, Hillf Danton, Sanan Hasanov, Rodrigo Siqueira,
	David Heidelberg, Tetsuo Handa, Linus Torvalds, Vignesh Raman,
	LKML, DRI, Melissa Wen, Maira Canal, syzkaller, Thomas Gleixner,
	Arthur Grillo, Daniel Stone

On Mon, 18 Sept 2023 at 23:02, Helen Koike <helen.koike@collabora.com> wrote:
> On 14/09/2023 05:12, Daniel Vetter wrote:
> > Also yes how that landed without anyone running lockdep is ... not good. I
> > guess we need a lockdep enabled drm ci target that runs vkms tests asap
> > :-)
>
> btw, I just executed a draft version of vkms targed on the ci, we do get
> the warning:
>
> https://gitlab.freedesktop.org/helen.fornazier/linux/-/jobs/49156305#L623
>
> I'm just not sure if tests would fail (since it is a warning) and it has
> a chance to be ignored if people don't look at the logs (unless if igt
> already handles that).

Hmm, dmesg-warn is already a separate igt test status. I guess it just
needs to be handled explicitly.

> I still need to do some adjustments (it seems the tests is hanging
> somewhere and we got a timeout) but we should have vkms in drm ci soon.

Might be due to the locking explosion? The kernels should probably
have soft-lockup detection enabled as well as lockdep.

Cheers,
Daniel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: drm/vkms: deadlock between dev->event_lock and timer
@ 2023-09-19  6:38                 ` Daniel Stone
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel Stone @ 2023-09-19  6:38 UTC (permalink / raw)
  To: Helen Koike
  Cc: Tetsuo Handa, Maira Canal, Arthur Grillo, Rodrigo Siqueira,
	Melissa Wen, Haneen Mohammed, David Airlie, DRI, syzkaller, LKML,
	Hillf Danton, Sanan Hasanov, Thomas Gleixner, Linus Torvalds,
	Daniel Stone, David Heidelberg, Vignesh Raman

On Mon, 18 Sept 2023 at 23:02, Helen Koike <helen.koike@collabora.com> wrote:
> On 14/09/2023 05:12, Daniel Vetter wrote:
> > Also yes how that landed without anyone running lockdep is ... not good. I
> > guess we need a lockdep enabled drm ci target that runs vkms tests asap
> > :-)
>
> btw, I just executed a draft version of vkms targed on the ci, we do get
> the warning:
>
> https://gitlab.freedesktop.org/helen.fornazier/linux/-/jobs/49156305#L623
>
> I'm just not sure if tests would fail (since it is a warning) and it has
> a chance to be ignored if people don't look at the logs (unless if igt
> already handles that).

Hmm, dmesg-warn is already a separate igt test status. I guess it just
needs to be handled explicitly.

> I still need to do some adjustments (it seems the tests is hanging
> somewhere and we got a timeout) but we should have vkms in drm ci soon.

Might be due to the locking explosion? The kernels should probably
have soft-lockup detection enabled as well as lockdep.

Cheers,
Daniel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: soft lockup in smp_call_function
  2023-09-13 11:07 ` Hillf Danton
  2023-09-13 14:21     ` Tetsuo Handa
@ 2023-09-13 14:30   ` Tetsuo Handa
  2023-09-14 12:21     ` Hillf Danton
  1 sibling, 1 reply; 19+ messages in thread
From: Tetsuo Handa @ 2023-09-13 14:30 UTC (permalink / raw)
  To: Hillf Danton, Sanan Hasanov, Thomas Gleixner, peterz
  Cc: Linus Torvalds, syzkaller, LKML

On 2023/09/13 20:07, Hillf Danton wrote:
> 
> 	cpu1			cpu4 (see below)
> 	====			====
> 	drm_crtc_vblank_off	__run_hrtimer
> 	spin_lock_irq(&dev->event_lock);
> 	...
> 				drm_handle_vblank
> 	hrtimer_cancel		spin_lock_irqsave(&dev->event_lock, irqflags);
> 
> 
> Deadlock should have been reported instead provided the lockdep_map in
> struct timer_list were added also to hrtimer, so it is highly appreciated
> if Tetsuo or Thomas adds it before 6.8 or 6.10.

Not me. ;-)

Since hrtimer_cancel() retries forever until lock_hrtimer_base() succeeds,
we want to add a lockdep annotation into hrtimer_cancel() so that we can
detect this type of deadlock?


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: soft lockup in smp_call_function
  2023-09-13 14:30   ` BUG: soft lockup in smp_call_function Tetsuo Handa
@ 2023-09-14 12:21     ` Hillf Danton
  2023-09-14 13:13       ` Tetsuo Handa
  0 siblings, 1 reply; 19+ messages in thread
From: Hillf Danton @ 2023-09-14 12:21 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Sanan Hasanov, Thomas Gleixner, Linus Torvalds, syzkaller,
	linux-mm, LKML

On Wed, 13 Sep 2023 23:30:23 +0900 Tetsuo Handa wrote:
> On 2023/09/13 20:07, Hillf Danton wrote:
> > 
> > 	cpu1			cpu4 (see below)
> > 	====			====
> > 	drm_crtc_vblank_off	__run_hrtimer
> > 	spin_lock_irq(&dev->event_lock);
> > 	...
> > 				drm_handle_vblank
> > 	hrtimer_cancel		spin_lock_irqsave(&dev->event_lock, irqflags);
> > 
> > 
> > Deadlock should have been reported instead provided the lockdep_map in
> > struct timer_list were added also to hrtimer, so it is highly appreciated
> > if Tetsuo or Thomas adds it before 6.8 or 6.10.
> 
> Not me. ;-)
> 
> Since hrtimer_cancel() retries forever until lock_hrtimer_base() succeeds,
> we want to add a lockdep annotation into hrtimer_cancel() so that we can
> detect this type of deadlock?

Yes, you are right.

The diff below is my two cents (only for thoughts).

--- x/include/linux/timer.h
+++ y/include/linux/timer.h
@@ -124,6 +124,9 @@ struct hrtimer {
 	u8				is_rel;
 	u8				is_soft;
 	u8				is_hard;
+#ifdef CONFIG_LOCKDEP
+	struct lockdep_map 		lockdep_map;
+#endif
 };
 
 /**
@@ -369,33 +372,65 @@ static inline void hrtimer_cancel_wait_r
 /* Exported timer functions: */
 
 /* Initialize timers: */
-extern void hrtimer_init(struct hrtimer *timer, clockid_t which_clock,
-			 enum hrtimer_mode mode);
-extern void hrtimer_init_sleeper(struct hrtimer_sleeper *sl, clockid_t clock_id,
-				 enum hrtimer_mode mode);
+extern void hrtimer_init_key(struct hrtimer *timer, clockid_t which_clock,
+			 enum hrtimer_mode mode,
+			const char *name, struct lock_class_key *key);
+extern void hrtimer_init_sleeper_key(struct hrtimer_sleeper *sl, clockid_t clock_id,
+				 enum hrtimer_mode mode,
+				const char *name, struct lock_class_key *key);
+#ifdef CONFIG_LOCKDEP
+#define hrtimer_init(t, c, m)						\
+	do {								\
+		static struct lock_class_key __key;			\
+		hrtimer_init_key(t, c, m, #t, &__key);			\
+	} while (0)
+
+#define hrtimer_init_sleeper(s, c, m)					\
+	do {								\
+		static struct lock_class_key __key;			\
+		hrtimer_init_sleeper_key(s, c, m, #s, &__key);		\
+	} while (0)
+#else
+#define hrtimer_init(t, c, m) \
+	hrtimer_init_key(t, c, m, NULL, NULL)
+
+#define hrtimer_init_sleeper(s, c, m) \
+	hrtimer_init_sleeper_key(s, c, m, NULL, NULL)
+#endif
 
 #ifdef CONFIG_DEBUG_OBJECTS_TIMERS
-extern void hrtimer_init_on_stack(struct hrtimer *timer, clockid_t which_clock,
-				  enum hrtimer_mode mode);
-extern void hrtimer_init_sleeper_on_stack(struct hrtimer_sleeper *sl,
+extern void hrtimer_init_on_stack_key(struct hrtimer *timer, clockid_t which_clock,
+				enum hrtimer_mode mode,
+				const char *name, struct lock_class_key *key);
+extern void hrtimer_init_sleeper_on_stack_key(struct hrtimer_sleeper *sl,
 					  clockid_t clock_id,
-					  enum hrtimer_mode mode);
+				enum hrtimer_mode mode,
+				const char *name, struct lock_class_key *key);
+#ifdef CONFIG_LOCKDEP
+  #define hrtimer_init_on_stack(t, c, m) 				\
+	  do {								\
+		static struct lock_class_key __key;			\
+		hrtimer_init_on_stack_key(t, c, m, #t, &__key);		\
+	  } while (0)
+  #define hrtimer_init_sleeper_on_stack(s, c, m) 			\
+	  do {								\
+		static struct lock_class_key __key;			\
+	  	hrtimer_init_sleeper_on_stack_key(s, c, m, #s, &__key);	\
+	  } while (0)
+#else
+  #define hrtimer_init_on_stack(t, c, m) \
+	  hrtimer_init_on_stack_key(t, c, m, NULL, NULL)
+  #define hrtimer_init_sleeper_on_stack(s, c, m) \
+	  hrtimer_init_sleeper_on_stack_key(s, c, m, NULL, NULL)
+#endif
 
 extern void destroy_hrtimer_on_stack(struct hrtimer *timer);
 #else
-static inline void hrtimer_init_on_stack(struct hrtimer *timer,
-					 clockid_t which_clock,
-					 enum hrtimer_mode mode)
-{
-	hrtimer_init(timer, which_clock, mode);
-}
+#define hrtimer_init_on_stack(t, c, m) \
+	hrtimer_init(t, c, m)
 
-static inline void hrtimer_init_sleeper_on_stack(struct hrtimer_sleeper *sl,
-						 clockid_t clock_id,
-						 enum hrtimer_mode mode)
-{
-	hrtimer_init_sleeper(sl, clock_id, mode);
-}
+#define hrtimer_init_sleeper_on_stack(s, c, m) \
+	hrtimer_init_sleeper(s, c, m)
 
 static inline void destroy_hrtimer_on_stack(struct hrtimer *timer) { }
 #endif
--- x/kernel/time/hrtimer.c
+++ y/kernel/time/hrtimer.c
@@ -428,22 +428,26 @@ static inline void debug_hrtimer_deactiv
 static void __hrtimer_init(struct hrtimer *timer, clockid_t clock_id,
 			   enum hrtimer_mode mode);
 
-void hrtimer_init_on_stack(struct hrtimer *timer, clockid_t clock_id,
-			   enum hrtimer_mode mode)
+void hrtimer_init_on_stack_key(struct hrtimer *timer, clockid_t clock_id,
+			enum hrtimer_mode mode,
+			const char *name, struct lock_class_key *key)
 {
 	debug_object_init_on_stack(timer, &hrtimer_debug_descr);
 	__hrtimer_init(timer, clock_id, mode);
+	lockdep_init_map(&timer->lockdep_map, name, key, 0);
 }
 EXPORT_SYMBOL_GPL(hrtimer_init_on_stack);
 
 static void __hrtimer_init_sleeper(struct hrtimer_sleeper *sl,
 				   clockid_t clock_id, enum hrtimer_mode mode);
 
-void hrtimer_init_sleeper_on_stack(struct hrtimer_sleeper *sl,
-				   clockid_t clock_id, enum hrtimer_mode mode)
+void hrtimer_init_sleeper_on_stack_key(struct hrtimer_sleeper *sl,
+					clockid_t clock_id, enum hrtimer_mode mode,
+					const char *name, struct lock_class_key *key)
 {
 	debug_object_init_on_stack(&sl->timer, &hrtimer_debug_descr);
 	__hrtimer_init_sleeper(sl, clock_id, mode);
+	lockdep_init_map(&sl->timer.lockdep_map, name, key, 0);
 }
 EXPORT_SYMBOL_GPL(hrtimer_init_sleeper_on_stack);
 
@@ -1439,6 +1443,8 @@ int hrtimer_cancel(struct hrtimer *timer
 {
 	int ret;
 
+	lock_map_acquire(&timer->lockdep_map);
+	lock_map_release(&timer->lockdep_map);
 	do {
 		ret = hrtimer_try_to_cancel(timer);
 
@@ -1586,11 +1592,12 @@ static void __hrtimer_init(struct hrtime
  *              but the PINNED bit is ignored as pinning happens
  *              when the hrtimer is started
  */
-void hrtimer_init(struct hrtimer *timer, clockid_t clock_id,
-		  enum hrtimer_mode mode)
+void hrtimer_init_key(struct hrtimer *timer, clockid_t clock_id, enum hrtimer_mode mode,
+			const char *name, struct lock_class_key *key)
 {
 	debug_init(timer, clock_id, mode);
 	__hrtimer_init(timer, clock_id, mode);
+	lockdep_init_map(&timer->lockdep_map, name, key, 0);
 }
 EXPORT_SYMBOL_GPL(hrtimer_init);
 
@@ -1647,6 +1654,11 @@ static void __run_hrtimer(struct hrtimer
 	enum hrtimer_restart (*fn)(struct hrtimer *);
 	bool expires_in_hardirq;
 	int restart;
+#ifdef CONFIG_LOCKDEP
+	struct lockdep_map lockdep_map;
+
+	lockdep_copy_map(&lockdep_map, &timer->lockdep_map);
+#endif
 
 	lockdep_assert_held(&cpu_base->lock);
 
@@ -1682,7 +1694,9 @@ static void __run_hrtimer(struct hrtimer
 	trace_hrtimer_expire_entry(timer, now);
 	expires_in_hardirq = lockdep_hrtimer_enter(timer);
 
+	lock_map_acquire(&lockdep_map);
 	restart = fn(timer);
+	lock_map_release(&lockdep_map);
 
 	lockdep_hrtimer_exit(expires_in_hardirq);
 	trace_hrtimer_expire_exit(timer);
@@ -2004,12 +2018,13 @@ static void __hrtimer_init_sleeper(struc
  * @clock_id:	the clock to be used
  * @mode:	timer mode abs/rel
  */
-void hrtimer_init_sleeper(struct hrtimer_sleeper *sl, clockid_t clock_id,
-			  enum hrtimer_mode mode)
+void hrtimer_init_sleeper_key(struct hrtimer_sleeper *sl, clockid_t clock_id,
+			enum hrtimer_mode mode,
+			const char *name, struct lock_class_key *key)
 {
 	debug_init(&sl->timer, clock_id, mode);
 	__hrtimer_init_sleeper(sl, clock_id, mode);
-
+	lockdep_init_map(&sl->timer.lockdep_map, name, key, 0);
 }
 EXPORT_SYMBOL_GPL(hrtimer_init_sleeper);
 
--


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: BUG: soft lockup in smp_call_function
  2023-09-14 12:21     ` Hillf Danton
@ 2023-09-14 13:13       ` Tetsuo Handa
  0 siblings, 0 replies; 19+ messages in thread
From: Tetsuo Handa @ 2023-09-14 13:13 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Sanan Hasanov, Linus Torvalds, syzkaller, LKML, Hillf Danton

On 2023/09/14 21:21, Hillf Danton wrote:
> On Wed, 13 Sep 2023 23:30:23 +0900 Tetsuo Handa wrote:
>> On 2023/09/13 20:07, Hillf Danton wrote:
>>>
>>> 	cpu1			cpu4 (see below)
>>> 	====			====
>>> 	drm_crtc_vblank_off	__run_hrtimer
>>> 	spin_lock_irq(&dev->event_lock);
>>> 	...
>>> 				drm_handle_vblank
>>> 	hrtimer_cancel		spin_lock_irqsave(&dev->event_lock, irqflags);
>>>
>>>
>>> Deadlock should have been reported instead provided the lockdep_map in
>>> struct timer_list were added also to hrtimer, so it is highly appreciated
>>> if Tetsuo or Thomas adds it before 6.8 or 6.10.
>>
>> Not me. ;-)
>>
>> Since hrtimer_cancel() retries forever until lock_hrtimer_base() succeeds,
>> we want to add a lockdep annotation into hrtimer_cancel() so that we can
>> detect this type of deadlock?

Here is a reproducer.

----------------------------------------
#include <linux/module.h>
#include <linux/delay.h>
static DEFINE_SPINLOCK(lock1);
static struct hrtimer timer1;
static enum hrtimer_restart timer_func(struct hrtimer *timer)
{
	unsigned long flags;
	mdelay(100); // Wait for test_init() to hold lock1.
	spin_lock_irqsave(&lock1, flags);
	spin_unlock_irqrestore(&lock1, flags);
	return HRTIMER_RESTART;
}
static int __init test_init(void)
{
	unsigned long flags;
	hrtimer_init(&timer1, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
	timer1.function = &timer_func;
	hrtimer_start(&timer1, 1, HRTIMER_MODE_REL);
	mdelay(10); // Wait for timer_func() to start.
	spin_lock_irqsave(&lock1, flags);
	hrtimer_cancel(&timer1); // Wait for timer_func() to finish.
	spin_unlock_irqrestore(&lock1, flags);
	return -EINVAL;
}
module_init(test_init);
MODULE_LICENSE("GPL");
----------------------------------------

----------------------------------------
[  996.507681] test: loading out-of-tree module taints kernel.
[  996.514019] test: module verification failed: signature and/or required key missing - tainting kernel
[ 1061.893054] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 1061.900411] rcu: 	4-...0: (1 GPs behind) idle=ed6c/1/0x4000000000000000 softirq=3304/3305 fqs=15784
[ 1061.909128] rcu: 	(detected by 0, t=65018 jiffies, g=12625, q=4422 ncpus=12)
[ 1061.915003] Sending NMI from CPU 0 to CPUs 4:
[ 1061.918972] NMI backtrace for cpu 4
[ 1061.919036] CPU: 4 PID: 3826 Comm: insmod Tainted: G           OE      6.6.0-rc1+ #20
[ 1061.919093] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[ 1061.919095] RIP: 0010:delay_tsc+0x34/0xa0
[ 1061.919560] Code: ff 05 e8 b1 26 70 65 44 8b 0d e4 b1 26 70 0f 01 f9 66 90 48 c1 e2 20 48 09 c2 49 89 d0 eb 21 65 ff 0d c8 b1 26 70 74 54 f3 90 <65> ff 05 bd b1 26 70 65 8b 35 ba b1 26 70 41 39 f1 75 28 41 89 f1
[ 1061.919563] RSP: 0018:ffffb471c059cf00 EFLAGS: 00000083
[ 1061.919567] RAX: 0000028efe104ef6 RBX: 0000000000000041 RCX: 0000000000000004
[ 1061.919569] RDX: 00000000002192f8 RSI: 0000000000000004 RDI: 000000000027d81e
[ 1061.919571] RBP: ffff8970dafe5040 R08: 0000028efdeebbfe R09: 0000000000000004
[ 1061.919572] R10: 0000000000000001 R11: ffffffffc0a8d600 R12: ffffffff90e030e0
[ 1061.919574] R13: ffff8970dafe5040 R14: ffffffffc0a8b010 R15: ffff8970dafe5100
[ 1061.919630] FS:  00007fdd998eb740(0000) GS:ffff8970dae00000(0000) knlGS:0000000000000000
[ 1061.919633] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1061.919635] CR2: 0000000001edf678 CR3: 00000001a1060000 CR4: 0000000000350ee0
[ 1061.919639] Call Trace:
[ 1061.919641]  <NMI>
[ 1061.919646]  ? nmi_cpu_backtrace+0xb1/0x130
[ 1061.919711]  ? nmi_cpu_backtrace_handler+0x11/0x20
[ 1061.922096]  ? nmi_handle+0xe4/0x290
[ 1061.922163]  ? default_do_nmi+0x49/0x100
[ 1061.922196]  ? exc_nmi+0x152/0x1e0
[ 1061.922198]  ? end_repeat_nmi+0x16/0x67
[ 1061.922340]  ? __pfx_timer_func+0x10/0x10 [test]
[ 1061.922347]  ? delay_tsc+0x34/0xa0
[ 1061.922349]  ? delay_tsc+0x34/0xa0
[ 1061.922350]  ? delay_tsc+0x34/0xa0
[ 1061.922352]  </NMI>
[ 1061.922353]  <IRQ>
[ 1061.922353]  timer_func+0x19/0xff0 [test]
[ 1061.922358]  __hrtimer_run_queues+0x177/0x3a0
[ 1061.922362]  hrtimer_interrupt+0x104/0x240
[ 1061.922364]  ? __do_softirq+0x2db/0x392
[ 1061.922827]  __sysvec_apic_timer_interrupt+0x64/0x180
[ 1061.922833]  sysvec_apic_timer_interrupt+0x65/0x80
[ 1061.922894]  </IRQ>
[ 1061.922896]  <TASK>
[ 1061.922898]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
[ 1061.922902] RIP: 0010:delay_tsc+0x4d/0xa0
[ 1061.922907] Code: c2 49 89 d0 eb 21 65 ff 0d c8 b1 26 70 74 54 f3 90 65 ff 05 bd b1 26 70 65 8b 35 ba b1 26 70 41 39 f1 75 28 41 89 f1 0f 01 f9 <66> 90 48 c1 e2 20 48 09 d0 48 89 c2 4c 29 c2 48 39 fa 72 c8 65 ff
[ 1061.922909] RSP: 0018:ffffb471c1e63bd0 EFLAGS: 00000246
[ 1061.922912] RAX: 00000000751ed8ab RBX: 000000000000000a RCX: 0000000000000004
[ 1061.922914] RDX: 0000000000000267 RSI: 0000000000000004 RDI: 000000000027d81e
[ 1061.922915] RBP: ffffffffc0a91010 R08: 00000267751adc59 R09: 0000000000000004
[ 1061.922917] R10: 0000000000000001 R11: ffffffff90cd85c8 R12: 0000000000000000
[ 1061.922918] R13: ffffb471c1e63d20 R14: 0000000000000000 R15: ffffffffc0a8d080
[ 1061.922923]  ? __pfx_test_init+0x10/0x10 [test]
[ 1061.922934]  test_init+0x52/0xff0 [test]
[ 1061.922941]  do_one_initcall+0x5c/0x280
[ 1061.923004]  ? kmalloc_trace+0xa9/0xc0
[ 1061.923105]  do_init_module+0x60/0x240
[ 1061.923111]  load_module+0x1f6e/0x20d0
[ 1061.923119]  ? ima_post_read_file+0xe3/0xf0
[ 1061.923225]  ? init_module_from_file+0x88/0xc0
[ 1061.923229]  init_module_from_file+0x88/0xc0
[ 1061.923238]  idempotent_init_module+0x19c/0x250
[ 1061.923244]  ? security_capable+0x39/0x60
[ 1061.923304]  __x64_sys_finit_module+0x5b/0xb0
[ 1061.923310]  do_syscall_64+0x3b/0x90
[ 1061.923366]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 1061.923421] RIP: 0033:0x7fdd986f8e29
[ 1061.923427] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 17 e0 2c 00 f7 d8 64 89 01 48
[ 1061.923429] RSP: 002b:00007ffe2f34dd18 EFLAGS: 00000206 ORIG_RAX: 0000000000000139
[ 1061.923432] RAX: ffffffffffffffda RBX: 0000000001ede240 RCX: 00007fdd986f8e29
[ 1061.923434] RDX: 0000000000000000 RSI: 000000000041a96e RDI: 0000000000000003
[ 1061.923435] RBP: 000000000041a96e R08: 0000000000000000 R09: 00007ffe2f34deb8
[ 1061.923436] R10: 0000000000000003 R11: 0000000000000206 R12: 0000000000000000
[ 1061.923437] R13: 0000000001ede210 R14: 0000000000000000 R15: 0000000000000000
[ 1061.923444]  </TASK>
[ 1061.923446] INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 4.474 msecs
----------------------------------------

> 
> Yes, you are right.
> 
> The diff below is my two cents (only for thoughts).
> 

I'm thinking something like below. (Completely untested.)

I haven't checked IRQ state handling. But in the last diff chunk, why raw_spin_unlock_irqrestore()
(which does not re-enable IRQs if the caller already disabled IRQs) is used before calling the callback
function and raw_spin_lock_irq() (which always disables IRQs) is used after calling the callback
function? Is it legal to disable IRQs again when the caller already disabled IRQs?

----------------------------------------
diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 0ee140176f10..5640730ec31c 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -123,8 +123,11 @@ struct hrtimer {
 	u8				state;
 	u8				is_rel;
 	u8				is_soft;
 	u8				is_hard;
+#ifdef CONFIG_LOCKDEP
+	struct lockdep_map lockdep_map;
+#endif
 };
 
 /**
  * struct hrtimer_sleeper - simple sleeper structure
@@ -440,15 +443,15 @@ static inline void hrtimer_restart(struct hrtimer *timer)
 	hrtimer_start_expires(timer, HRTIMER_MODE_ABS);
 }
 
 /* Query timers: */
-extern ktime_t __hrtimer_get_remaining(const struct hrtimer *timer, bool adjust);
+extern ktime_t __hrtimer_get_remaining(struct hrtimer *timer, bool adjust);
 
 /**
  * hrtimer_get_remaining - get remaining time for the timer
  * @timer:	the timer to read
  */
-static inline ktime_t hrtimer_get_remaining(const struct hrtimer *timer)
+static inline ktime_t hrtimer_get_remaining(struct hrtimer *timer)
 {
 	return __hrtimer_get_remaining(timer, false);
 }
 
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 238262e4aba7..fe0681d34b56 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -161,14 +161,23 @@ static inline bool is_migration_base(struct hrtimer_clock_base *base)
  * possible to set timer->base = &migration_base and drop the lock: the timer
  * remains locked.
  */
 static
-struct hrtimer_clock_base *lock_hrtimer_base(const struct hrtimer *timer,
+struct hrtimer_clock_base *lock_hrtimer_base(struct hrtimer *timer,
 					     unsigned long *flags)
 	__acquires(&timer->base->lock)
 {
 	struct hrtimer_clock_base *base;
 
+#ifdef CONFIG_LOCKDEP
+	unsigned long flags2;
+
+	local_irq_save(flags2);
+	lock_map_acquire(&timer->lockdep_map);
+	lock_map_release(&timer->lockdep_map);
+	local_irq_restore(flags2);
+#endif
+
 	for (;;) {
 		base = READ_ONCE(timer->base);
 		if (likely(base != &migration_base)) {
 			raw_spin_lock_irqsave(&base->cpu_base->lock, *flags);
@@ -1456,9 +1465,9 @@ EXPORT_SYMBOL_GPL(hrtimer_cancel);
  * __hrtimer_get_remaining - get remaining time for the timer
  * @timer:	the timer to read
  * @adjust:	adjust relative timers when CONFIG_TIME_LOW_RES=y
  */
-ktime_t __hrtimer_get_remaining(const struct hrtimer *timer, bool adjust)
+ktime_t __hrtimer_get_remaining(struct hrtimer *timer, bool adjust)
 {
 	unsigned long flags;
 	ktime_t rem;
 
@@ -1574,8 +1583,14 @@ static void __hrtimer_init(struct hrtimer *timer, clockid_t clock_id,
 	timer->is_soft = softtimer;
 	timer->is_hard = !!(mode & HRTIMER_MODE_HARD);
 	timer->base = &cpu_base->clock_base[base];
 	timerqueue_init(&timer->node);
+#ifdef CONFIG_LOCKDEP
+	{
+		static struct lock_class_key __key;
+		lockdep_init_map(&timer->lockdep_map, "hrtimer", &__key, 0);
+	}
+#endif
 }
 
 /**
  * hrtimer_init - initialize a timer to the given clock
@@ -1684,9 +1699,19 @@ static void __run_hrtimer(struct hrtimer_cpu_base *cpu_base,
 	raw_spin_unlock_irqrestore(&cpu_base->lock, flags);
 	trace_hrtimer_expire_entry(timer, now);
 	expires_in_hardirq = lockdep_hrtimer_enter(timer);
 
+#ifdef CONFIG_LOCKDEP
+	local_irq_save(flags);
+	lock_map_acquire(&timer->lockdep_map);
+	local_irq_restore(flags);
+#endif
 	restart = fn(timer);
+#ifdef CONFIG_LOCKDEP
+	local_irq_save(flags);
+	lock_map_release(&timer->lockdep_map);
+	local_irq_restore(flags);
+#endif
 
 	lockdep_hrtimer_exit(expires_in_hardirq);
 	trace_hrtimer_expire_exit(timer);
 	raw_spin_lock_irq(&cpu_base->lock);
----------------------------------------



^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-09-19  6:39 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-12 23:02 BUG: soft lockup in smp_call_function Sanan Hasanov
2023-09-13 10:05 ` Peter Zijlstra
2023-09-13 11:07 ` Hillf Danton
2023-09-13 14:21   ` drm/vkms: deadlock between dev->event_lock and timer Tetsuo Handa
2023-09-13 14:21     ` Tetsuo Handa
2023-09-13 16:47     ` Linus Torvalds
2023-09-13 16:47       ` Linus Torvalds
2023-09-13 21:08       ` Thomas Gleixner
2023-09-13 21:08         ` Thomas Gleixner
2023-09-14  6:33         ` Tetsuo Handa
2023-09-14  6:33           ` Tetsuo Handa
2023-09-14  8:12           ` Daniel Vetter
2023-09-14  8:12             ` Daniel Vetter
2023-09-18 22:02             ` Helen Koike
2023-09-19  6:38               ` Daniel Stone
2023-09-19  6:38                 ` Daniel Stone
2023-09-13 14:30   ` BUG: soft lockup in smp_call_function Tetsuo Handa
2023-09-14 12:21     ` Hillf Danton
2023-09-14 13:13       ` Tetsuo Handa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.