* next-20250327 - lockdep whine and USB issues at boot
@ 2025-03-28 18:10 Valdis Klētnieks
2025-03-28 20:32 ` Luck, Tony
2025-03-29 2:06 ` Ming Lei
0 siblings, 2 replies; 5+ messages in thread
From: Valdis Klētnieks @ 2025-03-28 18:10 UTC (permalink / raw)
To: Stephen Rothwell, Jens Axboe; +Cc: linux-next, linux-kernel, linux-block
Saw this during boot on a Dell Inspiron 5559 laptop.
In addition, the external USB ports all gave up, rendering a USB mouse and a
USB external drive totally dead in the water. May or may not be related, I didn't
dig too far into it.
[ 40.842033] [ T953] io scheduler bfq registered
[ 41.022391] [ T817] ======================================================
[ 41.103507] [ T817] WARNING: possible circular locking dependency detected
[ 41.184587] [ T817] 6.14.0-next-20250327 #110 Tainted: G I T
[ 41.265700] [ T817] ------------------------------------------------------
[ 41.346832] [ T817] (udev-worker)/817 is trying to acquire lock:
[ 41.427952] [ T817] ffff93a2c80ae9f0 (&q->elevator_lock){+.+.}-{4:4}, at: elv_iosched_store+0xe1/0x260
[ 41.830112] [ T817]
but task is already holding lock:
[ 41.912022] [ T817] ffff93a2c80ae460 (&q->q_usage_counter(io)#10){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x11/0x20
[ 42.394431] [ T817]
which lock already depends on the new lock.
[ 42.477193] [ T817]
the existing dependency chain (in reverse order) is:
[ 42.559132] [ T817]
-> #2 (&q->q_usage_counter(io)#10){++++}-{0:0}:
[ 43.042361] [ T817] lock_acquire.part.0+0xbe/0x240
[ 43.123452] [ T817] blk_alloc_queue+0x30b/0x350
[ 43.204547] [ T817] blk_mq_alloc_queue+0x62/0xd0
[ 43.285646] [ T817] scsi_alloc_sdev+0x29c/0x3d0
[ 43.366744] [ T817] scsi_probe_and_add_lun+0x1d8/0x2b0
[ 43.447847] [ T817] __scsi_add_device+0x114/0x130
[ 43.528950] [ T817] ata_scsi_scan_host+0x7a/0x190
[ 43.610047] [ T817] async_run_entry_fn+0x24/0xc0
[ 43.691137] [ T817] process_one_work+0x21e/0x5a0
[ 43.772226] [ T817] worker_thread+0x1d5/0x3c0
[ 43.853316] [ T817] kthread+0x114/0x230
[ 43.934369] [ T817] ret_from_fork+0x2c/0x50
[ 44.015453] [ T817] ret_from_fork_asm+0x1a/0x30
[ 44.096532] [ T817]
-> #1 (fs_reclaim){+.+.}-{0:0}:
[ 44.499438] [ T817] lock_acquire.part.0+0xbe/0x240
[ 44.580476] [ T817] fs_reclaim_acquire+0xa8/0xe0
[ 44.661541] [ T817] blk_mq_alloc_and_init_hctx+0x16a/0x240
[ 44.742613] [ T817] blk_mq_realloc_hw_ctxs+0x2fb/0x390
[ 44.823681] [ T817] blk_mq_init_allocated_queue+0x13b/0x460
[ 44.904744] [ T817] blk_mq_alloc_queue+0x7b/0xd0
[ 44.985800] [ T817] __blk_mq_alloc_disk+0x13/0x60
[ 45.066848] [ T817] loop_add+0x1fd/0x3e0
[ 45.147899] [ T817] loop_init+0x17b/0x1d0
[ 45.228942] [ T817] do_one_initcall+0x83/0x3c0
[ 45.309991] [ T817] do_initcalls+0x130/0x1b0
[ 45.391040] [ T817] kernel_init_freeable+0x292/0x300
[ 45.472088] [ T817] kernel_init+0x15/0x130
[ 45.553131] [ T817] ret_from_fork+0x2c/0x50
[ 45.634147] [ T817] ret_from_fork_asm+0x1a/0x30
[ 45.715192] [ T817]
-> #0 (&q->elevator_lock){+.+.}-{4:4}:
[ 46.117972] [ T817] check_prev_add+0xe1/0xcf0
[ 46.198974] [ T817] __lock_acquire+0x1031/0x13b0
[ 46.280004] [ T817] lock_acquire.part.0+0xbe/0x240
[ 46.361026] [ T817] __mutex_lock+0xcb/0xfc0
[ 46.442014] [ T817] elv_iosched_store+0xe1/0x260
[ 46.523029] [ T817] kernfs_fop_write_iter+0x160/0x240
[ 46.604038] [ T817] vfs_write+0x2ec/0x5c0
[ 46.685016] [ T817] ksys_write+0x7a/0xf0
[ 46.766019] [ T817] do_syscall_64+0x68/0x140
[ 46.847024] [ T817] entry_SYSCALL_64_after_hwframe+0x71/0x79
[ 46.927997] [ T817]
other info that might help us debug this:
[ 47.010420] [ T817] Chain exists of:
&q->elevator_lock --> fs_reclaim --> &q->q_usage_counter(io)#10
[ 47.654630] [ T817] Possible unsafe locking scenario:
[ 47.736287] [ T817] CPU0 CPU1
[ 47.817245] [ T817] ---- ----
[ 47.898175] [ T817] lock(&q->q_usage_counter(io)#10);
[ 48.219898] [ T817] lock(fs_reclaim);
[ 48.461357] [ T817] lock(&q->q_usage_counter(io)#10);
[ 48.783086] [ T817] lock(&q->elevator_lock);
[ 49.024555] [ T817]
*** DEADLOCK ***
[ 49.106827] [ T817] 5 locks held by (udev-worker)/817:
[ 49.187752] [ T817] #0: ffff93a2e079a440 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0x7a/0xf0
[ 49.750206] [ T817] #1: ffff93a2c62c6690 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x119/0x240
[ 50.312697] [ T817] #2: ffff93a2c2312ad0 (kn->active#89){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x122/0x240
[ 50.875194] [ T817] #3: ffff93a2c80ae460 (&q->q_usage_counter(io)#10){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x11/0x20
[ 51.437693] [ T817] #4: ffff93a2c80ae4a0 (&q->q_usage_counter(queue)){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x11/0x20
[ 51.919935] [ T817]
stack backtrace:
[ 52.001533] [ T817] CPU: 0 UID: 0 PID: 817 Comm: (udev-worker) Tainted: G I T 6.14.0-next-20250327 #110 PREEMPT(full) 238541a35a93f433dad83bf02a8fb62af6fea5a1
[ 52.081794] [ T817] Tainted: [I]=FIRMWARE_WORKAROUND, [T]=RANDSTRUCT
[ 52.162052] [ T817] Hardware name: Dell Inc. Unidentified System/052K07, BIOS 1.9.0 09/07/2020
[ 52.242309] [ T817] Call Trace:
[ 52.322568] [ T817] <TASK>
[ 52.402828] [ T817] dump_stack_lvl+0x65/0x90
[ 52.483089] [ T817] print_circular_bug.cold+0x38/0x46
[ 52.563351] [ T817] check_noncircular+0x169/0x190
[ 52.643613] [ T817] check_prev_add+0xe1/0xcf0
[ 52.723871] [ T817] ? add_chain_cache+0x115/0x4c0
[ 52.804133] [ T817] __lock_acquire+0x1031/0x13b0
[ 52.884394] [ T817] ? mark_usage+0x65/0x180
[ 52.964654] [ T817] lock_acquire.part.0+0xbe/0x240
[ 53.044913] [ T817] ? elv_iosched_store+0xe1/0x260
[ 53.125175] [ T817] ? lock_acquire+0xf8/0x140
[ 53.205435] [ T817] __mutex_lock+0xcb/0xfc0
[ 53.285695] [ T817] ? elv_iosched_store+0xe1/0x260
[ 53.365955] [ T817] ? mark_held_locks+0x40/0x70
[ 53.446214] [ T817] ? elv_iosched_store+0xe1/0x260
[ 53.526473] [ T817] ? lockdep_hardirqs_on_prepare.part.0+0x8e/0x170
[ 53.606736] [ T817] ? elv_iosched_store+0xe1/0x260
[ 53.686996] [ T817] elv_iosched_store+0xe1/0x260
[ 53.767260] [ T817] kernfs_fop_write_iter+0x160/0x240
[ 53.847520] [ T817] vfs_write+0x2ec/0x5c0
[ 53.927784] [ T817] ksys_write+0x7a/0xf0
[ 54.008044] [ T817] do_syscall_64+0x68/0x140
[ 54.088307] [ T817] entry_SYSCALL_64_after_hwframe+0x71/0x79
[ 54.168565] [ T817] RIP: 0033:0x7fd9f9a7ca06
[ 54.248825] [ T817] Code: 5d e8 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 75 19 83 e2 39 83 fa 08 75 11 e8 26 ff ff ff 66 0f 1f 44 00 00 48 8b 45 10 0f 05 <48> 8b 5d f8 c9 c3 0f 1f 40 00 f3 0f 1e fa 55 48 89 e5 48 83 ec 08
[ 54.329083] [ T817] RSP: 002b:00007ffe23386d50 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 54.489599] [ T817] RAX: ffffffffffffffda RBX: 0000564ed601dff0 RCX: 00007fd9f9a7ca06
[ 54.569857] [ T817] RDX: 0000000000000003 RSI: 00007ffe233870a0 RDI: 000000000000001c
[ 54.650115] [ T817] RBP: 00007ffe23386d70 R08: 0000000000000000 R09: 0000000000000000
[ 54.730373] [ T817] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000003
[ 54.810\0\0\0\0\0\0\0\0630] [ T817] R13: 0000000000000003 R14: 00007ffe233870a0 R15: 00007ffe233870a0
[ 54.890894] [ T817] </TASK>
[ 54.971553] [ T25] xhci_hcd 0000:00:14.0: xHCI host not responding to stop endpoint command
[ 55.051817] [ T25] xhci_hcd 0000:00:14.0: xHCI host controller not responding, assume dead
[ 55.132085] [ T25] xhci_hcd 0000:00:14.0: HC died; cleaning up
[ 55.212411] [ T59] xhci_hcd 0000:00:14.0: Timeout while waiting for stop endpoint command
[ 55.253106] [ T47] kauditd_printk_skb: 81 callbacks suppressed
[ 55.354880] [ T984] usb 1-2: USB disconnect, device number 2
[ 55.375007] [ T986] usb 2-3: USB disconnect, device number 2
[ 55.568668] [ T984] usb 1-3: USB disconnect, device number 3
[ 55.784923] [ T984] usb 1-6: USB disconnect, device number 4
[ 55.874703] [ T984] usb 1-7: USB disconnect, device number 5
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: next-20250327 - lockdep whine and USB issues at boot
2025-03-28 18:10 next-20250327 - lockdep whine and USB issues at boot Valdis Klētnieks
@ 2025-03-28 20:32 ` Luck, Tony
2025-03-29 2:06 ` Ming Lei
1 sibling, 0 replies; 5+ messages in thread
From: Luck, Tony @ 2025-03-28 20:32 UTC (permalink / raw)
To: Valdis Klētnieks
Cc: Stephen Rothwell, Jens Axboe, linux-next, linux-kernel,
linux-block
On Fri, Mar 28, 2025 at 02:10:53PM -0400, Valdis Klētnieks wrote:
> Saw this during boot on a Dell Inspiron 5559 laptop.
>
> In addition, the external USB ports all gave up, rendering a USB mouse and a
> USB external drive totally dead in the water. May or may not be related, I didn't
> dig too far into it.
>
> [ 40.842033] [ T953] io scheduler bfq registered
>
> [ 41.022391] [ T817] ======================================================
> [ 41.103507] [ T817] WARNING: possible circular locking dependency detected
> [ 41.184587] [ T817] 6.14.0-next-20250327 #110 Tainted: G I T
> [ 41.265700] [ T817] ------------------------------------------------------
> [ 41.346832] [ T817] (udev-worker)/817 is trying to acquire lock:
> [ 41.427952] [ T817] ffff93a2c80ae9f0 (&q->elevator_lock){+.+.}-{4:4}, at: elv_iosched_store+0xe1/0x260
> [ 41.830112] [ T817]
> but task is already holding lock:
> [ 41.912022] [ T817] ffff93a2c80ae460 (&q->q_usage_counter(io)#10){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x11/0x20
> [ 42.394431] [ T817]
> which lock already depends on the new lock.
>
> [ 42.477193] [ T817]
> the existing dependency chain (in reverse order) is:
> [ 42.559132] [ T817]
> -> #2 (&q->q_usage_counter(io)#10){++++}-{0:0}:
> [ 43.042361] [ T817] lock_acquire.part.0+0xbe/0x240
> [ 43.123452] [ T817] blk_alloc_queue+0x30b/0x350
> [ 43.204547] [ T817] blk_mq_alloc_queue+0x62/0xd0
> [ 43.285646] [ T817] scsi_alloc_sdev+0x29c/0x3d0
> [ 43.366744] [ T817] scsi_probe_and_add_lun+0x1d8/0x2b0
> [ 43.447847] [ T817] __scsi_add_device+0x114/0x130
> [ 43.528950] [ T817] ata_scsi_scan_host+0x7a/0x190
> [ 43.610047] [ T817] async_run_entry_fn+0x24/0xc0
> [ 43.691137] [ T817] process_one_work+0x21e/0x5a0
> [ 43.772226] [ T817] worker_thread+0x1d5/0x3c0
> [ 43.853316] [ T817] kthread+0x114/0x230
> [ 43.934369] [ T817] ret_from_fork+0x2c/0x50
> [ 44.015453] [ T817] ret_from_fork_asm+0x1a/0x30
> [ 44.096532] [ T817]
I see similar looking lockdep splat on a 2 socket Icelake server.
Kernel built from this Linus commit:
acb4f33713b9 ("Merge tag 'm68knommu-for-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu")
[ 30.253858] systemd-journald[2054]: Received client request to flush runtime journal.
[ 31.187581] ======================================================
[ 31.195139] WARNING: possible circular locking dependency detected
[ 31.202480] 6.14.0+ #32 Not tainted
[ 31.207291] ------------------------------------------------------
[ 31.214922] (udev-worker)/2193 is trying to acquire lock:
[ 31.221796] ff41ee0f1836f4d8 (&q->elevator_lock){+.+.}-{4:4}, at: elv_iosched_store+0xe9/0x260
[ 31.231864]
but task is already holding lock:
[ 31.239595] ff41ee0f1836efa8 (&q->q_usage_counter(io)){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x12/0x20
[ 31.250814]
which lock already depends on the new lock.
[ 31.261167]
the existing dependency chain (in reverse order) is:
[ 31.271104]
-> #2 (&q->q_usage_counter(io)){++++}-{0:0}:
[ 31.279299] blk_alloc_queue+0x30e/0x350
[ 31.284432] blk_mq_alloc_queue+0x62/0xd0
[ 31.290416] scsi_alloc_sdev+0x280/0x3c0
[ 31.296321] scsi_probe_and_add_lun+0x223/0x470
[ 31.302846] __scsi_add_device+0x10d/0x120
[ 31.308927] ata_scsi_scan_host+0x9c/0x1b0
[ 31.315003] async_run_entry_fn+0x31/0x130
[ 31.321016] process_one_work+0x22f/0x5b0
[ 31.326962] worker_thread+0x1cc/0x3c0
[ 31.332665] kthread+0xfc/0x240
[ 31.337784] ret_from_fork+0x31/0x50
[ 31.342252] ret_from_fork_asm+0x1a/0x30
[ 31.347061]
-> #1 (fs_reclaim){+.+.}-{0:0}:
[ 31.353404] fs_reclaim_acquire+0x9d/0xd0
[ 31.358273] kmem_cache_alloc_node_noprof+0x59/0x420
[ 31.364083] scsi_mq_init_request+0x32/0xe0
[ 31.369107] blk_mq_alloc_and_init_hctx+0x143/0x4d0
[ 31.374826] blk_mq_realloc_hw_ctxs+0x2f5/0x390
[ 31.380176] blk_mq_init_allocated_queue+0x17b/0x570
[ 31.385962] blk_mq_alloc_queue+0x7b/0xd0
[ 31.390795] scsi_alloc_sdev+0x280/0x3c0
[ 31.395603] scsi_probe_and_add_lun+0x223/0x470
[ 31.400929] __scsi_add_device+0x10d/0x120
[ 31.405828] ata_scsi_scan_host+0x9c/0x1b0
[ 31.410719] async_run_entry_fn+0x31/0x130
[ 31.415601] process_one_work+0x22f/0x5b0
[ 31.420396] worker_thread+0x1cc/0x3c0
[ 31.424997] kthread+0xfc/0x240
[ 31.428983] ret_from_fork+0x31/0x50
[ 31.433334] ret_from_fork_asm+0x1a/0x30
[ 31.438026]
-> #0 (&q->elevator_lock){+.+.}-{4:4}:
[ 31.444767] __lock_acquire+0x1510/0x2630
[ 31.449606] lock_acquire+0xcb/0x2d0
[ 31.454040] __mutex_lock+0xca/0xe50
[ 31.458382] elv_iosched_store+0xe9/0x260
[ 31.463145] kernfs_fop_write_iter+0x165/0x240
[ 31.468334] vfs_write+0x2b0/0x540
[ 31.472499] ksys_write+0x71/0xf0
[ 31.476573] do_syscall_64+0x95/0x180
[ 31.481122] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 31.486936]
other info that might help us debug this:
[ 31.495922] Chain exists of:
&q->elevator_lock --> fs_reclaim --> &q->q_usage_counter(io)
[ 31.508010] Possible unsafe locking scenario:
[ 31.514353] CPU0 CPU1
[ 31.519098] ---- ----
[ 31.523833] lock(&q->q_usage_counter(io));
[ 31.528314] lock(fs_reclaim);
[ 31.534183] lock(&q->q_usage_counter(io));
[ 31.541178] lock(&q->elevator_lock);
[ 31.545145]
*** DEADLOCK ***
[ 31.551676] 5 locks held by (udev-worker)/2193:
[ 31.556420] #0: ff41ee0f18e77420 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0x71/0xf0
[ 31.564568] #1: ff41ee0f1aabbe88 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x11e/0x240
[ 31.573842] #2: ff41ee0f0cfefe48 (kn->active#86){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x127/0x240
[ 31.583197] #3: ff41ee0f1836efa8 (&q->q_usage_counter(io)){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x12/0x20
[ 31.593949] #4: ff41ee0f1836efe0 (&q->q_usage_counter(queue)){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x12/0x20
[ 31.604967]
stack backtrace:
[ 31.609793] CPU: 1 UID: 0 PID: 2193 Comm: (udev-worker) Not tainted 6.14.0+ #32 PREEMPT(voluntary)
[ 31.609797] Hardware name: Intel Corporation WilsonCity/WilsonCity, BIOS WLYDCRB1.SYS.0021.P06.2104260458 04/26/2021
[ 31.609800] Call Trace:
[ 31.609803] <TASK>
[ 31.609807] dump_stack_lvl+0x6e/0xa0
[ 31.609816] print_circular_bug.cold+0x178/0x1be
[ 31.609822] check_noncircular+0x146/0x160
[ 31.609828] __lock_acquire+0x1510/0x2630
[ 31.609833] lock_acquire+0xcb/0x2d0
[ 31.609835] ? elv_iosched_store+0xe9/0x260
[ 31.609842] __mutex_lock+0xca/0xe50
[ 31.609846] ? elv_iosched_store+0xe9/0x260
[ 31.609849] ? elv_iosched_store+0xe9/0x260
[ 31.609853] ? elv_iosched_store+0xe9/0x260
[ 31.609855] elv_iosched_store+0xe9/0x260
[ 31.609859] kernfs_fop_write_iter+0x165/0x240
[ 31.609862] vfs_write+0x2b0/0x540
[ 31.609868] ksys_write+0x71/0xf0
[ 31.609871] do_syscall_64+0x95/0x180
[ 31.609874] ? lock_acquire+0xcb/0x2d0
[ 31.609875] ? ktime_get_coarse_real_ts64+0x15/0x70
[ 31.609881] ? find_held_lock+0x2b/0x80
[ 31.609888] ? ktime_get_coarse_real_ts64+0x15/0x70
[ 31.609890] ? file_has_perm+0xa8/0xf0
[ 31.609897] ? syscall_exit_to_user_mode_prepare+0x21b/0x250
[ 31.609901] ? lockdep_hardirqs_on_prepare+0xdb/0x190
[ 31.609904] ? syscall_exit_to_user_mode+0x97/0x290
[ 31.609907] ? do_syscall_64+0xa1/0x180
[ 31.609908] ? find_held_lock+0x2b/0x80
[ 31.609911] ? fd_install+0xa4/0x2c0
[ 31.609917] ? do_sys_openat2+0xa4/0xe0
[ 31.609918] ? kmem_cache_free+0x13b/0x460
[ 31.609926] ? do_sys_openat2+0xa4/0xe0
[ 31.609928] ? syscall_exit_to_user_mode_prepare+0x21b/0x250
[ 31.609930] ? lockdep_hardirqs_on_prepare+0xdb/0x190
[ 31.609931] ? syscall_exit_to_user_mode+0x97/0x290
[ 31.609933] ? do_syscall_64+0xa1/0x180
[ 31.609935] ? lock_acquire+0xcb/0x2d0
[ 31.609937] ? ktime_get+0x22/0x100
[ 31.609940] ? find_held_lock+0x2b/0x80
[ 31.609942] ? ktime_get+0x22/0x100
[ 31.609946] ? sched_clock+0x10/0x30
[ 31.609947] ? sched_clock_cpu+0xf/0x1f0
[ 31.609951] ? irqtime_account_irq+0x3e/0xc0
[ 31.609955] ? handle_softirqs+0x475/0x4d0
[ 31.609960] ? sched_clock_cpu+0xf/0x1f0
[ 31.609963] ? clear_bhb_loop+0x15/0x70
[ 31.609966] ? clear_bhb_loop+0x15/0x70
[ 31.609968] ? clear_bhb_loop+0x15/0x70
[ 31.609971] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 31.609973] RIP: 0033:0x7fd14baf4484
[ 31.609978] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 45 9c 10 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
[ 31.609979] RSP: 002b:00007ffe2d91f6e8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 31.609982] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fd14baf4484
[ 31.609984] RDX: 0000000000000003 RSI: 00007ffe2d91f9f0 RDI: 000000000000005a
[ 31.609985] RBP: 00007ffe2d91f710 R08: 00007fd14bbf51c8 R09: 00007ffe2d91f7c0
[ 31.609986] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000003
[ 31.609987] R13: 00007ffe2d91f9f0 R14: 000055eccc8f80a0 R15: 00007fd14bbf4e80
[ 31.609992] </TASK>
-Tony
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: next-20250327 - lockdep whine and USB issues at boot
2025-03-28 18:10 next-20250327 - lockdep whine and USB issues at boot Valdis Klētnieks
2025-03-28 20:32 ` Luck, Tony
@ 2025-03-29 2:06 ` Ming Lei
2025-03-29 6:02 ` Valdis Klētnieks
1 sibling, 1 reply; 5+ messages in thread
From: Ming Lei @ 2025-03-29 2:06 UTC (permalink / raw)
To: Valdis Klētnieks
Cc: Stephen Rothwell, Jens Axboe, linux-next, linux-kernel,
linux-block
On Fri, Mar 28, 2025 at 02:10:53PM -0400, Valdis Klētnieks wrote:
> Saw this during boot on a Dell Inspiron 5559 laptop.
>
> In addition, the external USB ports all gave up, rendering a USB mouse and a
> USB external drive totally dead in the water. May or may not be related, I didn't
> dig too far into it.
It shouldn't be related to the warning.
>
> [ 40.842033] [ T953] io scheduler bfq registered
>
> [ 41.022391] [ T817] ======================================================
> [ 41.103507] [ T817] WARNING: possible circular locking dependency detected
> [ 41.184587] [ T817] 6.14.0-next-20250327 #110 Tainted: G I T
> [ 41.265700] [ T817] ------------------------------------------------------
> [ 41.346832] [ T817] (udev-worker)/817 is trying to acquire lock:
> [ 41.427952] [ T817] ffff93a2c80ae9f0 (&q->elevator_lock){+.+.}-{4:4}, at: elv_iosched_store+0xe1/0x260
> [ 41.830112] [ T817]
> but task is already holding lock:
> [ 41.912022] [ T817] ffff93a2c80ae460 (&q->q_usage_counter(io)#10){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x11/0x20
For this lockdep warning, feel free to try patch in the following link:
https://lore.kernel.org/linux-block/Z-dUCLvf06SfTOHy@fedora/
Thanks,
Ming
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: next-20250327 - lockdep whine and USB issues at boot
2025-03-29 2:06 ` Ming Lei
@ 2025-03-29 6:02 ` Valdis Klētnieks
2025-03-31 9:21 ` Ming Lei
0 siblings, 1 reply; 5+ messages in thread
From: Valdis Klētnieks @ 2025-03-29 6:02 UTC (permalink / raw)
To: Ming Lei
Cc: Stephen Rothwell, Jens Axboe, linux-next, linux-kernel,
linux-block
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 10548 bytes --]
On Sat, 29 Mar 2025 10:06:39 +0800, Ming Lei said:
> On Fri, Mar 28, 2025 at 02:10:53PM -0400, Valdis Klētnieks wrote:
> > Saw this during boot on a Dell Inspiron 5559 laptop.
> >
> > In addition, the external USB ports all gave up, rendering a USB mouse and a
> > USB external drive totally dead in the water. May or may not be related, I didn't
> > dig too far into it.
>
> It shouldn't be related to the warning.
> For this lockdep warning, feel free to try patch in the following link:
>
> https://lore.kernel.org/linux-block/Z-dUCLvf06SfTOHy@fedora/
After applying that patch, USB *didn't* die during boot. So apparently
*something* changed.
Also, the patch merely caused a *different* lockdep warning.
Rather than &q->q_usage_counter(io) and &q->elevator_lock, the
new one is &q->elevator_lock versus pcpu_alloc_mutex...
Looks like it's a bit more convoluted than first looked?
[ 52.140990] [ T954] io scheduler bfq registered
[ 52.355286] [ T848] ======================================================
[ 52.436367] [ T848] WARNING: possible circular locking dependency detected
[ 52.517414] [ T848] 6.14.0-next-20250327-dirty #111 Tainted: G I T
[ 52.598492] [ T848] ------------------------------------------------------
[ 52.679571] [ T848] (udev-worker)/848 is trying to acquire lock:
[ 52.760647] [ T848] ffffffff99bb4470 (pcpu_alloc_mutex){+.+.}-{4:4}, at: pcpu_alloc_noprof+0x524/0x840
[ 53.162776] [ T848]
but task is already holding lock:
[ 53.244616] [ T848] ffff8c22c3392428 (&q->elevator_lock){+.+.}-{4:4}, at: elv_iosched_store+0xe1/0x260
[ 53.646741] [ T848]
which lock already depends on the new lock.
[ 53.729459] [ T848]
the existing dependency chain (in reverse order) is:
[ 53.811348] [ T848]
-> #3 (&q->elevator_lock){+.+.}-{4:4}:
[ 54.214266] [ T848] lock_acquire.part.0+0xbe/0x240
[ 54.295342] [ T848] __mutex_lock+0xcb/0xfc0
[ 54.376409] [ T848] elv_iosched_store+0xe1/0x260
[ 54.457476] [ T848] kernfs_fop_write_iter+0x160/0x240
[ 54.538548] [ T848] vfs_write+0x2ec/0x5c0
[ 54.619619] [ T848] ksys_write+0x7a/0xf0
[ 54.700687] [ T848] do_syscall_64+0x68/0x140
[ 54.781751] [ T848] entry_SYSCALL_64_after_hwframe+0x71/0x79
[ 54.862814] [ T848]
-> #2 (&q->q_usage_counter(io)#10){++++}-{0:0}:
[ 55.345959] [ T848] lock_acquire.part.0+0xbe/0x240
[ 55.427018] [ T848] blk_alloc_queue+0x30b/0x350
[ 55.508082] [ T848] blk_mq_alloc_queue+0x62/0xd0
[ 55.589134] [ T848] scsi_alloc_sdev+0x29c/0x3d0
[ 55.670185] [ T848] scsi_probe_and_add_lun+0x1d8/0x2b0
[ 55.751232] [ T848] __scsi_add_device+0x114/0x130
[ 55.832276] [ T848] ata_scsi_scan_host+0x7a/0x190
[ 55.913321] [ T848] async_run_entry_fn+0x24/0xc0
[ 55.994358] [ T848] process_one_work+0x21e/0x5a0
[ 56.075404] [ T848] worker_thread+0x1d5/0x3c0
[ 56.156444] [ T848] kthread+0x114/0x230
[ 56.237474] [ T848] ret_from_fork+0x2c/0x50
[ 56.318492] [ T848] ret_from_fork_asm+0x1a/0x30
[ 56.399515] [ T848]
-> #1 (fs_reclaim){+.+.}-{0:0}:
[ 56.802275] [ T848] lock_acquire.part.0+0xbe/0x240
[ 56.883298] [ T848] fs_reclaim_acquire+0xa8/0xe0
[ 56.964317] [ T848] __alloc_frozen_pages_noprof+0xd2/0x340
[ 57.045338] [ T848] __alloc_pages_noprof+0x9/0x20
[ 57.126360] [ T848] pcpu_alloc_pages.isra.0+0x97/0x180
[ 57.207375] [ T848] pcpu_populate_chunk+0x34/0x70
[ 57.288388] [ T848] pcpu_alloc_noprof+0x2d1/0x840
[ 57.369366] [ T848] iommu_dma_init_fq+0x91/0x260
[ 57.450373] [ T848] iommu_dma_init_domain+0x1de/0x210
[ 57.531381] [ T848] iommu_setup_dma_ops+0x47/0x80
[ 57.612355] [ T848] iommu_device_register+0x172/0x250
[ 57.693348] [ T848] intel_iommu_init+0x5ef/0xbb0
[ 57.774344] [ T848] pci_iommu_init+0x29/0x90
[ 57.855333] [ T848] do_one_initcall+0x83/0x3c0
[ 57.936316] [ T848] do_initcalls+0x130/0x1b0
[ 58.017300] [ T848] kernel_init_freeable+0x292/0x300
[ 58.098281] [ T848] kernel_init+0x15/0x130
[ 58.179254] [ T848] ret_from_fork+0x2c/0x50
[ 58.260224] [ T848] ret_from_fork_asm+0x1a/0x30
[ 58.341195] [ T848]
-> #0 (pcpu_alloc_mutex){+.+.}-{4:4}:
[ 58.743832] [ T848] check_prev_add+0xe1/0xcf0
[ 58.824752] [ T848] __lock_acquire+0x1031/0x13b0
[ 58.905697] [ T848] lock_acquire.part.0+0xbe/0x240
[ 58.986633] [ T848] __mutex_lock+0xcb/0xfc0
[ 59.067565] [ T848] pcpu_alloc_noprof+0x524/0x840
[ 59.148463] [ T848] sbitmap_init_node+0xf0/0x230
[ 59.229392] [ T848] sbitmap_queue_init_node+0x29/0x140
[ 59.310317] [ T848] blk_mq_init_tags+0x95/0x110
[ 59.391240] [ T848] blk_mq_alloc_map_and_rqs+0x3c/0x110
[ 59.472131] [ T848] blk_mq_init_sched+0x100/0x260
[ 59.553017] [ T848] elevator_switch+0xb1/0x190
[ 59.633926] [ T848] elv_iosched_store+0x1db/0x260
[ 59.714841] [ T848] kernfs_fop_write_iter+0x160/0x240
[ 59.795728] [ T848] vfs_write+0x2ec/0x5c0
[ 59.876599] [ T848] ksys_write+0x7a/0xf0
[ 59.957500] [ T848] do_syscall_64+0x68/0x140
[ 60.038396] [ T848] entry_SYSCALL_64_after_hwframe+0x71/0x79
[ 60.119269] [ T848]
other info that might help us debug this:
[ 60.201395] [ T848] Chain exists of:
pcpu_alloc_mutex --> &q->q_usage_counter(io)#10 --> &q->elevator_lock
[ 60.845339] [ T848] Possible unsafe locking scenario:
[ 60.926872] [ T848] CPU0 CPU1
[ 61.007776] [ T848] ---- ----
[ 61.088676] [ T848] lock(&q->elevator_lock);
[ 61.330053] [ T848] lock(&q->q_usage_counter(io)#10);
[ 61.651731] [ T848] lock(&q->elevator_lock);
[ 61.893151] [ T848] lock(pcpu_alloc_mutex);
[ 62.134602] [ T848]
*** DEADLOCK ***
[ 62.216729] [ T848] 6 locks held by (udev-worker)/848:
[ 62.297630] [ T848] #0: ffff8c22c8575440 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0x7a/0xf0
[ 62.860055] [ T848] #1: ffff8c22c9fb4090 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x119/0x240
[ 63.422482] [ T848] #2: ffff8c22c821fad0 (kn->active#89){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x122/0x240
[ 63.984915] [ T848] #3: ffff8c22c3391e98 (&q->q_usage_counter(io)#10){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x11/0x20
[ 64.547390] [ T848] #4: ffff8c22c3391ed8 (&q->q_usage_counter(queue)){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x11/0x20
[ 65.029578] [ T848] #5: ffff8c22c3392428 (&q->elevator_lock){+.+.}-{4:4}, at: elv_iosched_store+0xe1/0x260
[ 65.511794] [ T848]
stack backtrace:
[ 65.593348] [ T848] CPU: 1 UID: 0 PID: 848 Comm: (udev-worker) Tainted: G I T 6.14.0-next-20250327-dirty #111 PREEMPT(full) 7b54f670bb22238ec039f82f30e0fd3bcc8dbc83
[ 65.673611] [ T848] Tainte\0\0\0\0\0\0\0\0d: [I]=FIRMWARE_WORKAROUND, [T]=RANDSTRUCT
[ 65.753868] [ T848] Hardware name: Dell Inc. Unidentified System/052K07, BIOS 1.9.0 09/07/2020
[ 65.834126] [ T848] Call Trace:
[ 65.914385] [ T848] <TASK>
[ 65.994655] [ T848] dump_stack_lvl+0x65/0x90
[ 66.074916] [ T848] print_circular_bug.cold+0x38/0x46
[ 66.155178] [ T848] check_noncircular+0x169/0x190
[ 66.235441] [ T848] check_prev_add+0xe1/0xcf0
[ 66.315699] [ T848] ? add_chain_cache+0x115/0x4c0
[ 66.395961] [ T848] __lock_acquire+0x1031/0x13b0
[ 66.476225] [ T848] lock_acquire.part.0+0xbe/0x240
[ 66.556484] [ T848] ? pcpu_alloc_noprof+0x524/0x840
[ 66.636745] [ T848] ? lock_acquire+0xf8/0x140
[ 66.717006] [ T848] __mutex_lock+0xcb/0xfc0
[ 66.797266] [ T848] ? pcpu_alloc_noprof+0x524/0x840
[ 66.877526] [ T848] ? lockdep_hardirqs_on+0x84/0x130
[ 66.957786] [ T848] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[ 67.038044] [ T848] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[ 67.118303] [ T848] ? pcpu_alloc_noprof+0x524/0x840
[ 67.198566] [ T848] ? pcpu_alloc_noprof+0x524/0x840
[ 67.278825] [ T848] ? find_held_lock+0x2b/0x80
[ 67.359084] [ T848] pcpu_alloc_noprof+0x524/0x840
[ 67.439344] [ T848] ? local_clock_noinstr+0xf/0x100
[ 67.519608] [ T848] sbitmap_init_node+0xf0/0x230
[ 67.599870] [ T848] sbitmap_queue_init_node+0x29/0x140
[ 67.680131] [ T848] blk_mq_init_tags+0x95/0x110
[ 67.760392] [ T848] blk_mq_alloc_map_and_rqs+0x3c/0x110
[ 67.840654] [ T848] blk_mq_init_sched+0x100/0x260
[ 67.920916] [ T848] elevator_switch+0xb1/0x190
[ 68.001176] [ T848] ? elv_iosched_store+0x1d0/0x260
[ 68.081437] [ T848] elv_iosched_store+0x1db/0x260
[ 68.161700] [ T848] kernfs_fop_write_iter+0x160/0x240
[ 68.241960] [ T848] vfs_write+0x2ec/0x5c0
[ 68.322224] [ T848] ksys_write+0x7a/0xf0
[ 68.402485] [ T848] do_syscall_64+0x68/0x140
[ 68.482745] [ T848] entry_SYSCALL_64_after_hwframe+0x71/0x79
[ 68.563004] [ T848] RIP: 0033:0x7f2a1747ca06
[ 68.643263] [ T848] Code: 5d e8 41 8b 93 08 03 00 00 59 5e 48 83 f8 fc 75 19 83 e2 39 83 fa 08 75 11 e8 26 ff ff ff 66 0f 1f 44 00 00 48 8b 45 10 0f 05 <48> 8b 5d f8 c9 c3 0f 1f 40 00 f3 0f 1e fa 55 48 89 e5 48 83 ec 08
[ 68.723521] [ T848] RSP: 002b:00007ffe48933eb0 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 68.884037] [ T848] RAX: ffffffffffffffda RBX: 000056119a4eaff0 RCX: 00007f2a1747ca06
[ 68.964295] [ T848] RDX: 0000000000000003 RSI: 00007ffe48934200 RDI: 0000000000000027
[ 69.044553] [ T848] RBP: 00007ffe48933ed0 R08: 0000000000000000 R09: 0000000000000000
[ 69.124811] [ T848] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000003
[ 69.205068] [ T848] R13: 0000000000000003 R14: 00007ffe48934200 R15: 00007ffe48934200
[ 69.285332] [ T848] </TASK>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: next-20250327 - lockdep whine and USB issues at boot
2025-03-29 6:02 ` Valdis Klētnieks
@ 2025-03-31 9:21 ` Ming Lei
0 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2025-03-31 9:21 UTC (permalink / raw)
To: Valdis Klētnieks
Cc: Stephen Rothwell, Jens Axboe, linux-next, linux-kernel,
linux-block
On Sat, Mar 29, 2025 at 02:02:10AM -0400, Valdis Klētnieks wrote:
> On Sat, 29 Mar 2025 10:06:39 +0800, Ming Lei said:
> > On Fri, Mar 28, 2025 at 02:10:53PM -0400, Valdis Kl??tnieks wrote:
> > > Saw this during boot on a Dell Inspiron 5559 laptop.
> > >
> > > In addition, the external USB ports all gave up, rendering a USB mouse and a
> > > USB external drive totally dead in the water. May or may not be related, I didn't
> > > dig too far into it.
> >
> > It shouldn't be related to the warning.
>
> > For this lockdep warning, feel free to try patch in the following link:
> >
> > https://lore.kernel.org/linux-block/Z-dUCLvf06SfTOHy@fedora/
>
> After applying that patch, USB *didn't* die during boot. So apparently
> *something* changed.
That is surprising, and maybe the USB die isn't 100% thing.
>
> Also, the patch merely caused a *different* lockdep warning.
> Rather than &q->q_usage_counter(io) and &q->elevator_lock, the
> new one is &q->elevator_lock versus pcpu_alloc_mutex...
>
> Looks like it's a bit more convoluted than first looked?
That is another story wrt. freeze lock, fs_reclaim & percpu allocator
lock, looks one real risk, I will try to work one patch and see if
all can be addressed.
Thanks,
Ming
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-03-31 9:21 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-28 18:10 next-20250327 - lockdep whine and USB issues at boot Valdis Klētnieks
2025-03-28 20:32 ` Luck, Tony
2025-03-29 2:06 ` Ming Lei
2025-03-29 6:02 ` Valdis Klētnieks
2025-03-31 9:21 ` Ming Lei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).