* [REPORT] Possible circular locking dependency on 6.18-rc2 in blkg_conf_open_bdev_frozen+0x80/0xa0
@ 2025-10-23 21:12 David Wei
2025-10-24 4:54 ` Nilay Shroff
0 siblings, 1 reply; 7+ messages in thread
From: David Wei @ 2025-10-23 21:12 UTC (permalink / raw)
To: linux-block, cgroups
Cc: hch, hare, ming.lei, dlemoal, axboe, tj, josef, gjoyce, lkp,
oliver.sang, Nilay Shroff
Hi folks, hit this with lockdep on 6.18-rc2:
[ 36.862405] ======================================================
[ 36.862406] WARNING: possible circular locking dependency detected
[ 36.862408] 6.18.0-rc2-gdbafbca31432-dirty #97 Tainted: G S E
[ 36.862409] ------------------------------------------------------
[ 36.862410] fb-cgroups-setu/1420 is trying to acquire lock:
[ 36.862411] ffff8884035502a8 (&q->rq_qos_mutex){+.+.}-{4:4}, at: blkg_conf_open_bdev_frozen+0x80/0xa0
[ 36.943164]
but task is already holding lock:
[ 36.954824] ffff8884035500a8 (&q->q_usage_counter(io)#2){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0xe/0x20
[ 36.975183]
which lock already depends on the new lock.
[ 36.991541]
the existing dependency chain (in reverse order) is:
[ 37.006502]
-> #4 (&q->q_usage_counter(io)#2){++++}-{0:0}:
[ 37.020429] blk_alloc_queue+0x345/0x380
[ 37.029315] blk_mq_alloc_queue+0x51/0xb0
[ 37.038376] __blk_mq_alloc_disk+0x14/0x60
[ 37.047612] nvme_alloc_ns+0xa7/0xbc0
[ 37.055976] nvme_scan_ns+0x25a/0x320
[ 37.064339] async_run_entry_fn+0x28/0x110
[ 37.073576] process_one_work+0x1e1/0x570
[ 37.082634] worker_thread+0x184/0x330
[ 37.091170] kthread+0xe6/0x1e0
[ 37.098489] ret_from_fork+0x20b/0x260
[ 37.107026] ret_from_fork_asm+0x11/0x20
[ 37.115912]
-> #3 (fs_reclaim){+.+.}-{0:0}:
[ 37.127232] fs_reclaim_acquire+0x91/0xd0
[ 37.136293] kmem_cache_alloc_lru_noprof+0x49/0x760
[ 37.147090] __d_alloc+0x30/0x2a0
[ 37.154759] d_alloc_parallel+0x4c/0x760
[ 37.163644] __lookup_slow+0xc3/0x180
[ 37.172008] simple_start_creating+0x57/0xc0
[ 37.181590] debugfs_start_creating.part.0+0x4d/0xe0
[ 37.192561] debugfs_create_dir+0x3e/0x1f0
[ 37.201795] regulator_init+0x24/0x100
[ 37.210335] do_one_initcall+0x46/0x250
[ 37.219043] kernel_init_freeable+0x22c/0x430
[ 37.228799] kernel_init+0x16/0x1b0
[ 37.236814] ret_from_fork+0x20b/0x260
[ 37.245351] ret_from_fork_asm+0x11/0x20
[ 37.254235]
-> #2 (&sb->s_type->i_mutex_key#3){+.+.}-{4:4}:
[ 37.268336] down_write+0x25/0xa0
[ 37.276003] simple_start_creating+0x29/0xc0
[ 37.285582] debugfs_start_creating.part.0+0x4d/0xe0
[ 37.296552] debugfs_create_dir+0x3e/0x1f0
[ 37.305782] blk_register_queue+0x98/0x1c0
[ 37.315014] __add_disk+0x21e/0x3b0
[ 37.323030] add_disk_fwnode+0x75/0x160
[ 37.331738] nvme_alloc_ns+0x395/0xbc0
[ 37.340275] nvme_scan_ns+0x25a/0x320
[ 37.348638] async_run_entry_fn+0x28/0x110
[ 37.357870] process_one_work+0x1e1/0x570
[ 37.366929] worker_thread+0x184/0x330
[ 37.375461] kthread+0xe6/0x1e0
[ 37.382779] ret_from_fork+0x20b/0x260
[ 37.391315] ret_from_fork_asm+0x11/0x20
[ 37.400200]
-> #1 (&q->debugfs_mutex){+.+.}-{4:4}:
[ 37.412736] __mutex_lock+0x83/0x1070
[ 37.421100] rq_qos_add+0xde/0x130
[ 37.428942] wbt_init+0x160/0x200
[ 37.436612] blk_register_queue+0xe9/0x1c0
[ 37.445843] __add_disk+0x21e/0x3b0
[ 37.453859] add_disk_fwnode+0x75/0x160
[ 37.462568] nvme_alloc_ns+0x395/0xbc0
[ 37.471105] nvme_scan_ns+0x25a/0x320
[ 37.479469] async_run_entry_fn+0x28/0x110
[ 37.488702] process_one_work+0x1e1/0x570
[ 37.497761] worker_thread+0x184/0x330
[ 37.506296] kthread+0xe6/0x1e0
[ 37.513618] ret_from_fork+0x20b/0x260
[ 37.522154] ret_from_fork_asm+0x11/0x20
[ 37.531038]
-> #0 (&q->rq_qos_mutex){+.+.}-{4:4}:
[ 37.543399] __lock_acquire+0x15fc/0x2730
[ 37.552460] lock_acquire+0xb5/0x2a0
[ 37.560647] __mutex_lock+0x83/0x1070
[ 37.569010] blkg_conf_open_bdev_frozen+0x80/0xa0
[ 37.579457] ioc_qos_write+0x35/0x4a0
[ 37.587820] kernfs_fop_write_iter+0x15c/0x240
[ 37.597750] vfs_write+0x31f/0x4c0
[ 37.605590] ksys_write+0x58/0xd0
[ 37.613257] do_syscall_64+0x6f/0x1120
[ 37.621790] entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 37.632932]
other info that might help us debug this:
[ 37.648935] Chain exists of:
&q->rq_qos_mutex --> fs_reclaim --> &q->q_usage_counter(io)#2
[ 37.671552] Possible unsafe locking scenario:
[ 37.683385] CPU0 CPU1
[ 37.692438] ---- ----
[ 37.701489] lock(&q->q_usage_counter(io)#2);
[ 37.710374] lock(fs_reclaim);
[ 37.721691] lock(&q->q_usage_counter(io)#2);
[ 37.735615] lock(&q->rq_qos_mutex);
[ 37.742934]
*** DEADLOCK ***
[ 37.754767] 6 locks held by fb-cgroups-setu/1420:
[ 37.764168] #0: ffff88840ce38e78 (&f->f_pos_lock){+.+.}-{4:4}, at: fdget_pos+0x7a/0xb0
[ 37.780179] #1: ffff88841c292400 (sb_writers#8){.+.+}-{0:0}, at: ksys_write+0x58/0xd0
[ 37.796018] #2: ffff88840cfdba88 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0xfd/0x240
[ 37.813943] #3: ffff888404374428 (kn->active#105){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x112/0x240
[ 37.832390] #4: ffff8884035500a8 (&q->q_usage_counter(io)#2){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0xe/0x20
[ 37.853618] #5: ffff8884035500e0 (&q->q_usage_counter(queue)#2){+.+.}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0xe/0x20
[ 37.875365]
stack backtrace:
[ 37.884071] CPU: 19 UID: 0 PID: 1420 Comm: fb-cgroups-setu Tainted: G S E 6.18.0-rc2-gdbafbca31432-dirty #97 NONE
[ 37.884075] Tainted: [S]=CPU_OUT_OF_SPEC, [E]=UNSIGNED_MODULE
[ 37.884075] Hardware name: Quanta Delta Lake MP 29F0EMA01D0/Delta Lake-Class1, BIOS F0E_3A21 06/27/2024
[ 37.884077] Call Trace:
[ 37.884078] <TASK>
[ 37.884079] dump_stack_lvl+0x7e/0xc0
[ 37.884083] print_circular_bug+0x2c2/0x400
[ 37.884087] check_noncircular+0x118/0x130
[ 37.884090] ? save_trace+0x46/0x370
[ 37.884093] ? add_lock_to_list+0x2c/0x1a0
[ 37.884096] __lock_acquire+0x15fc/0x2730
[ 37.884101] lock_acquire+0xb5/0x2a0
[ 37.884103] ? blkg_conf_open_bdev_frozen+0x80/0xa0
[ 37.884108] __mutex_lock+0x83/0x1070
[ 37.884111] ? blkg_conf_open_bdev_frozen+0x80/0xa0
[ 37.884114] ? mark_held_locks+0x49/0x70
[ 37.884135] ? blkg_conf_open_bdev_frozen+0x80/0xa0
[ 37.884140] ? blkg_conf_open_bdev_frozen+0x80/0xa0
[ 37.884143] blkg_conf_open_bdev_frozen+0x80/0xa0
[ 37.884147] ioc_qos_write+0x35/0x4a0
[ 37.884150] ? kernfs_root+0x6e/0x160
[ 37.884154] ? kernfs_root+0x73/0x160
[ 37.884157] ? kernfs_root_flags+0xa/0x10
[ 37.884160] ? kn_priv+0x29/0x70
[ 37.884164] ? cgroup_file_write+0x2b/0x260
[ 37.884168] kernfs_fop_write_iter+0x15c/0x240
[ 37.884172] vfs_write+0x31f/0x4c0
[ 37.884176] ksys_write+0x58/0xd0
[ 37.884179] do_syscall_64+0x6f/0x1120
[ 37.884182] entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 37.884184] RIP: 0033:0x7f4d8171b87d
[ 37.884191] Code: e5 48 83 ec 20 48 89 55 e8 48 89 75 f0 89 7d f8 e8 c8 b3 f7 ff 41 89 c0 48 8b 55 e8 48 8b 75 f0 8b 7d f8 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3b 44 89 c7 48 89 45 f8 e8 ff b3 f7 ff 48 8b
[ 37.884193] RSP: 002b:00007ffe01969880 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[ 37.884195] RAX: ffffffffffffffda RBX: 0000000000000041 RCX: 00007f4d8171b87d
[ 37.884196] RDX: 0000000000000041 RSI: 00007f4d7f7aae60 RDI: 0000000000000007
[ 37.884197] RBP: 00007ffe019698a0 R08: 0000000000000000 R09: 00007f4d7f7aae60
[ 37.884199] R10: 00007f4d8160afd0 R11: 0000000000000293 R12: 0000000000000041
[ 37.884200] R13: 0000000000000007 R14: 00007ffe0196a020 R15: 0000000000000000
[ 37.884205] </TASK>
Happens consistently on boot.
Thanks,
David
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REPORT] Possible circular locking dependency on 6.18-rc2 in blkg_conf_open_bdev_frozen+0x80/0xa0
2025-10-23 21:12 [REPORT] Possible circular locking dependency on 6.18-rc2 in blkg_conf_open_bdev_frozen+0x80/0xa0 David Wei
@ 2025-10-24 4:54 ` Nilay Shroff
2025-10-27 20:30 ` Bart Van Assche
0 siblings, 1 reply; 7+ messages in thread
From: Nilay Shroff @ 2025-10-24 4:54 UTC (permalink / raw)
To: David Wei, linux-block, cgroups
Cc: hch, hare, ming.lei, dlemoal, axboe, tj, josef, gjoyce, lkp,
oliver.sang
On 10/24/25 2:42 AM, David Wei wrote:
> Hi folks, hit this with lockdep on 6.18-rc2:
>
> [ 36.862405] ======================================================
> [ 36.862406] WARNING: possible circular locking dependency detected
> [ 36.862408] 6.18.0-rc2-gdbafbca31432-dirty #97 Tainted: G S E
> [ 36.862409] ------------------------------------------------------
> [ 36.862410] fb-cgroups-setu/1420 is trying to acquire lock:
> [ 36.862411] ffff8884035502a8 (&q->rq_qos_mutex){+.+.}-{4:4}, at: blkg_conf_open_bdev_frozen+0x80/0xa0
> [ 36.943164]
> but task is already holding lock:
> [ 36.954824] ffff8884035500a8 (&q->q_usage_counter(io)#2){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0xe/0x20
> [ 36.975183]
> which lock already depends on the new lock.
> [ 36.991541]
> the existing dependency chain (in reverse order) is:
> [ 37.006502]
> -> #4 (&q->q_usage_counter(io)#2){++++}-{0:0}:
> [ 37.020429] blk_alloc_queue+0x345/0x380
> [ 37.029315] blk_mq_alloc_queue+0x51/0xb0
> [ 37.038376] __blk_mq_alloc_disk+0x14/0x60
> [ 37.047612] nvme_alloc_ns+0xa7/0xbc0
> [ 37.055976] nvme_scan_ns+0x25a/0x320
> [ 37.064339] async_run_entry_fn+0x28/0x110
> [ 37.073576] process_one_work+0x1e1/0x570
> [ 37.082634] worker_thread+0x184/0x330
> [ 37.091170] kthread+0xe6/0x1e0
> [ 37.098489] ret_from_fork+0x20b/0x260
> [ 37.107026] ret_from_fork_asm+0x11/0x20
> [ 37.115912]
> -> #3 (fs_reclaim){+.+.}-{0:0}:
> [ 37.127232] fs_reclaim_acquire+0x91/0xd0
> [ 37.136293] kmem_cache_alloc_lru_noprof+0x49/0x760
> [ 37.147090] __d_alloc+0x30/0x2a0
> [ 37.154759] d_alloc_parallel+0x4c/0x760
> [ 37.163644] __lookup_slow+0xc3/0x180
> [ 37.172008] simple_start_creating+0x57/0xc0
> [ 37.181590] debugfs_start_creating.part.0+0x4d/0xe0
> [ 37.192561] debugfs_create_dir+0x3e/0x1f0
> [ 37.201795] regulator_init+0x24/0x100
> [ 37.210335] do_one_initcall+0x46/0x250
> [ 37.219043] kernel_init_freeable+0x22c/0x430
> [ 37.228799] kernel_init+0x16/0x1b0
> [ 37.236814] ret_from_fork+0x20b/0x260
> [ 37.245351] ret_from_fork_asm+0x11/0x20
> [ 37.254235]
> -> #2 (&sb->s_type->i_mutex_key#3){+.+.}-{4:4}:
> [ 37.268336] down_write+0x25/0xa0
> [ 37.276003] simple_start_creating+0x29/0xc0
> [ 37.285582] debugfs_start_creating.part.0+0x4d/0xe0
> [ 37.296552] debugfs_create_dir+0x3e/0x1f0
> [ 37.305782] blk_register_queue+0x98/0x1c0
> [ 37.315014] __add_disk+0x21e/0x3b0
> [ 37.323030] add_disk_fwnode+0x75/0x160
> [ 37.331738] nvme_alloc_ns+0x395/0xbc0
> [ 37.340275] nvme_scan_ns+0x25a/0x320
> [ 37.348638] async_run_entry_fn+0x28/0x110
> [ 37.357870] process_one_work+0x1e1/0x570
> [ 37.366929] worker_thread+0x184/0x330
> [ 37.375461] kthread+0xe6/0x1e0
> [ 37.382779] ret_from_fork+0x20b/0x260
> [ 37.391315] ret_from_fork_asm+0x11/0x20
> [ 37.400200]
> -> #1 (&q->debugfs_mutex){+.+.}-{4:4}:
> [ 37.412736] __mutex_lock+0x83/0x1070
> [ 37.421100] rq_qos_add+0xde/0x130
> [ 37.428942] wbt_init+0x160/0x200
> [ 37.436612] blk_register_queue+0xe9/0x1c0
> [ 37.445843] __add_disk+0x21e/0x3b0
> [ 37.453859] add_disk_fwnode+0x75/0x160
> [ 37.462568] nvme_alloc_ns+0x395/0xbc0
> [ 37.471105] nvme_scan_ns+0x25a/0x320
> [ 37.479469] async_run_entry_fn+0x28/0x110
> [ 37.488702] process_one_work+0x1e1/0x570
> [ 37.497761] worker_thread+0x184/0x330
> [ 37.506296] kthread+0xe6/0x1e0
> [ 37.513618] ret_from_fork+0x20b/0x260
> [ 37.522154] ret_from_fork_asm+0x11/0x20
> [ 37.531038]
> -> #0 (&q->rq_qos_mutex){+.+.}-{4:4}:
> [ 37.543399] __lock_acquire+0x15fc/0x2730
> [ 37.552460] lock_acquire+0xb5/0x2a0
> [ 37.560647] __mutex_lock+0x83/0x1070
> [ 37.569010] blkg_conf_open_bdev_frozen+0x80/0xa0
> [ 37.579457] ioc_qos_write+0x35/0x4a0
> [ 37.587820] kernfs_fop_write_iter+0x15c/0x240
> [ 37.597750] vfs_write+0x31f/0x4c0
> [ 37.605590] ksys_write+0x58/0xd0
> [ 37.613257] do_syscall_64+0x6f/0x1120
> [ 37.621790] entry_SYSCALL_64_after_hwframe+0x4b/0x53
> [ 37.632932]
> other info that might help us debug this:
> [ 37.648935] Chain exists of:
> &q->rq_qos_mutex --> fs_reclaim --> &q->q_usage_counter(io)#2
> [ 37.671552] Possible unsafe locking scenario:
> [ 37.683385] CPU0 CPU1
> [ 37.692438] ---- ----
> [ 37.701489] lock(&q->q_usage_counter(io)#2);
> [ 37.710374] lock(fs_reclaim);
> [ 37.721691] lock(&q->q_usage_counter(io)#2);
> [ 37.735615] lock(&q->rq_qos_mutex);
> [ 37.742934]
Well this appears to be false-positive and its signature is similar to the one
reported here[1] earlier. The difference here's that we have two different NVMe
block devices being added concurrently (as can be seen in thread #1 and #2) where
device #1 pends on device #2's q->debugfs_mutex but in reality those two mutexes
are distinct as each queue has its own ->debugfs_mutex.
In the earlier report, the same locking sequence was observed with virtio-blk devices
which similarly triggered a spurious circular dependency warning.
IMO, we need to make lockdep learn about this differences by assigning separate
lockdep key/class for each queue's q->debugfs_mutex to avoid this false positive.
As this is another report with the same false-positive lockdep splat, I think we
should address this.
Any other thoughts or suggestions from others on the list?
[1]https://lore.kernel.org/all/7de6c29f-9058-41ca-af95-f3aaf67a64d3@linux.ibm.com/
Thanks,
--Nilay
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REPORT] Possible circular locking dependency on 6.18-rc2 in blkg_conf_open_bdev_frozen+0x80/0xa0
2025-10-24 4:54 ` Nilay Shroff
@ 2025-10-27 20:30 ` Bart Van Assche
2025-10-28 13:06 ` Nilay Shroff
0 siblings, 1 reply; 7+ messages in thread
From: Bart Van Assche @ 2025-10-27 20:30 UTC (permalink / raw)
To: Nilay Shroff, David Wei, linux-block, cgroups
Cc: hch, hare, ming.lei, dlemoal, axboe, tj, josef, gjoyce, lkp,
oliver.sang
On 10/23/25 9:54 PM, Nilay Shroff wrote:
> IMO, we need to make lockdep learn about this differences by assigning separate
> lockdep key/class for each queue's q->debugfs_mutex to avoid this false positive.
> As this is another report with the same false-positive lockdep splat, I think we
> should address this.
>
> Any other thoughts or suggestions from others on the list?
Please take a look at lockdep_register_key() and
lockdep_unregister_key(). I introduced these functions six years ago to
suppress false positive lockdep complaints like this one.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REPORT] Possible circular locking dependency on 6.18-rc2 in blkg_conf_open_bdev_frozen+0x80/0xa0
2025-10-27 20:30 ` Bart Van Assche
@ 2025-10-28 13:06 ` Nilay Shroff
2025-10-29 8:25 ` Ming Lei
0 siblings, 1 reply; 7+ messages in thread
From: Nilay Shroff @ 2025-10-28 13:06 UTC (permalink / raw)
To: Bart Van Assche, David Wei, linux-block, cgroups
Cc: hch, hare, ming.lei, dlemoal, axboe, tj, josef, gjoyce, lkp,
oliver.sang
On 10/28/25 2:00 AM, Bart Van Assche wrote:
> On 10/23/25 9:54 PM, Nilay Shroff wrote:
>> IMO, we need to make lockdep learn about this differences by assigning separate
>> lockdep key/class for each queue's q->debugfs_mutex to avoid this false positive.
>> As this is another report with the same false-positive lockdep splat, I think we
>> should address this.
>>
>> Any other thoughts or suggestions from others on the list?
>
> Please take a look at lockdep_register_key() and
> lockdep_unregister_key(). I introduced these functions six years ago to
> suppress false positive lockdep complaints like this one.
>
Thanks Bart! I'll send out patch with the above proposed fix.
Thanks,
--Nilay
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REPORT] Possible circular locking dependency on 6.18-rc2 in blkg_conf_open_bdev_frozen+0x80/0xa0
2025-10-28 13:06 ` Nilay Shroff
@ 2025-10-29 8:25 ` Ming Lei
2025-10-30 5:48 ` Nilay Shroff
0 siblings, 1 reply; 7+ messages in thread
From: Ming Lei @ 2025-10-29 8:25 UTC (permalink / raw)
To: Nilay Shroff
Cc: Bart Van Assche, David Wei, linux-block, cgroups, hch, hare,
dlemoal, axboe, tj, josef, gjoyce, lkp, oliver.sang
On Tue, Oct 28, 2025 at 06:36:20PM +0530, Nilay Shroff wrote:
>
>
> On 10/28/25 2:00 AM, Bart Van Assche wrote:
> > On 10/23/25 9:54 PM, Nilay Shroff wrote:
> >> IMO, we need to make lockdep learn about this differences by assigning separate
> >> lockdep key/class for each queue's q->debugfs_mutex to avoid this false positive.
> >> As this is another report with the same false-positive lockdep splat, I think we
> >> should address this.
> >>
> >> Any other thoughts or suggestions from others on the list?
> >
> > Please take a look at lockdep_register_key() and
> > lockdep_unregister_key(). I introduced these functions six years ago to
> > suppress false positive lockdep complaints like this one.
> >
> Thanks Bart! I'll send out patch with the above proposed fix.
IMO, that may not be a smart approach, here the following dependency should
be cut:
#4 (&q->q_usage_counter(io)#2){++++}-{0:0}:
...
#1 (&q->debugfs_mutex){+.+.}-{4:4}:
#0 (&q->rq_qos_mutex){+.+.}-{4:4}:
Why is there the dependency between `#1 (&q->debugfs_mutex)` and `#0 (&q->rq_qos_mutex)`?
I remember that Yu Kuai is working on remove it:
https://lore.kernel.org/linux-block/20251014022149.947800-1-yukuai3@huawei.com/
Thanks,
Ming
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REPORT] Possible circular locking dependency on 6.18-rc2 in blkg_conf_open_bdev_frozen+0x80/0xa0
2025-10-29 8:25 ` Ming Lei
@ 2025-10-30 5:48 ` Nilay Shroff
2025-10-30 6:26 ` Yu Kuai
0 siblings, 1 reply; 7+ messages in thread
From: Nilay Shroff @ 2025-10-30 5:48 UTC (permalink / raw)
To: Ming Lei
Cc: Bart Van Assche, David Wei, linux-block, cgroups, hch, hare,
dlemoal, axboe, tj, josef, gjoyce, lkp, oliver.sang, yukuai
On 10/29/25 1:55 PM, Ming Lei wrote:
> On Tue, Oct 28, 2025 at 06:36:20PM +0530, Nilay Shroff wrote:
>>
>>
>> On 10/28/25 2:00 AM, Bart Van Assche wrote:
>>> On 10/23/25 9:54 PM, Nilay Shroff wrote:
>>>> IMO, we need to make lockdep learn about this differences by assigning separate
>>>> lockdep key/class for each queue's q->debugfs_mutex to avoid this false positive.
>>>> As this is another report with the same false-positive lockdep splat, I think we
>>>> should address this.
>>>>
>>>> Any other thoughts or suggestions from others on the list?
>>>
>>> Please take a look at lockdep_register_key() and
>>> lockdep_unregister_key(). I introduced these functions six years ago to
>>> suppress false positive lockdep complaints like this one.
>>>
>> Thanks Bart! I'll send out patch with the above proposed fix.
>
> IMO, that may not be a smart approach, here the following dependency should
> be cut:
> #4 (&q->q_usage_counter(io)#2){++++}-{0:0}:
>
> ...
>
> #1 (&q->debugfs_mutex){+.+.}-{4:4}:
> #0 (&q->rq_qos_mutex){+.+.}-{4:4}:
>
> Why is there the dependency between `#1 (&q->debugfs_mutex)` and `#0 (&q->rq_qos_mutex)`?
>
Okay this also makes sense: if we could remove the ->rq_qos_mutex depedency on
->debug_fs_mutex. However I also think lockdep may, even after cutting depedency,
still generate spurious splat (due to it's not able to distinguish between
distinct ->debugfs_mutex instances). We may check that letter if that really
happens.
> I remember that Yu Kuai is working on remove it:
>
> https://lore.kernel.org/linux-block/20251014022149.947800-1-yukuai3@huawei.com/
>
Let me add Yu Kuai in the Cc. BTW, of late I saw my emails to Yu are bouncing.
But I found another email of Yu from lore, so I am adding it here - hope, this
is the correct email.
Thanks,
--Nilay
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [REPORT] Possible circular locking dependency on 6.18-rc2 in blkg_conf_open_bdev_frozen+0x80/0xa0
2025-10-30 5:48 ` Nilay Shroff
@ 2025-10-30 6:26 ` Yu Kuai
0 siblings, 0 replies; 7+ messages in thread
From: Yu Kuai @ 2025-10-30 6:26 UTC (permalink / raw)
To: Nilay Shroff, Ming Lei
Cc: Bart Van Assche, David Wei, linux-block, cgroups, hch, hare,
dlemoal, axboe, tj, josef, gjoyce, lkp, oliver.sang, yukuai
Hi,
在 2025/10/30 13:48, Nilay Shroff 写道:
>
> On 10/29/25 1:55 PM, Ming Lei wrote:
>> On Tue, Oct 28, 2025 at 06:36:20PM +0530, Nilay Shroff wrote:
>>>
>>> On 10/28/25 2:00 AM, Bart Van Assche wrote:
>>>> On 10/23/25 9:54 PM, Nilay Shroff wrote:
>>>>> IMO, we need to make lockdep learn about this differences by assigning separate
>>>>> lockdep key/class for each queue's q->debugfs_mutex to avoid this false positive.
>>>>> As this is another report with the same false-positive lockdep splat, I think we
>>>>> should address this.
>>>>>
>>>>> Any other thoughts or suggestions from others on the list?
>>>> Please take a look at lockdep_register_key() and
>>>> lockdep_unregister_key(). I introduced these functions six years ago to
>>>> suppress false positive lockdep complaints like this one.
>>>>
>>> Thanks Bart! I'll send out patch with the above proposed fix.
>> IMO, that may not be a smart approach, here the following dependency should
>> be cut:
>> #4 (&q->q_usage_counter(io)#2){++++}-{0:0}:
>>
>> ...
>>
>> #1 (&q->debugfs_mutex){+.+.}-{4:4}:
>> #0 (&q->rq_qos_mutex){+.+.}-{4:4}:
>>
>> Why is there the dependency between `#1 (&q->debugfs_mutex)` and `#0 (&q->rq_qos_mutex)`?
>>
> Okay this also makes sense: if we could remove the ->rq_qos_mutex depedency on
> ->debug_fs_mutex. However I also think lockdep may, even after cutting depedency,
> still generate spurious splat (due to it's not able to distinguish between
> distinct ->debugfs_mutex instances). We may check that letter if that really
> happens.
This way we have to make sure debugfs_mutex is not held while queue is freezed, and
is also not nested under other locks that can be held while queue is freezed. I think
it will be enough this way.
>> I remember that Yu Kuai is working on remove it:
>>
>> https://lore.kernel.org/linux-block/20251014022149.947800-1-yukuai3@huawei.com/
Yeah, I didn't continue work for new version because the problem I met is fixed
Separately, this report might be another motivation.
> Let me add Yu Kuai in the Cc. BTW, of late I saw my emails to Yu are bouncing.
> But I found another email of Yu from lore, so I am adding it here - hope, this
Thanks, I just left the previous employer and the old huawei or huaweiclound email
can't be used anymore.
Thanks
Kuai
> is the correct email.
>
> Thanks,
> --Nilay
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-10-30 6:27 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-23 21:12 [REPORT] Possible circular locking dependency on 6.18-rc2 in blkg_conf_open_bdev_frozen+0x80/0xa0 David Wei
2025-10-24 4:54 ` Nilay Shroff
2025-10-27 20:30 ` Bart Van Assche
2025-10-28 13:06 ` Nilay Shroff
2025-10-29 8:25 ` Ming Lei
2025-10-30 5:48 ` Nilay Shroff
2025-10-30 6:26 ` Yu Kuai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).