From: Nilay Shroff <nilay@linux.ibm.com>
To: Yi Zhang <yi.zhang@redhat.com>
Cc: linux-block@vger.kernel.org, hch@lst.de, ming.lei@redhat.com,
hare@suse.de, axboe@kernel.dk, sth@linux.ibm.com, gjoyce@ibm.com,
Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Subject: Re: [PATCHv7 0/3] block: move sched_tags allocation/de-allocation outside of locking context
Date: Wed, 2 Jul 2025 19:47:11 +0530 [thread overview]
Message-ID: <1b1b8767-2e08-496a-89db-385edf592d23@linux.ibm.com> (raw)
In-Reply-To: <CAHj4cs9ABpwaoywocFAHP+3k=oWsqBKbM5GFbCedgjEyxzpChA@mail.gmail.com>
On 7/2/25 7:23 PM, Yi Zhang wrote:
> Hi Nilay
>
> With the patch on the latest linux-block/for-next, I reproduced the
> following WARNING with blktests block/005, here is the full log:
>
> [ 342.845331] run blktests block/005 at 2025-07-02 09:48:55
>
> [ 343.835605] ======================================================
> [ 343.841783] WARNING: possible circular locking dependency detected
> [ 343.847966] 6.16.0-rc4.fix+ #3 Not tainted
> [ 343.852073] ------------------------------------------------------
> [ 343.858250] check/1365 is trying to acquire lock:
> [ 343.862957] ffffffff98141db0 (pcpu_alloc_mutex){+.+.}-{4:4}, at:
> pcpu_alloc_noprof+0x8eb/0xd70
> [ 343.871587]
> but task is already holding lock:
> [ 343.877421] ffff888300cfb040 (&q->elevator_lock){+.+.}-{4:4}, at:
> elevator_change+0x152/0x530
> [ 343.885958]
> which lock already depends on the new lock.
>
> [ 343.894131]
> the existing dependency chain (in reverse order) is:
> [ 343.901609]
> -> #3 (&q->elevator_lock){+.+.}-{4:4}:
> [ 343.907891] __lock_acquire+0x6f1/0xc00
> [ 343.912259] lock_acquire.part.0+0xb6/0x240
> [ 343.916966] __mutex_lock+0x17b/0x1690
> [ 343.921247] elevator_change+0x152/0x530
> [ 343.925692] elv_iosched_store+0x205/0x2f0
> [ 343.930312] queue_attr_store+0x23b/0x300
> [ 343.934853] kernfs_fop_write_iter+0x357/0x530
> [ 343.939829] vfs_write+0x9bc/0xf60
> [ 343.943763] ksys_write+0xf3/0x1d0
> [ 343.947695] do_syscall_64+0x8c/0x3d0
> [ 343.951883] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 343.957462]
> -> #2 (&q->q_usage_counter(io)#4){++++}-{0:0}:
> [ 343.964440] __lock_acquire+0x6f1/0xc00
> [ 343.968799] lock_acquire.part.0+0xb6/0x240
> [ 343.973507] blk_alloc_queue+0x5c5/0x710
> [ 343.977959] blk_mq_alloc_queue+0x14e/0x240
> [ 343.982666] __blk_mq_alloc_disk+0x15/0xd0
> [ 343.987294] nvme_alloc_ns+0x208/0x1690 [nvme_core]
> [ 343.992727] nvme_scan_ns+0x362/0x4c0 [nvme_core]
> [ 343.997978] async_run_entry_fn+0x96/0x4f0
> [ 344.002599] process_one_work+0x8cd/0x1950
> [ 344.007226] worker_thread+0x58d/0xcf0
> [ 344.011499] kthread+0x3d8/0x7a0
> [ 344.015259] ret_from_fork+0x406/0x510
> [ 344.019532] ret_from_fork_asm+0x1a/0x30
> [ 344.023980]
> -> #1 (fs_reclaim){+.+.}-{0:0}:
> [ 344.029654] __lock_acquire+0x6f1/0xc00
> [ 344.034015] lock_acquire.part.0+0xb6/0x240
> [ 344.038727] fs_reclaim_acquire+0x103/0x150
> [ 344.043433] prepare_alloc_pages+0x15f/0x600
> [ 344.048230] __alloc_frozen_pages_noprof+0x14a/0x3a0
> [ 344.053722] __alloc_pages_noprof+0xd/0x1d0
> [ 344.058438] pcpu_alloc_pages.constprop.0+0x104/0x420
> [ 344.064017] pcpu_populate_chunk+0x38/0x80
> [ 344.068644] pcpu_alloc_noprof+0x650/0xd70
> [ 344.073265] iommu_dma_init_fq+0x183/0x730
> [ 344.077893] iommu_dma_init_domain+0x566/0x990
> [ 344.082866] iommu_setup_dma_ops+0xca/0x230
> [ 344.087571] bus_iommu_probe+0x1f8/0x4a0
> [ 344.092020] iommu_device_register+0x153/0x240
> [ 344.096993] iommu_init_pci+0x53c/0x1040
> [ 344.101447] amd_iommu_init_pci+0xb6/0x5c0
> [ 344.106066] state_next+0xaf7/0xff0
> [ 344.110080] iommu_go_to_state+0x21/0x80
> [ 344.114535] amd_iommu_init+0x15/0x70
> [ 344.118728] pci_iommu_init+0x29/0x70
> [ 344.122914] do_one_initcall+0x100/0x5a0
> [ 344.127361] do_initcalls+0x138/0x1d0
> [ 344.131556] kernel_init_freeable+0x8b7/0xbd0
> [ 344.136442] kernel_init+0x1b/0x1f0
> [ 344.140456] ret_from_fork+0x406/0x510
> [ 344.144735] ret_from_fork_asm+0x1a/0x30
> [ 344.149182]
> -> #0 (pcpu_alloc_mutex){+.+.}-{4:4}:
> [ 344.155379] check_prev_add+0xf1/0xce0
> [ 344.159653] validate_chain+0x470/0x580
> [ 344.164019] __lock_acquire+0x6f1/0xc00
> [ 344.168378] lock_acquire.part.0+0xb6/0x240
> [ 344.173085] __mutex_lock+0x17b/0x1690
> [ 344.177365] pcpu_alloc_noprof+0x8eb/0xd70
> [ 344.181984] kyber_queue_data_alloc+0x16d/0x660
> [ 344.187047] kyber_init_sched+0x14/0x90
> [ 344.191413] blk_mq_init_sched+0x264/0x4e0
> [ 344.196033] elevator_switch+0x186/0x6a0
> [ 344.200478] elevator_change+0x305/0x530
> [ 344.204924] elv_iosched_store+0x205/0x2f0
> [ 344.209545] queue_attr_store+0x23b/0x300
> [ 344.214084] kernfs_fop_write_iter+0x357/0x530
> [ 344.219051] vfs_write+0x9bc/0xf60
> [ 344.222976] ksys_write+0xf3/0x1d0
> [ 344.226902] do_syscall_64+0x8c/0x3d0
> [ 344.231088] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 344.236660]
Thanks for the report!
I see that the above warning is different from the one addressed by the
current patchset. In the warning you've reported, the kyber elevator
allocates per-CPU data after acquiring ->elevator_lock, which introduces
a per-CPU lock dependency on the ->elevator_lock.
In contrast, the current patchset addresses a separate issue [1] that arises
due to elevator tag allocation. This allocation occurs after both ->freeze_lock
and ->elevator_lock are held. Internally, elevator tags allocation sets up
per-CPU sbitmap->alloc_hint, which also introduces a similar per-CPU lock
dependency on ->elevator_lock.
That said, I'll plan to address the issue you've just reported in a separate
patch, once the current patchset is merged.
Thanks,
--Nilay
[1]https://lore.kernel.org/all/0659ea8d-a463-47c8-9180-43c719e106eb@linux.ibm.com/
next prev parent reply other threads:[~2025-07-02 14:17 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-01 8:18 [PATCHv7 0/3] block: move sched_tags allocation/de-allocation outside of locking context Nilay Shroff
2025-07-01 8:18 ` [PATCHv7 1/3] block: move elevator queue allocation logic into blk_mq_init_sched Nilay Shroff
2025-07-01 8:18 ` [PATCHv7 2/3] block: fix lockdep warning caused by lock dependency in elv_iosched_store Nilay Shroff
2025-07-01 10:52 ` Hannes Reinecke
2025-07-01 8:19 ` [PATCHv7 3/3] block: fix potential deadlock while running nr_hw_queue update Nilay Shroff
2025-07-01 11:00 ` Hannes Reinecke
2025-07-02 13:53 ` [PATCHv7 0/3] block: move sched_tags allocation/de-allocation outside of locking context Yi Zhang
2025-07-02 14:17 ` Nilay Shroff [this message]
2025-07-02 14:41 ` Yi Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1b1b8767-2e08-496a-89db-385edf592d23@linux.ibm.com \
--to=nilay@linux.ibm.com \
--cc=axboe@kernel.dk \
--cc=gjoyce@ibm.com \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=ming.lei@redhat.com \
--cc=shinichiro.kawasaki@wdc.com \
--cc=sth@linux.ibm.com \
--cc=yi.zhang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox